update vocabulary documentation

Top-level and linux/ now have separate vocabulary.md files.
author: Kartik K. Agaram <vc@akkartik.com> 2021-03-08 23:49:07 -0800
committer: Kartik K. Agaram <vc@akkartik.com> 2021-03-08 23:50:35 -0800
commit: cec5ef31b3e383b7bdffe049a8c502a563f6b491 (patch)
tree: 9f6b410cc16991a709dc59258ae29dacd2feb98b /vocabulary.md
parent: 6508ab51ccd6a41d6d1da3502359e80611d8bda3 (diff)
download: mu-cec5ef31b3e383b7bdffe049a8c502a563f6b491.tar.gz
1 files changed, 173 insertions, 162 deletions
diff --git a/vocabulary.md b/vocabulary.md
index ce3eab23..9471d132 100644
--- a/vocabulary.md
+++ b/vocabulary.md
@@ -2,8 +2,15 @@
 
 ### Data Structures
 
-- Kernel strings: null-terminated regions of memory. Unsafe and to be avoided,
-  but needed for interacting with the kernel.
+- Handles: addresses to objects allocated on the heap. They're augmented with
+  book-keeping to guarantee memory-safety, and so cannot be stored in registers.
+  See [mu.md](mu.md) for details, but in brief:
+    - You need `addr` values to access data they point to.
+    - You can't store `addr` values in other types. They're temporary.
+    - You can store `handle` values in other types.
+    - To convert `handle` to `addr`, use `lookup`.
+    - Reclaiming memory (currently unimplemented) invalidates all `addr`
+      values.
 
 - Arrays: size-prefixed regions of memory containing multiple elements of a
   single type. Contents are preceded by 4 bytes (32 bits) containing the
@@ -24,185 +31,189 @@
 
   Invariant: 0 <= `read` <= `write` <= `size`
 
-- File descriptors (fd): Low-level 32-bit integers that the kernel uses to
-  track files opened by the program.
+  Writes to a stream abort if it's full. Reads to a stream abort if it's
+  empty.
 
-- File: 32-bit value containing either a fd or an address to a stream (fake
-  file).
+- Graphemes: 32-bit fragments of utf-8 that encode a single Unicode code-point.
+- Code-points: 32-bit integers representing a Unicode character.
 
-- Buffered files (buffered-file): Contain a file descriptor and a stream for
-  buffering reads/writes. Each `buffered-file` must exclusively perform either
-  reads or writes.
+### Functions
 
-### 'system calls'
+The most useful functions from 400.mu and later .mu files. Look for definitions
+(using `ctags`) to see type signatures.
 
-As I said at the top, a primary design goal of SubX (and Mu more broadly) is
-to explore ways to turn arbitrary manual tests into reproducible automated
-tests. SubX aims for this goal by baking testable interfaces deep into the
-stack, at the OS syscall level. The idea is that every syscall that interacts
-with hardware (and so the environment) should be *dependency injected* so that
-it's possible to insert fake hardware in tests.
+- `abort`: print a message in red on the bottom left of the screen and halt
 
-But those are big goals. Here are the syscalls I have so far:
-
-- `write`: takes two arguments, a file `f` and an address to array `s`.
-
-  Comparing this interface with the Unix `write()` syscall shows two benefits:
-
-  1. SubX can handle 'fake' file descriptors in tests.
-
-  1. `write()` accepts buffer and its size in separate arguments, which
-     requires callers to manage the two separately and so can be error-prone.
-     SubX's wrapper keeps the two together to increase the chances that we
-     never accidentally go out of array bounds.
-
-- `read`: takes two arguments, a file `f` and an address to stream `s`. Reads
-  as much data from `f` as can fit in (the free space of) `s`.
-
-  Like with `write()`, this wrapper around the Unix `read()` syscall adds the
-  ability to handle 'fake' file descriptors in tests, and reduces the chances
-  of clobbering outside array bounds.
-
-  One bit of weirdness here: in tests we do a redundant copy from one stream
-  to another. See [the comments before the implementation](http://akkartik.github.io/mu/html/060read.subx.html)
-  for a discussion of alternative interfaces.
-
-- `stop`: takes two arguments:
-  - `ed` is an address to an _exit descriptor_. Exit descriptors allow us to
-    `exit()` the program in production, but return to the test harness within
-    tests. That allows tests to make assertions about when `exit()` is called.
-  - `value` is the status code to `exit()` with.
-
-  For more details on exit descriptors and how to create one, see [the
-  comments before the implementation](http://akkartik.github.io/mu/html/059stop.subx.html).
-
-- `new-segment`
-
-  Allocates a whole new segment of memory for the program, discontiguous with
-  both existing code and data (heap) segments. Just a more opinionated form of
-  [`mmap`](http://man7.org/linux/man-pages/man2/mmap.2.html).
+#### assertions for tests
 
-- `allocate`: takes two arguments, an address to allocation-descriptor `ad`
-  and an integer `n`
+- `check`: fails current test if given boolean is false (`= 0`).
+- `check-not`: fails current test if given boolean isn't false (`!= 0`).
+- `check-ints-equal`: fails current test if given ints aren't equal.
+- `check-strings-equal`: fails current test if given strings have different bytes.
+- `check-stream-equal`: fails current test if stream's data doesn't match
+  string in its entirety. Ignores the stream's read index.
+- `check-array-equal`: fails if an array's elements don't match what's written
+  in a whitespace-separated string.
+- `check-next-stream-line-equal`: fails current test if next line of stream
+  until newline doesn't match string.
 
-  Allocates a contiguous range of memory that is guaranteed to be exclusively
-  available to the caller. Returns the starting address to the range in `eax`.
+#### predicates
 
-  An allocation descriptor tracks allocated vs available addresses in some
-  contiguous range of memory. The int specifies the number of bytes to allocate.
+- `handle-equal?`: checks if two handles point at the identical address. Does
+  not compare payloads at their respective addresses.
 
-  Explicitly passing in an allocation descriptor allows for nested memory
-  management, where a sub-system gets a chunk of memory and further parcels it
-  out to individual allocations. Particularly helpful for (surprise) tests.
+- `array-equal?`: checks if two arrays (of ints only for now) have identical
+  elements.
 
-- ... _(to be continued)_
+- `string-equal?`: compares two strings.
+- `stream-data-equal?`: compares a stream with a string.
+- `next-stream-line-equal?`: compares with string the next line in a stream, from
+  `read` index to newline.
 
-I will continue to import syscalls over time from [the old Mu VM in the parent
-directory](https://github.com/akkartik/mu), which has experimented with
-interfaces for the screen, keyboard, mouse, disk and network.
+- `slice-empty?`: checks if the `start` and `end` of a slice are equal.
+- `slice-equal?`: compares a slice with a string.
+- `slice-starts-with?`: compares the start of a slice with a string.
 
-### primitives built atop system calls
+- `stream-full?`: checks if a write to a stream would abort.
+- `stream-empty?`: checks if a read from a stream would abort.
 
-_(Compound arguments are usually passed in by reference. Where the results are
-compound objects that don't fit in a register, the caller usually passes in
-allocated memory for it.)_
+#### arrays
 
-#### assertions for tests
-- `check-ints-equal`: fails current test if given ints aren't equal
-- `check-stream-equal`: fails current test if stream doesn't match string
-- `check-next-stream-line-equal`: fails current test if next line of stream
-  until newline doesn't match string
+- `populate`: allocates space for `n` objects of the appropriate type.
+- `copy-array`: allocates enough space and writes out a copy of an array of
+  some type.
+- `slice-to-string`: allocates space for an array of bytes and copies the
+  slice into it.
 
-#### error handling
-- `error`: takes three arguments, an exit-descriptor, a file and a string (message)
+#### streams
 
-  Prints out the message to the file and then exits using the provided
-  exit-descriptor.
+- `populate-stream`: allocates space in a stream for `n` objects of the
+  appropriate type.
+- `write-to-stream`: writes arbitrary objects to a stream of the appropriate
+  type.
+- `read-from-stream`: reads arbitrary objects from a stream of the appropriate
+  type.
+- `stream-to-array`: allocates just enough space and writes out a stream's
+  data between its read index (inclusive) and write index (exclusive).
 
-- `error-byte`: like `error` but takes an extra byte value that it prints out
-  at the end of the message.
-
-#### predicates
-- `kernel-string-equal?`: compares a kernel string with a string
-- `string-equal?`: compares two strings
-- `stream-data-equal?`: compares a stream with a string
-- `next-stream-line-equal?`: compares with string the next line in a stream, from
-  `read` index to newline
-
-- `slice-empty?`: checks if the `start` and `end` of a slice are equal
-- `slice-equal?`: compares a slice with a string
-- `slice-starts-with?`: compares the start of a slice with a string
-- `slice-ends-with?`: compares the end of a slice with a string
-
-#### writing to disk
-- `write`: string -> file
-  - Can also be used to cat a string into a stream.
-  - Will abort the entire program if destination is a stream and doesn't have
-    enough room.
-- `write-stream`: stream -> file
-  - Can also be used to cat one stream into another.
-  - Will abort the entire program if destination is a stream and doesn't have
-    enough room.
-- `write-slice`: slice -> stream
-  - Will abort the entire program if there isn't enough room in the
-    destination stream.
-- `append-byte`: int -> stream
-  - Will abort the entire program if there isn't enough room in the
-    destination stream.
-- `append-byte-hex`: int -> stream
-  - textual representation in hex, no '0x' prefix
-  - Will abort the entire program if there isn't enough room in the
-    destination stream.
-- `print-int32`: int -> stream
-  - textual representation in hex, including '0x' prefix
-  - Will abort the entire program if there isn't enough room in the
-    destination stream.
-- `write-buffered`: string -> buffered-file
-- `write-slice-buffered`: slice -> buffered-file
-- `flush`: buffered-file
-- `write-byte-buffered`: int -> buffered-file
-- `print-byte-buffered`: int -> buffered-file
-  - textual representation in hex, no '0x' prefix
-- `print-int32-buffered`: int -> buffered-file
-  - textual representation in hex, including '0x' prefix
-
-#### reading from disk
-- `read`: file -> stream
-  - Can also be used to cat one stream into another.
-  - Will silently stop reading when destination runs out of space.
-- `read-byte-buffered`: buffered-file -> byte
-- `read-line-buffered`: buffered-file -> stream
-  - Will abort the entire program if there isn't enough room.
-
-#### non-IO operations on streams
-- `new-stream`: allocates space for a stream of `n` elements, each occupying
-  `b` bytes.
-  - Will abort the entire program if `n*b` requires more than 32 bits.
 - `clear-stream`: resets everything in the stream to `0` (except its `size`).
 - `rewind-stream`: resets the read index of the stream to `0` without modifying
   its contents.
 
+- `write`: writes a string into a stream of bytes. Doesn't support streams of
+  other types.
+- `write-stream`: concatenates one stream into another.
+- `write-slice`: writes a slice into a stream of bytes.
+- `append-byte`: writes a single byte into a stream of bytes.
+- `append-byte-hex`: writes textual representation of lowest byte in hex to
+  a stream of bytes. Does not write a '0x' prefix.
+- `read-byte`: reads a single byte from a stream of bytes.
+
 #### reading/writing hex representations of integers
-- `is-hex-int?`: takes a slice argument, returns boolean result in `eax`
-- `parse-hex-int`: takes a slice argument, returns int result in `eax`
-- `is-hex-digit?`: takes a 32-bit word containing a single byte, returns
-  boolean result in `eax`.
-- `from-hex-char`: takes a hexadecimal digit character in `eax`, returns its
-  numeric value in `eax`
-- `to-hex-char`: takes a single-digit numeric value in `eax`, returns its
-  corresponding hexadecimal character in `eax`
-
-#### tokenization
-
-from a stream:
-- `next-token`: stream, delimiter byte -> slice
-- `skip-chars-matching`: stream, delimiter byte
-- `skip-chars-not-matching`: stream, delimiter byte
-
-from a slice:
-- `next-token-from-slice`: start, end, delimiter byte -> slice
-  - Given a slice and a delimiter byte, returns a new slice inside the input
-    that ends at the delimiter byte.
-
-- `skip-chars-matching-in-slice`: curr, end, delimiter byte -> new-curr (in `eax`)
-- `skip-chars-not-matching-in-slice`:  curr, end, delimiter byte -> new-curr (in `eax`)
+
+- `write-int32-hex`
+- `hex-int?`: checks if a slice contains an int in hex. Supports '0x' prefix.
+- `parse-hex-int`: reads int in hex from string
+- `parse-hex-int-from-slice`: reads int in hex from slice
+- `parse-array-of-ints`: reads in multiple ints in hex, separated by whitespace.
+- `hex-digit?`: checks if byte is in [0, 9] or [a, f] (lowercase only)
+
+- `write-int32-decimal`
+- `parse-decimal-int`
+- `parse-decimal-int-from-slice`
+- `parse-decimal-int-from-stream`
+- `parse-array-of-decimal-ints`
+- `decimal-digit?`: checks if byte is in [0, 9]
+
+#### printing to screen
+
+All screen primitives require a screen object, which can be either the real
+screen on the computer or a fake screen for tests.
+
+The real screen on the Mu computer can currently display only ASCII characters,
+though it's easy to import more of the font. There is only one font. All
+graphemes are 8 pixels wide and 16 pixels tall. These constraints only apply
+to the real screen.
+
+- `draw-grapheme`: draws a single grapheme at a given coordinate, with given
+  foreground and background colors.
+- `render-grapheme`: like `draw-grapheme` and can also handle newlines
+  assuming text is printed left-to-right, top-to-bottom.
+- `draw-code-point`
+- `clear-screen`
+
+- `draw-text-rightward`: draws a single line of text, stopping when it reaches
+  either the provided bound or the right screen margin.
+- `draw-stream-rightward`
+- `draw-text-rightward-over-full-screen`: does not provide a bound.
+- `draw-text-wrapping-right-then-down`: draws multiple lines of text on screen
+  with simplistic word-wrap (no hyphenation) within (x, y) bounds.
+- `draw-stream-wrapping-right-then-down`
+- `draw-text-wrapping-right-then-down-over-full-screen`
+- `draw-int32-hex-wrapping-right-then-down`
+- `draw-int32-hex-wrapping-right-then-down-over-full-screen`
+- `draw-int32-decimal-wrapping-right-then-down`
+- `draw-int32-decimal-wrapping-right-then-down-over-full-screen`
+
+Similar primitives for writing text top-to-bottom, left-to-right.
+
+- `draw-text-downward`
+- `draw-stream-downward`
+- `draw-text-wrapping-down-then-right`
+- `draw-stream-wrapping-down-then-right`
+- `draw-text-wrapping-down-then-right-over-full-screen`
+- `draw-int32-hex-wrapping-down-then-right`
+- `draw-int32-hex-wrapping-down-then-right-over-full-screen`
+- `draw-int32-decimal-wrapping-down-then-right`
+- `draw-int32-decimal-wrapping-down-then-right-over-full-screen`
+
+Screens remember the current cursor position.
+
+- `cursor-position`
+- `set-cursor-position`
+- `draw-grapheme-at-cursor`
+- `draw-code-point-at-cursor`
+- `draw-cursor`: highlights the current position of the cursor. Programs must
+  pass in the grapheme to draw at the cursor position, and are responsible for
+  clearing the highlight when the cursor moves.
+- `move-cursor-left`, `move-cursor-right`, `move-cursor-up`, `move-cursor-down`.
+  These primitives always silently fail if the desired movement would go out
+  of screen bounds.
+- `move-cursor-to-left-margin-of-next-line`
+- `move-cursor-rightward-and-downward`: move cursor one grapheme to the right
+
+- `draw-text-rightward-from-cursor`
+- `draw-text-wrapping-right-then-down-from-cursor`
+- `draw-text-wrapping-right-then-down-from-cursor-over-full-screen`
+- `draw-int32-hex-wrapping-right-then-down-from-cursor`
+- `draw-int32-hex-wrapping-right-then-down-from-cursor-over-full-screen`
+- `draw-int32-decimal-wrapping-right-then-down-from-cursor`
+- `draw-int32-decimal-wrapping-right-then-down-from-cursor-over-full-screen`
+
+- `draw-text-wrapping-down-then-right-from-cursor`
+- `draw-text-wrapping-down-then-right-from-cursor-over-full-screen`
+
+Assertions for tests:
+
+- `check-screen-row`: compare a screen from the left margin of a given row
+  index with a string. The row index counts downward from 0 at the top of the
+  screen. String can be smaller or larger than a single row, and defines the
+  region of interest. Strings longer than a row wrap around to the left margin
+  of the next screen row. Currently assumes text is printed left-to-right on
+  the screen.
+- `check-screen-row-from`: compare a fragment of a screen (left to write, top
+  to bottom) starting from a given (x, y) coordinate with an expected string.
+  Currently assumes text is printed left-to-right and top-to-bottom on the
+  screen.
+- `check-screen-row-in-color`: like `check-screen-row` but:
+  - also compares foreground color
+  - ignores screen locations where the expected string contains spaces
+- `check-screen-row-in-color-from`
+- `check-screen-row-in-background-color`
+- `check-screen-row-in-background-color-from`
+- `check-background-color-in-screen-row`: unlike previous functions, this
+  doesn't check screen contents, only background color. Ignores background
+  color where expected string contains spaces, and compares background color
+  where expected string does not contain spaces. Never compares the character
+  at any screen location.
+- `check-background-color-in-screen-row-from`
author	Kartik K. Agaram <vc@akkartik.com>	2021-03-08 23:49:07 -0800
committer	Kartik K. Agaram <vc@akkartik.com>	2021-03-08 23:50:35 -0800
commit	cec5ef31b3e383b7bdffe049a8c502a563f6b491 (patch)
tree	9f6b410cc16991a709dc59258ae29dacd2feb98b /vocabulary.md
parent	6508ab51ccd6a41d6d1da3502359e80611d8bda3 (diff)
download	mu-cec5ef31b3e383b7bdffe049a8c502a563f6b491.tar.gz