about summary refs log tree commit diff stats
diff options
context:
space:
mode:
-rw-r--r--README.md788
-rw-r--r--apps/README.md6
-rwxr-xr-xapps/ex2bin214 -> 219 bytes
-rw-r--r--apps/ex2.2.mu19
-rw-r--r--apps/ex2.mu2
-rw-r--r--apps/ex2.subx10
-rw-r--r--bootstrap.md17
-rw-r--r--debugging.md122
-rw-r--r--editor.md19
-rw-r--r--html/apps/ex2.2.mu.html81
-rw-r--r--mu.md456
-rw-r--r--mu_summary259
-rw-r--r--subx.md162
-rw-r--r--subx_bare.md (renamed from subx_addressing_modes.md)55
-rwxr-xr-xtest_apps15
-rw-r--r--vocabulary.md208
16 files changed, 1127 insertions, 1092 deletions
diff --git a/README.md b/README.md
index aad6253d..c0740f89 100644
--- a/README.md
+++ b/README.md
@@ -10,7 +10,7 @@ Running the code you want to run, and nothing else.
   ```sh
   $ git clone https://github.com/akkartik/mu
   $ cd mu
-  $ ./translate_mu apps/ex2.mu  # emits a.elf
+  $ ./translate_mu apps/ex2.mu  # emit a.elf
   $ ./a.elf  # adds 3 and 4
   $ echo $?
   7
@@ -18,10 +18,13 @@ Running the code you want to run, and nothing else.
 
 [![Build Status](https://api.travis-ci.org/akkartik/mu.svg?branch=master)](https://travis-ci.org/akkartik/mu)
 
-Mu requires just a Unix-like OS and nothing else. The Mu translator is built
-up from machine code. You can also bootstrap it from C++. Both C++ and
-self-hosted versions emit identical binaries. The generated binaries require
-just a Unix-like kernel and nothing else. (There are more details in [this paper](http://akkartik.name/akkartik-convivial-20200607.pdf).)
+Rather than start from some syntax and introduce layers of translation to
+implement it, Mu starts from the processor's instruction set and tries to get
+to _some_ safe and clear syntax with as few layers of translation as possible.
+The emphasis is on internal consistency at any point in time rather than
+compatibility with the past. ([More details.](http://akkartik.name/akkartik-convivial-20200607.pdf))
+
+Currently Mu requires a 32-bit x86 processor and Linux kernel.
 
 ## Goals
 
@@ -56,668 +59,84 @@ In priority order:
   For now it's a thin veneer over machine code. I'm working on memory safety
   before expressive syntax.
 
-## Source Language
-
-Mu's main source language is [still being built](http://akkartik.name/post/mu-2019-2).
-When completed, it will be type- and memory-safe. At the moment it performs no
-checks. Here's the program we translated above:
-
-<img alt='ex2.mu' src='html/ex2.mu.png'>
-
-There are no expressions; functions consist of only statements that operate on
-variables. Most statements in Mu translate to [a single machine-code instruction](http://akkartik.github.io/mu/html/mu_instructions.html).
-Variables reside in memory by default. Programs must specify registers when
-they want to use them. Functions must return results in registers. Execution
-begins at the function `main`, which always returns its result in register
-`ebx`. [The paper](http://akkartik.name/akkartik-convivial-20200607.pdf) has
-more details, and there's a [summary](mu_summary) of all supported instructions.
-
-## SubX
-
-Mu is written in [a notation for a subset of x86 machine code called SubX](http://akkartik.name/post/mu-2019-1).
-Here's a SubX program (`apps/ex1.subx`) that returns 42:
-
-  ```sh
-  bb/copy-to-ebx  0x2a/imm32  # 42 in hex
-  b8/copy-to-eax  1/imm32/exit
-  cd/syscall  0x80/imm8
-  ```
-
-You can generate tiny zero-dependency ELF binaries from SubX that run on Linux.
-
-  ```sh
-  $ ./bootstrap translate init.linux apps/ex1.subx -o apps/ex1  # on Linux or BSD or Mac
-  $ ./apps/ex1  # only on Linux
-  $ echo $?
-  42
- ```
-
-(Running `bootstrap` requires a C++ compiler, transparently invoking it as
-necessary.)
-
-You can run the generated binaries on an interpreter/VM for better error
-messages.
-
-  ```sh
-  $ ./bootstrap run apps/ex1  # on Linux or BSD or Mac
-  $ echo $?
-  42
-  ```
-
-Emulated runs can generate a trace that permits [time-travel debugging](https://github.com/akkartik/mu/blob/master/tools/browse_trace.readme.md).
+## Toolchain
 
-  ```sh
-  $ ./bootstrap --debug translate init.linux apps/factorial.subx -o apps/factorial
-  saving address->label information to 'labels'
-  saving address->source information to 'source_lines'
+The Mu stack consists of:
+- the Mu type-safe language;
+- SubX, an unsafe notation for a subset of x86 machine code; and
+- _bare_ SubX, a more rudimentary form of SubX without certain syntax sugar.
 
-  $ ./bootstrap --trace run apps/factorial
-  saving trace to 'last_run'
+All Mu programs get translated through these layers, and the translators are
+mostly built out of lower levels. The translator from Mu to SubX is written in
+SubX, and the translator from SubX to bare SubX is built in bare SubX.
 
-  $ tools/browse_trace last_run  # text-mode debugger UI
-  ```
+To translate Mu programs through these layers into tiny zero-dependency ELF
+binaries for Linux, use `translate_mu`.
 
-You can write tests for your programs. The entire stack is thoroughly covered
-by automated tests. SubX's tagline: tests before syntax.
+Mu programs can be run in emulated mode to emit traces, which permit time-travel
+debugging. ([More details.](debugging.md))
 
-  ```sh
-  $ ./bootstrap test
-  $ ./bootstrap run apps/factorial test
-  ```
+### incomplete tools
 
-You can use SubX to translate itself. For example, running natively on Linux:
+The Mu translator is still a work in progress; not all incorrect programs
+result in good error messages.
 
-  ```sh
-  # generate translator phases using the C++ translator
-  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/hex.subx    -o hex
-  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/survey.subx -o survey
-  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/pack.subx   -o pack
-  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/assort.subx -o assort
-  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/dquotes.subx -o dquotes
-  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/tests.subx  -o tests
-  $ chmod +x hex survey pack assort dquotes tests
-
-  # use the generated translator phases to translate SubX programs
-  $ cat init.linux apps/ex1.subx |./tests |./dquotes |./assort |./pack |./survey |./hex > a.elf
-  $ chmod +x a.elf
-  $ ./a.elf
-  $ echo $?
-  42
-
-  # or, automating the above steps
-  $ ./translate_subx init.linux apps/ex1.subx
-  $ ./a.elf
-  $ echo $?
-  42
-  ```
-
-Or, running in a VM on other platforms (much slower):
-
-  ```sh
-  $ ./translate_subx_emulated init.linux apps/ex1.subx  # generates identical a.elf to above
-  $ ./bootstrap run a.elf
-  $ echo $?
-  42
-  ```
-
-You can package up SubX binaries with the minimal hobbyist OS [Soso](https://github.com/ozkl/soso)
-and run them on Qemu. (Requires graphics and sudo access. Currently doesn't
-work on a cloud server.)
+Once generated, ELF binaries can be packaged up with a Linux kernel into a
+bootable disk image:
 
   ```sh
+  $ ./translate_mu apps/ex2.mu  # emit a.elf
   # dependencies
-  $ sudo apt install build-essential util-linux nasm xorriso  # maybe also dosfstools and mtools
-  # package up a "hello world" program with a third-party kernel into mu_soso.iso
-  # requires sudo
-  $ tools/iso/soso init.soso apps/ex6.subx
-  # try it out
-  $ qemu-system-i386 -cdrom mu_soso.iso
-  ```
-
-You can also package up SubX binaries with a Linux kernel and run them on
-either Qemu or [a cloud server that supports custom images](http://akkartik.name/post/iso-on-linode).
-(Takes 12 minutes with 2GB RAM. Requires 12 million LoC of C for the Linux
-kernel; that number will gradually go down.)
-
-  ```sh
   $ sudo apt install build-essential flex bison wget libelf-dev libssl-dev xorriso
-  $ tools/iso/linux init.linux apps/ex6.subx
+  $ tools/iso/linux a.elf
   $ qemu-system-x86_64 -m 256M -cdrom mu_linux.iso -boot d
   ```
 
-## The syntax of SubX instructions
+The disk image also runs on [any cloud server that supports custom images](http://akkartik.name/post/iso-on-linode).
 
-Here is the above SubX example again:
+Mu also runs on the minimal hobbyist OS [Soso](https://github.com/ozkl/soso).
+(Requires graphics and sudo access. Currently doesn't work on a cloud server.)
 
   ```sh
-  bb/copy-to-ebx  0x2a/imm32  # 42 in hex
-  b8/copy-to-eax  1/imm32/exit
-  cd/syscall  0x80/imm8
-  ```
-
-Every line contains at most one instruction. Instructions consist of words
-separated by whitespace. Words may be _opcodes_ (defining the operation being
-performed) or _arguments_ (specifying the data the operation acts on). Any
-word can have extra _metadata_ attached to it after `/`. Some metadata is
-required (like the `/imm32` and `/imm8` above), but unrecognized metadata is
-silently skipped so you can attach comments to words (like the instruction
-name `/copy-to-eax` above, or the `/exit` argument).
-
-What do all these numbers mean? SubX supports a small subset of the 32-bit x86
-instruction set that likely runs on your computer. (Think of the name as short
-for "sub-x86".) The instruction set contains instructions like `89/copy`,
-`01/add`, `3d/compare` and `51/push-ecx` which modify registers and a byte-addressable
-memory. For a complete list of supported instructions, run `bootstrap help opcodes`.
-
-The registers instructions operate on are as follows:
-
-- Six general-purpose 32-bit registers: `0/eax`, `1/ebx`, `2/ecx`, `3/edx`,
-  `6/esi` and `7/edi`.
-- Two additional 32-bit registers: `4/esp` and `5/ebp`. (I suggest you only
-  use these to manage the call stack.)
-
-(SubX doesn't support floating-point registers yet. Intel processors support
-an 8-bit mode, 16-bit mode and 64-bit mode. SubX will never support them.
-There are also _many_ more instructions that SubX will never support.)
-
-While SubX doesn't provide the usual mnemonics for opcodes, it _does_ provide
-error-checking. If you miss an argument or accidentally add an extra argument,
-you'll get a nice error. SubX won't arbitrarily interpret bytes of data as
-instructions or vice versa.
-
-It's worth distinguishing between an instruction's arguments and its _operands_.
-Arguments are provided directly in instructions. Operands are pieces of data
-in register or memory that are operated on by instructions.
-
-Intel processors typically operate on no more than two operands, and at most
-one of them (the 'reg/mem' operand) can access memory. The address of the
-reg/mem operand is constructed by expressions of one of these forms:
-
-  - `%reg`: operate on just a register, not memory
-  - `*reg`: look up memory with the address in some register
-  - `*(reg + disp)`: add a constant to the address in some register
-  - `*(base + (index << scale) + disp)` where `base` and `index` are registers,
-    and `scale` and `disp` are 2- and 32-bit constants respectively.
-
-Under the hood, SubX turns expressions of these forms into multiple arguments
-with metadata in some complex ways. See [subx\_addressing\_modes.md](subx_addressing_modes.md).
-
-That covers the complexities of the reg/mem operand. The second operand is
-simpler. It comes from exactly one of the following argument types:
-
-  - `/r32`
-  - displacement: `/disp8` or `/disp32`
-  - immediate: `/imm8` or `/imm32`
-
-Putting all this together, here's an example that adds the integer in `eax` to
-the one at address `edx`:
-
-  ```
-  01/add %edx 0/r32/eax
+  $ ./translate_mu apps/ex2.mu  # emit a.elf
+  # dependencies
+  $ sudo apt install build-essential util-linux nasm xorriso  # maybe also dosfstools and mtools
+  $ tools/iso/soso a.elf  # requires sudo
+  $ qemu-system-i386 -cdrom mu_soso.iso
   ```
 
-## The syntax of SubX programs
-
-SubX programs map to the same ELF binaries that a conventional Linux system
-uses. Linux ELF binaries consist of a series of _segments_. In particular, they
-distinguish between code and data. Correspondingly, SubX programs consist of a
-series of segments, each starting with a header line: `==` followed by a name
-and approximate starting address.
-
-All code must lie in a segment called 'code'.
-
-Segments can be added to.
-
-```sh
-== code 0x09000000  # first mention requires starting address
-...A...
-
-== data 0x0a000000
-...B...
-
-== code             # no address necessary when adding
-...C...
-```
-
-The `code` segment now contains the instructions of `A` as well as `C`.
-
-Within the `code` segment, each line contains a comment, label or instruction.
-Comments start with a `#` and are ignored. Labels should always be the first
-word on a line, and they end with a `:`.
-
-Instructions can refer to labels in displacement or immediate arguments, and
-they'll obtain a value based on the address of the label: immediate arguments
-will contain the address directly, while displacement arguments will contain
-the difference between the address and the address of the current instruction.
-The latter is mostly useful for `jump` and `call` instructions.
-
-Functions are defined using labels. By convention, labels internal to functions
-(that must only be jumped to) start with a `$`. Any other labels must only be
-called, never jumped to. All labels must be unique.
-
-Functions are called using the following syntax:
-```
-(func arg1 arg2 ...)
-```
-
-Function arguments must be either literals (integers or strings) or a reg/mem
-operand using the syntax in the previous section.
-
-A special label is `Entry`, which can be used to specify/override the entry
-point of the program. It doesn't have to be unique, and the latest definition
-will override earlier ones.
-
-(The `Entry` label, along with duplicate segment headers, allows programs to
-be built up incrementally out of multiple [_layers_](http://akkartik.name/post/wart-layers).)
-
-Another special pair of labels are the block delimiters `{` and `}`. They can
-be nested, and jump instructions can take arguments `loop` or `break` that
-jump to the enclosing `{` and `}` respectively.
+## Syntax
 
-The data segment consists of labels as before and byte values. Referring to
-data labels in either `code` segment instructions or `data` segment values
-yields their address.
+The entire stack shares certain properties and conventions. Programs consist
+of functions and functions consist of statements, each performing a single
+operation. Operands to statements are always variables or constants. You can't
+say `a + b*c`, you have to break it up into two operations. Variables can live
+in memory or in registers. Registers must be explicitly specified. There are
+some shared lexical rules; comments always start with '#', and numbers are
+always written in hex.
 
-Automatic tests are an important part of SubX, and there's a simple mechanism
-to provide a test harness: all functions that start with `test-` are called in
-turn by a special, auto-generated function called `run-tests`. How you choose
-to call it is up to you.
+Here's an example program in Mu:
 
-I try to keep things simple so that there's less work to do when implementing
-SubX in SubX. But there _is_ one convenience: instructions can provide a
-string literal surrounded by quotes (`"`) in an `imm32` argument. SubX will
-transparently copy it to the `data` segment and replace it with its address.
-Strings are the only place where a SubX word is allowed to contain spaces.
-
-That should be enough information for writing SubX programs. The `apps/`
-directory provides some fodder for practice in the `apps/ex*.subx` files,
-giving a more gradual introduction to SubX features. In particular, you should
-work through `apps/factorial4.subx`, which demonstrates all the above ideas in
-concert.
-
-This repo includes binaries for all examples. At any commit, an example's
-binary should be identical bit for bit with the result of translating the
-corresponding `.subx` file. The binary should also be natively runnable on a
-Linux system running on Intel x86 processors, either 32- or 64-bit. If either
-of these invariants is broken it's a bug on my part.
-
-## Running
-
-`bootstrap` currently has the following sub-commands:
-
-- `bootstrap help`: some helpful documentation to have at your fingertips.
-
-- `bootstrap test`: runs all automated tests.
-
-- `bootstrap translate <input files> -o <output ELF binary>`: translates `.subx`
-  files into an executable ELF binary.
-
-- `bootstrap run <ELF binary> <args>`: simulates running the ELF binaries emitted
-  by `bootstrap translate`. Useful for testing and debugging.
-
-  Remember, not all 32-bit Linux binaries are guaranteed to run. I'm not
-  building general infrastructure here for all of the x86 instruction set.
-  SubX is about programming with a small, regular subset of 32-bit x86.
-
-## Getting your editor set up
-
-If you've read this far, it's time to set up your editor. Mu is really
-intended to be read interactively rather than on a browser.
-
-There is rudimentary syntax highlighting support for Mu and SubX files for
-various editors. Look for your editor in `mu.*` and `subx.*`, and follow the
-instructions within.
-
-The Vim files are most developed. In particular, I recommend some optional
-setup in subx.vim to use multiple colors for comments.
-
-If you use [Exuberant Ctags](http://ctags.sourceforge.net) for jumping easily
-from names to their definitions in your editor, copy the contents of `exuberant_ctags_rc`
-into your `.ctags` file.
-
-## A few hints for debugging
-
-Writing programs in SubX is surprisingly pleasant and addictive. Reading
-programs is a work in progress, and hopefully the extensive unit tests help.
-However, _debugging_ programs is where one really faces up to the low-level
-nature of SubX. Even the smallest modifications need testing to make sure they
-work. In my experience, there is no modification so small that I get it working
-on the first attempt. And when it doesn't work, there are no clear error
-messages. Machine code is too simple-minded for that. You can't use a debugger,
-since SubX's simplistic ELF binaries contain no debugging information. So
-debugging requires returning to basics and practicing with a new, more
-rudimentary but hopefully still workable toolkit:
-
-- Start by nailing down a concrete set of steps for reproducibly obtaining the
-  error or erroneous behavior.
-
-- If possible, turn the steps into a failing test. It's not always possible,
-  but SubX's primary goal is to keep improving the variety of tests one can
-  write.
-
-- Start running the single failing test alone. This involves modifying the top
-  of the program (or the final `.subx` file passed in to `bootstrap translate`) by
-  replacing the call to `run-tests` with a call to the appropriate `test-`
-  function.
-
-- Generate a trace for the failing test while running your program in emulated
-  mode (`bootstrap run`):
-  ```
-  $ ./bootstrap translate input.subx -o binary
-  $ ./bootstrap --trace run binary arg1 arg2  2>trace
-  ```
-  The ability to generate a trace is the essential reason for the existence of
-  `bootstrap run` mode. It gives far better visibility into program internals than
-  running natively.
-
-- As a further refinement, it is possible to render label names in the trace
-  by adding a second flag to the `bootstrap translate` command:
-  ```
-  $ ./bootstrap --debug translate input.subx -o binary
-  $ ./bootstrap --trace run binary arg1 arg2  2>trace
-  ```
-  `bootstrap --debug translate` emits a mapping from label to address in a file
-  called `labels`. `bootstrap --trace run` reads in the `labels` file if
-  it exists and prints out any matching label name as it traces each instruction
-  executed.
-
-  Here's a sample of what a trace looks like, with a few boxes highlighted:
+<img alt='ex2.mu' src='html/ex2.mu.png'>
 
-  <img alt='trace example' src='html/trace.png'>
+[More details on Mu syntax &rarr;](mu.md)
 
-  Each of the green boxes shows the trace emitted for a single instruction.
-  It starts with a line of the form `run: inst: ___` followed by the opcode
-  for the instruction, the state of registers before the instruction executes,
-  and various other facts deduced during execution. Some instructions first
-  print a matching label. In the above screenshot, the red boxes show that
-  address `0x0900005e` maps to label `$loop` and presumably marks the start of
-  some loop. Function names get similar `run: == label` lines.
+Here's an example program in SubX:
 
-- One trick when emitting traces with labels:
-  ```
-  $ grep label trace
-  ```
-  This is useful for quickly showing you the control flow for the run, and the
-  function executing when the error occurred. I find it useful to start with
-  this information, only looking at the complete trace after I've gotten
-  oriented on the control flow. Did it get to the loop I just modified? How
-  many times did it go through the loop?
-
-- Once you have SubX displaying labels in traces, it's a short step to modify
-  the program to insert more labels just to gain more insight. For example,
-  consider the following function:
-
-  <img alt='control example -- before' src='html/control0.png'>
-
-  This function contains a series of jump instructions. If a trace shows
-  `is-hex-lowercase-byte?` being encountered, and then `$is-hex-lowercase-byte?:end`
-  being encountered, it's still ambiguous what happened. Did we hit an early
-  exit, or did we execute all the way through? To clarify this, add temporary
-  labels after each jump:
-
-  <img alt='control example -- after' src='html/control1.png'>
-
-  Now the trace should have a lot more detail on which of these labels was
-  reached, and precisely when the exit was taken.
-
-- If you find yourself wondering, "when did the contents of this memory
-  address change?", `bootstrap run` has some rudimentary support for _watch
-  points_. Just insert a label starting with `$watch-` before an instruction
-  that writes to the address, and its value will start getting dumped to the
-  trace after every instruction thereafter.
-
-- Once we have a sense for precisely which instructions we want to look at,
-  it's time to look at the trace as a whole. Key is the state of registers
-  before each instruction. If a function is receiving bad arguments it becomes
-  natural to inspect what values were pushed on the stack before calling it,
-  tracing back further from there, and so on.
-
-  I occasionally want to see the precise state of the stack segment, in which
-  case I uncomment a commented-out call to `dump_stack()` in the `vm.cc`
-  layer. It makes the trace a lot more verbose and a lot less dense, necessitating
-  a lot more scrolling around, so I keep it turned off most of the time.
-
-- If the trace seems overwhelming, try [browsing it](https://github.com/akkartik/mu/blob/master/tools/browse_trace.readme.md)
-  in the 'time-travel debugger'.
-
-- Don't be afraid to slice and dice the trace using Unix tools. For example,
-  say you have a SubX binary that dies while running tests. You can see what
-  test it's segfaulting at by compiling it with debug information using
-  `./translate_subx_debug`, and then running:
-
-  ```
-  ./bootstrap --trace --dump run a.elf test 2>&1 |grep 'label test'
+  ```sh
+  == code
+  Entry:
+    # ebx = 1
+    bb/copy-to-ebx  1/imm32
+    # increment ebx
+    43/increment-ebx
+    # exit(ebx)
+    e8/call  syscall_exit/disp32
   ```
 
-  Just read out the last test printed out before the segfault.
-
-Hopefully these hints are enough to get you started. The main thing to
-remember is to not be afraid of modifying the sources. A good debugging
-session gets into a nice rhythm of generating a trace, staring at it for a
-while, modifying the sources, regenerating the trace, and so on. Email
-[me](mailto:mu@akkartik.com) if you'd like another pair of eyes to stare at a
-trace, or if you have questions or complaints.
-
-## Reference documentation on available primitives
-
-### Data Structures
-
-- Kernel strings: null-terminated regions of memory. Unsafe and to be avoided,
-  but needed for interacting with the kernel.
-
-- Arrays: length-prefixed regions of memory containing multiple elements of a
-  single type. Contents are preceded by 4 bytes (32 bits) containing the
-  `length` of the array in bytes.
-
-- Slices: a pair of 32-bit addresses denoting a [half-open](https://en.wikipedia.org/wiki/Interval_(mathematics))
-  \[`start`, `end`) interval to live memory with a consistent lifetime.
+[More details on SubX syntax &rarr;](subx.md)
 
-  Invariant: `start` <= `end`
-
-- Streams: strings prefixed by 32-bit `write` and `read` indexes that the next
-  write or read goes to, respectively.
-
-  - offset 0: write index
-  - offset 4: read index
-  - offset 8: length of array (in bytes)
-  - offset 12: start of array data
-
-  Invariant: 0 <= `read` <= `write` <= `length`
-
-- File descriptors (fd): Low-level 32-bit integers that the kernel uses to
-  track files opened by the program.
-
-- File: 32-bit value containing either a fd or an address to a stream (fake
-  file).
-
-- Buffered files (buffered-file): Contain a file descriptor and a stream for
-  buffering reads/writes. Each `buffered-file` must exclusively perform either
-  reads or writes.
-
-### 'system calls'
-
-As I said at the top, a primary design goal of SubX (and Mu more broadly) is
-to explore ways to turn arbitrary manual tests into reproducible automated
-tests. SubX aims for this goal by baking testable interfaces deep into the
-stack, at the OS syscall level. The idea is that every syscall that interacts
-with hardware (and so the environment) should be *dependency injected* so that
-it's possible to insert fake hardware in tests.
-
-But those are big goals. Here are the syscalls I have so far:
-
-- `write`: takes two arguments, a file `f` and an address to array `s`.
-
-  Comparing this interface with the Unix `write()` syscall shows two benefits:
-
-  1. SubX can handle 'fake' file descriptors in tests.
-
-  1. `write()` accepts buffer and its length in separate arguments, which
-     requires callers to manage the two separately and so can be error-prone.
-     SubX's wrapper keeps the two together to increase the chances that we
-     never accidentally go out of array bounds.
-
-- `read`: takes two arguments, a file `f` and an address to stream `s`. Reads
-  as much data from `f` as can fit in (the free space of) `s`.
-
-  Like with `write()`, this wrapper around the Unix `read()` syscall adds the
-  ability to handle 'fake' file descriptors in tests, and reduces the chances
-  of clobbering outside array bounds.
-
-  One bit of weirdness here: in tests we do a redundant copy from one stream
-  to another. See [the comments before the implementation](http://akkartik.github.io/mu/html/060read.subx.html)
-  for a discussion of alternative interfaces.
-
-- `stop`: takes two arguments:
-  - `ed` is an address to an _exit descriptor_. Exit descriptors allow us to
-    `exit()` the program in production, but return to the test harness within
-    tests. That allows tests to make assertions about when `exit()` is called.
-  - `value` is the status code to `exit()` with.
-
-  For more details on exit descriptors and how to create one, see [the
-  comments before the implementation](http://akkartik.github.io/mu/html/059stop.subx.html).
-
-- `new-segment`
-
-  Allocates a whole new segment of memory for the program, discontiguous with
-  both existing code and data (heap) segments. Just a more opinionated form of
-  [`mmap`](http://man7.org/linux/man-pages/man2/mmap.2.html).
-
-- `allocate`: takes two arguments, an address to allocation-descriptor `ad`
-  and an integer `n`
-
-  Allocates a contiguous range of memory that is guaranteed to be exclusively
-  available to the caller. Returns the starting address to the range in `eax`.
-
-  An allocation descriptor tracks allocated vs available addresses in some
-  contiguous range of memory. The int specifies the number of bytes to allocate.
-
-  Explicitly passing in an allocation descriptor allows for nested memory
-  management, where a sub-system gets a chunk of memory and further parcels it
-  out to individual allocations. Particularly helpful for (surprise) tests.
-
-- ... _(to be continued)_
-
-I will continue to import syscalls over time from [the old Mu VM in the parent
-directory](https://github.com/akkartik/mu), which has experimented with
-interfaces for the screen, keyboard, mouse, disk and network.
-
-### primitives built atop system calls
-
-_(Compound arguments are usually passed in by reference. Where the results are
-compound objects that don't fit in a register, the caller usually passes in
-allocated memory for it.)_
-
-#### assertions for tests
-- `check-ints-equal`: fails current test if given ints aren't equal
-- `check-stream-equal`: fails current test if stream doesn't match string
-- `check-next-stream-line-equal`: fails current test if next line of stream
-  until newline doesn't match string
-
-#### error handling
-- `error`: takes three arguments, an exit-descriptor, a file and a string (message)
-
-  Prints out the message to the file and then exits using the provided
-  exit-descriptor.
-
-- `error-byte`: like `error` but takes an extra byte value that it prints out
-  at the end of the message.
-
-#### predicates
-- `kernel-string-equal?`: compares a kernel string with a string
-- `string-equal?`: compares two strings
-- `stream-data-equal?`: compares a stream with a string
-- `next-stream-line-equal?`: compares with string the next line in a stream, from
-  `read` index to newline
-
-- `slice-empty?`: checks if the `start` and `end` of a slice are equal
-- `slice-equal?`: compares a slice with a string
-- `slice-starts-with?`: compares the start of a slice with a string
-- `slice-ends-with?`: compares the end of a slice with a string
-
-#### writing to disk
-- `write`: string -> file
-  - Can also be used to cat a string into a stream.
-  - Will abort the entire program if destination is a stream and doesn't have
-    enough room.
-- `write-stream`: stream -> file
-  - Can also be used to cat one stream into another.
-  - Will abort the entire program if destination is a stream and doesn't have
-    enough room.
-- `write-slice`: slice -> stream
-  - Will abort the entire program if there isn't enough room in the
-    destination stream.
-- `append-byte`: int -> stream
-  - Will abort the entire program if there isn't enough room in the
-    destination stream.
-- `append-byte-hex`: int -> stream
-  - textual representation in hex, no '0x' prefix
-  - Will abort the entire program if there isn't enough room in the
-    destination stream.
-- `print-int32`: int -> stream
-  - textual representation in hex, including '0x' prefix
-  - Will abort the entire program if there isn't enough room in the
-    destination stream.
-- `write-buffered`: string -> buffered-file
-- `write-slice-buffered`: slice -> buffered-file
-- `flush`: buffered-file
-- `write-byte-buffered`: int -> buffered-file
-- `print-byte-buffered`: int -> buffered-file
-  - textual representation in hex, no '0x' prefix
-- `print-int32-buffered`: int -> buffered-file
-  - textual representation in hex, including '0x' prefix
-
-#### reading from disk
-- `read`: file -> stream
-  - Can also be used to cat one stream into another.
-  - Will silently stop reading when destination runs out of space.
-- `read-byte-buffered`: buffered-file -> byte
-- `read-line-buffered`: buffered-file -> stream
-  - Will abort the entire program if there isn't enough room.
-
-#### non-IO operations on streams
-- `new-stream`: allocates space for a stream of `n` elements, each occupying
-  `b` bytes.
-  - Will abort the entire program if `n*b` requires more than 32 bits.
-- `clear-stream`: resets everything in the stream to `0` (except its `length`).
-- `rewind-stream`: resets the read index of the stream to `0` without modifying
-  its contents.
-
-#### reading/writing hex representations of integers
-- `is-hex-int?`: takes a slice argument, returns boolean result in `eax`
-- `parse-hex-int`: takes a slice argument, returns int result in `eax`
-- `is-hex-digit?`: takes a 32-bit word containing a single byte, returns
-  boolean result in `eax`.
-- `from-hex-char`: takes a hexadecimal digit character in `eax`, returns its
-  numeric value in `eax`
-- `to-hex-char`: takes a single-digit numeric value in `eax`, returns its
-  corresponding hexadecimal character in `eax`
-
-#### tokenization
-
-from a stream:
-- `next-token`: stream, delimiter byte -> slice
-- `skip-chars-matching`: stream, delimiter byte
-- `skip-chars-not-matching`: stream, delimiter byte
-
-from a slice:
-- `next-token-from-slice`: start, end, delimiter byte -> slice
-  - Given a slice and a delimiter byte, returns a new slice inside the input
-    that ends at the delimiter byte.
-
-- `skip-chars-matching-in-slice`: curr, end, delimiter byte -> new-curr (in `eax`)
-- `skip-chars-not-matching-in-slice`:  curr, end, delimiter byte -> new-curr (in `eax`)
-
-## Resources
-
-- [Single-page cheatsheet for the x86 ISA](https://net.cs.uni-bonn.de/fileadmin/user_upload/plohmann/x86_opcode_structure_and_instruction_overview.pdf)
-  (pdf; [cached local copy](https://github.com/akkartik/mu/blob/master/cheatsheet.pdf))
-- [Concise reference for the x86 ISA](https://c9x.me/x86)
-- [Intel processor manual](http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf) (pdf)
-- [&ldquo;Bootstrapping a compiler from nothing&rdquo;](http://web.archive.org/web/20061108010907/http://www.rano.org/bcompiler.html) by Edmund Grumley-Evans.
-- [&ldquo;Creating tiny ELF executables&rdquo;](https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html) by Brian Raiter.
-- [StoneKnifeForth](https://github.com/kragen/stoneknifeforth) by [Kragen Sitaker](http://canonical.org/~kragen).
-
-## Other forks
+## Forks
 
 Forks of Mu are encouraged. If you don't like something about this repo, feel
 free to make a fork. If you show it to me, I'll link to it here, so others can
@@ -726,64 +145,34 @@ use it. I might even pull your changes into this repo!
 - [mu-normie](https://git.sr.ht/~akkartik/mu-normie): with a more standard
   build system and C++ modules.
 
-## Conclusion
-
-The hypothesis of Mu and SubX is that designing the entire system to be
-testable from day 1 and from the ground up would radically impact the culture
-of the eco-system in a way that no bolted-on tool or service at higher levels
-can replicate:
-
-- Tests would make it easier to write programs that can be easily understood
-  by newcomers.
-
-- More broad-based understanding would lead to more forks.
+## Desiderata
 
-- Tests would make it easy to share code across forks. Copy the tests over,
-  and then copy code over and polish it until the tests pass. Manual work, but
-  tractable and without major risks.
-
-- The community would gain a diversified portfolio of forks for each program,
-  a “wavefront” of possible combinations of features and alternative
-  implementations of features. Application writers who wrote thorough tests
-  for their apps (something they just can’t do today) would be able to bounce
-  around between forks more easily without getting locked in to a single one
-  as currently happens.
-
-- There would be a stronger culture of reviewing the code for programs you use
-  or libraries you depend on. [More eyeballs would make more bugs shallow.](https://en.wikipedia.org/wiki/Linus%27s_Law)
-
-To falsify these hypotheses, here's a roadmap of the next few planned features:
+If you're still reading, here are some more things to check out:
 
-- Testable, dependency-injected vocabulary of primitives
-  - Streams: `read()`, `write()`. (✓)
-  - `exit()` (✓)
-  - Client-like non-blocking socket/file primitives: `load`, `save`
-  - Concurrency, and a framework for testing blocking code
-  - Server-like blocking socket/file primitives
+- The references on [Mu](mu.md) and [SubX](subx.md) syntax, and also [bare
+  SubX](subx_bare.md) without any syntax sugar.
 
-- Gradually streamline the bundled kernel, stripping away code we don't need.
+- [How to get your text editor set up for Mu and SubX programs.](editor.md)
 
----
+- [Some tips for debugging Mu and SubX programs.](debugging.md)
 
-If you're still reading, here are some more things to check out:
+- [Shared vocabulary of data types and functions shared by Mu programs.](vocabulary.md)
+  Mu programs can transparently call low-level functions written in SubX.
 
-a) Try running the tests: `./test_apps`
+- [A summary](mu_instructions) of how the Mu compiler translates instructions
+  to SubX. ([colorized version](http://akkartik.github.io/mu/html/mu_instructions.html))
 
-b) There's a handy [summary](mu_instructions) of how the Mu compiler translates
-instructions to SubX.
+- [Some starter exercises for learning SubX](https://github.com/akkartik/mu/pulls)
+  (labelled `hello`). Feel free to [ping me](mailto:ak@akkartik.com) with any questions.
 
-c) Check out the online help on SubX. Starting point: `./bootstrap`
+- [Commandline reference for the bootstrap C++ program.](bootstrap.md)
 
-d) Familiarize yourself with the list of opcodes supported in SubX: `./bootstrap
-help opcodes`. (It's also [in this repo](https://github.com/akkartik/mu/blob/master/subx_opcodes).)
-[Here](https://lobste.rs/s/qglfdp/subx_minimalist_assembly_language_for#c_o9ddqk)
-are some tips on my setup for quickly finding the right opcode for any
-situation from within Vim.
+- The [list of x86 opcodes](subx_opcodes) supported in SubX: `./bootstrap
+  help opcodes`.
 
-e) Try working on [some starter SubX exercises](https://github.com/akkartik/mu/pulls)
-(labelled `hello`).
+- [Some details on the unconventional organization of this project.](http://akkartik.name/post/four-repos)
 
-f) SubX comes with some useful [syntax sugar](http://akkartik.name/post/mu-2019-1).
+- Previous prototypes: [mu0](https://github.com/akkartik/mu0), [mu1](https://github.com/akkartik/mu1).
 
 ## Credits
 
@@ -795,36 +184,15 @@ Mu builds on many ideas that have come before, especially:
   and [Richard Gabriel](http://dreamsongs.net/Files/PatternsOfSoftware.pdf) for
   the intellectual tools for reasoning about the higher order design of a
   codebase;
-- Unix and C for showing us how to co-evolve language and OS, and for teaching
-  the (much maligned, misunderstood and underestimated) value of concise
-  *implementation* in addition to a clean interface;
-- Donald Knuth's [literate programming](http://www.literateprogramming.com/knuthweb.pdf)
-  for liberating "code for humans to read" from the tyranny of compiler order;
 - [David Parnas](http://www.cs.umd.edu/class/spring2003/cmsc838p/Design/criteria.pdf)
   and others for highlighting the value of separating concerns and stepwise
   refinement;
-- [Lisp](http://www.paulgraham.com/rootsoflisp.html) for showing the power of
-  dynamic languages, late binding and providing the right primitives *a la
-  carte*, especially lisp macros;
 - The folklore of debugging by print and the trace facility in many lisp
   systems;
 - Automated tests for showing the value of developing programs inside an
   elaborate harness;
-- [Python doctest](http://docs.python.org/2/library/doctest.html) for
-  exemplifying interactive documentation that doubles as tests;
-- [ReStructuredText](https://en.wikipedia.org/wiki/ReStructuredText)
-  and [its antecedents](https://en.wikipedia.org/wiki/Setext) for showing that
-  markup can be clean;
-- BDD for challenging us all to write tests at a higher level;
-- JavaScript and CSS for demonstrating the power of a DOM for complex
-  structured documents;
-- Rust for demonstrating that a system-programming language can be safe;
-- Forth for demonstrating that ergonomics don't require grammar; and
 - [Minimal Linux Live](http://minimal.linux-bg.org) for teaching how to create
   a bootable disk image.
-- [Soso](https://github.com/ozkl/soso), a tiny hackable OS.
-
-## Coda
-
-- [Some details on the unconventional organization of this project.](http://akkartik.name/post/four-repos)
-- Previous prototypes: [mu0](https://github.com/akkartik/mu0), [mu1](https://github.com/akkartik/mu1).
+- [&ldquo;Bootstrapping a compiler from nothing&rdquo;](http://web.archive.org/web/20061108010907/http://www.rano.org/bcompiler.html) by Edmund Grumley-Evans.
+- [&ldquo;Creating tiny ELF executables&rdquo;](https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html) by Brian Raiter.
+- [StoneKnifeForth](https://github.com/kragen/stoneknifeforth) by [Kragen Sitaker](http://canonical.org/~kragen).
diff --git a/apps/README.md b/apps/README.md
index 1e1a5685..c246b44b 100644
--- a/apps/README.md
+++ b/apps/README.md
@@ -9,3 +9,9 @@ Some apps written in SubX and Mu, in 3 categories:
   * More ambitious translator for a memory-safe language (in progress): `mu`
 
 * Miscellaneous test programs.
+
+All SubX apps include binaries. At any commit, an example's binary should be
+identical bit for bit with the result of translating the corresponding `.subx`
+file. The binary should also be natively runnable on a Linux system running on
+Intel x86 processors, either 32- or 64-bit. If either of these invariants is
+broken, it's a bug.
diff --git a/apps/ex2 b/apps/ex2
index e18440e1..0917c0aa 100755
--- a/apps/ex2
+++ b/apps/ex2
Binary files differdiff --git a/apps/ex2.2.mu b/apps/ex2.2.mu
deleted file mode 100644
index b177e770..00000000
--- a/apps/ex2.2.mu
+++ /dev/null
@@ -1,19 +0,0 @@
-# Increment a number, and return the result in the exit code.
-#
-# To run:
-#   $ ./translate_mu apps/ex2.2.mu
-#   $ ./a.elf
-# Expected result:
-#   $ echo $?
-#   7
-
-fn main -> result/ebx: int {
-  result <- foo
-}
-
-fn foo -> result/ebx: int {
-  var n: int
-  copy-to n, 3
-  increment n
-  result <- copy n
-}
diff --git a/apps/ex2.mu b/apps/ex2.mu
index fc20aa2e..c3a73874 100644
--- a/apps/ex2.mu
+++ b/apps/ex2.mu
@@ -1,4 +1,4 @@
-# Add two numbers, and return the result in the exit code.
+# Add 3 and 4, and return the result in the exit code.
 #
 # To run:
 #   $ ./translate_mu apps/ex2.mu
diff --git a/apps/ex2.subx b/apps/ex2.subx
index 11c04432..14007329 100644
--- a/apps/ex2.subx
+++ b/apps/ex2.subx
@@ -1,4 +1,4 @@
-# Add 1 and 1, and return the result in the exit code.
+# Add 3 and 4, and return the result in the exit code.
 #
 # To run:
 #   $ ./bootstrap translate init.linux apps/ex2.subx -o apps/ex2
@@ -10,10 +10,10 @@
 == code
 
 Entry:
-# ebx = 1
-bb/copy-to-ebx  1/imm32
-# increment ebx
-43/increment-ebx
+# ebx = 3
+bb/copy-to-ebx  3/imm32
+# add 4 to ebx
+81 0/subop/add 3/mod/direct 3/rm32/ebx 4/imm32
 # exit(ebx)
 e8/call  syscall_exit/disp32
 
diff --git a/bootstrap.md b/bootstrap.md
new file mode 100644
index 00000000..ca9320e5
--- /dev/null
+++ b/bootstrap.md
@@ -0,0 +1,17 @@
+## Running
+
+`bootstrap` currently has the following sub-commands:
+
+- `bootstrap help`: some helpful documentation to have at your fingertips.
+
+- `bootstrap test`: runs all automated tests.
+
+- `bootstrap translate <input files> -o <output ELF binary>`: translates `.subx`
+  files into an executable ELF binary.
+
+- `bootstrap run <ELF binary> <args>`: simulates running the ELF binaries emitted
+  by `bootstrap translate`. Useful for testing and debugging.
+
+  Remember, not all 32-bit Linux binaries are guaranteed to run. I'm not
+  building general infrastructure here for all of the x86 instruction set.
+  SubX is about programming with a small, regular subset of 32-bit x86.
diff --git a/debugging.md b/debugging.md
new file mode 100644
index 00000000..ef4cf474
--- /dev/null
+++ b/debugging.md
@@ -0,0 +1,122 @@
+## A few hints for debugging
+
+Writing programs in SubX is surprisingly pleasant and addictive. Reading
+programs is a work in progress, and hopefully the extensive unit tests help.
+However, _debugging_ programs is where one really faces up to the low-level
+nature of SubX. Even the smallest modifications need testing to make sure they
+work. In my experience, there is no modification so small that I get it working
+on the first attempt. And when it doesn't work, there are no clear error
+messages. Machine code is too simple-minded for that. You can't use a debugger,
+since SubX's simplistic ELF binaries contain no debugging information. So
+debugging requires returning to basics and practicing with a new, more
+rudimentary but hopefully still workable toolkit:
+
+- Start by nailing down a concrete set of steps for reproducibly obtaining the
+  error or erroneous behavior.
+
+- If possible, turn the steps into a failing test. It's not always possible,
+  but SubX's primary goal is to keep improving the variety of tests one can
+  write.
+
+- Start running the single failing test alone. This involves modifying the top
+  of the program (or the final `.subx` file passed in to `bootstrap translate`) by
+  replacing the call to `run-tests` with a call to the appropriate `test-`
+  function.
+
+- Generate a trace for the failing test while running your program in emulated
+  mode (`bootstrap run`):
+  ```
+  $ ./bootstrap translate input.subx -o binary
+  $ ./bootstrap --trace run binary arg1 arg2  2>trace
+  ```
+  The ability to generate a trace is the essential reason for the existence of
+  `bootstrap run` mode. It gives far better visibility into program internals than
+  running natively.
+
+- As a further refinement, it is possible to render label names in the trace
+  by adding a second flag to the `bootstrap translate` command:
+  ```
+  $ ./bootstrap --debug translate input.subx -o binary
+  $ ./bootstrap --trace run binary arg1 arg2  2>trace
+  ```
+  `bootstrap --debug translate` emits a mapping from label to address in a file
+  called `labels`. `bootstrap --trace run` reads in the `labels` file if
+  it exists and prints out any matching label name as it traces each instruction
+  executed.
+
+  Here's a sample of what a trace looks like, with a few boxes highlighted:
+
+  <img alt='trace example' src='html/trace.png'>
+
+  Each of the green boxes shows the trace emitted for a single instruction.
+  It starts with a line of the form `run: inst: ___` followed by the opcode
+  for the instruction, the state of registers before the instruction executes,
+  and various other facts deduced during execution. Some instructions first
+  print a matching label. In the above screenshot, the red boxes show that
+  address `0x0900005e` maps to label `$loop` and presumably marks the start of
+  some loop. Function names get similar `run: == label` lines.
+
+- One trick when emitting traces with labels:
+  ```
+  $ grep label trace
+  ```
+  This is useful for quickly showing you the control flow for the run, and the
+  function executing when the error occurred. I find it useful to start with
+  this information, only looking at the complete trace after I've gotten
+  oriented on the control flow. Did it get to the loop I just modified? How
+  many times did it go through the loop?
+
+- Once you have SubX displaying labels in traces, it's a short step to modify
+  the program to insert more labels just to gain more insight. For example,
+  consider the following function:
+
+  <img alt='control example -- before' src='html/control0.png'>
+
+  This function contains a series of jump instructions. If a trace shows
+  `is-hex-lowercase-byte?` being encountered, and then `$is-hex-lowercase-byte?:end`
+  being encountered, it's still ambiguous what happened. Did we hit an early
+  exit, or did we execute all the way through? To clarify this, add temporary
+  labels after each jump:
+
+  <img alt='control example -- after' src='html/control1.png'>
+
+  Now the trace should have a lot more detail on which of these labels was
+  reached, and precisely when the exit was taken.
+
+- If you find yourself wondering, "when did the contents of this memory
+  address change?", `bootstrap run` has some rudimentary support for _watch
+  points_. Just insert a label starting with `$watch-` before an instruction
+  that writes to the address, and its value will start getting dumped to the
+  trace after every instruction thereafter.
+
+- Once we have a sense for precisely which instructions we want to look at,
+  it's time to look at the trace as a whole. Key is the state of registers
+  before each instruction. If a function is receiving bad arguments it becomes
+  natural to inspect what values were pushed on the stack before calling it,
+  tracing back further from there, and so on.
+
+  I occasionally want to see the precise state of the stack segment, in which
+  case I uncomment a commented-out call to `dump_stack()` in the `vm.cc`
+  layer. It makes the trace a lot more verbose and a lot less dense, necessitating
+  a lot more scrolling around, so I keep it turned off most of the time.
+
+- If the trace seems overwhelming, try [browsing it](https://github.com/akkartik/mu/blob/master/tools/browse_trace.readme.md)
+  in the 'time-travel debugger'.
+
+- Don't be afraid to slice and dice the trace using Unix tools. For example,
+  say you have a SubX binary that dies while running tests. You can see what
+  test it's segfaulting at by compiling it with debug information using
+  `./translate_subx_debug`, and then running:
+
+  ```
+  ./bootstrap --trace --dump run a.elf test 2>&1 |grep 'label test'
+  ```
+
+  Just read out the last test printed out before the segfault.
+
+Hopefully these hints are enough to get you started. The main thing to
+remember is to not be afraid of modifying the sources. A good debugging
+session gets into a nice rhythm of generating a trace, staring at it for a
+while, modifying the sources, regenerating the trace, and so on. Email
+[me](mailto:mu@akkartik.com) if you'd like another pair of eyes to stare at a
+trace, or if you have questions or complaints.
diff --git a/editor.md b/editor.md
new file mode 100644
index 00000000..65f2badb
--- /dev/null
+++ b/editor.md
@@ -0,0 +1,19 @@
+## Getting your editor set up
+
+If you've read this far, it's time to set up your editor. Mu is really
+intended to be read interactively rather than on a browser.
+
+There is rudimentary syntax highlighting support for Mu and SubX files for
+various editors. Look for your editor in `mu.*` and `subx.*`, and follow the
+instructions within.
+
+The Vim files are most developed. In particular, I recommend some optional
+setup in subx.vim to use multiple colors for comments.
+
+If you use [Exuberant Ctags](http://ctags.sourceforge.net) for jumping easily
+from names to their definitions in your editor, copy the contents of `exuberant_ctags_rc`
+into your `.ctags` file.
+
+[Here](https://lobste.rs/s/qglfdp/subx_minimalist_assembly_language_for#c_o9ddqk)
+are some tips on my setup for quickly finding the right opcode for any
+situation from within Vim.
diff --git a/html/apps/ex2.2.mu.html b/html/apps/ex2.2.mu.html
deleted file mode 100644
index 455305a6..00000000
--- a/html/apps/ex2.2.mu.html
+++ /dev/null
@@ -1,81 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
-<html>
-<head>
-<meta http-equiv="content-type" content="text/html; charset=UTF-8">
-<title>Mu - apps/ex2.2.mu</title>
-<meta name="Generator" content="Vim/8.1">
-<meta name="plugin-version" content="vim8.1_v1">
-<meta name="syntax" content="none">
-<meta name="settings" content="number_lines,use_css,pre_wrap,no_foldcolumn,expand_tabs,line_ids,prevent_copy=">
-<meta name="colorscheme" content="minimal-light">
-<style type="text/css">
-<!--
-pre { white-space: pre-wrap; font-family: monospace; color: #000000; background-color: #c6c6c6; }
-body { font-size:12pt; font-family: monospace; color: #000000; background-color: #c6c6c6; }
-a { color:inherit; }
-* { font-size:12pt; font-size: 1em; }
-.LineNr { }
-.muFunction { color: #af5f00; text-decoration: underline; }
-.SpecialChar { color: #d70000; }
-.Comment { color: #005faf; }
-.Constant { color: #008787; }
-.Delimiter { color: #c000c0; }
-.PreProc { color: #c000c0; }
--->
-</style>
-
-<script type='text/javascript'>
-<!--
-
-/* function to open any folds containing a jumped-to line before jumping to it */
-function JumpToLine()
-{
-  var lineNum;
-  lineNum = window.location.hash;
-  lineNum = lineNum.substr(1); /* strip off '#' */
-
-  if (lineNum.indexOf('L') == -1) {
-    lineNum = 'L'+lineNum;
-  }
-  var lineElem = document.getElementById(lineNum);
-  /* Always jump to new location even if the line was hidden inside a fold, or
-   * we corrected the raw number to a line ID.
-   */
-  if (lineElem) {
-    lineElem.scrollIntoView(true);
-  }
-  return true;
-}
-if ('onhashchange' in window) {
-  window.onhashchange = JumpToLine;
-}
-
--->
-</script>
-</head>
-<body onload='JumpToLine();'>
-<a href='https://github.com/akkartik/mu/blob/master/apps/ex2.2.mu'>https://github.com/akkartik/mu/blob/master/apps/ex2.2.mu</a>
-<pre id='vimCodeElement'>
-<span id="L1" class="LineNr"> 1 </span><span class="Comment"># Increment a number, and return the result in the exit code.</span>
-<span id="L2" class="LineNr"> 2 </span><span class="Comment">#</span>
-<span id="L3" class="LineNr"> 3 </span><span class="Comment"># To run:</span>
-<span id="L4" class="LineNr"> 4 </span><span class="Comment">#   $ ./translate_mu apps/ex2.2.mu</span>
-<span id="L5" class="LineNr"> 5 </span><span class="Comment">#   $ ./a.elf</span>
-<span id="L6" class="LineNr"> 6 </span><span class="Comment"># Expected result:</span>
-<span id="L7" class="LineNr"> 7 </span><span class="Comment">#   $ echo $?</span>
-<span id="L8" class="LineNr"> 8 </span><span class="Comment">#   7</span>
-<span id="L9" class="LineNr"> 9 </span>
-<span id="L10" class="LineNr">10 </span><span class="PreProc">fn</span> <span class="muFunction">main</span><span class="PreProc"> -&gt; </span>result/<span class="Constant">ebx</span>: int <span class="Delimiter">{</span>
-<span id="L11" class="LineNr">11 </span>  result <span class="SpecialChar">&lt;-</span> <a href='ex2.2.mu.html#L14'>foo</a>
-<span id="L12" class="LineNr">12 </span><span class="Delimiter">}</span>
-<span id="L13" class="LineNr">13 </span>
-<span id="L14" class="LineNr">14 </span><span class="PreProc">fn</span> <span class="muFunction"><a href='ex2.2.mu.html#L14'>foo</a></span><span class="PreProc"> -&gt; </span>result/<span class="Constant">ebx</span>: int <span class="Delimiter">{</span>
-<span id="L15" class="LineNr">15 </span>  <span class="PreProc">var</span> n: int
-<span id="L16" class="LineNr">16 </span>  copy-to n, <span class="Constant">3</span>
-<span id="L17" class="LineNr">17 </span>  increment n
-<span id="L18" class="LineNr">18 </span>  result <span class="SpecialChar">&lt;-</span> copy n
-<span id="L19" class="LineNr">19 </span><span class="Delimiter">}</span>
-</pre>
-</body>
-</html>
-<!-- vim: set foldmethod=manual : -->
diff --git a/mu.md b/mu.md
new file mode 100644
index 00000000..b093f0e0
--- /dev/null
+++ b/mu.md
@@ -0,0 +1,456 @@
+# Mu Syntax
+
+Here are two valid statements in Mu:
+
+```
+increment x
+y <- increment
+```
+
+Understanding when to use one vs the other is the critical idea in Mu. In
+short, the former increments a value in memory, while the latter increments a
+value in a register.
+
+Most languages start from some syntax and do what it takes to implement it.
+Mu, however, is designed as a safe way to program in [a regular subset of
+32-bit x86 machine code](subx.md), _satisficing_ rather than optimizing for a
+clean syntax. To keep the mapping to machine code lightweight, Mu exclusively
+uses statements and most statements map to a single instruction of machine
+code.
+
+Since the x86 instruction set restricts how many memory locations an instruction
+can use, Mu makes registers explicit as well. Variables must be explicitly
+mapped to registers; otherwise they live in memory.
+
+Statements consist of 3 parts: the operation, optional _inouts_ and optional
+_outputs_. Outputs come before the operation name and `<-`.
+
+Outputs are always registers; memory locations that need to be modified are
+passed in by reference using the `addr` type described below.
+
+So Mu programmers need to make two new categories of decisions: whether to
+define variables in registers or memory, and whether to put variables to the
+left or right. There's always exactly one way to write any given operation. In
+return for this overhead you get a lightweight and future-proof stack. And Mu
+will provide good error message to support you.
+
+Further down, this page enumerates all available primitives in Mu, and [a
+separate page](http://akkartik.github.io/mu/html/mu_instructions.html)
+describes how each primitive is translated to machine code.
+
+## Functions and calls
+
+Zooming out from single statements, here's a complete sample program in Mu:
+
+<img alt='ex2.mu' src='html/ex2.mu.png'>
+
+Mu programs are lists of functions. Each function has the following form:
+
+  ```
+  fn _name_ inout ... -> output ... {
+    _statement_
+    _statement_
+    ...
+  }
+  ```
+
+Each function has a header line, and some number of statements, each on a
+separate line. Headers describe inouts and outputs. Inouts can't be registers,
+and outputs _must_ be registers.
+
+The above program also demonstrates a function call (to the function `do-add`).
+Function calls look the same as primitive statements: they can return (multiple)
+outputs in registers, and modify inouts passed in by reference. In addition,
+there's one more constraint: output registers must match the function header.
+For example:
+
+  ```
+  fn f -> x/eax: int {
+    ...
+  }
+  fn g {
+    a/eax <- f  # ok
+    a/ebx <- f  # wrong
+  }
+  ```
+
+The function `main` is special; it is where the program starts running. It
+must always return a single int in register `ebx` (as the exit status of the
+process). It can also optionally accept an array of strings as input (from the
+shell command-line). To be precise, `main` must have one of the following
+two signatures:
+
+- `fn main -> x/ebx: int`
+- `fn main args: (addr array (addr array byte)) -> x/ebx: int`
+
+(The name of the output is flexible.)
+
+## Blocks
+
+Blocks are useful for grouping related statements. They're delimited by `{`
+and `}`, both each alone on a line.
+
+Blocks can nest:
+
+  ```
+  {
+    _statements_
+    {
+      _more statements_
+    }
+  }
+  ```
+
+Blocks can be named (with the name ending in a `:` on the same line as the
+`{`):
+
+  ```
+  $name: {
+    _statements_
+  }
+  ```
+
+Further down we'll see primitive statements for skipping or repeating blocks.
+Besides control flow, the other use for blocks is...
+
+## Local variables
+
+Functions can define new variables at any time with the keyword `var`. There
+are two variants of the `var` statement, for defining variables in registers
+or memory.
+
+  ```
+  var name: type
+  var name/reg: type <- ...
+  ```
+
+Variables on the stack are never initialized. (They're always implicitly
+zeroed them out.) Variables in registers are always initialized.
+
+Variables exist from their definition until the end of their containing block.
+Register variables may also die earlier if their register is clobbered by a
+new variable.
+
+## Arithmetic primitives
+
+Here is the list of arithmetic primitive operations supported by Mu. The name
+`n` indicates a literal integer rather than a variable, and `var/reg` indicates
+a variable in a register.
+
+  ```
+  var/reg <- increment
+  increment var
+  var/reg <- decrement
+  decrement var
+  var1/reg1 <- add var2/reg2
+  var/reg <- add var2
+  add-to var1, var2/reg
+  var/reg <- add n
+  add-to var, n
+
+  var1/reg1 <- sub var2/reg2
+  var/reg <- sub var2
+  sub-from var1, var2/reg
+  var/reg <- sub n
+  sub-from var, n
+
+  var1/reg1 <- and var2/reg2
+  var/reg <- and var2
+  and-with var1, var2/reg
+  var/reg <- and n
+  and-with var, n
+
+  var1/reg1 <- or var2/reg2
+  var/reg <- or var2
+  or-with var1, var2/reg
+  var/reg <- or n
+  or-with var, n
+
+  var1/reg1 <- xor var2/reg2
+  var/reg <- xor var2
+  xor-with var1, var2/reg
+  var/reg <- xor n
+  xor-with var, n
+
+  var/reg <- copy var2/reg2
+  copy-to var1, var2/reg
+  var/reg <- copy var2
+  var/reg <- copy n
+  copy-to var, n
+
+  compare var1, var2/reg
+  compare var1/reg, var2
+  compare var/eax, n
+  compare var, n
+
+  var/reg <- multiply var2
+  ```
+
+Any statement above that takes a variable in memory can be replaced with a
+dereference (`*`) of an address variable (of type `(addr ...)`) in a register.
+(Types can have multiple words, and are wrapped in `()` when they do.) But you
+can't dereference variables in memory. You have to load them into a register
+first.
+
+Excluding dereferences, the above statements must operate on non-address
+primitive types: `int` or `boolean`. (Booleans are really just `int`s, and Mu
+assumes any value but `0` is true.)
+
+## Operating on individual bytes
+
+A special-case is variables of type 'byte'. Mu is a 32-bit platform so for the
+most part only supports types that are multiples of 32 bits. However, we do
+want to support strings in ASCII and UTF-8, which will be arrays of bytes.
+
+Since most x86 instructions implicitly load 32 bits at a time from memory,
+variables of type 'byte' are only allowed in registers, not on the stack. Here
+are the possible statements for reading bytes to/from memory:
+
+  ```
+  var/reg <- copy-byte var2/reg2      # var: byte, var2: byte
+  var/reg <- copy-byte *var2/reg2     # var: byte, var2: (addr byte)
+  copy-byte-to *var1/reg1, var2/reg2  # var1: (addr byte), var2: byte
+  ```
+
+In addition, variables of type 'byte' are restricted to (the lowest bytes of)
+just 4 registers: eax, ecx, edx and ebx.
+
+## Primitive jumps
+
+There are two kinds of jumps, both with many variations: `break` and `loop`.
+`break` instructions jump to the end of the containing block. `loop` instructions
+jump to the beginning of the containing block.
+
+All jumps can take an optional label starting with '$':
+
+  ```
+  loop $foo
+  ```
+
+This instruction jumps to the beginning of the block called $foo. The corresponding
+`break` jumps to the end of the block. Either jump statement must lie somewhere
+inside such a block. Jumps are only legal to containing blocks. (Use named
+blocks with restraint; jumps to places far away can get confusing.)
+
+There are two unconditional jumps:
+
+  ```
+  loop
+  loop label
+  break
+  break label
+  ```
+
+The remaining jump instructions are all conditional. Conditional jumps rely on
+the result of the most recently executed `compare` instruction. (To keep
+programs easy to read, keep compare instructions close to the jump that uses
+them.)
+
+  ```
+  break-if-=
+  break-if-= label
+  break-if-!=
+  break-if-!= label
+  ```
+
+Inequalities are similar, but have unsigned and signed variants. For simplicity,
+always use signed integers; use the unsigned variants only to compare addresses.
+
+  ```
+  break-if-<
+  break-if-< label
+  break-if->
+  break-if-> label
+  break-if-<=
+  break-if-<= label
+  break-if->=
+  break-if->= label
+
+  break-if-addr<
+  break-if-addr< label
+  break-if-addr>
+  break-if-addr> label
+  break-if-addr<=
+  break-if-addr<= label
+  break-if-addr>=
+  break-if-addr>= label
+  ```
+
+Similarly, conditional loops:
+
+  ```
+  loop-if-=
+  loop-if-= label
+  loop-if-!=
+  loop-if-!= label
+
+  loop-if-<
+  loop-if-< label
+  loop-if->
+  loop-if-> label
+  loop-if-<=
+  loop-if-<= label
+  loop-if->=
+  loop-if->= label
+
+  loop-if-addr<
+  loop-if-addr< label
+  loop-if-addr>
+  loop-if-addr> label
+  loop-if-addr<=
+  loop-if-addr<= label
+  loop-if-addr>=
+  loop-if-addr>= label
+  ```
+
+## Addresses
+
+Passing objects by reference requires the `address` operation, which returns
+an object of type `addr`.
+
+  ```
+  var/reg: (addr T) <- address var2: T
+  ```
+
+Here `var2` can't live in a register.
+
+## Array operations
+
+Mu arrays are size-prefixed so that operations on them can check bounds as
+necessary at run-time. The `length` statement returns the number of elements
+in an array.
+
+  ```
+  var/reg: int <- length arr/reg: (addr array T)
+  ```
+
+The `index` statement takes an `addr` to an `array` and returns an `addr` to
+one of its elements, that can be read from or written to.
+
+  ```
+  var/reg: (addr T) <- index arr/reg: (addr array T), n
+  var/reg: (addr T) <- index arr: (array T sz), n
+  ```
+
+The index can also be a variable in a register, with a caveat:
+
+  ```
+  var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: int
+  var/reg: (addr T) <- index arr: (array T sz), idx/reg: int
+  ```
+
+The caveat: the size of T must be 1, 2, 4 or 8 bytes. The x86 instruction set
+has complex addressing modes that can index into an array in a single instruction
+in these situations.
+
+For types in general you'll need to split up the work, performing a `compute-offset`
+before the `index`.
+
+  ```
+  var/reg: (offset T) <- compute-offset arr: (addr array T), idx/reg: int     # arr can be in reg or mem
+  var/reg: (offset T) <- compute-offset arr: (addr array T), idx: int         # arr can be in reg or mem
+  ```
+
+The `compute-offset` statement returns a value of type `(offset T)` after
+performing any necessary bounds checking. Now the offset can be passed to
+`index` as usual:
+
+  ```
+  var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: (offset T)
+  ```
+
+## Compound types
+
+Primitive types can be combined together using the `type` keyword. For
+example:
+
+  ```
+  type point {
+    x: int
+    y: int
+  }
+  ```
+
+Mu programs are currently sequences of `fn` and `type` definitions.
+
+To access within a compound type, use the `get` instruction. There are two
+forms. You need either a variable of the type itself (say `T`) in memory, or a
+variable of type `(addr T)` in a register.
+
+  ```
+  var/reg: (addr T_f) <- get var/reg: (addr T), f
+  var/reg: (addr T_f) <- get var: T, f
+  ```
+
+The `f` here is the field name from the `type` definition, and `T_f` must
+match the type of `f` in the `type` definition. For example, some legal
+instructions for the definition of `point` above:
+
+  ```
+  var a/eax: (addr int) <- get p, x
+  var a/eax: (addr int) <- get p, y
+  ```
+
+## Handles for safe access to the heap
+
+We've seen the `addr` type, but it's intended to be short-lived. In particular,
+you can't save `addr` values inside compound `type`s. To do that you need a
+"fat pointer" called a `handle` that is safe to keep around for extended
+periods and ensures it's used safely without corrupting the heap and causing
+security issues or hard-to-debug misbehavior.
+
+To actually _use_ a `handle`, we have to turn it into an `addr` first using
+the `lookup` statement.
+
+  ```
+  var y/reg: (addr T) <- lookup x
+  ```
+
+Now operate on the `addr` as usual, safe in the knowledge that you can later
+recover any writes to its payload from `x`.
+
+It's illegal to continue to use this `addr` after a function that reclaims
+heap memory. You have to repeat the lookup from the `handle`. (Luckily Mu
+doesn't implement reclamation yet.)
+
+Having two kinds of addresses takes some getting used to. Do we pass in
+variables by value, by `addr` or by `handle`? In inputs or outputs? Here are 3
+rules:
+
+  * Functions that need to look at the payload should accept an `(addr ...)`.
+  * Functions that need to treat a handle as a value, without looking at its
+  payload, should accept a `(handle ...)`. Helpers that save handles into data
+  structures are a common example.
+  * Functions that need to allocate memory should accept an `(addr handle
+  ...)`.
+
+Try to avoid mixing these use cases.
+
+You can copy handles to another variable on the stack like this:
+
+  ```
+  var x: (handle T)
+  # ..some code initializing x..
+  var y/eax: (addr handle T) <- address ...
+  copy-handle x, y
+  ```
+
+You can also save handles inside compound types like this:
+
+  ```
+  var y/reg: (addr handle T_f) <- get var: (addr T), f
+  copy-handle-to *y, x
+  ```
+
+Or this:
+
+  ```
+  var y/reg: (addr handle T) <- index arr: (addr array handle T), n
+  copy-handle-to *y, x
+  ```
+
+## Conclusion
+
+Anything not allowed here is forbidden. At least until you modify mu.subx.
+Please [contact me](mailto:ak@akkartik.com) or [report issues](https://github.com/akkartik/mu/issues)
+when you encounter a missing or misleading error message.
diff --git a/mu_summary b/mu_summary
deleted file mode 100644
index 286f8286..00000000
--- a/mu_summary
+++ /dev/null
@@ -1,259 +0,0 @@
-Mu programs are lists of functions. Each function has the following form:
-
-  fn _name_ _inouts_with_types_ -> _outputs_with_types_ {
-    _instructions_
-  }
-
-Each function has a header line, and some number of instructions, each on a
-separate line.
-
-Instructions may be primitives or function calls. Either way, all instructions
-have one of the following forms:
-
-  # defining variables
-  var _name_: _type_
-  var _name_/_register_: _type_
-
-  # doing things with variables
-  _operation_ _inouts_
-  _outputs_ <- _operation_ _inouts_
-
-Instructions and functions may have inouts and outputs. Both inouts and
-outputs are variables.
-
-As seen above, variables can be defined to live in a register, like this:
-
-  n/eax
-
-Variables not assigned a register live in the stack.
-
-Function inouts must always be on the stack, and outputs must always be in
-registers. A function call must always write to the exact registers its
-definition requires. For example:
-
-  fn foo -> x/eax: int {
-    ...
-  }
-  fn main {
-    a/eax <- foo  # ok
-    a/ebx <- foo  # wrong
-  }
-
-Primitive inouts may be on the stack or in registers, but outputs must always
-be in registers.
-
-Functions can contain nested blocks inside { and }. Variables defined in a
-block don't exist outside it.
-
-  {
-    _instructions_
-    {
-      _more instructions_
-    }
-  }
-
-Blocks can be named like so:
-
-  $name: {
-    _instructions_
-  }
-
-## Primitive instructions
-
-Primitive instructions currently supported in Mu ('n' indicates a literal
-integer rather than a variable, and 'var/reg' indicates a variable in a
-register):
-
-  var/reg <- increment
-  increment var
-  var/reg <- decrement
-  decrement var
-  var1/reg1 <- add var2/reg2
-  var/reg <- add var2
-  add-to var1, var2/reg
-  var/reg <- add n
-  add-to var, n
-
-  var1/reg1 <- sub var2/reg2
-  var/reg <- sub var2
-  sub-from var1, var2/reg
-  var/reg <- sub n
-  sub-from var, n
-
-  var1/reg1 <- and var2/reg2
-  var/reg <- and var2
-  and-with var1, var2/reg
-  var/reg <- and n
-  and-with var, n
-
-  var1/reg1 <- or var2/reg2
-  var/reg <- or var2
-  or-with var1, var2/reg
-  var/reg <- or n
-  or-with var, n
-
-  var1/reg1 <- xor var2/reg2
-  var/reg <- xor var2
-  xor-with var1, var2/reg
-  var/reg <- xor n
-  xor-with var, n
-
-  var/reg <- copy var2/reg2
-  copy-to var1, var2/reg
-  var/reg <- copy var2
-  var/reg <- copy n
-  copy-to var, n
-
-  compare var1, var2/reg
-  compare var1/reg, var2
-  compare var/eax, n
-  compare var, n
-
-  var/reg <- multiply var2
-
-Notice that there are no primitive instructions operating on two variables in
-memory. That's a restriction of the underlying x86 processor.
-
-Any instruction above that takes a variable in memory can be replaced with a
-dereference (`*`) of an address variable in a register. But you can't dereference
-variables in memory.
-
-## Byte operations
-
-A special-case is variables of type 'byte'. Mu is a 32-bit platform so for the
-most part only supports types that are multiples of 32 bits. However, we do
-want to support strings in ASCII and UTF-8, which will be arrays of bytes.
-
-Since most x86 instructions implicitly load 32 bits at a time from memory,
-variables of type 'byte' are only allowed in registers, not on the stack. Here
-are the possible instructions for reading bytes to/from memory:
-
-  var/reg <- copy-byte var2/reg2      # var: byte, var2: byte
-  var/reg <- copy-byte *var2/reg2     # var: byte, var2: (addr byte)
-  copy-byte-to *var1/reg1, var2/reg2  # var1: (addr byte), var2: byte
-
-In addition, variables of type 'byte' are restricted to (the lowest bytes of)
-just 4 registers: eax, ecx, edx and ebx.
-
-## Primitive jump instructions
-
-There are two kinds of jumps, both with many variations: `break` and `loop`.
-`break` instructions jump to the end of the containing block. `loop` instructions
-jump to the beginning of the containing block.
-
-Jumps can take an optional label starting with '$':
-
-  loop $foo
-
-This instruction jumps to the beginning of the block called $foo. It must lie
-somewhere inside such a block. Jumps are only legal to containing blocks. Use
-named blocks with restraint; jumps to places far away can get confusing.
-
-There are two unconditional jumps:
-
-  loop
-  loop label
-  break
-  break label
-
-The remaining jump instructions are all conditional. Conditional jumps rely on
-the result of the most recently executed `compare` instruction. (To keep
-programs easy to read, keep compare instructions close to the jump that uses
-them.)
-
-  break-if-=
-  break-if-= label
-  break-if-!=
-  break-if-!= label
-
-Inequalities are similar, but have unsigned and signed variants. We assume
-unsigned variants are only ever used to compare addresses.
-
-  break-if-<
-  break-if-< label
-  break-if->
-  break-if-> label
-  break-if-<=
-  break-if-<= label
-  break-if->=
-  break-if->= label
-
-  break-if-addr<
-  break-if-addr< label
-  break-if-addr>
-  break-if-addr> label
-  break-if-addr<=
-  break-if-addr<= label
-  break-if-addr>=
-  break-if-addr>= label
-
-Similarly, conditional loops:
-
-  loop-if-=
-  loop-if-= label
-  loop-if-!=
-  loop-if-!= label
-
-  loop-if-<
-  loop-if-< label
-  loop-if->
-  loop-if-> label
-  loop-if-<=
-  loop-if-<= label
-  loop-if->=
-  loop-if->= label
-
-  loop-if-addr<
-  loop-if-addr< label
-  loop-if-addr>
-  loop-if-addr> label
-  loop-if-addr<=
-  loop-if-addr<= label
-  loop-if-addr>=
-  loop-if-addr>= label
-
-## Address operations
-
-  var/reg: (addr T) <- address var: T         # var must be in mem (on the stack)
-
-## Array operations
-
-  var/reg: int <- length arr/reg: (addr array T)
-  var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: int
-  var/reg: (addr T) <- index arr: (array T sz), idx/reg: int
-  var/reg: (addr T) <- index arr/reg: (addr array T), n
-  var/reg: (addr T) <- index arr: (array T sz), n
-
-  var/reg: (offset T) <- compute-offset arr: (addr array T), idx/reg: int     # arr can be in reg or mem
-  var/reg: (offset T) <- compute-offset arr: (addr array T), idx: int         # arr can be in reg or mem
-  var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: (offset T)
-
-## User-defined types
-
-  var/reg: (addr T_f) <- get var/reg: (addr T), f
-    where record (product) type T has elements a, b, c, ... of types T_a, T_b, T_c, ...
-  var/reg: (addr T_f) <- get var: T, f
-
-## Handles for safe access to the heap
-
-Say we created a handle like this on the stack (it can't be in a register)
-  var x: (handle T)
-  allocate Heap, T, x
-
-You can copy handles to another variable on the stack like this:
-  var y: (handle T)
-  copy-handle-to y, x
-
-You can also save handles inside other user-defined types like this:
-  var y/reg: (addr handle T_f) <- get var: (addr T), f
-  copy-handle-to *y, x
-
-Or this:
-  var y/reg: (addr handle T) <- index arr: (addr array handle T), n
-  copy-handle-to *y, x
-
-Handles can be converted into addresses like this:
-  var y/reg: (addr T) <- lookup x
-
-It's illegal to continue to use this addr after a function that reclaims heap
-memory. You have to repeat the lookup.
diff --git a/subx.md b/subx.md
new file mode 100644
index 00000000..b1ab38bc
--- /dev/null
+++ b/subx.md
@@ -0,0 +1,162 @@
+## SubX
+
+SubX is a notation for a subset of x86 machine code. [The Mu translator](http://akkartik.github.io/mu/html/apps/mu.subx.html)
+is implemented in SubX and also emits SubX code.
+
+Here's an example program in SubX that adds 1 and 1 and returns the result to
+the parent shell process:
+
+  ```sh
+  == code
+  Entry:
+    # ebx = 1
+    bb/copy-to-ebx  1/imm32
+    # increment ebx
+    43/increment-ebx
+    # exit(ebx)
+    e8/call  syscall_exit/disp32
+  ```
+
+## The syntax of SubX instructions
+
+Just like in regular machine code, SubX programs consist mostly of instructions,
+which are basically sequences of numbers (always in hex). Instructions consist
+of words separated by whitespace. Words may be _opcodes_ (defining the
+operation being performed) or _arguments_ (specifying the data the operation
+acts on). Any word can have extra _metadata_ attached to it after `/`. Some
+metadata is required (like the `/imm32` and `/imm8` above), but unrecognized
+metadata is silently skipped so you can attach comments to words (like the
+instruction name `/copy-to-eax` above, or the `/exit` argument).
+
+What do all these numbers mean? SubX supports a small subset of the 32-bit x86
+instruction set that likely runs on your computer. (Think of the name as short
+for "sub-x86".) The instruction set contains instructions like `89/copy`,
+`01/add`, `3d/compare` and `51/push-ecx` which modify registers and a byte-addressable
+memory. For a complete list of supported instructions, run `bootstrap help
+opcodes`.
+
+The registers instructions operate on are as follows:
+
+- Six general-purpose 32-bit registers: `0/eax`, `1/ebx`, `2/ecx`, `3/edx`,
+  `6/esi` and `7/edi`.
+- Two additional 32-bit registers: `4/esp` and `5/ebp`. (I suggest you only
+  use these to manage the call stack.)
+
+(SubX doesn't support floating-point registers yet. Intel processors support
+an 8-bit mode, 16-bit mode and 64-bit mode. SubX will never support them.
+There are also _many_ more instructions that SubX will never support.)
+
+While SubX doesn't provide the usual mnemonics for opcodes, it _does_ provide
+error-checking. If you miss an argument or accidentally add an extra argument,
+you'll get a nice error. SubX won't arbitrarily interpret bytes of data as
+instructions or vice versa.
+
+It's worth distinguishing between an instruction's arguments and its _operands_.
+Arguments are provided directly in instructions. Operands are pieces of data
+in register or memory that are operated on by instructions.
+
+Intel processors typically operate on no more than two operands, and at most
+one of them (the 'reg/mem' operand) can access memory. The address of the
+reg/mem operand is constructed by expressions of one of these forms:
+
+  - `%reg`: operate on just a register, not memory
+  - `*reg`: look up memory with the address in some register
+  - `*(reg + disp)`: add a constant to the address in some register
+  - `*(base + (index << scale) + disp)` where `base` and `index` are registers,
+    and `scale` and `disp` are 2- and 32-bit constants respectively.
+
+Under the hood, SubX turns expressions of these forms into multiple arguments
+with metadata in some complex ways. See [the doc on bare SubX](subx_bare.md).
+
+That covers the complexities of the reg/mem operand. The second operand is
+simpler. It comes from exactly one of the following argument types:
+
+  - `/r32`
+  - displacement: `/disp8` or `/disp32`
+  - immediate: `/imm8` or `/imm32`
+
+Putting all this together, here's an example that adds the integer in `eax` to
+the one at address `edx`:
+
+  ```
+  01/add %edx 0/r32/eax
+  ```
+
+## The syntax of SubX programs
+
+SubX programs map to the same ELF binaries that a conventional Linux system
+uses. Linux ELF binaries consist of a series of _segments_. In particular, they
+distinguish between code and data. Correspondingly, SubX programs consist of a
+series of segments, each starting with a header line: `==` followed by a name
+and approximate starting address.
+
+All code must lie in a segment called 'code'.
+
+Segments can be added to.
+
+```sh
+== code 0x09000000  # first mention requires starting address
+...A...
+
+== data 0x0a000000
+...B...
+
+== code             # no address necessary when adding
+...C...
+```
+
+The `code` segment now contains the instructions of `A` as well as `C`.
+
+Within the `code` segment, each line contains a comment, label or instruction.
+Comments start with a `#` and are ignored. Labels should always be the first
+word on a line, and they end with a `:`.
+
+Instructions can refer to labels in displacement or immediate arguments, and
+they'll obtain a value based on the address of the label: immediate arguments
+will contain the address directly, while displacement arguments will contain
+the difference between the address and the address of the current instruction.
+The latter is mostly useful for `jump` and `call` instructions.
+
+Functions are defined using labels. By convention, labels internal to functions
+(that must only be jumped to) start with a `$`. Any other labels must only be
+called, never jumped to. All labels must be unique.
+
+Functions are called using the following syntax:
+```
+(func arg1 arg2 ...)
+```
+
+Function arguments must be either literals (integers or strings) or a reg/mem
+operand using the syntax in the previous section.
+
+A special label is `Entry`, which can be used to specify/override the entry
+point of the program. It doesn't have to be unique, and the latest definition
+will override earlier ones.
+
+(The `Entry` label, along with duplicate segment headers, allows programs to
+be built up incrementally out of multiple [_layers_](http://akkartik.name/post/wart-layers).)
+
+Another special pair of labels are the block delimiters `{` and `}`. They can
+be nested, and jump instructions can take arguments `loop` or `break` that
+jump to the enclosing `{` and `}` respectively.
+
+The data segment consists of labels as before and byte values. Referring to
+data labels in either `code` segment instructions or `data` segment values
+yields their address.
+
+Automatic tests are an important part of SubX, and there's a simple mechanism
+to provide a test harness: all functions that start with `test-` are called in
+turn by a special, auto-generated function called `run-tests`. How you choose
+to call it is up to you.
+
+I try to keep things simple so that there's less work to do when implementing
+SubX in SubX. But there _is_ one convenience: instructions can provide a
+string literal surrounded by quotes (`"`) in an `imm32` argument. SubX will
+transparently copy it to the `data` segment and replace it with its address.
+Strings are the only place where a SubX word is allowed to contain spaces.
+
+That should be enough information for writing SubX programs. The `apps/`
+directory provides some fodder for practice in the `apps/ex*.subx` files,
+giving a more gradual introduction to SubX features. In particular, you should
+work through `apps/factorial4.subx`, which demonstrates all the above ideas in
+concert.
diff --git a/subx_addressing_modes.md b/subx_bare.md
index 52455510..b2429ed2 100644
--- a/subx_addressing_modes.md
+++ b/subx_bare.md
@@ -1,7 +1,7 @@
-[The Readme](Readme.md) describes SubX notation with some details hidden
-behind _syntax sugar_ -- local rewrite rules that make programming in SubX
-less error-prone. However, much low-level SubX (before the syntax sugar is
-implemented) is written without syntax sugar. This document describes some
+[The SubX documentation](subx.md) describes SubX notation with some details
+hidden behind _syntax sugar_ -- local rewrite rules that make programming in
+SubX less error-prone. However, much low-level SubX (before the syntax sugar
+is implemented) is written without syntax sugar. This document describes some
 details of the syntax sugar: how the reg/mem operand is translated into
 arguments.
 
@@ -123,3 +123,50 @@ $ echo $?
 
 These details should now be enough information for reading and modifying
 low-level SubX programs.
+
+## Translating SubX programs
+
+This repo includes two translators for bare SubX. The first is [the bootstrap
+translator](bootstrap.md) implemented in C++. In addition, you can use SubX to
+translate itself. For example, running natively on Linux:
+
+  ```sh
+  # generate translator phases using the C++ translator
+  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/hex.subx    -o hex
+  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/survey.subx -o survey
+  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/pack.subx   -o pack
+  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/assort.subx -o assort
+  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/dquotes.subx -o dquotes
+  $ ./bootstrap translate init.linux 0*.subx apps/subx-params.subx apps/tests.subx  -o tests
+  $ chmod +x hex survey pack assort dquotes tests
+
+  # use the generated translator phases to translate SubX programs
+  $ cat init.linux apps/ex1.subx |./tests |./dquotes |./assort |./pack |./survey |./hex > a.elf
+  $ chmod +x a.elf
+  $ ./a.elf
+  $ echo $?
+  42
+
+  # or, automating the above steps
+  $ ./translate_subx init.linux apps/ex1.subx
+  $ ./a.elf
+  $ echo $?
+  42
+  ```
+
+Or, running in a VM on other platforms (much slower):
+
+  ```sh
+  $ ./translate_subx_emulated init.linux apps/ex1.subx  # generates identical a.elf to above
+  $ ./bootstrap run a.elf
+  $ echo $?
+  42
+  ```
+
+## Resources
+
+- [Single-page cheatsheet for the x86 ISA](https://net.cs.uni-bonn.de/fileadmin/user_upload/plohmann/x86_opcode_structure_and_instruction_overview.pdf)
+  (pdf; [cached local copy](https://github.com/akkartik/mu/blob/master/cheatsheet.pdf))
+- [Concise reference for the x86 ISA](https://c9x.me/x86)
+- [Concise reference for the x86 ISA #2](http://ref.x86asm.net/coder32.html)
+- [Intel processor manual](http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf) (pdf)
diff --git a/test_apps b/test_apps
index 325266e1..bcd61745 100755
--- a/test_apps
+++ b/test_apps
@@ -47,11 +47,11 @@ echo ex2
 test "$1" = 'record'  ||  git diff --exit-code apps/ex2
 test $EMULATED  &&  {
   ./bootstrap run apps/ex2  ||  ret=$?
-  test $ret -eq 2  # 1 + 1
+  test $ret -eq 7  # 3 + 4
 }
 test $NATIVE  &&  {
   apps/ex2  ||  ret=$?
-  test $ret -eq 2  # 1 + 1
+  test $ret -eq 7  # 3 + 4
 }
 
 echo ex3
@@ -365,17 +365,6 @@ test $NATIVE  &&  {
   test $ret -eq 7
 }
 
-echo ex2.2.mu
-./translate_mu apps/ex2.2.mu
-test $EMULATED  &&  {
-  ./bootstrap run a.elf  ||  ret=$?
-  test $ret -eq 4
-}
-test $NATIVE  &&  {
-  ./a.elf  ||  ret=$?
-  test $ret -eq 4
-}
-
 echo ex3.mu
 ./translate_mu apps/ex3.mu
 test $EMULATED  &&  {
diff --git a/vocabulary.md b/vocabulary.md
new file mode 100644
index 00000000..426ed590
--- /dev/null
+++ b/vocabulary.md
@@ -0,0 +1,208 @@
+## Reference documentation on available primitives
+
+### Data Structures
+
+- Kernel strings: null-terminated regions of memory. Unsafe and to be avoided,
+  but needed for interacting with the kernel.
+
+- Arrays: length-prefixed regions of memory containing multiple elements of a
+  single type. Contents are preceded by 4 bytes (32 bits) containing the
+  `length` of the array in bytes.
+
+- Slices: a pair of 32-bit addresses denoting a [half-open](https://en.wikipedia.org/wiki/Interval_(mathematics))
+  \[`start`, `end`) interval to live memory with a consistent lifetime.
+
+  Invariant: `start` <= `end`
+
+- Streams: strings prefixed by 32-bit `write` and `read` indexes that the next
+  write or read goes to, respectively.
+
+  - offset 0: write index
+  - offset 4: read index
+  - offset 8: length of array (in bytes)
+  - offset 12: start of array data
+
+  Invariant: 0 <= `read` <= `write` <= `length`
+
+- File descriptors (fd): Low-level 32-bit integers that the kernel uses to
+  track files opened by the program.
+
+- File: 32-bit value containing either a fd or an address to a stream (fake
+  file).
+
+- Buffered files (buffered-file): Contain a file descriptor and a stream for
+  buffering reads/writes. Each `buffered-file` must exclusively perform either
+  reads or writes.
+
+### 'system calls'
+
+As I said at the top, a primary design goal of SubX (and Mu more broadly) is
+to explore ways to turn arbitrary manual tests into reproducible automated
+tests. SubX aims for this goal by baking testable interfaces deep into the
+stack, at the OS syscall level. The idea is that every syscall that interacts
+with hardware (and so the environment) should be *dependency injected* so that
+it's possible to insert fake hardware in tests.
+
+But those are big goals. Here are the syscalls I have so far:
+
+- `write`: takes two arguments, a file `f` and an address to array `s`.
+
+  Comparing this interface with the Unix `write()` syscall shows two benefits:
+
+  1. SubX can handle 'fake' file descriptors in tests.
+
+  1. `write()` accepts buffer and its length in separate arguments, which
+     requires callers to manage the two separately and so can be error-prone.
+     SubX's wrapper keeps the two together to increase the chances that we
+     never accidentally go out of array bounds.
+
+- `read`: takes two arguments, a file `f` and an address to stream `s`. Reads
+  as much data from `f` as can fit in (the free space of) `s`.
+
+  Like with `write()`, this wrapper around the Unix `read()` syscall adds the
+  ability to handle 'fake' file descriptors in tests, and reduces the chances
+  of clobbering outside array bounds.
+
+  One bit of weirdness here: in tests we do a redundant copy from one stream
+  to another. See [the comments before the implementation](http://akkartik.github.io/mu/html/060read.subx.html)
+  for a discussion of alternative interfaces.
+
+- `stop`: takes two arguments:
+  - `ed` is an address to an _exit descriptor_. Exit descriptors allow us to
+    `exit()` the program in production, but return to the test harness within
+    tests. That allows tests to make assertions about when `exit()` is called.
+  - `value` is the status code to `exit()` with.
+
+  For more details on exit descriptors and how to create one, see [the
+  comments before the implementation](http://akkartik.github.io/mu/html/059stop.subx.html).
+
+- `new-segment`
+
+  Allocates a whole new segment of memory for the program, discontiguous with
+  both existing code and data (heap) segments. Just a more opinionated form of
+  [`mmap`](http://man7.org/linux/man-pages/man2/mmap.2.html).
+
+- `allocate`: takes two arguments, an address to allocation-descriptor `ad`
+  and an integer `n`
+
+  Allocates a contiguous range of memory that is guaranteed to be exclusively
+  available to the caller. Returns the starting address to the range in `eax`.
+
+  An allocation descriptor tracks allocated vs available addresses in some
+  contiguous range of memory. The int specifies the number of bytes to allocate.
+
+  Explicitly passing in an allocation descriptor allows for nested memory
+  management, where a sub-system gets a chunk of memory and further parcels it
+  out to individual allocations. Particularly helpful for (surprise) tests.
+
+- ... _(to be continued)_
+
+I will continue to import syscalls over time from [the old Mu VM in the parent
+directory](https://github.com/akkartik/mu), which has experimented with
+interfaces for the screen, keyboard, mouse, disk and network.
+
+### primitives built atop system calls
+
+_(Compound arguments are usually passed in by reference. Where the results are
+compound objects that don't fit in a register, the caller usually passes in
+allocated memory for it.)_
+
+#### assertions for tests
+- `check-ints-equal`: fails current test if given ints aren't equal
+- `check-stream-equal`: fails current test if stream doesn't match string
+- `check-next-stream-line-equal`: fails current test if next line of stream
+  until newline doesn't match string
+
+#### error handling
+- `error`: takes three arguments, an exit-descriptor, a file and a string (message)
+
+  Prints out the message to the file and then exits using the provided
+  exit-descriptor.
+
+- `error-byte`: like `error` but takes an extra byte value that it prints out
+  at the end of the message.
+
+#### predicates
+- `kernel-string-equal?`: compares a kernel string with a string
+- `string-equal?`: compares two strings
+- `stream-data-equal?`: compares a stream with a string
+- `next-stream-line-equal?`: compares with string the next line in a stream, from
+  `read` index to newline
+
+- `slice-empty?`: checks if the `start` and `end` of a slice are equal
+- `slice-equal?`: compares a slice with a string
+- `slice-starts-with?`: compares the start of a slice with a string
+- `slice-ends-with?`: compares the end of a slice with a string
+
+#### writing to disk
+- `write`: string -> file
+  - Can also be used to cat a string into a stream.
+  - Will abort the entire program if destination is a stream and doesn't have
+    enough room.
+- `write-stream`: stream -> file
+  - Can also be used to cat one stream into another.
+  - Will abort the entire program if destination is a stream and doesn't have
+    enough room.
+- `write-slice`: slice -> stream
+  - Will abort the entire program if there isn't enough room in the
+    destination stream.
+- `append-byte`: int -> stream
+  - Will abort the entire program if there isn't enough room in the
+    destination stream.
+- `append-byte-hex`: int -> stream
+  - textual representation in hex, no '0x' prefix
+  - Will abort the entire program if there isn't enough room in the
+    destination stream.
+- `print-int32`: int -> stream
+  - textual representation in hex, including '0x' prefix
+  - Will abort the entire program if there isn't enough room in the
+    destination stream.
+- `write-buffered`: string -> buffered-file
+- `write-slice-buffered`: slice -> buffered-file
+- `flush`: buffered-file
+- `write-byte-buffered`: int -> buffered-file
+- `print-byte-buffered`: int -> buffered-file
+  - textual representation in hex, no '0x' prefix
+- `print-int32-buffered`: int -> buffered-file
+  - textual representation in hex, including '0x' prefix
+
+#### reading from disk
+- `read`: file -> stream
+  - Can also be used to cat one stream into another.
+  - Will silently stop reading when destination runs out of space.
+- `read-byte-buffered`: buffered-file -> byte
+- `read-line-buffered`: buffered-file -> stream
+  - Will abort the entire program if there isn't enough room.
+
+#### non-IO operations on streams
+- `new-stream`: allocates space for a stream of `n` elements, each occupying
+  `b` bytes.
+  - Will abort the entire program if `n*b` requires more than 32 bits.
+- `clear-stream`: resets everything in the stream to `0` (except its `length`).
+- `rewind-stream`: resets the read index of the stream to `0` without modifying
+  its contents.
+
+#### reading/writing hex representations of integers
+- `is-hex-int?`: takes a slice argument, returns boolean result in `eax`
+- `parse-hex-int`: takes a slice argument, returns int result in `eax`
+- `is-hex-digit?`: takes a 32-bit word containing a single byte, returns
+  boolean result in `eax`.
+- `from-hex-char`: takes a hexadecimal digit character in `eax`, returns its
+  numeric value in `eax`
+- `to-hex-char`: takes a single-digit numeric value in `eax`, returns its
+  corresponding hexadecimal character in `eax`
+
+#### tokenization
+
+from a stream:
+- `next-token`: stream, delimiter byte -> slice
+- `skip-chars-matching`: stream, delimiter byte
+- `skip-chars-not-matching`: stream, delimiter byte
+
+from a slice:
+- `next-token-from-slice`: start, end, delimiter byte -> slice
+  - Given a slice and a delimiter byte, returns a new slice inside the input
+    that ends at the delimiter byte.
+
+- `skip-chars-matching-in-slice`: curr, end, delimiter byte -> new-curr (in `eax`)
+- `skip-chars-not-matching-in-slice`:  curr, end, delimiter byte -> new-curr (in `eax`)