# SubX: a minimal notation for x86 machine code SubX is a notation for a subset of the x86 instruction set. Here's a program (`examples/ex1.subx`) that returns 42: ```sh bb/copy-to-ebx 0x2a/imm32 # 42 in hex b8/copy-to-eax 1/imm32/exit cd/syscall 0x80/imm8 ``` You can generate tiny zero-dependency ELF binaries with it that run on Linux. ```sh $ ./ntranslate init.linux examples/ex1.subx -o examples/ex1 $ ./examples/ex1 $ echo $? 42 ``` You can run the generated binaries on an interpreter/VM for better error messages. ```sh $ ./subx run examples/ex1 # on Linux or BSD or Mac $ echo $? 42 ``` Emulated runs can generate a trace that permits [time-travel debugging](https://github.com/akkartik/mu/blob/master/browse_trace/Readme.md). ```sh $ ./subx --debug translate init.linux examples/factorial.subx -o examples/factorial saving address->label information to 'labels' saving address->source information to 'source_lines' $ ./subx --debug --trace run examples/factorial saving trace to 'last_run' $ ./browse_trace/browse_trace last_run # text-mode debugger UI ``` You can write tests for your programs. The entire stack is thoroughly covered by automated tests. SubX's tagline: tests before syntax. ```sh $ ./subx test $ ./subx run apps/factorial test ``` SubX is implemented in layers of syntax sugar over a tiny core. The core has two translators that emit identical binaries. The first, `subx`, is in C++. As a result it looks reasonable familiar but has a sprawling set of dependencies. The second, `ntranslate` is self-hosted, so it takes some practice to read. However, it has a miniscule set of dependencies. These complementary strengths and weaknesses make it easy to audit and debug. ```sh # generate translator phases using the C++ translator $ ./subx translate init.linux 0*.subx apps/subx-params.subx apps/hex.subx -o hex $ ./subx translate init.linux 0*.subx apps/subx-params.subx apps/survey.subx -o survey $ ./subx translate init.linux 0*.subx apps/subx-params.subx apps/pack.subx -o pack $ ./subx translate init.linux 0*.subx apps/subx-params.subx apps/assort.subx -o assort $ ./subx translate init.linux 0*.subx apps/subx-params.subx apps/dquotes.subx -o dquotes $ ./subx translate init.linux 0*.subx apps/subx-params.subx apps/tests.subx -o tests $ chmod +x hex survey pack assort dquotes tests # use the generated translator phases to translate SubX programs $ cat init.linux examples/ex1.subx |./tests |./dquotes |./assort |./pack |./survey |./hex > a.elf $ chmod +x a.elf $ ./a.elf $ echo $? 42 # or, automating the above steps $ ./ntranslate init.linux ex1.subx $ ./a.elf $ echo $? 42 ``` Or, running in a VM on other platforms: ```sh $ ./translate init.linux ex1.subx # generates identical a.elf to above $ ./subx run a.elf $ echo $? 42 ``` You can package up SubX binaries with the minimal hobbyist OS [Soso](https://github.com/ozkl/soso) and run them on Qemu. (Requires graphics and sudo access. Currently doesn't work on a cloud server.) ```sh # dependencies $ sudo apt install util-linux nasm xorriso # maybe also dosfstools and mtools # package up a "hello world" program with a third-party kernel into mu_soso.iso # requires sudo $ ./gen_soso_iso init.soso examples/ex6.subx # try it out $ qemu-system-i386 -cdrom mu_soso.iso ``` You can also package up SubX binaries with a Linux kernel and run them on either Qemu or [a cloud server that supports custom images](http://akkartik.name/post/iso-on-linode). (Takes 12 minutes with 8GB RAM. Requires 12 million LoC of C for the Linux kernel; that number will gradually go down.) ```sh $ sudo apt install build-essential flex bison wget libelf-dev libssl-dev xorriso $ ./gen_linux_iso init.linux examples/ex6.subx $ qemu-system-x86_64 -m 256M -cdrom mu.iso -boot d ``` ## What it looks like Here is the above example again: ```sh bb/copy-to-ebx 0x2a/imm32 # 42 in hex b8/copy-to-eax 1/imm32/exit cd/syscall 0x80/imm8 ``` Every line contains at most one instruction. Instructions consist of words separated by whitespace. Words may be _opcodes_ (defining the operation being performed) or _arguments_ (specifying the data the operation acts on). Any word can have extra _metadata_ attached to it after `/`. Some metadata is required (like the `/imm32` and `/imm8` above), but unrecognized metadata is silently skipped so you can attach comments to words (like the instruction name `/copy-to-eax` above, or the `/exit` operand). SubX doesn't provide much syntax (there aren't even the usual mnemonics for opcodes), but it _does_ provide error-checking. If you miss an operand or accidentally add an extra operand you'll get a nice error. SubX won't arbitrarily interpret bytes of data as instructions or vice versa. So much for syntax. What do all these numbers actually _mean_? SubX supports a small subset of the 32-bit x86 instruction set that likely runs on your computer. (Think of the name as short for "sub-x86".) Instructions operate on a few registers: * Six general-purpose 32-bit registers: `eax`, `ebx`, `ecx`, `edx`, `esi` and `edi` * Two additional 32-bit registers: `esp` and `ebp` (I suggest you only use these to manage the call stack.) * Four 1-bit _flag_ registers for conditional branching: - zero/equal flag `ZF` - sign flag `SF` - overflow flag `OF` - carry flag `CF` SubX programs consist of instructions like `89/copy`, `01/add`, `3d/compare` and `51/push-ecx` which modify these registers as well as a byte-addressable memory. For a complete list of supported instructions, run `subx help opcodes`. (SubX doesn't support floating-point registers yet. Intel processors support an 8-bit mode, 16-bit mode and 64-bit mode. SubX will never support them. There are other flags. SubX will never support them. There are also _many_ more instructions that SubX will never support.) It's worth distinguishing between an instruction's _operands_ and its _arguments_. Arguments are provided directly in instructions. Operands are pieces of data in register or memory that are operated on by instructions. Intel processors determine operands from arguments in fairly complex ways. ## Lengthy interlude: How x86 instructions compute operands The [Intel processor manual](http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf) is the final source of truth on the x86 instruction set, but it can be forbidding to make sense of, so here's a quick orientation. You will need familiarity with binary numbers, and maybe a few other things. Email [me](mailto:mu@akkartik.com) any time if something isn't clear. I love explaining this stuff for as long as it takes. The bad news is that it takes some getting used to. The good news is that internalizing the next 500 words will give you a significantly deeper understanding of your computer. Most instructions operate on an operand in register or memory ('reg/mem'), and a second operand in a register. The register operand is specified fairly directly using the 3-bit `/r32` argument: - 0 means register `eax` - 1 means register `ecx` - 2 means register `edx` - 3 means register `ebx` - 4 means register `esp` - 5 means register `ebp` - 6 means register `esi` - 7 means register `edi` The reg/mem operand, however, gets complex. It can be specified by 1-7 arguments, each ranging in size from 2 bits to 4 bytes. The key argument that's always present for reg/mem operands is `/mod`, the _addressing mode_. This is a 2-bit argument that can take 4 possible values, and it determines what other arguments are required, and how to interpret them. * If `/mod` is `3`: the operand is in the register described by the 3-bit `/rm32` argument similarly to `/r32` above. * If `/mod` is `0`: the operand is in the address provided in the register described by `/rm32`. That's `*rm32` in C syntax. * If `/mod` is `1`: the operand is in the address provided by adding the register in `/rm32` with the (1-byte) displacement. That's `*(rm32 + /disp8)` in C syntax. * If `/mod` is `2`: the operand is in the address provided by adding the register in `/rm32` with the (4-byte) displacement. That's `*(/rm32 + /disp32)` in C syntax. In the last three cases, one exception occurs when the `/rm32` argument contains `4`. Rather than encoding register `esp`, it means the address is provided by three _whole new_ arguments (`/base`, `/index` and `/scale`) in a _totally_ different way (where `<<` is the left-shift operator): ``` reg/mem = *(base + (index << scale)) ``` (There are a couple more exceptions ☹; see [Table 2-2](modrm.pdf) and [Table 2-3](sib.pdf) of the Intel manual for the complete story.) Phew, that was a lot to take in. Some examples to work through as you reread and digest it: 1. To read directly from the `eax` register, `/mod` must be `3` (direct mode), and `/rm32` must be `0`. There must be no `/base`, `/index` or `/scale` arguments. 1. To read from `*eax` (in C syntax), `/mod` must be `0` (indirect mode), and the `/rm32` argument must be `0`. There must be no `/base`, `/index` or `/scale` arguments (Intel calls the trio the 'SIB byte'.). 1. To read from `*(eax+4)`, `/mod` must be `1` (indirect + disp8 mode), `/rm32` must be `0`, there must be no SIB byte, and there must be a single displacement byte containing `4`. 1. To read from `*(eax+ecx+4)`, one approach would be to set `/mod` to `1` as above, `/rm32` to `4` (SIB byte next), `/base` to `0`, `/index` to `1` (`ecx`) and a single displacement byte to `4`. (What should the `scale` bits be? Can you think of another approach?) 1. To read from `*(eax+ecx+1000)`, one approach would be: - `/mod`: `2` (indirect + disp32) - `/rm32`: `4` (`/base`, `/index` and `/scale` arguments required) - `/base`: `0` (eax) - `/index`: `1` (ecx) - `/disp32
:(before "End Primitive Recipe Declarations")
_BROWSE_TRACE,
:(before "End Primitive Recipe Numbers")
Recipe_ordinal["$browse-trace"] = _BROWSE_TRACE;
:(before "End Primitive Recipe Checks")
case _BROWSE_TRACE: {
  break;
}
:(before "End Primitive Recipe Implementations")
case _BROWSE_TRACE: {
  start_trace_browser();
  break;
}

:(before "End Globals")
set<long long int> Visible;
long long int Top_of_screen = 0;
long long int Last_printed_row = 0;
map<int, long long int> Trace_index;  // screen row -> trace index

:(code)
void start_trace_browser() {
  if (!Trace_stream) return;
  cerr << "computing depth to display\n";
  long long int min_depth = 9999;
  for (long long int i = 0; i < SIZE(Trace_stream->past_lines); ++i) {
    trace_line& curr_line = Trace_stream->past_lines.at(i);
    if (curr_line.depth == 0) continue;
    if (curr_line.depth < min_depth) min_depth = curr_line.depth;
  }
  cerr << "depth is " << min_depth << '\n';
  cerr << "computing lines to display\n";
  for (long long int i = 0; i < SIZE(Trace_stream->past_lines); ++i) {
    if (Trace_stream->past_lines.at(i).depth == min_depth)
      Visible.insert(i);
  }
  tb_init();
  Display_row = Display_column = 0;
  tb_event event;
  Top_of_screen = 0;
  refresh_screen_rows();
  while (true) {
    render();
    do {
      tb_poll_event(&event);
    } while (event.type != TB_EVENT_KEY);
    long long int key = event.key ? event.key : event.ch;
    if (key == 'q' || key == 'Q') break;
    if (key == 'j' || key == TB_KEY_ARROW_DOWN) {
      // move cursor one line down
      if (Display_row < Last_printed_row) ++Display_row;
    }
    if (key == 'k' || key == TB_KEY_ARROW_UP) {
      // move cursor one line up
      if (Display_row > 0) --Display_row;
    }
    if (key == 'H') {
      // move cursor to top of screen
      Display_row = 0;
    }
    if (key == 'M') {
      // move cursor to center of screen
      Display_row = tb_height()/2;
    }
    if (key == 'L') {
      // move cursor to bottom of screen
      Display_row =