about summary refs log tree commit diff stats
diff options
context:
space:
mode:
authorKartik Agaram <vc@akkartik.com>2019-02-16 22:55:12 -0800
committerKartik Agaram <vc@akkartik.com>2019-02-16 22:55:12 -0800
commitde2990880ab7ac98db79747a2eaedc3442fd4afc (patch)
tree16f1e3cf257c78ff05de6e999c2118e0f537e830
parentbf1fe33c369d815676c4b7b3a334a6d971988f64 (diff)
downloadmu-de2990880ab7ac98db79747a2eaedc3442fd4afc.tar.gz
4976 - recommend that operand order be fixed
I've been allowing operands in any order just because it simplifies implementation.
I don't actually rely on this flexibility; all the .subx programs in this
repo consistently use a single ordering.

Why is a hard-coded canonical order hard to implement? The order that seems
most logical to me is complicated by the "reg" bits in the ModR/M byte:

- In instructions that interpret it as an `/r32` operand, it needs to be
  deemphasized because it refers to a different argument of the instruction
  than the `/mod`, `/rm32`, `/base`, `/index` and `/scale` operands that
  capture the bulk of instruction decoding complexity and so should be
  emphasized. `/r32` can also be unused, which strengthens the case for
  deemphasizing it.

- In instructions that interpret the "reg" bits as a `/subop` operand,
  it should be colocated with the opcode because it performs the same function:
  specifying the *operation* the instruction performs.

In both cases, the bits in the `reg` bitfield are conceptually unrelated
to the other bitfields in the same byte. But they sometimes want to be
close to the opcode bytes on the left, and at other times need to be deemphasized
rightward. Fixing both these possibilities seems complicated and stateful,
particularly since all operands are optional in general. On the other hand,
just pulling operands you need to create each byte, regardless of where
in the instruction they occur, that's nicely stateless.
-rw-r--r--subx/Readme.md35
1 files changed, 21 insertions, 14 deletions
diff --git a/subx/Readme.md b/subx/Readme.md
index a0fff7ca..404f4dc6 100644
--- a/subx/Readme.md
+++ b/subx/Readme.md
@@ -303,15 +303,15 @@ Within the code segment, each line contains a comment, label or instruction.
 Comments start with a `#` and are ignored. Labels should always be the first
 word on a line, and they end with a `:`.
 
-Instructions consist of a sequence of opcode bytes and their operands. As
-mentioned above, each opcode and operand can contain _metadata_ after a `/`.
-Metadata can be either for SubX or act as a comment for the reader; SubX
-silently ignores unrecognized metadata. A single word can contain multiple
-pieces of metadata, each starting with a `/`.
-
-SubX uses metadata to express instruction encoding and get decent error
-messages. You must tag each instruction operand with the appropriate operand
-type:
+Instructions consist of a sequence of words. As mentioned above, each word can
+contain _metadata_ after a `/`. Metadata can be either required by SubX or act
+as a comment for the reader; SubX silently ignores unrecognized metadata. A
+single word can contain multiple pieces of metadata, each starting with a `/`.
+
+The words in an instruction consist of 1-3 opcode bytes, and different kinds
+of operands corresponding to the bitfields in an x86 instruction listed above.
+For error checking, these operands must be tagged with one of the following
+bits of metadata:
   - `mod`
   - `rm32` ("r/m" in the x86 instruction diagram above, but we can't use `/`
     in metadata tags)
@@ -321,11 +321,18 @@ type:
   - displacement: `disp8`, `disp16` or `disp32`
   - immediate: `imm8` or `imm32`
 
-You don't need to remember what order instruction operands are in,
-or pack bitfields by hand. SubX will do all that for you. If you get the types
-wrong, giving an instruction an incorrect operand or forgetting an operand,
-you should get a clear error message. Remember, don't use `subop` (sub-operand
-above) and `r32` (reg in the x86 figure above) in a single instruction.
+Different instructions (opcodes) require different operands. SubX will
+validate each instruction in your programs, and raise an error anytime you
+miss or spuriously add an operand.
+
+I recommend you order operands consistently in your programs. SubX allows
+operands in any order, but only because that's simplest to explain/implement.
+Switching order from instruction to instruction is likely to add to the
+reader's burden. Here's the order I've been using:
+
+```
+/subop  /mod /rm32  /base /index /scale  /r32  /displacement  /immediate
+```
 
 Instructions can refer to labels in displacement or immediate operands, and
 they'll obtain a value based on the address of the label: immediate operands