diff options
Diffstat (limited to 'mu_summary')
-rw-r--r-- | mu_summary | 259 |
1 files changed, 0 insertions, 259 deletions
diff --git a/mu_summary b/mu_summary deleted file mode 100644 index 286f8286..00000000 --- a/mu_summary +++ /dev/null @@ -1,259 +0,0 @@ -Mu programs are lists of functions. Each function has the following form: - - fn _name_ _inouts_with_types_ -> _outputs_with_types_ { - _instructions_ - } - -Each function has a header line, and some number of instructions, each on a -separate line. - -Instructions may be primitives or function calls. Either way, all instructions -have one of the following forms: - - # defining variables - var _name_: _type_ - var _name_/_register_: _type_ - - # doing things with variables - _operation_ _inouts_ - _outputs_ <- _operation_ _inouts_ - -Instructions and functions may have inouts and outputs. Both inouts and -outputs are variables. - -As seen above, variables can be defined to live in a register, like this: - - n/eax - -Variables not assigned a register live in the stack. - -Function inouts must always be on the stack, and outputs must always be in -registers. A function call must always write to the exact registers its -definition requires. For example: - - fn foo -> x/eax: int { - ... - } - fn main { - a/eax <- foo # ok - a/ebx <- foo # wrong - } - -Primitive inouts may be on the stack or in registers, but outputs must always -be in registers. - -Functions can contain nested blocks inside { and }. Variables defined in a -block don't exist outside it. - - { - _instructions_ - { - _more instructions_ - } - } - -Blocks can be named like so: - - $name: { - _instructions_ - } - -## Primitive instructions - -Primitive instructions currently supported in Mu ('n' indicates a literal -integer rather than a variable, and 'var/reg' indicates a variable in a -register): - - var/reg <- increment - increment var - var/reg <- decrement - decrement var - var1/reg1 <- add var2/reg2 - var/reg <- add var2 - add-to var1, var2/reg - var/reg <- add n - add-to var, n - - var1/reg1 <- sub var2/reg2 - var/reg <- sub var2 - sub-from var1, var2/reg - var/reg <- sub n - sub-from var, n - - var1/reg1 <- and var2/reg2 - var/reg <- and var2 - and-with var1, var2/reg - var/reg <- and n - and-with var, n - - var1/reg1 <- or var2/reg2 - var/reg <- or var2 - or-with var1, var2/reg - var/reg <- or n - or-with var, n - - var1/reg1 <- xor var2/reg2 - var/reg <- xor var2 - xor-with var1, var2/reg - var/reg <- xor n - xor-with var, n - - var/reg <- copy var2/reg2 - copy-to var1, var2/reg - var/reg <- copy var2 - var/reg <- copy n - copy-to var, n - - compare var1, var2/reg - compare var1/reg, var2 - compare var/eax, n - compare var, n - - var/reg <- multiply var2 - -Notice that there are no primitive instructions operating on two variables in -memory. That's a restriction of the underlying x86 processor. - -Any instruction above that takes a variable in memory can be replaced with a -dereference (`*`) of an address variable in a register. But you can't dereference -variables in memory. - -## Byte operations - -A special-case is variables of type 'byte'. Mu is a 32-bit platform so for the -most part only supports types that are multiples of 32 bits. However, we do -want to support strings in ASCII and UTF-8, which will be arrays of bytes. - -Since most x86 instructions implicitly load 32 bits at a time from memory, -variables of type 'byte' are only allowed in registers, not on the stack. Here -are the possible instructions for reading bytes to/from memory: - - var/reg <- copy-byte var2/reg2 # var: byte, var2: byte - var/reg <- copy-byte *var2/reg2 # var: byte, var2: (addr byte) - copy-byte-to *var1/reg1, var2/reg2 # var1: (addr byte), var2: byte - -In addition, variables of type 'byte' are restricted to (the lowest bytes of) -just 4 registers: eax, ecx, edx and ebx. - -## Primitive jump instructions - -There are two kinds of jumps, both with many variations: `break` and `loop`. -`break` instructions jump to the end of the containing block. `loop` instructions -jump to the beginning of the containing block. - -Jumps can take an optional label starting with '$': - - loop $foo - -This instruction jumps to the beginning of the block called $foo. It must lie -somewhere inside such a block. Jumps are only legal to containing blocks. Use -named blocks with restraint; jumps to places far away can get confusing. - -There are two unconditional jumps: - - loop - loop label - break - break label - -The remaining jump instructions are all conditional. Conditional jumps rely on -the result of the most recently executed `compare` instruction. (To keep -programs easy to read, keep compare instructions close to the jump that uses -them.) - - break-if-= - break-if-= label - break-if-!= - break-if-!= label - -Inequalities are similar, but have unsigned and signed variants. We assume -unsigned variants are only ever used to compare addresses. - - break-if-< - break-if-< label - break-if-> - break-if-> label - break-if-<= - break-if-<= label - break-if->= - break-if->= label - - break-if-addr< - break-if-addr< label - break-if-addr> - break-if-addr> label - break-if-addr<= - break-if-addr<= label - break-if-addr>= - break-if-addr>= label - -Similarly, conditional loops: - - loop-if-= - loop-if-= label - loop-if-!= - loop-if-!= label - - loop-if-< - loop-if-< label - loop-if-> - loop-if-> label - loop-if-<= - loop-if-<= label - loop-if->= - loop-if->= label - - loop-if-addr< - loop-if-addr< label - loop-if-addr> - loop-if-addr> label - loop-if-addr<= - loop-if-addr<= label - loop-if-addr>= - loop-if-addr>= label - -## Address operations - - var/reg: (addr T) <- address var: T # var must be in mem (on the stack) - -## Array operations - - var/reg: int <- length arr/reg: (addr array T) - var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: int - var/reg: (addr T) <- index arr: (array T sz), idx/reg: int - var/reg: (addr T) <- index arr/reg: (addr array T), n - var/reg: (addr T) <- index arr: (array T sz), n - - var/reg: (offset T) <- compute-offset arr: (addr array T), idx/reg: int # arr can be in reg or mem - var/reg: (offset T) <- compute-offset arr: (addr array T), idx: int # arr can be in reg or mem - var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: (offset T) - -## User-defined types - - var/reg: (addr T_f) <- get var/reg: (addr T), f - where record (product) type T has elements a, b, c, ... of types T_a, T_b, T_c, ... - var/reg: (addr T_f) <- get var: T, f - -## Handles for safe access to the heap - -Say we created a handle like this on the stack (it can't be in a register) - var x: (handle T) - allocate Heap, T, x - -You can copy handles to another variable on the stack like this: - var y: (handle T) - copy-handle-to y, x - -You can also save handles inside other user-defined types like this: - var y/reg: (addr handle T_f) <- get var: (addr T), f - copy-handle-to *y, x - -Or this: - var y/reg: (addr handle T) <- index arr: (addr array handle T), n - copy-handle-to *y, x - -Handles can be converted into addresses like this: - var y/reg: (addr T) <- lookup x - -It's illegal to continue to use this addr after a function that reclaims heap -memory. You have to repeat the lookup. |