Mu programs are lists of functions. Each function has the following form:
fn _name_ _inouts_with_types_ -> _outputs_with_types_ {
_instructions_
}
Each function has a header line, and some number of instructions, each on a
separate line.
Instructions may be primitives or function calls. Either way, all instructions
have one of the following forms:
# defining variables
var _name_: _type_
var _name_/_register_: _type_
# doing things with variables
_operation_ _inouts_
_outputs_ <- _operation_ _inouts_
Instructions and functions may have inouts and outputs. Both inouts and
outputs are variables.
As seen above, variables can be defined to live in a register, like this:
n/eax
Variables not assigned a register live in the stack.
Function inouts must always be on the stack, and outputs must always be in
registers. A function call must always write to the exact registers its
definition requires. For example:
fn foo -> x/eax: int {
...
}
fn main {
a/eax <- foo # ok
a/ebx <- foo # wrong
}
Primitive inouts may be on the stack or in registers, but outputs must always
be in registers.
Functions can contain nested blocks inside { and }. Variables defined in a
block don't exist outside it.
{
_instructions_
{
_more instructions_
}
}
Blocks can be named like so:
$name: {
_instructions_
}
## Primitive instructions
Primitive instructions currently supported in Mu ('n' indicates a literal
integer rather than a variable, and 'var/reg' indicates a variable in a
register):
var/reg <- increment
increment var
var/reg <- decrement
decrement var
var1/reg1 <- add var2/reg2
var/reg <- add var2
add-to var1, var2/reg
var/reg <- add n
add-to var, n
var1/reg1 <- sub var2/reg2
var/reg <- sub var2
sub-from var1, var2/reg
var/reg <- sub n
sub-from var, n
var1/reg1 <- and var2/reg2
var/reg <- and var2
and-with var1, var2/reg
var/reg <- and n
and-with var, n
var1/reg1 <- or var2/reg2
var/reg <- or var2
or-with var1, var2/reg
var/reg <- or n
or-with var, n
var1/reg1 <- xor var2/reg2
var/reg <- xor var2
xor-with var1, var2/reg
var/reg <- xor n
xor-with var, n
var1/reg1 <- copy var2/reg2
copy-to var1, var2/reg
var/reg <- copy var2
var/reg <- copy n
copy-to var, n
compare var1, var2/reg
compare var1/reg, var2
compare var/eax, n
compare var, n
var/reg <- multiply var2
Notice that there are no primitive instructions operating on two variables in
memory. That's a restriction of the underlying x86 processor.
Any instruction above that takes a variable in memory can be replaced with a
dereference (`*`) of an address variable in a register. But you can't dereference
variables in memory.
## Primitive jump instructions
There are two kinds of jumps, both with many variations: `break` and `loop`.
`break` instructions jump to the end of the containing block. `loop` instructions
jump to the beginning of the containing block.
Jumps can take an optional label starting with '$':
loop $foo
This instruction jumps to the beginning of the block called $foo. It must lie
somewhere inside such a block. Jumps are only legal to containing blocks. Use
named blocks with restraint; jumps to places far away can get confusing.
There are two unconditional jumps:
loop
loop label
break
break label
The remaining jump instructions are all conditional. Conditional jumps rely on
the result of the most recently executed `compare` instruction. (To keep
programs easy to read, keep compare instructions close to the jump that uses
them.)
break-if-=
break-if-= label
break-if-!=
break-if-!= label
Inequalities are similar, but have unsigned and signed variants. We assume
unsigned variants are only ever used to compare addresses.
break-if-<
break-if-< label
break-if->
break-if-> label
break-if-<=
break-if-<= label
break-if->=
break-if->= label
break-if-addr<
break-if-addr< label
break-if-addr>
break-if-addr> label
break-if-addr<=
break-if-addr<= label
break-if-addr>=
break-if-addr>= label
Similarly, conditional loops:
loop-if-=
loop-if-= label
loop-if-!=
loop-if-!= label
loop-if-<
loop-if-< label
loop-if->
loop-if-> label
loop-if-<=
loop-if-<= label
loop-if->=
loop-if->= label
loop-if-addr<
loop-if-addr< label
loop-if-addr>
loop-if-addr> label
loop-if-addr<=
loop-if-addr<= label
loop-if-addr>=
loop-if-addr>= label
## Address operations
var/reg: (addr T) <- address var: T # var must be in mem (on the stack)
## Array operations
var/reg: int <- length arr/reg: (addr array T)
var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: int
var/reg: (addr T) <- index arr: (array T sz), idx/reg: int
var/reg: (addr T) <- index arr/reg: (addr array T), n
var/reg: (addr T) <- index arr: (array T sz), n
var/reg: (offset T) <- compute-offset arr: (addr array T), idx/reg: int # arr can be in reg or mem
var/reg: (offset T) <- compute-offset arr: (addr array T), idx: int # arr can be in reg or mem
var/reg: (addr T) <- index arr/reg: (addr array T), idx/reg: (offset T)
## User-defined types
var/reg: (addr T_f) <- get var/reg: (addr T), f
where record (product) type T has elements a, b, c, ... of types T_a, T_b, T_c, ...
var/reg: (addr T_f) <- get var: T, f
## Handles for safe access to the heap
Say we created a handle like this on the stack (it can't be in a register)
var x: (handle T)
allocate Heap, T, x
You can copy handles to another variable on the stack like this:
var y: (handle T)
copy-handle-to y, x
You can also save handles inside other user-defined types like this:
var y/reg: (addr handle T_f) <- get var: (addr T), f
copy-handle-to *y, x
Or this:
var y/reg: (addr handle T) <- index arr: (addr array handle T), n
copy-handle-to *y, x
Handles can be converted into addresses like this:
var y/reg: (addr T) <- lookup x
It's illegal to continue to use this addr after a function that reclaims heap
memory. You have to repeat the lookup.