diff options
Diffstat (limited to 'archive/3.transect/compiler3')
-rw-r--r-- | archive/3.transect/compiler3 | 73 |
1 files changed, 0 insertions, 73 deletions
diff --git a/archive/3.transect/compiler3 b/archive/3.transect/compiler3 deleted file mode 100644 index 6bc6bf85..00000000 --- a/archive/3.transect/compiler3 +++ /dev/null @@ -1,73 +0,0 @@ -== Goal - -A memory-safe language with a simple translator to x86 that can be feasibly written in x86. - -== Definitions of terms - -Memory-safe: it should be impossible to: - a) create a pointer out of arbitrary data, or - b) to access heap memory after it's been freed. - -Simple: do all the work in a 2-pass translator: - Pass 1: check each instruction's types in isolation. - Pass 2: emit code for each instruction in isolation. - -== Implications - -=> Each instruction matches a pattern and yields a template to emit. -=> There's a 1-to-1 mapping between instructions in the source language and x86 machine code. - Zero runtime. -=> Programmers have to decide how to use registers. -=> Translator can't insert any instructions that write to registers. (We don't know if a register is in use.) - -== Lessons from Mu - -1. For easy bounds checking, never advance pointers to arrays or heap allocations. No pointer arithmetic. -2. Store the array length with the array. -3. Store an allocation id with heap allocations. Allocation id goes monotonically up, never gets reused. When it wraps around to zero the program panics. -4. Heap pointers also carry around allocation id. -5. When dereferencing a heap pointer, first ensure its alloc id matches the alloc id of the payload. This ensures some other copy of the pointer didn't get freed (and potentially reused) - -== Problem 1 - -How to index into an array? - - The array has a length that needs to be checked. - Its elements have a type T. - The base will be in memory, either on the stack or the heap. - The index may be in the register, stack or heap. - -That's too much work to do in a single instruction. - -So arrays have to take multiple steps. And we have to guard against the steps -being misused in unsafe ways. - -To index into an array with elements of type T, starting with the size of the -array in bytes: - - step 1: get the offset the index is at - <reg offset> : (index T) <- index <reg/mem idx> : int, <literal> : (size T) - step 2: convert the array to address-of-element - <reg x> : (address T) <- advance <reg/mem A> : (array T), <reg offset> : (index T) - implicitly compares the offset with the size, panic if greater - => - compare <reg offset> : (index T), <reg/mem> : (array T) - jge panic - step 3: use the address to the element - ... - -(index T) is a special type. You can do only two things with it: - - pass it to the advance instruction - - convert it to a number (but no converting back) - -(address T) is a short-term pointer. You can't store addresses in structs, you -can't define global variables of that type, and you can't pass the type to the -memory allocator to save to the heap. Only place you can store an (address T) -is on the stack or a register. - -[But you can still be holding an address in a long-lived stack frame after -it's been freed?!] - -== Problem 2 - -How to dereference a heap allocation? |