about summary refs log tree commit diff stats
path: root/transect/compiler6
diff options
context:
space:
mode:
authorKartik Agaram <vc@akkartik.com>2018-09-17 22:57:10 -0700
committerKartik Agaram <vc@akkartik.com>2018-09-17 22:57:58 -0700
commitf09280141f18fbe8cef0ed576cf932e12e315666 (patch)
treed00962b07cb013f89d4fdb2fcf19c392afb62b5c /transect/compiler6
parent0a7b03727a736f73c16d37b22afef8496c60d657 (diff)
downloadmu-f09280141f18fbe8cef0ed576cf932e12e315666.tar.gz
4548: start of a compiler for a new experimental low-level language
Diffstat (limited to 'transect/compiler6')
-rw-r--r--transect/compiler636
1 files changed, 36 insertions, 0 deletions
diff --git a/transect/compiler6 b/transect/compiler6
new file mode 100644
index 00000000..48a7030f
--- /dev/null
+++ b/transect/compiler6
@@ -0,0 +1,36 @@
+== Goal
+
+A memory-safe language with a simple translator to x86 that can be feasibly written in x86.
+
+== Definitions of terms
+
+Memory-safe: it should be impossible to:
+  a) create a pointer out of arbitrary data, or
+  b) to access heap memory after it's been freed.
+
+Simple: do all the work in a 2-pass translator:
+  Pass 1: check each instruction's types in isolation.
+  Pass 2: emit code for each instruction in isolation.
+
+== types
+
+int
+char
+(address _)
+(array _ n)
+(ref _)
+
+addresses can't be saved to stack or global,
+      or included in compound types
+      or used across a call (to eliminate possibility of free)
+
+<reg x> : (address T) <- advance <reg/mem> : (array T), <reg offset> : (index T)
+
+arrays require a size
+(ref array _) may not include a size
+
+== open questions
+Is argv an address?
+Global variables are easiest to map to addresses.
+Ideally we'd represent 'indirect' as a '*' and we could just count to make
+sure that an instruction never has more than one '*'.