about summary refs log tree commit diff stats
diff options
context:
space:
mode:
-rw-r--r--awk/scheme/scheme/README.md309
-rw-r--r--awk/scheme/scheme/TODO.md19
-rwxr-xr-xawk/scheme/scheme/bin/compiler.awk4
-rwxr-xr-xawk/scheme/scheme/bin/vm.awk15
-rw-r--r--awk/scheme/scheme/diagram.md66
5 files changed, 202 insertions, 211 deletions
diff --git a/awk/scheme/scheme/README.md b/awk/scheme/scheme/README.md
index 6fcafe6..7711442 100644
--- a/awk/scheme/scheme/README.md
+++ b/awk/scheme/scheme/README.md
@@ -1,189 +1,186 @@
 # Awk-Scheme
 
 ## Overview
-This is a minimal Scheme implementation supporting basic arithmetic operations, list manipulation, and comparisons. All values are displayed with type tags (e.g., "N:42" for numbers).
+A Scheme interpreter implemented in AWK, featuring:
+- A compiler that translates Scheme expressions to stack-based VM instructions
+- A virtual machine that executes the compiled code
+- Support for basic arithmetic, list operations, functions, and variable bindings
+- Persistent global state between REPL sessions
+
+## Architecture
+
+### Components
+1. **Compiler** (`bin/compiler.awk`):
+   - Lexical analyzer for tokenizing input
+   - Recursive descent parser for Scheme expressions
+   - Code generator that produces stack-based VM instructions
+   - Handles nested expressions and proper scoping
+
+2. **Virtual Machine** (`bin/vm.awk`):
+   - Stack-based execution model
+   - Typed value system with runtime type checking
+   - Environment-based variable binding
+   - Function call/return mechanism
+   - Heap memory for cons cells
+   - State persistence for globals and functions
+
+3. **REPL** (`bin/repl`):
+   - Interactive command line interface
+   - Multi-line input support
+   - State persistence between sessions
+   - Debug mode support
+
+### Data Types
+All values are tagged with their type:
+- `N:value` - Numbers (integers)
+- `B:value` - Booleans (0/1)
+- `P:index` - Pairs (cons cells)
+- `F:name` - Functions
+- `NIL:` - Empty list
+- `S:name` - Symbols
+
+## Usage
+
+### Running the REPL
+```bash
+# Start interactive REPL
+./scheme
+
+# Run a Scheme file
+./scheme myprogram.scm
+
+# Enable debug output
+DEBUG=1 ./scheme
+```
 
-## Data Types
+### Basic Operations
 
-### Numbers
-- Represented as: `N:value`
-- Examples: `42`, `-5`, `123`
+1. **Arithmetic**:
 ```scheme
-scheme> 42
-N:42
+scheme> (+ 1 2 3)
+N:6
+scheme> (* 4 (- 10 5))
+N:20
 ```
 
-### Booleans
-- Represented as: `B:value` (1 for true, 0 for false)
-- Generated by comparison operations
+2. **Comparisons**:
 ```scheme
 scheme> (< 1 2)
 B:1
+scheme> (= 42 42)
+B:1
 ```
 
-### Nil (Empty List)
-- Represented as: `NIL:`
-- Used for list termination
+3. **Lists**:
 ```scheme
-scheme> nil
-NIL:
+scheme> (cons 1 (cons 2 nil))
+P:1
+scheme> (car (cons 1 2))
+N:1
 ```
 
-### Pairs
-- Represented as: `P:index`
-- Created using cons
-- Stored in heap with car and cdr values
-
-## Supported Operations
-
-### Arithmetic Operations
-1. Addition: `(+ x y ...)`
-   ```scheme
-   scheme> (+ 1 2)
-   N:3
-   scheme> (+ 1 2 3)
-   N:6
-   ```
-
-2. Subtraction: `(- x y ...)`
-   ```scheme
-   scheme> (- 5 3)
-   N:2
-   scheme> (- 10 2 3)  ; 10 - 2 - 3
-   N:5
-   ```
-
-3. Multiplication: `(* x y ...)`
-   ```scheme
-   scheme> (* 3 4)
-   N:12
-   scheme> (* 2 3 4)
-   N:24
-   ```
-
-4. Division: `(/ x y)`
-   ```scheme
-   scheme> (/ 10 2)
-   N:5
-   ```
-
-### Comparison Operations
-1. Less Than: `(< x y)`
-   ```scheme
-   scheme> (< 1 2)
-   B:1
-   scheme> (< 2 1)
-   B:0
-   ```
-
-2. Equality: `(= x y)`
-   ```scheme
-   scheme> (= 42 42)
-   B:1
-   ```
-
-### List Operations
-1. Cons: `(cons x y)`
-   - Creates a pair from two values
-   ```scheme
-   scheme> (cons 1 2)
-   P:1
-   scheme> (cons 1 nil)
-   P:1
-   ```
-
-2. Car: `(car pair)`
-   - Gets the first element of a pair
-   ```scheme
-   scheme> (car (cons 1 2))
-   N:1
-   ```
-
-3. Cdr: `(cdr pair)`
-   - Gets the second element of a pair
-   ```scheme
-   scheme> (cdr (cons 1 2))
-   N:2
-   ```
-
-### Building Lists
-Lists can be built using nested cons operations with nil as the terminator:
+4. **Variables**:
 ```scheme
-scheme> (cons 1 (cons 2 (cons 3 nil)))
-P:1  ; This represents the list (1 2 3)
+scheme> (define x 42)
+N:42
+scheme> x
+N:42
 ```
 
-### Define
-- Define a new variable or function
+5. **Functions**:
 ```scheme
-scheme> (define x 10)
-N:10
 scheme> (define add (x y) (+ x y))
-F:1
+scheme> (add 2 3)
+N:5
 ```
 
-### Let
-- Define local bindings within a let expression
+6. **Let Expressions**:
 ```scheme
 scheme> (let ((x 10) (y 20)) (+ x y))
 N:30
 ```
 
-## Expression Structure
-- All expressions must be properly parenthesized
-- Operators come first in a form (prefix notation)
-- Multiple expressions can be evaluated in sequence
-
-## REPL Features
-- Multi-line input supported (continues with "..." prompt until parentheses balance)
-- Ctrl+D to exit
-- Comments start with semicolon (;)
-
-## Error Handling
-The system will report errors for:
-- Stack underflow
-- Type mismatches
-- Unknown operations
-- Division by zero
-- Invalid list operations
-- Malformed expressions
-
-## Examples
-Here are some more complex examples:
-
-1. Nested arithmetic:
-```scheme
-scheme> (+ (* 3 4) (- 10 5))
-N:17
-```
-
-2. List construction and manipulation:
-```scheme
-scheme> (cons (+ 1 2) (cons (* 3 4) nil))
-P:1  ; Represents (3 12)
-```
-
-3. Combined operations:
-```scheme
-scheme> (car (cons (* 2 3) (+ 4 5)))
-N:6
-```
+## Implementation Details
+
+### Compiler Pipeline
+1. **Lexical Analysis**:
+   - Tokenizes input into numbers, symbols, and parentheses
+   - Handles whitespace and basic comments
+
+2. **Parsing**:
+   - Builds expression tree from tokens
+   - Validates syntax and expression structure
+
+3. **Code Generation**:
+   - Translates expressions to VM instructions
+   - Manages scope and variable bindings
+   - Handles function definitions
+
+### Virtual Machine
+1. **Stack Operations**:
+   - PUSH_CONST: Push constant value
+   - POP: Remove top value
+   - STORE: Store variable binding
+   - LOOKUP: Retrieve variable value
+
+2. **Function Calls**:
+   - CALL: Invoke function
+   - RETURN: Return from function
+   - Environment frame management
+
+3. **Memory Management**:
+   - Stack-based evaluation
+   - Simple heap for cons cells
+   - Basic reference counting (not fully implemented)
+
+### State Persistence
+- Global variables and functions persist between sessions
+- State stored in `/tmp/scheme_vm.state` and `/tmp/scheme_vm.env`
+- Automatic state loading/saving
+
+## Extending the System
+
+### Adding New Special Forms
+1. Add parsing in `compile_expr()`
+2. Implement code generation function
+3. Add corresponding VM instructions if needed
+
+### Adding New Data Types
+1. Define new type tag in VM
+2. Add type checking predicates
+3. Implement value construction/access
+4. Update relevant operations
+
+### Adding VM Instructions
+1. Add instruction handling in `execute()`
+2. Implement instruction logic
+3. Update compiler to generate new instruction
+
+### Debugging
+- Enable debug output: `DEBUG=1 ./scheme`
+- Debug messages show:
+  - Lexical analysis
+  - Parsing steps
+  - Code generation
+  - VM execution
+  - Stack operations
+  - Environment changes
 
 ## Limitations
 Current implementation does not support:
-- Variables or definition
-- Functions or lambda expressions
-- Control structures (if, cond)
-- Quote or quasiquote
-- String data type
-- Input/output operations
-- Standard library functions
+- Floating point numbers
+- Strings
+- Proper tail recursion
+- Garbage collection
+- Error recovery in REPL
+- Full numeric tower
+- Macros
+- Standard library
 
 ## Future Enhancements
-Possible additions could include:
-1. Let expressions for local bindings
-2. Function definitions
-3. Conditional expressions
-4. More numeric operations
-5. String support
-6. Additional list operations
\ No newline at end of file
+1. Proper tail call optimization
+2. Garbage collection
+3. Error recovery
+4. More data types
+5. Standard library
+6. Better debugging tools
\ No newline at end of file
diff --git a/awk/scheme/scheme/TODO.md b/awk/scheme/scheme/TODO.md
deleted file mode 100644
index 52218a8..0000000
--- a/awk/scheme/scheme/TODO.md
+++ /dev/null
@@ -1,19 +0,0 @@
-# Limitations
-Current implementation does not support:
-
-- Variables or definition
-- Functions or lambda expressions
-- Control structures (if, cond)
-- Quote or quasiquote
-- String data type
-- Input/output operations
-- Standard library functions
-
-# Future Enhancements
-Possible additions could include:
-
-- Let expressions for local bindings
-- Function definitions
-- Conditional expressions
-- More numeric operations
-- String support
diff --git a/awk/scheme/scheme/bin/compiler.awk b/awk/scheme/scheme/bin/compiler.awk
index 738ca9c..db48388 100755
--- a/awk/scheme/scheme/bin/compiler.awk
+++ b/awk/scheme/scheme/bin/compiler.awk
@@ -11,8 +11,8 @@ BEGIN {
     next_label = 0       # Counter for generating unique labels
     program = ""         # Accumulates the full program text
     
-    # Debug flag for development/troubleshooting
-    DEBUG = 0
+    # Debug mode disabled by default, can be enabled via DEBUG=1 environment variable
+    DEBUG = (ENVIRON["DEBUG"] == "1") ? 1 : 0
 }
 
 # Debug logging helper function
diff --git a/awk/scheme/scheme/bin/vm.awk b/awk/scheme/scheme/bin/vm.awk
index 1bd9db1..2c56fda 100755
--- a/awk/scheme/scheme/bin/vm.awk
+++ b/awk/scheme/scheme/bin/vm.awk
@@ -21,8 +21,8 @@ BEGIN {
     heap_ptr = 0      # Points to next free heap location
     pc = 0            # Program counter for instruction fetch
     
-    # Debug mode toggle
-    DEBUG = 0
+    # Debug mode disabled by default, can be enabled via DEBUG=1 environment variable
+    DEBUG = (ENVIRON["DEBUG"] == "1") ? 1 : 0
 
     # Environment for variable bindings
     env_size = 0     # Current size of environment stack
@@ -113,11 +113,15 @@ function makeValue(type, val) {
 }
 
 function getType(val) {
-    return substr(val, 1, index(val, ":") - 1)
+    type = substr(val, 1, index(val, ":") - 1)
+    debug("Get type: " type " from " val)
+    return type
 }
 
 function getValue(val) {
-    return substr(val, index(val, ":") + 1)
+    value = substr(val, index(val, ":") + 1)
+    debug("Get value: " value " from " val)
+    return value
 }
 
 # Type checking predicates
@@ -143,6 +147,7 @@ function pop() {
 
 function peek() {
     if (stack_ptr < 1) error("Stack empty")
+    debug("Peek: " stack[stack_ptr])
     return stack[stack_ptr]
 }
 
@@ -247,6 +252,7 @@ function vm_equal() {
     val2 = pop()
     val1 = pop()
     result = (val1 == val2) ? "1" : "0"
+    debug("Equal comparison: " val1 " == " val2 " -> " result)
     push(makeValue(T_BOOLEAN, result))
 }
 
@@ -257,6 +263,7 @@ function vm_less_than() {
     if (!isNumber(val1) || !isNumber(val2))
         error("LT requires numeric operands")
     result = (getValue(val1) < getValue(val2)) ? "1" : "0"
+    debug("Less than comparison: " val1 " < " val2 " -> " result)
     push(makeValue(T_BOOLEAN, result))
 }
 
diff --git a/awk/scheme/scheme/diagram.md b/awk/scheme/scheme/diagram.md
index 2c48d7c..4a719b4 100644
--- a/awk/scheme/scheme/diagram.md
+++ b/awk/scheme/scheme/diagram.md
@@ -1,6 +1,5 @@
-# Awk-Scheme Diagram
-
-How this is all orchestrated.
+# Awk-Scheme Architecture
+## Component Interaction Diagram
 
 ```
 +----------------+     Scheme Code      +----------------+    Assembly     +----------------+
@@ -8,36 +7,43 @@ How this is all orchestrated.
 |      REPL      |   "(+ 1 2)"          |    Compiler    |  "PUSH_CONST    |       VM       |
 |   (bin/repl)   |                      | compiler.awk   |   N:1           |    vm.awk      |
 |                |                      |                |   PUSH_CONST    |                |
-|  User Input    |                      | Translates     |   N:2           |  Stack-based   |
-|  Interface     |                      | Scheme to      |   ADD           |  Interpreter   |
-|                |                      | VM Assembly    |   HALT"         |                |
-|                | <------------------  |                | <-------------  |                |
+|  - Multi-line  |                      | - Lexer        |   N:2           | - Stack-based  |
+|  - Debug mode  |                      | - Parser       |   ADD           | - Type system  |
+|  - File input  |                      | - Code gen     |   HALT"         | - Environment  |
+|                | <-----------------   |                | <-------------  |                |
 |                |  Output: "N:3"       |                |   Result        |                |
 +----------------+                      +----------------+                 +----------------+
-       ^                                                                       |
-       |                                                                       |
-       |                              Data Flow                                |
-       |                                                                       v
-+------+-------------------------------------------------------------------------+
-|                                                                                |
-|  1. REPL (bin/repl) reads Scheme expression from user                          |
-|  2. Expression sent to compiler (bin/compiler.awk)                             |
-|  3. Compiler generates VM assembly instructions                                |
-|  4. VM (bin/vm.awk) executes assembly and maintains:                           |
-|     - Stack: For operation values                                              |
-|     - Heap: For storing pairs/lists                                            |
-|     - Type tags: N:(number), B:(boolean), P:(pair), NIL:                       |
-|  5. Result returned through REPL to user                                       |
-|                                                                                |
-+--------------------------------------------------------------------------------+
+       ^                                                                      |
+       |                                                                      v
+       |                                                              +----------------+
+       |                                                              |  Persistence   |
+       |                                                              | /tmp files:    |
+       +--------------------------------------------------------------+ - vm.state     |
+                                                                      | - vm.env       |
+                                                                      +----------------+
+
+Debug Flow (when DEBUG=1):
++----------------+     Debug Info      +----------------+    Debug Info    +----------------+
+|     REPL       | ----------------->  |   Compiler     | ------------->   |      VM        |
+|                |  [Input received]   |                | [Tokens found]   |                |
+| [Debug output] |                     | [Parsing tree] | [Stack ops]      | [Stack state]  |
+|     stderr     | <-----------------  |  [Gen code]    | <-------------   | [Environment]  |
++----------------+                     +----------------+                  +----------------+
+
+Execution Flow Example:
+┌─────────────┐         ┌─────────────┐         ┌─────────────┐         ┌─────────────┐
+│   Input:    │         │  Lexer/     │         │  Code Gen   │         │     VM      │
+│  (+ 1 2)    │ ------> │  Parser:    │ ------> │ PUSH_CONST  │ ------> │ Stack:      │
+│             │         │  (+         │         │   N:1       │         │  [N:1]      │
+│             │         │   1         │         │ PUSH_CONST  │         │  [N:1,N:2]  │
+│             │         │   2)        │         │   N:2       │         │  [N:3]      │
+│             │         │             │         │ ADD         │         │             │
+└─────────────┘         └─────────────┘         └─────────────┘         └─────────────┘
 
-Example Flow:
+State Management:
 ┌─────────────┐         ┌─────────────┐         ┌─────────────┐
-│   Input:    │         │  Compiler   │         │     VM      │
-│  (+ 1 2)    │    →    │ PUSH_CONST  │    →    │ Stack:      │
-│             │         │   N:1       │         │  [N:1]      │
-│             │         │ PUSH_CONST  │         │  [N:1,N:2]  │
-│             │         │   N:2       │         │  [N:3]      │
-│             │         │ ADD         │         │             │
+│   Global    │         │ Environment │         │  Function   │
+│  Variables  │ ------> │   Stack     │ ------> │   Calls     │
+│ (persist)   │         │ (frames)    │         │ (stack)     │
 └─────────────┘         └─────────────┘         └─────────────┘
 ```
\ No newline at end of file