diff options
Diffstat (limited to 'awk/scheme')
-rw-r--r-- | awk/scheme/scheme/README.md | 309 | ||||
-rw-r--r-- | awk/scheme/scheme/TODO.md | 19 | ||||
-rw-r--r-- | awk/scheme/scheme/TODO.txt | 69 | ||||
-rwxr-xr-x | awk/scheme/scheme/bin/compiler.awk | 250 | ||||
-rwxr-xr-x | awk/scheme/scheme/bin/vm.awk | 365 | ||||
-rw-r--r-- | awk/scheme/scheme/diagram.md | 66 | ||||
-rw-r--r-- | awk/scheme/scheme/docs/awk-scheme-prompt.md | 189 | ||||
-rw-r--r-- | awk/scheme/scheme/docs/awk-vm-implementation-prompt.md | 226 | ||||
-rw-r--r-- | awk/scheme/scheme/docs/scheme-compilation-examples.md | 222 | ||||
-rw-r--r-- | awk/scheme/scheme/docs/scheme-compiler-impl.md | 307 | ||||
-rw-r--r-- | awk/scheme/scheme/docs/scheme-vm-prompt.md | 112 | ||||
-rw-r--r-- | awk/scheme/scheme/docs/scheme-vm-spec.md | 130 | ||||
-rw-r--r-- | awk/scheme/scheme/docs/test-program.md | 48 | ||||
-rw-r--r-- | awk/scheme/scheme/examples/lambda.test.scm | 12 |
14 files changed, 726 insertions, 1598 deletions
diff --git a/awk/scheme/scheme/README.md b/awk/scheme/scheme/README.md index 6fcafe6..7711442 100644 --- a/awk/scheme/scheme/README.md +++ b/awk/scheme/scheme/README.md @@ -1,189 +1,186 @@ # Awk-Scheme ## Overview -This is a minimal Scheme implementation supporting basic arithmetic operations, list manipulation, and comparisons. All values are displayed with type tags (e.g., "N:42" for numbers). +A Scheme interpreter implemented in AWK, featuring: +- A compiler that translates Scheme expressions to stack-based VM instructions +- A virtual machine that executes the compiled code +- Support for basic arithmetic, list operations, functions, and variable bindings +- Persistent global state between REPL sessions + +## Architecture + +### Components +1. **Compiler** (`bin/compiler.awk`): + - Lexical analyzer for tokenizing input + - Recursive descent parser for Scheme expressions + - Code generator that produces stack-based VM instructions + - Handles nested expressions and proper scoping + +2. **Virtual Machine** (`bin/vm.awk`): + - Stack-based execution model + - Typed value system with runtime type checking + - Environment-based variable binding + - Function call/return mechanism + - Heap memory for cons cells + - State persistence for globals and functions + +3. **REPL** (`bin/repl`): + - Interactive command line interface + - Multi-line input support + - State persistence between sessions + - Debug mode support + +### Data Types +All values are tagged with their type: +- `N:value` - Numbers (integers) +- `B:value` - Booleans (0/1) +- `P:index` - Pairs (cons cells) +- `F:name` - Functions +- `NIL:` - Empty list +- `S:name` - Symbols + +## Usage + +### Running the REPL +```bash +# Start interactive REPL +./scheme + +# Run a Scheme file +./scheme myprogram.scm + +# Enable debug output +DEBUG=1 ./scheme +``` -## Data Types +### Basic Operations -### Numbers -- Represented as: `N:value` -- Examples: `42`, `-5`, `123` +1. **Arithmetic**: ```scheme -scheme> 42 -N:42 +scheme> (+ 1 2 3) +N:6 +scheme> (* 4 (- 10 5)) +N:20 ``` -### Booleans -- Represented as: `B:value` (1 for true, 0 for false) -- Generated by comparison operations +2. **Comparisons**: ```scheme scheme> (< 1 2) B:1 +scheme> (= 42 42) +B:1 ``` -### Nil (Empty List) -- Represented as: `NIL:` -- Used for list termination +3. **Lists**: ```scheme -scheme> nil -NIL: +scheme> (cons 1 (cons 2 nil)) +P:1 +scheme> (car (cons 1 2)) +N:1 ``` -### Pairs -- Represented as: `P:index` -- Created using cons -- Stored in heap with car and cdr values - -## Supported Operations - -### Arithmetic Operations -1. Addition: `(+ x y ...)` - ```scheme - scheme> (+ 1 2) - N:3 - scheme> (+ 1 2 3) - N:6 - ``` - -2. Subtraction: `(- x y ...)` - ```scheme - scheme> (- 5 3) - N:2 - scheme> (- 10 2 3) ; 10 - 2 - 3 - N:5 - ``` - -3. Multiplication: `(* x y ...)` - ```scheme - scheme> (* 3 4) - N:12 - scheme> (* 2 3 4) - N:24 - ``` - -4. Division: `(/ x y)` - ```scheme - scheme> (/ 10 2) - N:5 - ``` - -### Comparison Operations -1. Less Than: `(< x y)` - ```scheme - scheme> (< 1 2) - B:1 - scheme> (< 2 1) - B:0 - ``` - -2. Equality: `(= x y)` - ```scheme - scheme> (= 42 42) - B:1 - ``` - -### List Operations -1. Cons: `(cons x y)` - - Creates a pair from two values - ```scheme - scheme> (cons 1 2) - P:1 - scheme> (cons 1 nil) - P:1 - ``` - -2. Car: `(car pair)` - - Gets the first element of a pair - ```scheme - scheme> (car (cons 1 2)) - N:1 - ``` - -3. Cdr: `(cdr pair)` - - Gets the second element of a pair - ```scheme - scheme> (cdr (cons 1 2)) - N:2 - ``` - -### Building Lists -Lists can be built using nested cons operations with nil as the terminator: +4. **Variables**: ```scheme -scheme> (cons 1 (cons 2 (cons 3 nil))) -P:1 ; This represents the list (1 2 3) +scheme> (define x 42) +N:42 +scheme> x +N:42 ``` -### Define -- Define a new variable or function +5. **Functions**: ```scheme -scheme> (define x 10) -N:10 scheme> (define add (x y) (+ x y)) -F:1 +scheme> (add 2 3) +N:5 ``` -### Let -- Define local bindings within a let expression +6. **Let Expressions**: ```scheme scheme> (let ((x 10) (y 20)) (+ x y)) N:30 ``` -## Expression Structure -- All expressions must be properly parenthesized -- Operators come first in a form (prefix notation) -- Multiple expressions can be evaluated in sequence - -## REPL Features -- Multi-line input supported (continues with "..." prompt until parentheses balance) -- Ctrl+D to exit -- Comments start with semicolon (;) - -## Error Handling -The system will report errors for: -- Stack underflow -- Type mismatches -- Unknown operations -- Division by zero -- Invalid list operations -- Malformed expressions - -## Examples -Here are some more complex examples: - -1. Nested arithmetic: -```scheme -scheme> (+ (* 3 4) (- 10 5)) -N:17 -``` - -2. List construction and manipulation: -```scheme -scheme> (cons (+ 1 2) (cons (* 3 4) nil)) -P:1 ; Represents (3 12) -``` - -3. Combined operations: -```scheme -scheme> (car (cons (* 2 3) (+ 4 5))) -N:6 -``` +## Implementation Details + +### Compiler Pipeline +1. **Lexical Analysis**: + - Tokenizes input into numbers, symbols, and parentheses + - Handles whitespace and basic comments + +2. **Parsing**: + - Builds expression tree from tokens + - Validates syntax and expression structure + +3. **Code Generation**: + - Translates expressions to VM instructions + - Manages scope and variable bindings + - Handles function definitions + +### Virtual Machine +1. **Stack Operations**: + - PUSH_CONST: Push constant value + - POP: Remove top value + - STORE: Store variable binding + - LOOKUP: Retrieve variable value + +2. **Function Calls**: + - CALL: Invoke function + - RETURN: Return from function + - Environment frame management + +3. **Memory Management**: + - Stack-based evaluation + - Simple heap for cons cells + - Basic reference counting (not fully implemented) + +### State Persistence +- Global variables and functions persist between sessions +- State stored in `/tmp/scheme_vm.state` and `/tmp/scheme_vm.env` +- Automatic state loading/saving + +## Extending the System + +### Adding New Special Forms +1. Add parsing in `compile_expr()` +2. Implement code generation function +3. Add corresponding VM instructions if needed + +### Adding New Data Types +1. Define new type tag in VM +2. Add type checking predicates +3. Implement value construction/access +4. Update relevant operations + +### Adding VM Instructions +1. Add instruction handling in `execute()` +2. Implement instruction logic +3. Update compiler to generate new instruction + +### Debugging +- Enable debug output: `DEBUG=1 ./scheme` +- Debug messages show: + - Lexical analysis + - Parsing steps + - Code generation + - VM execution + - Stack operations + - Environment changes ## Limitations Current implementation does not support: -- Variables or definition -- Functions or lambda expressions -- Control structures (if, cond) -- Quote or quasiquote -- String data type -- Input/output operations -- Standard library functions +- Floating point numbers +- Strings +- Proper tail recursion +- Garbage collection +- Error recovery in REPL +- Full numeric tower +- Macros +- Standard library ## Future Enhancements -Possible additions could include: -1. Let expressions for local bindings -2. Function definitions -3. Conditional expressions -4. More numeric operations -5. String support -6. Additional list operations \ No newline at end of file +1. Proper tail call optimization +2. Garbage collection +3. Error recovery +4. More data types +5. Standard library +6. Better debugging tools \ No newline at end of file diff --git a/awk/scheme/scheme/TODO.md b/awk/scheme/scheme/TODO.md deleted file mode 100644 index 52218a8..0000000 --- a/awk/scheme/scheme/TODO.md +++ /dev/null @@ -1,19 +0,0 @@ -# Limitations -Current implementation does not support: - -- Variables or definition -- Functions or lambda expressions -- Control structures (if, cond) -- Quote or quasiquote -- String data type -- Input/output operations -- Standard library functions - -# Future Enhancements -Possible additions could include: - -- Let expressions for local bindings -- Function definitions -- Conditional expressions -- More numeric operations -- String support diff --git a/awk/scheme/scheme/TODO.txt b/awk/scheme/scheme/TODO.txt new file mode 100644 index 0000000..31723a4 --- /dev/null +++ b/awk/scheme/scheme/TODO.txt @@ -0,0 +1,69 @@ +Scheme Interpreter TODO +===================== + +Current State: +------------- +- Basic arithmetic operations (+,-,*,/) are working +- Let expressions with simple bindings are working +- Function definitions (define) and lambda expressions are partially implemented +- Stack-based virtual machine with environment for variable bindings + +Recent Changes: +-------------- +1. Fixed function table array naming conflicts (func_name -> func_def_names) +2. Modified vm_get_value to handle function names correctly +3. Attempted to fix argument handling in function calls +4. Modified lambda compilation to avoid pushing function name twice + +Current Issues: +-------------- +1. Function Definitions (define): + - Function calls are failing with "ADD requires numeric operands" + - Arguments may not be properly passed to functions + - Function environment may not be properly set up + +2. Lambda Expressions: + - Direct lambda calls are failing + - Lambda bindings in let expressions are failing + - Function environment for lambda parameters may not be correct + +Next Steps: +----------- +1. Debug Function Calls: + - Add detailed debug logging in vm_call_function + - Verify argument handling and environment setup + - Check if function code is being properly stored and retrieved + +2. Fix Function Environment: + - Review how parameters are stored in the environment + - Ensure environment is properly restored after function calls + - Verify parameter values are correctly passed to functions + +3. Improve Lambda Support: + - Review lambda compilation process + - Ensure lambda functions are properly stored and called + - Fix environment handling for lambda parameters + +4. Testing Strategy: + - Create test cases for each type of function call + - Add debug output to track stack and environment state + - Verify each step of function compilation and execution + +5. Code Cleanup: + - Review and document function handling code + - Ensure consistent naming and error handling + - Add comments explaining complex operations + +Technical Notes: +--------------- +- Function calls use a combination of LOOKUP, GET_VALUE, and CALL instructions +- Environment stack handles variable bindings and function parameters +- Function code is stored in FUNCTIONS table with unique names +- Lambda functions use __lambda_N naming scheme + +Debugging Tips: +-------------- +1. Enable DEBUG=1 to see detailed execution logs +2. Check stack contents before and after each operation +3. Verify environment state during function calls +4. Monitor function code storage and retrieval \ No newline at end of file diff --git a/awk/scheme/scheme/bin/compiler.awk b/awk/scheme/scheme/bin/compiler.awk index e7e8081..debaa2c 100755 --- a/awk/scheme/scheme/bin/compiler.awk +++ b/awk/scheme/scheme/bin/compiler.awk @@ -1,44 +1,55 @@ #!/usr/bin/awk -f +# This is a Scheme-to-Assembly compiler implemented in AWK +# It takes Scheme expressions as input and outputs assembly instructions +# for a custom stack-based virtual machine + BEGIN { - # Compiler state - curr_token = "" - input_buffer = "" - next_label = 0 - program = "" - - # Debug mode - DEBUG = 0 + # Compiler maintains internal state for code generation + curr_token = "" # Current token being processed by lexer + input_buffer = "" # Buffer for input text being tokenized + next_label = 0 # Counter for generating unique labels + program = "" # Accumulates the full program text + + # Debug mode disabled by default, can be enabled via DEBUG=1 environment variable + DEBUG = (ENVIRON["DEBUG"] == "1") ? 1 : 0 } +# Debug logging helper function function debug(msg) { if (DEBUG) printf("[DEBUG] %s\n", msg) > "/dev/stderr" } -# Process input line - just accumulate the raw input +# Each line of input is accumulated into the program string +# This allows handling multi-line expressions { if (program != "") program = program "\n" program = program $0 } +# Main compiler entry point - called after all input is read END { debug("Raw program:\n" program) if (program == "") exit - # Split program into expressions and compile each one + # Parse and compile each expression in the program split_expressions(program) } -# New function to handle multiple expressions -function split_expressions(prog, current, paren_count, i, c, expr, cleaned) { +# Splits input into individual Scheme expressions +# Handles nested parentheses and comments +function split_expressions(prog, current, paren_count, i, c, expr, cleaned) { current = "" paren_count = 0 - # Extract expressions between parentheses + # Clean up the input: + # 1. Remove comments (text between ; and next opening paren) + # 2. Normalize whitespace + # 3. Trim leading/trailing whitespace cleaned = prog gsub(/;[^(]*\(/, "(", cleaned) # Remove comments before expressions gsub(/\)[^)]*;/, ")", cleaned) # Remove comments after expressions - gsub(/[ \t\n]+/, " ", cleaned) # Normalize whitespace + gsub(/[ \t\n]+/, " ", cleaned) # Normalize whitespace to single spaces sub(/^[ \t\n]+/, "", cleaned) # Trim leading whitespace sub(/[ \t\n]+$/, "", cleaned) # Trim trailing whitespace @@ -46,6 +57,7 @@ function split_expressions(prog, current, paren_count, i, c, expr, cleaned) { if (cleaned == "") return + # Parse expressions by tracking parenthesis nesting for (i = 1; i <= length(cleaned); i++) { c = substr(cleaned, i, 1) @@ -59,7 +71,7 @@ function split_expressions(prog, current, paren_count, i, c, expr, cleaned) { if (c == ")") { paren_count-- if (paren_count == 0) { - # Found complete expression + # Complete expression found - compile it expr = current sub(/^\s+/, "", expr) sub(/\s+$/, "", expr) @@ -73,30 +85,34 @@ function split_expressions(prog, current, paren_count, i, c, expr, cleaned) { } } + # Add final HALT instruction print "HALT" } -# Lexer functions +# Lexer helper functions for character classification function is_digit(c) { return c >= "0" && c <= "9" } function is_whitespace(c) { return c == " " || c == "\t" || c == "\n" } +# Lexical analyzer - converts input into tokens +# Handles numbers, symbols, and parentheses function next_token() { - # Initialize input_buffer if this is first call + # Initialize input buffer on first call if (input_buffer == "") input_buffer = program - # Skip whitespace + # Skip whitespace between tokens while (length(input_buffer) > 0 && is_whitespace(substr(input_buffer, 1, 1))) input_buffer = substr(input_buffer, 2) if (length(input_buffer) == 0) return "EOF" + # Handle parentheses as single-character tokens c = substr(input_buffer, 1, 1) if (c == "(" || c == ")") { input_buffer = substr(input_buffer, 2) return c } - # Handle numbers + # Handle numbers (including negative numbers) if (is_digit(c) || c == "-" && length(input_buffer) > 1 && is_digit(substr(input_buffer, 2, 1))) { num = "" while (length(input_buffer) > 0) { @@ -108,7 +124,7 @@ function next_token() { return num } - # Handle symbols + # Handle symbols (identifiers and operators) sym = "" while (length(input_buffer) > 0) { c = substr(input_buffer, 1, 1) @@ -119,8 +135,9 @@ function next_token() { return sym } -# Parser functions -function parse_expr( token, result) { +# Recursive descent parser for Scheme expressions +# Returns parsed expression as a string +function parse_expr(token, result) { token = next_token() if (token == "EOF") return "" @@ -134,7 +151,8 @@ function parse_expr( token, result) { return token } -function parse_list( result, expr) { +# Parses a list expression (anything in parentheses) +function parse_list(result, expr) { result = "" while (1) { @@ -149,8 +167,9 @@ function parse_list( result, expr) { return "(" result ")" } -# Split expression into operator and arguments -function split_expr(expr, i, len, c, op, args, paren_count) { +# Splits an expression into operator and arguments +# Handles nested expressions correctly +function split_expr(expr, i, len, c, op, args, paren_count) { len = length(expr) paren_count = 0 @@ -174,8 +193,8 @@ function split_expr(expr, i, len, c, op, args, paren_count) { return op SUBSEP args } -# Split arguments handling nested parentheses -function split_args(args, arg_array, len, i, c, current, paren_count, arg_count) { +# Splits argument list handling nested parentheses +function split_args(args, arg_array, len, i, c, current, paren_count, arg_count) { len = length(args) current = "" paren_count = 0 @@ -202,26 +221,47 @@ function split_args(args, arg_array, len, i, c, current, paren_count, arg_cou return arg_count } -# Code generation functions +# Code generation for numeric literals function compile_number(num) { debug("Compiling number: " num) print "PUSH_CONST N:" num } -function compile_primitive_call(op, args, arg_array, nargs, i) { +# Code generation for primitive operations (+, -, *, cons, etc) +function compile_primitive_call(op, args, arg_array, nargs, i) { debug("Primitive call: op=" op " args=" args) nargs = split_args(args, arg_array) - # Compile arguments for all operations - for (i = 1; i <= nargs; i++) { - compile_expr(arg_array[i]) + # Check if this is a lambda function call + if (substr(op, 1, 1) == "(") { + # This is a lambda function call + # First compile the lambda function + compile_expr(op) + + # Then compile all arguments + for (i = 1; i <= nargs; i++) { + compile_expr(arg_array[i]) + } + + # Call the function + print "CALL __lambda_" (next_label - 1) + return } + # Then emit appropriate operation if (op == "+") { + # Compile arguments + for (i = 1; i <= nargs; i++) { + compile_expr(arg_array[i]) + } for (i = 1; i < nargs; i++) print "ADD" } else if (op == "-") { + # Compile arguments + for (i = 1; i <= nargs; i++) { + compile_expr(arg_array[i]) + } if (nargs == 1) { print "PUSH_CONST N:0" print "SWAP" @@ -230,55 +270,86 @@ function compile_primitive_call(op, args, arg_array, nargs, i) { print "SUB" } else if (op == "*") { + # Compile arguments + for (i = 1; i <= nargs; i++) { + compile_expr(arg_array[i]) + } for (i = 1; i < nargs; i++) print "MUL" } else if (op == "cons") { if (nargs != 2) error("cons requires 2 arguments") + # Compile arguments + for (i = 1; i <= nargs; i++) { + compile_expr(arg_array[i]) + } print "CONS" } else if (op == "car") { if (nargs != 1) error("car requires 1 argument") + # Compile argument + compile_expr(arg_array[1]) print "CAR" } else if (op == "cdr") { if (nargs != 1) error("cdr requires 1 argument") + # Compile argument + compile_expr(arg_array[1]) print "CDR" } else if (op == "<") { if (nargs != 2) error("< requires 2 arguments") + # Compile arguments + for (i = 1; i <= nargs; i++) { + compile_expr(arg_array[i]) + } print "LT" } else if (op == "=") { if (nargs != 2) error("= requires 2 arguments") + # Compile arguments + for (i = 1; i <= nargs; i++) { + compile_expr(arg_array[i]) + } print "EQ" } else { - # Function call + # Function call for user-defined functions debug("Function call: " op) - print "CALL " op + # Look up the function name + print "LOOKUP " op + # Get the actual function name + print "GET_VALUE" + # Then compile arguments + for (i = 1; i <= nargs; i++) { + compile_expr(arg_array[i]) + } + # Call the function + print "CALL" } } -function split_bindings(bindings, binding_array, count, current, paren_count, i, c) { +# Splits let bindings into individual variable/value pairs +function split_bindings(bindings, binding_array, count, current, paren_count, i, c, in_lambda) { count = 0 current = "" paren_count = 0 + in_lambda = 0 for (i = 1; i <= length(bindings); i++) { c = substr(bindings, i, 1) - # Track parentheses + # Track nested parentheses if (c == "(") { paren_count++ - if (paren_count == 1) { + if (paren_count == 1 && !in_lambda) { current = "" # Start new binding continue } } if (c == ")") { paren_count-- - if (paren_count == 0) { + if (paren_count == 0 && !in_lambda) { # End of binding binding_array[++count] = current current = "" @@ -286,6 +357,11 @@ function split_bindings(bindings, binding_array, count, current, paren_count, } } + # Track if we're inside a lambda expression + if (substr(bindings, i, 7) == "lambda ") { + in_lambda = 1 + } + # Only add character if we're inside a binding if (paren_count > 0) { current = current c @@ -295,7 +371,8 @@ function split_bindings(bindings, binding_array, count, current, paren_count, return count } -function compile_let(args, bindings, body, binding_array, nbindings, i, var, val, binding_parts) { +# Compiles let expressions (local variable bindings) +function compile_let(args, bindings, body, binding_array, nbindings, i, var, val, binding_parts) { # Split into bindings and body if (substr(args, 1, 1) != "(") error("Malformed let expression") @@ -323,17 +400,34 @@ function compile_let(args, bindings, body, binding_array, nbindings, i, var, nbindings = split_bindings(bindings, binding_array) for (i = 1; i <= nbindings; i++) { debug("Processing binding: " binding_array[i]) - split(binding_array[i], binding_parts, " ") - var = binding_parts[1] - val = binding_parts[2] + + # Find the variable name (everything up to the first space) + var = binding_array[i] + sub(/ .*$/, "", var) + + # Find the value (everything after the first space) + val = binding_array[i] + sub(/^[^ ]+ /, "", val) debug("Binding var: " var " val: " val) # Compile the value - compile_expr(val) - - # Store in environment - print "STORE " var + if (substr(val, 1, 1) == "(") { + # Handle lambda or other compound expressions + if (substr(val, 2, 6) == "lambda") { + # This is a lambda expression + # First compile the lambda + compile_lambda(substr(val, 8)) # Skip "(lambda " + # Store the function name in the environment + print "STORE " var + } else { + compile_expr(val) + print "STORE " var + } + } else { + compile_expr(val) + print "STORE " var + } } # Compile the body @@ -345,7 +439,8 @@ function compile_let(args, bindings, body, binding_array, nbindings, i, var, } } -function compile_define(args, name, params, body, param_array, nparams, i, paren_start, paren_end) { +# Compiles define expressions (function/variable definitions) +function compile_define(args, name, params, body, param_array, nparams, i, paren_start, paren_end) { # Set flag for global definition print "PUSH_CONST B:1" print "STORE from_define" # Must match exactly what vm_store checks for @@ -396,25 +491,81 @@ function compile_define(args, name, params, body, param_array, nparams, i, pa } } +# Compiles lambda expressions (anonymous functions) +function compile_lambda(args, params, body, param_array, nparams, i, lambda_name) { + # Generate a unique name for the lambda function + lambda_name = "__lambda_" next_label++ + + # Split into parameters and body + if (substr(args, 1, 1) != "(") error("Malformed lambda expression") + + # Find matching closing parenthesis for parameters + paren_count = 1 + i = 2 + while (paren_count > 0 && i <= length(args)) { + if (substr(args, i, 1) == "(") paren_count++ + if (substr(args, i, 1) == ")") paren_count-- + i++ + } + if (paren_count > 0) error("Unmatched parenthesis in lambda parameters") + + params = substr(args, 2, i - 3) # Remove parentheses + body = substr(args, i) + + # Trim whitespace from body + sub(/^[ \t\n]+/, "", body) + sub(/[ \t\n]+$/, "", body) + + debug("Lambda parameters: " params) + debug("Lambda body: " body) + + # Create function label + print "LABEL " lambda_name + + # Process parameters + nparams = split(params, param_array, " ") + for (i = 1; i <= nparams; i++) { + print "STORE " param_array[i] + } + + # Compile function body + compile_expr(body) + + # Clean up parameters and return + for (i = nparams; i >= 1; i--) { + print "POP_ENV" + } + print "RETURN" + + # Only push the function name if we're not in a direct call + if (!(args ~ /^\([^)]*\)[^(]*$/)) { + print "PUSH_CONST S:" lambda_name + } +} + +# Main expression compiler - dispatches based on expression type function compile_expr(expr, split_result, op, args) { debug("Compiling expression: " expr) + # Handle numeric literals if (expr ~ /^-?[0-9]+$/) { compile_number(expr) return } + # Handle nil constant if (expr == "nil") { print "PUSH_CONST NIL:" return } - # Add variable lookup + # Handle variable lookup if (expr ~ /^[a-zA-Z_][a-zA-Z0-9_]*$/) { print "LOOKUP " expr return } + # Handle compound expressions (lists) if (substr(expr, 1, 1) == "(") { expr = substr(expr, 2, length(expr) - 2) split_result = split_expr(expr) @@ -425,6 +576,8 @@ function compile_expr(expr, split_result, op, args) { compile_define(args) } else if (op == "let") { compile_let(args) + } else if (op == "lambda") { + compile_lambda(args) } else { compile_primitive_call(op, args) } @@ -434,6 +587,7 @@ function compile_expr(expr, split_result, op, args) { error("Unknown expression type: " expr) } +# Error reporting helper function error(msg) { print "Error: " msg > "/dev/stderr" exit 1 diff --git a/awk/scheme/scheme/bin/vm.awk b/awk/scheme/scheme/bin/vm.awk index cb2b992..4e7d2c7 100755 --- a/awk/scheme/scheme/bin/vm.awk +++ b/awk/scheme/scheme/bin/vm.awk @@ -1,42 +1,49 @@ #!/usr/bin/awk -f -# VM State Initialization +# This is a stack-based virtual machine for executing compiled Scheme code +# It implements a simple instruction set with support for: +# - Basic arithmetic operations +# - Function calls and returns +# - Variable bindings and lookups +# - Cons cells and list operations + BEGIN { - # Type tags - T_NUMBER = "N" - T_BOOLEAN = "B" - T_SYMBOL = "S" - T_PAIR = "P" - T_FUNCTION = "F" - T_NIL = "NIL" - - # VM registers - stack_ptr = 0 # Stack pointer - heap_ptr = 0 # Heap pointer - pc = 0 # Program counter + # Type system tags for runtime type checking + T_NUMBER = "N" # Numbers (integers) + T_BOOLEAN = "B" # Booleans (0/1) + T_SYMBOL = "S" # Symbols (identifiers) + T_PAIR = "P" # Cons cells (pairs) + T_FUNCTION = "F" # Function references + T_NIL = "NIL" # Empty list marker + + # Virtual machine registers + stack_ptr = 0 # Points to top of evaluation stack + heap_ptr = 0 # Points to next free heap location + pc = 0 # Program counter for instruction fetch - # Debug mode - DEBUG = 0 + # Debug mode disabled by default, can be enabled via DEBUG=1 environment variable + DEBUG = (ENVIRON["DEBUG"] == "1") ? 1 : 0 - # Environment for variables - env_size = 0 + # Environment for variable bindings + env_size = 0 # Current size of environment stack - # Function table (make it persistent) - delete func_name - delete func_pc - delete func_code - func_size = 0 + # Function table for storing defined functions + delete func_def_names # Function names + delete func_def_pc # Entry points + delete func_def_code # Function bodies + func_def_size = 0 # Number of defined functions - # Call stack + # Call stack for function returns call_stack_ptr = 0 - # State persistence + # State persistence configuration STATE_FILE = "/tmp/scheme_vm.state" if (PERSIST) { debug("Loading state from: " STATE_FILE) if ((getline line < STATE_FILE) >= 0) { # Check if file exists and is readable do { if (line ~ /^FUNC /) { + # Parse and load function definition sub(/^FUNC /, "", line) name = line sub(/ .*$/, "", name) @@ -46,30 +53,32 @@ BEGIN { debug("Loaded function: " name) debug("Code: " code) - func_name[func_size] = name - func_code[func_size] = code - func_size++ + # Store function in function table + func_def_names[func_def_size] = name + func_def_code[func_def_size] = code + func_def_size++ } } while ((getline line < STATE_FILE) > 0) close(STATE_FILE) } } - # Function environments - delete func_env_names - delete func_env_vals - delete func_env_sizes + # Function environment storage + delete func_env_names # Variable names in function scope + delete func_env_vals # Variable values in function scope + delete func_env_sizes # Size of each function's environment - # Global function storage - delete FUNCTIONS # Our own function storage array + # Global function registry + delete FUNCTIONS # Maps function names to implementations - # Environment persistence + # Environment persistence configuration ENV_STATE_FILE = "/tmp/scheme_vm.env" if (PERSIST) { debug("Loading environment state from: " ENV_STATE_FILE) if ((getline line < ENV_STATE_FILE) >= 0) { do { if (line ~ /^ENV /) { + # Parse and load environment binding sub(/^ENV /, "", line) name = line sub(/ .*$/, "", name) @@ -78,6 +87,7 @@ BEGIN { debug("Loaded env var: " name " = " val) + # Store in environment env_name[env_size] = name env_val[env_size] = val env_size++ @@ -87,28 +97,44 @@ BEGIN { } } - normal_exit = 0 # Track if we exited normally via HALT + # Register built-in functions + FUNCTIONS["+"] = "add" + FUNCTIONS["-"] = "subtract" + FUNCTIONS["*"] = "multiply" + FUNCTIONS["/"] = "divide" + FUNCTIONS["="] = "equals" + FUNCTIONS["<"] = "less_than" + FUNCTIONS[">"] = "greater_than" + FUNCTIONS["add1"] = "add_one" + + # Track if VM halted normally (vs error) + normal_exit = 0 } -# Debug output function +# Debug output helper function debug(msg) { if (DEBUG) printf("[DEBUG] %s\n", msg) > "/dev/stderr" } -# Value construction and access +# Value constructors and accessors +# Values are stored as type:value pairs for runtime type checking function makeValue(type, val) { return type ":" val } function getType(val) { - return substr(val, 1, index(val, ":") - 1) + type = substr(val, 1, index(val, ":") - 1) + debug("Get type: " type " from " val) + return type } function getValue(val) { - return substr(val, index(val, ":") + 1) + value = substr(val, index(val, ":") + 1) + debug("Get value: " value " from " val) + return value } -# Type checking +# Type checking predicates function isNumber(val) { return getType(val) == T_NUMBER } function isBoolean(val) { return getType(val) == T_BOOLEAN } function isSymbol(val) { return getType(val) == T_SYMBOL } @@ -131,13 +157,14 @@ function pop() { function peek() { if (stack_ptr < 1) error("Stack empty") + debug("Peek: " stack[stack_ptr]) return stack[stack_ptr] } -# Heap operations +# Heap operations for cons cells function allocate(val) { heap[++heap_ptr] = val - refs[heap_ptr] = 1 + refs[heap_ptr] = 1 # Reference counting (not fully implemented) debug("Allocate: " val " at " heap_ptr) return heap_ptr } @@ -156,7 +183,7 @@ function error(msg) { exit 1 } -# Arithmetic operations +# Arithmetic instruction implementations function vm_add() { if (stack_ptr < 2) error("ADD requires two operands") val2 = pop() @@ -199,7 +226,7 @@ function vm_divide() { push(makeValue(T_NUMBER, result)) } -# List operations +# List operation implementations function vm_cons() { if (stack_ptr < 2) error("CONS requires two operands") val2 = pop() @@ -235,6 +262,7 @@ function vm_equal() { val2 = pop() val1 = pop() result = (val1 == val2) ? "1" : "0" + debug("Equal comparison: " val1 " == " val2 " -> " result) push(makeValue(T_BOOLEAN, result)) } @@ -245,15 +273,17 @@ function vm_less_than() { if (!isNumber(val1) || !isNumber(val2)) error("LT requires numeric operands") result = (getValue(val1) < getValue(val2)) ? "1" : "0" + debug("Less than comparison: " val1 " < " val2 " -> " result) push(makeValue(T_BOOLEAN, result)) } -# Instruction execution +# Main instruction execution loop function execute(instr) { split(instr, parts, " ") op = parts[1] debug("Execute: " instr) + # Dispatch based on instruction opcode if (op == "PUSH_CONST") { push(parts[2]) } @@ -303,7 +333,7 @@ function execute(instr) { print peek() } else if (op == "HALT") { - normal_exit = 1 # Mark that we're exiting normally + normal_exit = 1 if (stack_ptr > 0) { result = peek() } @@ -333,41 +363,42 @@ function execute(instr) { else if (op == "RETURN") { vm_return() } + else if (op == "GET_VALUE") { + vm_get_value() + } else { error("Unknown instruction: " op) } } -# Program loading +# Load program instructions { - # Store each line as an instruction program[NR-1] = $0 } -# Program execution +# Main execution loop END { - # Execute program while (pc < length(program)) { execute(program[pc++]) } - # Only save state if we didn't halt normally + # Save state if we didn't halt normally if (!normal_exit && PERSIST) { save_state() } } -# Modify vm_store to handle global variables more consistently +# Variable binding implementation function vm_store(name) { debug("Storing " peek() " as " name " at env_size: " env_size) - # If this is from a define, mark it as global - if (lookup_no_error("from_define")) { # Check if from_define is set + # Handle global definitions specially + if (lookup_no_error("from_define")) { name = "__global_" name - # Clear the flag + # Clear the define flag for (i = env_size - 1; i >= 0; i--) { if (env_name[i] == "from_define") { - env_size-- # Remove the flag + env_size-- break } } @@ -375,7 +406,7 @@ function vm_store(name) { # Remove any previous definition of this global for (i = env_size - 1; i >= 0; i--) { if (env_name[i] == name) { - # Shift everything down to remove the old definition + # Shift everything down for (j = i; j < env_size - 1; j++) { env_name[j] = env_name[j + 1] env_val[j] = env_val[j + 1] @@ -386,19 +417,36 @@ function vm_store(name) { } } - # Store in current environment frame + # Handle lambda functions + val = peek() + if (isSymbol(val)) { + func_name = getValue(val) + if (func_name ~ /^__lambda_/) { + # Store the function code under the new name + FUNCTIONS[name] = FUNCTIONS[func_name] + # Store the new name in the environment + env_name[env_size] = name + env_val[env_size] = makeValue(T_SYMBOL, name) + env_size++ + return + } + } + + # Add to environment env_name[env_size] = name env_val[env_size] = peek() env_size++ + debug("Environment after store:") dump_env() } +# Remove top binding from environment function vm_pop_env() { if (env_size <= 0) error("Environment underflow") debug("Popping environment at size: " env_size) - # Don't pop if this is a global definition (from define) + # Don't pop globals if (env_name[env_size-1] ~ /^__global_/) { debug("Keeping global definition: " env_name[env_size-1]) return @@ -408,21 +456,23 @@ function vm_pop_env() { env_size-- } -# Modify vm_lookup to be more explicit about global lookups -function vm_lookup(name, i, global_name) { +# Variable lookup implementation +function vm_lookup(name, i, global_name, val) { debug("Looking up " name " in environment of size: " env_size) dump_env() - # First try looking up with global prefix + # Check if it's a function (built-in or user-defined) + if (name in FUNCTIONS) { + debug("Found function: " name) + push(makeValue(T_SYMBOL, name)) + return + } + + # Try global name first, then local global_name = "__global_" name for (i = env_size - 1; i >= 0; i--) { - if (env_name[i] == global_name) { - debug("Found global " name " = " env_val[i] " at position " i) - push(env_val[i]) - return - } - if (env_name[i] == name) { - debug("Found local " name " = " env_val[i] " at position " i) + if (env_name[i] == global_name || env_name[i] == name) { + debug("Found " name " = " env_val[i] " at position " i) push(env_val[i]) return } @@ -430,6 +480,7 @@ function vm_lookup(name, i, global_name) { error("Undefined variable: " name) } +# Function definition implementation function vm_define_function(name, start_pc) { debug("Defining function: " name " at " start_pc) @@ -443,51 +494,64 @@ function vm_define_function(name, start_pc) { } code = code "\nRETURN" - # Store in our function array + # Store function debug("Storing function: " name " = " code) FUNCTIONS[name] = code pc = i + 1 } -function vm_call_function(name, code_lines, j, saved_pc, saved_env_size, arg, param_name) { - debug("Calling function: " name) +# Function call implementation +function vm_call_function(func_name, code_lines, j, saved_pc, saved_env_size, arg, param_name) { + debug("Calling function: " func_name) - if (!(name in FUNCTIONS)) { - error("Undefined function: " name) + # If name is a symbol, get its value + if (isSymbol(func_name)) { + func_name = getValue(func_name) } - # Get argument from stack before modifying program - arg = pop() - debug("Function argument: " arg) + # Handle anonymous functions + if (func_name ~ /^__lambda_/) { + if (!(func_name in FUNCTIONS)) { + error("Undefined lambda function: " func_name) + } + } else if (!(func_name in FUNCTIONS)) { + error("Undefined function: " func_name) + } saved_pc = pc saved_env_size = env_size # Split function code into lines - split(FUNCTIONS[name], code_lines, "\n") + split(FUNCTIONS[func_name], code_lines, "\n") - # Extract parameter name from first STORE instruction + # Add function code to program at current position + for (j in code_lines) { + program[pc + j - 1] = code_lines[j] + } + + # Check if this is a parameterized function if (code_lines[1] ~ /^STORE /) { - param_name = substr(code_lines[1], 7) # Skip "STORE " + # This is a parameterized function (lambda) + # Get parameter name from STORE instruction + param_name = substr(code_lines[1], 7) debug("Found parameter name: " param_name) + + # Get argument from stack + arg = pop() + debug("Function argument: " arg) + + # Create new environment frame + debug("Creating new environment frame at size: " env_size) + env_name[env_size] = param_name + env_val[env_size] = arg + env_size++ } else { - error("Function missing parameter definition") - } - - # Create new environment frame with correct parameter name - debug("Creating new environment frame at size: " env_size) - env_name[env_size] = param_name - env_val[env_size] = arg - env_size++ - - # Add function code to program - for (j in code_lines) { - program[length(program)] = code_lines[j] + # This is a built-in function or non-parameterized function + debug("Calling non-parameterized function: " func_name) } - # Jump to function code - pc = length(program) - length(code_lines) + # Save return info and jump to function call_stack[++call_stack_ptr] = saved_pc env_stack[call_stack_ptr] = saved_env_size @@ -495,6 +559,7 @@ function vm_call_function(name, code_lines, j, saved_pc, saved_env_size, arg, dump_env() } +# Function return implementation function vm_return() { if (call_stack_ptr > 0) { # Save return value @@ -509,14 +574,14 @@ function vm_return() { # Restore program counter pc = call_stack[call_stack_ptr--] - # Push return value back + # Push return value push(ret_val) debug("Returned with value: " ret_val " and env_size: " env_size) } } -# Add debug function to dump environment in a more readable format +# Debug helper to dump environment contents function dump_env( i) { debug("Environment dump:") for (i = 0; i < env_size; i++) { @@ -524,16 +589,7 @@ function dump_env( i) { } } -# Add flag for define statements -function compile_define(args, name, params, body, param_array, nparams, i, paren_start, paren_end) { - # Set flag to mark this as a global definition - print "PUSH_CONST B:1" # Push true - print "STORE __from_define" - - # ... rest of existing compile_define function ... -} - -# Add helper function for looking up without error +# Helper for checking variable existence without error function lookup_no_error(name, i) { for (i = env_size - 1; i >= 0; i--) { if (env_name[i] == name) { @@ -543,22 +599,109 @@ function lookup_no_error(name, i) { return 0 } -# Add new function to handle state saving +# State persistence implementation function save_state() { debug("Saving state to: " STATE_FILE) - for (i = 0; i < func_size; i++) { - debug("Saving function: " func_name[i]) - print "FUNC " func_name[i] " " func_code[i] > STATE_FILE + for (i = 0; i < func_def_size; i++) { + debug("Saving function: " func_def_names[i]) + print "FUNC " func_def_names[i] " " func_def_code[i] > STATE_FILE } close(STATE_FILE) # Save environment state debug("Saving environment state to: " ENV_STATE_FILE) for (i = 0; i < env_size; i++) { - if (env_name[i] ~ /^__global_/) { # Only save global variables + if (env_name[i] ~ /^__global_/) { # Only save globals debug("Saving env var: " env_name[i] " = " env_val[i]) print "ENV " env_name[i] " " env_val[i] > ENV_STATE_FILE } } close(ENV_STATE_FILE) +} + +# Built-in function implementations +function equals() { + if (stack_ptr < 2) error("= requires two operands") + val2 = pop() + val1 = pop() + if (!isNumber(val1) || !isNumber(val2)) error("= requires numeric operands") + result = (getValue(val1) == getValue(val2)) ? 1 : 0 + push(makeValue(T_BOOLEAN, result)) +} + +function less_than() { + if (stack_ptr < 2) error("< requires two operands") + val2 = pop() + val1 = pop() + if (!isNumber(val1) || !isNumber(val2)) error("< requires numeric operands") + result = (getValue(val1) < getValue(val2)) ? 1 : 0 + push(makeValue(T_BOOLEAN, result)) +} + +function greater_than() { + if (stack_ptr < 2) error("> requires two operands") + val2 = pop() + val1 = pop() + if (!isNumber(val1) || !isNumber(val2)) error("> requires numeric operands") + result = (getValue(val1) > getValue(val2)) ? 1 : 0 + push(makeValue(T_BOOLEAN, result)) +} + +function add() { + if (stack_ptr < 2) error("+ requires two operands") + val2 = pop() + val1 = pop() + if (!isNumber(val1) || !isNumber(val2)) error("+ requires numeric operands") + result = getValue(val1) + getValue(val2) + push(makeValue(T_NUMBER, result)) +} + +function subtract() { + if (stack_ptr < 2) error("- requires two operands") + val2 = pop() + val1 = pop() + if (!isNumber(val1) || !isNumber(val2)) error("- requires numeric operands") + result = getValue(val1) - getValue(val2) + push(makeValue(T_NUMBER, result)) +} + +function multiply() { + if (stack_ptr < 2) error("* requires two operands") + val2 = pop() + val1 = pop() + if (!isNumber(val1) || !isNumber(val2)) error("* requires numeric operands") + result = getValue(val1) * getValue(val2) + push(makeValue(T_NUMBER, result)) +} + +function divide() { + if (stack_ptr < 2) error("/ requires two operands") + val2 = pop() + val1 = pop() + if (!isNumber(val1) || !isNumber(val2)) error("/ requires numeric operands") + if (getValue(val2) == 0) error("Division by zero") + result = getValue(val1) / getValue(val2) + push(makeValue(T_NUMBER, result)) +} + +function add_one() { + if (stack_ptr < 1) error("add1 requires one operand") + val = pop() + if (!isNumber(val)) error("add1 requires numeric operand") + result = getValue(val) + 1 + push(makeValue(T_NUMBER, result)) +} + +# Get value from top of stack +function vm_get_value() { + val = peek() + if (isSymbol(val)) { + name = getValue(val) + # If it's a function name, just push the name directly + if (name in FUNCTIONS) { + push(name) + } else { + push(makeValue(T_SYMBOL, name)) + } + } } \ No newline at end of file diff --git a/awk/scheme/scheme/diagram.md b/awk/scheme/scheme/diagram.md index 2c48d7c..4a719b4 100644 --- a/awk/scheme/scheme/diagram.md +++ b/awk/scheme/scheme/diagram.md @@ -1,6 +1,5 @@ -# Awk-Scheme Diagram - -How this is all orchestrated. +# Awk-Scheme Architecture +## Component Interaction Diagram ``` +----------------+ Scheme Code +----------------+ Assembly +----------------+ @@ -8,36 +7,43 @@ How this is all orchestrated. | REPL | "(+ 1 2)" | Compiler | "PUSH_CONST | VM | | (bin/repl) | | compiler.awk | N:1 | vm.awk | | | | | PUSH_CONST | | -| User Input | | Translates | N:2 | Stack-based | -| Interface | | Scheme to | ADD | Interpreter | -| | | VM Assembly | HALT" | | -| | <------------------ | | <------------- | | +| - Multi-line | | - Lexer | N:2 | - Stack-based | +| - Debug mode | | - Parser | ADD | - Type system | +| - File input | | - Code gen | HALT" | - Environment | +| | <----------------- | | <------------- | | | | Output: "N:3" | | Result | | +----------------+ +----------------+ +----------------+ - ^ | - | | - | Data Flow | - | v -+------+-------------------------------------------------------------------------+ -| | -| 1. REPL (bin/repl) reads Scheme expression from user | -| 2. Expression sent to compiler (bin/compiler.awk) | -| 3. Compiler generates VM assembly instructions | -| 4. VM (bin/vm.awk) executes assembly and maintains: | -| - Stack: For operation values | -| - Heap: For storing pairs/lists | -| - Type tags: N:(number), B:(boolean), P:(pair), NIL: | -| 5. Result returned through REPL to user | -| | -+--------------------------------------------------------------------------------+ + ^ | + | v + | +----------------+ + | | Persistence | + | | /tmp files: | + +--------------------------------------------------------------+ - vm.state | + | - vm.env | + +----------------+ + +Debug Flow (when DEBUG=1): ++----------------+ Debug Info +----------------+ Debug Info +----------------+ +| REPL | -----------------> | Compiler | -------------> | VM | +| | [Input received] | | [Tokens found] | | +| [Debug output] | | [Parsing tree] | [Stack ops] | [Stack state] | +| stderr | <----------------- | [Gen code] | <------------- | [Environment] | ++----------------+ +----------------+ +----------------+ + +Execution Flow Example: +┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ +│ Input: │ │ Lexer/ │ │ Code Gen │ │ VM │ +│ (+ 1 2) │ ------> │ Parser: │ ------> │ PUSH_CONST │ ------> │ Stack: │ +│ │ │ (+ │ │ N:1 │ │ [N:1] │ +│ │ │ 1 │ │ PUSH_CONST │ │ [N:1,N:2] │ +│ │ │ 2) │ │ N:2 │ │ [N:3] │ +│ │ │ │ │ ADD │ │ │ +└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ -Example Flow: +State Management: ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ -│ Input: │ │ Compiler │ │ VM │ -│ (+ 1 2) │ → │ PUSH_CONST │ → │ Stack: │ -│ │ │ N:1 │ │ [N:1] │ -│ │ │ PUSH_CONST │ │ [N:1,N:2] │ -│ │ │ N:2 │ │ [N:3] │ -│ │ │ ADD │ │ │ +│ Global │ │ Environment │ │ Function │ +│ Variables │ ------> │ Stack │ ------> │ Calls │ +│ (persist) │ │ (frames) │ │ (stack) │ └─────────────┘ └─────────────┘ └─────────────┘ ``` \ No newline at end of file diff --git a/awk/scheme/scheme/docs/awk-scheme-prompt.md b/awk/scheme/scheme/docs/awk-scheme-prompt.md deleted file mode 100644 index f7e0414..0000000 --- a/awk/scheme/scheme/docs/awk-scheme-prompt.md +++ /dev/null @@ -1,189 +0,0 @@ -# Implementation Request for AWK-based Scheme Virtual Machine - -I need help implementing a stack-based virtual machine for a minimal Scheme implementation in AWK. The implementation should work with standard AWK (gawk features optional). - -## Core Data Structures - -### Value Representation -Values in AWK will be represented as strings with type tags. Each value should be a string with format "TYPE:VALUE" where: -- Numbers: "N:123.45" -- Booleans: "B:1" or "B:0" -- Symbols: "S:symbol-name" -- Pairs: "P:car_idx,cdr_idx" (indices into heap array) -- Functions: "F:addr,env_idx" (instruction address and environment index) -- Nil: "NIL:" - -### Memory Model -Using AWK's associative arrays: -```awk -# Stack - numeric indexed array -stack[stack_ptr] - -# Heap - numeric indexed array for allocated objects -heap[heap_ptr] - -# Environments - numeric indexed array of frames -env[env_ptr] - -# Each environment frame is stored as concatenated strings: -# "name1:val1,name2:val2,..." -``` - -### Instruction Format -Instructions stored in array with format: -```awk -instr[addr] = "OPCODE OPERAND1 OPERAND2..." -``` - -## Core Components Needed - -1. Value Handling -```awk -# Type checking -function isNumber(val) { return substr(val, 1, 2) == "N:" } -function isSymbol(val) { return substr(val, 1, 2) == "S:" } -# etc. - -# Value extraction -function getValue(val) { return substr(val, 3) } -``` - -2. Memory Management -- Simple reference counting using an additional array -- Object allocation through heap_ptr increment -- Basic sweep of unreferenced heap cells - -3. Stack Operations -```awk -# Push and pop -function push(val) { stack[++stack_ptr] = val } -function pop() { return stack[stack_ptr--] } -``` - -4. Environment Management -- Environment represented as chain of frames in env[] array -- Each frame is a string of name:value pairs -- Lookup walks the chain for variable resolution - -## Implementation Steps - -1. Parser/Tokenizer: - - Read instruction format from input file - - Parse into instruction array - - Handle basic syntax for immediate values - -2. Value System: - - Type tag handling - - Value construction - - Type checking - - Value extraction - -3. Core VM: - - Instruction dispatch - - Stack manipulation - - Basic arithmetic - - Flow control - -4. Memory Management: - - Heap allocation - - Reference counting - - Basic garbage collection - -5. Function Handling: - - Call frames - - Return handling - - Tail call optimization - -## Initial Implementation Structure - -```awk -BEGIN { - # Initialize VM state - stack_ptr = 0 - heap_ptr = 0 - env_ptr = 0 - pc = 0 # Program counter -} - -# Main instruction dispatch -function execute(instr) { - split(instr, parts, " ") - op = parts[1] - - if (op == "PUSH_CONST") - push(parts[2]) - else if (op == "ADD") { - b = pop() - a = pop() - push(add(a, b)) - } - # etc. -} - -# Main execution loop -{ - # Load instruction into array - instr[$1] = $0 -} - -END { - # Execute loaded program - while (pc < length(instr)) { - execute(instr[pc++]) - } -} -``` - -## Testing Strategy - -1. Input File Format: -``` -0 PUSH_CONST N:5 -1 PUSH_CONST N:3 -2 ADD -3 HALT -``` - -2. Test Cases: -- Basic arithmetic -- Stack operations -- Function calls -- Error handling -- Memory management - -## AWK-Specific Considerations - -1. String Operations: -- Use split() for parsing -- substr() for type tags -- string concatenation for compound values - -2. Limitations: -- No native types besides numbers and strings -- No recursive function calls in AWK -- Limited stack size -- Memory management needs care - -3. Advantages: -- Associative arrays simplify environment handling -- Built-in string operations help with parsing -- Line-oriented processing suits instruction loading - -## Style Guidelines - -1. Use clear function names: - - makeNumber(n) - - makeSymbol(s) - - etc. - -2. Consistent error handling: - - Set ERRNO - - Print to STDERR - - Exit codes - -3. Document array usage: - - Purpose of each array - - Format of entries - - Lifetime management - -Please start with implementing the value type system and basic stack operations, showing how to represent and manipulate Scheme values in AWK's string-based environment. diff --git a/awk/scheme/scheme/docs/awk-vm-implementation-prompt.md b/awk/scheme/scheme/docs/awk-vm-implementation-prompt.md deleted file mode 100644 index 6596589..0000000 --- a/awk/scheme/scheme/docs/awk-vm-implementation-prompt.md +++ /dev/null @@ -1,226 +0,0 @@ -# AWK-based Scheme VM Implementation Guide - -I need help implementing a stack-based virtual machine in AWK that will support a minimal Scheme implementation. Please focus on AWK-specific approaches and limitations. - -## VM Architecture Overview - -The VM should be stack-based with these key components: -1. An instruction array storing the program -2. A value stack using numeric-indexed arrays -3. A heap for pairs and closures using associative arrays -4. An environment chain for variable lookup - -## Core Data Types - -Each value should be represented as a string with format "TYPE:VALUE": - -```awk -# Examples -"N:42" # Number -"B:1" # Boolean true -"B:0" # Boolean false -"S:x" # Symbol 'x' -"P:1,2" # Pair (cons cell) with heap indices -"F:100,1" # Function (instruction addr, env index) -"NIL:" # Empty list -``` - -## Instruction Set - -### Stack Operations -``` -PUSH_CONST val # Push constant onto stack - # Example: "PUSH_CONST N:42" - -POP # Remove top value from stack - -DUP # Duplicate top stack value - # Before: [a] - # After: [a a] - -SWAP # Swap top two stack values - # Before: [a b] - # After: [b a] -``` - -### Memory Operations -``` -LOAD_LOCAL idx # Load local variable at index - # Example: "LOAD_LOCAL 0" - -STORE_LOCAL idx # Store into local variable - # Example: "STORE_LOCAL 1" - -MAKE_ENV n # Create new environment frame with n slots - # Example: "MAKE_ENV 2" - -LOAD_FREE idx # Load from closure's environment - # Example: "LOAD_FREE 0" - -STORE_FREE idx # Store into closure's environment - # Example: "STORE_FREE 1" -``` - -### Function Operations -``` -CALL n # Call function with n arguments - # Example: "CALL 2" - -TAIL_CALL n # Tail-call with n arguments - # Example: "TAIL_CALL 1" - -RETURN # Return from function - -MAKE_FUNCTION addr # Create function object - # Example: "MAKE_FUNCTION 100" -``` - -### List Operations -``` -CONS # Create pair from two stack values - # Before: [a b] - # After: [(a . b)] - -CAR # Get first element of pair - # Before: [(a . b)] - # After: [a] - -CDR # Get second element of pair - # Before: [(a . b)] - # After: [b] -``` - -### Arithmetic Operations -``` -ADD # Add top two numbers - # Before: [N:3 N:4] - # After: [N:7] - -SUB # Subtract -MUL # Multiply -DIV # Divide -``` - -### Comparison Operations -``` -EQ # Generic equality test -NUM_LT # Numeric less than -NUM_GT # Numeric greater than -``` - -### Control Flow -``` -JUMP offset # Unconditional jump - # Example: "JUMP 100" - -JUMP_IF_FALSE offset # Jump if top of stack is false - # Example: "JUMP_IF_FALSE 200" -``` - -## Implementation Requirements - -1. Instruction Format: -```awk -# Each instruction stored as string in instr[] array -instr[addr] = "OPCODE [operand1] [operand2]..." -``` - -2. Value Handling: -```awk -# Type checking functions -function isNumber(val) { return substr(val, 1, 2) == "N:" } -function isPair(val) { return substr(val, 1, 2) == "P:" } -# etc. - -# Value constructors -function makeNumber(n) { return "N:" n } -function makePair(car_idx, cdr_idx) { return "P:" car_idx "," cdr_idx } -# etc. -``` - -3. Stack Operations: -```awk -# Basic stack manipulation -function push(val) { stack[++stack_ptr] = val } -function pop() { if (stack_ptr < 1) error("stack underflow"); return stack[stack_ptr--] } -``` - -4. Memory Management: -```awk -# Heap allocation -function allocate(val) { - heap[++heap_ptr] = val - refs[heap_ptr] = 1 - return heap_ptr -} - -# Reference counting -function incref(idx) { if (idx > 0) refs[idx]++ } -function decref(idx) { if (idx > 0 && --refs[idx] == 0) free_cell(idx) } -``` - -## Starting Implementation - -Please help implement this VM following these steps: - -1. Core VM loop: -```awk -BEGIN { - # Initialize VM state - stack_ptr = 0 - heap_ptr = 0 - pc = 0 -} - -# Load program -{ - instr[NR-1] = $0 -} - -END { - # Main execution loop - while (pc < length(instr)) { - execute(instr[pc++]) - } -} -``` - -2. Instruction execution: -```awk -function execute(instr) { - split(instr, parts, " ") - op = parts[1] - - if (op == "PUSH_CONST") - push(parts[2]) - else if (op == "ADD") { - b = pop() - a = pop() - push(add(a, b)) - } - # etc. -} -``` - -Please start by implementing: -1. The basic VM loop -2. Stack operations -3. Arithmetic operations -4. Simple control flow - -Then we can move on to: -1. Function calls and returns -2. Environment handling -3. Cons cells and list operations -4. Garbage collection - -Focus on AWK's strengths: -- Associative arrays for memory management -- String operations for value handling -- Line-oriented processing for instruction loading - -And handle AWK's limitations: -- No native complex types -- Limited recursion -- String-based value representation -- Memory management constraints diff --git a/awk/scheme/scheme/docs/scheme-compilation-examples.md b/awk/scheme/scheme/docs/scheme-compilation-examples.md deleted file mode 100644 index 43bae3a..0000000 --- a/awk/scheme/scheme/docs/scheme-compilation-examples.md +++ /dev/null @@ -1,222 +0,0 @@ -# Scheme to VM Instruction Examples - -## Basic Arithmetic - -### Simple Addition -```scheme -(+ 1 2) -``` -```awk -# VM Instructions -0 PUSH_CONST N:1 -1 PUSH_CONST N:2 -2 ADD -``` - -### Nested Arithmetic -```scheme -(+ (* 3 4) (- 10 5)) -``` -```awk -# VM Instructions -0 PUSH_CONST N:3 -1 PUSH_CONST N:4 -2 MUL # Stack now has 12 -3 PUSH_CONST N:10 -4 PUSH_CONST N:5 -5 SUB # Stack now has 5 -6 ADD # Final result 17 -``` - -## Variable Binding and Use - -### Let Expression -```scheme -(let ((x 5)) - (+ x 1)) -``` -```awk -# VM Instructions -0 MAKE_ENV 1 # Create environment with 1 slot -1 PUSH_CONST N:5 # Push initial value -2 STORE_LOCAL 0 # Store into x's slot -3 LOAD_LOCAL 0 # Load x -4 PUSH_CONST N:1 -5 ADD -``` - -### Nested Let -```scheme -(let ((x 5)) - (let ((y (+ x 1))) - (* x y))) -``` -```awk -# VM Instructions -0 MAKE_ENV 1 # Environment for x -1 PUSH_CONST N:5 -2 STORE_LOCAL 0 # Store x -3 MAKE_ENV 1 # Environment for y -4 LOAD_LOCAL 0 # Load x -5 PUSH_CONST N:1 -6 ADD -7 STORE_LOCAL 0 # Store y -8 LOAD_LOCAL 1 # Load x (from outer env) -9 LOAD_LOCAL 0 # Load y -10 MUL -``` - -## Function Definition and Call - -### Simple Function -```scheme -(define (add1 x) - (+ x 1)) -``` -```awk -# VM Instructions -0 MAKE_FUNCTION 2 # Create function pointing to instruction 2 -1 STORE_GLOBAL 0 # Store in global env slot for add1 -2 MAKE_ENV 1 # Function body starts here -3 LOAD_LOCAL 0 # Load parameter x -4 PUSH_CONST N:1 -5 ADD -6 RETURN -``` - -### Function Call -```scheme -(add1 5) -``` -```awk -# VM Instructions -0 PUSH_CONST N:5 # Push argument -1 LOAD_GLOBAL 0 # Load add1 function -2 CALL 1 # Call with 1 argument -``` - -## List Operations - -### List Construction -```scheme -(cons 1 (cons 2 '())) -``` -```awk -# VM Instructions -0 PUSH_CONST N:1 -1 PUSH_CONST N:2 -2 PUSH_CONST NIL: -3 CONS # Creates (2 . nil) -4 CONS # Creates (1 . (2 . nil)) -``` - -### List Operations -```scheme -(car (cons 1 2)) -``` -```awk -# VM Instructions -0 PUSH_CONST N:1 -1 PUSH_CONST N:2 -2 CONS -3 CAR -``` - -## Conditionals - -### If Expression -```scheme -(if (< x 0) - (- 0 x) - x) -``` -```awk -# VM Instructions -0 LOAD_LOCAL 0 # Load x -1 PUSH_CONST N:0 -2 NUM_LT # Compare x < 0 -3 JUMP_IF_FALSE 7 # Skip to else branch if false -4 PUSH_CONST N:0 -5 LOAD_LOCAL 0 # Load x again -6 SUB # Compute 0 - x -7 JUMP 9 # Skip else branch -8 LOAD_LOCAL 0 # Else branch: just load x -9 NOP # Continue here -``` - -## Closures - -### Create Closure -```scheme -(let ((x 1)) - (lambda (y) (+ x y))) -``` -```awk -# VM Instructions -0 MAKE_ENV 1 # Environment for x -1 PUSH_CONST N:1 -2 STORE_LOCAL 0 # Store x -3 MAKE_FUNCTION 5 # Create function -4 MAKE_CLOSURE 1 # Capture current env -5 MAKE_ENV 1 # Function body starts here -6 LOAD_FREE 0 # Load captured x -7 LOAD_LOCAL 0 # Load parameter y -8 ADD -9 RETURN -``` - -## Recursive Function - -### Factorial -```scheme -(define (factorial n) - (if (= n 0) - 1 - (* n (factorial (- n 1))))) -``` -```awk -# VM Instructions -0 MAKE_FUNCTION 2 # Create factorial function -1 STORE_GLOBAL 0 # Store in global env -2 MAKE_ENV 1 # Function body starts -3 LOAD_LOCAL 0 # Load n -4 PUSH_CONST N:0 -5 NUM_EQ # Compare n = 0 -6 JUMP_IF_FALSE 9 # Skip to else branch -7 PUSH_CONST N:1 # Return 1 for base case -8 RETURN -9 LOAD_LOCAL 0 # Load n -10 LOAD_LOCAL 0 # Load n again -11 PUSH_CONST N:1 -12 SUB # Compute n - 1 -13 LOAD_GLOBAL 0 # Load factorial function -14 TAIL_CALL 1 # Tail call factorial(n-1) -15 MUL # Multiply n * factorial(n-1) -16 RETURN -``` - -## Implementation Notes - -1. Environment Chain -- Each MAKE_ENV creates a new frame -- LOAD_LOCAL counts down the chain for outer references -- STORE_LOCAL works similarly - -2. Closure Creation -- MAKE_CLOSURE captures current environment -- LOAD_FREE accesses captured variables - -3. Tail Calls -- TAIL_CALL reuses current stack frame -- Essential for recursive functions - -4. Memory Management -- CONS allocates in heap -- Reference counting needed for heap objects -- Environment frames need reference counting too - -These examples show typical compilation patterns. The actual compiler would need to: -1. Track variable locations -2. Manage label generation -3. Detect tail call positions -4. Handle lexical scoping properly diff --git a/awk/scheme/scheme/docs/scheme-compiler-impl.md b/awk/scheme/scheme/docs/scheme-compiler-impl.md deleted file mode 100644 index db8a204..0000000 --- a/awk/scheme/scheme/docs/scheme-compiler-impl.md +++ /dev/null @@ -1,307 +0,0 @@ -# Scheme Compiler Implementation in AWK - -## Basic Compiler Structure - -We'll implement the compiler as a recursive descent processor that converts Scheme expressions into VM instructions. Here's the core structure: - -```awk -# Global state -BEGIN { - next_label = 0 # For generating unique labels - next_address = 0 # Current instruction address - depth = 0 # Current environment depth -} - -# Main compile function -function compile(expr, parts, len) { - if (isAtom(expr)) - return compileAtom(expr) - - split(expr, parts, " ") - len = length(parts) - - # Strip parentheses - parts[1] = substr(parts[1], 2) # Remove leading ( - parts[len] = substr(parts[len], 1, length(parts[len])-1) # Remove trailing ) - - # Dispatch based on first element - if (parts[1] == "+") - return compileAdd(parts, len) - else if (parts[1] == "let") - return compileLet(parts, len) - else if (parts[1] == "lambda") - return compileLambda(parts, len) - else if (parts[1] == "if") - return compileIf(parts, len) - else - return compileCall(parts, len) -} - -# Utility functions -function isAtom(expr) { - return substr(expr, 1, 1) != "(" -} - -function newLabel() { - return "L" next_label++ -} - -function emit(instr) { - program[next_address++] = instr -} -``` - -## Compilation Patterns - -### 1. Basic Arithmetic - -```awk -function compileAdd(parts, len, i) { - # Compile each argument - for (i = 2; i <= len; i++) - compile(parts[i]) - - # Emit adds between each value - for (i = 3; i <= len; i++) - emit("ADD") - - return next_address - 1 -} - -# Example usage: -# Input: (+ 1 2 3) -# Output: -# PUSH_CONST N:1 -# PUSH_CONST N:2 -# ADD -# PUSH_CONST N:3 -# ADD -``` - -### 2. Let Expressions - -```awk -function compileLet(parts, len, bindings, body, i, num_bindings) { - # Parse let structure - bindings = parts[2] - body = parts[3] - - # Count bindings - split(bindings, binding_parts, " ") - num_bindings = (length(binding_parts) - 2) / 2 # Account for parentheses - - # Create new environment frame - emit("MAKE_ENV " num_bindings) - - # Compile each binding value and store - for (i = 1; i <= num_bindings; i++) { - compile(binding_parts[i * 2]) - emit("STORE_LOCAL " (i - 1)) - } - - # Compile body in new environment - depth++ - compile(body) - depth-- - - return next_address - 1 -} - -# Example usage: -# Input: (let ((x 5)) (+ x 1)) -# Output: -# MAKE_ENV 1 -# PUSH_CONST N:5 -# STORE_LOCAL 0 -# LOAD_LOCAL 0 -# PUSH_CONST N:1 -# ADD -``` - -### 3. Lambda Expressions - -```awk -function compileLambda(parts, len, param_list, body, label_start, label_end) { - label_start = newLabel() - label_end = newLabel() - - # Emit jump around function body - emit("JUMP " label_end) - - # Mark function start - emit(label_start ":") - - # Parse parameters - param_list = parts[2] - body = parts[3] - - # Compile function body - depth++ - compile(body) - emit("RETURN") - depth-- - - # Mark function end - emit(label_end ":") - - # Create function object - emit("MAKE_FUNCTION " label_start) - - return next_address - 1 -} - -# Example usage: -# Input: (lambda (x) (+ x 1)) -# Output: -# JUMP L1 -# L0: -# LOAD_LOCAL 0 -# PUSH_CONST N:1 -# ADD -# RETURN -# L1: -# MAKE_FUNCTION L0 -``` - -### 4. If Expressions - -```awk -function compileIf(parts, len, condition, true_branch, false_branch, label_else, label_end) { - label_else = newLabel() - label_end = newLabel() - - # Compile condition - compile(parts[2]) - - # Jump to else if false - emit("JUMP_IF_FALSE " label_else) - - # Compile true branch - compile(parts[3]) - emit("JUMP " label_end) - - # Compile false branch - emit(label_else ":") - if (len > 4) # If there is an else branch - compile(parts[4]) - - emit(label_end ":") - - return next_address - 1 -} - -# Example usage: -# Input: (if (< x 0) (- 0 x) x) -# Output: -# LOAD_LOCAL 0 -# PUSH_CONST N:0 -# NUM_LT -# JUMP_IF_FALSE L0 -# PUSH_CONST N:0 -# LOAD_LOCAL 0 -# SUB -# JUMP L1 -# L0: -# LOAD_LOCAL 0 -# L1: -``` - -### 5. Function Calls - -```awk -function compileCall(parts, len, i, num_args) { - num_args = len - 1 - - # Compile each argument - for (i = 2; i <= len; i++) - compile(parts[i]) - - # Compile function reference - compile(parts[1]) - - # Emit call instruction - emit("CALL " num_args) - - return next_address - 1 -} - -# Example usage: -# Input: (add1 5) -# Output: -# PUSH_CONST N:5 -# LOAD_GLOBAL 0 -# CALL 1 -``` - -## Symbol Table Management - -```awk -# Track variables and their locations -function enterScope() { - scope_depth++ - next_local = 0 -} - -function leaveScope() { - scope_depth-- -} - -function addLocal(name) { - symbols[scope_depth "," name] = next_local++ -} - -function findVariable(name, i) { - for (i = scope_depth; i >= 0; i--) - if ((i "," name) in symbols) - return "LOCAL " i "," symbols[i "," name] - return "GLOBAL " name -} -``` - -## Example Usage - -Here's how to use the compiler: - -```awk -BEGIN { - # Initialize compiler - next_label = 0 - next_address = 0 - depth = 0 - - # Compile some example code - expr = "(let ((x 5)) (+ x 1))" - compile(expr) - - # Print generated code - for (i = 0; i < next_address; i++) - print i ": " program[i] -} -``` - -## Implementation Notes - -1. Variable Resolution - - Track scope depth for proper variable lookup - - Generate appropriate LOAD/STORE instructions - - Handle closure captures - -2. Label Generation - - Use unique labels for control flow - - Track label addresses for jumps - -3. Environment Management - - Track environment depth - - Generate correct frame sizes - - Handle nested scopes - -4. Error Handling - - Check for syntax errors - - Validate number of arguments - - Report meaningful errors - -5. Optimization Opportunities - - Tail call detection - - Constant folding - - Peephole optimization - - Dead code elimination diff --git a/awk/scheme/scheme/docs/scheme-vm-prompt.md b/awk/scheme/scheme/docs/scheme-vm-prompt.md deleted file mode 100644 index 7463494..0000000 --- a/awk/scheme/scheme/docs/scheme-vm-prompt.md +++ /dev/null @@ -1,112 +0,0 @@ -# Implementation Request for Scheme Virtual Machine - -I need help implementing a stack-based virtual machine for a minimal Scheme implementation in JavaScript. The implementation should follow these requirements: - -## Core Data Structures - -### Value Representation -Please help me implement a tagged union type for Scheme values with these cases: -- Numbers (using JavaScript numbers) -- Booleans (using JavaScript booleans) -- Symbols (as strings) -- Pairs (cons cells) -- Functions (closures) -- Nil - -Each value should include a type tag and the actual value data. The implementation should be memory efficient while still being easy to debug. - -### Stack Frame -Design a stack frame structure that includes: -- An array for local variables -- An operand stack (array) -- Return address (number) -- Previous frame pointer -- Captured environment reference (for closures) - -### Instruction Format -Each instruction should be represented as an object with: -- opcode (string or number) -- operands (if any) -- source position (for debugging) - -## Core Components Needed - -1. Value Constructor/Accessors -- Functions to create each type of value -- Type predicates (isNumber, isPair, etc.) -- Safe accessors that check types - -2. Memory Management -- A simple mark-and-sweep garbage collector -- Root set scanning of VM stack and global environment -- Object allocation interface - -3. Environment Management -- Environment chain implementation -- Closure creation and variable capture -- Variable lookup and storage - -4. Instruction Interpreter -- Main instruction dispatch loop -- Stack manipulation helpers -- Error handling with meaningful messages -- Debugging support (stack traces) - -## Initial Implementation Steps - -Please help me implement these components in this order: - -1. Value type system and basic operations -2. Stack frame and basic stack operations -3. Main instruction interpreter loop -4. Simple arithmetic and comparison operations -5. Function calls and returns -6. Closure creation and environment handling -7. Cons cells and list operations -8. Basic garbage collection - -## Starting Point - -Please start by showing me how to implement: - -1. The tagged value type system -2. Basic stack operations (push, pop) -3. A simple instruction interpreter that can handle: - - PUSH_CONST - - POP - - ADD - - RETURN - -The code should be in a functional style, avoiding mutation where practical, and should use modern JavaScript features. Please include basic error checking and type safety. - -## Testing Approach - -For each component, I'd like to see: -1. Basic unit tests -2. Example instruction sequences -3. Error cases to handle -4. Edge cases to consider - -The implementation should prioritize correctness and clarity over performance initially. - -## Additional Considerations - -Please also consider: -1. How to handle tail calls efficiently -2. Debug information tracking -3. Error recovery strategies -4. Performance bottlenecks to watch for -5. Future extensibility - -## Style Preferences - -The implementation should: -- Use functional programming patterns where appropriate -- Minimize state mutation -- Use clear naming conventions -- Include detailed comments explaining non-obvious code -- Use TypeScript-style JSDoc comments for better tooling support -- Target modern browsers without external dependencies -- Use ES modules for code organization - -Please start with the value type system implementation, showing how to create and manipulate Scheme values in a type-safe way. diff --git a/awk/scheme/scheme/docs/scheme-vm-spec.md b/awk/scheme/scheme/docs/scheme-vm-spec.md deleted file mode 100644 index 424602e..0000000 --- a/awk/scheme/scheme/docs/scheme-vm-spec.md +++ /dev/null @@ -1,130 +0,0 @@ -# Scheme Virtual Machine Specification - -## Stack and Memory Model -The VM uses a stack-based architecture with a separate heap for storing longer-lived objects. Each stack frame contains: -- Local variable space -- Operand stack -- Return address - -## Data Types -- Numbers (implemented as JavaScript numbers) -- Booleans -- Symbols (interned strings) -- Pairs (cons cells) -- Functions (closures) -- Nil - -## Instructions - -### Stack Operations -- PUSH_CONST value ; Push constant onto stack -- POP ; Remove top value from stack -- DUP ; Duplicate top stack value -- SWAP ; Swap top two stack values - -### Environment Operations -- LOAD_LOCAL idx ; Load local variable at index -- STORE_LOCAL idx ; Store into local variable -- MAKE_CLOSURE n ; Create closure with n free variables -- LOAD_FREE idx ; Load from closure's environment -- STORE_FREE idx ; Store into closure's environment - -### Function Operations -- CALL n ; Call function with n arguments -- TAIL_CALL n ; Tail-call with n arguments -- RETURN ; Return from function -- MAKE_FUNCTION addr ; Create function object - -### Pair Operations -- CONS ; Create pair from two stack values -- CAR ; Get first element of pair -- CDR ; Get second element of pair -- SET_CAR ; Set first element of pair -- SET_CDR ; Set second element of pair - -### Arithmetic Operations -- ADD ; Add top two numbers -- SUB ; Subtract -- MUL ; Multiply -- DIV ; Divide -- REMAINDER ; Modulo operation - -### Comparison Operations -- EQ ; Generic equality test -- NUM_EQ ; Numeric equality -- NUM_LT ; Less than -- NUM_GT ; Greater than - -### Control Flow -- JUMP offset ; Unconditional jump -- JUMP_IF_FALSE offset ; Conditional jump -- JUMP_IF_TRUE offset ; Conditional jump - -### Type Operations -- TYPE_OF ; Get type of value -- IS_PAIR ; Test if value is pair -- IS_NUMBER ; Test if value is number -- IS_SYMBOL ; Test if value is symbol - -## Example Instruction Sequences - -### Function Definition -```scheme -(define (add1 x) (+ x 1)) -``` -Compiles to: -``` -MAKE_FUNCTION L1 -STORE_LOCAL 0 -JUMP L2 -L1: - LOAD_LOCAL 0 ; Load x - PUSH_CONST 1 - ADD - RETURN -L2: -``` - -### List Creation -```scheme -(cons 1 (cons 2 '())) -``` -Compiles to: -``` -PUSH_CONST 1 -PUSH_CONST 2 -PUSH_CONST nil -CONS -CONS -``` - -### Closure Creation -```scheme -(let ((x 1)) - (lambda (y) (+ x y))) -``` -Compiles to: -``` -PUSH_CONST 1 ; Push initial value of x -MAKE_FUNCTION L1 ; Create function body -MAKE_CLOSURE 1 ; Create closure capturing x -JUMP L2 -L1: - LOAD_FREE 0 ; Load captured x - LOAD_LOCAL 0 ; Load parameter y - ADD - RETURN -L2: -``` - -## Implementation Notes - -1. The VM should implement proper tail-call optimization using the TAIL_CALL instruction. - -2. Garbage collection can be implemented using a simple mark-and-sweep collector, scanning the stack and heap for reachable objects. - -3. For efficiency, common constants (small integers, nil, etc.) can be preallocated and shared. - -4. The environment model uses static scope, with closures capturing their creation environment. - -5. Function calls maintain their own stack frame with local variables and operand stack. diff --git a/awk/scheme/scheme/docs/test-program.md b/awk/scheme/scheme/docs/test-program.md deleted file mode 100644 index ee20e32..0000000 --- a/awk/scheme/scheme/docs/test-program.md +++ /dev/null @@ -1,48 +0,0 @@ -# Test Program - -Save this as `test.scm.asm`: -``` -PUSH_CONST N:5 -PUSH_CONST N:3 -ADD -PUSH_CONST N:2 -SUB -HALT -``` - -Run with: -```bash -awk -f vm.awk test.scm.asm -``` - -Expected output: -``` -Result: N:6 -``` - -# Additional Test Cases - -## List Operations -``` -PUSH_CONST N:1 -PUSH_CONST N:2 -CONS -PUSH_CONST N:3 -CONS -CAR -HALT -``` - -## Error Cases -``` -# Stack underflow -POP -POP -HALT - -# Type error -PUSH_CONST S:hello -PUSH_CONST N:1 -ADD -HALT -``` diff --git a/awk/scheme/scheme/examples/lambda.test.scm b/awk/scheme/scheme/examples/lambda.test.scm new file mode 100644 index 0000000..1f2bb09 --- /dev/null +++ b/awk/scheme/scheme/examples/lambda.test.scm @@ -0,0 +1,12 @@ +; Test lambda function support +((lambda (x) (+ x 1)) 41) ; Should return 42 + +; Test lambda with multiple parameters +((lambda (x y) (+ x y)) 20 22) ; Should return 42 + +; Test lambda in let expression +(let ((add1 (lambda (x) (+ x 1)))) + (add1 41)) ; Should return 42 + +; Test nested lambda +((lambda (x) ((lambda (y) (+ x y)) 1)) 41) ; Should return 42 \ No newline at end of file |