about summary refs log tree commit diff stats
diff options
context:
space:
mode:
-rw-r--r--awk/scheme/scheme/TODO.txt95
-rw-r--r--awk/scheme/scheme/scratch/arch-notes.md115
-rw-r--r--awk/scheme/scheme/scratch/random.txt112
3 files changed, 275 insertions, 47 deletions
diff --git a/awk/scheme/scheme/TODO.txt b/awk/scheme/scheme/TODO.txt
index 31723a4..f576fba 100644
--- a/awk/scheme/scheme/TODO.txt
+++ b/awk/scheme/scheme/TODO.txt
@@ -5,65 +5,66 @@ Current State:
 -------------
 - Basic arithmetic operations (+,-,*,/) are working
 - Let expressions with simple bindings are working
-- Function definitions (define) and lambda expressions are partially implemented
+- Function definitions (define) and lambda expressions are implemented (no nested lambdas)
 - Stack-based virtual machine with environment for variable bindings
+- Closures supported for top-level lambdas (not nested)
+- State persistence for globals and functions between REPL sessions
 
 Recent Changes:
 --------------
-1. Fixed function table array naming conflicts (func_name -> func_def_names)
-2. Modified vm_get_value to handle function names correctly
-3. Attempted to fix argument handling in function calls
-4. Modified lambda compilation to avoid pushing function name twice
+1. Improved function table and environment handling
+2. Enhanced debug logging for function calls and environment
+3. Refined lambda compilation and closure creation
+4. Updated documentation and architectural notes
 
-Current Issues:
---------------
-1. Function Definitions (define):
-   - Function calls are failing with "ADD requires numeric operands"
-   - Arguments may not be properly passed to functions
-   - Function environment may not be properly set up
-
-2. Lambda Expressions:
-   - Direct lambda calls are failing
-   - Lambda bindings in let expressions are failing
-   - Function environment for lambda parameters may not be correct
+Current Issues / Limitations:
+----------------------------
+1. **Nested Lambda Support**:
+   - Lambdas defined inside other lambdas are not supported ("Undefined closure function" errors)
+2. **Reference Counting**:
+   - Reference counting for heap objects (cons cells) is stubbed but not enforced
+3. **Error Handling**:
+   - Minimal error recovery, especially in the REPL
+4. **Tail Recursion**:
+   - No proper tail call optimization
+5. **Data Types**:
+   - No floating point, string, or full numeric tower support
+6. **Garbage Collection**:
+   - No GC; memory for cons cells is not reclaimed
+7. **Standard Library**:
+   - No standard library or macros
 
 Next Steps:
 -----------
-1. Debug Function Calls:
-   - Add detailed debug logging in vm_call_function
-   - Verify argument handling and environment setup
-   - Check if function code is being properly stored and retrieved
-
-2. Fix Function Environment:
-   - Review how parameters are stored in the environment
-   - Ensure environment is properly restored after function calls
-   - Verify parameter values are correctly passed to functions
-
-3. Improve Lambda Support:
-   - Review lambda compilation process
-   - Ensure lambda functions are properly stored and called
-   - Fix environment handling for lambda parameters
-
-4. Testing Strategy:
-   - Create test cases for each type of function call
-   - Add debug output to track stack and environment state
-   - Verify each step of function compilation and execution
-
-5. Code Cleanup:
-   - Review and document function handling code
-   - Ensure consistent naming and error handling
-   - Add comments explaining complex operations
+1. **Implement Nested Lambda Support**:
+   - Refactor closure environment capture and restoration
+   - Update compiler and VM to support nested closures
+2. **Improve Memory Management**:
+   - Implement or enforce reference counting
+   - Plan for future garbage collection
+3. **Enhance Error Handling**:
+   - Add better error messages and recovery in REPL and VM
+4. **Tail Call Optimization**:
+   - Investigate and implement proper tail recursion
+5. **Testing and Debugging**:
+   - Expand test suite for edge cases (especially closures and let/lambda)
+   - Continue using DEBUG=1 for tracing execution
+6. **Documentation and Code Quality**:
+   - Keep architectural notes and code comments up to date
+   - Document any new patterns or changes
 
 Technical Notes:
 ---------------
-- Function calls use a combination of LOOKUP, GET_VALUE, and CALL instructions
-- Environment stack handles variable bindings and function parameters
-- Function code is stored in FUNCTIONS table with unique names
-- Lambda functions use __lambda_N naming scheme
+- Function calls use LOOKUP, GET_VALUE, and CALL instructions
+- Environment stack manages variable bindings and function parameters
+- Function code is stored in the FUNCTIONS table with unique names
+- Lambda functions use __lambda_N naming scheme; closures capture environment at creation
+- State is persisted to /tmp/scheme_vm.state and /tmp/scheme_vm.env
 
 Debugging Tips:
 --------------
 1. Enable DEBUG=1 to see detailed execution logs
-2. Check stack contents before and after each operation
-3. Verify environment state during function calls
-4. Monitor function code storage and retrieval 
\ No newline at end of file
+2. Check stack and environment contents before and after each operation
+3. Monitor closure creation and environment capture/restoration
+4. Verify function code storage and retrieval in the FUNCTIONS table
+5. Use and expand test cases for all supported features 
\ No newline at end of file
diff --git a/awk/scheme/scheme/scratch/arch-notes.md b/awk/scheme/scheme/scratch/arch-notes.md
new file mode 100644
index 0000000..060ca72
--- /dev/null
+++ b/awk/scheme/scheme/scratch/arch-notes.md
@@ -0,0 +1,115 @@
+# Awk-Scheme: Architectural Notes
+
+## Overview
+Awk-Scheme is a minimal Scheme interpreter implemented in AWK, composed of a compiler and a stack-based virtual machine (VM). The architecture is modular, with clear separation of concerns between parsing/compilation and execution. The design leverages several classic architectural patterns to achieve extensibility, maintainability, and clarity.
+
+---
+
+## 1. Program Flow: High-Level Pipeline
+
+1. **Input**: User provides Scheme code (via REPL or file).
+2. **Compilation**: The compiler (`bin/compiler.awk`) parses and compiles Scheme code into VM instructions.
+3. **Execution**: The VM (`bin/vm.awk`) executes the instructions, managing state, environment, and memory.
+
+---
+
+## 2. Compiler (`bin/compiler.awk`)
+
+### 2.1. Lexical Analysis (Lexer)
+- **Pattern**: *Lexer/Parser Separation* (classic compiler front-end)
+- **Why**: Decouples tokenization from parsing, making the code easier to reason about and extend.
+- **How**: The compiler tokenizes input into numbers, symbols, and parentheses, handling whitespace and comments.
+
+### 2.2. Parsing (Recursive Descent)
+- **Pattern**: *Recursive Descent Parser*
+- **Why**: Simple, direct mapping from grammar to code; easy to debug and extend for a small language.
+- **How**: The parser builds an expression tree from tokens, handling nested expressions and validating syntax.
+
+### 2.3. Code Generation
+- **Pattern**: *Visitor/Dispatcher* (for expression types)
+- **Why**: Each expression type (number, variable, list, special form) is handled by a dedicated function, improving maintainability.
+- **How**: The compiler emits stack-based VM instructions for each expression, handling special forms (define, let, lambda) and function calls.
+
+### 2.4. Closure Support
+- **Pattern**: *Closure Conversion* (from functional programming)
+- **Why**: Enables lexical scoping and first-class functions by capturing the environment at lambda creation.
+- **How**: The compiler emits instructions to capture the current environment and create closure objects.
+
+---
+
+## 3. Virtual Machine (`bin/vm.awk`)
+
+### 3.1. Stack-Based Execution
+- **Pattern**: *Stack Machine* (classic VM design)
+- **Why**: Simplicity and direct mapping from compiled instructions to execution; easy to implement in AWK.
+- **How**: The VM maintains an evaluation stack for operands and results, executing instructions sequentially.
+
+### 3.2. Typed Value System
+- **Pattern**: *Tagged Union* (Algebraic Data Type)
+- **Why**: Enables runtime type checking and safe operations on heterogeneous values.
+- **How**: All values are tagged (e.g., `N:`, `B:`, `P:`, `F:`, `CLOSURE:`) and checked at runtime.
+
+### 3.3. Environment Model
+- **Pattern**: *Environment Chain* (Lexical Scoping)
+- **Why**: Supports variable bindings, lexical scope, and closures.
+- **How**: The VM maintains an environment stack for variable bindings, with special handling for global and closure environments.
+
+### 3.4. Function and Closure Handling
+- **Pattern**: *Direct Function Execution* (no program array modification)
+- **Why**: Simplifies call/return logic and avoids mutation of the instruction stream.
+- **How**: Functions are stored as code blocks; calls execute code directly, with environment management for parameters and closures.
+
+### 3.5. Heap and Memory Management
+- **Pattern**: *Manual Heap with Reference Counting (partial)*
+- **Why**: Enables cons cell allocation and basic memory management.
+- **How**: The VM allocates cons cells on a heap array, with a placeholder for reference counting (not fully implemented).
+
+### 3.6. State Persistence
+- **Pattern**: *State Serialization*
+- **Why**: Allows global state and functions to persist across REPL sessions.
+- **How**: The VM serializes global variables and function definitions to files, loading them on startup.
+
+---
+
+## 4. Extensibility and Maintainability
+- **Pattern**: *Separation of Concerns*
+- **Why**: Compiler and VM are independent, making it easy to extend the language or change the execution model.
+- **Pattern**: *Table-Driven Dispatch* (for built-in functions)
+- **Why**: Adding new primitives or special forms is straightforward.
+
+---
+
+## 5. Notable Limitations
+- No support for nested lambdas (yet), proper tail recursion, or garbage collection.
+- Reference counting is stubbed but not enforced.
+- Error handling is minimal.
+
+---
+
+## 6. Summary Table: Patterns Used
+
+| Area                | Pattern(s) Used                  | Why?                                 |
+|---------------------|----------------------------------|--------------------------------------|
+| Lexing/Parsing      | Lexer/Parser, Recursive Descent  | Simplicity, extensibility            |
+| Code Generation     | Visitor/Dispatcher               | Maintainability, clarity             |
+| VM Execution        | Stack Machine, Tagged Union      | Simplicity, type safety              |
+| Environment         | Environment Chain, Closure Conv. | Lexical scoping, closures            |
+| Function Dispatch   | Table-Driven Dispatch            | Extensibility                        |
+| State Persistence   | State Serialization              | REPL continuity                      |
+| Memory Management   | Manual Heap, Ref Counting (stub) | List support, future GC              |
+
+---
+
+## 7. Architectural Choices: Rationale
+- **AWK as Implementation Language**: Chosen for portability and as a challenge; influences the use of arrays and string-based data structures.
+- **Stack Machine**: Maps well to AWK's capabilities and keeps the VM simple.
+- **Separation of Compiler/VM**: Enables clear boundaries and easier debugging.
+- **Explicit Typing**: Reduces runtime errors and clarifies value handling.
+
+---
+
+## 8. Flow Diagram (Textual)
+
+```
+User Input (Scheme) → [Compiler] → VM Instructions → [VM] → Result/State
+```
\ No newline at end of file
diff --git a/awk/scheme/scheme/scratch/random.txt b/awk/scheme/scheme/scratch/random.txt
new file mode 100644
index 0000000..20b3bcf
--- /dev/null
+++ b/awk/scheme/scheme/scratch/random.txt
@@ -0,0 +1,112 @@
+Other Uses for the Awk-Scheme VM
+================================
+
+The stack-based VM in Awk-Scheme is surprisingly versatile. Here are some realistic alternatives that could be implemented using the same VM architecture:
+
+## 1. Forth Implementation
+**Feasibility: High**
+
+Forth is a natural fit since it's already stack-based:
+- **Stack Operations**: VM already has PUSH_CONST, POP, DUP, SWAP
+- **Arithmetic**: ADD, SUB, MUL, DIV are perfect for Forth
+- **Memory**: Could extend heap for Forth's memory model
+- **Control Flow**: Would need to add conditional jumps and loops
+- **Words**: Could map Forth words to VM functions
+
+**Required Extensions:**
+- JMP, JZ (jump if zero), JNZ instructions
+- More stack operations (OVER, ROT, etc.)
+- Memory read/write operations
+- Input/output operations
+
+## 2. Simple Calculator Language
+**Feasibility: Very High**
+
+A basic calculator with variables and functions:
+- **Syntax**: `x = 5; y = x + 3; print y`
+- **Features**: Variables, basic math, simple functions
+- **VM Fit**: Perfect - already has arithmetic, variables, functions
+
+**Minimal Changes Needed:**
+- New parser for calculator syntax
+- Assignment operator handling
+- Print statement
+
+## 3. Configuration Language
+**Feasibility: High**
+
+A simple config language for key-value pairs and nested structures:
+- **Syntax**: `server { port = 8080; host = "localhost"; }`
+- **Features**: Nested objects, arrays, basic expressions
+- **VM Fit**: Good - can use cons cells for nested structures
+
+**Required Extensions:**
+- String support
+- Object/struct creation
+- Field access operations
+
+## 4. Simple Scripting Language
+**Feasibility: Medium**
+
+A basic scripting language with loops and conditionals:
+- **Syntax**: `if x > 0 { y = x * 2; } else { y = 0; }`
+- **Features**: Variables, conditionals, loops, functions
+- **VM Fit**: Good but needs control flow
+
+**Required Extensions:**
+- Conditional jumps
+- Loop constructs
+- Boolean logic operations
+
+## 5. Data Processing Language
+**Feasibility: Medium**
+
+A language for simple data transformations:
+- **Syntax**: `filter(x > 0) | map(x * 2) | sum()`
+- **Features**: Pipeline operations, list processing
+- **VM Fit**: Good - can use cons cells for lists
+
+**Required Extensions:**
+- List operations (map, filter, reduce)
+- Pipeline operator
+- More built-in functions
+
+## 6. Simple Logic/Constraint Language
+**Feasibility: High**
+
+A language for expressing simple constraints and rules:
+- **Syntax**: `rule: if age > 18 then can_vote = true`
+- **Features**: Rules, facts, simple inference
+- **VM Fit**: Good - boolean operations and variables
+
+**Required Extensions:**
+- Boolean logic (AND, OR, NOT)
+- Rule evaluation
+- Fact storage
+
+## VM Strengths for Alternative Languages:
+1. **Stack-based**: Natural for many languages
+2. **Typed values**: Good foundation for type safety
+3. **Environment model**: Supports variables and scoping
+4. **Function system**: Reusable for different syntaxes
+5. **Extensible**: Easy to add new instructions
+
+## VM Limitations for Alternative Languages:
+1. **Limited data types**: Only numbers, booleans, pairs, symbols
+2. **No strings**: Would need to add string support
+3. **Limited control flow**: Only function calls, no loops/conditionals
+4. **No I/O**: Would need file/console operations
+5. **Memory constraints**: Simple heap model
+
+## Most Realistic Next Steps:
+1. **Forth**: Natural progression, leverages existing stack operations
+2. **Calculator**: Minimal changes, good learning exercise
+3. **Config Language**: Practical use case, moderate complexity
+
+## Implementation Strategy:
+1. Keep the VM unchanged
+2. Write new compilers for different syntaxes
+3. Add minimal VM extensions as needed
+4. Reuse existing VM infrastructure (stack, environment, functions)
+
+The VM's simplicity is actually a strength - it's easy to understand and extend, making it a good foundation for experimenting with different language designs. 
\ No newline at end of file