diff options
Diffstat (limited to 'js/scripting-lang/README.md')
-rw-r--r-- | js/scripting-lang/README.md | 575 |
1 files changed, 255 insertions, 320 deletions
diff --git a/js/scripting-lang/README.md b/js/scripting-lang/README.md index 9ed6937..08334a4 100644 --- a/js/scripting-lang/README.md +++ b/js/scripting-lang/README.md @@ -1,376 +1,311 @@ # Simple Scripting Language -A minimal, interpreted scripting language designed for learning language implementation concepts. This language demonstrates the core components needed for an interpreter: lexer, parser, and interpreter. +A functional programming language with immutable variables, first-class functions, and pattern matching, built on a combinator-based foundation. ## Features -- **Arithmetic operations**: `+`, `-`, `*`, `/`, `%` (modulo), `^` (power) -- **Comparison operators**: `=`, `<`, `>`, `<=`, `>=`, `!=` -- **Logical operators**: `and`, `or`, `xor`, `not` -- **Variable assignment**: Immutable variables with `:` syntax -- **Function definitions**: Arrow function syntax with pattern matching -- **Pattern matching**: Case expressions with wildcard support -- **Function calls**: Direct function application -- **First-class functions**: Functions can be passed as arguments using `@` syntax -- **Function composition**: Higher-order functions for combining functions -- **Lexical scoping**: Functions create their own scope -- **Input/Output**: Built-in IO operations with `..in`, `..out`, and `..assert` +- **Immutable Variables**: Variables cannot be reassigned +- **First-Class Functions**: Functions can be passed as arguments and stored in data structures +- **Lexical Scoping**: Functions create their own scope +- **Pattern Matching**: Case expressions with wildcard support +- **Table Literals**: Lua-style tables with both array-like and key-value entries +- **Standard Library**: Built-in higher-order functions (`map`, `compose`, `pipe`, `apply`, `filter`, `reduce`, `fold`, `curry`) +- **Combinator Foundation**: All operations are implemented as function calls under the hood +- **IO Operations**: Built-in input/output operations (`..in`, `..out`, `..assert`) +- **Floating Point Arithmetic**: Full support for decimal numbers +- **Unary Minus**: Support for negative numbers (e.g., `-1`, `-3.14`) + +## Current Implementation Status + +### ✅ Completed Features +- **Core Combinators**: All arithmetic, comparison, logical, and higher-order combinators implemented +- **Parser Translation**: All operators translated to combinator function calls +- **Syntax Preservation**: All existing syntax works unchanged +- **Standard Library**: Complete set of higher-order functions +- **Basic Operations**: Arithmetic, comparison, logical operations +- **Function Definitions**: Arrow functions and function declarations +- **Tables**: Table literals and bracket notation access +- **IO Operations**: Input, output, and assertions + +### 🔄 In Progress +- **Recursive Functions**: Support for functions that call themselves +- **Advanced Pattern Matching**: Extended when expression patterns +- **Dot Notation**: Table access with dot notation (`table.property`) +- **Multi-parameter Cases**: Case expressions with multiple parameters ## Syntax -### Variables - -Variables are immutable and assigned using the `:` operator: - -``` -x : 5; -y : 10; +### Basic Operations ``` +/* Arithmetic */ +x : 5 + 3; +y : 10 - 2; +z : 4 * 3; +w : 15 / 3; +neg : -5; /* Unary minus */ -### Arithmetic Operations +/* Comparisons */ +result : x > y; +equal : a = b; +not_equal : a != b; -Arithmetic operations are supported with parentheses grouping: - -``` -sum : x + y; -diff : x - y; -product : x * y; -quotient : x / y; -modulo : 17 % 5; // Returns 2 -power : 2 ^ 3; // Returns 8 -grouped : (5 + 3) * 2; // Returns 16 -nested : ((5 + 3) * 2) + 1; // Returns 17 -``` - -### Function Definitions - -Functions are defined using arrow syntax (`->`): - -``` -add : x y -> x + y; -double : x -> x * 2; +/* Logical */ +and_result : true and false; +or_result : true or false; ``` -### Function Calls - -Functions are called by providing arguments directly after the function name: - -``` -result : add 3 4; -doubled : double 5; +### Variables and Functions ``` +/* Immutable variables */ +x : 42; +y : "hello"; -#### Function Calls with Parentheses - -Function calls support parenthesized arguments for complex expressions: +/* Function definition */ +f : x -> x * 2; +/* Function call */ +result : f 5; ``` -// Simple function call -result1 : add 3 4; - -// Function call with parenthesized arithmetic -result2 : add (3 + 2) (4 + 1); - -// Function call with function calls in parentheses -result3 : add (double 3) (square 2); -// Complex nested function calls with parentheses -result4 : add (add 1 2) (add 3 4); +### Tables ``` +/* Table literal */ +table : {1, 2, 3, key: "value"}; -**Note:** Parentheses provide explicit grouping and are the recommended way to handle complex nested function calls. - -### First-Class Functions - -Functions can be passed as arguments to other functions using the `@` prefix: - +/* Table access */ +first : table[1]; +value : table.key; /* Coming soon */ +nested : table.key.subkey; /* Coming soon */ ``` -double : x -> x * 2; -square : x -> x * x; -compose : f g x -> f g x; - -composed : compose @double @square 3; // double(square(3)) = 18 -``` - -**Function Reference Syntax:** -- `@functionName` - Creates a function reference (doesn't execute the function) -- `functionName args` - Calls the function with arguments ### Pattern Matching - -The language supports pattern matching with case expressions in function bodies: - ``` -compare : x y -> - case x y of - 0 0 : "both zero" - 0 _ : "x is zero" - _ 0 : "y is zero" - _ _ : "neither zero"; +/* Case expression */ +result : when x is + 1 then "one" + 2 then "two" + _ then "other"; ``` -#### Pattern Matching Syntax - -- **Exact matches**: `0 0` matches when both values are 0 -- **Wildcards**: `_` matches any value -- **Mixed patterns**: `0 _` matches when first value is 0 and second can be anything - -### Comparison and Logical Operations - -The language supports comparison and logical operations: - -``` -// Comparison operators -less : 3 < 5; // true -greater : 10 > 5; // true -equal : 5 = 5; // true -not_equal : 3 != 5; // true -less_equal : 5 <= 5; // true -greater_equal : 5 >= 3; // true - -// Logical operators -and_result : 1 and 1; // true -or_result : 0 or 1; // true -xor_result : 1 xor 0; // true -not_result : not 0; // true +### IO Operations ``` +/* Output */ +..out "Hello, World!"; -### Input/Output Operations +/* Input */ +name : ..in; -The language provides built-in IO operations for reading from stdin and writing to stdout: - -``` -name : ..in; // Read input from stdin -..out name; // Output to stdout -..out "Hello"; // Output string literal -..out 42; // Output number -..assert x = 5; // Assert that x equals 5 -..assert y > 3; // Assert that y is greater than 3 -..assert z != 0; // Assert that z is not equal to 0 +/* Assertion */ +..assert x = 5; ``` -**IO Functions:** -- `..in` - Reads a line from stdin, returns a number if possible, otherwise a string -- `..out` - Outputs a value to stdout and returns the value -- `..assert` - Asserts that a comparison is true, throws an error if it's false. Supports all comparison operators: `=`, `<`, `>`, `<=`, `>=`, `!=` - -**Note:** The `..` prefix indicates these are impure operations that have side effects. - -## Language Components - -### Lexer - -The lexer tokenizes input into meaningful units: -- Numbers: `123` -- Identifiers: `variable_name` -- Operators: `+`, `-`, `*`, `/` -- Keywords: `case`, `of`, `function` -- Symbols: `:`, `->`, `_` - -### Parser - -The parser builds an Abstract Syntax Tree (AST) from tokens, supporting: -- Number literals -- Arithmetic expressions -- Variable assignments -- Function declarations -- Function calls -- Case expressions -- Pattern matching - -### Interpreter - -The interpreter executes the AST with: -- Global scope management -- Function evaluation -- Pattern matching execution -- Arithmetic computation -- Immutable variable semantics - -## Example Programs - -### Basic Arithmetic - -``` -x : 5; -y : 10; -sum : x + y; -sum; // Returns 15 -``` - -### Function Definition and Call - -``` -add : x y -> x + y; -result : add 3 4; -result; // Returns 7 -``` - -### Pattern Matching - -``` -compare : x y -> - case x y of - 0 0 : "both zero" - 0 _ : "x is zero" - _ 0 : "y is zero" - _ _ : "neither zero"; - -test1 : compare 0 0; // Returns "both zero" -test2 : compare 0 5; // Returns "x is zero" -test3 : compare 5 0; // Returns "y is zero" -test4 : compare 5 5; // Returns "neither zero" -``` - -### Function Composition - +### Standard Library ``` +/* Map */ double : x -> x * 2; -square : x -> x * x; -compose : f g x -> f g x; -apply : f x -> f x; - -composed : compose @double @square 3; // double(square(3)) = 18 -applied : apply @double 5; // double(5) = 10 -``` - -### Recursion - -The language supports recursive function calls in function body case expressions: - -``` -countdown : n -> - case n of - 0 : 0 - _ : countdown n - 1; - -result : countdown 3; // Returns 0 +squared : map @double 5; -// Note: Complex arithmetic in recursive functions may require careful expression design -// due to the language's lack of explicit grouping (parentheses) -``` - -### Input/Output Example +/* Filter */ +isPositive : x -> x > 0; +filtered : filter @isPositive 5; -``` -name : ..in; // Prompts for input -..out "Hello, "; // Outputs greeting -..out name; // Outputs the input +/* Compose */ +f : x -> x + 1; +g : x -> x * 2; +h : compose @f @g; +result : h 5; /* (5 * 2) + 1 = 11 */ ``` -### First-Class Functions Example +## Usage +### Running Scripts +```bash +node lang.js script.txt ``` -double : x -> x * 2; -square : x -> x * x; -compose : f g x -> f g x; -// Function composition -composed : compose @double @square 3; // 18 - -// Higher-order functions -map : f x -> f x; -mapped : map @double 5; // 10 +### Testing +The project uses a structured testing approach with unit and integration tests: + +#### Unit Tests +Located in `tests/` directory, each focusing on a specific language feature: +- `01_lexer_basic.txt` - Basic lexer functionality ✅ +- `02_arithmetic_operations.txt` - Arithmetic operations ✅ +- `03_comparison_operators.txt` - Comparison operators ✅ +- `04_logical_operators.txt` - Logical operators ✅ +- `05_io_operations.txt` - IO operations ✅ +- `06_function_definitions.txt` - Function definitions ✅ +- `07_case_expressions.txt` - Case expressions and pattern matching 🔄 +- `08_first_class_functions.txt` - First-class function features ✅ +- `09_tables.txt` - Table literals and access ✅ +- `10_standard_library.txt` - Standard library functions ✅ +- `11_edge_cases.txt` - Edge cases 🔄 +- `12_advanced_tables.txt` - Advanced table features 🔄 +- `13_standard_library_complete.txt` - Complete standard library ✅ +- `14_error_handling.txt` - Error handling 🔄 +- `15_multi_parameter_case.txt` - Multi-parameter case expressions 🔄 + +#### Integration Tests +Test combinations of multiple features: +- `integration_01_basic_features.txt` - Basic feature combinations ✅ +- `integration_02_pattern_matching.txt` - Pattern matching with other features 🔄 +- `integration_03_functional_programming.txt` - Functional programming patterns ✅ + +#### Running Tests +```bash +# Run all tests +./run_tests.sh -// Function references in standalone case expressions -case 0 of - 0 : @double - _ : @square; +# Run individual tests +node lang.js tests/01_lexer_basic.txt +node lang.js tests/integration_01_basic_features.txt +# or with bun +bun run lang.js tests/01_lexer_basic.txt ``` -### Assertion Example +## Implementation Details -``` -x : 5; -y : 3; -sum : x + y; - -..assert x = 5; // Passes -..assert y = 3; // Passes -..assert sum = 8; // Passes -..assert x > 3; // Passes -..assert y < 10; // Passes -..assert sum != 0; // Passes -..assert x = 0; // Fails with error +### Architecture +- **Lexer**: Tokenizes input into tokens (numbers, identifiers, operators, etc.) +- **Parser**: Builds Abstract Syntax Tree (AST) from tokens, translating operators to combinator calls +- **Interpreter**: Executes AST with scope management and combinator function calls + +### Key Components +- **Token Types**: Supports all basic operators, literals, and special tokens +- **AST Nodes**: Expression, statement, and declaration nodes +- **Scope Management**: Lexical scoping with proper variable resolution +- **Combinator Foundation**: All operations implemented as function calls +- **Error Handling**: Comprehensive error reporting for parsing and execution + +## Recent Fixes + +### ✅ Combinator Foundation Implementation (Latest) +- **Issue**: Parser ambiguity between function application and operator expressions +- **Solution**: Implemented comprehensive combinator foundation +- **Status**: ✅ Completed - All operators now translate to combinator function calls +- **Impact**: Eliminated parsing ambiguity while preserving syntax + +### ✅ Standard Library Function Naming Conflicts +- **Issue**: Test functions using names that conflict with standard library combinators +- **Solution**: Renamed test functions to avoid conflicts (e.g., `add` → `add_func`) +- **Status**: ✅ Resolved - All tests now use unique function names + +### ✅ Parser Ambiguity with Unary Minus Arguments +- **Issue**: `filter @isPositive -3` was incorrectly parsed as binary operation +- **Root Cause**: Parser treating `FunctionReference MINUS` as binary minus operation +- **Solution**: Added special case in `parseExpression()` to handle `FunctionReference MINUS` pattern +- **Status**: ✅ Resolved - Standard library functions now work with negative arguments + +### ✅ Unary Minus Operator +- **Issue**: Stack overflow when parsing negative numbers (e.g., `-1`) +- **Root Cause**: Parser lacked specific handling for unary minus operator +- **Solution**: Added `UnaryMinusExpression` parsing and evaluation +- **Status**: ✅ Resolved - All tests passing + +### ✅ IO Operation Parsing +- **Issue**: IO operations not parsed correctly at top level +- **Solution**: Moved IO parsing to proper precedence level +- **Status**: ✅ Resolved + +### ✅ Decimal Number Support +- **Issue**: Decimal numbers not handled correctly +- **Solution**: Updated lexer and interpreter to use `parseFloat()` +- **Status**: ✅ Resolved + +## Known Issues + +### 🔄 Recursive Function Support +- **Issue**: Functions cannot call themselves recursively +- **Example**: `factorial : n -> when n is 0 then 1 _ then n * factorial (n - 1);` fails +- **Root Cause**: Function not available in global scope when body is evaluated +- **Status**: 🔄 In Progress - Implementing forward declaration pattern +- **Impact**: Recursive algorithms cannot be implemented + +### 🔄 When Expression Pattern Parsing +- **Issue**: When expressions only support basic patterns (identifiers, numbers, strings, wildcards) +- **Example**: `when x is < 0 then "negative" _ then "non-negative"` fails +- **Root Cause**: Parser not handling comparison operators in patterns +- **Status**: 🔄 In Progress - Extending pattern parsing +- **Impact**: Limited pattern matching capabilities + +### 🔄 Dot Notation for Table Access +- **Issue**: Table access only supports bracket notation +- **Example**: `table.property` fails to parse +- **Root Cause**: Parser not handling dot notation +- **Status**: 🔄 In Progress - Adding dot notation support +- **Impact**: Less convenient table access syntax + +### 🔄 Multi-parameter Case Expressions +- **Issue**: Multi-parameter case expressions not parsed correctly +- **Example**: `when x y is 0 0 then "both zero" _ _ then "not both zero"` fails +- **Root Cause**: Parser not handling multiple parameters in case expressions +- **Status**: 🔄 In Progress - Extending case expression parsing +- **Impact**: Limited pattern matching with multiple values + +### 🔄 When Expression Parsing Precedence +- **Issue**: When expressions not parsed correctly in all contexts +- **Example**: Some when expressions fail with "Unexpected token in parsePrimary: WHEN" +- **Root Cause**: Parser precedence not handling when expressions properly +- **Status**: 🔄 In Progress - Adjusting parser precedence +- **Impact**: When expressions may fail in certain contexts + +## Development + +### File Structure +``` +. +├── lang.js # Main implementation with combinator foundation +├── parser.js # Parser with operator-to-combinator translation +├── lexer.js # Lexical analyzer +├── tests/ # Unit and integration tests +│ ├── 01_lexer_basic.txt +│ ├── 02_arithmetic_operations.txt +│ ├── ... +│ ├── integration_01_basic_features.txt +│ ├── integration_02_pattern_matching.txt +│ └── integration_03_functional_programming.txt +├── run_tests.sh # Test runner script +├── COMBINATORS.md # Combinator foundation documentation +└── README.md # This file +``` + +### Debugging +Enable debug mode by setting `DEBUG=true`: +```bash +DEBUG=true node lang.js script.txt ``` -## Running Programs +## Combinator Foundation -The language supports two modes of execution: +The language is built on a combinator foundation where all operations are implemented as function calls: -### File Execution Mode -Run a script file by providing the file path as an argument: +### Internal Translation +```javascript +// x - y becomes internally: +subtract(x, y) -```bash -node lang.js script.txt -``` - -### REPL Mode -Start an interactive REPL by running without arguments: - -```bash -node lang.js -``` +// filter @isPositive -3 becomes internally: +filter(isPositive, negate(3)) -In REPL mode: -- Type expressions and end them with a semicolon (`;`) to execute -- Multi-line expressions are supported -- Type `exit` or `quit` to exit -- Variables and functions persist between expressions +// x + y becomes internally: +add(x, y) -**Example REPL session:** -``` -Scripting Language REPL -Type expressions to evaluate. End with semicolon (;) to execute. -Type "exit" to quit. - -> x : 5; -> y : 10; -> + x y; -15 -> exit -Goodbye! +// true and false becomes internally: +logicalAnd(true, false) ``` -## Language Design Principles - -- **Simplicity**: Minimal syntax for maximum clarity -- **Immutability**: Variables cannot be reassigned -- **Functional**: Functions are first-class values with composition support -- **Pattern-oriented**: Pattern matching as a core feature -- **Recursive**: Support for recursive function calls in function bodies -- **Grouped**: Parentheses support for complex arithmetic expressions -- **Expressive**: Rich set of arithmetic, comparison, and logical operators -- **IO-aware**: Built-in input/output operations with clear impurity markers -- **Educational**: Designed to teach language implementation concepts - -## Limitations - -This is a learning language with intentional limitations: -- Complex nested function calls (e.g., `add double 3 square 2`) are ambiguous without explicit grouping -- No data structures beyond numbers and strings -- No error handling beyond basic validation -- No modules or imports -- No standard library -- IO operations are synchronous and block execution - -## Future Enhancements - -Planned features for future versions: -- Ambiguous function call detection and error reporting -- Lists and data structures -- Error handling -- Standard library functions -- Modules and imports -- Type system -- Performance optimizations - -## Implementation Details +### Benefits +- **Eliminates Parsing Ambiguity**: Every operation is a function call +- **Preserves Syntax**: Zero breaking changes to existing code +- **Functional Foundation**: Everything is a function under the hood +- **Extensible**: Easy to add new combinators and patterns +- **Consistent Semantics**: All operations follow the same pattern -The language is implemented in JavaScript with three main components: +For detailed information about the combinator foundation, see [COMBINATORS.md](COMBINATORS.md). -1. **Lexer** (`lexer()`): Converts source code to tokens -2. **Parser** (`parser()`): Builds AST from tokens -3. **Interpreter** (`interpreter()`): Executes AST +## Contributing -Each component is designed to be modular and extensible for adding new language features. \ No newline at end of file +1. Create focused unit tests for new features +2. Add integration tests for feature combinations +3. Update documentation +4. Run the full test suite before submitting changes +5. Follow the combinator foundation approach for new operations \ No newline at end of file |