# Baba Yaga Testing Guide This document explains the comprehensive testing approach for the Baba Yaga scripting language, which includes both JavaScript (reference) and C implementations. ## 🎯 Testing Philosophy The Baba Yaga project uses a **shared test suite** approach to ensure maximum consistency between implementations: - **JavaScript Implementation**: Reference implementation and source of truth - **C Implementation**: Must match JS implementation behavior exactly - **Shared Tests**: Single source of truth for expected behavior - **Consistency**: Both implementations must pass identical tests ## 📁 Test Organization ``` project/ ├── tests/ # 🎯 Shared test suite (SOURCE OF TRUTH) │ ├── unit/ # Unit tests (23 files) │ ├── integration/ # Integration tests (4 files) │ ├── turing-completeness/ # Turing completeness proofs (7 files) │ ├── run_shared_tests.sh # Unified test runner │ └── README.md # Detailed test documentation ├── js/ │ ├── run_tests.sh # JS-specific runner (uses shared tests) │ └── tests/ # 🚫 Legacy (tests moved to shared) └── c/ ├── run_tests.sh # C-specific runner (uses shared tests + C-specific) └── turing_complete_demos/ # 🚫 Legacy (moved to shared) ``` ## 🚀 Quick Start ### Run All Tests on Both Implementations ```bash # Test JavaScript implementation ./tests/run_shared_tests.sh js # Test C implementation ./tests/run_shared_tests.sh c # Compare implementations (run both and compare results) ./tests/run_shared_tests.sh js > js_results.txt ./tests/run_shared_tests.sh c > c_results.txt diff js_results.txt c_results.txt ``` ### Run Specific Test Categories ```bash # Unit tests only ./tests/run_shared_tests.sh js unit ./tests/run_shared_tests.sh c unit # Integration tests only ./tests/run_shared_tests.sh js integration ./tests/run_shared_tests.sh c integration # Turing completeness tests only ./tests/run_shared_tests.sh js turing ./tests/run_shared_tests.sh c turing ``` ### Legacy Runners (Still Work) ```bash # JavaScript (now uses shared tests) cd js && ./run_tests.sh # C (uses shared tests + C-specific tests) cd c && ./run_tests.sh ``` ## 📊 Test Categories Explained ### 1. Unit Tests (23 tests) **Purpose**: Test individual language features in isolation **Examples**: - Lexer functionality - Arithmetic operations - Function definitions - Pattern matching - Data structures **Coverage**: Every language feature has dedicated unit tests ### 2. Integration Tests (4 tests) **Purpose**: Test how multiple features work together **Examples**: - Basic features integration - Pattern matching with functions - Functional programming patterns - Multi-parameter expressions **Coverage**: Common usage patterns and feature combinations ### 3. Turing Completeness Tests (7 tests) **Purpose**: Formally prove the language is Turing complete **Examples**: - Recursion and infinite computation capability - Data structure manipulation - Lambda calculus foundations - Complex algorithms (GCD, prime checking, sorting) **Coverage**: All requirements for Turing completeness ## 🎖️ Test Results Interpretation ### ✅ All Tests Pass ``` === Test Summary === Total tests: 34 Passed: 34 Failed: 0 ✅ All tests passed! ``` **Meaning**: Implementation is fully consistent with reference ### ⚠️ Some Tests Fail ``` === Test Summary === Total tests: 34 Passed: 28 Failed: 6 ❌ Some tests failed. ``` **Meaning**: Implementation has gaps or differences from reference ### 🔍 Debugging Failed Tests 1. **Identify Pattern**: Do failures cluster around specific features? 2. **Check Reference**: Run same test on JS implementation 3. **Compare Output**: What's different between implementations? 4. **File Issue**: Document the discrepancy 5. **Fix or Document**: Either fix the implementation or document limitation ## 🔄 Development Workflow ### Adding New Language Features 1. **JS First**: Implement feature in JavaScript (reference) 2. **Write Tests**: Add comprehensive tests to `tests/unit/` 3. **Verify JS**: Ensure tests pass on JS implementation 4. **Implement C**: Add feature to C implementation 5. **Verify Consistency**: Ensure tests pass on both implementations ### Fixing Implementation Gaps 1. **Identify Gap**: Test failure indicates inconsistency 2. **Reference Check**: JS implementation is source of truth 3. **Fix C Implementation**: Modify C code to match JS behavior 4. **Verify Fix**: Ensure test now passes on both implementations ### Regression Testing Before any release: ```bash # Full test suite on both implementations ./tests/run_shared_tests.sh js && ./tests/run_shared_tests.sh c ``` If any tests fail, the implementations have diverged and need attention. ## 📈 Test Metrics & Goals ### Current Status | Metric | Value | |--------|-------| | Total Test Files | 34 | | Unit Tests | 23 | | Integration Tests | 4 | | Turing Completeness Tests | 7 | | Test Assertions | 200+ | | Implementation Coverage | JS: 100%, C: ~80% | ### Goals - **100% Test Pass Rate**: Both implementations pass all tests - **Feature Parity**: C implementation matches JS completely - **Regression Prevention**: Catch any implementation divergence - **Documentation**: Every feature has corresponding tests ## 🛠️ Advanced Testing ### Performance Testing ```bash # Stress tests (included in unit tests) ./tests/run_shared_tests.sh js unit | grep "Performance" ./tests/run_shared_tests.sh c unit | grep "Performance" ``` ### Specific Feature Testing ```bash # Test only arithmetic ./tests/run_shared_tests.sh js unit | grep -A5 -B5 "Arithmetic" # Test only functions ./tests/run_shared_tests.sh js unit | grep -A5 -B5 "Function" ``` ### Debug Mode ```bash # Enable debug output DEBUG=1 ./tests/run_shared_tests.sh js unit ``` ## 🤝 Contributing Guidelines ### For New Tests 1. **Location**: Add to appropriate directory in `tests/` 2. **Format**: Follow established test file format 3. **Assertions**: Include `..assert` statements for verification 4. **Both Implementations**: Verify test works on JS and C 5. **Documentation**: Update test documentation ### For Implementation Changes 1. **Tests First**: Update/add tests before changing implementations 2. **Reference Implementation**: JS changes require careful consideration 3. **Consistency Check**: Ensure both implementations still pass tests 4. **Breaking Changes**: Document any intentional behavior changes ## 🏆 Quality Assurance The shared test suite ensures: - ✅ **Behavioral Consistency**: Both implementations behave identically - ✅ **Feature Completeness**: All features are thoroughly tested - ✅ **Regression Prevention**: Changes don't break existing functionality - ✅ **Turing Completeness**: Language is formally proven complete - ✅ **Documentation**: Tests serve as executable specification **Result**: Confidence that both Baba Yaga implementations are equivalent and reliable.