diff options
author | Kartik Agaram <vc@akkartik.com> | 2020-01-02 01:28:24 -0800 |
---|---|---|
committer | Kartik Agaram <vc@akkartik.com> | 2020-01-02 01:28:24 -0800 |
commit | d02aa9ac0b9e1130ffcd5a27aa1304e80eee08d9 (patch) | |
tree | bf592fba4275002cbbde420cef8c806bb9b1b45f | |
parent | 01013f2ad2132dd945c6ceb168b85dc52e18882c (diff) | |
download | mu-d02aa9ac0b9e1130ffcd5a27aa1304e80eee08d9.tar.gz |
5863
Just clarified for myself why `subx translate` and `subx run` need to share code: emulation supports the tests first and foremost. In the process we clean up our architecture for levels of layers. It's a good idea but unused once we reconceive of "level 1" as just part of the test harness.
-rw-r--r-- | 010---vm.cc | 10 | ||||
-rw-r--r-- | 011run.cc | 20 | ||||
-rw-r--r-- | 030---translate.cc | 21 | ||||
-rw-r--r-- | 031transforms.cc | 69 | ||||
-rw-r--r-- | 032---operands.cc | 7 |
5 files changed, 25 insertions, 102 deletions
diff --git a/010---vm.cc b/010---vm.cc index 6675cab9..d67667f7 100644 --- a/010---vm.cc +++ b/010---vm.cc @@ -1,11 +1,5 @@ -//: Core data structures for simulating the SubX VM (subset of an x86 processor) -//: -//: At the lowest level ("level 1") of abstraction, SubX executes x86 -//: instructions provided in the form of an array of bytes, loaded into memory -//: starting at a specific address. -//: -//: SubX is fundamentally a translator. But having a VM to execute its -//: translations affords greater confidence in it. +//: Core data structures for simulating the SubX VM (subset of an x86 processor), +//: either in tests or debug aids. //:: registers //: assume segment registers are hard-coded to 0 diff --git a/011run.cc b/011run.cc index 585f9930..e4194687 100644 --- a/011run.cc +++ b/011run.cc @@ -78,15 +78,14 @@ void test_copy_imm32_to_EAX() { ); } -// top-level helper for scenarios: parse the input, transform any macros, load -// the final hex bytes into memory, run it +// top-level helper for tests: parse the input, load the hex bytes into memory, run void run(const string& text_bytes) { program p; istringstream in(text_bytes); + // Loading Test Program parse(in, p); if (trace_contains_errors()) return; // if any stage raises errors, stop immediately - transform(p); - if (trace_contains_errors()) return; + // Running Test Program load(p); if (trace_contains_errors()) return; // convenience to keep tests concise: 'Entry' label need not be provided @@ -244,19 +243,6 @@ void test_detect_duplicate_segments() { ); } -//:: transform - -:(before "End Types") -typedef void (*transform_fn)(program&); -:(before "End Globals") -vector<transform_fn> Transform; - -:(code) -void transform(program& p) { - for (int t = 0; t < SIZE(Transform); ++t) - (*Transform.at(t))(p); -} - //:: load void load(const program& p) { diff --git a/030---translate.cc b/030---translate.cc index 9737834e..b950fce7 100644 --- a/030---translate.cc +++ b/030---translate.cc @@ -1,20 +1,9 @@ -//: The bedrock level 1 of abstraction is now done, and we're going to start -//: building levels above it that make programming in x86 machine code a -//: little more ergonomic. -//: -//: All levels will be "pass through by default". Whatever they don't -//: understand they will silently pass through to lower levels. -//: -//: Since raw hex bytes of machine code are always possible to inject, SubX is -//: not a language, and we aren't building a compiler. This is something -//: deliberately leakier. Levels are more for improving auditing, checks and -//: error messages rather than for hiding low-level details. +//: After that lengthy prelude to define an x86 emulator, we are now ready to +//: start translating SubX notation. //: Translator workflow: read 'source' file. Run a series of transforms on it, //: each passing through what it doesn't understand. The final program should -//: be just machine code, suitable to write to an ELF binary. -//: -//: Higher levels usually transform code on the basis of metadata. +//: be just machine code, suitable to emulate, or to write to an ELF binary. :(before "End Main") if (is_equal(argv[1], "translate")) { @@ -69,6 +58,10 @@ if (is_equal(argv[1], "translate")) { } :(code) +void transform(program& p) { + // End transform(program& p) +} + void print_translate_usage() { cerr << "Usage: subx translate file1 file2 ... -o output\n"; } diff --git a/031transforms.cc b/031transforms.cc index a6e12502..5f13b697 100644 --- a/031transforms.cc +++ b/031transforms.cc @@ -1,64 +1,11 @@ -//: Ordering transforms is a well-known hard problem when building compilers. -//: In our case we also have the additional notion of layers. The ordering of -//: layers can have nothing in common with the ordering of transforms when -//: SubX is tangled and run. This can be confusing for readers, particularly -//: if later layers start inserting transforms at arbitrary points between -//: transforms introduced earlier. Over time adding transforms can get harder -//: and harder, having to meet the constraints of everything that's come -//: before. It's worth thinking about organization up-front so the ordering is -//: easy to hold in our heads, and it's obvious where to add a new transform. -//: Some constraints: -//: -//: 1. Layers force us to build SubX bottom-up; since we want to be able to -//: build and run SubX after stopping loading at any layer, the overall -//: organization has to be to introduce primitives before we start using -//: them. -//: -//: 2. Transforms usually need to be run top-down, converting high-level -//: representations to low-level ones so that low-level layers can be -//: oblivious to them. -//: -//: 3. When running we'd often like new representations to be checked before -//: they are transformed away. The whole reason for new representations is -//: often to add new kinds of automatic checking for our machine code -//: programs. -//: -//: Putting these constraints together, we'll use the following broad -//: organization: -//: -//: a) We'll divide up our transforms into "levels", each level consisting -//: of multiple transforms, and dealing in some new set of representational -//: ideas. Levels will be added in reverse order to the one their transforms -//: will be run in. -//: -//: To run all transforms: -//: Load transforms for level n -//: Load transforms for level n-1 -//: ... -//: Load transforms for level 2 -//: Run code at level 1 -//: -//: b) *Within* a level we'll usually introduce transforms in the order -//: they're run in. -//: -//: To run transforms for level n: -//: Perform transform of layer l -//: Perform transform of layer l+1 -//: ... -//: -//: c) Within a level it's often most natural to introduce a new -//: representation by showing how it's transformed to the level below. To -//: make such exceptions more obvious checks usually won't be first-class -//: transforms; instead code that keeps the program unmodified will run -//: within transforms before they mutate the program. As an example: -//: -//: Layer l introduces a transform -//: Layer l+1 adds precondition checks for the transform -//: -//: This may all seem abstract, but will hopefully make sense over time. The -//: goals are basically to always have a working program after any layer, to -//: have the order of layers make narrative sense, and to order transforms -//: correctly at runtime. +:(before "End Types") +typedef void (*transform_fn)(program&); +:(before "End Globals") +vector<transform_fn> Transform; + +:(before "End transform(program& p)") +for (int t = 0; t < SIZE(Transform); ++t) + (*Transform.at(t))(p); :(before "End One-time Setup") // Begin Transforms diff --git a/032---operands.cc b/032---operands.cc index 5203201e..5d434319 100644 --- a/032---operands.cc +++ b/032---operands.cc @@ -1,5 +1,4 @@ -//: Beginning of "level 2": tagging bytes with metadata around what field of -//: an x86 instruction they're for. +//: Metadata for fields of an x86 instruction. //: //: The x86 instruction set is variable-length, and how a byte is interpreted //: affects later instruction boundaries. A lot of the pain in programming @@ -27,6 +26,10 @@ put_new(Help, "instructions", :(before "End Help Contents") cerr << " instructions\n"; +:(before "Running Test Program") +transform(p); +if (trace_contains_errors()) return; + :(code) void test_pack_immediate_constants() { run( |