| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
All extant SubX programs generate identical binaries using either the
C++ or the self-hosted SubX translators.
|
|
|
|
|
|
|
|
|
|
|
| |
A little more resizing of buffers. apps/hex.subx is now building an
identical binary.
I'm now aborting on allocation failures. That requires disabling
a couple of tests. (I'm not quite confident enough of this decision to
delete them outright.) I want to treat all segfaults as bugs, and
machine code is no place to add boilerplate checks for return values of
standard library functions.
|
|
|
|
|
|
| |
Ensure we don't create overly long lines. Now this works:
$ ./diff_ntranslate 0*.subx apps/subx-common.subx
|
|
|
|
|
| |
Clean up. All apps now translating correctly except for the phases of
the self-hosted translator. Next step: SubX-in-SubX in SubX-in-SubX.
|
|
|
|
| |
Bugfix fifteen -- on the C++ side.
|
|
|
|
| |
All that debugging and it turns out the bug is on the C++ side!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Snapshot while debugging survey.subx by print.
I can see the error in 1 minute with this command:
subx run apps/survey < a.pack
(where a.pack is obtained from `ntranslate 049*.subx 05[0-8]*.subx`)
By contrast, using the trace requires 4.5 minutes:
subx --trace run apps/survey < a.pack
It generates a trace of 4.4GB with almost 83M lines.
The trace takes 2 minutes to load.. oops, I forgot to load labels with
`--debug`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even though the standard library is building and passing tests, the
binaries it generates aren't exactly bit for bit identical with the
originals. Comparing using `diff_ntranslate`, it looks like the data
segment starting address isn't computed right in survey.subx
(`compute-addresses`) when I start translating layer 058. Deleting some
tests brings the code segment to a p_offset where bits 8-11 (the lowest
4 bits excluding the lowermost byte) are cleared and everything works.
However, if bits 8-11 are set, then they don't make it to p_vaddr and
p_paddr.
Tried reproducing with a unit test, but the unit test passes fine.
|
|
|
|
| |
Make the output of the `pack` phase a little easier to read.
|
| |
|
|
|
|
|
|
|
| |
Translates 5k lines of input in 26 seconds.
I'm not sure why I need to grow the label table. It was already 512 entries
long, and I'm only using 373 so far.
|
|
|
|
|
|
|
|
|
| |
We can now translate layers 49-72 using the self-hosted translator.
The translator has now demonstrated translation over 4k lines. Most verbose
phase output is 325KB, even if the final binary is 15KB.
Emulation is too slow now, so I'm back to debug by print on a Linux machine.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can now translate layers 49-56 using the self-hosted translator
(`translate` and `ntranslate`).
As a follow-up to commit 5404, the self-hosted translator is a little
more strict than the C++ translator in 3 places:
a) All .subx files must define a data segment.
b) All .subx files must define an `Entry` label.
c) All numbers must be in *lowercase* hex.
In all cases, where programs work with the C++ translator but violate
the self-hosted translator's assumptions, we must make sure we raise
errors rather than silently emit bad code.
|
|
|
|
| |
Break a dependency from `print-int32` to `from-hex-char`.
|
| |
|
|
|
|
|
| |
We can now translate layers 49-55 using translate and ntranslate. Next
step is to support '\n' in dquotes.subx.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
We now have a new pass called 'tests' which code-generates a new
function called 'run-tests', just like the C++ layer `tests.cc`.
|
|
|
|
| |
Fix CI.
|
| |
|
|
|
|
|
| |
Various buffer sizes needed to be grown for ex11. But the next
bottleneck is that we need to code-generate run-tests.
|
|
|
|
|
|
|
|
|
|
|
| |
Bugfix fourteen: we need different address computation logic for code vs
data labels.
It's really about different categories of instructions having different
address computation logic. This subtle distinction will make good error
messages hard. But that's a problem for later.
Now there's just one example program not translating.
|
|
|
|
| |
Clean up.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Figured out what's going in with bug fourteen: displacement operands
aren't always used relative to the PC. Does this mean I need to track
instruction boundaries past pack? :'(
No, I just need different logic for labels in code vs data segments.
This was an interesting bug for reminding me of the difference between
the emulator-level trace and the application-level trace. The former has
1.5 million lines, while the latter has a dozen. Luckily, just dumping
the latter immediately made obvious what the issue was.
Though this experience does suggest some further ideas for debugging
tools:
slice trace by line and phase
slice trace by start and end label
debug UI for SubX translator
2D layout: rows = lines of code; columns = translator phases
each 'cell' in this layout contains a list of log lines
shows what came in, what was emitted
easily collapse any cell
These are domain-specific tools. Special-cased to the SubX translator
phases.
|
|
|
|
|
|
|
|
| |
Bugfix thirteen: displacement calculations were wrong because current
offset was not being updated properly as words were being read and
emitted.
Now 10/12 example programs are translated correctly.
|
|
|
|
|
|
|
| |
Bugfix twelve: ModR/M was being incorrectly computed.
This is one of two problems with subx/examples/ex3, so no new passing
examples.
|
|
|
|
|
| |
Bugfix eleven: segment flags were incorrectly computed. examples/ex1 now
verified! Added to CI.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bugfix ten: type error in `convert`. I was calling `rewind-stream` on a
`buffered-file`.
examples/ex1 is now just one nibble off the canonical.
I *have* found one missing feature in the self-hosted translator,
though: dquotes doesn't support newlines in strings, even though the C++
version does. dquotes parses them right, but the value initialized in
the data segment is wrong.
|
|
|
|
|
|
|
| |
Bugfix nine: flush(out) after translation is done.
Still one remaining bug from comparing ELF binaries: emit-segments
prints nothing for some reason.
|
|
|
|
|
|
|
| |
Bugfix eight: incorrect segment count in ELF header.
The generated examples/ex1 is still not right. But it has the second
segment now. Or almost all of it. Final byte is missing for some reason.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The result isn't an identical binary to before, and it segfaults when
run. But it's bugfix seven.
A couple of places where we make .subx files a little more strict:
a) All .subx files must define a data segment. Even if they have no
data.
b) All .subx files must define an `Entry` label for the binary to start
at. Earlier we used to default to the start of the code label. That's
not too hard to add; we'd just need to:
i) rename `get` to `get-or-abort`
ii) clone a third variant of `get-or-insert` called `get` that returns
null if the key is not found.
iii) use `get` rather than `get-or-abort` when looking up the `Entry`
label.
|
| |
|
|
|
|
| |
Clean up.
|
| |
|
| |
|
|
|
|
| |
All assertions in `test-convert-computes-addresses` still failing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
map of how far we've gotten by now (functions with '*' independently tested):
✓ compute-offsets*
✓ compute-addresses*
✓ emit-output
✓ emit-headers
✓ emit-elf-header
✓ emit-hex-array*
✓ first emit-elf-program-header-entry
✓ emit-hex-array*
? second emit-elf-program-header-entry
emit-hex-array*
emit-segments*
|
| |
|
| |
|
|
|
|
| |
Clean up.
|
| |
|
| |
|
|
|
|
| |
Clean up.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
I carefully logged the segment a label is declared in but forgot to
actually save it in the table. This has been a theoretic concern for
some time, but I've never seen it actually happen until now. SubX is
just too low level.
Now I get past the first two phases but code generation fails to find
the 'Entry' label.
|
|
|
|
|
|
|
|
| |
Snapshot at a random moment, showing a new debugging trick: hacking on
the C++ level to dump memory contents on specific labels.
For some reason label 'x' doesn't have a segment assigned by the time we
get to compute-addresses.
|