| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several bugs found after performing multiple loops through convert-data.
This has been a general pattern: given how unsafe the x86 'language' is,
the regular amount of testing with a single input doesn't really give sufficient
confidence. Ever-present is the possibility that I forgot to pop something
from the stack, either a spilled register or a local. Calling functions
multiple times seems to help detect such bugs. So far I've been doing this
extra level of testing implicitly when I build the next higher abstraction.
But with `convert-data` the buck stopped, and much painful debugging ensued.
One thing that would help is if `write` on streams didn't remain silent
on overflow. But we actually need that sometimes, when streams are used
as buffers.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Add a bounds-check to `next-word`.
|
|
|
|
|
|
|
| |
New convention: compare 'with' for asymmetric comparisons (greater or lesser
than), and compare 'and' for symmetric comparisons. Worth making this distinction
even though the opcodes are identical; when we compare 'with', the order
of operands is significant.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Bugfix in top-level prototype: commandline args were broken since commit
4266 last June.
We still don't have automated tests for commandline args, but we'll add
an example program that'll increase the odds of detecting issues there.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've been saying for a while[1][2][3] that adding extra abstractions makes
things harder for newcomers, and adding new notations doubly so. And then
I notice this DSL in my own backyard. Makes me feel like a hypocrite.
[1] https://news.ycombinator.com/item?id=13565743#13570092
[2] https://lobste.rs/s/to8wpr/configuration_files_are_canary_warning
[3] https://lobste.rs/s/mdmcdi/little_languages_by_jon_bentley_1986#c_3miuf2
The implementation of the DSL was also highly hacky:
a) It was happening in the tangle/ tool, but was utterly unrelated to tangling
layers.
b) There were several persnickety constraints on the different kinds of
lines and the specific order they were expected in. I kept finding bugs
where the translator would silently do the wrong thing. Or the error messages
sucked, and readers may be stuck looking at the generated code to figure
out what happened. Fixing error messages would require a lot more code,
which is one of my arguments against DSLs in the first place: they may
be easy to implement, but they're hard to design to go with the grain of
the underlying platform. They require lots of iteration. Is that effort
worth prioritizing in this project?
On the other hand, the DSL did make at least some readers' life easier,
the ones who weren't immediately put off by having to learn a strange syntax.
There were fewer quotes to parse, fewer backslash escapes.
Anyway, since there are also people who dislike having to put up with strange
syntaxes, we'll call that consideration a wash and tear this DSL out.
---
This commit was sheer drudgery. Hopefully it won't need to be redone with
a new DSL because I grow sick of backslashes.
|
|
|
|
| |
Fix spurious errors in `test_layers` on non-Linux platforms.
|
|
|
|
|
|
|
|
| |
Fix CI. pack.subx was passing in emulation but not natively.
Commit 4954 on Feb 10 was a real dud. First I find I forgot to reclaim
space for locals (commit 4996). Now I find I haven't been tracking registers
properly either.
|
|
|
|
|
|
|
| |
Start running binaries natively in test_layers as well.
CI is still broken; need to investigate where my SubX emulation has a discrepancy
with native x86.
|
| |
|
|
|
|
|
|
|
|
|
| |
Yet another redrawing of responsibilities between convert and its helpers.
In the process I discovered a bug in `write-stream-buffered` which ended
up taking me through a detour to extract `browse_trace` into its own tool.
It turns out just having long buffers is enough to need browse_trace. Simple
operations like clearing a stream swamp a flat view of the trace.
|
|
|
|
|
| |
Thanks Peter van Hardenberg for causing me to run into this crash (the
first time I tried to demo sandboxes in a long time).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bring back support for incrementally printing the trace to the screen (stderr).
I previously thought I didn't need this as long as I'm always incrementally
saving to the 'last_run' trace file. But I quickly ran into a use for it:
when I want to see a complete trace including switching into the sandbox's
trace and back out again.
So there are now two separate commandline flags:
--trace to save the trace to file
--dump to print the trace to screen
The former won't handle sandbox traces. But the latter will.
I'm deemphasizing --dump in the help message since it should be rarely
used.
One other situation where I've used stderr in the past is for just raw
convenience. But trying to always use the trace was a foolish consistency.
Conclusion:
a) For simple debugging, feel free to just use cout/cerr. Delete them
before committing.
b) If the prints get too complex, switch to the trace and browse_trace
tool.
c) If using nested sandboxes, emit to stderr, redirect to file, and browse_trace.
I've gone back and forth on these questions in the past; now I'm trying
to be a little more rigorous about capturing reasoning.
|
|
|
|
|
| |
Fix CI after commit 4987. And track stack depths more correctly inside
sandboxes.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've extracted it into a separate binary, independent of my Mu prototype.
I also cleaned up my tracing layer to be a little nicer. Major improvements:
- Realized that incremental tracing really ought to be the default.
And to minimize printing traces to screen.
- Finally figured out how to combine layers and call stack frames in a
single dimension of depth. The answer: optimize for the experience of
`browse_trace`. Instructions occupy a range of depths based on their call
stack frame, and minor details of an instruction lie one level deeper
in each case.
Other than that, I spent some time adjusting levels everywhere to make
`browse_trace` useful.
|
|
|
|
|
|
| |
Now that our test runs are getting longer, debugging is again becoming a
bottleneck. Time to start using trace depths along with `mu browse-trace`
from the top-level.
|
| |
|
| |
|
|
|
|
|
|
|
| |
Standardize name for 'end of file' sentinel. `eof` seems like an ordinary
variable, and `EOF` looks too much like a register (particularly in code
like `if (EAX == EOF)`), so we'll go with `Eof`. Consistent capitalization
for globals, and constants are globals too.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Considering how much trouble a merge phase would be (commit 4978), it seems
simpler to just add the extra syntax for controlling the entry point of
the generated ELF binary.
But I wouldn't have noticed this if I hadn't taken the time to write out
the commit messages of 4976 and 4978.
Even if we happened to already have linked list primitives built, this
may still be a good idea considering that I'm saving quite a lot of code
in duplicated entrypoints.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Phase 1: coalesce different instances/fragments for each segment, correctly
prepending later fragments.
Phase 2: pack bitfields into bytes.
Phase 3: compute addresses for labels, compute the ELF header.
Phase 4: convert hex bytes to binary.
But ugh, phase 1 involves linked lists and I'll have to go down a rabbit
hole building up more standard library functions.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've been allowing operands in any order just because it simplifies implementation.
I don't actually rely on this flexibility; all the .subx programs in this
repo consistently use a single ordering.
Why is a hard-coded canonical order hard to implement? The order that seems
most logical to me is complicated by the "reg" bits in the ModR/M byte:
- In instructions that interpret it as an `/r32` operand, it needs to be
deemphasized because it refers to a different argument of the instruction
than the `/mod`, `/rm32`, `/base`, `/index` and `/scale` operands that
capture the bulk of instruction decoding complexity and so should be
emphasized. `/r32` can also be unused, which strengthens the case for
deemphasizing it.
- In instructions that interpret the "reg" bits as a `/subop` operand,
it should be colocated with the opcode because it performs the same function:
specifying the *operation* the instruction performs.
In both cases, the bits in the `reg` bitfield are conceptually unrelated
to the other bitfields in the same byte. But they sometimes want to be
close to the opcode bytes on the left, and at other times need to be deemphasized
rightward. Fixing both these possibilities seems complicated and stateful,
particularly since all operands are optional in general. On the other hand,
just pulling operands you need to create each byte, regardless of where
in the instruction they occur, that's nicely stateless.
|