| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
I thought I'd done this in the previous commit, but I hadn't. And, what's
more, there was a bug that seemed pretty tough for a time. Turns out my
self-hosted translator doesn't support '.' comment tokens in data segments.
Hopefully I'm past the valley of the shadow of death now.
"I HAVE NO TOOLS BECAUSE I’VE DESTROYED MY TOOLS WITH MY TOOLS."
-- James Mickens (https://www.usenix.org/system/files/1311_05-08_mickens.pdf)
|
|
|
|
| |
CI should start passing again now.
|
|
|
|
|
|
|
|
|
| |
Cleaner abstraction, but adds 3 instructions to our overhead for handles,
including one potentially-hard-to-predict jump :/
I wish I could have put the alloc id in eax for the comparison as well,
to save a few bytes of instruction space. But that messes up the non-null
case.
|
|
|
|
|
|
|
|
| |
Mystery solved of why the syntax sugar phases don't work even though they
don't use any functions whose signatures changed in the migration to handles.
The answer: they use the Registers table, and it needs to use handles rather
than raw strings.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mystery solved of why the syntax sugar phases don't work even though they
don't use any functions whose signatures changed in the migration to handles.
The answer: they use the Registers table, and it currently doesn't use
handles.
Rather than create a whole new set of functions that operate on addresses,
I'm going to create fake handles that are never intended to be reclaimed.
Which raises the question of the best way to do that. I'd like to continue
using string syntax, so I'm going to use a prefix in the payload that can
also be rendered as a string. But all the printable characters start with
0x20, and we don't currently have escape sequences for null or any other
non-printable characters.
I _could_ use newlines, but that seems overly clever. So instead I'll once
again not worry about some hypothetical problem with running out of alloc-ids,
and just carve out half of the id space that can't be used for real alloc
ids. Ascii doesn't use the most significant bit of bytes, so it seems like
a natural separation.
|
| |
|
| |
|
|
|
|
| |
For this one commit we need to bootstrap ourselves with subx_translate_debug.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
This bug was never caught because we've never tested with more than 2 segments.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So far it's unclear how to do this in a series of small commits. Still
nibbling around the edges. In this commit we standardize some terminology:
The length of an array or stream is denominated in the high-level elements.
The _size_ is denominated in bytes.
The thing we encode into the type is always the size, not the length.
There's still an open question of what to do about the Mu `length` operator.
I'd like to modify it to provide the length. Currently it provides the
size. If I can't fix that I'll rename it.
|
| |
|
|
|
|
|
| |
It's going to be hard work retrofitting 8-byte handles in place of 4-byte
addrs. Here we just clean up some unused args.
|
|
|
|
|
|
|
|
|
|
|
| |
At the SubX level we have to put up with null-terminated kernel strings
for commandline args. But so far we haven't done much with them. Rather
than try to support them we'll just convert them transparently to standard
length-prefixed strings.
In the process I realized that it's not quite right to treat the combination
of argc and argv as an array of kernel strings. Argc counts the number
of elements, whereas the length of an array is usually denominated in bytes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If indexing into a type with power-of-2-sized elements we can access them
in one instruction:
x/reg1: (addr int) <- index A/reg2: (addr array int), idx/reg3: int
This translates to a single instruction because x86 instructions support
an addressing mode with left-shifts.
For non-powers-of-2, however, we need a multiply. To keep things type-safe,
it is performed like this:
x/reg1: (offset T) <- compute-offset A: (addr array T), idx: int
y/reg2: (addr T) <- index A, x
An offset is just an int that is guaranteed to be a multiple of size-of(T).
Offsets can only be used in index instructions, and the types will eventually
be required to line up.
In the process, I have to expand Input-size because mu.subx is growing
big.
|
|
|
|
| |
Support parsing ints from strings rather than slices.
|
| |
|
| |
|
|
|
|
| |
Fix CI.
|
| |
|
|
|
|
|
| |
Fix CI. apps/survey was running out of space in the trace segment when
translating apps/mu.subx
|
| |
|
|
|
|
| |
Expand some buffer sizes to continue building mu.subx natively.
|
|
|
|
|
|
|
|
|
|
|
| |
Anytime we create a slice, the first check tends to be whether it's empty.
If we handle ill-formed slices here where start > end, that provides a
measure of safety.
In the Mu translator (mu.subx) we often check for a trailing ':' or ','
and decrement slice->end to ignore it. But that could conceivably yield
ill-formed slices if the slice started out empty. Now we make sure we never
operate on such ill-formed slices.
|
|
|
|
|
|
|
| |
Layers 0-89 are used in self-hosting SubX.
Layers 90-99 are not needed for self-hosting SubX, and therefore could
use transitional levels of syntax sugar.
Layers 100 and up use all SubX syntax sugar.
|
| |
|
|
|
|
|
| |
Try to make the comments consistent with the type system we'll eventually
have.
|
| |
|
|
|
|
|
| |
Fix a bug in one test: it checks eax when the component under test returns
nothing. It's been just accidentally passing all these months.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
A couple more primitives now working. In the process I ran into an issue
with some buffer filling up when running ntranslate. Isolating it to survey.subx
was straightforward, but --trace ran out of RAM, and --trace --dump ran
out of (7GB of) disk. In the end what helped was just repeatedly inserting
exits at different points, and I realized there was a magic number that
hadn't been turned into a named constant.
|
|
|
|
|
|
|
| |
Support binary operations with reg/mem and reg operands.
Everything is passing. However, the self-hosting translator now generates
some discrepancies compared to the C++ translator :(
|
|
|
|
|
|
|
| |
Clean up pseudocode to match planned syntax for the type- and memory-safe
level-2 Mu language.
http://akkartik.name/post/mu-2019-2 is already out of date.
|
|
|
|
| |
Replace calculations of constants with labels.
|
|
|
|
|
|
|
|
|
| |
Move stack operations to a layer of their own.
It was some short-term pain to take out the syntax sugar from it, but we
need access to this layer from braces, which can't depend on sugar since
it's part of sugar. Just simpler to keep one clear line and not have to
build sometimes with some sugar but not others.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This undoes 5672 in favor of a new plan:
Layers 000 - 099 are for running without syntax sugar. We use them for
building syntax-sugar passes.
Layers 100 and up are for running with all syntax sugar.
The layers are arranged in approximate order so more phases rely on earlier
layers than later ones.
I plan to not use intermediate syntax sugar (just sigils without calls,
or sigils and calls without braces) anywhere except in the specific passes
implementing them.
|
| |
|