diff options
Diffstat (limited to 'doc/intern.rst')
-rw-r--r-- | doc/intern.rst | 669 |
1 files changed, 669 insertions, 0 deletions
diff --git a/doc/intern.rst b/doc/intern.rst new file mode 100644 index 000000000..b253cac42 --- /dev/null +++ b/doc/intern.rst @@ -0,0 +1,669 @@ +========================================= + Internals of the Nim Compiler +========================================= + + +:Author: Andreas Rumpf +:Version: |nimversion| + +.. contents:: + + "Abstraction is layering ignorance on top of reality." -- unknown + + +Directory structure +=================== + +The Nim project's directory structure is: + +============ ============================================== +Path Purpose +============ ============================================== +``bin`` generated binary files +``build`` generated C code for the installation +``compiler`` the Nim compiler itself; note that this + code has been translated from a bootstrapping + version written in Pascal, so the code is **not** + a poster child of good Nim code +``config`` configuration files for Nim +``dist`` additional packages for the distribution +``doc`` the documentation; it is a bunch of + reStructuredText files +``lib`` the Nim library +``web`` website of Nim; generated by ``nimweb`` + from the ``*.txt`` and ``*.tmpl`` files +============ ============================================== + + +Bootstrapping the compiler +========================== + +Compiling the compiler is a simple matter of running:: + + nim c koch.nim + ./koch boot + +For a release version use:: + + nim c koch.nim + ./koch boot -d:release + +And for a debug version compatible with GDB:: + + nim c koch.nim + ./koch boot --debuginfo --linedir:on + +The ``koch`` program is Nim's maintenance script. It is a replacement for +make and shell scripting with the advantage that it is much more portable. +More information about its options can be found in the `koch <koch.html>`_ +documentation. + + +Coding Guidelines +================= + +* Use CamelCase, not underscored_identifiers. +* Indent with two spaces. +* Max line length is 80 characters. +* Provide spaces around binary operators if that enhances readability. +* Use a space after a colon, but not before it. +* Start types with a capital ``T``, unless they are pointers/references which + start with ``P``. + +See also the `API naming design <apis.html>`_ document. + + +Porting to new platforms +======================== + +Porting Nim to a new architecture is pretty easy, since C is the most +portable programming language (within certain limits) and Nim generates +C code, porting the code generator is not necessary. + +POSIX-compliant systems on conventional hardware are usually pretty easy to +port: Add the platform to ``platform`` (if it is not already listed there), +check that the OS, System modules work and recompile Nim. + +The only case where things aren't as easy is when the garbage +collector needs some assembler tweaking to work. The standard +version of the GC uses C's ``setjmp`` function to store all registers +on the hardware stack. It may be necessary that the new platform needs to +replace this generic code by some assembler code. + + +Runtime type information +======================== + +*Runtime type information* (RTTI) is needed for several aspects of the Nim +programming language: + +Garbage collection + The most important reason for RTTI. Generating + traversal procedures produces bigger code and is likely to be slower on + modern hardware as dynamic procedure binding is hard to predict. + +Complex assignments + Sequences and strings are implemented as + pointers to resizeable buffers, but Nim requires copying for + assignments. Apart from RTTI the compiler could generate copy procedures + for any type that needs one. However, this would make the code bigger and + the RTTI is likely already there for the GC. + +We already know the type information as a graph in the compiler. +Thus we need to serialize this graph as RTTI for C code generation. +Look at the file ``lib/system/hti.nim`` for more information. + + +Debugging the compiler +====================== + +You can of course use GDB or Visual Studio to debug the +compiler (via ``--debuginfo --lineDir:on``). However, there +are also lots of procs that aid in debugging: + + +.. code-block:: nim + # pretty prints the Nim AST + echo renderTree(someNode) + # outputs some JSON representation + debug(someNode) + # pretty prints some type + echo typeToString(someType) + debug(someType) + echo symbol.name.s + debug(symbol) + # pretty prints the Nim ast, but annotates symbol IDs: + echo renderTree(someNode, {renderIds}) + if n.info ?? "temp.nim": + # only output when it comes from "temp.nim" + echo renderTree(n) + if n.info ?? "temp.nim": + # why does it process temp.nim here? + writeStackTrace() + +To create a new compiler for each run, use ``koch temp``:: + + ./koch temp c /tmp/test.nim + +``koch temp`` creates a debug build of the compiler, which is useful +to create stacktraces for compiler debugging. + +``koch temp`` returns 125 as the exit code in case the compiler +compilation fails. This exit code tells ``git bisect`` to skip the +current commit.:: + + git bisect start bad-commit good-commit + git bisect run ./koch temp -r c test-source.nim + +The compiler's architecture +=========================== + +Nim uses the classic compiler architecture: A lexer/scanner feds tokens to a +parser. The parser builds a syntax tree that is used by the code generator. +This syntax tree is the interface between the parser and the code generator. +It is essential to understand most of the compiler's code. + +In order to compile Nim correctly, type-checking has to be separated from +parsing. Otherwise generics cannot work. + +.. include:: filelist.txt + + +The syntax tree +--------------- +The syntax tree consists of nodes which may have an arbitrary number of +children. Types and symbols are represented by other nodes, because they +may contain cycles. The AST changes its shape after semantic checking. This +is needed to make life easier for the code generators. See the "ast" module +for the type definitions. The `macros <macros.html>`_ module contains many +examples how the AST represents each syntactic structure. + + +How the RTL is compiled +======================= + +The ``system`` module contains the part of the RTL which needs support by +compiler magic (and the stuff that needs to be in it because the spec +says so). The C code generator generates the C code for it just like any other +module. However, calls to some procedures like ``addInt`` are inserted by +the CCG. Therefore the module ``magicsys`` contains a table (``compilerprocs``) +with all symbols that are marked as ``compilerproc``. ``compilerprocs`` are +needed by the code generator. A ``magic`` proc is not the same as a +``compilerproc``: A ``magic`` is a proc that needs compiler magic for its +semantic checking, a ``compilerproc`` is a proc that is used by the code +generator. + + +Compilation cache +================= + +The implementation of the compilation cache is tricky: There are lots +of issues to be solved for the front- and backend. + + +General approach: AST replay +---------------------------- + +We store a module's AST of a successful semantic check in a SQLite +database. There are plenty of features that require a sub sequence +to be re-applied, for example: + +.. code-block:: nim + {.compile: "foo.c".} # even if the module is loaded from the DB, + # "foo.c" needs to be compiled/linked. + +The solution is to **re-play** the module's top level statements. +This solves the problem without having to special case the logic +that fills the internal seqs which are affected by the pragmas. + +In fact, this decribes how the AST should be stored in the database, +as a "shallow" tree. Let's assume we compile module ``m`` with the +following contents: + +.. code-block:: nim + import strutils + + var x*: int = 90 + {.compile: "foo.c".} + proc p = echo "p" + proc q = echo "q" + static: + echo "static" + +Conceptually this is the AST we store for the module: + +.. code-block:: nim + import strutils + + var x* + {.compile: "foo.c".} + proc p + proc q + static: + echo "static" + +The symbol's ``ast`` field is loaded lazily, on demand. This is where most +savings come from, only the shallow outer AST is reconstructed immediately. + +It is also important that the replay involves the ``import`` statement so +that the dependencies are resolved properly. + + +Shared global compiletime state +------------------------------- + +Nim allows ``.global, compiletime`` variables that can be filled by macro +invokations across different modules. This feature breaks modularity in a +severe way. Plenty of different solutions have been proposed: + +- Restrict the types of global compiletime variables to ``Set[T]`` or + similar unordered, only-growable collections so that we can track + the module's write effects to these variables and reapply the changes + in a different order. +- In every module compilation, reset the variable to its default value. +- Provide a restrictive API that can load/save the compiletime state to + a file. + +(These solutions are not mutually exclusive.) + +Since we adopt the "replay the top level statements" idea, the natural +solution to this problem is to emit pseudo top level statements that +reflect the mutations done to the global variable. However, this is +MUCH harder than it sounds, for example ``squeaknim`` uses this +snippet: + +.. code-block:: nim + apicall.add(") module: '" & dllName & "'>\C" & + "\t^self externalCallFailed\C!\C\C") + stCode.add(st & "\C\t\"Generated by NimSqueak\"\C\t" & apicall) + +We can "replay" ``stCode.add`` only if the values of ``st`` +and ``apicall`` are known. And even then a hash table's ``add`` with its +hashing mechanism is too hard to replay. + +In practice, things are worse still, consider ``someGlobal[i][j].add arg``. +We only know the root is ``someGlobal`` but the concrete path to the data +is unknown as is the value that is added. We could compute a "diff" between +the global states and use that to compute a symbol patchset, but this is +quite some work, expensive to do at runtime (it would need to run after +every module has been compiled) and also would break for hash tables. + +We need an API that hides the complex aliasing problems by not relying +on Nim's global variables. The obvious solution is to use string keys +instead of global variables: + +.. code-block:: nim + + proc cachePut*(key: string; value: string) + proc cacheGet*(key: string): string + +However, the values being strings/json is quite problematic: Many +lookup tables that are built at compiletime embed *proc vars* and +types which have no obvious string representation... Seems like +AST diffing is still the best idea as it will not require to use +an alien API and works with some existing Nimble packages, at least. + +On the other hand, in Nim's future I would like to replace the VM +by native code. A diff algorithm wouldn't work for that. +Instead the native code would work with an API like ``put``, ``get``: + +.. code-block:: nim + + proc cachePut*(key: string; value: NimNode) + proc cacheGet*(key: string): NimNode + +The API should embrace the AST diffing notion: See the +module ``macrocache`` for the final details. + + + +Methods and type converters +--------------------------- + +In the following +sections *global* means *shared between modules* or *property of the whole +program*. + +Nim contains language features that are *global*. The best example for that +are multi methods: Introducing a new method with the same name and some +compatible object parameter means that the method's dispatcher needs to take +the new method into account. So the dispatching logic is only completely known +after the whole program has been translated! + +Other features that are *implicitly* triggered cause problems for modularity +too. Type converters fall into this category: + +.. code-block:: nim + # module A + converter toBool(x: int): bool = + result = x != 0 + +.. code-block:: nim + # module B + import A + + if 1: + echo "ugly, but should work" + +If in the above example module ``B`` is re-compiled, but ``A`` is not then +``B`` needs to be aware of ``toBool`` even though ``toBool`` is not referenced +in ``B`` *explicitly*. + +Both the multi method and the type converter problems are solved by the +AST replay implementation. + + +Generics +~~~~~~~~ + +We cache generic instantiations and need to ensure this caching works +well with the incremental compilation feature. Since the cache is +attached to the ``PSym`` datastructure, it should work without any +special logic. + + +Backend issues +-------------- + +- Init procs must not be "forgotten" to be called. +- Files must not be "forgotten" to be linked. +- Method dispatchers are global. +- DLL loading via ``dlsym`` is global. +- Emulated thread vars are global. + +However the biggest problem is that dead code elimination breaks modularity! +To see why, consider this scenario: The module ``G`` (for example the huge +Gtk2 module...) is compiled with dead code elimination turned on. So none +of ``G``'s procs is generated at all. + +Then module ``B`` is compiled that requires ``G.P1``. Ok, no problem, +``G.P1`` is loaded from the symbol file and ``G.c`` now contains ``G.P1``. + +Then module ``A`` (that depends on ``B`` and ``G``) is compiled and ``B`` +and ``G`` are left unchanged. ``A`` requires ``G.P2``. + +So now ``G.c`` MUST contain both ``P1`` and ``P2``, but we haven't even +loaded ``P1`` from the symbol file, nor do we want to because we then quickly +would restore large parts of the whole program. + +Solution +~~~~~~~~ + +The backend must have some logic so that if the currently processed module +is from the compilation cache, the ``ast`` field is not accessed. Instead +the generated C(++) for the symbol's body needs to be cached too and +inserted back into the produced C file. This approach seems to deal with +all the outlined problems above. + + +Debugging Nim's memory management +================================= + +The following paragraphs are mostly a reminder for myself. Things to keep +in mind: + +* If an assertion in Nim's memory manager or GC fails, the stack trace + keeps allocating memory! Thus a stack overflow may happen, hiding the + real issue. +* What seem to be C code generation problems is often a bug resulting from + not producing prototypes, so that some types default to ``cint``. Testing + without the ``-w`` option helps! + + +The Garbage Collector +===================== + +Introduction +------------ + +I use the term *cell* here to refer to everything that is traced +(sequences, refs, strings). +This section describes how the GC works. + +The basic algorithm is *Deferrent Reference Counting* with cycle detection. +References on the stack are not counted for better performance and easier C +code generation. + +Each cell has a header consisting of a RC and a pointer to its type +descriptor. However the program does not know about these, so they are placed at +negative offsets. In the GC code the type ``PCell`` denotes a pointer +decremented by the right offset, so that the header can be accessed easily. It +is extremely important that ``pointer`` is not confused with a ``PCell`` +as this would lead to a memory corruption. + + +The CellSet data structure +-------------------------- + +The GC depends on an extremely efficient datastructure for storing a +set of pointers - this is called a ``TCellSet`` in the source code. +Inserting, deleting and searching are done in constant time. However, +modifying a ``TCellSet`` during traversation leads to undefined behaviour. + +.. code-block:: Nim + type + TCellSet # hidden + + proc cellSetInit(s: var TCellSet) # initialize a new set + proc cellSetDeinit(s: var TCellSet) # empty the set and free its memory + proc incl(s: var TCellSet, elem: PCell) # include an element + proc excl(s: var TCellSet, elem: PCell) # exclude an element + + proc `in`(elem: PCell, s: TCellSet): bool # tests membership + + iterator elements(s: TCellSet): (elem: PCell) + + +All the operations have to perform efficiently. Because a Cellset can +become huge a hash table alone is not suitable for this. + +We use a mixture of bitset and hash table for this. The hash table maps *pages* +to a page descriptor. The page descriptor contains a bit for any possible cell +address within this page. So including a cell is done as follows: + +- Find the page descriptor for the page the cell belongs to. +- Set the appropriate bit in the page descriptor indicating that the + cell points to the start of a memory block. + +Removing a cell is analogous - the bit has to be set to zero. +Single page descriptors are never deleted from the hash table. This is not +needed as the data structures needs to be rebuilt periodically anyway. + +Complete traversal is done in this way:: + + for each page descriptor d: + for each bit in d: + if bit == 1: + traverse the pointer belonging to this bit + + +Further complications +--------------------- + +In Nim the compiler cannot always know if a reference +is stored on the stack or not. This is caused by var parameters. +Consider this example: + +.. code-block:: Nim + proc setRef(r: var ref TNode) = + new(r) + + proc usage = + var + r: ref TNode + setRef(r) # here we should not update the reference counts, because + # r is on the stack + setRef(r.left) # here we should update the refcounts! + +We have to decide at runtime whether the reference is on the stack or not. +The generated code looks roughly like this: + +.. code-block:: C + void setref(TNode** ref) { + unsureAsgnRef(ref, newObj(TNode_TI, sizeof(TNode))) + } + void usage(void) { + setRef(&r) + setRef(&r->left) + } + +Note that for systems with a continuous stack (which most systems have) +the check whether the ref is on the stack is very cheap (only two +comparisons). + + +Code generation for closures +============================ + +Code generation for closures is implemented by `lambda lifting`:idx:. + +Design +------ + +A ``closure`` proc var can call ordinary procs of the default Nim calling +convention. But not the other way round! A closure is implemented as a +``tuple[prc, env]``. ``env`` can be nil implying a call without a closure. +This means that a call through a closure generates an ``if`` but the +interoperability is worth the cost of the ``if``. Thunk generation would be +possible too, but it's slightly more effort to implement. + +Tests with GCC on Amd64 showed that it's really beneficical if the +'environment' pointer is passed as the last argument, not as the first argument. + +Proper thunk generation is harder because the proc that is to wrap +could stem from a complex expression: + +.. code-block:: nim + receivesClosure(returnsDefaultCC[i]) + +A thunk would need to call 'returnsDefaultCC[i]' somehow and that would require +an *additional* closure generation... Ok, not really, but it requires to pass +the function to call. So we'd end up with 2 indirect calls instead of one. +Another much more severe problem which this solution is that it's not GC-safe +to pass a proc pointer around via a generic ``ref`` type. + + +Example code: + +.. code-block:: nim + proc add(x: int): proc (y: int): int {.closure.} = + return proc (y: int): int = + return x + y + + var add2 = add(2) + echo add2(5) #OUT 7 + +This should produce roughly this code: + +.. code-block:: nim + type + PEnv = ref object + x: int # data + + proc anon(y: int, c: PEnv): int = + return y + c.x + + proc add(x: int): tuple[prc, data] = + var env: PEnv + new env + env.x = x + result = (anon, env) + + var add2 = add(2) + let tmp = if add2.data == nil: add2.prc(5) else: add2.prc(5, add2.data) + echo tmp + + +Beware of nesting: + +.. code-block:: nim + proc add(x: int): proc (y: int): proc (z: int): int {.closure.} {.closure.} = + return lamba (y: int): proc (z: int): int {.closure.} = + return lambda (z: int): int = + return x + y + z + + var add24 = add(2)(4) + echo add24(5) #OUT 11 + +This should produce roughly this code: + +.. code-block:: nim + type + PEnvX = ref object + x: int # data + + PEnvY = ref object + y: int + ex: PEnvX + + proc lambdaZ(z: int, ey: PEnvY): int = + return ey.ex.x + ey.y + z + + proc lambdaY(y: int, ex: PEnvX): tuple[prc, data: PEnvY] = + var ey: PEnvY + new ey + ey.y = y + ey.ex = ex + result = (lambdaZ, ey) + + proc add(x: int): tuple[prc, data: PEnvX] = + var ex: PEnvX + ex.x = x + result = (labmdaY, ex) + + var tmp = add(2) + var tmp2 = tmp.fn(4, tmp.data) + var add24 = tmp2.fn(4, tmp2.data) + echo add24(5) + + +We could get rid of nesting environments by always inlining inner anon procs. +More useful is escape analysis and stack allocation of the environment, +however. + + +Alternative +----------- + +Process the closure of all inner procs in one pass and accumulate the +environments. This is however not always possible. + + +Accumulator +----------- + +.. code-block:: nim + proc getAccumulator(start: int): proc (): int {.closure} = + var i = start + return lambda: int = + inc i + return i + + proc p = + var delta = 7 + proc accumulator(start: int): proc(): int = + var x = start-1 + result = proc (): int = + x = x + delta + inc delta + return x + + var a = accumulator(3) + var b = accumulator(4) + echo a() + b() + + +Internals +--------- + +Lambda lifting is implemented as part of the ``transf`` pass. The ``transf`` +pass generates code to setup the environment and to pass it around. However, +this pass does not change the types! So we have some kind of mismatch here; on +the one hand the proc expression becomes an explicit tuple, on the other hand +the tyProc(ccClosure) type is not changed. For C code generation it's also +important the hidden formal param is ``void*`` and not something more +specialized. However the more specialized env type needs to passed to the +backend somehow. We deal with this by modifying ``s.ast[paramPos]`` to contain +the formal hidden parameter, but not ``s.typ``! + + |