summary refs log tree commit diff stats
diff options
context:
space:
mode:
-rwxr-xr-xdoc/intern.txt85
-rwxr-xr-xdoc/nimrodc.txt26
-rwxr-xr-xtodo.txt5
-rwxr-xr-xweb/news.txt3
4 files changed, 112 insertions, 7 deletions
diff --git a/doc/intern.txt b/doc/intern.txt
index c6c0108a7..475aad75f 100755
--- a/doc/intern.txt
+++ b/doc/intern.txt
@@ -112,7 +112,7 @@ Look at the file ``lib/system/hti.nim`` for more information.
 The compiler's architecture
 ===========================
 
-Nimrod uses the classic compiler architecture: A scanner feds tokens to a
+Nimrod uses the classic compiler architecture: A lexer/scanner feds tokens to a
 parser. The parser builds a syntax tree that is used by the code generator.
 This syntax tree is the interface between the parser and the code generator.
 It is essential to understand most of the compiler's code.
@@ -148,6 +148,89 @@ semantic checking, a ``compilerproc`` is a proc that is used by the code
 generator.
 
 
+Compilation cache
+=================
+
+The implementation of the `compilation cache`:idx: is tricky: There are lots
+of issues to be solved for the front- and backend. In the following 
+sections *global* means *shared between modules* or *property of the whole
+program*.
+
+
+Frontend issues
+---------------
+
+Nimrod contains language features that are *global*. The best example for that
+are multi methods: Introducing a new method with the same name and some 
+compatible object parameter means that the method's dispatcher needs to take
+the new method into account. So the dispatching logic is only completely known
+after the whole program has been translated!
+
+Other features that are *implicitly* triggered cause problems for modularity 
+too. Type converters fall into this category:
+
+.. code-block:: nimrod
+  # module A
+  converter toBool(x: int): bool =
+    result = x != 0
+    
+.. code-block:: nimrod
+  # module B
+  import A
+  
+  if 1:
+    echo "ugly, but should work"
+
+If in the above example module ``B`` is re-compiled, but ``A`` is not then
+``B`` needs to be aware of ``toBool`` even though  ``toBool`` is not referenced
+in ``B`` *explicitely*. 
+
+Both the multi method and the type converter problems are solved by storing 
+them in special sections in the ROD file that are loaded *unconditionally*
+when the ROD file is read.
+
+
+Backend issues
+--------------
+
+- Init procs must not be "forgotten" to be called.
+- Files must not be "forgotten" to be linked.
+- Anything that is contained in ``nim__dat.c`` is shared between modules
+  implicitely.
+- Method dispatchers are global.
+- DLL loading via ``dlsym`` is global.
+- Emulated thread vars are global.
+
+
+However the biggest problem is that dead code elimination breaks modularity! 
+To see why, consider this scenario: The module ``G`` (for example the huge
+Gtk2 module...) is compiled with dead code elimination turned on. So no
+of ``G``'s procs is generated at all.
+
+Then module ``B`` is compiled that requires ``G.P1``. Ok, no problem,
+``G.P1`` is loaded from the symbol file and ``G.c`` now contains ``G.P1``.
+
+Then module ``A`` (that depends onto ``B`` and ``G``) is compiled and ``B`` 
+and ``G`` are left unchanged. ``A`` requires ``G.P2``.
+
+So now ``G.c`` MUST contain both ``P1`` and ``P2``, but we haven't even 
+loaded ``P1`` from the symbol file, nor do we want to because we then quickly 
+would restore large parts of the whole program. But we also don't want to 
+store ``P1`` in ``B.c`` because that would mean to store every symbol where 
+it is referred from which ultimately means the main module and putting
+everything in a single C file.
+
+There is however another solution: The old file ``G.c`` containing ``P1`` is
+**merged** with the new file ``G.c`` containing ``P2``. This is the solution
+that is implemented in the C code generator (have a look at the ``ccgmerge``
+module). The merging may lead to *cruft* (aka dead code) in generated C code
+which can only be removed by recompiling a project with the compilation cache
+turned off. Nevertheless the merge solution is way superior to the
+cheap solution "turn off dead code elimination if the compilation cache is 
+turned on".
+
+
+
 Debugging Nimrod's memory management
 ====================================
 
diff --git a/doc/nimrodc.txt b/doc/nimrodc.txt
index d249d4860..08128521f 100755
--- a/doc/nimrodc.txt
+++ b/doc/nimrodc.txt
@@ -76,6 +76,28 @@ Linux does not compile on Windows, for instance. The comment on top of the
 C file lists the OS, CPU and CC the file has been compiled for.

 

 

+Compilation cache

+=================

+

+**Warning**: The compilation cache is still highly experimental!

+

+The ``nimcache`` directory may also contain so called `rod`:idx: 

+or `symbol files`:idx:. These files are pre-compiled modules that are used by

+the compiler to perform `incremental compilation`:idx:. This means that only

+modules that have changed since the last compilation (or the modules depending

+on them etc.) are re-compiled. However, per default no symbol files are 

+generated; use the ``--symbolFiles:on`` command line switch to activate them.

+

+Unfortunately due to technical reasons the ``--symbolFiles:on`` needs 

+to *aggregate* some generated C code. This means that the resulting executable

+might contain some cruft even in when dead code elimination is turned on. So

+the final release build should be done with ``--symbolFiles:off``.

+

+Due to the aggregation of C code it is also recommended that each project

+resists in its own directory so that the generated ``nimcache`` directory

+is not shared between different projects.

+

+

 Cross compilation

 =================

 

@@ -315,7 +337,7 @@ ENDB. See the documentation of `endb <endb.html>`_ for further information.
 Volatile pragma

 ---------------

 The `volatile`:idx: pragma is for variables only. It declares the variable as

-``volatile``, whatever that means in C/C++ (its semantics are not well defined
+``volatile``, whatever that means in C/C++ (its semantics are not well defined

 in C/C++).

 

 **Note**: This pragma will not exist for the LLVM backend.

@@ -331,7 +353,7 @@ input management. To start Nimrod in interactive mode use the command
 ``nimrod i``. To quit use the ``quit()`` command. To determine whether an input

 line is an incomplete statement to be continued these rules are used:

 

-1. The line ends with ``[-+*/\\<>!\?\|%&$@~,;:=#^]\s*$`` (operator symbol 
+1. The line ends with ``[-+*/\\<>!\?\|%&$@~,;:=#^]\s*$`` (operator symbol 

    followed by optional whitespace).

 2. The line starts with a space (indentation).

 3. The line is within a triple quoted string literal. However, the detection 

diff --git a/todo.txt b/todo.txt
index 7add7ceb3..4db53a105 100755
--- a/todo.txt
+++ b/todo.txt
@@ -18,16 +18,13 @@ incremental compilation
 
 - adapt thread var implementation to care about the new merge operation
 - write test cases: needs test script support
+  - test type converters
   - test thread var
   - test method generation
-  - test type converters
   - test init sections
   - test DLL interfacing!
   - hallo.rod is missing initial statements: feature or bug?
 - fix remaining bugs
-- write documentation
-- make the compiler output a warning if linking fails with --symbolFiles:on
-  (necessary?)
 
 
 version 0.9.0
diff --git a/web/news.txt b/web/news.txt
index 3588876f1..b415b9164 100755
--- a/web/news.txt
+++ b/web/news.txt
@@ -74,6 +74,9 @@ Compiler Additions
   are declared with the ``TaintedString`` string type. If the taint
   mode is turned on it is a distinct string type which helps to detect input
   validation errors.
+- The compiler now supports the compilation cache via ``--symbolFiles:on``. 
+  This potentially speeds up compilations by an order of magnitude, but is
+  still highly experimental!
 
 
 Library Additions