diff options
author | Andreas Rumpf <rumpf_a@web.de> | 2021-03-26 16:27:55 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2021-03-26 16:27:55 +0100 |
commit | 3e03f6733548fecb328214e0c8632fa4916992ed (patch) | |
tree | 1605cc1a34f2f94abd396a6f6d5d7cbd9e0d370a /lib | |
parent | add771b051128b5eef09b4d77015fbeacf994700 (diff) | |
download | Nim-3e03f6733548fecb328214e0c8632fa4916992ed.tar.gz |
cleaned up the internal documentation (#17524)
Diffstat (limited to 'lib')
-rw-r--r-- | lib/system/cellsets.nim | 35 | ||||
-rw-r--r-- | lib/system/gc.nim | 47 |
2 files changed, 81 insertions, 1 deletions
diff --git a/lib/system/cellsets.nim b/lib/system/cellsets.nim index 779f1a91f..a0f1fabf9 100644 --- a/lib/system/cellsets.nim +++ b/lib/system/cellsets.nim @@ -7,7 +7,40 @@ # distribution, for details about the copyright. # -# Efficient set of pointers for the GC (and repr) + +#[ + +Efficient set of pointers for the GC (and repr) +----------------------------------------------- + +The GC depends on an extremely efficient datastructure for storing a +set of pointers - this is called a `CellSet` in the source code. +Inserting, deleting and searching are done in constant time. However, +modifying a `CellSet` during traversal leads to undefined behaviour. + +All operations on a CellSet have to perform efficiently. Because a Cellset can +become huge a hash table alone is not suitable for this. + +We use a mixture of bitset and hash table for this. The hash table maps *pages* +to a page descriptor. The page descriptor contains a bit for any possible cell +address within this page. So including a cell is done as follows: + +- Find the page descriptor for the page the cell belongs to. +- Set the appropriate bit in the page descriptor indicating that the + cell points to the start of a memory block. + +Removing a cell is analogous - the bit has to be set to zero. +Single page descriptors are never deleted from the hash table. This is not +needed as the data structures needs to be rebuilt periodically anyway. + +Complete traversal is done in this way:: + + for each page descriptor d: + for each bit in d: + if bit == 1: + traverse the pointer belonging to this bit + +]# when defined(gcOrc) or defined(gcArc): type diff --git a/lib/system/gc.nim b/lib/system/gc.nim index 1f7164266..ae29f3466 100644 --- a/lib/system/gc.nim +++ b/lib/system/gc.nim @@ -12,6 +12,53 @@ # Refcounting + Mark&Sweep. Complex algorithms avoided. # Been there, done that, didn't work. +#[ + +A *cell* is anything that is traced by the GC +(sequences, refs, strings, closures). + +The basic algorithm is *Deferrent Reference Counting* with cycle detection. +References on the stack are not counted for better performance and easier C +code generation. + +Each cell has a header consisting of a RC and a pointer to its type +descriptor. However the program does not know about these, so they are placed at +negative offsets. In the GC code the type `PCell` denotes a pointer +decremented by the right offset, so that the header can be accessed easily. It +is extremely important that `pointer` is not confused with a `PCell`. + +In Nim the compiler cannot always know if a reference +is stored on the stack or not. This is caused by var parameters. +Consider this example: + +.. code-block:: Nim + proc setRef(r: var ref TNode) = + new(r) + + proc usage = + var + r: ref TNode + setRef(r) # here we should not update the reference counts, because + # r is on the stack + setRef(r.left) # here we should update the refcounts! + +We have to decide at runtime whether the reference is on the stack or not. +The generated code looks roughly like this: + +.. code-block:: C + void setref(TNode** ref) { + unsureAsgnRef(ref, newObj(TNode_TI, sizeof(TNode))) + } + void usage(void) { + setRef(&r) + setRef(&r->left) + } + +Note that for systems with a continuous stack (which most systems have) +the check whether the ref is on the stack is very cheap (only two +comparisons). +]# + {.push profiler:off.} const |