diff options
Diffstat (limited to 'doc/manual/threads.txt')
-rw-r--r-- | doc/manual/threads.txt | 221 |
1 files changed, 0 insertions, 221 deletions
diff --git a/doc/manual/threads.txt b/doc/manual/threads.txt deleted file mode 100644 index 53071b5a5..000000000 --- a/doc/manual/threads.txt +++ /dev/null @@ -1,221 +0,0 @@ -Threads -======= - -To enable thread support the ``--threads:on`` command line switch needs to -be used. The ``system`` module then contains several threading primitives. -See the `threads <threads.html>`_ and `channels <channels.html>`_ modules -for the low level thread API. There are also high level parallelism constructs -available. See `spawn <#parallel-spawn>`_ for further details. - -Nim's memory model for threads is quite different than that of other common -programming languages (C, Pascal, Java): Each thread has its own (garbage -collected) heap and sharing of memory is restricted to global variables. This -helps to prevent race conditions. GC efficiency is improved quite a lot, -because the GC never has to stop other threads and see what they reference. -Memory allocation requires no lock at all! This design easily scales to massive -multicore processors that are becoming the norm. - - -Thread pragma -------------- - -A proc that is executed as a new thread of execution should be marked by the -``thread`` pragma for reasons of readability. The compiler checks for -violations of the `no heap sharing restriction`:idx:\: This restriction implies -that it is invalid to construct a data structure that consists of memory -allocated from different (thread local) heaps. - -A thread proc is passed to ``createThread`` or ``spawn`` and invoked -indirectly; so the ``thread`` pragma implies ``procvar``. - - -GC safety ---------- - -We call a proc ``p`` `GC safe`:idx: when it doesn't access any global variable -that contains GC'ed memory (``string``, ``seq``, ``ref`` or a closure) either -directly or indirectly through a call to a GC unsafe proc. - -The `gcsafe`:idx: annotation can be used to mark a proc to be gcsafe, -otherwise this property is inferred by the compiler. Note that ``noSideEffect`` -implies ``gcsafe``. The only way to create a thread is via ``spawn`` or -``createThread``. ``spawn`` is usually the preferable method. Either way -the invoked proc must not use ``var`` parameters nor must any of its parameters -contain a ``ref`` or ``closure`` type. This enforces -the *no heap sharing restriction*. - -Routines that are imported from C are always assumed to be ``gcsafe``. -To disable the GC-safety checking the ``--threadAnalysis:off`` command line -switch can be used. This is a temporary workaround to ease the porting effort -from old code to the new threading model. - -To override the compiler's gcsafety analysis a ``{.gcsafe.}`` pragma block can -be used: - -.. code-block:: nim - - var - someGlobal: string = "some string here" - perThread {.threadvar.}: string - - proc setPerThread() = - {.gcsafe.}: - deepCopy(perThread, someGlobal) - - -Future directions: - -- A shared GC'ed heap might be provided. - - -Threadvar pragma ----------------- - -A global variable can be marked with the ``threadvar`` pragma; it is -a `thread-local`:idx: variable then: - -.. code-block:: nim - var checkpoints* {.threadvar.}: seq[string] - -Due to implementation restrictions thread local variables cannot be -initialized within the ``var`` section. (Every thread local variable needs to -be replicated at thread creation.) - - -Threads and exceptions ----------------------- - -The interaction between threads and exceptions is simple: A *handled* exception -in one thread cannot affect any other thread. However, an *unhandled* exception -in one thread terminates the whole *process*! - - - -Parallel & Spawn -================ - -Nim has two flavors of parallelism: -1) `Structured`:idx: parallelism via the ``parallel`` statement. -2) `Unstructured`:idx: parallelism via the standalone ``spawn`` statement. - -Nim has a builtin thread pool that can be used for CPU intensive tasks. For -IO intensive tasks the ``async`` and ``await`` features should be -used instead. Both parallel and spawn need the `threadpool <threadpool.html>`_ -module to work. - -Somewhat confusingly, ``spawn`` is also used in the ``parallel`` statement -with slightly different semantics. ``spawn`` always takes a call expression of -the form ``f(a, ...)``. Let ``T`` be ``f``'s return type. If ``T`` is ``void`` -then ``spawn``'s return type is also ``void`` otherwise it is ``FlowVar[T]``. - -Within a ``parallel`` section sometimes the ``FlowVar[T]`` is eliminated -to ``T``. This happens when ``T`` does not contain any GC'ed memory. -The compiler can ensure the location in ``location = spawn f(...)`` is not -read prematurely within a ``parallel`` section and so there is no need for -the overhead of an indirection via ``FlowVar[T]`` to ensure correctness. - -**Note**: Currently exceptions are not propagated between ``spawn``'ed tasks! - - -Spawn statement ---------------- - -`spawn`:idx: can be used to pass a task to the thread pool: - -.. code-block:: nim - import threadpool - - proc processLine(line: string) = - discard "do some heavy lifting here" - - for x in lines("myinput.txt"): - spawn processLine(x) - sync() - -For reasons of type safety and implementation simplicity the expression -that ``spawn`` takes is restricted: - -* It must be a call expression ``f(a, ...)``. -* ``f`` must be ``gcsafe``. -* ``f`` must not have the calling convention ``closure``. -* ``f``'s parameters may not be of type ``var``. - This means one has to use raw ``ptr``'s for data passing reminding the - programmer to be careful. -* ``ref`` parameters are deeply copied which is a subtle semantic change and - can cause performance problems but ensures memory safety. This deep copy - is performed via ``system.deepCopy`` and so can be overridden. -* For *safe* data exchange between ``f`` and the caller a global ``TChannel`` - needs to be used. However, since spawn can return a result, often no further - communication is required. - - -``spawn`` executes the passed expression on the thread pool and returns -a `data flow variable`:idx: ``FlowVar[T]`` that can be read from. The reading -with the ``^`` operator is **blocking**. However, one can use ``awaitAny`` to -wait on multiple flow variables at the same time: - -.. code-block:: nim - import threadpool, ... - - # wait until 2 out of 3 servers received the update: - proc main = - var responses = newSeq[FlowVarBase](3) - for i in 0..2: - responses[i] = spawn tellServer(Update, "key", "value") - var index = awaitAny(responses) - assert index >= 0 - responses.del(index) - discard awaitAny(responses) - -Data flow variables ensure that no data races -are possible. Due to technical limitations not every type ``T`` is possible in -a data flow variable: ``T`` has to be of the type ``ref``, ``string``, ``seq`` -or of a type that doesn't contain a type that is garbage collected. This -restriction is not hard to work-around in practice. - - - -Parallel statement ------------------- - -Example: - -.. code-block:: nim - # Compute PI in an inefficient way - import strutils, math, threadpool - - proc term(k: float): float = 4 * math.pow(-1, k) / (2*k + 1) - - proc pi(n: int): float = - var ch = newSeq[float](n+1) - parallel: - for k in 0..ch.high: - ch[k] = spawn term(float(k)) - for k in 0..ch.high: - result += ch[k] - - echo formatFloat(pi(5000)) - - -The parallel statement is the preferred mechanism to introduce parallelism -in a Nim program. A subset of the Nim language is valid within a -``parallel`` section. This subset is checked to be free of data races at -compile time. A sophisticated `disjoint checker`:idx: ensures that no data -races are possible even though shared memory is extensively supported! - -The subset is in fact the full language with the following -restrictions / changes: - -* ``spawn`` within a ``parallel`` section has special semantics. -* Every location of the form ``a[i]`` and ``a[i..j]`` and ``dest`` where - ``dest`` is part of the pattern ``dest = spawn f(...)`` has to be - provably disjoint. This is called the *disjoint check*. -* Every other complex location ``loc`` that is used in a spawned - proc (``spawn f(loc)``) has to be immutable for the duration of - the ``parallel`` section. This is called the *immutability check*. Currently - it is not specified what exactly "complex location" means. We need to make - this an optimization! -* Every array access has to be provably within bounds. This is called - the *bounds check*. -* Slices are optimized so that no copy is performed. This optimization is not - yet performed for ordinary slices outside of a ``parallel`` section. |