========================================================== Parallel & Spawn ========================================================== Nim has two flavors of parallelism: 1) `Structured`:idx parallelism via the ``parallel`` statement. 2) `Unstructured`:idx: parallelism via the standalone ``spawn`` statement. Both need the `threadpool `_ module to work. Somewhat confusingly, ``spawn`` is also used in the ``parallel`` statement with slightly different semantics. ``spawn`` always takes a call expression of the form ``f(a, ...)``. Let ``T`` be ``f``'s return type. If ``T`` is ``void`` then ``spawn``'s return type is also ``void``. Within a ``parallel`` section ``spawn``'s return type is ``T``, otherwise it is ``FlowVar[T]``. The compiler can ensure the location in ``location = spawn f(...)`` is not read prematurely within a ``parallel`` section and so there is no need for the overhead of an indirection via ``FlowVar[T]`` to ensure correctness. Spawn statement =============== A standalone ``spawn`` statement is a simple construct. It executes the passed expression on the thread pool and returns a `data flow variable`:idx: ``FlowVar[T]`` that can be read from. The reading with the ``^`` operator is **blocking**. However, one can use ``blockUntilAny`` to wait on multiple flow variables at the same time: .. code-block:: nim import std/threadpool, ... # wait until 2 out of 3 servers received the update: proc main = var responses = newSeq[FlowVarBase](3) for i in 0..2: responses[i] = spawn tellServer(Update, "key", "value") var index = blockUntilAny(responses) assert index >= 0 responses.del(index) discard blockUntilAny(responses) Data flow variables ensure that no data races are possible. Due to technical limitations not every type ``T`` is possible in a data flow variable: ``T`` has to be of the type ``ref``, ``string``, ``seq`` or of a type that doesn't contain a type that is garbage collected. This restriction will be removed in the future. Parallel statement ================== Example: .. code-block:: nim # Compute PI in an inefficient way import std/[strutils, math, threadpool] proc term(k: float): float = 4 * math.pow(-1, k) / (2*k + 1) proc pi(n: int): float = var ch = newSeq[float](n+1) parallel: for k in 0..ch.high: ch[k] = spawn term(float(k)) for k in 0..ch.high: result += ch[k] echo formatFloat(pi(5000)) The parallel statement is the preferred mechanism to introduce parallelism in a Nim program. A subset of the Nim language is valid within a ``parallel`` section. This subset is checked to be free of data races at compile time. A sophisticated `disjoint checker`:idx: ensures that no data races are possible even though shared memory is extensively supported! The subset is in fact the full language with the following restrictions / changes: * ``spawn`` within a ``parallel`` section has special semantics. * Every location of the form ``a[i]`` and ``a[i..j]`` and ``dest`` where ``dest`` is part of the pattern ``dest = spawn f(...)`` has to be provably disjoint. This is called the *disjoint check*. * Every other complex location ``loc`` that is used in a spawned proc (``spawn f(loc)``) has to be immutable for the duration of the ``parallel`` section. This is called the *immutability check*. Currently it is not specified what exactly "complex location" means. We need to make this an optimization! * Every array access has to be provably within bounds. This is called the *bounds check*. * Slices are optimized so that no copy is performed. This optimization is not yet performed for ordinary slices outside of a ``parallel`` section.