diff options
Diffstat (limited to 'doc/manual/procs.txt')
-rw-r--r-- | doc/manual/procs.txt | 554 |
1 files changed, 554 insertions, 0 deletions
diff --git a/doc/manual/procs.txt b/doc/manual/procs.txt new file mode 100644 index 000000000..66f9ad58f --- /dev/null +++ b/doc/manual/procs.txt @@ -0,0 +1,554 @@ +Procedures +========== + +What most programming languages call `methods`:idx: or `functions`:idx: are +called `procedures`:idx: in Nim (which is the correct terminology). A +procedure declaration defines an identifier and associates it with a block +of code. +A procedure may call itself recursively. A parameter may be given a default +value that is used if the caller does not provide a value for this parameter. + +If the proc declaration has no body, it is a `forward`:idx: declaration. If +the proc returns a value, the procedure body can access an implicitly declared +variable named `result`:idx: that represents the return value. Procs can be +overloaded. The overloading resolution algorithm tries to find the proc that is +the best match for the arguments. Example: + +.. code-block:: nim + + proc toLower(c: Char): Char = # toLower for characters + if c in {'A'..'Z'}: + result = chr(ord(c) + (ord('a') - ord('A'))) + else: + result = c + + proc toLower(s: string): string = # toLower for strings + result = newString(len(s)) + for i in 0..len(s) - 1: + result[i] = toLower(s[i]) # calls toLower for characters; no recursion! + +Calling a procedure can be done in many different ways: + +.. code-block:: nim + proc callme(x, y: int, s: string = "", c: char, b: bool = false) = ... + + # call with positional arguments # parameter bindings: + callme(0, 1, "abc", '\t', true) # (x=0, y=1, s="abc", c='\t', b=true) + # call with named and positional arguments: + callme(y=1, x=0, "abd", '\t') # (x=0, y=1, s="abd", c='\t', b=false) + # call with named arguments (order is not relevant): + callme(c='\t', y=1, x=0) # (x=0, y=1, s="", c='\t', b=false) + # call as a command statement: no () needed: + callme 0, 1, "abc", '\t' + + +A procedure cannot modify its parameters (unless the parameters have the type +`var`). + +`Operators`:idx: are procedures with a special operator symbol as identifier: + +.. code-block:: nim + proc `$` (x: int): string = + # converts an integer to a string; this is a prefix operator. + result = intToStr(x) + +Operators with one parameter are prefix operators, operators with two +parameters are infix operators. (However, the parser distinguishes these from +the operator's position within an expression.) There is no way to declare +postfix operators: all postfix operators are built-in and handled by the +grammar explicitly. + +Any operator can be called like an ordinary proc with the '`opr`' +notation. (Thus an operator can have more than two parameters): + +.. code-block:: nim + proc `*+` (a, b, c: int): int = + # Multiply and add + result = a * b + c + + assert `*+`(3, 4, 6) == `*`(a, `+`(b, c)) + + +Method call syntax +------------------ + +For object oriented programming, the syntax ``obj.method(args)`` can be used +instead of ``method(obj, args)``. The parentheses can be omitted if there are no +remaining arguments: ``obj.len`` (instead of ``len(obj)``). + +This method call syntax is not restricted to objects, it can be used +to supply any type of first argument for procedures: + +.. code-block:: nim + + echo("abc".len) # is the same as echo(len("abc")) + echo("abc".toUpper()) + echo({'a', 'b', 'c'}.card) + stdout.writeln("Hallo") # the same as writeln(stdout, "Hallo") + +Another way to look at the method call syntax is that it provides the missing +postfix notation. + + +Properties +---------- +Nim has no need for *get-properties*: Ordinary get-procedures that are called +with the *method call syntax* achieve the same. But setting a value is +different; for this a special setter syntax is needed: + +.. code-block:: nim + + type + TSocket* = object of TObject + FHost: int # cannot be accessed from the outside of the module + # the `F` prefix is a convention to avoid clashes since + # the accessors are named `host` + + proc `host=`*(s: var TSocket, value: int) {.inline.} = + ## setter of hostAddr + s.FHost = value + + proc host*(s: TSocket): int {.inline.} = + ## getter of hostAddr + s.FHost + + var + s: TSocket + s.host = 34 # same as `host=`(s, 34) + + +Command invocation syntax +------------------------- + +Routines can be invoked without the ``()`` if the call is syntatically +a statement. This command invocation syntax also works for +expressions, but then only a single argument may follow. This restriction +means ``echo f 1, f 2`` is parsed as ``echo(f(1), f(2))`` and not as +``echo(f(1, f(2)))``. The method call syntax may be used to provide one +more argument in this case: + +.. code-block:: nim + proc optarg(x:int, y:int = 0):int = x + y + proc singlearg(x:int):int = 20*x + + echo optarg 1, " ", singlearg 2 # prints "1 40" + + let fail = optarg 1, optarg 8 # Wrong. Too many arguments for a command call + let x = optarg(1, optarg 8) # traditional procedure call with 2 arguments + let y = 1.optarg optarg 8 # same thing as above, w/o the parenthesis + assert x == y + +The command invocation syntax also can't have complex expressions as arguments. +For example: (`anonymous procs`_), ``if``, ``case`` or ``try``. The (`do +notation`_) is limited, but usable for a single proc (see the example in the +corresponding section). Function calls with no arguments still needs () to +distinguish between a call and the function itself as a first class value. + + +Closures +-------- + +Procedures can appear at the top level in a module as well as inside other +scopes, in which case they are called nested procs. A nested proc can access +local variables from its enclosing scope and if it does so it becomes a +closure. Any captured variables are stored in a hidden additional argument +to the closure (its environment) and they are accessed by reference by both +the closure and its enclosing scope (i.e. any modifications made to them are +visible in both places). The closure environment may be allocated on the heap +or on the stack if the compiler determines that this would be safe. + + +Anonymous Procs +--------------- + +Procs can also be treated as expressions, in which case it's allowed to omit +the proc's name. + +.. code-block:: nim + var cities = @["Frankfurt", "Tokyo", "New York"] + + cities.sort(proc (x,y: string): int = + cmp(x.len, y.len)) + + +Procs as expressions can appear both as nested procs and inside top level +executable code. + + +Do notation +----------- + +As a special more convenient notation, proc expressions involved in procedure +calls can use the ``do`` keyword: + +.. code-block:: nim + sort(cities) do (x,y: string) -> int: + cmp(x.len, y.len) + # Less parenthesis using the method plus command syntax: + cities = cities.map do (x:string) -> string: + "City of " & x + +``do`` is written after the parentheses enclosing the regular proc params. +The proc expression represented by the do block is appended to them. + +More than one ``do`` block can appear in a single call: + +.. code-block:: nim + proc performWithUndo(task: proc(), undo: proc()) = ... + + performWithUndo do: + # multiple-line block of code + # to perform the task + do: + # code to undo it + +For compatibility with ``stmt`` templates and macros, the ``do`` keyword can be +omitted if the supplied proc doesn't have any parameters and return value. +The compatibility works in the other direction too as the ``do`` syntax can be +used with macros and templates expecting ``stmt`` blocks. + + +Nonoverloadable builtins +------------------------ + +The following builtin procs cannot be overloaded for reasons of implementation +simplicity (they require specialized semantic checking):: + + defined, definedInScope, compiles, low, high, sizeOf, + is, of, echo, shallowCopy, getAst, spawn + +Thus they act more like keywords than like ordinary identifiers; unlike a +keyword however, a redefinition may `shadow`:idx: the definition in +the ``system`` module. + + +Var parameters +-------------- +The type of a parameter may be prefixed with the ``var`` keyword: + +.. code-block:: nim + proc divmod(a, b: int; res, remainder: var int) = + res = a div b + remainder = a mod b + + var + x, y: int + + divmod(8, 5, x, y) # modifies x and y + assert x == 1 + assert y == 3 + +In the example, ``res`` and ``remainder`` are `var parameters`. +Var parameters can be modified by the procedure and the changes are +visible to the caller. The argument passed to a var parameter has to be +an l-value. Var parameters are implemented as hidden pointers. The +above example is equivalent to: + +.. code-block:: nim + proc divmod(a, b: int; res, remainder: ptr int) = + res[] = a div b + remainder[] = a mod b + + var + x, y: int + divmod(8, 5, addr(x), addr(y)) + assert x == 1 + assert y == 3 + +In the examples, var parameters or pointers are used to provide two +return values. This can be done in a cleaner way by returning a tuple: + +.. code-block:: nim + proc divmod(a, b: int): tuple[res, remainder: int] = + (a div b, a mod b) + + var t = divmod(8, 5) + + assert t.res == 1 + assert t.remainder == 3 + +One can use `tuple unpacking`:idx: to access the tuple's fields: + +.. code-block:: nim + var (x, y) = divmod(8, 5) # tuple unpacking + assert x == 1 + assert y == 3 + + +Var return type +--------------- + +A proc, converter or iterator may return a ``var`` type which means that the +returned value is an l-value and can be modified by the caller: + +.. code-block:: nim + var g = 0 + + proc WriteAccessToG(): var int = + result = g + + WriteAccessToG() = 6 + assert g == 6 + +It is a compile time error if the implicitly introduced pointer could be +used to access a location beyond its lifetime: + +.. code-block:: nim + proc WriteAccessToG(): var int = + var g = 0 + result = g # Error! + +For iterators, a component of a tuple return type can have a ``var`` type too: + +.. code-block:: nim + iterator mpairs(a: var seq[string]): tuple[key: int, val: var string] = + for i in 0..a.high: + yield (i, a[i]) + +In the standard library every name of a routine that returns a ``var`` type +starts with the prefix ``m`` per convention. + + +Overloading of the subscript operator +------------------------------------- + +The ``[]`` subscript operator for arrays/openarrays/sequences can be overloaded. + + +Multi-methods +============= + +Procedures always use static dispatch. Multi-methods use dynamic +dispatch. + +.. code-block:: nim + type + TExpr = object ## abstract base class for an expression + TLiteral = object of TExpr + x: int + TPlusExpr = object of TExpr + a, b: ref TExpr + + method eval(e: ref TExpr): int = + # override this base method + quit "to override!" + + method eval(e: ref TLiteral): int = return e.x + + method eval(e: ref TPlusExpr): int = + # watch out: relies on dynamic binding + result = eval(e.a) + eval(e.b) + + proc newLit(x: int): ref TLiteral = + new(result) + result.x = x + + proc newPlus(a, b: ref TExpr): ref TPlusExpr = + new(result) + result.a = a + result.b = b + + echo eval(newPlus(newPlus(newLit(1), newLit(2)), newLit(4))) + +In the example the constructors ``newLit`` and ``newPlus`` are procs +because they should use static binding, but ``eval`` is a method because it +requires dynamic binding. + +In a multi-method all parameters that have an object type are used for the +dispatching: + +.. code-block:: nim + type + TThing = object + TUnit = object of TThing + x: int + + method collide(a, b: TThing) {.inline.} = + quit "to override!" + + method collide(a: TThing, b: TUnit) {.inline.} = + echo "1" + + method collide(a: TUnit, b: TThing) {.inline.} = + echo "2" + + var + a, b: TUnit + collide(a, b) # output: 2 + + +Invocation of a multi-method cannot be ambiguous: collide 2 is preferred over +collide 1 because the resolution works from left to right. +In the example ``TUnit, TThing`` is preferred over ``TThing, TUnit``. + +**Performance note**: Nim does not produce a virtual method table, but +generates dispatch trees. This avoids the expensive indirect branch for method +calls and enables inlining. However, other optimizations like compile time +evaluation or dead code elimination do not work with methods. + + +Iterators and the for statement +=============================== + +The `for`:idx: statement is an abstract mechanism to iterate over the elements +of a container. It relies on an `iterator`:idx: to do so. Like ``while`` +statements, ``for`` statements open an `implicit block`:idx:, so that they +can be left with a ``break`` statement. + +The ``for`` loop declares iteration variables - their scope reaches until the +end of the loop body. The iteration variables' types are inferred by the +return type of the iterator. + +An iterator is similar to a procedure, except that it can be called in the +context of a ``for`` loop. Iterators provide a way to specify the iteration over +an abstract type. A key role in the execution of a ``for`` loop plays the +``yield`` statement in the called iterator. Whenever a ``yield`` statement is +reached the data is bound to the ``for`` loop variables and control continues +in the body of the ``for`` loop. The iterator's local variables and execution +state are automatically saved between calls. Example: + +.. code-block:: nim + # this definition exists in the system module + iterator items*(a: string): char {.inline.} = + var i = 0 + while i < len(a): + yield a[i] + inc(i) + + for ch in items("hello world"): # `ch` is an iteration variable + echo(ch) + +The compiler generates code as if the programmer would have written this: + +.. code-block:: nim + var i = 0 + while i < len(a): + var ch = a[i] + echo(ch) + inc(i) + +If the iterator yields a tuple, there can be as many iteration variables +as there are components in the tuple. The i'th iteration variable's type is +the type of the i'th component. In other words, implicit tuple unpacking in a +for loop context is supported. + +Implict items/pairs invocations +------------------------------- + +If the for loop expression ``e`` does not denote an iterator and the for loop +has exactly 1 variable, the for loop expression is rewritten to ``items(e)``; +ie. an ``items`` iterator is implicitly invoked: + +.. code-block:: nim + for x in [1,2,3]: echo x + +If the for loop has exactly 2 variables, a ``pairs`` iterator is implicitly +invoked. + +Symbol lookup of the identifiers ``items``/``pairs`` is performed after +the rewriting step, so that all overloadings of ``items``/``pairs`` are taken +into account. + + +First class iterators +--------------------- + +There are 2 kinds of iterators in Nim: *inline* and *closure* iterators. +An `inline iterator`:idx: is an iterator that's always inlined by the compiler +leading to zero overhead for the abstraction, but may result in a heavy +increase in code size. Inline iterators are second class citizens; +They can be passed as parameters only to other inlining code facilities like +templates, macros and other inline iterators. + +In contrast to that, a `closure iterator`:idx: can be passed around more freely: + +.. code-block:: nim + iterator count0(): int {.closure.} = + yield 0 + + iterator count2(): int {.closure.} = + var x = 1 + yield x + inc x + yield x + + proc invoke(iter: iterator(): int {.closure.}) = + for x in iter(): echo x + + invoke(count0) + invoke(count2) + +Closure iterators have other restrictions than inline iterators: + +1. ``yield`` in a closure iterator can not occur in a ``try`` statement. +2. For now, a closure iterator cannot be evaluated at compile time. +3. ``return`` is allowed in a closure iterator (but rarely useful). +4. Both inline and closure iterators cannot be recursive. + +Iterators that are neither marked ``{.closure.}`` nor ``{.inline.}`` explicitly +default to being inline, but that this may change in future versions of the +implementation. + +The ``iterator`` type is always of the calling convention ``closure`` +implicitly; the following example shows how to use iterators to implement +a `collaborative tasking`:idx: system: + +.. code-block:: nim + # simple tasking: + type + TTask = iterator (ticker: int) + + iterator a1(ticker: int) {.closure.} = + echo "a1: A" + yield + echo "a1: B" + yield + echo "a1: C" + yield + echo "a1: D" + + iterator a2(ticker: int) {.closure.} = + echo "a2: A" + yield + echo "a2: B" + yield + echo "a2: C" + + proc runTasks(t: varargs[TTask]) = + var ticker = 0 + while true: + let x = t[ticker mod t.len] + if finished(x): break + x(ticker) + inc ticker + + runTasks(a1, a2) + +The builtin ``system.finished`` can be used to determine if an iterator has +finished its operation; no exception is raised on an attempt to invoke an +iterator that has already finished its work. + +Closure iterators are *resumable functions* and so one has to provide the +arguments to every call. To get around this limitation one can capture +parameters of an outer factory proc: + +.. code-block:: nim + proc mycount(a, b: int): iterator (): int = + result = iterator (): int = + var x = a + while x <= b: + yield x + inc x + + let foo = mycount(1, 4) + + for f in foo(): + echo f + +Implicit return type +-------------------- + +Since inline interators must always produce values that will be consumed in +a for loop, the compiler will implicity use the ``auto`` return type if no +type is given by the user. In contrast, since closure iterators can be used +as a collaborative tasking system, ``void`` is a valid return type for them. |