summary refs log tree commit diff stats
path: root/doc/manual/procs.txt
blob: 38e343686b682b6e5b9b84f46118d6e3abab448e (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
Procedures
==========

What most programming languages call `methods`:idx: or `functions`:idx: are
called `procedures`:idx: in Nim (which is the correct terminology). A procedure
declaration consists of an identifier, zero or more formal parameters, a return
value type and a block of code. Formal parameters are declared as a list of
identifiers separated by either comma or semicolon. A parameter is given a type
by ``: typename``. The type applies to all parameters immediately before it,
until either the beginning of the parameter list, a semicolon separator or an
already typed parameter, is reached. The semicolon can be used to make
separation of types and subsequent identifiers more distinct.

.. code-block:: nim
  # Using only commas
  proc foo(a, b: int, c, d: bool): int

  # Using semicolon for visual distinction
  proc foo(a, b: int; c, d: bool): int

  # Will fail: a is untyped since ';' stops type propagation.
  proc foo(a; b: int; c, d: bool): int

A parameter may be declared with a default value which is used if the caller
does not provide a value for the argument.

.. code-block:: nim
  # b is optional with 47 as its default value
  proc foo(a: int, b: int = 47): int

Parameters can be declared mutable and so allow the proc to modify those
arguments, by using the type modifier `var`.

.. code-block:: nim
  # "returning" a value to the caller through the 2nd argument
  # Notice that the function uses no actual return value at all (ie void)
  proc foo(inp: int, outp: var int) =
    outp = inp + 47

If the proc declaration has no body, it is a `forward`:idx: declaration. If the
proc returns a value, the procedure body can access an implicitly declared
variable named `result`:idx: that represents the return value. Procs can be
overloaded. The overloading resolution algorithm determines which proc is the
best match for the arguments. Example:

.. code-block:: nim

  proc toLower(c: char): char = # toLower for characters
    if c in {'A'..'Z'}:
      result = chr(ord(c) + (ord('a') - ord('A')))
    else:
      result = c

  proc toLower(s: string): string = # toLower for strings
    result = newString(len(s))
    for i in 0..len(s) - 1:
      result[i] = toLower(s[i]) # calls toLower for characters; no recursion!

Calling a procedure can be done in many different ways:

.. code-block:: nim
  proc callme(x, y: int, s: string = "", c: char, b: bool = false) = ...

  # call with positional arguments # parameter bindings:
  callme(0, 1, "abc", '\t', true)  # (x=0, y=1, s="abc", c='\t', b=true)
  # call with named and positional arguments:
  callme(y=1, x=0, "abd", '\t')    # (x=0, y=1, s="abd", c='\t', b=false)
  # call with named arguments (order is not relevant):
  callme(c='\t', y=1, x=0)         # (x=0, y=1, s="", c='\t', b=false)
  # call as a command statement: no () needed:
  callme 0, 1, "abc", '\t'

A procedure may call itself recursively.


`Operators`:idx: are procedures with a special operator symbol as identifier:

.. code-block:: nim
  proc `$` (x: int): string =
    # converts an integer to a string; this is a prefix operator.
    result = intToStr(x)

Operators with one parameter are prefix operators, operators with two
parameters are infix operators. (However, the parser distinguishes these from
the operator's position within an expression.) There is no way to declare
postfix operators: all postfix operators are built-in and handled by the
grammar explicitly.

Any operator can be called like an ordinary proc with the '`opr`'
notation. (Thus an operator can have more than two parameters):

.. code-block:: nim
  proc `*+` (a, b, c: int): int =
    # Multiply and add
    result = a * b + c

  assert `*+`(3, 4, 6) == `*`(a, `+`(b, c))


Export marker
-------------

If a declared symbol is marked with an `asterisk`:idx: it is exported from the
current module:

.. code-block:: nim

  proc exportedEcho*(s: string) = echo s
  proc `*`*(a: string; b: int): string =
    result = newStringOfCap(a.len * b)
    for i in 1..b: result.add a

  var exportedVar*: int
  const exportedConst* = 78
  type
    ExportedType* = object
      exportedField*: int


Method call syntax
------------------

For object oriented programming, the syntax ``obj.method(args)`` can be used 
instead of ``method(obj, args)``. The parentheses can be omitted if there are no
remaining arguments: ``obj.len`` (instead of ``len(obj)``).

This method call syntax is not restricted to objects, it can be used
to supply any type of first argument for procedures:

.. code-block:: nim
  
  echo("abc".len) # is the same as echo(len("abc"))
  echo("abc".toUpper())
  echo({'a', 'b', 'c'}.card)
  stdout.writeln("Hallo") # the same as writeln(stdout, "Hallo")

Another way to look at the method call syntax is that it provides the missing
postfix notation.

See also: `Limitations of the method call syntax`_.


Properties
----------
Nim has no need for *get-properties*: Ordinary get-procedures that are called
with the *method call syntax* achieve the same. But setting a value is 
different; for this a special setter syntax is needed:

.. code-block:: nim
  
  type
    Socket* = ref object of RootObj
      FHost: int # cannot be accessed from the outside of the module
                 # the `F` prefix is a convention to avoid clashes since
                 # the accessors are named `host`

  proc `host=`*(s: var Socket, value: int) {.inline.} =
    ## setter of hostAddr
    s.FHost = value
  
  proc host*(s: Socket): int {.inline.} =
    ## getter of hostAddr
    s.FHost

  var s: Socket
  new s
  s.host = 34  # same as `host=`(s, 34)


Command invocation syntax
-------------------------

Routines can be invoked without the ``()`` if the call is syntatically
a statement. This command invocation syntax also works for
expressions, but then only a single argument may follow. This restriction
means ``echo f 1, f 2`` is parsed as ``echo(f(1), f(2))`` and not as
``echo(f(1, f(2)))``. The method call syntax may be used to provide one
more argument in this case:

.. code-block:: nim
  proc optarg(x: int, y: int = 0): int = x + y
  proc singlearg(x: int): int = 20*x
  
  echo optarg 1, " ", singlearg 2  # prints "1 40"
  
  let fail = optarg 1, optarg 8   # Wrong. Too many arguments for a command call
  let x = optarg(1, optarg 8)  # traditional procedure call with 2 arguments
  let y = 1.optarg optarg 8    # same thing as above, w/o the parenthesis
  assert x == y

The command invocation syntax also can't have complex expressions as arguments. 
For example: (`anonymous procs`_), ``if``, ``case`` or ``try``. The (`do 
notation`_) is limited, but usable for a single proc (see the example in the 
corresponding section). Function calls with no arguments still needs () to 
distinguish between a call and the function itself as a first class value.


Closures
--------

Procedures can appear at the top level in a module as well as inside other
scopes, in which case they are called nested procs. A nested proc can access
local variables from its enclosing scope and if it does so it becomes a
closure. Any captured variables are stored in a hidden additional argument
to the closure (its environment) and they are accessed by reference by both
the closure and its enclosing scope (i.e. any modifications made to them are
visible in both places). The closure environment may be allocated on the heap
or on the stack if the compiler determines that this would be safe.


Anonymous Procs
---------------

Procs can also be treated as expressions, in which case it's allowed to omit
the proc's name.

.. code-block:: nim
  var cities = @["Frankfurt", "Tokyo", "New York"]

  cities.sort(proc (x,y: string): int =
      cmp(x.len, y.len))


Procs as expressions can appear both as nested procs and inside top level 
executable code.


Do notation
-----------

**Note:** The future of the ``do`` notation is uncertain.

As a special more convenient notation, proc expressions involved in procedure
calls can use the ``do`` keyword:

.. code-block:: nim
  sort(cities) do (x,y: string) -> int:
    cmp(x.len, y.len)
  # Less parenthesis using the method plus command syntax:
  cities = cities.map do (x:string) -> string:  
    "City of " & x

``do`` is written after the parentheses enclosing the regular proc params. 
The proc expression represented by the do block is appended to them.

More than one ``do`` block can appear in a single call:

.. code-block:: nim
  proc performWithUndo(task: proc(), undo: proc()) = ...

  performWithUndo do:
    # multiple-line block of code
    # to perform the task
  do:
    # code to undo it


Nonoverloadable builtins
------------------------

The following builtin procs cannot be overloaded for reasons of implementation
simplicity (they require specialized semantic checking)::

  declared, defined, definedInScope, compiles, low, high, sizeOf, 
  is, of, shallowCopy, getAst, astToStr, spawn, procCall

Thus they act more like keywords than like ordinary identifiers; unlike a 
keyword however, a redefinition may `shadow`:idx: the definition in 
the ``system`` module. From this list the following should not be written in dot
notation ``x.f`` since ``x`` cannot be type checked before it gets passed
to ``f``::

  declared, defined, definedInScope, compiles, getAst, astToStr


Var parameters
--------------
The type of a parameter may be prefixed with the ``var`` keyword:

.. code-block:: nim
  proc divmod(a, b: int; res, remainder: var int) =
    res = a div b
    remainder = a mod b

  var
    x, y: int

  divmod(8, 5, x, y) # modifies x and y
  assert x == 1
  assert y == 3

In the example, ``res`` and ``remainder`` are `var parameters`.
Var parameters can be modified by the procedure and the changes are
visible to the caller. The argument passed to a var parameter has to be
an l-value. Var parameters are implemented as hidden pointers. The
above example is equivalent to:

.. code-block:: nim
  proc divmod(a, b: int; res, remainder: ptr int) =
    res[] = a div b
    remainder[] = a mod b

  var
    x, y: int
  divmod(8, 5, addr(x), addr(y))
  assert x == 1
  assert y == 3

In the examples, var parameters or pointers are used to provide two
return values. This can be done in a cleaner way by returning a tuple:

.. code-block:: nim
  proc divmod(a, b: int): tuple[res, remainder: int] =
    (a div b, a mod b)

  var t = divmod(8, 5)

  assert t.res == 1
  assert t.remainder == 3

One can use `tuple unpacking`:idx: to access the tuple's fields:

.. code-block:: nim
  var (x, y) = divmod(8, 5) # tuple unpacking
  assert x == 1
  assert y == 3


**Note**: ``var`` parameters are never necessary for efficient parameter
passing. Since non-var parameters cannot be modified the compiler is always
free to pass arguments by reference if it considers it can speed up execution.


Var return type
---------------

A proc, converter or iterator may return a ``var`` type which means that the
returned value is an l-value and can be modified by the caller:

.. code-block:: nim
  var g = 0

  proc WriteAccessToG(): var int =
    result = g
  
  WriteAccessToG() = 6
  assert g == 6

It is a compile time error if the implicitly introduced pointer could be 
used to access a location beyond its lifetime:

.. code-block:: nim
  proc WriteAccessToG(): var int =
    var g = 0
    result = g # Error!

For iterators, a component of a tuple return type can have a ``var`` type too: 

.. code-block:: nim
  iterator mpairs(a: var seq[string]): tuple[key: int, val: var string] =
    for i in 0..a.high:
      yield (i, a[i])

In the standard library every name of a routine that returns a ``var`` type
starts with the prefix ``m`` per convention.


Overloading of the subscript operator
-------------------------------------

The ``[]`` subscript operator for arrays/openarrays/sequences can be overloaded.


Multi-methods
=============

Procedures always use static dispatch. Multi-methods use dynamic
dispatch.

.. code-block:: nim
  type
    Expression = ref object of RootObj ## abstract base class for an expression
    Literal = ref object of Expression
      x: int
    PlusExpr = ref object of Expression
      a, b: Expression
  
  method eval(e: Expression): int =
    # override this base method
    quit "to override!"
  
  method eval(e: Literal): int = return e.x
  
  method eval(e: PlusExpr): int =
    # watch out: relies on dynamic binding
    result = eval(e.a) + eval(e.b)
  
  proc newLit(x: int): Literal =
    new(result)
    result.x = x
  
  proc newPlus(a, b: Expression): PlusExpr =
    new(result)
    result.a = a
    result.b = b

echo eval(newPlus(newPlus(newLit(1), newLit(2)), newLit(4)))
  
In the example the constructors ``newLit`` and ``newPlus`` are procs
because they should use static binding, but ``eval`` is a method because it
requires dynamic binding.

In a multi-method all parameters that have an object type are used for the
dispatching:

.. code-block:: nim
  type
    Thing = ref object of RootObj
    Unit = ref object of Thing
      x: int
      
  method collide(a, b: Thing) {.inline.} =
    quit "to override!"
    
  method collide(a: Thing, b: Unit) {.inline.} =
    echo "1"
  
  method collide(a: Unit, b: Thing) {.inline.} =
    echo "2"
  
  var a, b: Unit
  new a
  new b
  collide(a, b) # output: 2


Invocation of a multi-method cannot be ambiguous: collide 2 is preferred over 
collide 1 because the resolution works from left to right. 
In the example ``Unit, Thing`` is preferred over ``Thing, Unit``.

**Performance note**: Nim does not produce a virtual method table, but
generates dispatch trees. This avoids the expensive indirect branch for method
calls and enables inlining. However, other optimizations like compile time
evaluation or dead code elimination do not work with methods.


Iterators and the for statement
===============================

The `for`:idx: statement is an abstract mechanism to iterate over the elements
of a container. It relies on an `iterator`:idx: to do so. Like ``while``
statements, ``for`` statements open an `implicit block`:idx:, so that they
can be left with a ``break`` statement. 

The ``for`` loop declares iteration variables - their scope reaches until the
end of the loop body. The iteration variables' types are inferred by the
return type of the iterator.

An iterator is similar to a procedure, except that it can be called in the
context of a ``for`` loop. Iterators provide a way to specify the iteration over
an abstract type. A key role in the execution of a ``for`` loop plays the
``yield`` statement in the called iterator. Whenever a ``yield`` statement is
reached the data is bound to the ``for`` loop variables and control continues
in the body of the ``for`` loop. The iterator's local variables and execution
state are automatically saved between calls. Example:

.. code-block:: nim
  # this definition exists in the system module
  iterator items*(a: string): char {.inline.} =
    var i = 0
    while i < len(a):
      yield a[i]
      inc(i)

  for ch in items("hello world"): # `ch` is an iteration variable
    echo(ch)

The compiler generates code as if the programmer would have written this:

.. code-block:: nim
  var i = 0
  while i < len(a):
    var ch = a[i]
    echo(ch)
    inc(i)

If the iterator yields a tuple, there can be as many iteration variables
as there are components in the tuple. The i'th iteration variable's type is
the type of the i'th component. In other words, implicit tuple unpacking in a 
for loop context is supported.

Implict items/pairs invocations
-------------------------------

If the for loop expression ``e`` does not denote an iterator and the for loop
has exactly 1 variable, the for loop expression is rewritten to ``items(e)``;
ie. an ``items`` iterator is implicitly invoked:

.. code-block:: nim
  for x in [1,2,3]: echo x
  
If the for loop has exactly 2 variables, a ``pairs`` iterator is implicitly
invoked.

Symbol lookup of the identifiers ``items``/``pairs`` is performed after 
the rewriting step, so that all overloads of ``items``/``pairs`` are taken
into account.


First class iterators
---------------------

There are 2 kinds of iterators in Nim: *inline* and *closure* iterators.
An `inline iterator`:idx: is an iterator that's always inlined by the compiler 
leading to zero overhead for the abstraction, but may result in a heavy
increase in code size. Inline iterators are second class citizens;
They can be passed as parameters only to other inlining code facilities like
templates, macros and other inline iterators.

In contrast to that, a `closure iterator`:idx: can be passed around more freely:

.. code-block:: nim
  iterator count0(): int {.closure.} =
    yield 0
   
  iterator count2(): int {.closure.} =
    var x = 1
    yield x
    inc x
    yield x

  proc invoke(iter: iterator(): int {.closure.}) =
    for x in iter(): echo x

  invoke(count0)
  invoke(count2)

Closure iterators have other restrictions than inline iterators:

1. ``yield`` in a closure iterator can not occur in a ``try`` statement.
2. For now, a closure iterator cannot be evaluated at compile time.
3. ``return`` is allowed in a closure iterator (but rarely useful) and ends 
   iteration.
4. Neither inline nor closure iterators can be recursive.

Iterators that are neither marked ``{.closure.}`` nor ``{.inline.}`` explicitly
default to being inline, but this may change in future versions of the
implementation.

The ``iterator`` type is always of the calling convention ``closure`` 
implicitly; the following example shows how to use iterators to implement
a `collaborative tasking`:idx: system:

.. code-block:: nim
  # simple tasking:
  type
    Task = iterator (ticker: int)

  iterator a1(ticker: int) {.closure.} =
    echo "a1: A"
    yield
    echo "a1: B"
    yield
    echo "a1: C"
    yield
    echo "a1: D"

  iterator a2(ticker: int) {.closure.} =
    echo "a2: A"
    yield
    echo "a2: B"
    yield
    echo "a2: C"

  proc runTasks(t: varargs[Task]) =
    var ticker = 0
    while true:
      let x = t[ticker mod t.len]
      if finished(x): break
      x(ticker)
      inc ticker

  runTasks(a1, a2)

The builtin ``system.finished`` can be used to determine if an iterator has
finished its operation; no exception is raised on an attempt to invoke an
iterator that has already finished its work.

Note that ``system.finished`` is error prone to use because it only returns
``true`` one iteration after the iterator has finished:

.. code-block:: nim
  iterator mycount(a, b: int): int {.closure.} =
    var x = a
    while x <= b:
      yield x
      inc x

  var c = mycount # instantiate the iterator
  while not finished(c):
    echo c(1, 3)

  # Produces
  1
  2
  3
  0

Instead this code has be used:

.. code-block:: nim
  var c = mycount # instantiate the iterator
  while true:
    let value = c(1, 3)
    if finished(c): break # and discard 'value'!
    echo value

It helps to think that the iterator actually returns a
pair ``(value, done)`` and ``finished`` is used to access the hidden ``done``
field.


Closure iterators are *resumable functions* and so one has to provide the
arguments to every call. To get around this limitation one can capture
parameters of an outer factory proc:

.. code-block:: nim
  proc mycount(a, b: int): iterator (): int =
    result = iterator (): int =
      var x = a
      while x <= b:
        yield x
        inc x

  let foo = mycount(1, 4)

  for f in foo():
    echo f

..
  Implicit return type
  --------------------

  Since inline iterators must always produce values that will be consumed in
  a for loop, the compiler will implicitly use the ``auto`` return type if no
  type is given by the user. In contrast, since closure iterators can be used
  as a collaborative tasking system, ``void`` is a valid return type for them.


Converters
==========

A converter is like an ordinary proc except that it enhances 
the "implicitly convertible" type relation (see `Convertible relation`_):

.. code-block:: nim
  # bad style ahead: Nim is not C.
  converter toBool(x: int): bool = x != 0

  if 4:
    echo "compiles"


A converter can also be explicitly invoked for improved readability. Note that
implicit converter chaining is not supported: If there is a converter from
type A to type B and from type B to type C the implicit conversion from A to C
is not provided.