summary refs log tree commit diff stats
path: root/doc/tut2.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/tut2.txt')
-rw-r--r--doc/tut2.txt718
1 files changed, 718 insertions, 0 deletions
diff --git a/doc/tut2.txt b/doc/tut2.txt
new file mode 100644
index 000000000..dc14aabc0
--- /dev/null
+++ b/doc/tut2.txt
@@ -0,0 +1,718 @@
+=============================
+The Nimrod Tutorial (Part II)
+=============================
+
+:Author: Andreas Rumpf
+:Version: |nimrodversion|
+
+.. contents::
+
+
+Introduction
+============
+
+  "With great power comes great responsibility." -- Spider-man
+
+This document is a tutorial for the advanced constructs of the *Nimrod* 
+programming language.
+
+
+Pragmas
+=======
+Pragmas are Nimrod's method to give the compiler additional information/
+commands without introducing a massive number of new keywords. Pragmas are
+processed during semantic checking. Pragmas are enclosed in the
+special ``{.`` and ``.}`` curly dot brackets. This tutorial does not cover
+pragmas. See the `manual <manual.html>`_ or `user guide <nimrodc.html>`_ for 
+a description of the available pragmas.
+
+
+Object Oriented Programming
+===========================
+
+While Nimrod's support for object oriented programming (OOP) is minimalistic, 
+powerful OOP technics can be used. OOP is seen as *one* way to design a 
+program, not *the only* way. Often a procedural approach leads to simpler
+and more efficient code.
+
+
+Objects
+-------
+
+Like tuples, objects are a means to pack different values together in a
+structured way. However, objects provide many features that tuples do not:
+They provide inheritance and information hiding. Because objects encapsulate
+data, the ``()`` tuple constructor cannot be used to construct objects. So
+the order of the object's fields is not as important as it is for tuples. The
+programmer should provide a proc to initialize the object (this is called
+a *constructor*).
+
+Objects have access to their type at runtime. There is an
+``is`` operator that can be used to check the object's type:
+
+.. code-block:: nimrod
+
+  type
+    TPerson = object of TObject
+      name*: string  # the * means that `name` is accessible from other modules
+      age: int       # no * means that the field is hidden from other modules
+
+    TStudent = object of TPerson # TStudent inherits from TPerson
+      id: int                    # with an id field
+
+  var
+    student: TStudent
+    person: TPerson
+  assert(student is TStudent) # is true
+
+Object fields that should be visible from outside the defining module, have to
+be marked by ``*``. In contrast to tuples, different object types are
+never *equivalent*. New object types can only be defined within a type
+section.
+
+Inheritance is done with the ``object of`` syntax. Multiple inheritance is
+currently not supported. If an object type has no suitable ancestor, ``TObject``
+should be used as its ancestor, but this is only a convention.
+
+Note that aggregation (*has-a* relation) is often preferable to inheritance
+(*is-a* relation) for simple code reuse. Since objects are value types in
+Nimrod, aggregation is as efficient as inheritance.
+
+
+Mutually recursive types
+------------------------
+
+Objects, tuples and references can model quite complex data structures which
+depend on each other. This is called *mutually recursive types*. In Nimrod 
+these types need to be declared within a single type section. Anything else 
+would require arbitrary symbol lookahead which slows down compilation.
+
+Example:
+
+.. code-block:: nimrod
+  type
+    PNode = ref TNode # a traced reference to a TNode
+    TNode = object
+      le, ri: PNode   # left and right subtrees
+      sym: ref TSym   # leaves contain a reference to a TSym
+
+    TSym = object     # a symbol
+      name: string    # the symbol's name
+      line: int       # the line the symbol was declared in
+      code: PNode     # the symbol's abstract syntax tree
+
+
+Type conversions
+----------------
+Nimrod distinguishes between `type casts`:idx: and `type conversions`:idx:.
+Casts are done with the ``cast`` operator and force the compiler to 
+interpret a bit pattern to be of another type. 
+
+Type conversions are a much more polite way to convert a type into another: 
+They preserve the abstract *value*, not necessarily the *bit-pattern*. If a
+type conversion is not possible, the compiler complains or an exception is
+raised. 
+
+The syntax for type conversions is ``destination_type(expression_to_convert)``
+(like an ordinary call): 
+
+.. code-block:: nimrod
+  proc getID(x: TPerson): int = 
+    return TStudent(x).id
+  
+The ``EInvalidObjectConversion`` exception is raised if ``x`` is not a 
+``TStudent``.
+
+
+Object variants
+---------------
+Often an object hierarchy is overkill in certain situations where simple
+`variant`:idx: types are needed.
+
+An example:
+
+.. code-block:: nimrod
+
+  # This is an example how an abstract syntax tree could be modelled in Nimrod
+  type
+    TNodeKind = enum  # the different node types
+      nkInt,          # a leaf with an integer value
+      nkFloat,        # a leaf with a float value
+      nkString,       # a leaf with a string value
+      nkAdd,          # an addition
+      nkSub,          # a subtraction
+      nkIf            # an if statement
+    PNode = ref TNode
+    TNode = object
+      case kind: TNodeKind  # the ``kind`` field is the discriminator
+      of nkInt: intVal: int
+      of nkFloat: floavVal: float
+      of nkString: strVal: string
+      of nkAdd, nkSub:
+        leftOp, rightOp: PNode
+      of nkIf:
+        condition, thenPart, elsePart: PNode
+
+  var
+    n: PNode
+  new(n)  # creates a new node
+  n.kind = nkFloat
+  n.floatVal = 0.0 # valid, because ``n.kind==nkFloat``
+
+  # the following statement raises an `EInvalidField` exception, because
+  # n.kind's value does not fit:
+  n.strVal = ""
+
+As can been seen from the example, an advantage to an object hierarchy is that
+no conversion between different object types is needed. Yet, access to invalid
+object fields raises an exception.
+
+
+Methods
+-------
+In ordinary object oriented languages, procedures (also called *methods*) are 
+bound to a class. This has disadvantages: 
+
+* Adding a method to a class the programmer has no control over is 
+  impossible or needs ugly workarounds.
+* Often it is unclear where the procedure should belong to: Is
+  ``join`` a string method or an array method? Should the complex 
+  ``vertexCover`` algorithm really be a method of the ``graph`` class?
+
+Nimrod avoids these problems by not distinguishing between methods and 
+procedures. Methods are just ordinary procedures. However, there is a special 
+syntactic sugar for calling procedures: The syntax ``obj.method(args)`` can be
+used instead of ``method(obj, args)``. If there are no remaining arguments, the
+parentheses can be omitted: ``obj.len`` (instead of ``len(obj)``).
+
+This `method call syntax`:idx: is not restricted to objects, it can be used 
+for any type: 
+
+.. code-block:: nimrod
+  
+  echo("abc".len) # is the same as echo(len("abc"))
+  echo("abc".toUpper())
+  echo({'a', 'b', 'c'}.card)
+  stdout.writeln("Hallo") # the same as write(stdout, "Hallo")
+
+If it gives you warm fuzzy feelings, you can even write ``1.`+`(2)`` instead of
+``1 + 2`` and claim that Nimrod is a pure object oriented language. (That 
+would not even be lying: *pure OO* has no meaning anyway. :-)
+
+
+Properties
+----------
+As the above example shows, Nimrod has no need for *get-properties*:  
+Ordinary get-procedures that are called with the *method call syntax* achieve 
+the same. But setting a value is different; for this a special setter syntax 
+is needed:
+
+.. code-block:: nimrod
+  
+  type
+    TSocket* = object of TObject
+      FHost: int # cannot be accessed from the outside of the module
+                 # the `F` prefix is a convention to avoid clashes since
+                 # the accessors are named `host`
+                 
+  proc `host=`*(s: var TSocket, value: int) {.inline.} = 
+    ## setter of hostAddr
+    s.FHost = value
+  
+  proc host*(s: TSocket): int {.inline.} =
+    ## getter of hostAddr
+    return s.FHost
+    
+  var 
+    s: TSocket
+  s.host = 34  # same as `host=`(s, 34)
+
+(The example also shows ``inline`` procedures.)
+
+
+The ``[]`` array access operator can be overloaded to provide 
+`array properties`:idx:\ :
+
+.. code-block:: nimrod
+  type
+    TVector* = object
+      x, y, z: float
+

+  proc `[]=`* (v: var TVector, i: int, value: float) =
+    # setter

+    case i
+    of 0: v.x = value
+    of 1: v.y = value
+    of 2: v.z = value
+    else: assert(false)
+

+  proc `[]`* (v: TVector, i: int): float = 
+    # getter
+    case i
+    of 0: result = v.x
+    of 1: result = v.y
+    of 2: result = v.z
+    else: assert(false)
+   

+The example is silly, since a vector is better modelled by a tuple which 
+already provides ``v[]`` access.
+
+
+Dynamic binding
+---------------
+In Nimrod procedural types are used to implement dynamic binding. The following
+example also shows some more conventions: The ``self`` or ``this`` object 
+is named ``my`` (because it is shorter than the alternatives), each class 
+provides a constructor, etc.
+
+.. code-block:: nimrod
+  type
+    TFigure = object of TObject    # abstract base class:
+      draw: proc (my: var TFigure) # concrete classes implement this proc
+    
+  proc init(f: var TFigure) = 
+    f.draw = nil
+  
+  type
+    TCircle = object of TFigure
+      radius: int
+    
+  proc drawCircle(my: var TCircle) = echo("o " & $my.radius)
+  
+  proc init(my: var TCircle) = 
+    init(TFigure(my)) # call base constructor
+    my.radius = 5
+    my.draw = drawCircle
+
+  type
+    TRectangle = object of TFigure
+      width, height: int
+  
+  proc drawRectangle(my: var TRectangle) = echo("[]")
+  
+  proc init(my: var TRectangle) = 
+    init(TFigure(my)) # call base constructor
+    my.width = 5
+    my.height = 10
+    my.draw = drawRectangle
+
+  # now use these classes:
+  var
+    r: TRectangle
+    c: TCircle
+  init(r)
+  init(c)
+  r.draw(r)
+  c.draw(c) 
+
+The last line shows the syntactical difference between static and dynamic 
+binding: The ``r.draw(r)`` dynamic call refers to ``r`` twice. This difference
+is not necessarily bad. But if you want to eliminate the somewhat redundant
+``r``, it can be done by using *closures*: 
+
+.. code-block:: nimrod
+  type
+    TFigure = object of TObject    # abstract base class:
+      draw: proc () {.closure.}    # concrete classes implement this proc
+    
+  proc init(f: var TFigure) = 
+    f.draw = nil
+  
+  type
+    TCircle = object of TFigure
+      radius: int
+  
+  proc init(me: var TCircle) = 
+    init(TFigure(me)) # call base constructor
+    me.radius = 5
+    me.draw = lambda () = 
+      echo("o " & $me.radius)
+
+  type
+    TRectangle = object of TFigure
+      width, height: int
+  
+  proc init(me: var TRectangle) = 
+    init(TFigure(me)) # call base constructor
+    me.width = 5
+    me.height = 10
+    me.draw = lambda () =
+      echo("[]")
+
+  # now use these classes:
+  var
+    r: TRectangle
+    c: TCircle
+  init(r)
+  init(c)
+  r.draw()
+  c.draw() 
+
+The example also introduces `lambda`:idx: expressions: A ``lambda`` expression
+defines a new proc with the ``closure`` calling convention on the fly.
+
+`Version 0.7.4: Closures and lambda expressions are not implemented.`:red:
+
+
+Exceptions
+==========
+
+In Nimrod `exceptions`:idx: are objects. By convention, exception types are 
+prefixed with an 'E', not 'T'. The ``system`` module defines an exception 
+hierarchy that you should stick to. Reusing an existing exception type is
+often better than defining a new exception type: It avoids a proliferation of
+types. 
+
+Exceptions should be allocated on the heap because their lifetime is unknown.
+
+A convention is that exceptions should be raised in *exceptional* cases: 
+For example, if a file cannot be opened, this should not raise an exception 
+since this is quite common (the file may have been deleted).
+
+
+Raise statement
+---------------
+Raising an exception is done with the ``raise`` statement: 
+
+.. code-block:: nimrod
+  var
+    e: ref EOS
+  new(e)
+  e.msg = "the request to the OS failed"
+  raise e
+
+If the ``raise`` keyword is not followed by an expression, the last exception 
+is *re-raised*. 
+
+
+Try statement
+-------------
+
+The `try`:idx: statement handles exceptions: 
+
+.. code-block:: nimrod
+  # read the first two lines of a text file that should contain numbers
+  # and tries to add them
+  var
+    f: TFile
+  if openFile(f, "numbers.txt"):
+    try:
+      var a = readLine(f)
+      var b = readLine(f)
+      echo("sum: " & $(parseInt(a) + parseInt(b)))
+    except EOverflow:
+      echo("overflow!")
+    except EInvalidValue:
+      echo("could not convert string to integer")
+    except EIO:
+      echo("IO error!")
+    except:
+      echo("Unknown exception!")
+      # reraise the unknown exception:
+      raise
+    finally:
+      closeFile(f)
+
+The statements after the ``try`` are executed unless an exception is 
+raised. Then the appropriate ``except`` part is executed. 
+
+The empty ``except`` part is executed if there is an exception that is
+not explicitely listed. It is similiar to an ``else`` part in ``if`` 
+statements.
+
+If there is a ``finally`` part, it is always executed after the
+exception handlers.
+
+The exception is *consumed* in an ``except`` part. If an exception is not
+handled, it is propagated through the call stack. This means that often
+the rest of the procedure - that is not within a ``finally`` clause -
+is not executed (if an exception occurs).
+
+
+Generics
+========
+
+`Version 0.7.4: Complex generic types like in the example do not work.`:red:
+
+`Generics`:idx: are Nimrod's means to parametrize procs, iterators or types 
+with `type parameters`:idx:. They are most useful for efficient type safe
+containers: 
+
+.. code-block:: nimrod
+  type
+    TBinaryTree[T] = object      # TBinaryTree is a generic type with
+                                 # with generic param ``T``
+      le, ri: ref TBinaryTree[T] # left and right subtrees; may be nil
+      data: T                    # the data stored in a node
+    PBinaryTree*[T] = ref TBinaryTree[T] # type that is exported
+
+  proc newNode*[T](data: T): PBinaryTree[T] = 
+    # constructor for a node
+    new(result)
+    result.dat = data
+
+  proc add*[T](root: var PBinaryTree[T], n: PBinaryTree[T]) =
+    # insert a node into the tree
+    if root == nil:
+      root = n
+    else:
+      var it = root
+      while it != nil:
+        # compare the data items; uses the generic ``cmd`` proc that works for
+        # any type that has a ``==`` and ``<`` operator
+        var c = cmp(it.data, n.data) 
+        if c < 0:
+          if it.le == nil:
+            it.le = n
+            return
+          it = it.le
+        else:
+          if it.ri == nil:
+            it.ri = n
+            return
+          it = it.ri
+
+  proc add*[T](root: var PBinaryTree[T], data: T) = 
+    # convenience proc:
+    add(root, newNode(data))
+
+  iterator preorder*[T](root: PBinaryTree[T]): T =
+    # Preorder traversal of a binary tree.
+    # Since recursive iterators are not yet implemented, 
+    # this uses an explicit stack (which is more efficient anyway):
+    var stack: seq[PBinaryTree[T]] = @[root]
+    while stack.len > 0:
+      var n = stack[stack.len-1]
+      setLen(stack, stack.len-1) # pop `n` of the stack
+      while n != nil:
+        yield n
+        add(stack, n.ri)  # push right subtree onto the stack
+        n = n.le          # and follow the left pointer
+      
+  var
+    root: PBinaryTree[string] # instantiate a PBinaryTree with ``string``
+  add(root, newNode("hallo")) # instantiates generic procs ``newNode`` and ``add``
+  add(root, "world")          # instantiates the second ``add`` proc
+  for str in preorder(root):
+    stdout.writeln(str)
+
+The example shows a generic binary tree. Depending on context, the brackets are 
+used either to introduce type parameters or to instantiate a generic proc, 
+iterator or type. As the example shows, generics work with overloading: The
+best match of ``add`` is used. The built-in ``add`` procedure for sequences
+is not hidden and used in the ``preorder`` iterator. 
+
+
+Templates
+=========
+
+Templates are a simple substitution mechanism that operates on Nimrod's 
+abstract syntax trees. Templates are processed in the semantic pass of the 
+compiler. They integrate well with the rest of the language and share none 
+of C's preprocessor macros flaws. However, they may lead to code that is harder 
+to understand and maintain. So one should use them sparingly. 
+
+To *invoke* a template, call it like a procedure.
+
+Example:
+
+.. code-block:: nimrod
+  template `!=` (a, b: expr): expr =
+    # this definition exists in the System module
+    not (a == b)
+
+  assert(5 != 6) # the compiler rewrites that to: assert(not (5 == 6))
+
+The ``!=``, ``>``, ``>=``, ``in``, ``notin``, ``isnot`` operators are in fact 
+templates: This has the benefit that if you overload the ``==`` operator, 
+the ``!=`` operator is available automatically and does the right thing.
+
+``a > b`` is transformed into ``b < a``.
+``a in b`` is transformed into ``contains(b, a)``. 
+``notin`` and ``isnot`` have the obvious meanings.
+
+Templates are especially useful for lazy evaluation purposes. Consider a
+simple proc for logging: 
+
+.. code-block:: nimrod
+  const
+    debug = True
+    
+  proc log(msg: string) {.inline.} = 
+    if debug:
+      stdout.writeln(msg)
+  
+  var
+    x = 4
+  log("x has the value: " & $x)
+
+This code has a shortcoming: If ``debug`` is set to false someday, the quite
+expensive ``$`` and ``&`` operations are still performed! (The argument 
+evaluation for procedures is said to be *eager*).
+
+Turning the ``log`` proc into a template solves this problem in an elegant way:
+
+.. code-block:: nimrod
+  const
+    debug = True
+    
+  template log(msg: expr): stmt = 
+    if debug:
+      stdout.writeln(msg)
+  
+  var
+    x = 4
+  log("x has the value: " & $x)
+
+The "types" of templates can be the symbols ``expr`` (stands for *expression*), 
+``stmt`` (stands for *statement*) or ``typedesc`` (stands for *type 
+description*). These are no real types, they just help the compiler parsing.
+
+The template body does not open a new scope. To open a new scope
+use a ``block`` statement:
+
+.. code-block:: nimrod
+  template declareInScope(x: expr, t: typeDesc): stmt = 
+    var x: t
+    
+  template declareInNewScope(x: expr, t: typeDesc): stmt = 
+    # open a new scope:
+    block: 
+      var x: t
+
+  declareInScope(a, int)
+  a = 42  # works, `a` is known here
+  
+  declareInNewScope(b, int)
+  b = 42  # does not work, `b` is unknown
+
+
+Macros
+======
+
+If the template mechanism scares you, you will be pleased to hear that 
+templates are not really necessary: Macros can do anything that templates can
+do and much more. Macros are harder to write than templates and even harder 
+to get right :-). Now that you have been warned, lets see what a macro *is*.
+
+Macros enable advanced compile-time code tranformations, but they
+cannot change Nimrod's syntax. However, this is no real restriction because
+Nimrod's syntax is flexible enough anyway. 
+
+`Macros`:idx: can be used to implement `domain specific languages`:idx:. 
+
+To write macros, one needs to know how the Nimrod concrete syntax is converted
+to an abstract syntax tree (AST). (Unfortunately the AST is not documented yet.)
+
+There are two ways to invoke a macro:
+(1) invoking a macro like a procedure call (`expression macros`:idx:)
+(2) invoking a macro with the special ``macrostmt`` syntax (`statement macros`:idx:)
+
+
+Expression Macros
+-----------------
+
+The following example implements a powerful ``debug`` command that accepts a
+variable number of arguments (this cannot be done with templates):
+
+.. code-block:: nimrod
+  # to work with Nimrod syntax trees, we need an API that is defined in the
+  # ``macros`` module:
+  import macros
+
+  macro debug(n: expr): stmt =
+    # `n` is a Nimrod AST that contains the whole macro expression
+    # this macro returns a list of statements:
+    result = newNimNode(nnkStmtList, n)
+    # iterate over any argument that is passed to this macro:
+    for i in 1..n.len-1:
+      # add a call to the statement list that writes the expression;
+      # `toStrLit` converts an AST to its string representation:
+      result.add(newCall("write", newIdentNode("stdout"), toStrLit(n[i])))
+      # add a call to the statement list that writes ": "
+      result.add(newCall("write", newIdentNode("stdout"), newStrLitNode(": ")))
+      # add a call to the statement list that writes the expressions value:
+      result.add(newCall("writeln", newIdentNode("stdout"), n[i]))
+
+  var
+    a: array[0..10, int]
+    x = "some string"
+  a[0] = 42
+  a[1] = 45
+
+  debug(a[0], a[1], x)
+
+The macro call expands to:
+
+.. code-block:: nimrod
+  write(stdout, "a[0]")
+  write(stdout, ": ")
+  writeln(stdout, a[0])
+
+  write(stdout, "a[1]")
+  write(stdout, ": ")
+  writeln(stdout, a[1])
+
+  write(stdout, "x")
+  write(stdout, ": ")
+  writeln(stdout, x)
+
+
+Lets return to the dynamic binding ``r.draw(r)`` notational "problem". Apart 
+from closures, there is another "solution": Define an infix ``!`` macro 
+operator which hides it: 
+
+.. code-block:: 
+
+  macro `!` (n: expr): expr = 
+    result = newNimNode(nnkCall, n)
+    var dot = newNimNode(nnkDotExpr, n)
+    dot.add(n[1])    # obj
+    if n[2].kind == nnkCall:
+      # transforms ``obj!method(arg1, arg2, ...)`` to
+      # ``(obj.method)(obj, arg1, arg2, ...)``
+      dot.add(n[2][0]) # method
+      result.add(dot)
+      result.add(n[1]) # obj
+      for i in 1..n[2].len-1:
+        result.add(n[2][i])
+    else:
+      # transforms ``obj!method`` to
+      # ``(obj.method)(obj)``
+      dot.add(n[2]) # method
+      result.add(dot)
+      result.add(n[1]) # obj
+  
+  r!draw(a, b, c) # will be transfomed into ``r.draw(r, a, b, c)``
+
+Great! 20 lines of complex code to safe a few keystrokes! Obviously, this is
+exactly you should not do! (But it makes a cool example.)
+
+
+Statement Macros
+----------------
+
+Statement macros are defined just as expression macros. However, they are
+invoked by an expression following a colon.
+
+The following example outlines a macro that generates a lexical analyser from
+regular expressions:
+
+.. code-block:: nimrod
+
+  macro case_token(n: stmt): stmt =
+    # creates a lexical analyser from regular expressions
+    # ... (implementation is an exercise for the reader :-)
+    nil
+
+  case_token: # this colon tells the parser it is a macro statement
+  of r"[A-Za-z_]+[A-Za-z_0-9]*":
+    return tkIdentifier
+  of r"0-9+":
+    return tkInteger
+  of r"[\+\-\*\?]+":
+    return tkOperator
+  else:
+    return tkUnknown
+
+