======================== Nimrod Tutorial (Part I) ======================== :Author: Andreas Rumpf :Version: |nimrodversion| .. contents:: Introduction ============ .. raw:: html
This document is a tutorial for the programming language *Nimrod*. This tutorial assumes that you are familiar with basic programming concepts like variables, types or statements but is kept very basic. The manual contains many more examples of the advanced language features. The first program ================= We start the tour with a modified "hello world" program: .. code-block:: Nimrod # This is a comment echo("What's your name? ") var name: string = readLine(stdin) echo("Hi, ", name, "!") Save this code to the file "greetings.nim". Now compile and run it:: nimrod compile --run greetings.nim With the ``--run`` switch Nimrod executes the file automatically after compilation. You can give your program command line arguments by appending them after the filename:: nimrod compile --run greetings.nim arg1 arg2 Commonly used commands and switches have abbreviations, so you can also use:: nimrod c -r greetings.nim To compile a release version use:: nimrod c -d:release greetings.nim By default the Nimrod compiler generates a large amount of runtime checks aiming for your debugging pleasure. With ``-d:release`` these checks are turned off and optimizations are turned on. Though it should be pretty obvious what the program does, I will explain the syntax: statements which are not indented are executed when the program starts. Indentation is Nimrod's way of grouping statements. Indentation is done with spaces only, tabulators are not allowed. String literals are enclosed in double quotes. The ``var`` statement declares a new variable named ``name`` of type ``string`` with the value that is returned by the ``readLine`` procedure. Since the compiler knows that ``readLine`` returns a string, you can leave out the type in the declaration (this is called `local type inference`:idx:). So this will work too: .. code-block:: Nimrod var name = readLine(stdin) Note that this is basically the only form of type inference that exists in Nimrod: it is a good compromise between brevity and readability. The "hello world" program contains several identifiers that are already known to the compiler: ``echo``, ``readLine``, etc. These built-ins are declared in the system_ module which is implicitly imported by any other module. Lexical elements ================ Let us look at Nimrod's lexical elements in more detail: like other programming languages Nimrod consists of (string) literals, identifiers, keywords, comments, operators, and other punctuation marks. String and character literals ----------------------------- String literals are enclosed in double quotes; character literals in single quotes. Special characters are escaped with ``\``: ``\n`` means newline, ``\t`` means tabulator, etc. There are also *raw* string literals: .. code-block:: Nimrod r"C:\program files\nim" In raw literals the backslash is not an escape character. The third and last way to write string literals are *long string literals*. They are written with three quotes: ``""" ... """``; they can span over multiple lines and the ``\`` is not an escape character either. They are very useful for embedding HTML code templates for example. Comments -------- `Comments`:idx: start anywhere outside a string or character literal with the hash character ``#``. Documentation comments start with ``##``. Multiline comments need to be aligned at the same column: .. code-block:: nimrod i = 0 # This is a single comment over multiple lines belonging to the # assignment statement. # This is a new comment belonging to the current block, but to no particular # statement. i = i + 1 # This a new comment that is NOT echo(i) # continued here, because this comment refers to the echo statement The alignment requirement does not hold if the preceding comment piece ends in a backslash: .. code-block:: nimrod type TMyObject {.final, pure, acyclic.} = object # comment continues: \ # we have lots of space here to comment 'TMyObject'. # This line belongs to the comment as it's properly aligned. Comments are tokens; they are only allowed at certain places in the input file as they belong to the syntax tree! This feature enables perfect source-to-source transformations (such as pretty-printing) and simpler documentation generators. A nice side-effect is that the human reader of the code always knows exactly which code snippet the comment refers to. Since comments are a proper part of the syntax, watch their indentation: .. code-block:: echo("Hello!") # comment has the same indentation as above statement -> fine echo("Hi!") # comment has not the correct indentation level -> syntax error! **Note**: To comment out a large piece of code, it is often better to use a ``when false:`` statement. Numbers ------- Numerical literals are written as in most other languages. As a special twist, underscores are allowed for better readability: ``1_000_000`` (one million). A number that contains a dot (or 'e' or 'E') is a floating point literal: ``1.0e9`` (one million). Hexadecimal literals are prefixed with ``0x``, binary literals with ``0b`` and octal literals with ``0o``. A leading zero alone does not produce an octal. The var statement ================= The var statement declares a new local or global variable: .. code-block:: var x, y: int # declares x and y to have the type ``int`` Indentation can be used after the ``var`` keyword to list a whole section of variables: .. code-block:: var x, y: int # a comment can occur here too a, b, c: string The assignment statement ======================== The assignment statement assigns a new value to a variable or more generally to a storage location: .. code-block:: var x = "abc" # introduces a new variable `x` and assigns a value to it x = "xyz" # assigns a new value to `x` ``=`` is the *assignment operator*. The assignment operator cannot be overloaded, overwritten or forbidden, but this might change in a future version of Nimrod. Constants ========= `Constants`:idx: are symbols which are bound to a value. The constant's value cannot change. The compiler must be able to evaluate the expression in a constant declaration at compile time: .. code-block:: nimrod const x = "abc" # the constant x contains the string "abc" Indentation can be used after the ``const`` keyword to list a whole section of constants: .. code-block:: const x = 1 # a comment can occur here too y = 2 z = y + 5 # computations are possible The let statement ================= The ``let`` statement works like the ``var`` statement but the declared symbols are *single assignment* variables: After the initialization their value cannot change: .. code-block:: let x = "abc" # introduces a new variable `x` and binds a value to it x = "xyz" # Illegal: assignment to `x` The difference between ``let`` and ``const`` is: ``let`` introduces a variable that can not be re-assigned, ``const`` means "enforce compile time evaluation and put it into a data section": .. code-block:: const input = readline(stdin) # Error: constant expression expected .. code-block:: let input = readline(stdin) # works Control flow statements ======================= The greetings program consists of 3 statements that are executed sequentially. Only the most primitive programs can get away with that: branching and looping are needed too. If statement ------------ The if statement is one way to branch the control flow: .. code-block:: nimrod let name = readLine(stdin) if name == "": echo("Poor soul, you lost your name?") elif name == "name": echo("Very funny, your name is name.") else: echo("Hi, ", name, "!") There can be zero or more elif parts, and the else part is optional. The keyword ``elif`` is short for ``else if``, and is useful to avoid excessive indentation. (The ``""`` is the empty string. It contains no characters.) Case statement -------------- Another way to branch is provided by the case statement. A case statement is a multi-branch: .. code-block:: nimrod let name = readLine(stdin) case name of "": echo("Poor soul, you lost your name?") of "name": echo("Very funny, your name is name.") of "Dave", "Frank": echo("Cool name!") else: echo("Hi, ", name, "!") As it can be seen, for an ``of`` branch a comma separated list of values is also allowed. The case statement can deal with integers, other ordinal types and strings. (What an ordinal type is will be explained soon.) For integers or other ordinal types value ranges are also possible: .. code-block:: nimrod # this statement will be explained later: from strutils import parseInt echo("A number please: ") let n = parseInt(readLine(stdin)) case n of 0..2, 4..7: echo("The number is in the set: {0, 1, 2, 4, 5, 6, 7}") of 3, 8: echo("The number is 3 or 8") However, the above code does not compile: the reason is that you have to cover every value that ``n`` may contain, but the code only handles the values ``0..8``. Since it is not very practical to list every other possible integer (though it is possible thanks to the range notation), we fix this by telling the compiler that for every other value nothing should be done: .. code-block:: nimrod ... case n of 0..2, 4..7: echo("The number is in the set: {0, 1, 2, 4, 5, 6, 7}") of 3, 8: echo("The number is 3 or 8") else: nil The ``nil`` statement is a *do nothing* statement. The compiler knows that a case statement with an else part cannot fail and thus the error disappears. Note that it is impossible to cover all possible string values: that is why there is no such check for string cases. In general the case statement is used for subrange types or enumerations where it is of great help that the compiler checks that you covered any possible value. While statement --------------- The while statement is a simple looping construct: .. code-block:: nimrod echo("What's your name? ") var name = readLine(stdin) while name == "": echo("Please tell me your name: ") name = readLine(stdin) # no ``var``, because we do not declare a new variable here The example uses a while loop to keep asking the user for his name, as long as he types in nothing (only presses RETURN). For statement ------------- The `for`:idx: statement is a construct to loop over any element an *iterator* provides. The example uses the built-in ``countup`` iterator: .. code-block:: nimrod echo("Counting to ten: ") for i in countup(1, 10): echo($i) The built-in ``$`` operator turns an integer (``int``) and many other types into a string. The variable ``i`` is implicitly declared by the ``for`` loop and has the type ``int``, because that is what ``countup`` returns. ``i`` runs through the values 1, 2, .., 10. Each value is ``echo``-ed. This code does the same: .. code-block:: nimrod echo("Counting to 10: ") var i = 1 while i <= 10: echo($i) inc(i) # increment i by 1 Counting down can be achieved as easily (but is less often needed): .. code-block:: nimrod echo("Counting down from 10 to 1: ") for i in countdown(10, 1): echo($i) Since counting up occurs so often in programs, Nimrod also has a ``..`` iterator that does the same: .. code-block:: nimrod for i in 1..10: ... Scopes and the block statement ------------------------------ Control flow statements have a feature not covered yet: they open a new scope. This means that in the following example, ``x`` is not accessible outside the loop: .. code-block:: nimrod while false: var x = "hi" echo(x) # does not work A while (for) statement introduces an implicit block. Identifiers are only visible within the block they have been declared. The ``block`` statement can be used to open a new block explicitly: .. code-block:: nimrod block myblock: var x = "hi" echo(x) # does not work either The block's *label* (``myblock`` in the example) is optional. Break statement --------------- A block can be left prematurely with a ``break`` statement. The break statement can leave a ``while``, ``for``, or a ``block`` statement. It leaves the innermost construct, unless a label of a block is given: .. code-block:: nimrod block myblock: echo("entering block") while true: echo("looping") break # leaves the loop, but not the block echo("still in block") block myblock2: echo("entering block") while true: echo("looping") break myblock2 # leaves the block (and the loop) echo("still in block") Continue statement ------------------ Like in many other programming languages, a ``continue`` statement starts the next iteration immediately: .. code-block:: nimrod while true: let x = readLine(stdin) if x == "": continue echo(x) When statement -------------- Example: .. code-block:: nimrod when system.hostOS == "windows": echo("running on Windows!") elif system.hostOS == "linux": echo("running on Linux!") elif system.hostOS == "macosx": echo("running on Mac OS X!") else: echo("unknown operating system") The `when`:idx: statement is almost identical to the ``if`` statement with some differences: * Each condition has to be a constant expression since it is evaluated by the compiler. * The statements within a branch do not open a new scope. * The compiler checks the semantics and produces code *only* for the statements that belong to the first condition that evaluates to ``true``. The ``when`` statement is useful for writing platform specific code, similar to the ``#ifdef`` construct in the C programming language. **Note**: To comment out a large piece of code, it is often better to use a ``when false:`` statement than to use real comments. This way nesting is possible. Statements and indentation ========================== Now that we covered the basic control flow statements, let's return to Nimrod indentation rules. In Nimrod there is a distinction between *simple statements* and *complex statements*. *Simple statements* cannot contain other statements: Assignment, procedure calls or the ``return`` statement belong to the simple statements. *Complex statements* like ``if``, ``when``, ``for``, ``while`` can contain other statements. To avoid ambiguities, complex statements always have to be indented, but single simple statements do not: .. code-block:: nimrod # no indentation needed for single assignment statement: if x: x = false # indentation needed for nested if statement: if x: if y: y = false else: y = true # indentation needed, because two statements follow the condition: if x: x = false y = false *Expressions* are parts of a statement which usually result in a value. The condition in an if statement is an example for an expression. Expressions can contain indentation at certain places for better readability: .. code-block:: nimrod if thisIsaLongCondition() and thisIsAnotherLongCondition(1, 2, 3, 4): x = true As a rule of thumb, indentation within expressions is allowed after operators, an open parenthesis and after commas. Procedures ========== To define new commands like ``echo``, ``readline`` in the examples, the concept of a `procedure` is needed. (Some languages call them *methods* or *functions*.) In Nimrod new procedures are defined with the ``proc`` keyword: .. code-block:: nimrod proc yes(question: string): bool = echo(question, " (y/n)") while true: case readLine(stdin) of "y", "Y", "yes", "Yes": return true of "n", "N", "no", "No": return false else: echo("Please be clear: yes or no") if yes("Should I delete all your important files?"): echo("I'm sorry Dave, I'm afraid I can't do that.") else: echo("I think you know what the problem is just as well as I do.") This example shows a procedure named ``yes`` that asks the user a ``question`` and returns true if he answered "yes" (or something similar) and returns false if he answered "no" (or something similar). A ``return`` statement leaves the procedure (and therefore the while loop) immediately. The ``(question: string): bool`` syntax describes that the procedure expects a parameter named ``question`` of type ``string`` and returns a value of type ``bool``. ``Bool`` is a built-in type: the only valid values for ``bool`` are ``true`` and ``false``. The conditions in if or while statements should be of the type ``bool``. Some terminology: in the example ``question`` is called a (formal) *parameter*, ``"Should I..."`` is called an *argument* that is passed to this parameter. Result variable --------------- A procedure that returns a value has an implicit ``result`` variable that represents the return value. A ``return`` statement with no expression is a shorthand for ``return result``. So all three code snippets are equivalent: .. code-block:: nimrod return 42 .. code-block:: nimrod result = 42 return .. code-block:: nimrod result = 42 return result Parameters ---------- Parameters are constant in the procedure body. Their value cannot be changed because this allows the compiler to implement parameter passing in the most efficient way. If the procedure needs to modify the argument for the caller, a ``var`` parameter can be used: .. code-block:: nimrod proc divmod(a, b: int; res, remainder: var int) = res = a div b # integer division remainder = a mod b # integer modulo operation var x, y: int divmod(8, 5, x, y) # modifies x and y echo(x) echo(y) In the example, ``res`` and ``remainder`` are `var parameters`. Var parameters can be modified by the procedure and the changes are visible to the caller. Note that the above example would better make use of a tuple as a return value instead of using var parameters. Discard statement ----------------- To call a procedure that returns a value just for its side effects and ignoring its return value, a discard statement **has** to be used. Nimrod does not allow to silently throw away a return value: .. code-block:: nimrod discard yes("May I ask a pointless question?") The return value can be ignored implicitely if the called proc/iterator has been declared with the ``discardable`` pragma: .. code-block:: nimrod proc p(x, y: int): int {.discardable.} = return x + y p(3, 4) # now valid Named arguments --------------- Often a procedure has many parameters and it is not clear in which order the parameters appear. This is especially true for procedures that construct a complex data type. Therefore the arguments to a procedure can be named, so that it is clear which argument belongs to which parameter: .. code-block:: nimrod proc createWindow(x, y, width, height: int; title: string; show: bool): Window = ... var w = createWindow(show = true, title = "My Application", x = 0, y = 0, height = 600, width = 800) Now that we use named arguments to call ``createWindow`` the argument order does not matter anymore. Mixing named arguments with ordered arguments is also possible, but not very readable: .. code-block:: nimrod var w = createWindow(0, 0, title = "My Application", height = 600, width = 800, true) The compiler checks that each parameter receives exactly one argument. Default values -------------- To make the ``createWindow`` proc easier to use it should provide `default values`, these are values that are used as arguments if the caller does not specify them: .. code-block:: nimrod proc createWindow(x = 0, y = 0, width = 500, height = 700, title = "unknown", show = true): Window = ... var w = createWindow(title = "My Application", height = 600, width = 800) Now the call to ``createWindow`` only needs to set the values that differ from the defaults. Note that type inference works for parameters with default values; there is no need to write ``title: string = "unknown"``, for example. Overloaded procedures --------------------- Nimrod provides the ability to overload procedures similar to C++: .. code-block:: nimrod proc toString(x: int): string = ... proc toString(x: bool): string = if x: return "true" else: return "false" echo(toString(13)) # calls the toString(x: int) proc echo(toString(true)) # calls the toString(x: bool) proc (Note that ``toString`` is usually the ``$`` operator in Nimrod.) The compiler chooses the most appropriate proc for the ``toString`` calls. How this overloading resolution algorithm works exactly is not discussed here (it will be specified in the manual soon). However, it does not lead to nasty surprises and is based on a quite simple unification algorithm. Ambiguous calls are reported as errors. Operators --------- The Nimrod library makes heavy use of overloading - one reason for this is that each operator like ``+`` is a just an overloaded proc. The parser lets you use operators in `infix notation` (``a + b``) or `prefix notation` (``+ a``). An infix operator always receives two arguments, a prefix operator always one. Postfix operators are not possible, because this would be ambiguous: does ``a @ @ b`` mean ``(a) @ (@b)`` or ``(a@) @ (b)``? It always means ``(a) @ (@b)``, because there are no postfix operators in Nimrod. Apart from a few built-in keyword operators such as ``and``, ``or``, ``not``, operators always consist of these characters: ``+ - * \ / < > = @ $ ~ & % ! ? ^ . |`` User defined operators are allowed. Nothing stops you from defining your own ``@!?+~`` operator, but readability can suffer. The operator's precedence is determined by its first character. The details can be found in the manual. To define a new operator enclose the operator in "``": .. code-block:: nimrod proc `$` (x: myDataType): string = ... # now the $ operator also works with myDataType, overloading resolution # ensures that $ works for built-in types just like before The "``" notation can also be used to call an operator just like any other procedure: .. code-block:: nimrod if `==`( `+`(3, 4), 7): echo("True") Forward declarations -------------------- Every variable, procedure, etc. needs to be declared before it can be used. (The reason for this is compilation efficiency.) However, this cannot be done for mutually recursive procedures: .. code-block:: nimrod # forward declaration: proc even(n: int): bool proc odd(n: int): bool = if n == 1: return true else: return even(n-1) proc even(n: int): bool = if n == 0: return true else: return odd(n-1) Here ``odd`` depends on ``even`` and vice versa. Thus ``even`` needs to be introduced to the compiler before it is completely defined. The syntax for such a `forward declaration`:idx: is simple: just omit the ``=`` and the procedure's body. Later versions of the language may get rid of the need for forward declarations. Iterators ========= Let's return to the boring counting example: .. code-block:: nimrod echo("Counting to ten: ") for i in countup(1, 10): echo($i) Can a ``countup`` proc be written that supports this loop? Lets try: .. code-block:: nimrod proc countup(a, b: int): int = var res = a while res <= b: return res inc(res) However, this does not work. The problem is that the procedure should not only ``return``, but return and **continue** after an iteration has finished. This *return and continue* is called a `yield` statement. Now the only thing left to do is to replace the ``proc`` keyword by ``iterator`` and there it is - our first iterator: .. code-block:: nimrod iterator countup(a, b: int): int = var res = a while res <= b: yield res inc(res) Iterators look very similar to procedures, but there are several important differences: * Iterators can only be called from for loops. * Iterators cannot contain a ``return`` statement and procs cannot contain a ``yield`` statement. * Iterators have no implicit ``result`` variable. * Iterators do not support recursion. * Iterators cannot be forward declared, because the compiler must be able to inline an iterator. (This restriction will be gone in a future version of the compiler.) However, you can also use a ``closure`` iterator to get a different set of restrictions. See `first class iterators"Der Mensch ist doch ein Augentier -- schöne Dinge wünsch ich mir."