diff options
author | PhiLho <PhiLho@GMX.net> | 2010-05-21 16:03:21 +0200 |
---|---|---|
committer | PhiLho <PhiLho@GMX.net> | 2010-05-21 16:03:21 +0200 |
commit | 286e5958d662038fdc852861ecd07c89256495ab (patch) | |
tree | cea500e695b874f47358602c6433955963a9a7bc /doc | |
parent | 227b76c34259cf406131d27fb8e0cc88530e38f7 (diff) | |
download | Nim-286e5958d662038fdc852861ecd07c89256495ab.tar.gz |
Integrating my changes, mostly minor/cosmetic fixes; plus a big Windows lib update
Diffstat (limited to 'doc')
-rwxr-xr-x | doc/abstypes.txt | 80 | ||||
-rwxr-xr-x | doc/effects.txt | 10 | ||||
-rwxr-xr-x | doc/endb.txt | 2 | ||||
-rwxr-xr-x | doc/lib.txt | 78 | ||||
-rwxr-xr-x | doc/manual.txt | 5529 | ||||
-rwxr-xr-x | doc/nimrodc.txt | 478 | ||||
-rwxr-xr-x | doc/rst.txt | 22 |
7 files changed, 3099 insertions, 3100 deletions
diff --git a/doc/abstypes.txt b/doc/abstypes.txt index 44c3bb0c9..c5827745a 100755 --- a/doc/abstypes.txt +++ b/doc/abstypes.txt @@ -10,21 +10,21 @@ a `base type`:idx:. Use case 1: SQL strings ----------------------- -An SQL statement that is passed from Nimrod to an SQL database might be +An SQL statement that is passed from Nimrod to an SQL database might be modelled as a string. However, using string templates and filling in the -values is vulnerable to the famous `SQL injection attack`:idx:\: +values is vulnerable to the famous `SQL injection attack`:idx:\: .. code-block:: nimrod proc query(db: TDbHandle, statement: TSQL) = ... var username: string - + db.query("SELECT FROM users WHERE name = '$1'" % username) - # Horrible security whole, but the compiler does not mind! - -This can be avoided by distinguishing strings that contain SQL from strings -that don't. Abstract types provide a means to introduce a new string type + # Horrible security hole, but the compiler does not mind! + +This can be avoided by distinguishing strings that contain SQL from strings +that don't. Abstract types provide a means to introduce a new string type ``TSQL`` that is incompatible with ``string``: .. code-block:: nimrod @@ -35,26 +35,26 @@ that don't. Abstract types provide a means to introduce a new string type var username: string - + db.query("SELECT FROM users WHERE name = '$1'" % username) # Error at compile time: `query` expects an SQL string! - - + + It is an essential property of abstract types that they **do not** imply a subtype relation between the abtract type and its base type. Explict type -conversions from ``string`` to ``TSQL`` are allowed: +conversions from ``string`` to ``TSQL`` are allowed: .. code-block:: nimrod - proc properQuote(s: string): TSQL = + proc properQuote(s: string): TSQL = # quotes a string properly for an SQL statement ... - - proc `%` (frmt: TSQL, values: openarray[string]): TSQL = + + proc `%` (frmt: TSQL, values: openarray[string]): TSQL = # quote each argument: var v = values.each(properQuote) # we need a temporary type for the type conversion :-( type TStrSeq = seq[string] - # call strutils.`%`: + # call strutils.`%`: result = TSQL(string(frmt) % TStrSeq(v)) db.query("SELECT FROM users WHERE name = $1".TSQL % username) @@ -68,43 +68,43 @@ for nice looking ``TSQL`` string literals. Use case 2: Money ----------------- Different currencies should not be mixed in monetary calculations. Abstract -types are a perfect tool to model different currencies: +types are a perfect tool to model different currencies: .. code-block:: nimrod type TDollar = abstract int TEuro = abstract int - + var d: TDollar e: TEuro - - echo d + 12 + + echo d + 12 # Error: cannot add a number with no unit with a ``TDollar`` -Unfortunetaly, ``d + 12.TDollar`` is not allowed either, +Unfortunetaly, ``d + 12.TDollar`` is not allowed either, because ``+`` is defined for ``int`` (among others), not for ``TDollar``. So -we define our own ``+`` for dollars: +we define our own ``+`` for dollars: -.. code-block:: - proc `+` (x, y: TDollar): TDollar = +.. code-block:: + proc `+` (x, y: TDollar): TDollar = result = TDollar(int(x) + int(y)) It does not make sense to multiply a dollar with a dollar, but with a number without unit; and the same holds for division: -.. code-block:: - proc `*` (x: TDollar, y: int): TDollar = +.. code-block:: + proc `*` (x: TDollar, y: int): TDollar = result = TDollar(int(x) * y) - proc `*` (x: int, y: TDollar): TDollar = + proc `*` (x: int, y: TDollar): TDollar = result = TDollar(x * int(y)) - + proc `div` ... -This quickly gets tedious. The implementations are trivial and the compiler +This quickly gets tedious. The implementations are trivial and the compiler should not generate all this code only to optimize it away later - after all -``+`` for dollars should produce the same binary code as ``+`` for ints. +``+`` for dollars should produce the same binary code as ``+`` for ints. The pragma ``borrow`` has been designed to solve this problem; in principle it generates the trivial implementation for us: @@ -113,40 +113,40 @@ it generates the trivial implementation for us: proc `*` (x: int, y: TDollar): TDollar {.borrow.} proc `div` (x: TDollar, y: int): TDollar {.borrow.} -The ``borrow`` pragma makes the compiler use the same implementation as +The ``borrow`` pragma makes the compiler to use the same implementation as the proc that deals with the abstract type's base type, so no code is -generated. +generated. But it seems we still have to repeat all this boilerplate code for -the ``TEuro`` currency. Fortunately, Nimrod has a template mechanism: +the ``TEuro`` currency. Fortunately, Nimrod has a template mechanism: .. code-block:: nimrod template Additive(typ: typeDesc): stmt = proc `+` *(x, y: typ): typ {.borrow.} proc `-` *(x, y: typ): typ {.borrow.} - + # unary operators: proc `+` *(x: typ): typ {.borrow.} proc `-` *(x: typ): typ {.borrow.} - - template Multiplicative(typ, base: typeDesc): stmt = + + template Multiplicative(typ, base: typeDesc): stmt = proc `*` *(x: typ, y: base): typ {.borrow.} proc `*` *(x: base, y: typ): typ {.borrow.} proc `div` *(x: typ, y: base): typ {.borrow.} proc `mod` *(x: typ, y: base): typ {.borrow.} - - template Comparable(typ: typeDesc): stmt = + + template Comparable(typ: typeDesc): stmt = proc `<` * (x, y: typ): bool {.borrow.} proc `<=` * (x, y: typ): bool {.borrow.} proc `==` * (x, y: typ): bool {.borrow.} - - template DefineCurrency(typ, base: expr): stmt = + + template DefineCurrency(typ, base: expr): stmt = type typ* = abstract base Additive(typ) Multiplicative(typ, base) Comparable(typ) - + DefineCurrency(TDollar, int) DefineCurrency(TEuro, int) diff --git a/doc/effects.txt b/doc/effects.txt index 85de1ffdf..8084ae17a 100755 --- a/doc/effects.txt +++ b/doc/effects.txt @@ -9,7 +9,7 @@ explicit like in Haskell? The idea is that side effects and partial evaluation belong together: Iff a proc is side effect free and all its argument are evaluable at compile time, it can be evaluated by the compiler. However, really -difficult is the ``newString`` proc: If it is simply wrapped, it +difficult is the ``newString`` proc: If it is simply wrapped, it should not be evaluated at compile time! On other occasions it can and should be evaluted: @@ -20,22 +20,22 @@ and should be evaluted: result[i] = toUpper(s[i]) No, it really can always be evaluated. The code generator should transform -``s = "\0\0\0..."`` back into ``s = newString(...)``. +``s = "\0\0\0..."`` back into ``s = newString(...)``. -``new`` cannot be evaluated at compile time either. +``new`` cannot be evaluated at compile time either. Raise statement =============== -It is impractical to consider ``raise`` a statement with side effects. +It is impractical to consider ``raise`` as a statement with side effects. Solution ======== Being side effect free does not suffice for compile time evaluation. However, -the evaluator can attempt to evaluate at compile time. +the evaluator can attempt to evaluate at compile time. diff --git a/doc/endb.txt b/doc/endb.txt index e2be59c50..3cc20cdb8 100755 --- a/doc/endb.txt +++ b/doc/endb.txt @@ -23,7 +23,7 @@ available for the debugger. If you start your program the debugger will immediately show a prompt on the console. You can now enter a command. The next sections -deal with the possible commands. As usual for Nimrod for all commands +deal with the possible commands. As usual in Nimrod in all commands underscores and case do not matter. Optional components of a command are listed in brackets ``[...]`` here. diff --git a/doc/lib.txt b/doc/lib.txt index a56aa5bf4..fa29ff6ff 100755 --- a/doc/lib.txt +++ b/doc/lib.txt @@ -12,7 +12,7 @@ Nimrod Standard Library Though the Nimrod Standard Library is still evolving, it is already quite usable. It is divided into *pure libraries*, *impure libraries* and *wrappers*. -Pure libraries do not depend on any external ``*.dll`` or ``lib*.so`` binary +Pure libraries do not depend on any external ``*.dll`` or ``lib*.so`` binary while impure libraries do. A wrapper is an impure library that is a very low-level interface to a C library. @@ -31,7 +31,7 @@ Core implicitly by the compiler. Do not import it directly. It relies on compiler magic to work. -* `macros <macros.html>`_ +* `macros <macros.html>`_ Contains the AST API and documentation of Nimrod for writing macros. @@ -39,8 +39,8 @@ String handling --------------- * `strutils <strutils.html>`_ - This module contains common string handling operations like converting a - string into uppercase, splitting a string into substrings, searching for + This module contains common string handling operations like changing + case of a string, splitting a string into substrings, searching for substrings, replacing substrings. * `parseutils <parseutils.html>`_ @@ -52,7 +52,7 @@ String handling style-insensitive mode. An efficient string substitution operator ``%`` for the string table is also provided. -* `unicode <unicode.html>`_ +* `unicode <unicode.html>`_ This module provides support to handle the Unicode UTF-8 encoding. * `re <re.html>`_ @@ -67,11 +67,11 @@ String handling Ropes can represent very long strings efficiently; especially concatenation is done in O(1) instead of O(n). -* `unidecode <unidecode.html>`_ +* `unidecode <unidecode.html>`_ This module provides Unicode to ASCII transliterations: It finds the sequence of ASCII characters that is the closest approximation to the Unicode string. - + Generic Operating System Services --------------------------------- @@ -97,8 +97,8 @@ Generic Operating System Services may provide other implementations for this standard stream interface. * `terminal <terminal.html>`_ - This module contains a few procedures to control the *terminal* - (also called *console*). The implementation simply uses ANSI escape + This module contains a few procedures to control the *terminal* + (also called *console*). The implementation simply uses ANSI escape sequences and does not depend on any other module. @@ -117,20 +117,20 @@ Internet Protocols and Support ------------------------------ * `cgi <cgi.html>`_ - This module implements helpers for CGI applictions. + This module implements helpers for CGI applictions. * `sockets <sockets.html>`_ This module implements a simple portable type-safe sockets layer. * `browsers <browsers.html>`_ - This module implements procs for opening URLs with the user's default + This module implements procs for opening URLs with the user's default browser. - + * `httpserver <httpserver.html>`_ This module implements a simple HTTP server. - + * `httpclient <httpclient.html>`_ - This module implements a simple HTTP client. + This module implements a simple HTTP client. Parsers @@ -149,14 +149,14 @@ Parsers as in the Nimrod programming language. * `parsexml <parsexml.html>`_ - The ``parsexml`` module implements a simple high performance XML/HTML parser. + The ``parsexml`` module implements a simple high performance XML/HTML parser. The only encoding that is supported is UTF-8. The parser has been designed - to be somewhat error correcting, so that even some "wild HTML" found on the - web can be parsed with it. + to be somewhat error correcting, so that even some "wild HTML" found on the + Web can be parsed with it. -* `parsecsv <parsecsv.html>`_ +* `parsecsv <parsecsv.html>`_ The ``parsecsv`` module implements a simple high performance CSV parser. - + * `parsesql <parsesql.html>`_ The ``parsesql`` module implements a simple high performance SQL parser. @@ -168,20 +168,20 @@ Parsers XML Processing -------------- -* `xmldom <xmldom.html>`_ +* `xmldom <xmldom.html>`_ This module implements the XML DOM Level 2. * `xmldomparser <xmldomparser.html>`_ This module parses an XML Document into a XML DOM Document representation. * `xmltree <xmltree.html>`_ - A simple XML tree. More efficient and simpler than the DOM. It also + A simple XML tree. More efficient and simpler than the DOM. It also contains a macro for XML/HTML code generation. -* `xmlparser <xmlparser.html>`_ +* `xmlparser <xmlparser.html>`_ This module parses an XML document and creates its XML tree representation. - -* `htmlparser <htmlparser.html>`_ + +* `htmlparser <htmlparser.html>`_ This module parses an HTML document and creates its XML tree representation. @@ -199,8 +199,8 @@ Cryptography and Hashing Multimedia support ------------------ -* `colors <colors.html>`_ - This module implements color handling for Nimrod. It is used by +* `colors <colors.html>`_ + This module implements color handling for Nimrod. It is used by the ``graphics`` module. @@ -211,7 +211,7 @@ Impure libraries * `graphics <graphics.html>`_ This module implements graphical output for Nimrod; the current implementation uses SDL but the interface is meant to support multiple - backends some day. + backends some day. * `dialogs <dialogs.html>`_ This module implements portable dialogs for Nimrod; the implementation @@ -220,10 +220,10 @@ Impure libraries * `zipfiles <zipfiles.html>`_ This module implements a zip archive creator/reader/modifier. - + * `web <web.html>`_ This module contains simple high-level procedures for dealing with the - web like loading the contents of a web page from an URL. + Web like loading the contents of a Web page from an URL. Database support @@ -232,13 +232,13 @@ Database support * `db_postgres <db_postgres.html>`_ A higher level PostgreSQL database wrapper. The same interface is implemented for other databases too. - + * `db_mysql <db_mysql.html>`_ - A higher level mySQL database wrapper. The same interface is implemented + A higher level MySQL database wrapper. The same interface is implemented for other databases too. - + * `db_sqlite <db_sqlite.html>`_ - A higher level mySQL database wrapper. The same interface is implemented + A higher level SQLite database wrapper. The same interface is implemented for other databases too. @@ -255,7 +255,7 @@ not contained in the distribution. You can then find them on the website. Contains a wrapper for the Win32 API. * `mysql <mysql.html>`_ Contains a wrapper for the mySQL API. -* `sqlite3 <sqlite3.html>`_ +* `sqlite3 <sqlite3.html>`_ Contains a wrapper for SQLite 3 API. * `libcurl <libcurl.html>`_ Contains a wrapper for the libcurl library. @@ -317,14 +317,14 @@ not contained in the distribution. You can then find them on the website. Part of the wrapper for Lua. * `lauxlib <lauxlib.html>`_ Part of the wrapper for Lua. -* `tcl <tcl.html>`_ +* `tcl <tcl.html>`_ Wrapper for the TCL programming language. * `python <python.html>`_ Wrapper for the Python programming language. * `odbcsql <odbcsql.html>`_ interface to the ODBC driver. * `zlib <zlib.html>`_ - Wrapper for the zlib library. + Wrapper for the zlib library. * `sdl <sdl.html>`_ Part of the wrapper for SDL. * `sdl_gfx <sdl_gfx.html>`_ @@ -379,7 +379,7 @@ not contained in the distribution. You can then find them on the website. Part of the wrapper for X11. * `libzip <libzip.html>`_ Interface to the `lib zip <http://www.nih.at/libzip/index.html>`_ library by - Dieter Baron and Thomas Klausner. -* `iup <iup.html>`_ + Dieter Baron and Thomas Klausner. +* `iup <iup.html>`_ Wrapper of the IUP GUI library. - + diff --git a/doc/manual.txt b/doc/manual.txt index 3f2f9c404..490901b64 100755 --- a/doc/manual.txt +++ b/doc/manual.txt @@ -1,2765 +1,2764 @@ -============= -Nimrod Manual -============= - -:Author: Andreas Rumpf -:Version: |nimrodversion| - -.. contents:: - - - "Complexity" seems to be a lot like "energy": you can transfer it from the end - user to one/some of the other players, but the total amount seems to remain - pretty much constant for a given task. -- Ran - -About this document -=================== - -**Note**: This document is a draft! Several of Nimrod's features need more -precise wording. This manual will evolve into a proper specification some -day. - -This document describes the lexis, the syntax, and the semantics of Nimrod. - -The language constructs are explained using an extended BNF, in -which ``(a)*`` means 0 or more ``a``'s, ``a+`` means 1 or more ``a``'s, and -``(a)?`` means an optional *a*; an alternative spelling for optional parts is -``[a]``. The ``|`` symbol is used to mark alternatives -and has the lowest precedence. Parentheses may be used to group elements. -Non-terminals start with a lowercase letter, abstract terminal symbols are in -UPPERCASE. Verbatim terminal symbols (including keywords) are quoted -with ``'``. An example:: - - ifStmt ::= 'if' expr ':' stmts ('elif' expr ':' stmts)* ['else' stmts] - -Other parts of Nimrod - like scoping rules or runtime semantics are only -described in an informal manner. The reason is that formal semantics are -difficult to write and understand. However, there is only one Nimrod -implementation, so one may consider it as the formal specification; -especially since the compiler's code is pretty clean (well, some parts of it). - - -Definitions -=========== - -A Nimrod program specifies a computation that acts on a memory consisting of -components called `locations`:idx:. A variable is basically a name for a -location. Each variable and location is of a certain `type`:idx:. The -variable's type is called `static type`:idx:, the location's type is called -`dynamic type`:idx:. If the static type is not the same as the dynamic type, -it is a super-type or subtype of the dynamic type. - -An `identifier`:idx: is a symbol declared as a name for a variable, type, -procedure, etc. The region of the program over which a declaration applies is -called the `scope`:idx: of the declaration. Scopes can be nested. The meaning -of an identifier is determined by the smallest enclosing scope in which the -identifier is declared. - -An expression specifies a computation that produces a value or location. -Expressions that produce locations are called `l-values`:idx:. An l-value -can denote either a location or the value the location contains, depending on -the context. Expressions whose values can be determined statically are called -`constant expressions`:idx:; they are never l-values. - -A `static error`:idx: is an error that the implementation detects before -program execution. Unless explicitly classified, an error is a static error. - -A `checked runtime error`:idx: is an error that the implementation detects -and reports at runtime. The method for reporting such errors is via *raising -exceptions*. However, the implementation provides a means to disable these -runtime checks. See the section pragmas_ for details. - -An `unchecked runtime error`:idx: is an error that is not guaranteed to be -detected, and can cause the subsequent behavior of the computation to -be arbitrary. Unchecked runtime errors cannot occur if only `safe`:idx: -language features are used. - - -Lexical Analysis -================ - -Encoding --------- - -All Nimrod source files are in the UTF-8 encoding (or its ASCII subset). Other -encodings are not supported. Any of the standard platform line termination -sequences can be used - the Unix form using ASCII LF (linefeed), the Windows -form using the ASCII sequence CR LF (return followed by linefeed), or the old -Macintosh form using the ASCII CR (return) character. All of these forms can be -used equally, regardless of platform. - - -Indentation ------------ - -Nimrod's standard grammar describes an `indentation sensitive`:idx: language. -This means that all the control structures are recognized by indentation. -Indentation consists only of spaces; tabulators are not allowed. - -The terminals ``IND`` (indentation), ``DED`` (dedentation) and ``SAD`` -(same indentation) are generated by the scanner, denoting an indentation. - -These terminals are only generated for lines that are not empty. - -The parser and the scanner communicate over a stack which indentation terminal -should be generated: the stack consists of integers counting the spaces. The -stack is initialized with a zero on its top. The scanner reads from the stack: -If the current indentation token consists of more spaces than the entry at the -top of the stack, a ``IND`` token is generated, else if it consists of the same -number of spaces, a ``SAD`` token is generated. If it consists of fewer spaces, -a ``DED`` token is generated for any item on the stack that is greater than the -current. These items are later popped from the stack by the parser. At the end -of the file, a ``DED`` token is generated for each number remaining on the -stack that is larger than zero. - -Because the grammar contains some optional ``IND`` tokens, the scanner cannot -push new indentation levels. This has to be done by the parser. The symbol -``indPush`` indicates that an ``IND`` token is expected; the current number of -leading spaces is pushed onto the stack by the parser. The symbol ``indPop`` -denotes that the parser pops an item from the indentation stack. No token is -consumed by ``indPop``. - - -Comments --------- - -`Comments`:idx: start anywhere outside a string or character literal with the -hash character ``#``. -Comments consist of a concatenation of `comment pieces`:idx:. A comment piece -starts with ``#`` and runs until the end of the line. The end of line characters -belong to the piece. If the next line only consists of a comment piece which is -aligned to the preceding one, it does not start a new comment: - -.. code-block:: nimrod - - i = 0 # This is a single comment over multiple lines belonging to the - # assignment statement. The scanner merges these two pieces. - # This is a new comment belonging to the current block, but to no particular - # statement. - i = i + 1 # This a new comment that is NOT - echo(i) # continued here, because this comment refers to the echo statement - -Comments are tokens; they are only allowed at certain places in the input file -as they belong to the syntax tree! This feature enables perfect source-to-source -transformations (such as pretty-printing) and superior documentation generators. -A nice side-effect is that the human reader of the code always knows exactly -which code snippet the comment refers to. - - -Identifiers & Keywords ----------------------- - -`Identifiers`:idx: in Nimrod can be any string of letters, digits -and underscores, beginning with a letter. Two immediate following -underscores ``__`` are not allowed:: - - letter ::= 'A'..'Z' | 'a'..'z' | '\x80'..'\xff' - digit ::= '0'..'9' - IDENTIFIER ::= letter ( ['_'] letter | digit )* - -The following `keywords`:idx: are reserved and cannot be used as identifiers: - -.. code-block:: nimrod - :file: ../data/keywords.txt - -Some keywords are unused; they are reserved for future developments of the -language. - -Nimrod is a `style-insensitive`:idx: language. This means that it is not -case-sensitive and even underscores are ignored: -**type** is a reserved word, and so is **TYPE** or **T_Y_P_E**. The idea behind -this is that this allows programmers to use their own preferred spelling style -and libraries written by different programmers cannot use incompatible -conventions. A Nimrod-aware editor or IDE can show the identifiers as -preferred. Another advantage is that it frees the programmer from remembering -the exact spelling of an identifier. - - -String literals ---------------- - -`String literals`:idx: can be delimited by matching double quotes, and can -contain the following `escape sequences`:idx:\ : - -================== =================================================== - Escape sequence Meaning -================== =================================================== - ``\n`` `newline`:idx: - ``\r``, ``\c`` `carriage return`:idx: - ``\l`` `line feed`:idx: - ``\f`` `form feed`:idx: - ``\t`` `tabulator`:idx: - ``\v`` `vertical tabulator`:idx: - ``\\`` `backslash`:idx: - ``\"`` `quotation mark`:idx: - ``\'`` `apostrophe`:idx: - ``\d+`` `character with decimal value d`:idx:; - all decimal digits directly - following are used for the character - ``\a`` `alert`:idx: - ``\b`` `backspace`:idx: - ``\e`` `escape`:idx: `[ESC]`:idx: - ``\xHH`` `character with hex value HH`:idx:; - exactly two hex digits are allowed -================== =================================================== - - -Strings in Nimrod may contain any 8-bit value, except embedded zeros. - - -Triple quoted string literals ------------------------------ - -String literals can also be delimited by three double quotes -``"""`` ... ``"""``. -Literals in this form may run for several lines, may contain ``"`` and do not -interpret any escape sequences. -For convenience, when the opening ``"""`` is immediately followed by a newline, -the newline is not included in the string. The ending of the string literal is -defined by the pattern ``"""[^"]``, so this: - -.. code-block:: nimrod - """"long string within quotes"""" - -Produces:: - - "long string within quotes" - - -Raw string literals -------------------- - -There are also `raw string literals` that are preceded with the letter ``r`` -(or ``R``) and are delimited by matching double quotes (just like ordinary -string literals) and do not interpret the escape sequences. This is especially -convenient for regular expressions or Windows paths: - -.. code-block:: nimrod - - var f = openFile(r"C:\texts\text.txt") # a raw string, so ``\t`` is no tab - -To produce a single ``"`` within a raw string literal, it has to be doubled: - -.. code-block:: nimrod - - r"a""b" - -Produces:: - - a"b - -``r""""`` is not possible with this notation, because the three leading -quotes introduce a triple quoted string literal. - - -Generalized raw string literals -------------------------------- - -The construct ``identifier"string literal"`` (without whitespace between the -identifier and the opening quotation mark) is a -`generalized raw string literal`:idx:. It is a shortcut for the construct -``identifier(r"string literal")``, so it denotes a procedure call with a -raw string literal as its only argument. Generalized raw string literals -are especially convenient for embedding mini languages directly into Nimrod -(for example regular expressions). - -The construct ``identifier"""string literal"""`` exists too. It is a shortcut -for ``identifier("""string literal""")``. - - -Character literals ------------------- - -Character literals are enclosed in single quotes ``''`` and can contain the -same escape sequences as strings - with one exception: ``\n`` is not allowed -as it may be wider than one character (often it is the pair CR/LF for example). -A character is not an Unicode character but a single byte. The reason for this -is efficiency: for the overwhelming majority of use-cases, the resulting -programs will still handle UTF-8 properly as UTF-8 was specially designed for -this. -Another reason is that Nimrod can thus support ``array[char, int]`` or -``set[char]`` efficiently as many algorithms rely on this feature. - - -Numerical constants -------------------- - -`Numerical constants`:idx: are of a single type and have the form:: - - hexdigit ::= digit | 'A'..'F' | 'a'..'f' - octdigit ::= '0'..'7' - bindigit ::= '0'..'1' - INT_LIT ::= digit ( ['_'] digit )* - | '0' ('x' | 'X' ) hexdigit ( ['_'] hexdigit )* - | '0o' octdigit ( ['_'] octdigit )* - | '0' ('b' | 'B' ) bindigit ( ['_'] bindigit )* - - INT8_LIT ::= INT_LIT '\'' ('i' | 'I' ) '8' - INT16_LIT ::= INT_LIT '\'' ('i' | 'I' ) '16' - INT32_LIT ::= INT_LIT '\'' ('i' | 'I' ) '32' - INT64_LIT ::= INT_LIT '\'' ('i' | 'I' ) '64' - - exponent ::= ('e' | 'E' ) ['+' | '-'] digit ( ['_'] digit )* - FLOAT_LIT ::= digit (['_'] digit)* ('.' (['_'] digit)* [exponent] |exponent) - FLOAT32_LIT ::= ( FLOAT_LIT | INT_LIT ) '\'' ('f' | 'F') '32' - FLOAT64_LIT ::= ( FLOAT_LIT | INT_LIT ) '\'' ('f' | 'F') '64' - - -As can be seen in the productions, numerical constants can contain underscores -for readability. Integer and floating point literals may be given in decimal (no -prefix), binary (prefix ``0b``), octal (prefix ``0o``) and hexadecimal -(prefix ``0x``) notation. - -There exists a literal for each numerical type that is -defined. The suffix starting with an apostrophe ('\'') is called a -`type suffix`:idx:. Literals without a type suffix are of the type ``int``, -unless the literal contains a dot or ``E|e`` in which case it is of -type ``float``. - -The type suffixes are: - -================= ========================= - Type Suffix Resulting type of literal -================= ========================= - ``'i8`` int8 - ``'i16`` int16 - ``'i32`` int32 - ``'i64`` int64 - ``'f32`` float32 - ``'f64`` float64 -================= ========================= - -Floating point literals may also be in binary, octal or hexadecimal -notation: -``0B0_10001110100_0000101001000111101011101111111011000101001101001001'f64`` -is approximately 1.72826e35 according to the IEEE floating point standard. - - - -Other tokens ------------- - -The following strings denote other tokens:: - - ( ) { } [ ] , ; [. .] {. .} (. .) - : = ^ .. ` - -`..`:tok: takes precedence over other tokens that contain a dot: `{..}`:tok: are -the three tokens `{`:tok:, `..`:tok:, `}`:tok: and not the two tokens -`{.`:tok:, `.}`:tok:. - -In Nimrod one can define his own operators. An `operator`:idx: is any -combination of the following characters that is not listed above:: - - + - * / < > - = @ $ ~ & % - ! ? ^ . | \ - -These keywords are also operators: -``and or not xor shl shr div mod in notin is isnot``. - - -Syntax -====== - -This section lists Nimrod's standard syntax in ENBF. How the parser receives -indentation tokens is already described in the `Lexical Analysis`_ section. - -Nimrod allows user-definable operators. -Binary operators have 8 different levels of precedence. For user-defined -operators, the precedence depends on the first character the operator consists -of. All binary operators are left-associative. - -================ ============================================== ================== =============== -Precedence level Operators First characters Terminal symbol -================ ============================================== ================== =============== - 7 (highest) ``$`` OP7 - 6 ``* / div mod shl shr %`` ``* % \ /`` OP6 - 5 ``+ -`` ``+ ~ |`` OP5 - 4 ``&`` ``&`` OP4 - 3 ``== <= < >= > != in not_in is isnot`` ``= < > !`` OP3 - 2 ``and`` OP2 - 1 ``or xor`` OP1 - 0 (lowest) ``? @ ^ ` : .`` OP0 -================ ============================================== ================== =============== - - -The grammar's start symbol is ``module``. - -.. include:: grammar.txt - :literal: - - - -Semantics -========= - -Constants ---------- - -`Constants`:idx: are symbols which are bound to a value. The constant's value -cannot change. The compiler must be able to evaluate the expression in a -constant declaration at compile time. - -Nimrod contains a sophisticated compile-time evaluator, so procedures which -have no side-effect can be used in constant expressions too: - -.. code-block:: nimrod - import strutils - const - constEval = contains("abc", 'b') # computed at compile time! - - -Types ------ - -All expressions have a `type`:idx: which is known at compile time. Nimrod -is statically typed. One can declare new types, which is in essence defining -an identifier that can be used to denote this custom type. - -These are the major type classes: - -* ordinal types (consist of integer, bool, character, enumeration - (and subranges thereof) types) -* floating point types -* string type -* structured types -* reference (pointer) type -* procedural type -* generic type - - -Ordinal types -~~~~~~~~~~~~~ -`Ordinal types`:idx: have the following characteristics: - -- Ordinal types are countable and ordered. This property allows - the operation of functions as ``Inc``, ``Ord``, ``Dec`` on ordinal types to - be defined. -- Ordinal values have a smallest possible value. Trying to count further - down than the smallest value gives a checked runtime or static error. -- Ordinal values have a largest possible value. Trying to count further - than the largest value gives a checked runtime or static error. - -Integers, bool, characters and enumeration types (and subranges of these -types) belong to ordinal types. - - -Pre-defined integer types -~~~~~~~~~~~~~~~~~~~~~~~~~ -These integer types are pre-defined: - -``int`` - the generic signed integer type; its size is platform dependent - (the compiler chooses the processor's fastest integer type). - This type should be used in general. An integer literal that has no type - suffix is of this type. - -intXX - additional signed integer types of XX bits use this naming scheme - (example: int16 is a 16 bit wide integer). - The current implementation supports ``int8``, ``int16``, ``int32``, ``int64``. - Literals of these types have the suffix 'iXX. - - -There are no `unsigned integer`:idx: types, only `unsigned operations`:idx: -that treat their arguments as unsigned. Unsigned operations all wrap around; -they cannot lead to over- or underflow errors. Unsigned operations use the -``%`` suffix as convention: - -====================== ====================================================== -operation meaning -====================== ====================================================== -``a +% b`` unsigned integer addition -``a -% b`` unsigned integer subtraction -``a *% b`` unsigned integer multiplication -``a /% b`` unsigned integer division -``a %% b`` unsigned integer modulo operation -``a <% b`` treat ``a`` and ``b`` as unsigned and compare -``a <=% b`` treat ``a`` and ``b`` as unsigned and compare -``ze(a)`` extends the bits of ``a`` with zeros until it has the - width of the ``int`` type -``toU8(a)`` treats ``a`` as unsigned and converts it to an - unsigned integer of 8 bits (but still the - ``int8`` type) -``toU16(a)`` treats ``a`` as unsigned and converts it to an - unsigned integer of 16 bits (but still the - ``int16`` type) -``toU32(a)`` treats ``a`` as unsigned and converts it to an - unsigned integer of 32 bits (but still the - ``int32`` type) -====================== ====================================================== - -`Automatic type conversion`:idx: is performed in expressions where different -kinds of integer types are used: the smaller type is converted to the larger. -For further details, see `Convertible relation`_. - - -Pre-defined floating point types -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The following floating point types are pre-defined: - -``float`` - the generic floating point type; its size is platform dependent - (the compiler chooses the processor's fastest floating point type). - This type should be used in general. - -floatXX - an implementation may define additional floating point types of XX bits using - this naming scheme (example: float64 is a 64 bit wide float). The current - implementation supports ``float32`` and ``float64``. Literals of these types - have the suffix 'fXX. - - -Automatic type conversion in expressions with different kinds -of floating point types is performed: See `Convertible relation`_ for further -details. Arithmetic performed on floating point types follows the IEEE -standard. Integer types are not converted to floating point types automatically -and vice versa. - -The IEEE standard defines five types of floating-point exceptions: - -* Invalid: operations with mathematically invalid operands, - for example 0.0/0.0, sqrt(-1.0), and log(-37.8). -* Division by zero: divisor is zero and dividend is a finite nonzero number, - for example 1.0/0.0. -* Overflow: operation produces a result that exceeds the range of the exponent, - for example MAXDOUBLE+0.0000000000001e308. -* Underflow: operation produces a result that is too small to be represented - as a normal number, for example, MINDOUBLE * MINDOUBLE. -* Inexact: operation produces a result that cannot be represented with infinite - precision, for example, 2.0 / 3.0, log(1.1) and 0.1 in input. - -The IEEE exceptions are either ignored at runtime or mapped to the -Nimrod exceptions: `EFloatInvalidOp`:idx, `EFloatDivByZero`:idx:, -`EFloatOverflow`:idx:, `EFloatUnderflow`:idx:, and `EFloatInexact`:idx:\. -These exceptions inherit from the `EFloatingPoint`:idx: base class. - -Nimrod provides the pragmas `NaNChecks`:idx and `InfChecks`:idx:\ to control -whether the IEEE exceptions are ignored or trap a Nimrod exception: - -.. code-block:: nimrod - {.NanChecks: on, InfChecks: on.} - var a = 1.0 - var b = 0.0 - echo b / b # raises EFloatInvalidOp - echo a / b # raises EFloatOverflow - -In the current implementation ``EFloatDivByZero`` and ``EFloatInexact`` are -never raised. ``EFloatOverflow`` is raised instead of ``EFloatDivByZero``. -There is also a `floatChecks`:idx: pragma that is a short-cut for the -combination of ``NaNChecks`` and ``InfChecks`` pragmas. ``floatChecks`` are -turned off as default. - -The only operations that are affected by the ``floatChecks`` pragma are -the ``+``, ``-``, ``*``, ``/`` operators for floating point types. - - -Boolean type -~~~~~~~~~~~~ -The `boolean`:idx: type is named ``bool`` in Nimrod and can be one of the two -pre-defined values ``true`` and ``false``. Conditions in while, -if, elif, when statements need to be of type bool. - -This condition holds:: - - ord(false) == 0 and ord(true) == 1 - -The operators ``not, and, or, xor, <, <=, >, >=, !=, ==`` are defined -for the bool type. The ``and`` and ``or`` operators perform short-cut -evaluation. Example: - -.. code-block:: nimrod - - while p != nil and p.name != "xyz": - # p.name is not evaluated if p == nil - p = p.next - - -The size of the bool type is one byte. - - -Character type -~~~~~~~~~~~~~~ -The `character type`:idx: is named ``char`` in Nimrod. Its size is one byte. -Thus it cannot represent an UTF-8 character, but a part of it. -The reason for this is efficiency: for the overwhelming majority of use-cases, -the resulting programs will still handle UTF-8 properly as UTF-8 was specially -designed for this. -Another reason is that Nimrod can support ``array[char, int]`` or -``set[char]`` efficiently as many algorithms rely on this feature. The -`TRune` type is used for Unicode characters, it can represent any Unicode -character. ``TRune`` is declared in the ``unicode`` module. - - - -Enumeration types -~~~~~~~~~~~~~~~~~ -`Enumeration`:idx: types define a new type whose values consist of the ones -specified. The values are ordered. Example: - -.. code-block:: nimrod - - type - TDirection = enum - north, east, south, west - - -Now the following holds:: - - ord(north) == 0 - ord(east) == 1 - ord(south) == 2 - ord(west) == 3 - -Thus, north < east < south < west. The comparison operators can be used -with enumeration types. - -For better interfacing to other programming languages, the fields of enum -types can be assigned an explicit ordinal value. However, the ordinal values -have to be in ascending order. A field whose ordinal value is not -explicitly given is assigned the value of the previous field + 1. - -An explicit ordered enum can have *holes*: - -.. code-block:: nimrod - type - TTokenType = enum - a = 2, b = 4, c = 89 # holes are valid - -However, it is then not an ordinal anymore, so it is not possible to use these -enums as an index type for arrays. The procedures ``inc``, ``dec``, ``succ`` -and ``pred`` are not available for them either. - - -Subrange types -~~~~~~~~~~~~~~ -A `subrange`:idx: type is a range of values from an ordinal type (the base -type). To define a subrange type, one must specify it's limiting values: the -lowest and highest value of the type: - -.. code-block:: nimrod - type - TSubrange = range[0..5] - - -``TSubrange`` is a subrange of an integer which can only hold the values 0 -to 5. Assigning any other value to a variable of type ``TSubrange`` is a -checked runtime error (or static error if it can be statically -determined). Assignments from the base type to one of its subrange types -(and vice versa) are allowed. - -A subrange type has the same size as its base type (``int`` in the example). - - -String type -~~~~~~~~~~~ -All string literals are of the type `string`:idx:. A string in Nimrod is very -similar to a sequence of characters. However, strings in Nimrod are both -zero-terminated and have a length field. One can retrieve the length with the -builtin ``len`` procedure; the length never counts the terminating zero. -The assignment operator for strings always copies the string. -The ``&`` operator concatenates strings. - -Strings are compared by their lexicographical order. All comparison operators -are available. Strings can be indexed like arrays (lower bound is 0). Unlike -arrays, they can be used in case statements: - -.. code-block:: nimrod - - case paramStr(i) - of "-v": incl(options, optVerbose) - of "-h", "-?": incl(options, optHelp) - else: write(stdout, "invalid command line option!\n") - -Per convention, all strings are UTF-8 strings, but this is not enforced. For -example, when reading strings from binary files, they are merely a sequence of -bytes. The index operation ``s[i]`` means the i-th *char* of ``s``, not the -i-th *unichar*. The iterator ``runes`` from the ``unicode`` -module can be used for iteration over all Unicode characters. - - -Structured types -~~~~~~~~~~~~~~~~ -A variable of a `structured type`:idx: can hold multiple values at the same -time. Structured types can be nested to unlimited levels. Arrays, sequences, -tuples, objects and sets belong to the structured types. - -Array and sequence types -~~~~~~~~~~~~~~~~~~~~~~~~ -`Arrays`:idx: are a homogeneous type, meaning that each element in the array -has the same type. Arrays always have a fixed length which is specified at -compile time (except for open arrays). They can be indexed by any ordinal type. -A parameter ``A`` may be an *open array*, in which case it is indexed by -integers from 0 to ``len(A)-1``. An array expression may be constructed by the -array constructor ``[]``. - -`Sequences`:idx: are similar to arrays but of dynamic length which may change -during runtime (like strings). A sequence ``S`` is always indexed by integers -from 0 to ``len(S)-1`` and its bounds are checked. Sequences can be -constructed by the array constructor ``[]`` in conjunction with the array to -sequence operator ``@``. Another way to allocate space for a sequence is to -call the built-in ``newSeq`` procedure. - -A sequence may be passed to a parameter that is of type *open array*. - -Example: - -.. code-block:: nimrod - - type - TIntArray = array[0..5, int] # an array that is indexed with 0..5 - TIntSeq = seq[int] # a sequence of integers - var - x: TIntArray - y: TIntSeq - x = [1, 2, 3, 4, 5, 6] # [] is the array constructor - y = @[1, 2, 3, 4, 5, 6] # the @ turns the array into a sequence - -The lower bound of an array or sequence may be received by the built-in proc -``low()``, the higher bound by ``high()``. The length may be -received by ``len()``. ``low()`` for a sequence or an open array always returns -0, as this is the first valid index. -One can append elements to a sequence with the ``add()`` proc or the ``&`` -operator, and remove (and get) the last element of a sequence with the -``pop()`` proc. - -The notation ``x[i]`` can be used to access the i-th element of ``x``. - -Arrays are always bounds checked (at compile-time or at runtime). These -checks can be disabled via pragmas or invoking the compiler with the -``--boundChecks:off`` command line switch. - -An open array is also a means to implement passing a variable number of -arguments to a procedure. The compiler converts the list of arguments -to an array automatically: - -.. code-block:: nimrod - proc myWriteln(f: TFile, a: openarray[string]) = - for s in items(a): - write(f, s) - write(f, "\n") - - myWriteln(stdout, "abc", "def", "xyz") - # is transformed by the compiler to: - myWriteln(stdout, ["abc", "def", "xyz"]) - -This transformation is only done if the openarray parameter is the -last parameter in the procedure header. The current implementation does not -support nested open arrays. - - -Tuples and object types -~~~~~~~~~~~~~~~~~~~~~~~ -A variable of a `tuple`:idx: or `object`:idx: type is a heterogeneous storage -container. -A tuple or object defines various named *fields* of a type. A tuple also -defines an *order* of the fields. Tuples are meant for heterogeneous storage -types with no overhead and few abstraction possibilities. The constructor ``()`` -can be used to construct tuples. The order of the fields in the constructor -must match the order of the tuple's definition. Different tuple-types are -*equivalent* if they specify the same fields of the same type in the same -order. - -The assignment operator for tuples copies each component. -The default assignment operator for objects copies each component. Overloading -of the assignment operator for objects is not possible, but this may change in -future versions of the compiler. - -.. code-block:: nimrod - - type - TPerson = tuple[name: string, age: int] # type representing a person - # a person consists of a name - # and an age - var - person: TPerson - person = (name: "Peter", age: 30) - # the same, but less readable: - person = ("Peter", 30) - -The implementation aligns the fields for best access performance. The alignment -is compatible with the way the C compiler does it. - -Objects provide many features that tuples do not. Object provide inheritance -and information hiding. Objects have access to their type at runtime, so that -the ``is`` operator can be used to determine the object's type. - -.. code-block:: nimrod - - type - TPerson = object - name*: string # the * means that `name` is accessible from other modules - age: int # no * means that the field is hidden - - TStudent = object of TPerson # a student is a person - id: int # with an id field - - var - student: TStudent - person: TPerson - assert(student is TStudent) # is true - -Object fields that should be visible from outside the defining module, have to -be marked by ``*``. In contrast to tuples, different object types are -never *equivalent*. - - -Object variants -~~~~~~~~~~~~~~~ -Often an object hierarchy is overkill in certain situations where simple -`variant`:idx: types are needed. - -An example: - -.. code-block:: nimrod - - # This is an example how an abstract syntax tree could be modelled in Nimrod - type - TNodeKind = enum # the different node types - nkInt, # a leaf with an integer value - nkFloat, # a leaf with a float value - nkString, # a leaf with a string value - nkAdd, # an addition - nkSub, # a subtraction - nkIf # an if statement - PNode = ref TNode - TNode = object - case kind: TNodeKind # the ``kind`` field is the discriminator - of nkInt: intVal: int - of nkFloat: floatVal: float - of nkString: strVal: string - of nkAdd, nkSub: - leftOp, rightOp: PNode - of nkIf: - condition, thenPart, elsePart: PNode - - var - n: PNode - new(n) # creates a new node - n.kind = nkFloat - n.floatVal = 0.0 # valid, because ``n.kind==nkFloat``, so that it fits - - # the following statement raises an `EInvalidField` exception, because - # n.kind's value does not fit: - n.strVal = "" - -As can been seen from the example, an advantage to an object hierarchy is that -no casting between different object types is needed. Yet, access to invalid -object fields raises an exception. - - -Set type -~~~~~~~~ -The `set type`:idx: models the mathematical notion of a set. The set's -basetype can only be an ordinal type. The reason is that sets are implemented -as high performance bit vectors. - -Sets can be constructed via the set constructor: ``{}`` is the empty set. The -empty set is type compatible with any special set type. The constructor -can also be used to include elements (and ranges of elements) in the set: - -.. code-block:: nimrod - - {'a'..'z', '0'..'9'} # This constructs a set that contains the - # letters from 'a' to 'z' and the digits - # from '0' to '9' - -These operations are supported by sets: - -================== ======================================================== -operation meaning -================== ======================================================== -``A + B`` union of two sets -``A * B`` intersection of two sets -``A - B`` difference of two sets (A without B's elements) -``A == B`` set equality -``A <= B`` subset relation (A is subset of B or equal to B) -``A < B`` strong subset relation (A is a real subset of B) -``e in A`` set membership (A contains element e) -``A -+- B`` symmetric set difference (= (A - B) + (B - A)) -``card(A)`` the cardinality of A (number of elements in A) -``incl(A, elem)`` same as A = A + {elem} -``excl(A, elem)`` same as A = A - {elem} -================== ======================================================== - - -Reference and pointer types -~~~~~~~~~~~~~~~~~~~~~~~~~~~ -References (similar to `pointers`:idx: in other programming languages) are a -way to introduce many-to-one relationships. This means different references can -point to and modify the same location in memory. - -Nimrod distinguishes between `traced`:idx: and `untraced`:idx: references. -Untraced references are also called *pointers*. Traced references point to -objects of a garbage collected heap, untraced references point to -manually allocated objects or to objects somewhere else in memory. Thus -untraced references are *unsafe*. However for certain low-level operations -(accessing the hardware) untraced references are unavoidable. - -Traced references are declared with the **ref** keyword, untraced references -are declared with the **ptr** keyword. - -The ``^`` operator can be used to derefer a reference, the ``addr`` procedure -returns the address of an item. An address is always an untraced reference. -Thus the usage of ``addr`` is an *unsafe* feature. - -The ``.`` (access a tuple/object field operator) -and ``[]`` (array/string/sequence index operator) operators perform implicit -dereferencing operations for reference types: - -.. code-block:: nimrod - - type - PNode = ref TNode - TNode = object - le, ri: PNode - data: int - - var - n: PNode - new(n) - n.data = 9 # no need to write n^ .data - -To allocate a new traced object, the built-in procedure ``new`` has to be used. -To deal with untraced memory, the procedures ``alloc``, ``dealloc`` and -``realloc`` can be used. The documentation of the system module contains -further information. - -If a reference points to *nothing*, it has the value ``nil``. - -Special care has to be taken if an untraced object contains traced objects like -traced references, strings or sequences: in order to free everything properly, -the built-in procedure ``GCunref`` has to be called before freeing the -untraced memory manually! - -.. XXX finalizers for traced objects - -Procedural type -~~~~~~~~~~~~~~~ -A `procedural type`:idx: is internally a pointer to a procedure. ``nil`` is -an allowed value for variables of a procedural type. Nimrod uses procedural -types to achieve `functional`:idx: programming techniques. - -Example: - -.. code-block:: nimrod - - type - TCallback = proc (x: int) {.cdecl.} - - proc printItem(x: Int) = ... - - proc forEach(c: TCallback) = - ... - - forEach(printItem) # this will NOT work because calling conventions differ - -A subtle issue with procedural types is that the calling convention of the -procedure influences the type compatibility: procedural types are only -compatible if they have the same calling convention. - -Nimrod supports these `calling conventions`:idx:, which are all incompatible to -each other: - -`stdcall`:idx: - This the stdcall convention as specified by Microsoft. The generated C - procedure is declared with the ``__stdcall`` keyword. - -`cdecl`:idx: - The cdecl convention means that a procedure shall use the same convention - as the C compiler. Under windows the generated C procedure is declared with - the ``__cdecl`` keyword. - -`safecall`:idx: - This is the safecall convention as specified by Microsoft. The generated C - procedure is declared with the ``__safecall`` keyword. The word *safe* - refers to the fact that all hardware registers shall be pushed to the - hardware stack. - -`inline`:idx: - The inline convention means the the caller should not call the procedure, - but inline its code directly. Note that Nimrod does not inline, but leaves - this to the C compiler. Thus it generates ``__inline`` procedures. This is - only a hint for the compiler: it may completely ignore it and - it may inline procedures that are not marked as ``inline``. - -`fastcall`:idx: - Fastcall means different things to different C compilers. One gets whatever - the C ``__fastcall`` means. - -`nimcall`:idx: - Nimcall is the default convention used for Nimrod procedures. It is the - same as ``fastcall``, but only for C compilers that support ``fastcall``. - -`closure`:idx: - indicates that the procedure expects a context, a closure that needs - to be passed to the procedure. The calling convention ``nimcall`` is - compatible to ``closure``. - -`syscall`:idx: - The syscall convention is the same as ``__syscall`` in C. It is used for - interrupts. - -`noconv`:idx: - The generated C code will not have any explicit calling convention and thus - use the C compiler's default calling convention. This is needed because - Nimrod's default calling convention for procedures is ``fastcall`` to - improve speed. - -Most calling conventions exist only for the Windows 32-bit platform. - -Assigning/passing a procedure to a procedural variable is only allowed if one -of the following conditions hold: -1) The procedure that is accessed resists in the current module. -2) The procedure is marked with the ``procvar`` pragma (see `procvar pragma`_). -3) The procedure has a calling convention that differs from ``nimcall``. -4) The procedure is anonymous. - -The rules' purpose is to prevent the case that extending a non-``procvar`` -procedure with default parameters breaks client code. - - -Distinct type -~~~~~~~~~~~~~ - -A distinct type is new type derived from a `base type`:idx: that is -incompatible with its base type. In particular, it is an essential property -of a distinct type that it **does not** imply a subtype relation between it -and its base type. Explicit type conversions from a distinct type to its -base type and vice versa are allowed. - -A distinct type can be used to model different physical `units`:idx: with a -numerical base type, for example. The following example models currencies. - -Different currencies should not be mixed in monetary calculations. Distinct -types are a perfect tool to model different currencies: - -.. code-block:: nimrod - type - TDollar = distinct int - TEuro = distinct int - - var - d: TDollar - e: TEuro - - echo d + 12 - # Error: cannot add a number with no unit and a ``TDollar`` - -Unfortunately, ``d + 12.TDollar`` is not allowed either, -because ``+`` is defined for ``int`` (among others), not for ``TDollar``. So -a ``+`` for dollars needs to be defined: - -.. code-block:: - proc `+` (x, y: TDollar): TDollar = - result = TDollar(int(x) + int(y)) - -It does not make sense to multiply a dollar with a dollar, but with a -number without unit; and the same holds for division: - -.. code-block:: - proc `*` (x: TDollar, y: int): TDollar = - result = TDollar(int(x) * y) - - proc `*` (x: int, y: TDollar): TDollar = - result = TDollar(x * int(y)) - - proc `div` ... - -This quickly gets tedious. The implementations are trivial and the compiler -should not generate all this code only to optimize it away later - after all -``+`` for dollars should produce the same binary code as ``+`` for ints. -The pragma ``borrow`` has been designed to solve this problem; in principle -it generates the above trivial implementations: - -.. code-block:: nimrod - proc `*` (x: TDollar, y: int): TDollar {.borrow.} - proc `*` (x: int, y: TDollar): TDollar {.borrow.} - proc `div` (x: TDollar, y: int): TDollar {.borrow.} - -The ``borrow`` pragma makes the compiler use the same implementation as -the proc that deals with the distinct type's base type, so no code is -generated. - -But it seems all this boilerplate code needs to be repeated for the ``TEuro`` -currency. This can be solved with templates_. - -.. code-block:: nimrod - template Additive(typ: typeDesc): stmt = - proc `+` *(x, y: typ): typ {.borrow.} - proc `-` *(x, y: typ): typ {.borrow.} - - # unary operators: - proc `+` *(x: typ): typ {.borrow.} - proc `-` *(x: typ): typ {.borrow.} - - template Multiplicative(typ, base: typeDesc): stmt = - proc `*` *(x: typ, y: base): typ {.borrow.} - proc `*` *(x: base, y: typ): typ {.borrow.} - proc `div` *(x: typ, y: base): typ {.borrow.} - proc `mod` *(x: typ, y: base): typ {.borrow.} - - template Comparable(typ: typeDesc): stmt = - proc `<` * (x, y: typ): bool {.borrow.} - proc `<=` * (x, y: typ): bool {.borrow.} - proc `==` * (x, y: typ): bool {.borrow.} - - template DefineCurrency(typ, base: expr): stmt = - type - typ* = distinct base - Additive(typ) - Multiplicative(typ, base) - Comparable(typ) - - DefineCurrency(TDollar, int) - DefineCurrency(TEuro, int) - - - -Type relations --------------- - -The following section defines several relations on types that are needed to -describe the type checking done by the compiler. - - -Type equality -~~~~~~~~~~~~~ -Nimrod uses structural type equivalence for most types. Only for objects, -enumerations and distinct types name equivalence is used. The following -algorithm (in pseudo-code) determines type equality: - -.. code-block:: nimrod - proc typeEqualsAux(a, b: PType, - s: var set[PType * PType]): bool = - if (a,b) in s: return true - incl(s, (a,b)) - if a.kind == b.kind: - case a.kind - of int, intXX, float, floatXX, char, string, cstring, pointer, bool, nil: - # leaf type: kinds identical; nothing more to check - result = true - of ref, ptr, var, set, seq, openarray: - result = typeEqualsAux(a.baseType, b.baseType, s) - of range: - result = typeEqualsAux(a.baseType, b.baseType, s) and - (a.rangeA == b.rangeA) and (a.rangeB == b.rangeB) - of array: - result = typeEqualsAux(a.baseType, b.baseType, s) and - typeEqualsAux(a.indexType, b.indexType, s) - of tuple: - if a.tupleLen == b.tupleLen: - for i in 0..a.tupleLen-1: - if not typeEqualsAux(a[i], b[i], s): return false - result = true - of object, enum, distinct: - result = a == b - of proc: - result = typeEqualsAux(a.parameterTuple, b.parameterTuple, s) and - typeEqualsAux(a.resultType, b.resultType, s) and - a.callingConvention == b.callingConvention - - proc typeEquals(a, b: PType): bool = - var s: set[PType * PType] = {} - result = typeEqualsAux(a, b, s) - -Since types are graphs which can have cycles, the above algorithm needs an -auxiliary set ``s`` to detect this case. - - -Subtype relation -~~~~~~~~~~~~~~~~ -If object ``a`` inherits from ``b``, ``a`` is a subtype of ``b``. This subtype -relation is extended to the types ``var``, ``ref``, ``ptr``: - -.. code-block:: nimrod - proc isSubtype(a, b: PType): bool = - if a.kind == b.kind: - case a.kind - of object: - var aa = a.baseType - while aa != nil and aa != b: aa = aa.baseType - result = aa == b - of var, ref, ptr: - result = isSubtype(a.baseType, b.baseType) - -.. XXX nil is a special value! - - -Convertible relation -~~~~~~~~~~~~~~~~~~~~ -A type ``a`` is **implicitly** convertible to type ``b`` iff the following -algorithm returns true: - -.. code-block:: nimrod - # XXX range types? - proc isImplicitlyConvertible(a, b: PType): bool = - case a.kind - of proc: - if b.kind == proc: - var x = a.parameterTuple - var y = b.parameterTuple - if x.tupleLen == y.tupleLen: - for i in 0.. x.tupleLen-1: - if not isSubtype(x[i], y[i]): return false - result = isSubType(b.resultType, a.resultType) - of int8: result = b.kind in {int16, int32, int64, int} - of int16: result = b.kind in {int32, int64, int} - of int32: result = b.kind in {int64, int} - of float: result = b.kind in {float32, float64} - of float32: result = b.kind in {float64, float} - of float64: result = b.kind in {float32, float} - of seq: - result = b.kind == openArray and typeEquals(a.baseType, b.baseType) - of array: - result = b.kind == openArray and typeEquals(a.baseType, b.baseType) - if a.baseType == char and a.indexType.rangeA == 0: - result = b.kind = cstring - of cstring, ptr: - result = b.kind == pointer - of string: - result = b.kind == cstring - -A type ``a`` is **explicitly** convertible to type ``b`` iff the following -algorithm returns true: - -.. code-block:: nimrod - proc isIntegralType(t: PType): bool = - result = isOrdinal(t) or t.kind in {float, float32, float64} - - proc isExplicitlyConvertible(a, b: PType): bool = - if isImplicitlyConvertible(a, b): return true - if isIntegralType(a) and isIntegralType(b): return true - if isSubtype(a, b) or isSubtype(b, a): return true - if a.kind == distinct and typeEquals(a.baseType, b): return true - if b.kind == distinct and typeEquals(b.baseType, a): return true - return false - - -Assignment compatibility -~~~~~~~~~~~~~~~~~~~~~~~~ - -An expression ``b`` can be assigned to an expression ``a`` iff ``a`` is an -`l-value` and ``isImplicitlyConvertible(b.typ, a.typ)`` holds. - - -Overloading resolution -~~~~~~~~~~~~~~~~~~~~~~ - -To be written. - - -Statements and expressions --------------------------- -Nimrod uses the common statement/expression paradigm: `Statements`:idx: do not -produce a value in contrast to expressions. Call expressions are statements. -If the called procedure returns a value, it is not a valid statement -as statements do not produce values. To evaluate an expression for -side-effects and throw its value away, one can use the ``discard`` statement. - -Statements are separated into `simple statements`:idx: and -`complex statements`:idx:. -Simple statements are statements that cannot contain other statements like -assignments, calls or the ``return`` statement; complex statements can -contain other statements. To avoid the `dangling else problem`:idx:, complex -statements always have to be intended:: - - simpleStmt ::= returnStmt - | yieldStmt - | discardStmt - | raiseStmt - | breakStmt - | continueStmt - | pragma - | importStmt - | fromStmt - | includeStmt - | exprStmt - complexStmt ::= ifStmt | whileStmt | caseStmt | tryStmt | forStmt - | blockStmt | asmStmt - | procDecl | iteratorDecl | macroDecl | templateDecl - | constSection | typeSection | whenStmt | varSection - - - -Discard statement -~~~~~~~~~~~~~~~~~ - -Syntax:: - - discardStmt ::= 'discard' expr - -Example: - -.. code-block:: nimrod - - discard proc_call("arg1", "arg2") # discard the return value of `proc_call` - -The `discard`:idx: statement evaluates its expression for side-effects and -throws the expression's resulting value away. If the expression has no -side-effects, this generates a static error. Ignoring the return value of a -procedure without using a discard statement is a static error too. - - -Var statement -~~~~~~~~~~~~~ - -Syntax:: - - colonOrEquals ::= ':' typeDesc ['=' expr] | '=' expr - varField ::= symbol ['*'] [pragma] - varPart ::= symbol (comma symbol)* [comma] colonOrEquals [COMMENT | IND COMMENT] - varSection ::= 'var' (varPart - | indPush (COMMENT|varPart) - (SAD (COMMENT|varPart))* DED indPop) - - -`Var`:idx: statements declare new local and global variables and -initialize them. A comma separated list of variables can be used to specify -variables of the same type: - -.. code-block:: nimrod - - var - a: int = 0 - x, y, z: int - -If an initializer is given the type can be omitted: the variable is then of the -same type as the initializing expression. Variables are always initialized -with a default value if there is no initializing expression. The default -value depends on the type and is always a zero in binary. - -============================ ============================================== -Type default value -============================ ============================================== -any integer type 0 -any float 0.0 -char '\\0' -bool false -ref or pointer type nil -procedural type nil -sequence nil (*not* ``@[]``) -string nil (*not* "") -tuple[x: A, y: B, ...] (default(A), default(B), ...) - (analogous for objects) -array[0..., T] [default(T), ...] -range[T] default(T); this may be out of the valid range -T = enum cast[T](0); this may be an invalid value -============================ ============================================== - - -Const section -~~~~~~~~~~~~~ - -Syntax:: - - colonAndEquals ::= [':' typeDesc] '=' expr - - constDecl ::= symbol ['*'] [pragma] colonAndEquals [COMMENT | IND COMMENT] - | COMMENT - constSection ::= 'const' indPush constDecl (SAD constDecl)* DED indPop - - -Example: - -.. code-block:: nimrod - - const - MyFilename = "/home/my/file.txt" - debugMode: bool = false - -The `const`:idx: section declares symbolic constants. A symbolic constant is -a name for a constant expression. Symbolic constants only allow read-access. - - -If statement -~~~~~~~~~~~~ - -Syntax:: - - ifStmt ::= 'if' expr ':' stmt ('elif' expr ':' stmt)* ['else' ':' stmt] - -Example: - -.. code-block:: nimrod - - var name = readLine(stdin) - - if name == "Andreas": - echo("What a nice name!") - elif name == "": - echo("Don't you have a name?") - else: - echo("Boring name...") - -The `if`:idx: statement is a simple way to make a branch in the control flow: -The expression after the keyword ``if`` is evaluated, if it is true -the corresponding statements after the ``:`` are executed. Otherwise -the expression after the ``elif`` is evaluated (if there is an -``elif`` branch), if it is true the corresponding statements after -the ``:`` are executed. This goes on until the last ``elif``. If all -conditions fail, the ``else`` part is executed. If there is no ``else`` -part, execution continues with the statement after the ``if`` statement. - - -Case statement -~~~~~~~~~~~~~~ - -Syntax:: - - caseStmt ::= 'case' expr [':'] ('of' sliceExprList ':' stmt)* - ('elif' expr ':' stmt)* - ['else' ':' stmt] - -Example: - -.. code-block:: nimrod - - case readline(stdin) - of "delete-everything", "restart-computer": - echo("permission denied") - of "go-for-a-walk": echo("please yourself") - else: echo("unknown command") - -The `case`:idx: statement is similar to the if statement, but it represents -a multi-branch selection. The expression after the keyword ``case`` is -evaluated and if its value is in a *slicelist* the corresponding statements -(after the ``of`` keyword) are executed. If the value is not in any -given *slicelist* the ``else`` part is executed. If there is no ``else`` -part and not all possible values that ``expr`` can hold occur in a -``slicelist``, a static error occurs. This holds only for expressions of -ordinal types. -If the expression is not of an ordinal type, and no ``else`` part is -given, control passes after the ``case`` statement. - -To suppress the static error in the ordinal case an ``else`` part with a ``nil`` -statement can be used. - - -When statement -~~~~~~~~~~~~~~ - -Syntax:: - - whenStmt ::= 'when' expr ':' stmt ('elif' expr ':' stmt)* ['else' ':' stmt] - -Example: - -.. code-block:: nimrod - - when sizeof(int) == 2: - echo("running on a 16 bit system!") - elif sizeof(int) == 4: - echo("running on a 32 bit system!") - elif sizeof(int) == 8: - echo("running on a 64 bit system!") - else: - echo("cannot happen!") - -The `when`:idx: statement is almost identical to the ``if`` statement with some -exceptions: - -* Each ``expr`` has to be a constant expression (of type ``bool``). -* The statements do not open a new scope. -* The statements that belong to the expression that evaluated to true are - translated by the compiler, the other statements are not checked for - semantics! However, each ``expr`` is checked for semantics. - -The ``when`` statement enables conditional compilation techniques. As -a special syntactic extension, the ``when`` construct is also available -within ``object`` definitions. - - -Raise statement -~~~~~~~~~~~~~~~ - -Syntax:: - - raiseStmt ::= 'raise' [expr] - -Example: - -.. code-block:: nimrod - raise newEOS("operating system failed") - -Apart from built-in operations like array indexing, memory allocation, etc. -the ``raise`` statement is the only way to raise an exception. - -.. XXX document this better! - -If no exception name is given, the current exception is `re-raised`:idx:. The -`ENoExceptionToReraise`:idx: exception is raised if there is no exception to -re-raise. It follows that the ``raise`` statement *always* raises an -exception. - - -Try statement -~~~~~~~~~~~~~ - -Syntax:: - - qualifiedIdent ::= symbol ['.' symbol] - exceptList ::= [qualifiedIdent (comma qualifiedIdent)* [comma]] - tryStmt ::= 'try' ':' stmt - ('except' exceptList ':' stmt)* - ['finally' ':' stmt] - -Example: - -.. code-block:: nimrod - # read the first two lines of a text file that should contain numbers - # and tries to add them - var - f: TFile - if open(f, "numbers.txt"): - try: - var a = readLine(f) - var b = readLine(f) - echo("sum: " & $(parseInt(a) + parseInt(b))) - except EOverflow: - echo("overflow!") - except EInvalidValue: - echo("could not convert string to integer") - except EIO: - echo("IO error!") - except: - echo("Unknown exception!") - finally: - close(f) - -The statements after the `try`:idx: are executed in sequential order unless -an exception ``e`` is raised. If the exception type of ``e`` matches any -of the list ``exceptlist`` the corresponding statements are executed. -The statements following the ``except`` clauses are called -`exception handlers`:idx:. - -The empty `except`:idx: clause is executed if there is an exception that is -in no list. It is similar to an ``else`` clause in ``if`` statements. - -If there is a `finally`:idx: clause, it is always executed after the -exception handlers. - -The exception is *consumed* in an exception handler. However, an -exception handler may raise another exception. If the exception is not -handled, it is propagated through the call stack. This means that often -the rest of the procedure - that is not within a ``finally`` clause - -is not executed (if an exception occurs). - - -Return statement -~~~~~~~~~~~~~~~~ - -Syntax:: - - returnStmt ::= 'return' [expr] - -Example: - -.. code-block:: nimrod - return 40+2 - -The `return`:idx: statement ends the execution of the current procedure. -It is only allowed in procedures. If there is an ``expr``, this is syntactic -sugar for: - -.. code-block:: nimrod - result = expr - return result - -``return`` without an expression is a short notation for ``return result`` if -the proc has a return type. The `result`:idx: variable is always the return -value of the procedure. It is automatically declared by the compiler. As all -variables, ``result`` is initialized to (binary) zero: - -.. code-block:: nimrod - proc returnZero(): int = - # implicitly returns 0 - - -Yield statement -~~~~~~~~~~~~~~~ - -Syntax:: - - yieldStmt ::= 'yield' expr - -Example: - -.. code-block:: nimrod - yield (1, 2, 3) - -The `yield`:idx: statement is used instead of the ``return`` statement in -iterators. It is only valid in iterators. Execution is returned to the body -of the for loop that called the iterator. Yield does not end the iteration -process, but execution is passed back to the iterator if the next iteration -starts. See the section about iterators (`Iterators and the for statement`_) -for further information. - - -Block statement -~~~~~~~~~~~~~~~ - -Syntax:: - - blockStmt ::= 'block' [symbol] ':' stmt - -Example: - -.. code-block:: nimrod - var found = false - block myblock: - for i in 0..3: - for j in 0..3: - if a[j][i] == 7: - found = true - break myblock # leave the block, in this case both for-loops - echo(found) - -The block statement is a means to group statements to a (named) `block`:idx:. -Inside the block, the ``break`` statement is allowed to leave the block -immediately. A ``break`` statement can contain a name of a surrounding -block to specify which block is to leave. - - -Break statement -~~~~~~~~~~~~~~~ - -Syntax:: - - breakStmt ::= 'break' [symbol] - -Example: - -.. code-block:: nimrod - break - -The `break`:idx: statement is used to leave a block immediately. If ``symbol`` -is given, it is the name of the enclosing block that is to leave. If it is -absent, the innermost block is left. - - -While statement -~~~~~~~~~~~~~~~ - -Syntax:: - - whileStmt ::= 'while' expr ':' stmt - -Example: - -.. code-block:: nimrod - echo("Please tell me your password: \n") - var pw = readLine(stdin) - while pw != "12345": - echo("Wrong password! Next try: \n") - pw = readLine(stdin) - - -The `while`:idx: statement is executed until the ``expr`` evaluates to false. -Endless loops are no error. ``while`` statements open an `implicit block`, -so that they can be left with a ``break`` statement. - - -Continue statement -~~~~~~~~~~~~~~~~~~ - -Syntax:: - - continueStmt ::= 'continue' - -A `continue`:idx: statement leads to the immediate next iteration of the -surrounding loop construct. It is only allowed within a loop. A continue -statement is syntactic sugar for a nested block: - -.. code-block:: nimrod - while expr1: - stmt1 - continue - stmt2 - -Is equivalent to: - -.. code-block:: nimrod - while expr1: - block myBlockName: - stmt1 - break myBlockName - stmt2 - - -Assembler statement -~~~~~~~~~~~~~~~~~~~ -Syntax:: - - asmStmt ::= 'asm' [pragma] (STR_LIT | RSTR_LIT | TRIPLESTR_LIT) - -The direct embedding of `assembler`:idx: code into Nimrod code is supported -by the unsafe ``asm`` statement. Identifiers in the assembler code that refer to -Nimrod identifiers shall be enclosed in a special character which can be -specified in the statement's pragmas. The default special character is ``'`'``. - - -If expression -~~~~~~~~~~~~~ - -An `if expression` is almost like an if statement, but it is an expression. -Example: - -.. code-block:: nimrod - p(if x > 8: 9 else: 10) - -An if expression always results in a value, so the ``else`` part is -required. ``Elif`` parts are also allowed (but unlikely to be good -style). - - -Type conversions -~~~~~~~~~~~~~~~~ -Syntactically a `type conversion` is like a procedure call, but a -type name replaces the procedure name. A type conversion is always -safe in the sense that a failure to convert a type to another -results in an exception (if it cannot be determined statically). - - -Type casts -~~~~~~~~~~ -Example: - -.. code-block:: nimrod - cast[int](x) - -Type casts are a crude mechanism to interpret the bit pattern of -an expression as if it would be of another type. Type casts are -only needed for low-level programming and are inherently unsafe. - - -The addr operator -~~~~~~~~~~~~~~~~~ -The `addr` operator returns the address of an l-value. If the -type of the location is ``T``, the `addr` operator result is -of the type ``ptr T``. Taking the address of an object that resides -on the stack is **unsafe**, as the pointer may live longer than the -object on the stack and can thus reference a non-existing object. - - -Procedures -~~~~~~~~~~ -What most programming languages call `methods`:idx: or `functions`:idx: are -called `procedures`:idx: in Nimrod (which is the correct terminology). A -procedure declaration defines an identifier and associates it with a block -of code. -A procedure may call itself recursively. A parameter may be given a default -value that is used if the caller does not provide a value for this parameter. -The syntax is:: - - param ::= symbol (comma symbol)* (':' typeDesc ['=' expr] | '=' expr) - paramList ::= ['(' [param (comma param)*] optPar ')'] [':' typeDesc] - - genericParam ::= symbol [':' typeDesc] ['=' expr] - genericParams ::= '[' genericParam (comma genericParam)* optPar ']' - - routineDecl := symbol ['*'] [genericParams] paramList [pragma] ['=' stmt] - procDecl ::= 'proc' routineDecl - - -If the ``= stmt`` part is missing, it is a `forward`:idx: declaration. If -the proc returns a value, the procedure body can access an implicitly declared -variable named `result`:idx: that represents the return value. Procs can be -overloaded. The overloading resolution algorithm tries to find the proc that is -the best match for the arguments. Example: - -.. code-block:: nimrod - - proc toLower(c: Char): Char = # toLower for characters - if c in {'A'..'Z'}: - result = chr(ord(c) + (ord('a') - ord('A'))) - else: - result = c - - proc toLower(s: string): string = # toLower for strings - result = newString(len(s)) - for i in 0..len(s) - 1: - result[i] = toLower(s[i]) # calls toLower for characters; no recursion! - -Calling a procedure can be done in many different ways: - -.. code-block:: nimrod - proc callme(x, y: int, s: string = "", c: char, b: bool = false) = ... - - # call with positional arguments # parameter bindings: - callme(0, 1, "abc", '\t', true) # (x=0, y=1, s="abc", c='\t', b=true) - # call with named and positional arguments: - callme(y=1, x=0, "abd", '\t') # (x=0, y=1, s="abd", c='\t', b=false) - # call with named arguments (order is not relevant): - callme(c='\t', y=1, x=0) # (x=0, y=1, s="", c='\t', b=false) - # call as a command statement: no () needed: - callme 0, 1, "abc", '\t' - - -A procedure cannot modify its parameters (unless the parameters have the type -`var`). - -`Operators`:idx: are procedures with a special operator symbol as identifier: - -.. code-block:: nimrod - proc `$` (x: int): string = - # converts an integer to a string; this is a prefix operator. - return intToStr(x) - -Operators with one parameter are prefix operators, operators with two -parameters are infix operators. (However, the parser distinguishes these from -the operator's position within an expression.) There is no way to declare -postfix operators: all postfix operators are built-in and handled by the -grammar explicitly. - -Any operator can be called like an ordinary proc with the '`opr`' -notation. (Thus an operator can have more than two parameters): - -.. code-block:: nimrod - proc `*+` (a, b, c: int): int = - # Multiply and add - return a * b + c - - assert `*+`(3, 4, 6) == `*`(a, `+`(b, c)) - - - -Var parameters -~~~~~~~~~~~~~~ -The type of a parameter may be prefixed with the ``var`` keyword: - -.. code-block:: nimrod - proc divmod(a, b: int, - res, remainder: var int) = - res = a div b - remainder = a mod b - - var - x, y: int - - divmod(8, 5, x, y) # modifies x and y - assert x == 1 - assert y == 3 - -In the example, ``res`` and ``remainder`` are `var parameters`. -Var parameters can be modified by the procedure and the changes are -visible to the caller. The argument passed to a var parameter has to be -an l-value. Var parameters are implemented as hidden pointers. The -above example is equivalent to: - -.. code-block:: nimrod - proc divmod(a, b: int, - res, remainder: ptr int) = - res^ = a div b - remainder^ = a mod b - - var - x, y: int - divmod(8, 5, addr(x), addr(y)) - assert x == 1 - assert y == 3 - -In the examples, var parameters or pointers are used to provide two -return values. This can be done in a cleaner way by returning a tuple: - -.. code-block:: nimrod - proc divmod(a, b: int): tuple[res, remainder: int] = - return (a div b, a mod b) - - var t = divmod(8, 5) - assert t.res == 1 - assert t.remainder = 3 - -One can use `tuple unpacking`:idx: to access the tuple's fields: - -.. code-block:: nimrod - var (x, y) = divmod(8, 5) # tuple unpacking - assert x == 1 - assert y == 3 - - -Overloading of the subscript operator -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The ``[]`` subscript operator for arrays/openarrays/sequences can be overloaded. -Overloading support is only possible if the first parameter has no type that -already supports the built-in ``[]`` notation. Currently the compiler currently -does not check this. XXX Multiple indexes - - -Multi-methods -~~~~~~~~~~~~~ - -Procedures always use static dispatch. `Multi-methods`:idx: use dynamic -dispatch. - -.. code-block:: nimrod - type - TExpr = object ## abstract base class for an expression - TLiteral = object of TExpr - x: int - TPlusExpr = object of TExpr - a, b: ref TExpr - - method eval(e: ref TExpr): int = - # override this base method - quit "to override!" - - method eval(e: ref TLiteral): int = return e.x - - method eval(e: ref TPlusExpr): int = - # watch out: relies on dynamic binding - return eval(e.a) + eval(e.b) - - proc newLit(x: int): ref TLiteral = - new(result) - result.x = x - - proc newPlus(a, b: ref TExpr): ref TPlusExpr = - new(result) - result.a = a - result.b = b - - echo eval(newPlus(newPlus(newLit(1), newLit(2)), newLit(4))) - -In the example the constructors ``newLit`` and ``newPlus`` are procs -because they should use static binding, but ``eval`` is a method because it -requires dynamic binding. - -In a multi-method all parameters that have an object type are used for the -dispatching: - -.. code-block:: nimrod - type - TThing = object - TUnit = object of TThing - x: int - - method collide(a, b: TThing) {.inline.} = - quit "to override!" - - method collide(a: TThing, b: TUnit) {.inline.} = - echo "1" - - method collide(a: TUnit, b: TThing) {.inline.} = - echo "2" - - var - a, b: TUnit - collide(a, b) # output: 2 - - -Invocation of a multi-method cannot be ambiguous: collide 2 is preferred over -collide 1 because the resolution works from left to right. -In the example ``TUnit, TThing`` is prefered over ``TThing, TUnit``. - -**Performance note**: Nimrod does not produce a virtual method table, but -generates dispatch trees. This avoids the expensive indirect branch for method -calls and enables inlining. However, other optimizations like compile time -evaluation or dead code elimination do not work with methods. - - -Iterators and the for statement -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Syntax:: - - forStmt ::= 'for' symbol (comma symbol)* [comma] 'in' expr ['..' expr] ':' stmt - - param ::= symbol (comma symbol)* [comma] ':' typeDesc - paramList ::= ['(' [param (comma param)* [comma]] ')'] [':' typeDesc] - - genericParam ::= symbol [':' typeDesc] - genericParams ::= '[' genericParam (comma genericParam)* [comma] ']' - - iteratorDecl ::= 'iterator' symbol ['*'] [genericParams] paramList [pragma] - ['=' stmt] - -The `for`:idx: statement is an abstract mechanism to iterate over the elements -of a container. It relies on an `iterator`:idx: to do so. Like ``while`` -statements, ``for`` statements open an `implicit block`:idx:, so that they -can be left with a ``break`` statement. The ``for`` loop declares -iteration variables (``x`` in the example) - their scope reaches until the -end of the loop body. The iteration variables' types are inferred by the -return type of the iterator. - -An iterator is similar to a procedure, except that it is always called in the -context of a ``for`` loop. Iterators provide a way to specify the iteration over -an abstract type. A key role in the execution of a ``for`` loop plays the -``yield`` statement in the called iterator. Whenever a ``yield`` statement is -reached the data is bound to the ``for`` loop variables and control continues -in the body of the ``for`` loop. The iterator's local variables and execution -state are automatically saved between calls. Example: - -.. code-block:: nimrod - # this definition exists in the system module - iterator items*(a: string): char {.inline.} = - var i = 0 - while i < len(a): - yield a[i] - inc(i) - - for ch in items("hello world"): # `ch` is an iteration variable - echo(ch) - -The compiler generates code as if the programmer would have written this: - -.. code-block:: nimrod - var i = 0 - while i < len(a): - var ch = a[i] - echo(ch) - inc(i) - -The current implementation always inlines the iterator code leading to zero -overhead for the abstraction. But this may increase the code size. Later -versions of the compiler will only inline iterators which have the calling -convention ``inline``. - -If the iterator yields a tuple, there have to be as many iteration variables -as there are components in the tuple. The i'th iteration variable's type is -the one of the i'th component. - - -Type sections -~~~~~~~~~~~~~ - -Syntax:: - - typeDef ::= typeDesc | objectDef | enumDef - - genericParam ::= symbol [':' typeDesc] - genericParams ::= '[' genericParam (comma genericParam)* [comma] ']' - - typeDecl ::= COMMENT - | symbol ['*'] [genericParams] ['=' typeDef] [COMMENT|IND COMMENT] - - typeSection ::= 'type' indPush typeDecl (SAD typeDecl)* DED indPop - - -Example: - -.. code-block:: nimrod - type # example demonstrates mutually recursive types - PNode = ref TNode # a traced pointer to a TNode - TNode = object - le, ri: PNode # left and right subtrees - sym: ref TSym # leaves contain a reference to a TSym - - TSym = object # a symbol - name: string # the symbol's name - line: int # the line the symbol was declared in - code: PNode # the symbol's abstract syntax tree - -A `type`:idx: section begins with the ``type`` keyword. It contains multiple -type definitions. A type definition binds a type to a name. Type definitions -can be recursive or even mutually recursive. Mutually recursive types are only -possible within a single ``type`` section. - - -Generics -~~~~~~~~ - -Example: - -.. code-block:: nimrod - type - TBinaryTree[T] = object # TBinaryTree is a generic type with - # with generic param ``T`` - le, ri: ref TBinaryTree[T] # left and right subtrees; may be nil - data: T # the data stored in a node - PBinaryTree[T] = ref TBinaryTree[T] # a shorthand for notational convenience - - proc newNode[T](data: T): PBinaryTree[T] = # constructor for a node - new(result) - result.dat = data - - proc add[T](root: var PBinaryTree[T], n: PBinaryTree[T]) = - if root == nil: - root = n - else: - var it = root - while it != nil: - var c = cmp(it.data, n.data) # compare the data items; uses - # the generic ``cmd`` proc that works for - # any type that has a ``==`` and ``<`` - # operator - if c < 0: - if it.le == nil: - it.le = n - return - it = it.le - else: - if it.ri == nil: - it.ri = n - return - it = it.ri - - iterator inorder[T](root: PBinaryTree[T]): T = - # inorder traversal of a binary tree - # recursive iterators are not yet implemented, so this does not work in - # the current compiler! - if root.le != nil: yield inorder(root.le) - yield root.data - if root.ri != nil: yield inorder(root.ri) - - var - root: PBinaryTree[string] # instantiate a PBinaryTree with the type string - add(root, newNode("hallo")) # instantiates generic procs ``newNode`` and - add(root, newNode("world")) # ``add`` - for str in inorder(root): - writeln(stdout, str) - -`Generics`:idx: are Nimrod's means to parametrize procs, iterators or types with -`type parameters`:idx:. Depending on context, the brackets are used either to -introduce type parameters or to instantiate a generic proc, iterator or type. - - -Templates -~~~~~~~~~ - -A `template`:idx: is a simple form of a macro: It is a simple substitution -mechanism that operates on Nimrod's abstract syntax trees. It is processed in -the semantic pass of the compiler. - -The syntax to *invoke* a template is the same as calling a procedure. - -Example: - -.. code-block:: nimrod - template `!=` (a, b: expr): expr = - # this definition exists in the System module - not (a == b) - - assert(5 != 6) # the compiler rewrites that to: assert(not (5 == 6)) - -The ``!=``, ``>``, ``>=``, ``in``, ``notin``, ``isnot`` operators are in fact -templates: - -| ``a > b`` is transformed into ``b < a``. -| ``a in b`` is transformed into ``contains(b, a)``. -| ``notin`` and ``isnot`` have the obvious meanings. - -The "types" of templates can be the symbols ``expr`` (stands for *expression*), -``stmt`` (stands for *statement*) or ``typedesc`` (stands for *type -description*). These are no real types, they just help the compiler parsing. -Real types can be used too; this implies that expressions are expected. -However, for parameter type checking the arguments are semantically checked -before being passed to the template. Other arguments are not semantically -checked before being passed to the template. - -The template body does not open a new scope. To open a new scope a ``block`` -statement can be used: - -.. code-block:: nimrod - template declareInScope(x: expr, t: typeDesc): stmt = - var x: t - - template declareInNewScope(x: expr, t: typeDesc): stmt = - # open a new scope: - block: - var x: t - - declareInScope(a, int) - a = 42 # works, `a` is known here - - declareInNewScope(b, int) - b = 42 # does not work, `b` is unknown - - -If there is a ``stmt`` parameter it should be the last in the template -declaration, because statements are passed to a template via a -special ``:`` syntax: - -.. code-block:: nimrod - - template withFile(f, fn, mode: expr, actions: stmt): stmt = - block: - var f: TFile - if open(f, fn, mode): - try: - actions - finally: - close(f) - else: - quit("cannot open: " & fn) - - withFile(txt, "ttempl3.txt", fmWrite): - txt.writeln("line 1") - txt.writeln("line 2") - -In the example the two ``writeln`` statements are bound to the ``actions`` -parameter. - -**Note:** Symbol binding rules in templates might change! - -Symbol binding within templates happens after template instantation: - -.. code-block:: nimrod - # Module A - var - lastId = 0 - - template genId*: expr = - inc(lastId) - lastId - -.. code-block:: nimrod - # Module B - import A - - echo genId() # Error: undeclared identifier: 'lastId' - -Exporting a template is a often a leaky abstraction. However, to compensate for -this case, the ``bind`` operator can be used: All identifiers within a ``bind`` -context are bound early (i.e. when the template is parsed). -The affected identifiers are then always bound early even if the other -occurences are in no ``bind`` context: - -.. code-block:: nimrod - # Module A - var - lastId = 0 - - template genId*: expr = - inc(bind lastId) - lastId - -.. code-block:: nimrod - # Module B - import A - - echo genId() # Works - - -**Style note**: For code readability, it is the best idea to use the least -powerful programming construct that still suffices. So the "check list" is: - -(1) Use an ordinary proc/iterator, if possible. -(2) Else: Use a generic proc/iterator, if possible. -(3) Else: Use a template, if possible. -(4) Else: Use a macro. - - -Macros ------- - -`Macros`:idx: are the most powerful feature of Nimrod. They can be used -to implement `domain specific languages`:idx:. - -While macros enable advanced compile-time code transformations, they -cannot change Nimrod's syntax. However, this is no real restriction because -Nimrod's syntax is flexible enough anyway. - -To write macros, one needs to know how the Nimrod concrete syntax is converted -to an abstract syntax tree. - -There are two ways to invoke a macro: -(1) invoking a macro like a procedure call (`expression macros`) -(2) invoking a macro with the special ``macrostmt`` syntax (`statement macros`) - - -Expression Macros -~~~~~~~~~~~~~~~~~ - -The following example implements a powerful ``debug`` command that accepts a -variable number of arguments: - -.. code-block:: nimrod - # to work with Nimrod syntax trees, we need an API that is defined in the - # ``macros`` module: - import macros - - macro debug(n: expr): stmt = - # `n` is a Nimrod AST that contains the whole macro invocation - # this macro returns a list of statements: - result = newNimNode(nnkStmtList, n) - # iterate over any argument that is passed to this macro: - for i in 1..n.len-1: - # add a call to the statement list that writes the expression; - # `toStrLit` converts an AST to its string representation: - add(result, newCall("write", newIdentNode("stdout"), toStrLit(n[i]))) - # add a call to the statement list that writes ": " - add(result, newCall("write", newIdentNode("stdout"), newStrLitNode(": "))) - # add a call to the statement list that writes the expressions value: - add(result, newCall("writeln", newIdentNode("stdout"), n[i])) - - var - a: array [0..10, int] - x = "some string" - a[0] = 42 - a[1] = 45 - - debug(a[0], a[1], x) - -The macro call expands to: - -.. code-block:: nimrod - write(stdout, "a[0]") - write(stdout, ": ") - writeln(stdout, a[0]) - - write(stdout, "a[1]") - write(stdout, ": ") - writeln(stdout, a[1]) - - write(stdout, "x") - write(stdout, ": ") - writeln(stdout, x) - - -Statement Macros -~~~~~~~~~~~~~~~~ - -Statement macros are defined just as expression macros. However, they are -invoked by an expression following a colon:: - - exprStmt ::= lowestExpr ['=' expr | [expr (comma expr)* [comma]] [macroStmt]] - macroStmt ::= ':' [stmt] ('of' [sliceExprList] ':' stmt - | 'elif' expr ':' stmt - | 'except' exceptList ':' stmt )* - ['else' ':' stmt] - -The following example outlines a macro that generates a lexical analyzer from -regular expressions: - -.. code-block:: nimrod - import macros - - macro case_token(n: stmt): stmt = - # creates a lexical analyzer from regular expressions - # ... (implementation is an exercise for the reader :-) - nil - - case_token: # this colon tells the parser it is a macro statement - of r"[A-Za-z_]+[A-Za-z_0-9]*": - return tkIdentifier - of r"0-9+": - return tkInteger - of r"[\+\-\*\?]+": - return tkOperator - else: - return tkUnknown - - - -Modules -------- -Nimrod supports splitting a program into pieces by a `module`:idx: concept. -Each module needs to be in its own file and has its own `namespace`:idx:. -Modules enable `information hiding`:idx: and `separate compilation`:idx:. -A module may gain access to symbols of another module by the `import`:idx: -statement. `Recursive module dependencies`:idx: are allowed, but slightly -subtle. Only top-level symbols that are marked with an asterisk (``*``) are -exported. - -The algorithm for compiling modules is: - -- compile the whole module as usual, following import statements recursively -- if there is a cycle only import the already parsed symbols (that are - exported); if an unknown identifier occurs then abort - -This is best illustrated by an example: - -.. code-block:: nimrod - # Module A - type - T1* = int # Module A exports the type ``T1`` - import B # the compiler starts parsing B - - proc main() = - var i = p(3) # works because B has been parsed completely here - - main() - - -.. code-block:: nimrod - # Module B - import A # A is not parsed here! Only the already known symbols - # of A are imported. - - proc p*(x: A.T1): A.T1 = - # this works because the compiler has already - # added T1 to A's interface symbol table - return x + 1 - - -Scope rules ------------ -Identifiers are valid from the point of their declaration until the end of -the block in which the declaration occurred. The range where the identifier -is known is the `scope`:idx: of the identifier. The exact scope of an -identifier depends on the way it was declared. - -Block scope -~~~~~~~~~~~ -The *scope* of a variable declared in the declaration part of a block -is valid from the point of declaration until the end of the block. If a -block contains a second block, in which the identifier is redeclared, -then inside this block, the second declaration will be valid. Upon -leaving the inner block, the first declaration is valid again. An -identifier cannot be redefined in the same block, except if valid for -procedure or iterator overloading purposes. - - -Tuple or object scope -~~~~~~~~~~~~~~~~~~~~~ -The field identifiers inside a tuple or object definition are valid in the -following places: - -* To the end of the tuple/object definition. -* Field designators of a variable of the given tuple/object type. -* In all descendant types of the object type. - -Module scope -~~~~~~~~~~~~ -All identifiers of a module are valid from the point of declaration until -the end of the module. Identifiers from indirectly dependent modules are *not* -available. The `system`:idx: module is automatically imported in every other -module. - -If a module imports an identifier by two different modules, each occurrence of -the identifier has to be qualified, unless it is an overloaded procedure or -iterator in which case the overloading resolution takes place: - -.. code-block:: nimrod - # Module A - var x*: string - -.. code-block:: nimrod - # Module B - var x*: int - -.. code-block:: nimrod - # Module C - import A, B - write(stdout, x) # error: x is ambiguous - write(stdout, A.x) # no error: qualifier used - - var x = 4 - write(stdout, x) # not ambiguous: uses the module C's x - - -Messages -======== - -The Nimrod compiler emits different kinds of messages: `hint`:idx:, -`warning`:idx:, and `error`:idx: messages. An *error* message is emitted if -the compiler encounters any static error. - - -Pragmas -======= - -Syntax:: - - colonExpr ::= expr [':' expr] - colonExprList ::= [colonExpr (comma colonExpr)* [comma]] - - pragma ::= '{.' optInd (colonExpr [comma])* [SAD] ('.}' | '}') - -Pragmas are Nimrod's method to give the compiler additional information/ -commands without introducing a massive number of new keywords. Pragmas are -processed on the fly during semantic checking. Pragmas are enclosed in the -special ``{.`` and ``.}`` curly brackets. Pragmas are also often used as a -first implementation to play with a language feature before a nicer syntax -to access the feature becomes available. - - -noSideEffect pragma -------------------- -The `noSideEffect`:idx: pragma is used to mark a proc/iterator to have no side -effects. This means that the proc/iterator only changes locations that are -reachable from its parameters and the return value only depends on the -arguments. If none of its parameters have the type ``var T`` -or ``ref T`` or ``ptr T`` this means no locations are modified. It is a static -error to mark a proc/iterator to have no side effect if the compiler cannot -verify this. - -**Future directions**: ``func`` may become a keyword and syntactic sugar for a -proc with no side effects: - -.. code-block:: nimrod - func `+` (x, y: int): int - - -procvar pragma --------------- -The `procvar`:idx: pragma is used to mark a proc that it can be passed to a -procedural variable. - - -compileTime pragma ------------------- -The `compileTime`:idx: pragma is used to mark a proc to be used at compile -time only. No code will be generated for it. Compile time procs are useful -as helpers for macros. - - -noReturn pragma ---------------- -The `noreturn`:idx: pragma is used to mark a proc that it never returns. - - -Acyclic pragma --------------- -The `acyclic`:idx: pragma can be used for object types to mark them as acyclic -even though they seem to be cyclic. This is an **optimization** for the garbage -collector to not consider objects of this type as part of a cycle: - -.. code-block:: nimrod - type - PNode = ref TNode - TNode {.acyclic, final.} = object - left, right: PNode - data: string - -In the example a tree structure is declared with the ``TNode`` type. Note that -the type definition is recursive and the GC has to assume that objects of -this type may form a cyclic graph. The ``acyclic`` pragma passes the -information that this cannot happen to the GC. If the programmer uses the -``acyclic`` pragma for data types that are in reality cyclic, the GC may leak -memory, but nothing worse happens. - -**Future directions**: The ``acyclic`` pragma may become a property of a -``ref`` type: - -.. code-block:: nimrod - type - PNode = acyclic ref TNode - TNode = object - left, right: PNode - data: string - - -Final pragma ------------- -The `final`:idx: pragma can be used for an object type to specify that it -cannot be inherited from. - - -Pure pragma ------------ -The `pure`:idx: pragma serves two completely different purposes: -1) To mark a procedure that Nimrod should not generate any exit statements like - ``return result;`` in the generated code. This is useful for procs that only - consist of an assembler statement. -2) To mark an object type so that its type field should be omitted. This is - necessary for binary compatibility with other compiled languages. - - -error pragma ------------- -The `error`:idx: pragma is used to make the compiler output an error message -with the given content. Compilation currently aborts after an error, but this -may be changed in later versions. - - -fatal pragma ------------- -The `fatal`:idx: pragma is used to make the compiler output an error message -with the given content. In contrast to the ``error`` pragma, compilation -is guaranteed to be aborted by this pragma. - -warning pragma --------------- -The `warning`:idx: pragma is used to make the compiler output a warning message -with the given content. Compilation continues after the warning. - -hint pragma ------------ -The `hint`:idx: pragma is used to make the compiler output a hint message with -the given content. Compilation continues after the hint. - - -compilation option pragmas --------------------------- -The listed pragmas here can be used to override the code generation options -for a section of code. - -The implementation currently provides the following possible options (various -others may be added later). - -=============== =============== ============================================ -pragma allowed values description -=============== =============== ============================================ -checks on|off Turns the code generation for all runtime - checks on or off. -boundChecks on|off Turns the code generation for array bound - checks on or off. -overflowChecks on|off Turns the code generation for over- or - underflow checks on or off. -nilChecks on|off Turns the code generation for nil pointer - checks on or off. -assertions on|off Turns the code generation for assertions - on or off. -warnings on|off Turns the warning messages of the compiler - on or off. -hints on|off Turns the hint messages of the compiler - on or off. -optimization none|speed|size Optimize the code for speed or size, or - disable optimization. -callconv cdecl|... Specifies the default calling convention for - all procedures (and procedure types) that - follow. -=============== =============== ============================================ - -Example: - -.. code-block:: nimrod - {.checks: off, optimization: speed.} - # compile without runtime checks and optimize for speed - - -push and pop pragmas --------------------- -The `push/pop`:idx: pragmas are very similar to the option directive, -but are used to override the settings temporarily. Example: - -.. code-block:: nimrod - {.push checks: off.} - # compile this section without runtime checks as it is - # speed critical - # ... some code ... - {.pop.} # restore old settings - - -Register pragma ---------------- -The `register`:idx: pragma is for variables only. It declares the variable as -``register``, giving the compiler a hint that the variable should be placed -in a hardware register for faster access. C compilers usually ignore this -though and for good reasons: Often they do a better job without it anyway. - -In highly specific cases (a dispatch loop of an bytecode interpreter for -example) it may provide benefits, though. - - -DeadCodeElim pragma -------------------- -The `deadCodeElim`:idx: pragma only applies to whole modules: It tells the -compiler to activate (or deactivate) dead code elimination for the module the -pragma appers in. - -The ``--deadCodeElim:on`` command line switch has the same effect as marking -every module with ``{.deadCodeElim:on}``. However, for some modules such as -the GTK wrapper it makes sense to *always* turn on dead code elimination - -no matter if it is globally active or not. - -Example: - -.. code-block:: nimrod - {.deadCodeElim: on.} - - -Disabling certain messages --------------------------- -Nimrod generates some warnings and hints ("line too long") that may annoy the -user. A mechanism for disabling certain messages is provided: Each hint -and warning message contains a symbol in brackets. This is the message's -identifier that can be used to enable or disable it: - -.. code-block:: Nimrod - {.warning[LineTooLong]: off.} # turn off warning about too long lines - -This is often better than disabling all warnings at once. - - -Foreign function interface -========================== - -Nimrod's `FFI`:idx: (foreign function interface) is extensive and only the -parts that scale to other future backends (like the LLVM/EcmaScript backends) -are documented here. - - -Importc pragma --------------- -The `importc`:idx: pragma provides a means to import a proc or a variable -from C. The optional argument is a string containing the C identifier. If -the argument is missing, the C name is the Nimrod identifier *exactly as -spelled*: - -.. code-block:: - proc printf(formatstr: cstring) {.importc: "printf", varargs.} - -Note that this pragma is somewhat of a misnomer: Other backends will provide -the same feature under the same name. - - -Exportc pragma --------------- -The `exportc`:idx: pragma provides a means to export a type, a variable, or a -procedure to C. The optional argument is a string containing the C identifier. -If the argument is missing, the C name is the Nimrod -identifier *exactly as spelled*: - -.. code-block:: Nimrod - proc callme(formatstr: cstring) {.exportc: "callMe", varargs.} - -Note that this pragma is somewhat of a misnomer: Other backends will provide -the same feature under the same name. - - -Varargs pragma --------------- -The `varargs`:idx: pragma can be applied to procedures only (and procedure -types). It tells Nimrod that the proc can take a variable number of parameters -after the last specified parameter. Nimrod string values will be converted to C -strings automatically: - -.. code-block:: Nimrod - proc printf(formatstr: cstring) {.nodecl, varargs.} - - printf("hallo %s", "world") # "world" will be passed as C string - - -Dynlib pragma -------------- -With the `dynlib`:idx: pragma a procedure can be imported from -a dynamic library (``.dll`` files for Windows, ``lib*.so`` files for UNIX). The -non-optional argument has to be the name of the dynamic library: - -.. code-block:: Nimrod - proc gtk_image_new(): PGtkWidget {.cdecl, dynlib: "libgtk-x11-2.0.so", importc.} - -In general, importing a dynamic library does not require any special linker -options or linking with import libraries. This also implies that no *devel* -packages need to be installed. - -The ``dynlib`` import mechanism supports a versioning scheme: - -.. code-block:: nimrod - proc Tcl_Eval(interp: pTcl_Interp, script: cstring): int {.cdecl, - importc, dynlib: "libtcl(|8.5|8.4|8.3).so.(1|0)".} - -At runtime the dynamic library is searched for (in this order):: - - libtcl.so.1 - libtcl.so.0 - libtcl8.5.so.1 - libtcl8.5.so.0 - libtcl8.4.so.1 - libtcl8.4.so.0 - libtcl8.3.so.1 - libtcl8.3.so.0 - -The ``dynlib`` pragma supports not only constant strings as argument but also -string expressions in general: - -.. code-block:: nimrod - import os - - proc getDllName: string = - result = "mylib.dll" - if ExistsFile(result): return - result = "mylib2.dll" - if ExistsFile(result): return - quit("could not load dynamic library") - - proc myImport(s: cstring) {.cdecl, importc, dynlib: getDllName().} - -**Note**: Patterns like ``libtcl(|8.5|8.4).so`` are only supported in constant -strings, because they are precompiled. +============= +Nimrod Manual +============= + +:Author: Andreas Rumpf +:Version: |nimrodversion| + +.. contents:: + + + "Complexity" seems to be a lot like "energy": you can transfer it from the end + user to one/some of the other players, but the total amount seems to remain + pretty much constant for a given task. -- Ran + +About this document +=================== + +**Note**: This document is a draft! Several of Nimrod's features need more +precise wording. This manual will evolve into a proper specification some +day. + +This document describes the lexis, the syntax, and the semantics of Nimrod. + +The language constructs are explained using an extended BNF, in +which ``(a)*`` means 0 or more ``a``'s, ``a+`` means 1 or more ``a``'s, and +``(a)?`` means an optional *a*; an alternative spelling for optional parts is +``[a]``. The ``|`` symbol is used to mark alternatives +and has the lowest precedence. Parentheses may be used to group elements. +Non-terminals start with a lowercase letter, abstract terminal symbols are in +UPPERCASE. Verbatim terminal symbols (including keywords) are quoted +with ``'``. An example:: + + ifStmt ::= 'if' expr ':' stmts ('elif' expr ':' stmts)* ['else' stmts] + +Other parts of Nimrod - like scoping rules or runtime semantics are only +described in an informal manner. The reason is that formal semantics are +difficult to write and understand. However, there is only one Nimrod +implementation, so one may consider it as the formal specification; +especially since the compiler's code is pretty clean (well, some parts of it). + + +Definitions +=========== + +A Nimrod program specifies a computation that acts on a memory consisting of +components called `locations`:idx:. A variable is basically a name for a +location. Each variable and location is of a certain `type`:idx:. The +variable's type is called `static type`:idx:, the location's type is called +`dynamic type`:idx:. If the static type is not the same as the dynamic type, +it is a super-type or subtype of the dynamic type. + +An `identifier`:idx: is a symbol declared as a name for a variable, type, +procedure, etc. The region of the program over which a declaration applies is +called the `scope`:idx: of the declaration. Scopes can be nested. The meaning +of an identifier is determined by the smallest enclosing scope in which the +identifier is declared. + +An expression specifies a computation that produces a value or location. +Expressions that produce locations are called `l-values`:idx:. An l-value +can denote either a location or the value the location contains, depending on +the context. Expressions whose values can be determined statically are called +`constant expressions`:idx:; they are never l-values. + +A `static error`:idx: is an error that the implementation detects before +program execution. Unless explicitly classified, an error is a static error. + +A `checked runtime error`:idx: is an error that the implementation detects +and reports at runtime. The method for reporting such errors is via *raising +exceptions*. However, the implementation provides a means to disable these +runtime checks. See the section pragmas_ for details. + +An `unchecked runtime error`:idx: is an error that is not guaranteed to be +detected, and can cause the subsequent behavior of the computation to +be arbitrary. Unchecked runtime errors cannot occur if only `safe`:idx: +language features are used. + + +Lexical Analysis +================ + +Encoding +-------- + +All Nimrod source files are in the UTF-8 encoding (or its ASCII subset). Other +encodings are not supported. Any of the standard platform line termination +sequences can be used - the Unix form using ASCII LF (linefeed), the Windows +form using the ASCII sequence CR LF (return followed by linefeed), or the old +Macintosh form using the ASCII CR (return) character. All of these forms can be +used equally, regardless of platform. + + +Indentation +----------- + +Nimrod's standard grammar describes an `indentation sensitive`:idx: language. +This means that all the control structures are recognized by indentation. +Indentation consists only of spaces; tabulators are not allowed. + +The terminals ``IND`` (indentation), ``DED`` (dedentation) and ``SAD`` +(same indentation) are generated by the scanner, denoting an indentation. + +These terminals are only generated for lines that are not empty. + +The parser and the scanner communicate over a stack which indentation terminal +should be generated: the stack consists of integers counting the spaces. The +stack is initialized with a zero on its top. The scanner reads from the stack: +If the current indentation token consists of more spaces than the entry at the +top of the stack, a ``IND`` token is generated, else if it consists of the same +number of spaces, a ``SAD`` token is generated. If it consists of fewer spaces, +a ``DED`` token is generated for any item on the stack that is greater than the +current. These items are later popped from the stack by the parser. At the end +of the file, a ``DED`` token is generated for each number remaining on the +stack that is larger than zero. + +Because the grammar contains some optional ``IND`` tokens, the scanner cannot +push new indentation levels. This has to be done by the parser. The symbol +``indPush`` indicates that an ``IND`` token is expected; the current number of +leading spaces is pushed onto the stack by the parser. The symbol ``indPop`` +denotes that the parser pops an item from the indentation stack. No token is +consumed by ``indPop``. + + +Comments +-------- + +`Comments`:idx: start anywhere outside a string or character literal with the +hash character ``#``. +Comments consist of a concatenation of `comment pieces`:idx:. A comment piece +starts with ``#`` and runs until the end of the line. The end of line characters +belong to the piece. If the next line only consists of a comment piece which is +aligned to the preceding one, it does not start a new comment: + +.. code-block:: nimrod + + i = 0 # This is a single comment over multiple lines belonging to the + # assignment statement. The scanner merges these two pieces. + # This is a new comment belonging to the current block, but to no particular + # statement. + i = i + 1 # This a new comment that is NOT + echo(i) # continued here, because this comment refers to the echo statement + +Comments are tokens; they are only allowed at certain places in the input file +as they belong to the syntax tree! This feature enables perfect source-to-source +transformations (such as pretty-printing) and superior documentation generators. +A nice side-effect is that the human reader of the code always knows exactly +which code snippet the comment refers to. + + +Identifiers & Keywords +---------------------- + +`Identifiers`:idx: in Nimrod can be any string of letters, digits +and underscores, beginning with a letter. Two immediate following +underscores ``__`` are not allowed:: + + letter ::= 'A'..'Z' | 'a'..'z' | '\x80'..'\xff' + digit ::= '0'..'9' + IDENTIFIER ::= letter ( ['_'] letter | digit )* + +The following `keywords`:idx: are reserved and cannot be used as identifiers: + +.. code-block:: nimrod + :file: ../data/keywords.txt + +Some keywords are unused; they are reserved for future developments of the +language. + +Nimrod is a `style-insensitive`:idx: language. This means that it is not +case-sensitive and even underscores are ignored: +**type** is a reserved word, and so is **TYPE** or **T_Y_P_E**. The idea behind +this is that this allows programmers to use their own preferred spelling style +and libraries written by different programmers cannot use incompatible +conventions. A Nimrod-aware editor or IDE can show the identifiers as +preferred. Another advantage is that it frees the programmer from remembering +the exact spelling of an identifier. + + +String literals +--------------- + +`String literals`:idx: can be delimited by matching double quotes, and can +contain the following `escape sequences`:idx:\ : + +================== =================================================== + Escape sequence Meaning +================== =================================================== + ``\n`` `newline`:idx: + ``\r``, ``\c`` `carriage return`:idx: + ``\l`` `line feed`:idx: + ``\f`` `form feed`:idx: + ``\t`` `tabulator`:idx: + ``\v`` `vertical tabulator`:idx: + ``\\`` `backslash`:idx: + ``\"`` `quotation mark`:idx: + ``\'`` `apostrophe`:idx: + ``\d+`` `character with decimal value d`:idx:; + all decimal digits directly + following are used for the character + ``\a`` `alert`:idx: + ``\b`` `backspace`:idx: + ``\e`` `escape`:idx: `[ESC]`:idx: + ``\xHH`` `character with hex value HH`:idx:; + exactly two hex digits are allowed +================== =================================================== + + +Strings in Nimrod may contain any 8-bit value, except embedded zeros. + + +Triple quoted string literals +----------------------------- + +String literals can also be delimited by three double quotes +``"""`` ... ``"""``. +Literals in this form may run for several lines, may contain ``"`` and do not +interpret any escape sequences. +For convenience, when the opening ``"""`` is immediately followed by a newline, +the newline is not included in the string. The ending of the string literal is +defined by the pattern ``"""[^"]``, so this: + +.. code-block:: nimrod + """"long string within quotes"""" + +Produces:: + + "long string within quotes" + + +Raw string literals +------------------- + +There are also `raw string literals` that are preceded with the letter ``r`` +(or ``R``) and are delimited by matching double quotes (just like ordinary +string literals) and do not interpret the escape sequences. This is especially +convenient for regular expressions or Windows paths: + +.. code-block:: nimrod + + var f = openFile(r"C:\texts\text.txt") # a raw string, so ``\t`` is no tab + +To produce a single ``"`` within a raw string literal, it has to be doubled: + +.. code-block:: nimrod + + r"a""b" + +Produces:: + + a"b + +``r""""`` is not possible with this notation, because the three leading +quotes introduce a triple quoted string literal. + + +Generalized raw string literals +------------------------------- + +The construct ``identifier"string literal"`` (without whitespace between the +identifier and the opening quotation mark) is a +`generalized raw string literal`:idx:. It is a shortcut for the construct +``identifier(r"string literal")``, so it denotes a procedure call with a +raw string literal as its only argument. Generalized raw string literals +are especially convenient for embedding mini languages directly into Nimrod +(for example regular expressions). + +The construct ``identifier"""string literal"""`` exists too. It is a shortcut +for ``identifier("""string literal""")``. + + +Character literals +------------------ + +Character literals are enclosed in single quotes ``''`` and can contain the +same escape sequences as strings - with one exception: ``\n`` is not allowed +as it may be wider than one character (often it is the pair CR/LF for example). +A character is not an Unicode character but a single byte. The reason for this +is efficiency: for the overwhelming majority of use-cases, the resulting +programs will still handle UTF-8 properly as UTF-8 was specially designed for +this. +Another reason is that Nimrod can thus support ``array[char, int]`` or +``set[char]`` efficiently as many algorithms rely on this feature. + + +Numerical constants +------------------- + +`Numerical constants`:idx: are of a single type and have the form:: + + hexdigit ::= digit | 'A'..'F' | 'a'..'f' + octdigit ::= '0'..'7' + bindigit ::= '0'..'1' + INT_LIT ::= digit ( ['_'] digit )* + | '0' ('x' | 'X' ) hexdigit ( ['_'] hexdigit )* + | '0o' octdigit ( ['_'] octdigit )* + | '0' ('b' | 'B' ) bindigit ( ['_'] bindigit )* + + INT8_LIT ::= INT_LIT '\'' ('i' | 'I' ) '8' + INT16_LIT ::= INT_LIT '\'' ('i' | 'I' ) '16' + INT32_LIT ::= INT_LIT '\'' ('i' | 'I' ) '32' + INT64_LIT ::= INT_LIT '\'' ('i' | 'I' ) '64' + + exponent ::= ('e' | 'E' ) ['+' | '-'] digit ( ['_'] digit )* + FLOAT_LIT ::= digit (['_'] digit)* ('.' (['_'] digit)* [exponent] |exponent) + FLOAT32_LIT ::= ( FLOAT_LIT | INT_LIT ) '\'' ('f' | 'F') '32' + FLOAT64_LIT ::= ( FLOAT_LIT | INT_LIT ) '\'' ('f' | 'F') '64' + + +As can be seen in the productions, numerical constants can contain underscores +for readability. Integer and floating point literals may be given in decimal (no +prefix), binary (prefix ``0b``), octal (prefix ``0o``) and hexadecimal +(prefix ``0x``) notation. + +There exists a literal for each numerical type that is +defined. The suffix starting with an apostrophe ('\'') is called a +`type suffix`:idx:. Literals without a type suffix are of the type ``int``, +unless the literal contains a dot or ``E|e`` in which case it is of +type ``float``. + +The type suffixes are: + +================= ========================= + Type Suffix Resulting type of literal +================= ========================= + ``'i8`` int8 + ``'i16`` int16 + ``'i32`` int32 + ``'i64`` int64 + ``'f32`` float32 + ``'f64`` float64 +================= ========================= + +Floating point literals may also be in binary, octal or hexadecimal +notation: +``0B0_10001110100_0000101001000111101011101111111011000101001101001001'f64`` +is approximately 1.72826e35 according to the IEEE floating point standard. + + + +Other tokens +------------ + +The following strings denote other tokens:: + + ( ) { } [ ] , ; [. .] {. .} (. .) + : = ^ .. ` + +`..`:tok: takes precedence over other tokens that contain a dot: `{..}`:tok: are +the three tokens `{`:tok:, `..`:tok:, `}`:tok: and not the two tokens +`{.`:tok:, `.}`:tok:. + +In Nimrod one can define his own operators. An `operator`:idx: is any +combination of the following characters that is not listed above:: + + + - * / < > + = @ $ ~ & % + ! ? ^ . | \ + +These keywords are also operators: +``and or not xor shl shr div mod in notin is isnot``. + + +Syntax +====== + +This section lists Nimrod's standard syntax in ENBF. How the parser receives +indentation tokens is already described in the `Lexical Analysis`_ section. + +Nimrod allows user-definable operators. +Binary operators have 8 different levels of precedence. For user-defined +operators, the precedence depends on the first character the operator consists +of. All binary operators are left-associative. + +================ ============================================== ================== =============== +Precedence level Operators First characters Terminal symbol +================ ============================================== ================== =============== + 7 (highest) ``$`` OP7 + 6 ``* / div mod shl shr %`` ``* % \ /`` OP6 + 5 ``+ -`` ``+ ~ |`` OP5 + 4 ``&`` ``&`` OP4 + 3 ``== <= < >= > != in not_in is isnot`` ``= < > !`` OP3 + 2 ``and`` OP2 + 1 ``or xor`` OP1 + 0 (lowest) ``? @ ^ ` : .`` OP0 +================ ============================================== ================== =============== + + +The grammar's start symbol is ``module``. + +.. include:: grammar.txt + :literal: + + + +Semantics +========= + +Constants +--------- + +`Constants`:idx: are symbols which are bound to a value. The constant's value +cannot change. The compiler must be able to evaluate the expression in a +constant declaration at compile time. + +Nimrod contains a sophisticated compile-time evaluator, so procedures which +have no side-effect can be used in constant expressions too: + +.. code-block:: nimrod + import strutils + const + constEval = contains("abc", 'b') # computed at compile time! + + +Types +----- + +All expressions have a `type`:idx: which is known at compile time. Nimrod +is statically typed. One can declare new types, which is in essence defining +an identifier that can be used to denote this custom type. + +These are the major type classes: + +* ordinal types (consist of integer, bool, character, enumeration + (and subranges thereof) types) +* floating point types +* string type +* structured types +* reference (pointer) type +* procedural type +* generic type + + +Ordinal types +~~~~~~~~~~~~~ +`Ordinal types`:idx: have the following characteristics: + +- Ordinal types are countable and ordered. This property allows + the operation of functions as ``Inc``, ``Ord``, ``Dec`` on ordinal types to + be defined. +- Ordinal values have a smallest possible value. Trying to count further + down than the smallest value gives a checked runtime or static error. +- Ordinal values have a largest possible value. Trying to count further + than the largest value gives a checked runtime or static error. + +Integers, bool, characters and enumeration types (and subranges of these +types) belong to ordinal types. + + +Pre-defined integer types +~~~~~~~~~~~~~~~~~~~~~~~~~ +These integer types are pre-defined: + +``int`` + the generic signed integer type; its size is platform dependent + (the compiler chooses the processor's fastest integer type). + This type should be used in general. An integer literal that has no type + suffix is of this type. + +intXX + additional signed integer types of XX bits use this naming scheme + (example: int16 is a 16 bit wide integer). + The current implementation supports ``int8``, ``int16``, ``int32``, ``int64``. + Literals of these types have the suffix 'iXX. + + +There are no `unsigned integer`:idx: types, only `unsigned operations`:idx: +that treat their arguments as unsigned. Unsigned operations all wrap around; +they cannot lead to over- or underflow errors. Unsigned operations use the +``%`` suffix as convention: + +====================== ====================================================== +operation meaning +====================== ====================================================== +``a +% b`` unsigned integer addition +``a -% b`` unsigned integer subtraction +``a *% b`` unsigned integer multiplication +``a /% b`` unsigned integer division +``a %% b`` unsigned integer modulo operation +``a <% b`` treat ``a`` and ``b`` as unsigned and compare +``a <=% b`` treat ``a`` and ``b`` as unsigned and compare +``ze(a)`` extends the bits of ``a`` with zeros until it has the + width of the ``int`` type +``toU8(a)`` treats ``a`` as unsigned and converts it to an + unsigned integer of 8 bits (but still the + ``int8`` type) +``toU16(a)`` treats ``a`` as unsigned and converts it to an + unsigned integer of 16 bits (but still the + ``int16`` type) +``toU32(a)`` treats ``a`` as unsigned and converts it to an + unsigned integer of 32 bits (but still the + ``int32`` type) +====================== ====================================================== + +`Automatic type conversion`:idx: is performed in expressions where different +kinds of integer types are used: the smaller type is converted to the larger. +For further details, see `Convertible relation`_. + + +Pre-defined floating point types +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following floating point types are pre-defined: + +``float`` + the generic floating point type; its size is platform dependent + (the compiler chooses the processor's fastest floating point type). + This type should be used in general. + +floatXX + an implementation may define additional floating point types of XX bits using + this naming scheme (example: float64 is a 64 bit wide float). The current + implementation supports ``float32`` and ``float64``. Literals of these types + have the suffix 'fXX. + + +Automatic type conversion in expressions with different kinds +of floating point types is performed: See `Convertible relation`_ for further +details. Arithmetic performed on floating point types follows the IEEE +standard. Integer types are not converted to floating point types automatically +and vice versa. + +The IEEE standard defines five types of floating-point exceptions: + +* Invalid: operations with mathematically invalid operands, + for example 0.0/0.0, sqrt(-1.0), and log(-37.8). +* Division by zero: divisor is zero and dividend is a finite nonzero number, + for example 1.0/0.0. +* Overflow: operation produces a result that exceeds the range of the exponent, + for example MAXDOUBLE+0.0000000000001e308. +* Underflow: operation produces a result that is too small to be represented + as a normal number, for example, MINDOUBLE * MINDOUBLE. +* Inexact: operation produces a result that cannot be represented with infinite + precision, for example, 2.0 / 3.0, log(1.1) and 0.1 in input. + +The IEEE exceptions are either ignored at runtime or mapped to the +Nimrod exceptions: `EFloatInvalidOp`:idx, `EFloatDivByZero`:idx:, +`EFloatOverflow`:idx:, `EFloatUnderflow`:idx:, and `EFloatInexact`:idx:\. +These exceptions inherit from the `EFloatingPoint`:idx: base class. + +Nimrod provides the pragmas `NaNChecks`:idx and `InfChecks`:idx:\ to control +whether the IEEE exceptions are ignored or trap a Nimrod exception: + +.. code-block:: nimrod + {.NanChecks: on, InfChecks: on.} + var a = 1.0 + var b = 0.0 + echo b / b # raises EFloatInvalidOp + echo a / b # raises EFloatOverflow + +In the current implementation ``EFloatDivByZero`` and ``EFloatInexact`` are +never raised. ``EFloatOverflow`` is raised instead of ``EFloatDivByZero``. +There is also a `floatChecks`:idx: pragma that is a short-cut for the +combination of ``NaNChecks`` and ``InfChecks`` pragmas. ``floatChecks`` are +turned off as default. + +The only operations that are affected by the ``floatChecks`` pragma are +the ``+``, ``-``, ``*``, ``/`` operators for floating point types. + + +Boolean type +~~~~~~~~~~~~ +The `boolean`:idx: type is named ``bool`` in Nimrod and can be one of the two +pre-defined values ``true`` and ``false``. Conditions in while, +if, elif, when statements need to be of type bool. + +This condition holds:: + + ord(false) == 0 and ord(true) == 1 + +The operators ``not, and, or, xor, <, <=, >, >=, !=, ==`` are defined +for the bool type. The ``and`` and ``or`` operators perform short-cut +evaluation. Example: + +.. code-block:: nimrod + + while p != nil and p.name != "xyz": + # p.name is not evaluated if p == nil + p = p.next + + +The size of the bool type is one byte. + + +Character type +~~~~~~~~~~~~~~ +The `character type`:idx: is named ``char`` in Nimrod. Its size is one byte. +Thus it cannot represent an UTF-8 character, but a part of it. +The reason for this is efficiency: for the overwhelming majority of use-cases, +the resulting programs will still handle UTF-8 properly as UTF-8 was specially +designed for this. +Another reason is that Nimrod can support ``array[char, int]`` or +``set[char]`` efficiently as many algorithms rely on this feature. The +`TRune` type is used for Unicode characters, it can represent any Unicode +character. ``TRune`` is declared in the ``unicode`` module. + + + +Enumeration types +~~~~~~~~~~~~~~~~~ +`Enumeration`:idx: types define a new type whose values consist of the ones +specified. The values are ordered. Example: + +.. code-block:: nimrod + + type + TDirection = enum + north, east, south, west + + +Now the following holds:: + + ord(north) == 0 + ord(east) == 1 + ord(south) == 2 + ord(west) == 3 + +Thus, north < east < south < west. The comparison operators can be used +with enumeration types. + +For better interfacing to other programming languages, the fields of enum +types can be assigned an explicit ordinal value. However, the ordinal values +have to be in ascending order. A field whose ordinal value is not +explicitly given is assigned the value of the previous field + 1. + +An explicit ordered enum can have *holes*: + +.. code-block:: nimrod + type + TTokenType = enum + a = 2, b = 4, c = 89 # holes are valid + +However, it is then not an ordinal anymore, so it is not possible to use these +enums as an index type for arrays. The procedures ``inc``, ``dec``, ``succ`` +and ``pred`` are not available for them either. + + +Subrange types +~~~~~~~~~~~~~~ +A `subrange`:idx: type is a range of values from an ordinal type (the base +type). To define a subrange type, one must specify it's limiting values: the +lowest and highest value of the type: + +.. code-block:: nimrod + type + TSubrange = range[0..5] + + +``TSubrange`` is a subrange of an integer which can only hold the values 0 +to 5. Assigning any other value to a variable of type ``TSubrange`` is a +checked runtime error (or static error if it can be statically +determined). Assignments from the base type to one of its subrange types +(and vice versa) are allowed. + +A subrange type has the same size as its base type (``int`` in the example). + + +String type +~~~~~~~~~~~ +All string literals are of the type `string`:idx:. A string in Nimrod is very +similar to a sequence of characters. However, strings in Nimrod are both +zero-terminated and have a length field. One can retrieve the length with the +builtin ``len`` procedure; the length never counts the terminating zero. +The assignment operator for strings always copies the string. +The ``&`` operator concatenates strings. + +Strings are compared by their lexicographical order. All comparison operators +are available. Strings can be indexed like arrays (lower bound is 0). Unlike +arrays, they can be used in case statements: + +.. code-block:: nimrod + + case paramStr(i) + of "-v": incl(options, optVerbose) + of "-h", "-?": incl(options, optHelp) + else: write(stdout, "invalid command line option!\n") + +Per convention, all strings are UTF-8 strings, but this is not enforced. For +example, when reading strings from binary files, they are merely a sequence of +bytes. The index operation ``s[i]`` means the i-th *char* of ``s``, not the +i-th *unichar*. The iterator ``runes`` from the ``unicode`` +module can be used for iteration over all Unicode characters. + + +Structured types +~~~~~~~~~~~~~~~~ +A variable of a `structured type`:idx: can hold multiple values at the same +time. Structured types can be nested to unlimited levels. Arrays, sequences, +tuples, objects and sets belong to the structured types. + +Array and sequence types +~~~~~~~~~~~~~~~~~~~~~~~~ +`Arrays`:idx: are a homogeneous type, meaning that each element in the array +has the same type. Arrays always have a fixed length which is specified at +compile time (except for open arrays). They can be indexed by any ordinal type. +A parameter ``A`` may be an *open array*, in which case it is indexed by +integers from 0 to ``len(A)-1``. An array expression may be constructed by the +array constructor ``[]``. + +`Sequences`:idx: are similar to arrays but of dynamic length which may change +during runtime (like strings). A sequence ``S`` is always indexed by integers +from 0 to ``len(S)-1`` and its bounds are checked. Sequences can be +constructed by the array constructor ``[]`` in conjunction with the array to +sequence operator ``@``. Another way to allocate space for a sequence is to +call the built-in ``newSeq`` procedure. + +A sequence may be passed to a parameter that is of type *open array*. + +Example: + +.. code-block:: nimrod + + type + TIntArray = array[0..5, int] # an array that is indexed with 0..5 + TIntSeq = seq[int] # a sequence of integers + var + x: TIntArray + y: TIntSeq + x = [1, 2, 3, 4, 5, 6] # [] is the array constructor + y = @[1, 2, 3, 4, 5, 6] # the @ turns the array into a sequence + +The lower bound of an array or sequence may be received by the built-in proc +``low()``, the higher bound by ``high()``. The length may be +received by ``len()``. ``low()`` for a sequence or an open array always returns +0, as this is the first valid index. +One can append elements to a sequence with the ``add()`` proc or the ``&`` +operator, and remove (and get) the last element of a sequence with the +``pop()`` proc. + +The notation ``x[i]`` can be used to access the i-th element of ``x``. + +Arrays are always bounds checked (at compile-time or at runtime). These +checks can be disabled via pragmas or invoking the compiler with the +``--boundChecks:off`` command line switch. + +An open array is also a means to implement passing a variable number of +arguments to a procedure. The compiler converts the list of arguments +to an array automatically: + +.. code-block:: nimrod + proc myWriteln(f: TFile, a: openarray[string]) = + for s in items(a): + write(f, s) + write(f, "\n") + + myWriteln(stdout, "abc", "def", "xyz") + # is transformed by the compiler to: + myWriteln(stdout, ["abc", "def", "xyz"]) + +This transformation is only done if the openarray parameter is the +last parameter in the procedure header. The current implementation does not +support nested open arrays. + + +Tuples and object types +~~~~~~~~~~~~~~~~~~~~~~~ +A variable of a `tuple`:idx: or `object`:idx: type is a heterogeneous storage +container. +A tuple or object defines various named *fields* of a type. A tuple also +defines an *order* of the fields. Tuples are meant for heterogeneous storage +types with no overhead and few abstraction possibilities. The constructor ``()`` +can be used to construct tuples. The order of the fields in the constructor +must match the order of the tuple's definition. Different tuple-types are +*equivalent* if they specify the same fields of the same type in the same +order. + +The assignment operator for tuples copies each component. +The default assignment operator for objects copies each component. Overloading +of the assignment operator for objects is not possible, but this may change in +future versions of the compiler. + +.. code-block:: nimrod + + type + TPerson = tuple[name: string, age: int] # type representing a person + # a person consists of a name + # and an age + var + person: TPerson + person = (name: "Peter", age: 30) + # the same, but less readable: + person = ("Peter", 30) + +The implementation aligns the fields for best access performance. The alignment +is compatible with the way the C compiler does it. + +Objects provide many features that tuples do not. Object provide inheritance +and information hiding. Objects have access to their type at runtime, so that +the ``is`` operator can be used to determine the object's type. + +.. code-block:: nimrod + + type + TPerson = object + name*: string # the * means that `name` is accessible from other modules + age: int # no * means that the field is hidden + + TStudent = object of TPerson # a student is a person + id: int # with an id field + + var + student: TStudent + person: TPerson + assert(student is TStudent) # is true + +Object fields that should be visible from outside the defining module, have to +be marked by ``*``. In contrast to tuples, different object types are +never *equivalent*. + + +Object variants +~~~~~~~~~~~~~~~ +Often an object hierarchy is overkill in certain situations where simple +`variant`:idx: types are needed. + +An example: + +.. code-block:: nimrod + + # This is an example how an abstract syntax tree could be modelled in Nimrod + type + TNodeKind = enum # the different node types + nkInt, # a leaf with an integer value + nkFloat, # a leaf with a float value + nkString, # a leaf with a string value + nkAdd, # an addition + nkSub, # a subtraction + nkIf # an if statement + PNode = ref TNode + TNode = object + case kind: TNodeKind # the ``kind`` field is the discriminator + of nkInt: intVal: int + of nkFloat: floatVal: float + of nkString: strVal: string + of nkAdd, nkSub: + leftOp, rightOp: PNode + of nkIf: + condition, thenPart, elsePart: PNode + + var + n: PNode + new(n) # creates a new node + n.kind = nkFloat + n.floatVal = 0.0 # valid, because ``n.kind==nkFloat``, so that it fits + + # the following statement raises an `EInvalidField` exception, because + # n.kind's value does not fit: + n.strVal = "" + +As can been seen from the example, an advantage to an object hierarchy is that +no casting between different object types is needed. Yet, access to invalid +object fields raises an exception. + + +Set type +~~~~~~~~ +The `set type`:idx: models the mathematical notion of a set. The set's +basetype can only be an ordinal type. The reason is that sets are implemented +as high performance bit vectors. + +Sets can be constructed via the set constructor: ``{}`` is the empty set. The +empty set is type compatible with any special set type. The constructor +can also be used to include elements (and ranges of elements) in the set: + +.. code-block:: nimrod + + {'a'..'z', '0'..'9'} # This constructs a set that contains the + # letters from 'a' to 'z' and the digits + # from '0' to '9' + +These operations are supported by sets: + +================== ======================================================== +operation meaning +================== ======================================================== +``A + B`` union of two sets +``A * B`` intersection of two sets +``A - B`` difference of two sets (A without B's elements) +``A == B`` set equality +``A <= B`` subset relation (A is subset of B or equal to B) +``A < B`` strong subset relation (A is a real subset of B) +``e in A`` set membership (A contains element e) +``A -+- B`` symmetric set difference (= (A - B) + (B - A)) +``card(A)`` the cardinality of A (number of elements in A) +``incl(A, elem)`` same as A = A + {elem} +``excl(A, elem)`` same as A = A - {elem} +================== ======================================================== + + +Reference and pointer types +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +References (similar to `pointers`:idx: in other programming languages) are a +way to introduce many-to-one relationships. This means different references can +point to and modify the same location in memory. + +Nimrod distinguishes between `traced`:idx: and `untraced`:idx: references. +Untraced references are also called *pointers*. Traced references point to +objects of a garbage collected heap, untraced references point to +manually allocated objects or to objects somewhere else in memory. Thus +untraced references are *unsafe*. However for certain low-level operations +(accessing the hardware) untraced references are unavoidable. + +Traced references are declared with the **ref** keyword, untraced references +are declared with the **ptr** keyword. + +The ``^`` operator can be used to derefer a reference, the ``addr`` procedure +returns the address of an item. An address is always an untraced reference. +Thus the usage of ``addr`` is an *unsafe* feature. + +The ``.`` (access a tuple/object field operator) +and ``[]`` (array/string/sequence index operator) operators perform implicit +dereferencing operations for reference types: + +.. code-block:: nimrod + + type + PNode = ref TNode + TNode = object + le, ri: PNode + data: int + + var + n: PNode + new(n) + n.data = 9 # no need to write n^ .data + +To allocate a new traced object, the built-in procedure ``new`` has to be used. +To deal with untraced memory, the procedures ``alloc``, ``dealloc`` and +``realloc`` can be used. The documentation of the system module contains +further information. + +If a reference points to *nothing*, it has the value ``nil``. + +Special care has to be taken if an untraced object contains traced objects like +traced references, strings or sequences: in order to free everything properly, +the built-in procedure ``GCunref`` has to be called before freeing the +untraced memory manually! + +.. XXX finalizers for traced objects + +Procedural type +~~~~~~~~~~~~~~~ +A `procedural type`:idx: is internally a pointer to a procedure. ``nil`` is +an allowed value for variables of a procedural type. Nimrod uses procedural +types to achieve `functional`:idx: programming techniques. + +Example: + +.. code-block:: nimrod + + type + TCallback = proc (x: int) {.cdecl.} + + proc printItem(x: Int) = ... + + proc forEach(c: TCallback) = + ... + + forEach(printItem) # this will NOT work because calling conventions differ + +A subtle issue with procedural types is that the calling convention of the +procedure influences the type compatibility: procedural types are only +compatible if they have the same calling convention. + +Nimrod supports these `calling conventions`:idx:, which are all incompatible to +each other: + +`stdcall`:idx: + This the stdcall convention as specified by Microsoft. The generated C + procedure is declared with the ``__stdcall`` keyword. + +`cdecl`:idx: + The cdecl convention means that a procedure shall use the same convention + as the C compiler. Under windows the generated C procedure is declared with + the ``__cdecl`` keyword. + +`safecall`:idx: + This is the safecall convention as specified by Microsoft. The generated C + procedure is declared with the ``__safecall`` keyword. The word *safe* + refers to the fact that all hardware registers shall be pushed to the + hardware stack. + +`inline`:idx: + The inline convention means the the caller should not call the procedure, + but inline its code directly. Note that Nimrod does not inline, but leaves + this to the C compiler. Thus it generates ``__inline`` procedures. This is + only a hint for the compiler: it may completely ignore it and + it may inline procedures that are not marked as ``inline``. + +`fastcall`:idx: + Fastcall means different things to different C compilers. One gets whatever + the C ``__fastcall`` means. + +`nimcall`:idx: + Nimcall is the default convention used for Nimrod procedures. It is the + same as ``fastcall``, but only for C compilers that support ``fastcall``. + +`closure`:idx: + indicates that the procedure expects a context, a closure that needs + to be passed to the procedure. The calling convention ``nimcall`` is + compatible to ``closure``. + +`syscall`:idx: + The syscall convention is the same as ``__syscall`` in C. It is used for + interrupts. + +`noconv`:idx: + The generated C code will not have any explicit calling convention and thus + use the C compiler's default calling convention. This is needed because + Nimrod's default calling convention for procedures is ``fastcall`` to + improve speed. + +Most calling conventions exist only for the Windows 32-bit platform. + +Assigning/passing a procedure to a procedural variable is only allowed if one +of the following conditions hold: +1) The procedure that is accessed resists in the current module. +2) The procedure is marked with the ``procvar`` pragma (see `procvar pragma`_). +3) The procedure has a calling convention that differs from ``nimcall``. +4) The procedure is anonymous. + +The rules' purpose is to prevent the case that extending a non-``procvar`` +procedure with default parameters breaks client code. + + +Distinct type +~~~~~~~~~~~~~ + +A distinct type is new type derived from a `base type`:idx: that is +incompatible with its base type. In particular, it is an essential property +of a distinct type that it **does not** imply a subtype relation between it +and its base type. Explicit type conversions from a distinct type to its +base type and vice versa are allowed. + +A distinct type can be used to model different physical `units`:idx: with a +numerical base type, for example. The following example models currencies. + +Different currencies should not be mixed in monetary calculations. Distinct +types are a perfect tool to model different currencies: + +.. code-block:: nimrod + type + TDollar = distinct int + TEuro = distinct int + + var + d: TDollar + e: TEuro + + echo d + 12 + # Error: cannot add a number with no unit and a ``TDollar`` + +Unfortunately, ``d + 12.TDollar`` is not allowed either, +because ``+`` is defined for ``int`` (among others), not for ``TDollar``. So +a ``+`` for dollars needs to be defined: + +.. code-block:: + proc `+` (x, y: TDollar): TDollar = + result = TDollar(int(x) + int(y)) + +It does not make sense to multiply a dollar with a dollar, but with a +number without unit; and the same holds for division: + +.. code-block:: + proc `*` (x: TDollar, y: int): TDollar = + result = TDollar(int(x) * y) + + proc `*` (x: int, y: TDollar): TDollar = + result = TDollar(x * int(y)) + + proc `div` ... + +This quickly gets tedious. The implementations are trivial and the compiler +should not generate all this code only to optimize it away later - after all +``+`` for dollars should produce the same binary code as ``+`` for ints. +The pragma ``borrow`` has been designed to solve this problem; in principle +it generates the above trivial implementations: + +.. code-block:: nimrod + proc `*` (x: TDollar, y: int): TDollar {.borrow.} + proc `*` (x: int, y: TDollar): TDollar {.borrow.} + proc `div` (x: TDollar, y: int): TDollar {.borrow.} + +The ``borrow`` pragma makes the compiler use the same implementation as +the proc that deals with the distinct type's base type, so no code is +generated. + +But it seems all this boilerplate code needs to be repeated for the ``TEuro`` +currency. This can be solved with templates_. + +.. code-block:: nimrod + template Additive(typ: typeDesc): stmt = + proc `+` *(x, y: typ): typ {.borrow.} + proc `-` *(x, y: typ): typ {.borrow.} + + # unary operators: + proc `+` *(x: typ): typ {.borrow.} + proc `-` *(x: typ): typ {.borrow.} + + template Multiplicative(typ, base: typeDesc): stmt = + proc `*` *(x: typ, y: base): typ {.borrow.} + proc `*` *(x: base, y: typ): typ {.borrow.} + proc `div` *(x: typ, y: base): typ {.borrow.} + proc `mod` *(x: typ, y: base): typ {.borrow.} + + template Comparable(typ: typeDesc): stmt = + proc `<` * (x, y: typ): bool {.borrow.} + proc `<=` * (x, y: typ): bool {.borrow.} + proc `==` * (x, y: typ): bool {.borrow.} + + template DefineCurrency(typ, base: expr): stmt = + type + typ* = distinct base + Additive(typ) + Multiplicative(typ, base) + Comparable(typ) + + DefineCurrency(TDollar, int) + DefineCurrency(TEuro, int) + + + +Type relations +-------------- + +The following section defines several relations on types that are needed to +describe the type checking done by the compiler. + + +Type equality +~~~~~~~~~~~~~ +Nimrod uses structural type equivalence for most types. Only for objects, +enumerations and distinct types name equivalence is used. The following +algorithm (in pseudo-code) determines type equality: + +.. code-block:: nimrod + proc typeEqualsAux(a, b: PType, + s: var set[PType * PType]): bool = + if (a,b) in s: return true + incl(s, (a,b)) + if a.kind == b.kind: + case a.kind + of int, intXX, float, floatXX, char, string, cstring, pointer, bool, nil: + # leaf type: kinds identical; nothing more to check + result = true + of ref, ptr, var, set, seq, openarray: + result = typeEqualsAux(a.baseType, b.baseType, s) + of range: + result = typeEqualsAux(a.baseType, b.baseType, s) and + (a.rangeA == b.rangeA) and (a.rangeB == b.rangeB) + of array: + result = typeEqualsAux(a.baseType, b.baseType, s) and + typeEqualsAux(a.indexType, b.indexType, s) + of tuple: + if a.tupleLen == b.tupleLen: + for i in 0..a.tupleLen-1: + if not typeEqualsAux(a[i], b[i], s): return false + result = true + of object, enum, distinct: + result = a == b + of proc: + result = typeEqualsAux(a.parameterTuple, b.parameterTuple, s) and + typeEqualsAux(a.resultType, b.resultType, s) and + a.callingConvention == b.callingConvention + + proc typeEquals(a, b: PType): bool = + var s: set[PType * PType] = {} + result = typeEqualsAux(a, b, s) + +Since types are graphs which can have cycles, the above algorithm needs an +auxiliary set ``s`` to detect this case. + + +Subtype relation +~~~~~~~~~~~~~~~~ +If object ``a`` inherits from ``b``, ``a`` is a subtype of ``b``. This subtype +relation is extended to the types ``var``, ``ref``, ``ptr``: + +.. code-block:: nimrod + proc isSubtype(a, b: PType): bool = + if a.kind == b.kind: + case a.kind + of object: + var aa = a.baseType + while aa != nil and aa != b: aa = aa.baseType + result = aa == b + of var, ref, ptr: + result = isSubtype(a.baseType, b.baseType) + +.. XXX nil is a special value! + + +Convertible relation +~~~~~~~~~~~~~~~~~~~~ +A type ``a`` is **implicitly** convertible to type ``b`` iff the following +algorithm returns true: + +.. code-block:: nimrod + # XXX range types? + proc isImplicitlyConvertible(a, b: PType): bool = + case a.kind + of proc: + if b.kind == proc: + var x = a.parameterTuple + var y = b.parameterTuple + if x.tupleLen == y.tupleLen: + for i in 0.. x.tupleLen-1: + if not isSubtype(x[i], y[i]): return false + result = isSubType(b.resultType, a.resultType) + of int8: result = b.kind in {int16, int32, int64, int} + of int16: result = b.kind in {int32, int64, int} + of int32: result = b.kind in {int64, int} + of float: result = b.kind in {float32, float64} + of float32: result = b.kind in {float64, float} + of float64: result = b.kind in {float32, float} + of seq: + result = b.kind == openArray and typeEquals(a.baseType, b.baseType) + of array: + result = b.kind == openArray and typeEquals(a.baseType, b.baseType) + if a.baseType == char and a.indexType.rangeA == 0: + result = b.kind = cstring + of cstring, ptr: + result = b.kind == pointer + of string: + result = b.kind == cstring + +A type ``a`` is **explicitly** convertible to type ``b`` iff the following +algorithm returns true: + +.. code-block:: nimrod + proc isIntegralType(t: PType): bool = + result = isOrdinal(t) or t.kind in {float, float32, float64} + + proc isExplicitlyConvertible(a, b: PType): bool = + if isImplicitlyConvertible(a, b): return true + if isIntegralType(a) and isIntegralType(b): return true + if isSubtype(a, b) or isSubtype(b, a): return true + if a.kind == distinct and typeEquals(a.baseType, b): return true + if b.kind == distinct and typeEquals(b.baseType, a): return true + return false + + +Assignment compatibility +~~~~~~~~~~~~~~~~~~~~~~~~ + +An expression ``b`` can be assigned to an expression ``a`` iff ``a`` is an +`l-value` and ``isImplicitlyConvertible(b.typ, a.typ)`` holds. + + +Overloading resolution +~~~~~~~~~~~~~~~~~~~~~~ + +To be written. + + +Statements and expressions +-------------------------- +Nimrod uses the common statement/expression paradigm: `Statements`:idx: do not +produce a value in contrast to expressions. Call expressions are statements. +If the called procedure returns a value, it is not a valid statement +as statements do not produce values. To evaluate an expression for +side-effects and throw its value away, one can use the ``discard`` statement. + +Statements are separated into `simple statements`:idx: and +`complex statements`:idx:. +Simple statements are statements that cannot contain other statements like +assignments, calls or the ``return`` statement; complex statements can +contain other statements. To avoid the `dangling else problem`:idx:, complex +statements always have to be intended:: + + simpleStmt ::= returnStmt + | yieldStmt + | discardStmt + | raiseStmt + | breakStmt + | continueStmt + | pragma + | importStmt + | fromStmt + | includeStmt + | exprStmt + complexStmt ::= ifStmt | whileStmt | caseStmt | tryStmt | forStmt + | blockStmt | asmStmt + | procDecl | iteratorDecl | macroDecl | templateDecl + | constSection | typeSection | whenStmt | varSection + + + +Discard statement +~~~~~~~~~~~~~~~~~ + +Syntax:: + + discardStmt ::= 'discard' expr + +Example: + +.. code-block:: nimrod + + discard proc_call("arg1", "arg2") # discard the return value of `proc_call` + +The `discard`:idx: statement evaluates its expression for side-effects and +throws the expression's resulting value away. If the expression has no +side-effects, this generates a static error. Ignoring the return value of a +procedure without using a discard statement is a static error too. + + +Var statement +~~~~~~~~~~~~~ + +Syntax:: + + colonOrEquals ::= ':' typeDesc ['=' expr] | '=' expr + varField ::= symbol ['*'] [pragma] + varPart ::= symbol (comma symbol)* [comma] colonOrEquals [COMMENT | IND COMMENT] + varSection ::= 'var' (varPart + | indPush (COMMENT|varPart) + (SAD (COMMENT|varPart))* DED indPop) + + +`Var`:idx: statements declare new local and global variables and +initialize them. A comma separated list of variables can be used to specify +variables of the same type: + +.. code-block:: nimrod + + var + a: int = 0 + x, y, z: int + +If an initializer is given the type can be omitted: the variable is then of the +same type as the initializing expression. Variables are always initialized +with a default value if there is no initializing expression. The default +value depends on the type and is always a zero in binary. + +============================ ============================================== +Type default value +============================ ============================================== +any integer type 0 +any float 0.0 +char '\\0' +bool false +ref or pointer type nil +procedural type nil +sequence nil (*not* ``@[]``) +string nil (*not* "") +tuple[x: A, y: B, ...] (default(A), default(B), ...) + (analogous for objects) +array[0..., T] [default(T), ...] +range[T] default(T); this may be out of the valid range +T = enum cast[T](0); this may be an invalid value +============================ ============================================== + + +Const section +~~~~~~~~~~~~~ + +Syntax:: + + colonAndEquals ::= [':' typeDesc] '=' expr + + constDecl ::= symbol ['*'] [pragma] colonAndEquals [COMMENT | IND COMMENT] + | COMMENT + constSection ::= 'const' indPush constDecl (SAD constDecl)* DED indPop + + +Example: + +.. code-block:: nimrod + + const + MyFilename = "/home/my/file.txt" + debugMode: bool = false + +The `const`:idx: section declares symbolic constants. A symbolic constant is +a name for a constant expression. Symbolic constants only allow read-access. + + +If statement +~~~~~~~~~~~~ + +Syntax:: + + ifStmt ::= 'if' expr ':' stmt ('elif' expr ':' stmt)* ['else' ':' stmt] + +Example: + +.. code-block:: nimrod + + var name = readLine(stdin) + + if name == "Andreas": + echo("What a nice name!") + elif name == "": + echo("Don't you have a name?") + else: + echo("Boring name...") + +The `if`:idx: statement is a simple way to make a branch in the control flow: +The expression after the keyword ``if`` is evaluated, if it is true +the corresponding statements after the ``:`` are executed. Otherwise +the expression after the ``elif`` is evaluated (if there is an +``elif`` branch), if it is true the corresponding statements after +the ``:`` are executed. This goes on until the last ``elif``. If all +conditions fail, the ``else`` part is executed. If there is no ``else`` +part, execution continues with the statement after the ``if`` statement. + + +Case statement +~~~~~~~~~~~~~~ + +Syntax:: + + caseStmt ::= 'case' expr [':'] ('of' sliceExprList ':' stmt)* + ('elif' expr ':' stmt)* + ['else' ':' stmt] + +Example: + +.. code-block:: nimrod + + case readline(stdin) + of "delete-everything", "restart-computer": + echo("permission denied") + of "go-for-a-walk": echo("please yourself") + else: echo("unknown command") + +The `case`:idx: statement is similar to the if statement, but it represents +a multi-branch selection. The expression after the keyword ``case`` is +evaluated and if its value is in a *slicelist* the corresponding statements +(after the ``of`` keyword) are executed. If the value is not in any +given *slicelist* the ``else`` part is executed. If there is no ``else`` +part and not all possible values that ``expr`` can hold occur in a +``slicelist``, a static error occurs. This holds only for expressions of +ordinal types. +If the expression is not of an ordinal type, and no ``else`` part is +given, control passes after the ``case`` statement. + +To suppress the static error in the ordinal case an ``else`` part with a ``nil`` +statement can be used. + + +When statement +~~~~~~~~~~~~~~ + +Syntax:: + + whenStmt ::= 'when' expr ':' stmt ('elif' expr ':' stmt)* ['else' ':' stmt] + +Example: + +.. code-block:: nimrod + + when sizeof(int) == 2: + echo("running on a 16 bit system!") + elif sizeof(int) == 4: + echo("running on a 32 bit system!") + elif sizeof(int) == 8: + echo("running on a 64 bit system!") + else: + echo("cannot happen!") + +The `when`:idx: statement is almost identical to the ``if`` statement with some +exceptions: + +* Each ``expr`` has to be a constant expression (of type ``bool``). +* The statements do not open a new scope. +* The statements that belong to the expression that evaluated to true are + translated by the compiler, the other statements are not checked for + semantics! However, each ``expr`` is checked for semantics. + +The ``when`` statement enables conditional compilation techniques. As +a special syntactic extension, the ``when`` construct is also available +within ``object`` definitions. + + +Raise statement +~~~~~~~~~~~~~~~ + +Syntax:: + + raiseStmt ::= 'raise' [expr] + +Example: + +.. code-block:: nimrod + raise newEOS("operating system failed") + +Apart from built-in operations like array indexing, memory allocation, etc. +the ``raise`` statement is the only way to raise an exception. + +.. XXX document this better! + +If no exception name is given, the current exception is `re-raised`:idx:. The +`ENoExceptionToReraise`:idx: exception is raised if there is no exception to +re-raise. It follows that the ``raise`` statement *always* raises an +exception. + + +Try statement +~~~~~~~~~~~~~ + +Syntax:: + + qualifiedIdent ::= symbol ['.' symbol] + exceptList ::= [qualifiedIdent (comma qualifiedIdent)* [comma]] + tryStmt ::= 'try' ':' stmt + ('except' exceptList ':' stmt)* + ['finally' ':' stmt] + +Example: + +.. code-block:: nimrod + # read the first two lines of a text file that should contain numbers + # and tries to add them + var + f: TFile + if open(f, "numbers.txt"): + try: + var a = readLine(f) + var b = readLine(f) + echo("sum: " & $(parseInt(a) + parseInt(b))) + except EOverflow: + echo("overflow!") + except EInvalidValue: + echo("could not convert string to integer") + except EIO: + echo("IO error!") + except: + echo("Unknown exception!") + finally: + close(f) + +The statements after the `try`:idx: are executed in sequential order unless +an exception ``e`` is raised. If the exception type of ``e`` matches any +of the list ``exceptlist`` the corresponding statements are executed. +The statements following the ``except`` clauses are called +`exception handlers`:idx:. + +The empty `except`:idx: clause is executed if there is an exception that is +in no list. It is similar to an ``else`` clause in ``if`` statements. + +If there is a `finally`:idx: clause, it is always executed after the +exception handlers. + +The exception is *consumed* in an exception handler. However, an +exception handler may raise another exception. If the exception is not +handled, it is propagated through the call stack. This means that often +the rest of the procedure - that is not within a ``finally`` clause - +is not executed (if an exception occurs). + + +Return statement +~~~~~~~~~~~~~~~~ + +Syntax:: + + returnStmt ::= 'return' [expr] + +Example: + +.. code-block:: nimrod + return 40+2 + +The `return`:idx: statement ends the execution of the current procedure. +It is only allowed in procedures. If there is an ``expr``, this is syntactic +sugar for: + +.. code-block:: nimrod + result = expr + return result + +``return`` without an expression is a short notation for ``return result`` if +the proc has a return type. The `result`:idx: variable is always the return +value of the procedure. It is automatically declared by the compiler. As all +variables, ``result`` is initialized to (binary) zero: + +.. code-block:: nimrod + proc returnZero(): int = + # implicitly returns 0 + + +Yield statement +~~~~~~~~~~~~~~~ + +Syntax:: + + yieldStmt ::= 'yield' expr + +Example: + +.. code-block:: nimrod + yield (1, 2, 3) + +The `yield`:idx: statement is used instead of the ``return`` statement in +iterators. It is only valid in iterators. Execution is returned to the body +of the for loop that called the iterator. Yield does not end the iteration +process, but execution is passed back to the iterator if the next iteration +starts. See the section about iterators (`Iterators and the for statement`_) +for further information. + + +Block statement +~~~~~~~~~~~~~~~ + +Syntax:: + + blockStmt ::= 'block' [symbol] ':' stmt + +Example: + +.. code-block:: nimrod + var found = false + block myblock: + for i in 0..3: + for j in 0..3: + if a[j][i] == 7: + found = true + break myblock # leave the block, in this case both for-loops + echo(found) + +The block statement is a means to group statements to a (named) `block`:idx:. +Inside the block, the ``break`` statement is allowed to leave the block +immediately. A ``break`` statement can contain a name of a surrounding +block to specify which block is to leave. + + +Break statement +~~~~~~~~~~~~~~~ + +Syntax:: + + breakStmt ::= 'break' [symbol] + +Example: + +.. code-block:: nimrod + break + +The `break`:idx: statement is used to leave a block immediately. If ``symbol`` +is given, it is the name of the enclosing block that is to leave. If it is +absent, the innermost block is left. + + +While statement +~~~~~~~~~~~~~~~ + +Syntax:: + + whileStmt ::= 'while' expr ':' stmt + +Example: + +.. code-block:: nimrod + echo("Please tell me your password: \n") + var pw = readLine(stdin) + while pw != "12345": + echo("Wrong password! Next try: \n") + pw = readLine(stdin) + + +The `while`:idx: statement is executed until the ``expr`` evaluates to false. +Endless loops are no error. ``while`` statements open an `implicit block`, +so that they can be left with a ``break`` statement. + + +Continue statement +~~~~~~~~~~~~~~~~~~ + +Syntax:: + + continueStmt ::= 'continue' + +A `continue`:idx: statement leads to the immediate next iteration of the +surrounding loop construct. It is only allowed within a loop. A continue +statement is syntactic sugar for a nested block: + +.. code-block:: nimrod + while expr1: + stmt1 + continue + stmt2 + +Is equivalent to: + +.. code-block:: nimrod + while expr1: + block myBlockName: + stmt1 + break myBlockName + stmt2 + + +Assembler statement +~~~~~~~~~~~~~~~~~~~ +Syntax:: + + asmStmt ::= 'asm' [pragma] (STR_LIT | RSTR_LIT | TRIPLESTR_LIT) + +The direct embedding of `assembler`:idx: code into Nimrod code is supported +by the unsafe ``asm`` statement. Identifiers in the assembler code that refer to +Nimrod identifiers shall be enclosed in a special character which can be +specified in the statement's pragmas. The default special character is ``'`'``. + + +If expression +~~~~~~~~~~~~~ + +An `if expression` is almost like an if statement, but it is an expression. +Example: + +.. code-block:: nimrod + p(if x > 8: 9 else: 10) + +An if expression always results in a value, so the ``else`` part is +required. ``Elif`` parts are also allowed (but unlikely to be good +style). + + +Type conversions +~~~~~~~~~~~~~~~~ +Syntactically a `type conversion` is like a procedure call, but a +type name replaces the procedure name. A type conversion is always +safe in the sense that a failure to convert a type to another +results in an exception (if it cannot be determined statically). + + +Type casts +~~~~~~~~~~ +Example: + +.. code-block:: nimrod + cast[int](x) + +Type casts are a crude mechanism to interpret the bit pattern of +an expression as if it would be of another type. Type casts are +only needed for low-level programming and are inherently unsafe. + + +The addr operator +~~~~~~~~~~~~~~~~~ +The `addr` operator returns the address of an l-value. If the +type of the location is ``T``, the `addr` operator result is +of the type ``ptr T``. Taking the address of an object that resides +on the stack is **unsafe**, as the pointer may live longer than the +object on the stack and can thus reference a non-existing object. + + +Procedures +~~~~~~~~~~ +What most programming languages call `methods`:idx: or `functions`:idx: are +called `procedures`:idx: in Nimrod (which is the correct terminology). A +procedure declaration defines an identifier and associates it with a block +of code. +A procedure may call itself recursively. A parameter may be given a default +value that is used if the caller does not provide a value for this parameter. +The syntax is:: + + param ::= symbol (comma symbol)* (':' typeDesc ['=' expr] | '=' expr) + paramList ::= ['(' [param (comma param)*] [SAD] ')'] [':' typeDesc] + + genericParam ::= symbol [':' typeDesc] ['=' expr] + genericParams ::= '[' genericParam (comma genericParam)* [SAD] ']' + + procDecl ::= 'proc' symbol ['*'] [genericParams] paramList [pragma] + ['=' stmt] + +If the ``= stmt`` part is missing, it is a `forward`:idx: declaration. If +the proc returns a value, the procedure body can access an implicitly declared +variable named `result`:idx: that represents the return value. Procs can be +overloaded. The overloading resolution algorithm tries to find the proc that is +the best match for the arguments. Example: + +.. code-block:: nimrod + + proc toLower(c: Char): Char = # toLower for characters + if c in {'A'..'Z'}: + result = chr(ord(c) + (ord('a') - ord('A'))) + else: + result = c + + proc toLower(s: string): string = # toLower for strings + result = newString(len(s)) + for i in 0..len(s) - 1: + result[i] = toLower(s[i]) # calls toLower for characters; no recursion! + +Calling a procedure can be done in many different ways: + +.. code-block:: nimrod + proc callme(x, y: int, s: string = "", c: char, b: bool = false) = ... + + # call with positional arguments # parameter bindings: + callme(0, 1, "abc", '\t', true) # (x=0, y=1, s="abc", c='\t', b=true) + # call with named and positional arguments: + callme(y=1, x=0, "abd", '\t') # (x=0, y=1, s="abd", c='\t', b=false) + # call with named arguments (order is not relevant): + callme(c='\t', y=1, x=0) # (x=0, y=1, s="", c='\t', b=false) + # call as a command statement: no () needed: + callme 0, 1, "abc", '\t' + + +A procedure cannot modify its parameters (unless the parameters have the type +`var`). + +`Operators`:idx: are procedures with a special operator symbol as identifier: + +.. code-block:: nimrod + proc `$` (x: int): string = + # converts an integer to a string; this is a prefix operator. + return intToStr(x) + +Operators with one parameter are prefix operators, operators with two +parameters are infix operators. (However, the parser distinguishes these from +the operator's position within an expression.) There is no way to declare +postfix operators: all postfix operators are built-in and handled by the +grammar explicitly. + +Any operator can be called like an ordinary proc with the '`opr`' +notation. (Thus an operator can have more than two parameters): + +.. code-block:: nimrod + proc `*+` (a, b, c: int): int = + # Multiply and add + return a * b + c + + assert `*+`(3, 4, 6) == `*`(a, `+`(b, c)) + + + +Var parameters +~~~~~~~~~~~~~~ +The type of a parameter may be prefixed with the ``var`` keyword: + +.. code-block:: nimrod + proc divmod(a, b: int, + res, remainder: var int) = + res = a div b + remainder = a mod b + + var + x, y: int + + divmod(8, 5, x, y) # modifies x and y + assert x == 1 + assert y == 3 + +In the example, ``res`` and ``remainder`` are `var parameters`. +Var parameters can be modified by the procedure and the changes are +visible to the caller. The argument passed to a var parameter has to be +an l-value. Var parameters are implemented as hidden pointers. The +above example is equivalent to: + +.. code-block:: nimrod + proc divmod(a, b: int, + res, remainder: ptr int) = + res^ = a div b + remainder^ = a mod b + + var + x, y: int + divmod(8, 5, addr(x), addr(y)) + assert x == 1 + assert y == 3 + +In the examples, var parameters or pointers are used to provide two +return values. This can be done in a cleaner way by returning a tuple: + +.. code-block:: nimrod + proc divmod(a, b: int): tuple[res, remainder: int] = + return (a div b, a mod b) + + var t = divmod(8, 5) + assert t.res == 1 + assert t.remainder = 3 + +One can use `tuple unpacking`:idx: to access the tuple's fields: + +.. code-block:: nimrod + var (x, y) = divmod(8, 5) # tuple unpacking + assert x == 1 + assert y == 3 + + +Overloading of the subscript operator +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``[]`` subscript operator for arrays/openarrays/sequences can be overloaded. +Overloading support is only possible if the first parameter has no type that +already supports the built-in ``[]`` notation. Currently the compiler currently +does not check this. XXX Multiple indexes + + +Multi-methods +~~~~~~~~~~~~~ + +Procedures always use static dispatch. `Multi-methods`:idx: use dynamic +dispatch. + +.. code-block:: nimrod + type + TExpr = object ## abstract base class for an expression + TLiteral = object of TExpr + x: int + TPlusExpr = object of TExpr + a, b: ref TExpr + + method eval(e: ref TExpr): int = + # override this base method + quit "to override!" + + method eval(e: ref TLiteral): int = return e.x + + method eval(e: ref TPlusExpr): int = + # watch out: relies on dynamic binding + return eval(e.a) + eval(e.b) + + proc newLit(x: int): ref TLiteral = + new(result) + result.x = x + + proc newPlus(a, b: ref TExpr): ref TPlusExpr = + new(result) + result.a = a + result.b = b + + echo eval(newPlus(newPlus(newLit(1), newLit(2)), newLit(4))) + +In the example the constructors ``newLit`` and ``newPlus`` are procs +because they should use static binding, but ``eval`` is a method because it +requires dynamic binding. + +In a multi-method all parameters that have an object type are used for the +dispatching: + +.. code-block:: nimrod + type + TThing = object + TUnit = object of TThing + x: int + + method collide(a, b: TThing) {.inline.} = + quit "to override!" + + method collide(a: TThing, b: TUnit) {.inline.} = + echo "1" + + method collide(a: TUnit, b: TThing) {.inline.} = + echo "2" + + var + a, b: TUnit + collide(a, b) # output: 2 + + +Invocation of a multi-method cannot be ambiguous: collide 2 is preferred over +collide 1 because the resolution works from left to right. +In the example ``TUnit, TThing`` is prefered over ``TThing, TUnit``. + +**Performance note**: Nimrod does not produce a virtual method table, but +generates dispatch trees. This avoids the expensive indirect branch for method +calls and enables inlining. However, other optimizations like compile time +evaluation or dead code elimination do not work with methods. + + +Iterators and the for statement +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Syntax:: + + forStmt ::= 'for' symbol (comma symbol)* [comma] 'in' expr ['..' expr] ':' stmt + + param ::= symbol (comma symbol)* [comma] ':' typeDesc + paramList ::= ['(' [param (comma param)* [comma]] ')'] [':' typeDesc] + + genericParam ::= symbol [':' typeDesc] + genericParams ::= '[' genericParam (comma genericParam)* [comma] ']' + + iteratorDecl ::= 'iterator' symbol ['*'] [genericParams] paramList [pragma] + ['=' stmt] + +The `for`:idx: statement is an abstract mechanism to iterate over the elements +of a container. It relies on an `iterator`:idx: to do so. Like ``while`` +statements, ``for`` statements open an `implicit block`:idx:, so that they +can be left with a ``break`` statement. The ``for`` loop declares +iteration variables (``x`` in the example) - their scope reaches until the +end of the loop body. The iteration variables' types are inferred by the +return type of the iterator. + +An iterator is similar to a procedure, except that it is always called in the +context of a ``for`` loop. Iterators provide a way to specify the iteration over +an abstract type. A key role in the execution of a ``for`` loop plays the +``yield`` statement in the called iterator. Whenever a ``yield`` statement is +reached the data is bound to the ``for`` loop variables and control continues +in the body of the ``for`` loop. The iterator's local variables and execution +state are automatically saved between calls. Example: + +.. code-block:: nimrod + # this definition exists in the system module + iterator items*(a: string): char {.inline.} = + var i = 0 + while i < len(a): + yield a[i] + inc(i) + + for ch in items("hello world"): # `ch` is an iteration variable + echo(ch) + +The compiler generates code as if the programmer would have written this: + +.. code-block:: nimrod + var i = 0 + while i < len(a): + var ch = a[i] + echo(ch) + inc(i) + +The current implementation always inlines the iterator code leading to zero +overhead for the abstraction. But this may increase the code size. Later +versions of the compiler will only inline iterators which have the calling +convention ``inline``. + +If the iterator yields a tuple, there have to be as many iteration variables +as there are components in the tuple. The i'th iteration variable's type is +the one of the i'th component. + + +Type sections +~~~~~~~~~~~~~ + +Syntax:: + + typeDef ::= typeDesc | objectDef | enumDef + + genericParam ::= symbol [':' typeDesc] + genericParams ::= '[' genericParam (comma genericParam)* [comma] ']' + + typeDecl ::= COMMENT + | symbol ['*'] [genericParams] ['=' typeDef] [COMMENT|IND COMMENT] + + typeSection ::= 'type' indPush typeDecl (SAD typeDecl)* DED indPop + + +Example: + +.. code-block:: nimrod + type # example demonstrates mutually recursive types + PNode = ref TNode # a traced pointer to a TNode + TNode = object + le, ri: PNode # left and right subtrees + sym: ref TSym # leaves contain a reference to a TSym + + TSym = object # a symbol + name: string # the symbol's name + line: int # the line the symbol was declared in + code: PNode # the symbol's abstract syntax tree + +A `type`:idx: section begins with the ``type`` keyword. It contains multiple +type definitions. A type definition binds a type to a name. Type definitions +can be recursive or even mutually recursive. Mutually recursive types are only +possible within a single ``type`` section. + + +Generics +~~~~~~~~ + +Example: + +.. code-block:: nimrod + type + TBinaryTree[T] = object # TBinaryTree is a generic type with + # with generic param ``T`` + le, ri: ref TBinaryTree[T] # left and right subtrees; may be nil + data: T # the data stored in a node + PBinaryTree[T] = ref TBinaryTree[T] # a shorthand for notational convenience + + proc newNode[T](data: T): PBinaryTree[T] = # constructor for a node + new(result) + result.dat = data + + proc add[T](root: var PBinaryTree[T], n: PBinaryTree[T]) = + if root == nil: + root = n + else: + var it = root + while it != nil: + var c = cmp(it.data, n.data) # compare the data items; uses + # the generic ``cmd`` proc that works for + # any type that has a ``==`` and ``<`` + # operator + if c < 0: + if it.le == nil: + it.le = n + return + it = it.le + else: + if it.ri == nil: + it.ri = n + return + it = it.ri + + iterator inorder[T](root: PBinaryTree[T]): T = + # inorder traversal of a binary tree + # recursive iterators are not yet implemented, so this does not work in + # the current compiler! + if root.le != nil: yield inorder(root.le) + yield root.data + if root.ri != nil: yield inorder(root.ri) + + var + root: PBinaryTree[string] # instantiate a PBinaryTree with the type string + add(root, newNode("hallo")) # instantiates generic procs ``newNode`` and + add(root, newNode("world")) # ``add`` + for str in inorder(root): + writeln(stdout, str) + +`Generics`:idx: are Nimrod's means to parametrize procs, iterators or types with +`type parameters`:idx:. Depending on context, the brackets are used either to +introduce type parameters or to instantiate a generic proc, iterator or type. + + +Templates +~~~~~~~~~ + +A `template`:idx: is a simple form of a macro: It is a simple substitution +mechanism that operates on Nimrod's abstract syntax trees. It is processed in +the semantic pass of the compiler. + +The syntax to *invoke* a template is the same as calling a procedure. + +Example: + +.. code-block:: nimrod + template `!=` (a, b: expr): expr = + # this definition exists in the System module + not (a == b) + + assert(5 != 6) # the compiler rewrites that to: assert(not (5 == 6)) + +The ``!=``, ``>``, ``>=``, ``in``, ``notin``, ``isnot`` operators are in fact +templates: + +| ``a > b`` is transformed into ``b < a``. +| ``a in b`` is transformed into ``contains(b, a)``. +| ``notin`` and ``isnot`` have the obvious meanings. + +The "types" of templates can be the symbols ``expr`` (stands for *expression*), +``stmt`` (stands for *statement*) or ``typedesc`` (stands for *type +description*). These are no real types, they just help the compiler parsing. +Real types can be used too; this implies that expressions are expected. +However, for parameter type checking the arguments are semantically checked +before being passed to the template. Other arguments are not semantically +checked before being passed to the template. + +The template body does not open a new scope. To open a new scope a ``block`` +statement can be used: + +.. code-block:: nimrod + template declareInScope(x: expr, t: typeDesc): stmt = + var x: t + + template declareInNewScope(x: expr, t: typeDesc): stmt = + # open a new scope: + block: + var x: t + + declareInScope(a, int) + a = 42 # works, `a` is known here + + declareInNewScope(b, int) + b = 42 # does not work, `b` is unknown + + +If there is a ``stmt`` parameter it should be the last in the template +declaration, because statements are passed to a template via a +special ``:`` syntax: + +.. code-block:: nimrod + + template withFile(f, fn, mode: expr, actions: stmt): stmt = + block: + var f: TFile + if open(f, fn, mode): + try: + actions + finally: + close(f) + else: + quit("cannot open: " & fn) + + withFile(txt, "ttempl3.txt", fmWrite): + txt.writeln("line 1") + txt.writeln("line 2") + +In the example the two ``writeln`` statements are bound to the ``actions`` +parameter. + +**Note:** Symbol binding rules in templates might change! + +Symbol binding within templates happens after template instantation: + +.. code-block:: nimrod + # Module A + var + lastId = 0 + + template genId*: expr = + inc(lastId) + lastId + +.. code-block:: nimrod + # Module B + import A + + echo genId() # Error: undeclared identifier: 'lastId' + +Exporting a template is a often a leaky abstraction. However, to compensate for +this case, the ``bind`` operator can be used: All identifiers within a ``bind`` +context are bound early (i.e. when the template is parsed). +The affected identifiers are then always bound early even if the other +occurences are in no ``bind`` context: + +.. code-block:: nimrod + # Module A + var + lastId = 0 + + template genId*: expr = + inc(bind lastId) + lastId + +.. code-block:: nimrod + # Module B + import A + + echo genId() # Works + + +**Style note**: For code readability, it is the best idea to use the least +powerful programming construct that still suffices. So the "check list" is: + +(1) Use an ordinary proc/iterator, if possible. +(2) Else: Use a generic proc/iterator, if possible. +(3) Else: Use a template, if possible. +(4) Else: Use a macro. + + +Macros +------ + +`Macros`:idx: are the most powerful feature of Nimrod. They can be used +to implement `domain specific languages`:idx:. + +While macros enable advanced compile-time code transformations, they +cannot change Nimrod's syntax. However, this is no real restriction because +Nimrod's syntax is flexible enough anyway. + +To write macros, one needs to know how the Nimrod concrete syntax is converted +to an abstract syntax tree. + +There are two ways to invoke a macro: +(1) invoking a macro like a procedure call (`expression macros`) +(2) invoking a macro with the special ``macrostmt`` syntax (`statement macros`) + + +Expression Macros +~~~~~~~~~~~~~~~~~ + +The following example implements a powerful ``debug`` command that accepts a +variable number of arguments: + +.. code-block:: nimrod + # to work with Nimrod syntax trees, we need an API that is defined in the + # ``macros`` module: + import macros + + macro debug(n: expr): stmt = + # `n` is a Nimrod AST that contains the whole macro invocation + # this macro returns a list of statements: + result = newNimNode(nnkStmtList, n) + # iterate over any argument that is passed to this macro: + for i in 1..n.len-1: + # add a call to the statement list that writes the expression; + # `toStrLit` converts an AST to its string representation: + add(result, newCall("write", newIdentNode("stdout"), toStrLit(n[i]))) + # add a call to the statement list that writes ": " + add(result, newCall("write", newIdentNode("stdout"), newStrLitNode(": "))) + # add a call to the statement list that writes the expressions value: + add(result, newCall("writeln", newIdentNode("stdout"), n[i])) + + var + a: array [0..10, int] + x = "some string" + a[0] = 42 + a[1] = 45 + + debug(a[0], a[1], x) + +The macro call expands to: + +.. code-block:: nimrod + write(stdout, "a[0]") + write(stdout, ": ") + writeln(stdout, a[0]) + + write(stdout, "a[1]") + write(stdout, ": ") + writeln(stdout, a[1]) + + write(stdout, "x") + write(stdout, ": ") + writeln(stdout, x) + + +Statement Macros +~~~~~~~~~~~~~~~~ + +Statement macros are defined just as expression macros. However, they are +invoked by an expression following a colon:: + + exprStmt ::= lowestExpr ['=' expr | [expr (comma expr)* [comma]] [macroStmt]] + macroStmt ::= ':' [stmt] ('of' [sliceExprList] ':' stmt + | 'elif' expr ':' stmt + | 'except' exceptList ':' stmt )* + ['else' ':' stmt] + +The following example outlines a macro that generates a lexical analyzer from +regular expressions: + +.. code-block:: nimrod + import macros + + macro case_token(n: stmt): stmt = + # creates a lexical analyzer from regular expressions + # ... (implementation is an exercise for the reader :-) + nil + + case_token: # this colon tells the parser it is a macro statement + of r"[A-Za-z_]+[A-Za-z_0-9]*": + return tkIdentifier + of r"0-9+": + return tkInteger + of r"[\+\-\*\?]+": + return tkOperator + else: + return tkUnknown + + + +Modules +------- +Nimrod supports splitting a program into pieces by a `module`:idx: concept. +Each module needs to be in its own file and has its own `namespace`:idx:. +Modules enable `information hiding`:idx: and `separate compilation`:idx:. +A module may gain access to symbols of another module by the `import`:idx: +statement. `Recursive module dependencies`:idx: are allowed, but slightly +subtle. Only top-level symbols that are marked with an asterisk (``*``) are +exported. + +The algorithm for compiling modules is: + +- compile the whole module as usual, following import statements recursively +- if there is a cycle only import the already parsed symbols (that are + exported); if an unknown identifier occurs then abort + +This is best illustrated by an example: + +.. code-block:: nimrod + # Module A + type + T1* = int # Module A exports the type ``T1`` + import B # the compiler starts parsing B + + proc main() = + var i = p(3) # works because B has been parsed completely here + + main() + + +.. code-block:: nimrod + # Module B + import A # A is not parsed here! Only the already known symbols + # of A are imported. + + proc p*(x: A.T1): A.T1 = + # this works because the compiler has already + # added T1 to A's interface symbol table + return x + 1 + + +Scope rules +----------- +Identifiers are valid from the point of their declaration until the end of +the block in which the declaration occurred. The range where the identifier +is known is the `scope`:idx: of the identifier. The exact scope of an +identifier depends on the way it was declared. + +Block scope +~~~~~~~~~~~ +The *scope* of a variable declared in the declaration part of a block +is valid from the point of declaration until the end of the block. If a +block contains a second block, in which the identifier is redeclared, +then inside this block, the second declaration will be valid. Upon +leaving the inner block, the first declaration is valid again. An +identifier cannot be redefined in the same block, except if valid for +procedure or iterator overloading purposes. + + +Tuple or object scope +~~~~~~~~~~~~~~~~~~~~~ +The field identifiers inside a tuple or object definition are valid in the +following places: + +* To the end of the tuple/object definition. +* Field designators of a variable of the given tuple/object type. +* In all descendant types of the object type. + +Module scope +~~~~~~~~~~~~ +All identifiers of a module are valid from the point of declaration until +the end of the module. Identifiers from indirectly dependent modules are *not* +available. The `system`:idx: module is automatically imported in every other +module. + +If a module imports an identifier by two different modules, each occurrence of +the identifier has to be qualified, unless it is an overloaded procedure or +iterator in which case the overloading resolution takes place: + +.. code-block:: nimrod + # Module A + var x*: string + +.. code-block:: nimrod + # Module B + var x*: int + +.. code-block:: nimrod + # Module C + import A, B + write(stdout, x) # error: x is ambiguous + write(stdout, A.x) # no error: qualifier used + + var x = 4 + write(stdout, x) # not ambiguous: uses the module C's x + + +Messages +======== + +The Nimrod compiler emits different kinds of messages: `hint`:idx:, +`warning`:idx:, and `error`:idx: messages. An *error* message is emitted if +the compiler encounters any static error. + + +Pragmas +======= + +Syntax:: + + colonExpr ::= expr [':' expr] + colonExprList ::= [colonExpr (comma colonExpr)* [comma]] + + pragma ::= '{.' optInd (colonExpr [comma])* [SAD] ('.}' | '}') + +Pragmas are Nimrod's method to give the compiler additional information / +commands without introducing a massive number of new keywords. Pragmas are +processed on the fly during semantic checking. Pragmas are enclosed in the +special ``{.`` and ``.}`` curly brackets. Pragmas are also often used as a +first implementation to play with a language feature before a nicer syntax +to access the feature becomes available. + + +noSideEffect pragma +------------------- +The `noSideEffect`:idx: pragma is used to mark a proc/iterator to have no side +effects. This means that the proc/iterator only changes locations that are +reachable from its parameters and the return value only depends on the +arguments. If none of its parameters have the type ``var T`` +or ``ref T`` or ``ptr T`` this means no locations are modified. It is a static +error to mark a proc/iterator to have no side effect if the compiler cannot +verify this. + +**Future directions**: ``func`` may become a keyword and syntactic sugar for a +proc with no side effects: + +.. code-block:: nimrod + func `+` (x, y: int): int + + +procvar pragma +-------------- +The `procvar`:idx: pragma is used to mark a proc that it can be passed to a +procedural variable. + + +compileTime pragma +------------------ +The `compileTime`:idx: pragma is used to mark a proc to be used at compile +time only. No code will be generated for it. Compile time procs are useful +as helpers for macros. + + +noReturn pragma +--------------- +The `noreturn`:idx: pragma is used to mark a proc that it never returns. + + +Acyclic pragma +-------------- +The `acyclic`:idx: pragma can be used for object types to mark them as acyclic +even though they seem to be cyclic. This is an **optimization** for the garbage +collector to not consider objects of this type as part of a cycle: + +.. code-block:: nimrod + type + PNode = ref TNode + TNode {.acyclic, final.} = object + left, right: PNode + data: string + +In the example a tree structure is declared with the ``TNode`` type. Note that +the type definition is recursive and the GC has to assume that objects of +this type may form a cyclic graph. The ``acyclic`` pragma passes the +information that this cannot happen to the GC. If the programmer uses the +``acyclic`` pragma for data types that are in reality cyclic, the GC may leak +memory, but nothing worse happens. + +**Future directions**: The ``acyclic`` pragma may become a property of a +``ref`` type: + +.. code-block:: nimrod + type + PNode = acyclic ref TNode + TNode = object + left, right: PNode + data: string + + +Final pragma +------------ +The `final`:idx: pragma can be used for an object type to specify that it +cannot be inherited from. + + +Pure pragma +----------- +The `pure`:idx: pragma serves two completely different purposes: +1) To mark a procedure that Nimrod should not generate any exit statements like + ``return result;`` in the generated code. This is useful for procs that only + consist of an assembler statement. +2) To mark an object type so that its type field should be omitted. This is + necessary for binary compatibility with other compiled languages. + + +error pragma +------------ +The `error`:idx: pragma is used to make the compiler output an error message +with the given content. Compilation currently aborts after an error, but this +may be changed in later versions. + + +fatal pragma +------------ +The `fatal`:idx: pragma is used to make the compiler output an error message +with the given content. In contrast to the ``error`` pragma, compilation +is guaranteed to be aborted by this pragma. + +warning pragma +-------------- +The `warning`:idx: pragma is used to make the compiler output a warning message +with the given content. Compilation continues after the warning. + +hint pragma +----------- +The `hint`:idx: pragma is used to make the compiler output a hint message with +the given content. Compilation continues after the hint. + + +compilation option pragmas +-------------------------- +The listed pragmas here can be used to override the code generation options +for a section of code. + +The implementation currently provides the following possible options (various +others may be added later). + +=============== =============== ============================================ +pragma allowed values description +=============== =============== ============================================ +checks on|off Turns the code generation for all runtime + checks on or off. +boundChecks on|off Turns the code generation for array bound + checks on or off. +overflowChecks on|off Turns the code generation for over- or + underflow checks on or off. +nilChecks on|off Turns the code generation for nil pointer + checks on or off. +assertions on|off Turns the code generation for assertions + on or off. +warnings on|off Turns the warning messages of the compiler + on or off. +hints on|off Turns the hint messages of the compiler + on or off. +optimization none|speed|size Optimize the code for speed or size, or + disable optimization. +callconv cdecl|... Specifies the default calling convention for + all procedures (and procedure types) that + follow. +=============== =============== ============================================ + +Example: + +.. code-block:: nimrod + {.checks: off, optimization: speed.} + # compile without runtime checks and optimize for speed + + +push and pop pragmas +-------------------- +The `push/pop`:idx: pragmas are very similar to the option directive, +but are used to override the settings temporarily. Example: + +.. code-block:: nimrod + {.push checks: off.} + # compile this section without runtime checks as it is + # speed critical + # ... some code ... + {.pop.} # restore old settings + + +Register pragma +--------------- +The `register`:idx: pragma is for variables only. It declares the variable as +``register``, giving the compiler a hint that the variable should be placed +in a hardware register for faster access. C compilers usually ignore this +though and for good reasons: Often they do a better job without it anyway. + +In highly specific cases (a dispatch loop of an bytecode interpreter for +example) it may provide benefits, though. + + +DeadCodeElim pragma +------------------- +The `deadCodeElim`:idx: pragma only applies to whole modules: It tells the +compiler to activate (or deactivate) dead code elimination for the module the +pragma appers in. + +The ``--deadCodeElim:on`` command line switch has the same effect as marking +every module with ``{.deadCodeElim:on}``. However, for some modules such as +the GTK wrapper it makes sense to *always* turn on dead code elimination - +no matter if it is globally active or not. + +Example: + +.. code-block:: nimrod + {.deadCodeElim: on.} + + +Disabling certain messages +-------------------------- +Nimrod generates some warnings and hints ("line too long") that may annoy the +user. A mechanism for disabling certain messages is provided: Each hint +and warning message contains a symbol in brackets. This is the message's +identifier that can be used to enable or disable it: + +.. code-block:: Nimrod + {.warning[LineTooLong]: off.} # turn off warning about too long lines + +This is often better than disabling all warnings at once. + + +Foreign function interface +========================== + +Nimrod's `FFI`:idx: (foreign function interface) is extensive and only the +parts that scale to other future backends (like the LLVM/EcmaScript backends) +are documented here. + + +Importc pragma +-------------- +The `importc`:idx: pragma provides a means to import a proc or a variable +from C. The optional argument is a string containing the C identifier. If +the argument is missing, the C name is the Nimrod identifier *exactly as +spelled*: + +.. code-block:: + proc printf(formatstr: cstring) {.importc: "printf", varargs.} + +Note that this pragma is somewhat of a misnomer: Other backends will provide +the same feature under the same name. + + +Exportc pragma +-------------- +The `exportc`:idx: pragma provides a means to export a type, a variable, or a +procedure to C. The optional argument is a string containing the C identifier. +If the argument is missing, the C name is the Nimrod +identifier *exactly as spelled*: + +.. code-block:: Nimrod + proc callme(formatstr: cstring) {.exportc: "callMe", varargs.} + +Note that this pragma is somewhat of a misnomer: Other backends will provide +the same feature under the same name. + + +Varargs pragma +-------------- +The `varargs`:idx: pragma can be applied to procedures only (and procedure +types). It tells Nimrod that the proc can take a variable number of parameters +after the last specified parameter. Nimrod string values will be converted to C +strings automatically: + +.. code-block:: Nimrod + proc printf(formatstr: cstring) {.nodecl, varargs.} + + printf("hallo %s", "world") # "world" will be passed as C string + + +Dynlib pragma +------------- +With the `dynlib`:idx: pragma a procedure can be imported from +a dynamic library (``.dll`` files for Windows, ``lib*.so`` files for UNIX). The +non-optional argument has to be the name of the dynamic library: + +.. code-block:: Nimrod + proc gtk_image_new(): PGtkWidget {.cdecl, dynlib: "libgtk-x11-2.0.so", importc.} + +In general, importing a dynamic library does not require any special linker +options or linking with import libraries. This also implies that no *devel* +packages need to be installed. + +The ``dynlib`` import mechanism supports a versioning scheme: + +.. code-block:: nimrod + proc Tcl_Eval(interp: pTcl_Interp, script: cstring): int {.cdecl, + importc, dynlib: "libtcl(|8.5|8.4|8.3).so.(1|0)".} + +At runtime the dynamic library is searched for (in this order):: + + libtcl.so.1 + libtcl.so.0 + libtcl8.5.so.1 + libtcl8.5.so.0 + libtcl8.4.so.1 + libtcl8.4.so.0 + libtcl8.3.so.1 + libtcl8.3.so.0 + +The ``dynlib`` pragma supports not only constant strings as argument but also +string expressions in general: + +.. code-block:: nimrod + import os + + proc getDllName: string = + result = "mylib.dll" + if ExistsFile(result): return + result = "mylib2.dll" + if ExistsFile(result): return + quit("could not load dynamic library") + + proc myImport(s: cstring) {.cdecl, importc, dynlib: getDllName().} + +**Note**: Patterns like ``libtcl(|8.5|8.4).so`` are only supported in constant +strings, because they are precompiled. diff --git a/doc/nimrodc.txt b/doc/nimrodc.txt index 79ce06ad1..cdad6efa8 100755 --- a/doc/nimrodc.txt +++ b/doc/nimrodc.txt @@ -1,239 +1,239 @@ -=================================== - Nimrod Compiler User Guide -=================================== - -:Author: Andreas Rumpf -:Version: |nimrodversion| - -.. contents:: - - "Look at you, hacker. A pathetic creature of meat and bone, panting and - sweating as you run through my corridors. How can you challenge a perfect, - immortal machine?" - - -Introduction -============ - -This document describes the usage of the *Nimrod compiler* -on the different supported platforms. It is not a definition of the Nimrod -programming language (therefore is the `manual <manual>`_). - -Nimrod is free software; it is licensed under the -`GNU General Public License <gpl.html>`_. - - -Compiler Usage -============== - -Command line switches ---------------------- -Basis command line switches are: - -.. include:: ../data/basicopt.txt - -Advanced command line switches are: - -.. include:: ../data/advopt.txt - - -Configuration file ------------------- -The default configuration file is ``nimrod.cfg``. The ``nimrod`` executable -looks for it in the following directories (in this order): - -1. ``/home/$user/.config/nimrod.cfg`` (UNIX) or ``%APPDATA%/nimrod.cfg`` (Windows) -2. ``$nimrod/config/nimrod.cfg`` (UNIX, Windows) -3. ``/etc/nimrod.cfg`` (UNIX) - -The search stops as soon as a configuration file has been found. The reading -of ``nimrod.cfg`` can be suppressed by the ``--skipCfg`` command line option. - -**Note:** The *project file name* is the name of the ``.nim`` file that is -passed as a command line argument to the compiler. - -Configuration settings can be overwritten in a project specific -configuration file that is read automatically. This specific file has to -be in the same directory as the project and be of the same name, except -that its extension should be ``.cfg``. - -Command line settings have priority over configuration file settings. - - -Generated C code directory --------------------------- -The generated files that Nimrod produces all go into a subdirectory called -``nimcache`` in your project directory. This makes it easy to delete all -generated files. - -However, the generated C code is not platform independant. C code generated for -Linux does not compile on Windows, for instance. The comment on top of the -C file lists the OS, CPU and CC the file has been compiled for. - - -Additional Features -=================== - -This section describes Nimrod's additional features that are not listed in the -Nimrod manual. Some of the features here only make sense for the C code -generator and are subject to change. - - -NoDecl pragma -------------- -The `noDecl`:idx: pragma can be applied to almost any symbol (variable, proc, -type, etc.) and is sometimes useful for interoperability with C: -It tells Nimrod that it should not generate a declaration for the symbol in -the C code. For example: - -.. code-block:: Nimrod - var - EACCES {.importc, noDecl.}: cint # pretend EACCES was a variable, as - # Nimrod does not know its value - -However, the ``header`` pragma is often the better alternative. - -**Note**: This will not work for the LLVM backend. - - -Header pragma -------------- -The `header`:idx: pragma is very similar to the ``noDecl`` pragma: It can be -applied to almost any symbol and specifies that it should not be declared -and instead the generated code should contain an ``#include``: - -.. code-block:: Nimrod - type - PFile {.importc: "FILE*", header: "<stdio.h>".} = distinct pointer - # import C's FILE* type; Nimrod will treat it as a new pointer type - -The ``header`` pragma always expects a string constant. The string contant -contains the header file: As usual for C, a system header file is enclosed -in angle brackets: ``<>``. If no angle brackets are given, Nimrod -encloses the header file in ``""`` in the generated C code. - -**Note**: This will not work for the LLVM backend. - - -LineDir option --------------- -The `lineDir`:idx: option can be turned on or off. If turned on the -generated C code contains ``#line`` directives. This may be helpful for -debugging with GDB. - - -StackTrace option ------------------ -If the `stackTrace`:idx: option is turned on, the generated C contains code to -ensure that proper stack traces are given if the program crashes or an -uncaught exception is raised. - - -LineTrace option ----------------- -The `lineTrace`:idx: option implies the ``stackTrace`` option. If turned on, -the generated C contains code to ensure that proper stack traces with line -number information are given if the program crashes or an uncaught exception -is raised. - -Debugger option ---------------- -The `debugger`:idx: option enables or disables the *Embedded Nimrod Debugger*. -See the documentation of endb_ for further information. - - -Breakpoint pragma ------------------ -The *breakpoint* pragma was specially added for the sake of debugging with -ENDB. See the documentation of `endb <endb.html>`_ for further information. - - -Volatile pragma ---------------- -The `volatile`:idx: pragma is for variables only. It declares the variable as -``volatile``, whatever that means in C/C++. - -**Note**: This pragma will not exist for the LLVM backend. - - -Debugging with Nimrod -===================== - -Nimrod comes with its own *Embedded Nimrod Debugger*. See -the documentation of endb_ for further information. - - -Optimizing for Nimrod -===================== - -Nimrod has no separate optimizer, but the C code that is produced is very -efficient. Most C compilers have excellent optimizers, so usually it is -not needed to optimize one's code. Nimrod has been designed to encourage -efficient code: The most readable code in Nimrod is often the most efficient -too. - -However, sometimes one has to optimize. Do it in the following order: - -1. switch off the embedded debugger (it is **slow**!) -2. turn on the optimizer and turn off runtime checks -3. profile your code to find where the bottlenecks are -4. try to find a better algorithm -5. do low-level optimizations - -This section can only help you with the last item. - - -Optimizing string handling --------------------------- - -String assignments are sometimes expensive in Nimrod: They are required to -copy the whole string. However, the compiler is often smart enough to not copy -strings. Due to the argument passing semantics, strings are never copied when -passed to subroutines. The compiler does not copy strings that are a result from -a procedure call, because the callee returns a new string anyway. -Thus it is efficient to do: - -.. code-block:: Nimrod - var s = procA() # assignment will not copy the string; procA allocates a new - # string already - -However it is not efficient to do: - -.. code-block:: Nimrod - var s = varA # assignment has to copy the whole string into a new buffer! - -.. - String case statements are optimized too. A hashing scheme is used for them - if several different string constants are used. This is likely to be more - efficient than any hand-coded scheme. - - -.. - The ECMAScript code generator - ============================= - - Note: As of version 0.7.0 the ECMAScript code generator is not maintained any - longer. Help if you are interested. - - Note: I use the term `ECMAScript`:idx: here instead of `JavaScript`:idx:, - since it is the proper term. - - The ECMAScript code generator is experimental! - - Nimrod targets ECMAScript 1.5 which is supported by any widely used browser. - Since ECMAScript does not have a portable means to include another module, - Nimrod just generates a long ``.js`` file. - - Features or modules that the ECMAScript platform does not support are not - available. This includes: - - * manual memory management (``alloc``, etc.) - * casting and other unsafe operations (``cast`` operator, ``zeroMem``, etc.) - * file management - * most modules of the Standard library - * proper 64 bit integer arithmetic - * proper unsigned integer arithmetic - - However, the modules `strutils`:idx:, `math`:idx:, and `times`:idx: are - available! To access the DOM, use the `dom`:idx: module that is only - available for the ECMAScript platform. +=================================== + Nimrod Compiler User Guide +=================================== + +:Author: Andreas Rumpf +:Version: |nimrodversion| + +.. contents:: + + "Look at you, hacker. A pathetic creature of meat and bone, panting and + sweating as you run through my corridors. How can you challenge a perfect, + immortal machine?" + + +Introduction +============ + +This document describes the usage of the *Nimrod compiler* +on the different supported platforms. It is not a definition of the Nimrod +programming language (therefore is the `manual <manual>`_). + +Nimrod is free software; it is licensed under the +`GNU General Public License <gpl.html>`_. + + +Compiler Usage +============== + +Command line switches +--------------------- +Basis command line switches are: + +.. include:: ../data/basicopt.txt + +Advanced command line switches are: + +.. include:: ../data/advopt.txt + + +Configuration file +------------------ +The default configuration file is ``nimrod.cfg``. The ``nimrod`` executable +looks for it in the following directories (in this order): + +1. ``/home/$user/.config/nimrod.cfg`` (UNIX) or ``%APPDATA%/nimrod.cfg`` (Windows) +2. ``$nimrod/config/nimrod.cfg`` (UNIX), ``%NIMROD%/config/nimrod.cfg`` (Windows) +3. ``/etc/nimrod.cfg`` (UNIX) + +The search stops as soon as a configuration file has been found. The reading +of ``nimrod.cfg`` can be suppressed by the ``--skipCfg`` command line option. + +**Note:** The *project file name* is the name of the ``.nim`` file that is +passed as a command line argument to the compiler. + +Configuration settings can be overwritten individually in a project specific +configuration file that is read automatically. This specific file has to +be in the same directory as the project and be of the same name, except +that its extension should be ``.cfg``. + +Command line settings have priority over configuration file settings. + + +Generated C code directory +-------------------------- +The generated files that Nimrod produces all go into a subdirectory called +``nimcache`` in your project directory. This makes it easy to delete all +generated files. + +However, the generated C code is not platform independant. C code generated for +Linux does not compile on Windows, for instance. The comment on top of the +C file lists the OS, CPU and CC the file has been compiled for. + + +Additional Features +=================== + +This section describes Nimrod's additional features that are not listed in the +Nimrod manual. Some of the features here only make sense for the C code +generator and are subject to change. + + +NoDecl pragma +------------- +The `noDecl`:idx: pragma can be applied to almost any symbol (variable, proc, +type, etc.) and is sometimes useful for interoperability with C: +It tells Nimrod that it should not generate a declaration for the symbol in +the C code. For example: + +.. code-block:: Nimrod + var + EACCES {.importc, noDecl.}: cint # pretend EACCES was a variable, as + # Nimrod does not know its value + +However, the ``header`` pragma is often the better alternative. + +**Note**: This will not work for the LLVM backend. + + +Header pragma +------------- +The `header`:idx: pragma is very similar to the ``noDecl`` pragma: It can be +applied to almost any symbol and specifies that it should not be declared +and instead the generated code should contain an ``#include``: + +.. code-block:: Nimrod + type + PFile {.importc: "FILE*", header: "<stdio.h>".} = distinct pointer + # import C's FILE* type; Nimrod will treat it as a new pointer type + +The ``header`` pragma always expects a string constant. The string contant +contains the header file: As usual for C, a system header file is enclosed +in angle brackets: ``<>``. If no angle brackets are given, Nimrod +encloses the header file in ``""`` in the generated C code. + +**Note**: This will not work for the LLVM backend. + + +LineDir option +-------------- +The `lineDir`:idx: option can be turned on or off. If turned on the +generated C code contains ``#line`` directives. This may be helpful for +debugging with GDB. + + +StackTrace option +----------------- +If the `stackTrace`:idx: option is turned on, the generated C contains code to +ensure that proper stack traces are given if the program crashes or an +uncaught exception is raised. + + +LineTrace option +---------------- +The `lineTrace`:idx: option implies the ``stackTrace`` option. If turned on, +the generated C contains code to ensure that proper stack traces with line +number information are given if the program crashes or an uncaught exception +is raised. + +Debugger option +--------------- +The `debugger`:idx: option enables or disables the *Embedded Nimrod Debugger*. +See the documentation of endb_ for further information. + + +Breakpoint pragma +----------------- +The *breakpoint* pragma was specially added for the sake of debugging with +ENDB. See the documentation of `endb <endb.html>`_ for further information. + + +Volatile pragma +--------------- +The `volatile`:idx: pragma is for variables only. It declares the variable as +``volatile``, whatever that means in C/C++. + +**Note**: This pragma will not exist for the LLVM backend. + + +Debugging with Nimrod +===================== + +Nimrod comes with its own *Embedded Nimrod Debugger*. See +the documentation of endb_ for further information. + + +Optimizing for Nimrod +===================== + +Nimrod has no separate optimizer, but the C code that is produced is very +efficient. Most C compilers have excellent optimizers, so usually it is +not needed to optimize one's code. Nimrod has been designed to encourage +efficient code: The most readable code in Nimrod is often the most efficient +too. + +However, sometimes one has to optimize. Do it in the following order: + +1. switch off the embedded debugger (it is **slow**!) +2. turn on the optimizer and turn off runtime checks +3. profile your code to find where the bottlenecks are +4. try to find a better algorithm +5. do low-level optimizations + +This section can only help you with the last item. + + +Optimizing string handling +-------------------------- + +String assignments are sometimes expensive in Nimrod: They are required to +copy the whole string. However, the compiler is often smart enough to not copy +strings. Due to the argument passing semantics, strings are never copied when +passed to subroutines. The compiler does not copy strings that are a result from +a procedure call, because the callee returns a new string anyway. +Thus it is efficient to do: + +.. code-block:: Nimrod + var s = procA() # assignment will not copy the string; procA allocates a new + # string already + +However it is not efficient to do: + +.. code-block:: Nimrod + var s = varA # assignment has to copy the whole string into a new buffer! + +.. + String case statements are optimized too. A hashing scheme is used for them + if several different string constants are used. This is likely to be more + efficient than any hand-coded scheme. + + +.. + The ECMAScript code generator + ============================= + + Note: As of version 0.7.0 the ECMAScript code generator is not maintained any + longer. Help if you are interested. + + Note: I use the term `ECMAScript`:idx: here instead of `JavaScript`:idx:, + since it is the proper term. + + The ECMAScript code generator is experimental! + + Nimrod targets ECMAScript 1.5 which is supported by any widely used browser. + Since ECMAScript does not have a portable means to include another module, + Nimrod just generates a long ``.js`` file. + + Features or modules that the ECMAScript platform does not support are not + available. This includes: + + * manual memory management (``alloc``, etc.) + * casting and other unsafe operations (``cast`` operator, ``zeroMem``, etc.) + * file management + * most modules of the Standard library + * proper 64 bit integer arithmetic + * proper unsigned integer arithmetic + + However, the modules `strutils`:idx:, `math`:idx:, and `times`:idx: are + available! To access the DOM, use the `dom`:idx: module that is only + available for the ECMAScript platform. diff --git a/doc/rst.txt b/doc/rst.txt index 79d0eb9c4..4199598d1 100755 --- a/doc/rst.txt +++ b/doc/rst.txt @@ -5,16 +5,16 @@ :Author: Andreas Rumpf :Version: |nimrodversion| -.. contents:: +.. contents:: Introduction ============ -This document describes the subset of `Docutils`_' `reStructuredText`_ as it +This document describes the subset of `Docutils`_' `reStructuredText`_ as it has been implemented in the Nimrod compiler for generating documentation. -Elements of |rst| that are not listed here have not been implemented. +Elements of |rst| that are not listed here have not been implemented. Unfortunately, the specification of |rst| is quite vague, so Nimrod is not as -compatible to the original implementation as one would like. +compatible to the original implementation as one would like. Even though Nimrod's |rst| parser does not parse all constructs, it is pretty usable. The missing features can easily be circumvented. An indication of this @@ -26,13 +26,13 @@ Docutils' parser.) Inline elements =============== -Ordinary text may contain *inline elements*. +Ordinary text may contain *inline elements*. Bullet lists ============ -*Bullet lists* look like this:: +*Bullet lists* look like this:: * Item 1 * Item 2 that @@ -60,8 +60,8 @@ Enumerated lists *Enumerated lists* -Defintion lists -=============== +Definition lists +================ Save this code to the file "greeting.nim". Now compile and run it: @@ -77,14 +77,14 @@ appending them after the filename that is to be compiled and run: Tables ====== -Nimrod only implements simple tables of the form:: +Nimrod only implements simple tables of the form:: ================== =============== =================== header 1 header 2 header n ================== =============== =================== Cell 1 Cell 2 Cell 3 Cell 4 Cell 5; any Cell 6 - cell that is + cell that is not in column 1 may span over multiple lines @@ -97,7 +97,7 @@ header 1 header 2 header n ================== =============== =================== Cell 1 Cell 2 Cell 3 Cell 4 Cell 5; any Cell 6 - cell that is + cell that is not in column 1 may span over multiple lines |