diff options
author | Zahary Karadjov <zahary@gmail.com> | 2012-09-27 22:28:51 +0300 |
---|---|---|
committer | Zahary Karadjov <zahary@gmail.com> | 2012-09-27 22:44:52 +0300 |
commit | 04c5f58ce4a5c41e1b3ec758479bd58c6b2f33ae (patch) | |
tree | 577472a20082595ceacf7b6b63cfad79775ccef8 /doc | |
parent | 5538727945d634272f9de03b626bc2ff4e6df873 (diff) | |
download | Nim-04c5f58ce4a5c41e1b3ec758479bd58c6b2f33ae.tar.gz |
normalized the line endings in the manual
Diffstat (limited to 'doc')
-rwxr-xr-x | doc/manual.txt | 7370 |
1 files changed, 3685 insertions, 3685 deletions
diff --git a/doc/manual.txt b/doc/manual.txt index 028ccb671..e9eb3871a 100755 --- a/doc/manual.txt +++ b/doc/manual.txt @@ -1,140 +1,140 @@ -============= -Nimrod Manual -============= - -:Author: Andreas Rumpf -:Version: |nimrodversion| - -.. contents:: - - - "Complexity" seems to be a lot like "energy": you can transfer it from the end - user to one/some of the other players, but the total amount seems to remain - pretty much constant for a given task. -- Ran - -About this document -=================== - -**Note**: This document is a draft! Several of Nimrod's features need more -precise wording. This manual will evolve into a proper specification some -day. - -This document describes the lexis, the syntax, and the semantics of Nimrod. - -The language constructs are explained using an extended BNF, in -which ``(a)*`` means 0 or more ``a``'s, ``a+`` means 1 or more ``a``'s, and -``(a)?`` means an optional *a*; an alternative spelling for optional parts is -``[a]``. The ``|`` symbol is used to mark alternatives -and has the lowest precedence. Parentheses may be used to group elements. -Non-terminals start with a lowercase letter, abstract terminal symbols are in -UPPERCASE. Verbatim terminal symbols (including keywords) are quoted -with ``'``. An example:: - - ifStmt ::= 'if' expr ':' stmts ('elif' expr ':' stmts)* ['else' stmts] - -Other parts of Nimrod - like scoping rules or runtime semantics are only -described in an informal manner for now. - - -Definitions -=========== - -A Nimrod program specifies a computation that acts on a memory consisting of -components called `locations`:idx:. A variable is basically a name for a -location. Each variable and location is of a certain `type`:idx:. The -variable's type is called `static type`:idx:, the location's type is called -`dynamic type`:idx:. If the static type is not the same as the dynamic type, -it is a super-type or subtype of the dynamic type. - -An `identifier`:idx: is a symbol declared as a name for a variable, type, -procedure, etc. The region of the program over which a declaration applies is -called the `scope`:idx: of the declaration. Scopes can be nested. The meaning -of an identifier is determined by the smallest enclosing scope in which the -identifier is declared. - -An expression specifies a computation that produces a value or location. -Expressions that produce locations are called `l-values`:idx:. An l-value -can denote either a location or the value the location contains, depending on -the context. Expressions whose values can be determined statically are called -`constant expressions`:idx:; they are never l-values. - -A `static error`:idx: is an error that the implementation detects before -program execution. Unless explicitly classified, an error is a static error. - -A `checked runtime error`:idx: is an error that the implementation detects -and reports at runtime. The method for reporting such errors is via *raising -exceptions*. However, the implementation provides a means to disable these -runtime checks. See the section pragmas_ for details. - -An `unchecked runtime error`:idx: is an error that is not guaranteed to be -detected, and can cause the subsequent behavior of the computation to -be arbitrary. Unchecked runtime errors cannot occur if only `safe`:idx: -language features are used. - - -Lexical Analysis -================ - -Encoding --------- - -All Nimrod source files are in the UTF-8 encoding (or its ASCII subset). Other -encodings are not supported. Any of the standard platform line termination -sequences can be used - the Unix form using ASCII LF (linefeed), the Windows -form using the ASCII sequence CR LF (return followed by linefeed), or the old -Macintosh form using the ASCII CR (return) character. All of these forms can be -used equally, regardless of platform. - - -Indentation ------------ - -Nimrod's standard grammar describes an `indentation sensitive`:idx: language. -This means that all the control structures are recognized by indentation. -Indentation consists only of spaces; tabulators are not allowed. - -The terminals ``IND`` (indentation), ``DED`` (dedentation) and ``SAD`` -(same indentation) are generated by the scanner, denoting an indentation. - -These terminals are only generated for lines that are not empty. - -The parser and the scanner communicate over a stack which indentation terminal -should be generated: the stack consists of integers counting the spaces. The -stack is initialized with a zero on its top. The scanner reads from the stack: -If the current indentation token consists of more spaces than the entry at the -top of the stack, a ``IND`` token is generated, else if it consists of the same -number of spaces, a ``SAD`` token is generated. If it consists of fewer spaces, -a ``DED`` token is generated for any item on the stack that is greater than the -current. These items are later popped from the stack by the parser. At the end -of the file, a ``DED`` token is generated for each number remaining on the -stack that is larger than zero. - -Because the grammar contains some optional ``IND`` tokens, the scanner cannot -push new indentation levels. This has to be done by the parser. The symbol -``indPush`` indicates that an ``IND`` token is expected; the current number of -leading spaces is pushed onto the stack by the parser. The symbol ``indPop`` -denotes that the parser pops an item from the indentation stack. No token is -consumed by ``indPop``. - - -Comments --------- - -`Comments`:idx: start anywhere outside a string or character literal with the -hash character ``#``. -Comments consist of a concatenation of `comment pieces`:idx:. A comment piece -starts with ``#`` and runs until the end of the line. The end of line characters -belong to the piece. If the next line only consists of a comment piece which is -aligned to the preceding one, it does not start a new comment: - -.. code-block:: nimrod - - i = 0 # This is a single comment over multiple lines belonging to the - # assignment statement. The scanner merges these two pieces. - # This is a new comment belonging to the current block, but to no particular - # statement. - i = i + 1 # This a new comment that is NOT - echo(i) # continued here, because this comment refers to the echo statement +============= +Nimrod Manual +============= + +:Author: Andreas Rumpf +:Version: |nimrodversion| + +.. contents:: + + + "Complexity" seems to be a lot like "energy": you can transfer it from the end + user to one/some of the other players, but the total amount seems to remain + pretty much constant for a given task. -- Ran + +About this document +=================== + +**Note**: This document is a draft! Several of Nimrod's features need more +precise wording. This manual will evolve into a proper specification some +day. + +This document describes the lexis, the syntax, and the semantics of Nimrod. + +The language constructs are explained using an extended BNF, in +which ``(a)*`` means 0 or more ``a``'s, ``a+`` means 1 or more ``a``'s, and +``(a)?`` means an optional *a*; an alternative spelling for optional parts is +``[a]``. The ``|`` symbol is used to mark alternatives +and has the lowest precedence. Parentheses may be used to group elements. +Non-terminals start with a lowercase letter, abstract terminal symbols are in +UPPERCASE. Verbatim terminal symbols (including keywords) are quoted +with ``'``. An example:: + + ifStmt ::= 'if' expr ':' stmts ('elif' expr ':' stmts)* ['else' stmts] + +Other parts of Nimrod - like scoping rules or runtime semantics are only +described in an informal manner for now. + + +Definitions +=========== + +A Nimrod program specifies a computation that acts on a memory consisting of +components called `locations`:idx:. A variable is basically a name for a +location. Each variable and location is of a certain `type`:idx:. The +variable's type is called `static type`:idx:, the location's type is called +`dynamic type`:idx:. If the static type is not the same as the dynamic type, +it is a super-type or subtype of the dynamic type. + +An `identifier`:idx: is a symbol declared as a name for a variable, type, +procedure, etc. The region of the program over which a declaration applies is +called the `scope`:idx: of the declaration. Scopes can be nested. The meaning +of an identifier is determined by the smallest enclosing scope in which the +identifier is declared. + +An expression specifies a computation that produces a value or location. +Expressions that produce locations are called `l-values`:idx:. An l-value +can denote either a location or the value the location contains, depending on +the context. Expressions whose values can be determined statically are called +`constant expressions`:idx:; they are never l-values. + +A `static error`:idx: is an error that the implementation detects before +program execution. Unless explicitly classified, an error is a static error. + +A `checked runtime error`:idx: is an error that the implementation detects +and reports at runtime. The method for reporting such errors is via *raising +exceptions*. However, the implementation provides a means to disable these +runtime checks. See the section pragmas_ for details. + +An `unchecked runtime error`:idx: is an error that is not guaranteed to be +detected, and can cause the subsequent behavior of the computation to +be arbitrary. Unchecked runtime errors cannot occur if only `safe`:idx: +language features are used. + + +Lexical Analysis +================ + +Encoding +-------- + +All Nimrod source files are in the UTF-8 encoding (or its ASCII subset). Other +encodings are not supported. Any of the standard platform line termination +sequences can be used - the Unix form using ASCII LF (linefeed), the Windows +form using the ASCII sequence CR LF (return followed by linefeed), or the old +Macintosh form using the ASCII CR (return) character. All of these forms can be +used equally, regardless of platform. + + +Indentation +----------- + +Nimrod's standard grammar describes an `indentation sensitive`:idx: language. +This means that all the control structures are recognized by indentation. +Indentation consists only of spaces; tabulators are not allowed. + +The terminals ``IND`` (indentation), ``DED`` (dedentation) and ``SAD`` +(same indentation) are generated by the scanner, denoting an indentation. + +These terminals are only generated for lines that are not empty. + +The parser and the scanner communicate over a stack which indentation terminal +should be generated: the stack consists of integers counting the spaces. The +stack is initialized with a zero on its top. The scanner reads from the stack: +If the current indentation token consists of more spaces than the entry at the +top of the stack, a ``IND`` token is generated, else if it consists of the same +number of spaces, a ``SAD`` token is generated. If it consists of fewer spaces, +a ``DED`` token is generated for any item on the stack that is greater than the +current. These items are later popped from the stack by the parser. At the end +of the file, a ``DED`` token is generated for each number remaining on the +stack that is larger than zero. + +Because the grammar contains some optional ``IND`` tokens, the scanner cannot +push new indentation levels. This has to be done by the parser. The symbol +``indPush`` indicates that an ``IND`` token is expected; the current number of +leading spaces is pushed onto the stack by the parser. The symbol ``indPop`` +denotes that the parser pops an item from the indentation stack. No token is +consumed by ``indPop``. + + +Comments +-------- + +`Comments`:idx: start anywhere outside a string or character literal with the +hash character ``#``. +Comments consist of a concatenation of `comment pieces`:idx:. A comment piece +starts with ``#`` and runs until the end of the line. The end of line characters +belong to the piece. If the next line only consists of a comment piece which is +aligned to the preceding one, it does not start a new comment: + +.. code-block:: nimrod + + i = 0 # This is a single comment over multiple lines belonging to the + # assignment statement. The scanner merges these two pieces. + # This is a new comment belonging to the current block, but to no particular + # statement. + i = i + 1 # This a new comment that is NOT + echo(i) # continued here, because this comment refers to the echo statement The alignment requirement does not hold if the preceding comment piece ends in @@ -145,184 +145,184 @@ a backslash (followed by optional whitespace): TMyObject {.final, pure, acyclic.} = object # comment continues: \ # we have lots of space here to comment 'TMyObject'. # This line belongs to the comment as it's properly aligned. - -Comments are tokens; they are only allowed at certain places in the input file -as they belong to the syntax tree! This feature enables perfect source-to-source -transformations (such as pretty-printing) and superior documentation generators. -A nice side-effect is that the human reader of the code always knows exactly -which code snippet the comment refers to. - - -Identifiers & Keywords ----------------------- - -`Identifiers`:idx: in Nimrod can be any string of letters, digits -and underscores, beginning with a letter. Two immediate following -underscores ``__`` are not allowed:: - - letter ::= 'A'..'Z' | 'a'..'z' | '\x80'..'\xff' - digit ::= '0'..'9' - IDENTIFIER ::= letter ( ['_'] (letter | digit) )* - + +Comments are tokens; they are only allowed at certain places in the input file +as they belong to the syntax tree! This feature enables perfect source-to-source +transformations (such as pretty-printing) and superior documentation generators. +A nice side-effect is that the human reader of the code always knows exactly +which code snippet the comment refers to. + + +Identifiers & Keywords +---------------------- + +`Identifiers`:idx: in Nimrod can be any string of letters, digits +and underscores, beginning with a letter. Two immediate following +underscores ``__`` are not allowed:: + + letter ::= 'A'..'Z' | 'a'..'z' | '\x80'..'\xff' + digit ::= '0'..'9' + IDENTIFIER ::= letter ( ['_'] (letter | digit) )* + Currently any unicode character with an ordinal value > 127 (non ASCII) is classified as a ``letter`` and may thus be part of an identifier but later versions of the language may assign some Unicode characters to belong to the operator characters instead. -The following `keywords`:idx: are reserved and cannot be used as identifiers: - -.. code-block:: nimrod - :file: keywords.txt - -Some keywords are unused; they are reserved for future developments of the -language. - -Nimrod is a `style-insensitive`:idx: language. This means that it is not -case-sensitive and even underscores are ignored: -**type** is a reserved word, and so is **TYPE** or **T_Y_P_E**. The idea behind -this is that this allows programmers to use their own preferred spelling style -and libraries written by different programmers cannot use incompatible -conventions. A Nimrod-aware editor or IDE can show the identifiers as -preferred. Another advantage is that it frees the programmer from remembering -the exact spelling of an identifier. - - -String literals ---------------- - -Terminal symbol in the grammar: ``STR_LIT``. - -`String literals`:idx: can be delimited by matching double quotes, and can -contain the following `escape sequences`:idx:\ : - -================== =================================================== - Escape sequence Meaning -================== =================================================== - ``\n`` `newline`:idx: - ``\r``, ``\c`` `carriage return`:idx: - ``\l`` `line feed`:idx: - ``\f`` `form feed`:idx: - ``\t`` `tabulator`:idx: - ``\v`` `vertical tabulator`:idx: - ``\\`` `backslash`:idx: - ``\"`` `quotation mark`:idx: - ``\'`` `apostrophe`:idx: - ``\`` '0'..'9'+ `character with decimal value d`:idx:; - all decimal digits directly - following are used for the character - ``\a`` `alert`:idx: - ``\b`` `backspace`:idx: - ``\e`` `escape`:idx: `[ESC]`:idx: - ``\x`` HH `character with hex value HH`:idx:; - exactly two hex digits are allowed -================== =================================================== - - -Strings in Nimrod may contain any 8-bit value, even embedded zeros. However -some operations may interpret the first binary zero as a terminator. - - -Triple quoted string literals ------------------------------ - -Terminal symbol in the grammar: ``TRIPLESTR_LIT``. - -String literals can also be delimited by three double quotes -``"""`` ... ``"""``. -Literals in this form may run for several lines, may contain ``"`` and do not -interpret any escape sequences. -For convenience, when the opening ``"""`` is immediately followed by a newline, -the newline is not included in the string. The ending of the string literal is -defined by the pattern ``"""[^"]``, so this: - -.. code-block:: nimrod - """"long string within quotes"""" - -Produces:: - - "long string within quotes" - - -Raw string literals -------------------- - -Terminal symbol in the grammar: ``RSTR_LIT``. - -There are also `raw string literals`:idx: that are preceded with the -letter ``r`` (or ``R``) and are delimited by matching double quotes (just -like ordinary string literals) and do not interpret the escape sequences. -This is especially convenient for regular expressions or Windows paths: - -.. code-block:: nimrod - - var f = openFile(r"C:\texts\text.txt") # a raw string, so ``\t`` is no tab - -To produce a single ``"`` within a raw string literal, it has to be doubled: - -.. code-block:: nimrod - - r"a""b" - -Produces:: - - a"b - -``r""""`` is not possible with this notation, because the three leading -quotes introduce a triple quoted string literal. ``r"""`` is the same -as ``"""`` since triple quoted string literals do not interpret escape -sequences either. - - -Generalized raw string literals -------------------------------- - -Terminal symbols in the grammar: ``GENERALIZED_STR_LIT``, -``GENERALIZED_TRIPLESTR_LIT``. - -The construct ``identifier"string literal"`` (without whitespace between the -identifier and the opening quotation mark) is a -`generalized raw string literal`:idx:. It is a shortcut for the construct -``identifier(r"string literal")``, so it denotes a procedure call with a -raw string literal as its only argument. Generalized raw string literals -are especially convenient for embedding mini languages directly into Nimrod -(for example regular expressions). - -The construct ``identifier"""string literal"""`` exists too. It is a shortcut -for ``identifier("""string literal""")``. - - -Character literals ------------------- - -Character literals are enclosed in single quotes ``''`` and can contain the -same escape sequences as strings - with one exception: ``\n`` is not allowed -as it may be wider than one character (often it is the pair CR/LF for example). -A character is not an Unicode character but a single byte. The reason for this -is efficiency: for the overwhelming majority of use-cases, the resulting -programs will still handle UTF-8 properly as UTF-8 was specially designed for -this. -Another reason is that Nimrod can thus support ``array[char, int]`` or -``set[char]`` efficiently as many algorithms rely on this feature. - - -Numerical constants -------------------- - -`Numerical constants`:idx: are of a single type and have the form:: - - hexdigit ::= digit | 'A'..'F' | 'a'..'f' - octdigit ::= '0'..'7' - bindigit ::= '0'..'1' +The following `keywords`:idx: are reserved and cannot be used as identifiers: + +.. code-block:: nimrod + :file: keywords.txt + +Some keywords are unused; they are reserved for future developments of the +language. + +Nimrod is a `style-insensitive`:idx: language. This means that it is not +case-sensitive and even underscores are ignored: +**type** is a reserved word, and so is **TYPE** or **T_Y_P_E**. The idea behind +this is that this allows programmers to use their own preferred spelling style +and libraries written by different programmers cannot use incompatible +conventions. A Nimrod-aware editor or IDE can show the identifiers as +preferred. Another advantage is that it frees the programmer from remembering +the exact spelling of an identifier. + + +String literals +--------------- + +Terminal symbol in the grammar: ``STR_LIT``. + +`String literals`:idx: can be delimited by matching double quotes, and can +contain the following `escape sequences`:idx:\ : + +================== =================================================== + Escape sequence Meaning +================== =================================================== + ``\n`` `newline`:idx: + ``\r``, ``\c`` `carriage return`:idx: + ``\l`` `line feed`:idx: + ``\f`` `form feed`:idx: + ``\t`` `tabulator`:idx: + ``\v`` `vertical tabulator`:idx: + ``\\`` `backslash`:idx: + ``\"`` `quotation mark`:idx: + ``\'`` `apostrophe`:idx: + ``\`` '0'..'9'+ `character with decimal value d`:idx:; + all decimal digits directly + following are used for the character + ``\a`` `alert`:idx: + ``\b`` `backspace`:idx: + ``\e`` `escape`:idx: `[ESC]`:idx: + ``\x`` HH `character with hex value HH`:idx:; + exactly two hex digits are allowed +================== =================================================== + + +Strings in Nimrod may contain any 8-bit value, even embedded zeros. However +some operations may interpret the first binary zero as a terminator. + + +Triple quoted string literals +----------------------------- + +Terminal symbol in the grammar: ``TRIPLESTR_LIT``. + +String literals can also be delimited by three double quotes +``"""`` ... ``"""``. +Literals in this form may run for several lines, may contain ``"`` and do not +interpret any escape sequences. +For convenience, when the opening ``"""`` is immediately followed by a newline, +the newline is not included in the string. The ending of the string literal is +defined by the pattern ``"""[^"]``, so this: + +.. code-block:: nimrod + """"long string within quotes"""" + +Produces:: + + "long string within quotes" + + +Raw string literals +------------------- + +Terminal symbol in the grammar: ``RSTR_LIT``. + +There are also `raw string literals`:idx: that are preceded with the +letter ``r`` (or ``R``) and are delimited by matching double quotes (just +like ordinary string literals) and do not interpret the escape sequences. +This is especially convenient for regular expressions or Windows paths: + +.. code-block:: nimrod + + var f = openFile(r"C:\texts\text.txt") # a raw string, so ``\t`` is no tab + +To produce a single ``"`` within a raw string literal, it has to be doubled: + +.. code-block:: nimrod + + r"a""b" + +Produces:: + + a"b + +``r""""`` is not possible with this notation, because the three leading +quotes introduce a triple quoted string literal. ``r"""`` is the same +as ``"""`` since triple quoted string literals do not interpret escape +sequences either. + + +Generalized raw string literals +------------------------------- + +Terminal symbols in the grammar: ``GENERALIZED_STR_LIT``, +``GENERALIZED_TRIPLESTR_LIT``. + +The construct ``identifier"string literal"`` (without whitespace between the +identifier and the opening quotation mark) is a +`generalized raw string literal`:idx:. It is a shortcut for the construct +``identifier(r"string literal")``, so it denotes a procedure call with a +raw string literal as its only argument. Generalized raw string literals +are especially convenient for embedding mini languages directly into Nimrod +(for example regular expressions). + +The construct ``identifier"""string literal"""`` exists too. It is a shortcut +for ``identifier("""string literal""")``. + + +Character literals +------------------ + +Character literals are enclosed in single quotes ``''`` and can contain the +same escape sequences as strings - with one exception: ``\n`` is not allowed +as it may be wider than one character (often it is the pair CR/LF for example). +A character is not an Unicode character but a single byte. The reason for this +is efficiency: for the overwhelming majority of use-cases, the resulting +programs will still handle UTF-8 properly as UTF-8 was specially designed for +this. +Another reason is that Nimrod can thus support ``array[char, int]`` or +``set[char]`` efficiently as many algorithms rely on this feature. + + +Numerical constants +------------------- + +`Numerical constants`:idx: are of a single type and have the form:: + + hexdigit ::= digit | 'A'..'F' | 'a'..'f' + octdigit ::= '0'..'7' + bindigit ::= '0'..'1' HEX_LIT ::= '0' ('x' | 'X' ) hexdigit ( ['_'] hexdigit )* DEC_LIT ::= digit ( ['_'] digit )* OCT_LIT ::= '0o' octdigit ( ['_'] octdigit )* BIN_LIT ::= '0' ('b' | 'B' ) bindigit ( ['_'] bindigit )* - INT_LIT ::= HEX_LIT - | DEC_LIT - | OCT_LIT + INT_LIT ::= HEX_LIT + | DEC_LIT + | OCT_LIT | BIN_LIT - + INT8_LIT ::= INT_LIT ['\''] ('i' | 'I') '8' INT16_LIT ::= INT_LIT ['\''] ('i' | 'I') '16' INT32_LIT ::= INT_LIT ['\''] ('i' | 'I') '32' @@ -333,219 +333,219 @@ Numerical constants UINT16_LIT ::= INT_LIT ['\''] ('u' | 'U') '16' UINT32_LIT ::= INT_LIT ['\''] ('u' | 'U') '32' UINT64_LIT ::= INT_LIT ['\''] ('u' | 'U') '64' - - exponent ::= ('e' | 'E' ) ['+' | '-'] digit ( ['_'] digit )* - FLOAT_LIT ::= digit (['_'] digit)* ('.' (['_'] digit)* [exponent] |exponent) + + exponent ::= ('e' | 'E' ) ['+' | '-'] digit ( ['_'] digit )* + FLOAT_LIT ::= digit (['_'] digit)* ('.' (['_'] digit)* [exponent] |exponent) FLOAT32_LIT ::= HEX_LIT '\'' ('f'|'F') '32' | (FLOAT_LIT | DEC_LIT | OCT_LIT | BIN_LIT) ['\''] ('f'|'F') '32' FLOAT64_LIT ::= HEX_LIT '\'' ('f'|'F') '64' | (FLOAT_LIT | DEC_LIT | OCT_LIT | BIN_LIT) ['\''] ('f'|'F') '64' - - -As can be seen in the productions, numerical constants can contain underscores -for readability. Integer and floating point literals may be given in decimal (no -prefix), binary (prefix ``0b``), octal (prefix ``0o``) and hexadecimal -(prefix ``0x``) notation. - -There exists a literal for each numerical type that is -defined. The suffix starting with an apostrophe ('\'') is called a -`type suffix`:idx:. Literals without a type suffix are of the type ``int``, -unless the literal contains a dot or ``E|e`` in which case it is of + + +As can be seen in the productions, numerical constants can contain underscores +for readability. Integer and floating point literals may be given in decimal (no +prefix), binary (prefix ``0b``), octal (prefix ``0o``) and hexadecimal +(prefix ``0x``) notation. + +There exists a literal for each numerical type that is +defined. The suffix starting with an apostrophe ('\'') is called a +`type suffix`:idx:. Literals without a type suffix are of the type ``int``, +unless the literal contains a dot or ``E|e`` in which case it is of type ``float``. For notational convenience the apostrophe of a type suffix is optional if it is not ambiguous (only hexadecimal floating point literals with a type suffix can be ambiguous). - - -The type suffixes are: - -================= ========================= - Type Suffix Resulting type of literal -================= ========================= - ``'i8`` int8 - ``'i16`` int16 - ``'i32`` int32 + + +The type suffixes are: + +================= ========================= + Type Suffix Resulting type of literal +================= ========================= + ``'i8`` int8 + ``'i16`` int16 + ``'i32`` int32 ``'i64`` int64 - ``'u`` uint - ``'u8`` uint8 - ``'u16`` uint16 - ``'u32`` uint32 - ``'u64`` uint64 - ``'f32`` float32 - ``'f64`` float64 -================= ========================= - -Floating point literals may also be in binary, octal or hexadecimal -notation: -``0B0_10001110100_0000101001000111101011101111111011000101001101001001'f64`` -is approximately 1.72826e35 according to the IEEE floating point standard. - - -Operators ---------- - -In Nimrod one can define his own operators. An `operator`:idx: is any -combination of the following characters:: - - = + - * / < > - @ $ ~ & % | - ! ? ^ . : \ - -These keywords are also operators: -``and or not xor shl shr div mod in notin is isnot of``. - -`=`:tok:, `:`:tok:, `::`:tok: are not available as general operators; they -are used for other notational purposes. - -``*:`` is as a special case the two tokens `*`:tok: and `:`:tok: -(to support ``var v*: T``). - - -Other tokens ------------- - -The following strings denote other tokens:: - - ` ( ) { } [ ] , ; [. .] {. .} (. .) - - -The `slice`:idx: operator `..`:tok: takes precedence over other tokens that -contain a dot: `{..}`:tok: are the three tokens `{`:tok:, `..`:tok:, `}`:tok: -and not the two tokens `{.`:tok:, `.}`:tok:. - - -Syntax -====== - -This section lists Nimrod's standard syntax in ENBF. How the parser receives -indentation tokens is already described in the `Lexical Analysis`_ section. - -Nimrod allows user-definable operators. -Binary operators have 10 different levels of precedence. - -Relevant character ------------------- - -An operator symbol's *relevant character* is its first -character unless the first character is ``\`` and its length is greater than 1 -then it is the second character. - -This rule allows to escape operator symbols with ``\`` and keeps the operator's -precedence and associativity; this is useful for meta programming. - - -Associativity -------------- - -All binary operators are left-associative, except binary operators whose -relevant char is ``^``. - -Precedence ----------- - -For operators that are not keywords the precedence is determined by the -following rules: - -If the operator ends with ``=`` and its relevant character is none of -``<``, ``>``, ``!``, ``=``, ``~``, ``?``, it is an *assignment operator* which -has the lowest precedence. - -If the operator's relevant character is ``@`` it is a `sigil-like`:idx: -operator which binds stronger than a ``primarySuffix``: ``@x.abc`` is parsed -as ``(@x).abc`` whereas ``$x.abc`` is parsed as ``$(x.abc)``. - - -Otherwise precedence is determined by the relevant character. - -================ =============================================== ================== =============== -Precedence level Operators Relevant character Terminal symbol -================ =============================================== ================== =============== - 9 (highest) ``$ ^`` OP9 - 8 ``* / div mod shl shr %`` ``* % \ /`` OP8 - 7 ``+ -`` ``+ ~ |`` OP7 - 6 ``&`` ``&`` OP6 - 5 ``..`` ``.`` OP5 - 4 ``== <= < >= > != in not_in is isnot not of`` ``= < > !`` OP4 - 3 ``and`` OP3 - 2 ``or xor`` OP2 - 1 ``@ : ?`` OP1 - 0 (lowest) *assignment operator* (like ``+=``, ``*=``) OP0 -================ =============================================== ================== =============== - - -The grammar's start symbol is ``module``. - -.. include:: grammar.txt - :literal: - - - -Semantics -========= - -Types ------ - -All expressions have a `type`:idx: which is known at compile time. Nimrod -is statically typed. One can declare new types, which is in essence defining -an identifier that can be used to denote this custom type. - -These are the major type classes: - -* ordinal types (consist of integer, bool, character, enumeration - (and subranges thereof) types) -* floating point types -* string type -* structured types -* reference (pointer) type -* procedural type -* generic type - - -Ordinal types -~~~~~~~~~~~~~ -`Ordinal types`:idx: have the following characteristics: - -- Ordinal types are countable and ordered. This property allows - the operation of functions as ``Inc``, ``Ord``, ``Dec`` on ordinal types to - be defined. -- Ordinal values have a smallest possible value. Trying to count further - down than the smallest value gives a checked runtime or static error. -- Ordinal values have a largest possible value. Trying to count further - than the largest value gives a checked runtime or static error. - -Integers, bool, characters and enumeration types (and subranges of these + ``'u`` uint + ``'u8`` uint8 + ``'u16`` uint16 + ``'u32`` uint32 + ``'u64`` uint64 + ``'f32`` float32 + ``'f64`` float64 +================= ========================= + +Floating point literals may also be in binary, octal or hexadecimal +notation: +``0B0_10001110100_0000101001000111101011101111111011000101001101001001'f64`` +is approximately 1.72826e35 according to the IEEE floating point standard. + + +Operators +--------- + +In Nimrod one can define his own operators. An `operator`:idx: is any +combination of the following characters:: + + = + - * / < > + @ $ ~ & % | + ! ? ^ . : \ + +These keywords are also operators: +``and or not xor shl shr div mod in notin is isnot of``. + +`=`:tok:, `:`:tok:, `::`:tok: are not available as general operators; they +are used for other notational purposes. + +``*:`` is as a special case the two tokens `*`:tok: and `:`:tok: +(to support ``var v*: T``). + + +Other tokens +------------ + +The following strings denote other tokens:: + + ` ( ) { } [ ] , ; [. .] {. .} (. .) + + +The `slice`:idx: operator `..`:tok: takes precedence over other tokens that +contain a dot: `{..}`:tok: are the three tokens `{`:tok:, `..`:tok:, `}`:tok: +and not the two tokens `{.`:tok:, `.}`:tok:. + + +Syntax +====== + +This section lists Nimrod's standard syntax in ENBF. How the parser receives +indentation tokens is already described in the `Lexical Analysis`_ section. + +Nimrod allows user-definable operators. +Binary operators have 10 different levels of precedence. + +Relevant character +------------------ + +An operator symbol's *relevant character* is its first +character unless the first character is ``\`` and its length is greater than 1 +then it is the second character. + +This rule allows to escape operator symbols with ``\`` and keeps the operator's +precedence and associativity; this is useful for meta programming. + + +Associativity +------------- + +All binary operators are left-associative, except binary operators whose +relevant char is ``^``. + +Precedence +---------- + +For operators that are not keywords the precedence is determined by the +following rules: + +If the operator ends with ``=`` and its relevant character is none of +``<``, ``>``, ``!``, ``=``, ``~``, ``?``, it is an *assignment operator* which +has the lowest precedence. + +If the operator's relevant character is ``@`` it is a `sigil-like`:idx: +operator which binds stronger than a ``primarySuffix``: ``@x.abc`` is parsed +as ``(@x).abc`` whereas ``$x.abc`` is parsed as ``$(x.abc)``. + + +Otherwise precedence is determined by the relevant character. + +================ =============================================== ================== =============== +Precedence level Operators Relevant character Terminal symbol +================ =============================================== ================== =============== + 9 (highest) ``$ ^`` OP9 + 8 ``* / div mod shl shr %`` ``* % \ /`` OP8 + 7 ``+ -`` ``+ ~ |`` OP7 + 6 ``&`` ``&`` OP6 + 5 ``..`` ``.`` OP5 + 4 ``== <= < >= > != in not_in is isnot not of`` ``= < > !`` OP4 + 3 ``and`` OP3 + 2 ``or xor`` OP2 + 1 ``@ : ?`` OP1 + 0 (lowest) *assignment operator* (like ``+=``, ``*=``) OP0 +================ =============================================== ================== =============== + + +The grammar's start symbol is ``module``. + +.. include:: grammar.txt + :literal: + + + +Semantics +========= + +Types +----- + +All expressions have a `type`:idx: which is known at compile time. Nimrod +is statically typed. One can declare new types, which is in essence defining +an identifier that can be used to denote this custom type. + +These are the major type classes: + +* ordinal types (consist of integer, bool, character, enumeration + (and subranges thereof) types) +* floating point types +* string type +* structured types +* reference (pointer) type +* procedural type +* generic type + + +Ordinal types +~~~~~~~~~~~~~ +`Ordinal types`:idx: have the following characteristics: + +- Ordinal types are countable and ordered. This property allows + the operation of functions as ``Inc``, ``Ord``, ``Dec`` on ordinal types to + be defined. +- Ordinal values have a smallest possible value. Trying to count further + down than the smallest value gives a checked runtime or static error. +- Ordinal values have a largest possible value. Trying to count further + than the largest value gives a checked runtime or static error. + +Integers, bool, characters and enumeration types (and subranges of these types) belong to ordinal types. For reasons of simplicity of implementation -the types ``uint`` and ``uint64`` are no ordinal types. - - -Pre-defined integer types -~~~~~~~~~~~~~~~~~~~~~~~~~ -These integer types are pre-defined: - -``int`` - the generic signed integer type; its size is platform dependent and has the - same size as a pointer. This type should be used in general. An integer +the types ``uint`` and ``uint64`` are no ordinal types. + + +Pre-defined integer types +~~~~~~~~~~~~~~~~~~~~~~~~~ +These integer types are pre-defined: + +``int`` + the generic signed integer type; its size is platform dependent and has the + same size as a pointer. This type should be used in general. An integer literal that has no type suffix is of this type. - -intXX - additional signed integer types of XX bits use this naming scheme - (example: int16 is a 16 bit wide integer). - The current implementation supports ``int8``, ``int16``, ``int32``, ``int64``. - Literals of these types have the suffix 'iXX. - -``uint`` + +intXX + additional signed integer types of XX bits use this naming scheme + (example: int16 is a 16 bit wide integer). + The current implementation supports ``int8``, ``int16``, ``int32``, ``int64``. + Literals of these types have the suffix 'iXX. + +``uint`` the generic `unsigned integer`:idx: type; its size is platform dependent and has the same size as a pointer. An integer literal with the type - suffix ``'u`` is of this type. + suffix ``'u`` is of this type. -uintXX - additional signed integer types of XX bits use this naming scheme - (example: uint16 is a 16 bit wide unsigned integer). +uintXX + additional signed integer types of XX bits use this naming scheme + (example: uint16 is a 16 bit wide unsigned integer). The current implementation supports ``uint8``, ``uint16``, ``uint32``, ``uint64``. Literals of these types have the suffix 'uXX. Unsigned operations all wrap around; they cannot lead to over- or - underflow errors. - + underflow errors. + In addition to the usual arithmetic operators for signed and unsigned integers (``+ - *`` etc.) there are also operators that formally work on *signed* @@ -554,32 +554,32 @@ for backwards compatibility with older versions of the language that lacked unsigned integer types. These unsigned operations for signed integers use the ``%`` suffix as convention: - -====================== ====================================================== -operation meaning -====================== ====================================================== -``a +% b`` unsigned integer addition -``a -% b`` unsigned integer subtraction -``a *% b`` unsigned integer multiplication -``a /% b`` unsigned integer division -``a %% b`` unsigned integer modulo operation -``a <% b`` treat ``a`` and ``b`` as unsigned and compare -``a <=% b`` treat ``a`` and ``b`` as unsigned and compare -``ze(a)`` extends the bits of ``a`` with zeros until it has the - width of the ``int`` type -``toU8(a)`` treats ``a`` as unsigned and converts it to an - unsigned integer of 8 bits (but still the - ``int8`` type) -``toU16(a)`` treats ``a`` as unsigned and converts it to an - unsigned integer of 16 bits (but still the - ``int16`` type) -``toU32(a)`` treats ``a`` as unsigned and converts it to an - unsigned integer of 32 bits (but still the - ``int32`` type) -====================== ====================================================== - -`Automatic type conversion`:idx: is performed in expressions where different -kinds of integer types are used: the smaller type is converted to the larger. + +====================== ====================================================== +operation meaning +====================== ====================================================== +``a +% b`` unsigned integer addition +``a -% b`` unsigned integer subtraction +``a *% b`` unsigned integer multiplication +``a /% b`` unsigned integer division +``a %% b`` unsigned integer modulo operation +``a <% b`` treat ``a`` and ``b`` as unsigned and compare +``a <=% b`` treat ``a`` and ``b`` as unsigned and compare +``ze(a)`` extends the bits of ``a`` with zeros until it has the + width of the ``int`` type +``toU8(a)`` treats ``a`` as unsigned and converts it to an + unsigned integer of 8 bits (but still the + ``int8`` type) +``toU16(a)`` treats ``a`` as unsigned and converts it to an + unsigned integer of 16 bits (but still the + ``int16`` type) +``toU32(a)`` treats ``a`` as unsigned and converts it to an + unsigned integer of 32 bits (but still the + ``int32`` type) +====================== ====================================================== + +`Automatic type conversion`:idx: is performed in expressions where different +kinds of integer types are used: the smaller type is converted to the larger. A `narrowing type conversion`:idx: converts a larger to a smaller type (for example ``int32 -> int16``. A `widening type conversion`:idx: converts a @@ -598,26 +598,26 @@ if the literal's value fits this smaller type and such a conversion is less expensive than other implicit conversions, so ``myInt16 + 34`` produces an ``int16`` result. -For further details, see `Convertible relation`_. - - -Subrange types -~~~~~~~~~~~~~~ -A `subrange`:idx: type is a range of values from an ordinal type (the base -type). To define a subrange type, one must specify it's limiting values: the -lowest and highest value of the type: - -.. code-block:: nimrod - type - TSubrange = range[0..5] - - -``TSubrange`` is a subrange of an integer which can only hold the values 0 -to 5. Assigning any other value to a variable of type ``TSubrange`` is a -checked runtime error (or static error if it can be statically -determined). Assignments from the base type to one of its subrange types -(and vice versa) are allowed. - +For further details, see `Convertible relation`_. + + +Subrange types +~~~~~~~~~~~~~~ +A `subrange`:idx: type is a range of values from an ordinal type (the base +type). To define a subrange type, one must specify it's limiting values: the +lowest and highest value of the type: + +.. code-block:: nimrod + type + TSubrange = range[0..5] + + +``TSubrange`` is a subrange of an integer which can only hold the values 0 +to 5. Assigning any other value to a variable of type ``TSubrange`` is a +checked runtime error (or static error if it can be statically +determined). Assignments from the base type to one of its subrange types +(and vice versa) are allowed. + A subrange type has the same size as its base type (``int`` in the example). Nimrod requires `interval arithmetic`:idx: for subrange types over a set @@ -639,283 +639,283 @@ This means that the following code is accepted: of 9: echo "C" of 10: echo "D" # note: no ``else`` required as (x and 3) + 7 has the type: range[7..10] - - -Pre-defined floating point types -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The following floating point types are pre-defined: - -``float`` - the generic floating point type; its size is platform dependent - (the compiler chooses the processor's fastest floating point type). - This type should be used in general. - -floatXX - an implementation may define additional floating point types of XX bits using - this naming scheme (example: float64 is a 64 bit wide float). The current - implementation supports ``float32`` and ``float64``. Literals of these types - have the suffix 'fXX. - - -Automatic type conversion in expressions with different kinds -of floating point types is performed: See `Convertible relation`_ for further -details. Arithmetic performed on floating point types follows the IEEE -standard. Integer types are not converted to floating point types automatically -and vice versa. - -The IEEE standard defines five types of floating-point exceptions: - -* Invalid: operations with mathematically invalid operands, - for example 0.0/0.0, sqrt(-1.0), and log(-37.8). -* Division by zero: divisor is zero and dividend is a finite nonzero number, - for example 1.0/0.0. -* Overflow: operation produces a result that exceeds the range of the exponent, - for example MAXDOUBLE+0.0000000000001e308. -* Underflow: operation produces a result that is too small to be represented - as a normal number, for example, MINDOUBLE * MINDOUBLE. -* Inexact: operation produces a result that cannot be represented with infinite - precision, for example, 2.0 / 3.0, log(1.1) and 0.1 in input. - -The IEEE exceptions are either ignored at runtime or mapped to the -Nimrod exceptions: `EFloatInvalidOp`:idx:, `EFloatDivByZero`:idx:, -`EFloatOverflow`:idx:, `EFloatUnderflow`:idx:, and `EFloatInexact`:idx:. -These exceptions inherit from the `EFloatingPoint`:idx: base class. - -Nimrod provides the pragmas `NaNChecks`:idx: and `InfChecks`:idx: to control -whether the IEEE exceptions are ignored or trap a Nimrod exception: - -.. code-block:: nimrod - {.NanChecks: on, InfChecks: on.} - var a = 1.0 - var b = 0.0 - echo b / b # raises EFloatInvalidOp - echo a / b # raises EFloatOverflow - -In the current implementation ``EFloatDivByZero`` and ``EFloatInexact`` are -never raised. ``EFloatOverflow`` is raised instead of ``EFloatDivByZero``. -There is also a `floatChecks`:idx: pragma that is a short-cut for the -combination of ``NaNChecks`` and ``InfChecks`` pragmas. ``floatChecks`` are -turned off as default. - -The only operations that are affected by the ``floatChecks`` pragma are -the ``+``, ``-``, ``*``, ``/`` operators for floating point types. - - -Boolean type -~~~~~~~~~~~~ -The `boolean`:idx: type is named `bool`:idx: in Nimrod and can be one of the two -pre-defined values ``true`` and ``false``. Conditions in while, -if, elif, when statements need to be of type bool. - -This condition holds:: - - ord(false) == 0 and ord(true) == 1 - -The operators ``not, and, or, xor, <, <=, >, >=, !=, ==`` are defined -for the bool type. The ``and`` and ``or`` operators perform short-cut -evaluation. Example: - -.. code-block:: nimrod - - while p != nil and p.name != "xyz": - # p.name is not evaluated if p == nil - p = p.next - - -The size of the bool type is one byte. - - -Character type -~~~~~~~~~~~~~~ -The `character type`:idx: is named ``char`` in Nimrod. Its size is one byte. -Thus it cannot represent an UTF-8 character, but a part of it. -The reason for this is efficiency: for the overwhelming majority of use-cases, -the resulting programs will still handle UTF-8 properly as UTF-8 was specially -designed for this. -Another reason is that Nimrod can support ``array[char, int]`` or -``set[char]`` efficiently as many algorithms rely on this feature. The -`TRune` type is used for Unicode characters, it can represent any Unicode -character. ``TRune`` is declared in the ``unicode`` module. - - - -Enumeration types -~~~~~~~~~~~~~~~~~ -`Enumeration`:idx: types define a new type whose values consist of the ones -specified. The values are ordered. Example: - -.. code-block:: nimrod - - type - TDirection = enum - north, east, south, west - - -Now the following holds:: - - ord(north) == 0 - ord(east) == 1 - ord(south) == 2 - ord(west) == 3 - -Thus, north < east < south < west. The comparison operators can be used -with enumeration types. - -For better interfacing to other programming languages, the fields of enum -types can be assigned an explicit ordinal value. However, the ordinal values -have to be in ascending order. A field whose ordinal value is not -explicitly given is assigned the value of the previous field + 1. - -An explicit ordered enum can have *holes*: - -.. code-block:: nimrod - type - TTokenType = enum - a = 2, b = 4, c = 89 # holes are valid - -However, it is then not an ordinal anymore, so it is not possible to use these -enums as an index type for arrays. The procedures ``inc``, ``dec``, ``succ`` -and ``pred`` are not available for them either. - - -The compiler supports the built-in stringify operator ``$`` for enumerations. -The stringify's result can be controlled by explicitely giving the string -values to use: - -.. code-block:: nimrod - - type - TMyEnum = enum - valueA = (0, "my value A"), - valueB = "value B", - valueC = 2, - valueD = (3, "abc") - -As can be seen from the example, it is possible to both specify a field's -ordinal value and its string value by using a tuple. It is also -possible to only specify one of them. + + +Pre-defined floating point types +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following floating point types are pre-defined: + +``float`` + the generic floating point type; its size is platform dependent + (the compiler chooses the processor's fastest floating point type). + This type should be used in general. + +floatXX + an implementation may define additional floating point types of XX bits using + this naming scheme (example: float64 is a 64 bit wide float). The current + implementation supports ``float32`` and ``float64``. Literals of these types + have the suffix 'fXX. + + +Automatic type conversion in expressions with different kinds +of floating point types is performed: See `Convertible relation`_ for further +details. Arithmetic performed on floating point types follows the IEEE +standard. Integer types are not converted to floating point types automatically +and vice versa. + +The IEEE standard defines five types of floating-point exceptions: + +* Invalid: operations with mathematically invalid operands, + for example 0.0/0.0, sqrt(-1.0), and log(-37.8). +* Division by zero: divisor is zero and dividend is a finite nonzero number, + for example 1.0/0.0. +* Overflow: operation produces a result that exceeds the range of the exponent, + for example MAXDOUBLE+0.0000000000001e308. +* Underflow: operation produces a result that is too small to be represented + as a normal number, for example, MINDOUBLE * MINDOUBLE. +* Inexact: operation produces a result that cannot be represented with infinite + precision, for example, 2.0 / 3.0, log(1.1) and 0.1 in input. + +The IEEE exceptions are either ignored at runtime or mapped to the +Nimrod exceptions: `EFloatInvalidOp`:idx:, `EFloatDivByZero`:idx:, +`EFloatOverflow`:idx:, `EFloatUnderflow`:idx:, and `EFloatInexact`:idx:. +These exceptions inherit from the `EFloatingPoint`:idx: base class. + +Nimrod provides the pragmas `NaNChecks`:idx: and `InfChecks`:idx: to control +whether the IEEE exceptions are ignored or trap a Nimrod exception: + +.. code-block:: nimrod + {.NanChecks: on, InfChecks: on.} + var a = 1.0 + var b = 0.0 + echo b / b # raises EFloatInvalidOp + echo a / b # raises EFloatOverflow + +In the current implementation ``EFloatDivByZero`` and ``EFloatInexact`` are +never raised. ``EFloatOverflow`` is raised instead of ``EFloatDivByZero``. +There is also a `floatChecks`:idx: pragma that is a short-cut for the +combination of ``NaNChecks`` and ``InfChecks`` pragmas. ``floatChecks`` are +turned off as default. + +The only operations that are affected by the ``floatChecks`` pragma are +the ``+``, ``-``, ``*``, ``/`` operators for floating point types. + + +Boolean type +~~~~~~~~~~~~ +The `boolean`:idx: type is named `bool`:idx: in Nimrod and can be one of the two +pre-defined values ``true`` and ``false``. Conditions in while, +if, elif, when statements need to be of type bool. + +This condition holds:: + + ord(false) == 0 and ord(true) == 1 + +The operators ``not, and, or, xor, <, <=, >, >=, !=, ==`` are defined +for the bool type. The ``and`` and ``or`` operators perform short-cut +evaluation. Example: + +.. code-block:: nimrod + + while p != nil and p.name != "xyz": + # p.name is not evaluated if p == nil + p = p.next + + +The size of the bool type is one byte. + + +Character type +~~~~~~~~~~~~~~ +The `character type`:idx: is named ``char`` in Nimrod. Its size is one byte. +Thus it cannot represent an UTF-8 character, but a part of it. +The reason for this is efficiency: for the overwhelming majority of use-cases, +the resulting programs will still handle UTF-8 properly as UTF-8 was specially +designed for this. +Another reason is that Nimrod can support ``array[char, int]`` or +``set[char]`` efficiently as many algorithms rely on this feature. The +`TRune` type is used for Unicode characters, it can represent any Unicode +character. ``TRune`` is declared in the ``unicode`` module. + + + +Enumeration types +~~~~~~~~~~~~~~~~~ +`Enumeration`:idx: types define a new type whose values consist of the ones +specified. The values are ordered. Example: + +.. code-block:: nimrod + + type + TDirection = enum + north, east, south, west + + +Now the following holds:: + + ord(north) == 0 + ord(east) == 1 + ord(south) == 2 + ord(west) == 3 + +Thus, north < east < south < west. The comparison operators can be used +with enumeration types. + +For better interfacing to other programming languages, the fields of enum +types can be assigned an explicit ordinal value. However, the ordinal values +have to be in ascending order. A field whose ordinal value is not +explicitly given is assigned the value of the previous field + 1. + +An explicit ordered enum can have *holes*: + +.. code-block:: nimrod + type + TTokenType = enum + a = 2, b = 4, c = 89 # holes are valid + +However, it is then not an ordinal anymore, so it is not possible to use these +enums as an index type for arrays. The procedures ``inc``, ``dec``, ``succ`` +and ``pred`` are not available for them either. + + +The compiler supports the built-in stringify operator ``$`` for enumerations. +The stringify's result can be controlled by explicitely giving the string +values to use: + +.. code-block:: nimrod + + type + TMyEnum = enum + valueA = (0, "my value A"), + valueB = "value B", + valueC = 2, + valueD = (3, "abc") + +As can be seen from the example, it is possible to both specify a field's +ordinal value and its string value by using a tuple. It is also +possible to only specify one of them. An enum can be marked with the ``pure`` pragma so that it's fields are not added to the current scope, so they always need to be accessed via ``TMyEnum.value``: -.. code-block:: nimrod - - type - TMyEnum {.pure.} = enum +.. code-block:: nimrod + + type + TMyEnum {.pure.} = enum valueA, valueB, valueC, valueD echo valueA # error: Unknown identifier - echo TMyEnum.valueA # works - - -String type -~~~~~~~~~~~ -All string literals are of the type `string`:idx:. A string in Nimrod is very -similar to a sequence of characters. However, strings in Nimrod are both -zero-terminated and have a length field. One can retrieve the length with the -builtin ``len`` procedure; the length never counts the terminating zero. -The assignment operator for strings always copies the string. -The ``&`` operator concatenates strings. - -Strings are compared by their lexicographical order. All comparison operators -are available. Strings can be indexed like arrays (lower bound is 0). Unlike -arrays, they can be used in case statements: - -.. code-block:: nimrod - - case paramStr(i) - of "-v": incl(options, optVerbose) - of "-h", "-?": incl(options, optHelp) - else: write(stdout, "invalid command line option!\n") - -Per convention, all strings are UTF-8 strings, but this is not enforced. For -example, when reading strings from binary files, they are merely a sequence of -bytes. The index operation ``s[i]`` means the i-th *char* of ``s``, not the -i-th *unichar*. The iterator ``runes`` from the ``unicode`` -module can be used for iteration over all Unicode characters. - - -CString type -~~~~~~~~~~~~ -The `cstring`:idx: type represents a pointer to a zero-terminated char array -compatible to the type ``char*`` in Ansi C. Its primary purpose lies in easy -interfacing with C. The index operation ``s[i]`` means the i-th *char* of -``s``; however no bounds checking for ``cstring`` is performed making the -index operation unsafe. - -A Nimrod ``string`` is implicitely convertible -to ``cstring`` for convenience. If a Nimrod string is passed to a C-style -variadic proc, it is implicitely converted to ``cstring`` too: - -.. code-block:: nimrod - proc printf(formatstr: cstring) {.importc: "printf", varargs, - header: "<stdio.h>".} - - printf("This works %s", "as expected") - -Even though the conversion is implicit, it is not *safe*: The garbage collector -does not consider a ``cstring`` to be a root and may collect the underlying -memory. However in practice this almost never happens as the GC considers -stack roots conservatively. One can use the builtin procs ``GC_ref`` and -``GC_unref`` to keep the string data alive for the rare cases where it does -not work. - - -Structured types -~~~~~~~~~~~~~~~~ -A variable of a `structured type`:idx: can hold multiple values at the same -time. Structured types can be nested to unlimited levels. Arrays, sequences, -tuples, objects and sets belong to the structured types. - -Array and sequence types -~~~~~~~~~~~~~~~~~~~~~~~~ -`Arrays`:idx: are a homogeneous type, meaning that each element in the array -has the same type. Arrays always have a fixed length which is specified at -compile time (except for open arrays). They can be indexed by any ordinal type. -A parameter ``A`` may be an *open array*, in which case it is indexed by -integers from 0 to ``len(A)-1``. An array expression may be constructed by the -array constructor ``[]``. - -`Sequences`:idx: are similar to arrays but of dynamic length which may change -during runtime (like strings). A sequence ``S`` is always indexed by integers -from 0 to ``len(S)-1`` and its bounds are checked. Sequences can be -constructed by the array constructor ``[]`` in conjunction with the array to -sequence operator ``@``. Another way to allocate space for a sequence is to -call the built-in ``newSeq`` procedure. - -A sequence may be passed to a parameter that is of type *open array*. - -Example: - -.. code-block:: nimrod - - type - TIntArray = array[0..5, int] # an array that is indexed with 0..5 - TIntSeq = seq[int] # a sequence of integers - var - x: TIntArray - y: TIntSeq - x = [1, 2, 3, 4, 5, 6] # [] is the array constructor - y = @[1, 2, 3, 4, 5, 6] # the @ turns the array into a sequence - -The lower bound of an array or sequence may be received by the built-in proc -``low()``, the higher bound by ``high()``. The length may be -received by ``len()``. ``low()`` for a sequence or an open array always returns -0, as this is the first valid index. -One can append elements to a sequence with the ``add()`` proc or the ``&`` -operator, and remove (and get) the last element of a sequence with the -``pop()`` proc. - -The notation ``x[i]`` can be used to access the i-th element of ``x``. - -Arrays are always bounds checked (at compile-time or at runtime). These -checks can be disabled via pragmas or invoking the compiler with the -``--boundChecks:off`` command line switch. + echo TMyEnum.valueA # works + + +String type +~~~~~~~~~~~ +All string literals are of the type `string`:idx:. A string in Nimrod is very +similar to a sequence of characters. However, strings in Nimrod are both +zero-terminated and have a length field. One can retrieve the length with the +builtin ``len`` procedure; the length never counts the terminating zero. +The assignment operator for strings always copies the string. +The ``&`` operator concatenates strings. + +Strings are compared by their lexicographical order. All comparison operators +are available. Strings can be indexed like arrays (lower bound is 0). Unlike +arrays, they can be used in case statements: + +.. code-block:: nimrod + + case paramStr(i) + of "-v": incl(options, optVerbose) + of "-h", "-?": incl(options, optHelp) + else: write(stdout, "invalid command line option!\n") + +Per convention, all strings are UTF-8 strings, but this is not enforced. For +example, when reading strings from binary files, they are merely a sequence of +bytes. The index operation ``s[i]`` means the i-th *char* of ``s``, not the +i-th *unichar*. The iterator ``runes`` from the ``unicode`` +module can be used for iteration over all Unicode characters. + + +CString type +~~~~~~~~~~~~ +The `cstring`:idx: type represents a pointer to a zero-terminated char array +compatible to the type ``char*`` in Ansi C. Its primary purpose lies in easy +interfacing with C. The index operation ``s[i]`` means the i-th *char* of +``s``; however no bounds checking for ``cstring`` is performed making the +index operation unsafe. + +A Nimrod ``string`` is implicitely convertible +to ``cstring`` for convenience. If a Nimrod string is passed to a C-style +variadic proc, it is implicitely converted to ``cstring`` too: + +.. code-block:: nimrod + proc printf(formatstr: cstring) {.importc: "printf", varargs, + header: "<stdio.h>".} + + printf("This works %s", "as expected") + +Even though the conversion is implicit, it is not *safe*: The garbage collector +does not consider a ``cstring`` to be a root and may collect the underlying +memory. However in practice this almost never happens as the GC considers +stack roots conservatively. One can use the builtin procs ``GC_ref`` and +``GC_unref`` to keep the string data alive for the rare cases where it does +not work. + + +Structured types +~~~~~~~~~~~~~~~~ +A variable of a `structured type`:idx: can hold multiple values at the same +time. Structured types can be nested to unlimited levels. Arrays, sequences, +tuples, objects and sets belong to the structured types. + +Array and sequence types +~~~~~~~~~~~~~~~~~~~~~~~~ +`Arrays`:idx: are a homogeneous type, meaning that each element in the array +has the same type. Arrays always have a fixed length which is specified at +compile time (except for open arrays). They can be indexed by any ordinal type. +A parameter ``A`` may be an *open array*, in which case it is indexed by +integers from 0 to ``len(A)-1``. An array expression may be constructed by the +array constructor ``[]``. + +`Sequences`:idx: are similar to arrays but of dynamic length which may change +during runtime (like strings). A sequence ``S`` is always indexed by integers +from 0 to ``len(S)-1`` and its bounds are checked. Sequences can be +constructed by the array constructor ``[]`` in conjunction with the array to +sequence operator ``@``. Another way to allocate space for a sequence is to +call the built-in ``newSeq`` procedure. + +A sequence may be passed to a parameter that is of type *open array*. + +Example: + +.. code-block:: nimrod + + type + TIntArray = array[0..5, int] # an array that is indexed with 0..5 + TIntSeq = seq[int] # a sequence of integers + var + x: TIntArray + y: TIntSeq + x = [1, 2, 3, 4, 5, 6] # [] is the array constructor + y = @[1, 2, 3, 4, 5, 6] # the @ turns the array into a sequence + +The lower bound of an array or sequence may be received by the built-in proc +``low()``, the higher bound by ``high()``. The length may be +received by ``len()``. ``low()`` for a sequence or an open array always returns +0, as this is the first valid index. +One can append elements to a sequence with the ``add()`` proc or the ``&`` +operator, and remove (and get) the last element of a sequence with the +``pop()`` proc. + +The notation ``x[i]`` can be used to access the i-th element of ``x``. + +Arrays are always bounds checked (at compile-time or at runtime). These +checks can be disabled via pragmas or invoking the compiler with the +``--boundChecks:off`` command line switch. The current implementation does not support nested open arrays. - + Varargs ~~~~~~~ @@ -951,1430 +951,1430 @@ type conversions in this context: In this example ``$`` is applied to any argument that is passed to the parameter ``a``. (Note that ``$`` applied to strings is a nop.) - - -Tuples and object types -~~~~~~~~~~~~~~~~~~~~~~~ -A variable of a `tuple`:idx: or `object`:idx: type is a heterogeneous storage -container. -A tuple or object defines various named *fields* of a type. A tuple also -defines an *order* of the fields. Tuples are meant for heterogeneous storage -types with no overhead and few abstraction possibilities. The constructor ``()`` -can be used to construct tuples. The order of the fields in the constructor -must match the order of the tuple's definition. Different tuple-types are -*equivalent* if they specify the same fields of the same type in the same -order. - -The assignment operator for tuples copies each component. -The default assignment operator for objects copies each component. Overloading + + +Tuples and object types +~~~~~~~~~~~~~~~~~~~~~~~ +A variable of a `tuple`:idx: or `object`:idx: type is a heterogeneous storage +container. +A tuple or object defines various named *fields* of a type. A tuple also +defines an *order* of the fields. Tuples are meant for heterogeneous storage +types with no overhead and few abstraction possibilities. The constructor ``()`` +can be used to construct tuples. The order of the fields in the constructor +must match the order of the tuple's definition. Different tuple-types are +*equivalent* if they specify the same fields of the same type in the same +order. + +The assignment operator for tuples copies each component. +The default assignment operator for objects copies each component. Overloading of the assignment operator for objects is not possible, but this will change -in future versions of the compiler. - -.. code-block:: nimrod - - type - TPerson = tuple[name: string, age: int] # type representing a person - # a person consists of a name - # and an age - var - person: TPerson - person = (name: "Peter", age: 30) - # the same, but less readable: - person = ("Peter", 30) - -The implementation aligns the fields for best access performance. The alignment +in future versions of the compiler. + +.. code-block:: nimrod + + type + TPerson = tuple[name: string, age: int] # type representing a person + # a person consists of a name + # and an age + var + person: TPerson + person = (name: "Peter", age: 30) + # the same, but less readable: + person = ("Peter", 30) + +The implementation aligns the fields for best access performance. The alignment is compatible with the way the C compiler does it. For consistency with ``object`` declarations, tuples in a ``type`` section can also be defined with indentation instead of ``[]``: - -.. code-block:: nimrod - - type - TPerson = tuple # type representing a person + +.. code-block:: nimrod + + type + TPerson = tuple # type representing a person name: string # a person consists of a name - age: natural # and an age - -Objects provide many features that tuples do not. Object provide inheritance -and information hiding. Objects have access to their type at runtime, so that -the ``of`` operator can be used to determine the object's type. - -.. code-block:: nimrod - - type - TPerson {.inheritable.} = object - name*: string # the * means that `name` is accessible from other modules - age: int # no * means that the field is hidden - - TStudent = object of TPerson # a student is a person - id: int # with an id field - - var - student: TStudent - person: TPerson - assert(student of TStudent) # is true - -Object fields that should be visible from outside the defining module, have to -be marked by ``*``. In contrast to tuples, different object types are + age: natural # and an age + +Objects provide many features that tuples do not. Object provide inheritance +and information hiding. Objects have access to their type at runtime, so that +the ``of`` operator can be used to determine the object's type. + +.. code-block:: nimrod + + type + TPerson {.inheritable.} = object + name*: string # the * means that `name` is accessible from other modules + age: int # no * means that the field is hidden + + TStudent = object of TPerson # a student is a person + id: int # with an id field + + var + student: TStudent + person: TPerson + assert(student of TStudent) # is true + +Object fields that should be visible from outside the defining module, have to +be marked by ``*``. In contrast to tuples, different object types are never *equivalent*. Objects that have no ancestor are implicitely ``final`` and thus have no hidden type field. One can use the ``inheritable`` pragma to introduce new object roots apart from ``system.TObject``. - - -Object variants -~~~~~~~~~~~~~~~ -Often an object hierarchy is overkill in certain situations where simple -`variant`:idx: types are needed. - -An example: - -.. code-block:: nimrod - - # This is an example how an abstract syntax tree could be modelled in Nimrod - type - TNodeKind = enum # the different node types - nkInt, # a leaf with an integer value - nkFloat, # a leaf with a float value - nkString, # a leaf with a string value - nkAdd, # an addition - nkSub, # a subtraction - nkIf # an if statement - PNode = ref TNode - TNode = object - case kind: TNodeKind # the ``kind`` field is the discriminator - of nkInt: intVal: int - of nkFloat: floatVal: float - of nkString: strVal: string - of nkAdd, nkSub: - leftOp, rightOp: PNode - of nkIf: - condition, thenPart, elsePart: PNode - - var - n: PNode - new(n) # creates a new node - n.kind = nkFloat - n.floatVal = 0.0 # valid, because ``n.kind==nkFloat``, so that it fits - - # the following statement raises an `EInvalidField` exception, because - # n.kind's value does not fit: - n.strVal = "" - -As can been seen from the example, an advantage to an object hierarchy is that -no casting between different object types is needed. Yet, access to invalid -object fields raises an exception. + + +Object variants +~~~~~~~~~~~~~~~ +Often an object hierarchy is overkill in certain situations where simple +`variant`:idx: types are needed. + +An example: + +.. code-block:: nimrod + + # This is an example how an abstract syntax tree could be modelled in Nimrod + type + TNodeKind = enum # the different node types + nkInt, # a leaf with an integer value + nkFloat, # a leaf with a float value + nkString, # a leaf with a string value + nkAdd, # an addition + nkSub, # a subtraction + nkIf # an if statement + PNode = ref TNode + TNode = object + case kind: TNodeKind # the ``kind`` field is the discriminator + of nkInt: intVal: int + of nkFloat: floatVal: float + of nkString: strVal: string + of nkAdd, nkSub: + leftOp, rightOp: PNode + of nkIf: + condition, thenPart, elsePart: PNode + + var + n: PNode + new(n) # creates a new node + n.kind = nkFloat + n.floatVal = 0.0 # valid, because ``n.kind==nkFloat``, so that it fits + + # the following statement raises an `EInvalidField` exception, because + # n.kind's value does not fit: + n.strVal = "" + +As can been seen from the example, an advantage to an object hierarchy is that +no casting between different object types is needed. Yet, access to invalid +object fields raises an exception. The syntax of ``case`` in an object declaration follows closely the syntax of -the ``case`` statement: The branches in a ``case`` section may be indented too. - - -Set type -~~~~~~~~ -The `set type`:idx: models the mathematical notion of a set. The set's -basetype can only be an ordinal type. The reason is that sets are implemented -as high performance bit vectors. - -Sets can be constructed via the set constructor: ``{}`` is the empty set. The -empty set is type compatible with any special set type. The constructor -can also be used to include elements (and ranges of elements) in the set: - -.. code-block:: nimrod - - {'a'..'z', '0'..'9'} # This constructs a set that contains the - # letters from 'a' to 'z' and the digits - # from '0' to '9' - -These operations are supported by sets: - -================== ======================================================== -operation meaning -================== ======================================================== -``A + B`` union of two sets -``A * B`` intersection of two sets -``A - B`` difference of two sets (A without B's elements) -``A == B`` set equality -``A <= B`` subset relation (A is subset of B or equal to B) -``A < B`` strong subset relation (A is a real subset of B) -``e in A`` set membership (A contains element e) -``A -+- B`` symmetric set difference (= (A - B) + (B - A)) -``card(A)`` the cardinality of A (number of elements in A) -``incl(A, elem)`` same as A = A + {elem} -``excl(A, elem)`` same as A = A - {elem} -================== ======================================================== - - -Reference and pointer types -~~~~~~~~~~~~~~~~~~~~~~~~~~~ -References (similar to `pointers`:idx: in other programming languages) are a -way to introduce many-to-one relationships. This means different references can -point to and modify the same location in memory (also called `aliasing`:idx:). - -Nimrod distinguishes between `traced`:idx: and `untraced`:idx: references. -Untraced references are also called *pointers*. Traced references point to -objects of a garbage collected heap, untraced references point to -manually allocated objects or to objects somewhere else in memory. Thus -untraced references are *unsafe*. However for certain low-level operations -(accessing the hardware) untraced references are unavoidable. - -Traced references are declared with the **ref** keyword, untraced references -are declared with the **ptr** keyword. - -An empty subscript ``[]`` notation can be used to derefer a reference, -the ``addr`` procedure returns the address of an item. An address is always -an untraced reference. -Thus the usage of ``addr`` is an *unsafe* feature. - -The ``.`` (access a tuple/object field operator) -and ``[]`` (array/string/sequence index operator) operators perform implicit -dereferencing operations for reference types: - -.. code-block:: nimrod - - type - PNode = ref TNode - TNode = object - le, ri: PNode - data: int - - var - n: PNode - new(n) - n.data = 9 - # no need to write n[].data; in fact n[].data is highly discouraged! +the ``case`` statement: The branches in a ``case`` section may be indented too. + + +Set type +~~~~~~~~ +The `set type`:idx: models the mathematical notion of a set. The set's +basetype can only be an ordinal type. The reason is that sets are implemented +as high performance bit vectors. + +Sets can be constructed via the set constructor: ``{}`` is the empty set. The +empty set is type compatible with any special set type. The constructor +can also be used to include elements (and ranges of elements) in the set: + +.. code-block:: nimrod + + {'a'..'z', '0'..'9'} # This constructs a set that contains the + # letters from 'a' to 'z' and the digits + # from '0' to '9' + +These operations are supported by sets: + +================== ======================================================== +operation meaning +================== ======================================================== +``A + B`` union of two sets +``A * B`` intersection of two sets +``A - B`` difference of two sets (A without B's elements) +``A == B`` set equality +``A <= B`` subset relation (A is subset of B or equal to B) +``A < B`` strong subset relation (A is a real subset of B) +``e in A`` set membership (A contains element e) +``A -+- B`` symmetric set difference (= (A - B) + (B - A)) +``card(A)`` the cardinality of A (number of elements in A) +``incl(A, elem)`` same as A = A + {elem} +``excl(A, elem)`` same as A = A - {elem} +================== ======================================================== + + +Reference and pointer types +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +References (similar to `pointers`:idx: in other programming languages) are a +way to introduce many-to-one relationships. This means different references can +point to and modify the same location in memory (also called `aliasing`:idx:). + +Nimrod distinguishes between `traced`:idx: and `untraced`:idx: references. +Untraced references are also called *pointers*. Traced references point to +objects of a garbage collected heap, untraced references point to +manually allocated objects or to objects somewhere else in memory. Thus +untraced references are *unsafe*. However for certain low-level operations +(accessing the hardware) untraced references are unavoidable. + +Traced references are declared with the **ref** keyword, untraced references +are declared with the **ptr** keyword. + +An empty subscript ``[]`` notation can be used to derefer a reference, +the ``addr`` procedure returns the address of an item. An address is always +an untraced reference. +Thus the usage of ``addr`` is an *unsafe* feature. + +The ``.`` (access a tuple/object field operator) +and ``[]`` (array/string/sequence index operator) operators perform implicit +dereferencing operations for reference types: + +.. code-block:: nimrod + + type + PNode = ref TNode + TNode = object + le, ri: PNode + data: int + + var + n: PNode + new(n) + n.data = 9 + # no need to write n[].data; in fact n[].data is highly discouraged! As a syntactical extension ``object`` types can be anonymous if declared in a type section via the ``ref object`` or ``ptr object`` notations. This feature is useful if an object should only gain reference semantics: -.. code-block:: nimrod - - type - Node = ref object - le, ri: Node - data: int - - -To allocate a new traced object, the built-in procedure ``new`` has to be used. -To deal with untraced memory, the procedures ``alloc``, ``dealloc`` and -``realloc`` can be used. The documentation of the system module contains -further information. - -If a reference points to *nothing*, it has the value ``nil``. - -Special care has to be taken if an untraced object contains traced objects like -traced references, strings or sequences: in order to free everything properly, -the built-in procedure ``GCunref`` has to be called before freeing the untraced -memory manually: - -.. code-block:: nimrod - type - TData = tuple[x, y: int, s: string] - - # allocate memory for TData on the heap: - var d = cast[ptr TData](alloc0(sizeof(TData))) - - # create a new string on the garbage collected heap: - d.s = "abc" - - # tell the GC that the string is not needed anymore: - GCunref(d.s) - - # free the memory: - dealloc(d) - -Without the ``GCunref`` call the memory allocated for the ``d.s`` string would -never be freed. The example also demonstrates two important features for low -level programming: the ``sizeof`` proc returns the size of a type or value -in bytes. The ``cast`` operator can circumvent the type system: the compiler -is forced to treat the result of the ``alloc0`` call (which returns an untyped -pointer) as if it would have the type ``ptr TData``. Casting should only be -done if it is unavoidable: it breaks type safety and bugs can lead to -mysterious crashes. - -**Note**: The example only works because the memory is initialized to zero -(``alloc0`` instead of ``alloc`` does this): ``d.s`` is thus initialized to -``nil`` which the string assignment can handle. You need to know low level -details like this when mixing garbage collected data with unmanaged memory. - -.. XXX finalizers for traced objects - - -Procedural type -~~~~~~~~~~~~~~~ -A `procedural type`:idx: is internally a pointer to a procedure. ``nil`` is -an allowed value for variables of a procedural type. Nimrod uses procedural -types to achieve `functional`:idx: programming techniques. - -Examples: - -.. code-block:: nimrod - - type - TCallback = proc (x: int) {.cdecl.} - - proc printItem(x: Int) = ... - - proc forEach(c: TCallback) = - ... - - forEach(printItem) # this will NOT work because calling conventions differ - - -.. code-block:: nimrod - - type - TOnMouseMove = proc (x, y: int) {.closure.} - - proc onMouseMove(mouseX, mouseY: int) = - # has default calling convention - echo "x: ", mouseX, " y: ", mouseY - - proc setOnMouseMove(mouseMoveEvent: TOnMouseMove) = nil - - # ok, 'onMouseMove' has the default calling convention, which is compatible - # to 'closure': - setOnMouseMove(onMouseMove) - - -A subtle issue with procedural types is that the calling convention of the -procedure influences the type compatibility: procedural types are only -compatible if they have the same calling convention. As a special extension, -a procedure of the calling convention ``nimcall`` can be passed to a parameter -that expects a proc of the calling convention ``closure``. - -Nimrod supports these `calling conventions`:idx:\: - -`nimcall`:idx: - is the default convention used for a Nimrod **proc**. It is the - same as ``fastcall``, but only for C compilers that support ``fastcall``. - +.. code-block:: nimrod + + type + Node = ref object + le, ri: Node + data: int + + +To allocate a new traced object, the built-in procedure ``new`` has to be used. +To deal with untraced memory, the procedures ``alloc``, ``dealloc`` and +``realloc`` can be used. The documentation of the system module contains +further information. + +If a reference points to *nothing*, it has the value ``nil``. + +Special care has to be taken if an untraced object contains traced objects like +traced references, strings or sequences: in order to free everything properly, +the built-in procedure ``GCunref`` has to be called before freeing the untraced +memory manually: + +.. code-block:: nimrod + type + TData = tuple[x, y: int, s: string] + + # allocate memory for TData on the heap: + var d = cast[ptr TData](alloc0(sizeof(TData))) + + # create a new string on the garbage collected heap: + d.s = "abc" + + # tell the GC that the string is not needed anymore: + GCunref(d.s) + + # free the memory: + dealloc(d) + +Without the ``GCunref`` call the memory allocated for the ``d.s`` string would +never be freed. The example also demonstrates two important features for low +level programming: the ``sizeof`` proc returns the size of a type or value +in bytes. The ``cast`` operator can circumvent the type system: the compiler +is forced to treat the result of the ``alloc0`` call (which returns an untyped +pointer) as if it would have the type ``ptr TData``. Casting should only be +done if it is unavoidable: it breaks type safety and bugs can lead to +mysterious crashes. + +**Note**: The example only works because the memory is initialized to zero +(``alloc0`` instead of ``alloc`` does this): ``d.s`` is thus initialized to +``nil`` which the string assignment can handle. You need to know low level +details like this when mixing garbage collected data with unmanaged memory. + +.. XXX finalizers for traced objects + + +Procedural type +~~~~~~~~~~~~~~~ +A `procedural type`:idx: is internally a pointer to a procedure. ``nil`` is +an allowed value for variables of a procedural type. Nimrod uses procedural +types to achieve `functional`:idx: programming techniques. + +Examples: + +.. code-block:: nimrod + + type + TCallback = proc (x: int) {.cdecl.} + + proc printItem(x: Int) = ... + + proc forEach(c: TCallback) = + ... + + forEach(printItem) # this will NOT work because calling conventions differ + + +.. code-block:: nimrod + + type + TOnMouseMove = proc (x, y: int) {.closure.} + + proc onMouseMove(mouseX, mouseY: int) = + # has default calling convention + echo "x: ", mouseX, " y: ", mouseY + + proc setOnMouseMove(mouseMoveEvent: TOnMouseMove) = nil + + # ok, 'onMouseMove' has the default calling convention, which is compatible + # to 'closure': + setOnMouseMove(onMouseMove) + + +A subtle issue with procedural types is that the calling convention of the +procedure influences the type compatibility: procedural types are only +compatible if they have the same calling convention. As a special extension, +a procedure of the calling convention ``nimcall`` can be passed to a parameter +that expects a proc of the calling convention ``closure``. + +Nimrod supports these `calling conventions`:idx:\: + +`nimcall`:idx: + is the default convention used for a Nimrod **proc**. It is the + same as ``fastcall``, but only for C compilers that support ``fastcall``. + `closure`:idx: is the default calling convention for a **procedural type** that lacks any pragma annotations. It indicates that the procedure has a hidden implicit parameter (an *environment*). Proc vars that have the calling convention ``closure`` take up two machine words: One for the proc pointer - and another one for the pointer to implicitely passed environment. - -`stdcall`:idx: - This the stdcall convention as specified by Microsoft. The generated C - procedure is declared with the ``__stdcall`` keyword. - -`cdecl`:idx: - The cdecl convention means that a procedure shall use the same convention - as the C compiler. Under windows the generated C procedure is declared with - the ``__cdecl`` keyword. - -`safecall`:idx: - This is the safecall convention as specified by Microsoft. The generated C - procedure is declared with the ``__safecall`` keyword. The word *safe* - refers to the fact that all hardware registers shall be pushed to the - hardware stack. - -`inline`:idx: - The inline convention means the the caller should not call the procedure, - but inline its code directly. Note that Nimrod does not inline, but leaves - this to the C compiler; it generates ``__inline`` procedures. This is - only a hint for the compiler: it may completely ignore it and - it may inline procedures that are not marked as ``inline``. - -`fastcall`:idx: - Fastcall means different things to different C compilers. One gets whatever - the C ``__fastcall`` means. - -`syscall`:idx: - The syscall convention is the same as ``__syscall`` in C. It is used for - interrupts. - -`noconv`:idx: - The generated C code will not have any explicit calling convention and thus - use the C compiler's default calling convention. This is needed because - Nimrod's default calling convention for procedures is ``fastcall`` to - improve speed. - -Most calling conventions exist only for the Windows 32-bit platform. - -Assigning/passing a procedure to a procedural variable is only allowed if one -of the following conditions hold: -1) The procedure that is accessed resists in the current module. -2) The procedure is marked with the ``procvar`` pragma (see `procvar pragma`_). -3) The procedure has a calling convention that differs from ``nimcall``. -4) The procedure is anonymous. - -The rules' purpose is to prevent the case that extending a non-``procvar`` -procedure with default parameters breaks client code. - -The default calling convention is ``nimcall``, unless it is an inner proc ( -a proc inside of a proc). For an inner proc an analysis is performed wether it -accesses its environment. If it does so, it has the calling convention -``closure``, otherwise it has the calling convention ``nimcall``. - - -Distinct type -~~~~~~~~~~~~~ - -A distinct type is new type derived from a `base type`:idx: that is -incompatible with its base type. In particular, it is an essential property -of a distinct type that it **does not** imply a subtype relation between it -and its base type. Explicit type conversions from a distinct type to its -base type and vice versa are allowed. - -A distinct type can be used to model different physical `units`:idx: with a -numerical base type, for example. The following example models currencies. - -Different currencies should not be mixed in monetary calculations. Distinct -types are a perfect tool to model different currencies: - -.. code-block:: nimrod - type - TDollar = distinct int - TEuro = distinct int - - var - d: TDollar - e: TEuro - - echo d + 12 - # Error: cannot add a number with no unit and a ``TDollar`` - -Unfortunately, ``d + 12.TDollar`` is not allowed either, -because ``+`` is defined for ``int`` (among others), not for ``TDollar``. So -a ``+`` for dollars needs to be defined: - -.. code-block:: - proc `+` (x, y: TDollar): TDollar = - result = TDollar(int(x) + int(y)) - -It does not make sense to multiply a dollar with a dollar, but with a -number without unit; and the same holds for division: - -.. code-block:: - proc `*` (x: TDollar, y: int): TDollar = - result = TDollar(int(x) * y) - - proc `*` (x: int, y: TDollar): TDollar = - result = TDollar(x * int(y)) - - proc `div` ... - -This quickly gets tedious. The implementations are trivial and the compiler -should not generate all this code only to optimize it away later - after all -``+`` for dollars should produce the same binary code as ``+`` for ints. -The pragma ``borrow`` has been designed to solve this problem; in principle -it generates the above trivial implementations: - -.. code-block:: nimrod - proc `*` (x: TDollar, y: int): TDollar {.borrow.} - proc `*` (x: int, y: TDollar): TDollar {.borrow.} - proc `div` (x: TDollar, y: int): TDollar {.borrow.} - -The ``borrow`` pragma makes the compiler use the same implementation as -the proc that deals with the distinct type's base type, so no code is -generated. - -But it seems all this boilerplate code needs to be repeated for the ``TEuro`` -currency. This can be solved with templates_. - -.. code-block:: nimrod - template Additive(typ: typeDesc): stmt = - proc `+` *(x, y: typ): typ {.borrow.} - proc `-` *(x, y: typ): typ {.borrow.} - - # unary operators: - proc `+` *(x: typ): typ {.borrow.} - proc `-` *(x: typ): typ {.borrow.} - - template Multiplicative(typ, base: typeDesc): stmt = - proc `*` *(x: typ, y: base): typ {.borrow.} - proc `*` *(x: base, y: typ): typ {.borrow.} - proc `div` *(x: typ, y: base): typ {.borrow.} - proc `mod` *(x: typ, y: base): typ {.borrow.} - - template Comparable(typ: typeDesc): stmt = - proc `<` * (x, y: typ): bool {.borrow.} - proc `<=` * (x, y: typ): bool {.borrow.} - proc `==` * (x, y: typ): bool {.borrow.} - - template DefineCurrency(typ, base: expr): stmt = - type - typ* = distinct base - Additive(typ) - Multiplicative(typ, base) - Comparable(typ) - - DefineCurrency(TDollar, int) - DefineCurrency(TEuro, int) - - -Void type -~~~~~~~~~ - -The `void`:idx: type denotes the absense of any type. Parameters of -type ``void`` are treated as non-existent, ``void`` as a return type means that -the procedure does not return a value: - -.. code-block:: nimrod - proc nothing(x, y: void): void = - echo "ha" - - nothing() # writes "ha" to stdout - -The ``void`` type is particularly useful for generic code: - -.. code-block:: nimrod - proc callProc[T](p: proc (x: T), x: T) = - when T is void: - p() - else: - p(x) - - proc intProc(x: int) = nil - proc emptyProc() = nil - - callProc[int](intProc, 12) - callProc[void](emptyProc) - -However, a ``void`` type cannot be inferred in generic code: - -.. code-block:: nimrod - callProc(emptyProc) - # Error: type mismatch: got (proc ()) - # but expected one of: - # callProc(p: proc (T), x: T) - -The ``void`` type is only valid for parameters and return types; other symbols -cannot have the type ``void``. - - -Type relations --------------- - -The following section defines several relations on types that are needed to -describe the type checking done by the compiler. - - -Type equality -~~~~~~~~~~~~~ -Nimrod uses structural type equivalence for most types. Only for objects, -enumerations and distinct types name equivalence is used. The following -algorithm (in pseudo-code) determines type equality: - -.. code-block:: nimrod - proc typeEqualsAux(a, b: PType, - s: var set[PType * PType]): bool = - if (a,b) in s: return true - incl(s, (a,b)) - if a.kind == b.kind: - case a.kind - of int, intXX, float, floatXX, char, string, cstring, pointer, - bool, nil, void: - # leaf type: kinds identical; nothing more to check - result = true - of ref, ptr, var, set, seq, openarray: - result = typeEqualsAux(a.baseType, b.baseType, s) - of range: - result = typeEqualsAux(a.baseType, b.baseType, s) and - (a.rangeA == b.rangeA) and (a.rangeB == b.rangeB) - of array: - result = typeEqualsAux(a.baseType, b.baseType, s) and - typeEqualsAux(a.indexType, b.indexType, s) - of tuple: - if a.tupleLen == b.tupleLen: - for i in 0..a.tupleLen-1: - if not typeEqualsAux(a[i], b[i], s): return false - result = true - of object, enum, distinct: - result = a == b - of proc: - result = typeEqualsAux(a.parameterTuple, b.parameterTuple, s) and - typeEqualsAux(a.resultType, b.resultType, s) and - a.callingConvention == b.callingConvention - - proc typeEquals(a, b: PType): bool = - var s: set[PType * PType] = {} - result = typeEqualsAux(a, b, s) - -Since types are graphs which can have cycles, the above algorithm needs an -auxiliary set ``s`` to detect this case. - - -Type equality modulo type distinction -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The following algorithm (in pseudo-code) determines whether two types -are equal with no respect to ``distinct`` types. For brevity the cycle check -with an auxiliary set ``s`` is omitted: - -.. code-block:: nimrod - proc typeEqualsOrDistinct(a, b: PType): bool = - if a.kind == b.kind: - case a.kind - of int, intXX, float, floatXX, char, string, cstring, pointer, - bool, nil, void: - # leaf type: kinds identical; nothing more to check - result = true - of ref, ptr, var, set, seq, openarray: - result = typeEqualsOrDistinct(a.baseType, b.baseType) - of range: - result = typeEqualsOrDistinct(a.baseType, b.baseType) and - (a.rangeA == b.rangeA) and (a.rangeB == b.rangeB) - of array: - result = typeEqualsOrDistinct(a.baseType, b.baseType) and - typeEqualsOrDistinct(a.indexType, b.indexType) - of tuple: - if a.tupleLen == b.tupleLen: - for i in 0..a.tupleLen-1: - if not typeEqualsOrDistinct(a[i], b[i]): return false - result = true - of distinct: - result = typeEqualsOrDistinct(a.baseType, b.baseType) - of object, enum: - result = a == b - of proc: - result = typeEqualsOrDistinct(a.parameterTuple, b.parameterTuple) and - typeEqualsOrDistinct(a.resultType, b.resultType) and - a.callingConvention == b.callingConvention - elif a.kind == distinct: - result = typeEqualsOrDistinct(a.baseType, b) - elif b.kind == distinct: - result = typeEqualsOrDistinct(a, b.baseType) - - -Subtype relation -~~~~~~~~~~~~~~~~ -If object ``a`` inherits from ``b``, ``a`` is a subtype of ``b``. This subtype -relation is extended to the types ``var``, ``ref``, ``ptr``: - -.. code-block:: nimrod - proc isSubtype(a, b: PType): bool = - if a.kind == b.kind: - case a.kind - of object: - var aa = a.baseType - while aa != nil and aa != b: aa = aa.baseType - result = aa == b - of var, ref, ptr: - result = isSubtype(a.baseType, b.baseType) - -.. XXX nil is a special value! - - -Convertible relation -~~~~~~~~~~~~~~~~~~~~ -A type ``a`` is **implicitly** convertible to type ``b`` iff the following -algorithm returns true: - -.. code-block:: nimrod - # XXX range types? - proc isImplicitlyConvertible(a, b: PType): bool = - case a.kind + and another one for the pointer to implicitely passed environment. + +`stdcall`:idx: + This the stdcall convention as specified by Microsoft. The generated C + procedure is declared with the ``__stdcall`` keyword. + +`cdecl`:idx: + The cdecl convention means that a procedure shall use the same convention + as the C compiler. Under windows the generated C procedure is declared with + the ``__cdecl`` keyword. + +`safecall`:idx: + This is the safecall convention as specified by Microsoft. The generated C + procedure is declared with the ``__safecall`` keyword. The word *safe* + refers to the fact that all hardware registers shall be pushed to the + hardware stack. + +`inline`:idx: + The inline convention means the the caller should not call the procedure, + but inline its code directly. Note that Nimrod does not inline, but leaves + this to the C compiler; it generates ``__inline`` procedures. This is + only a hint for the compiler: it may completely ignore it and + it may inline procedures that are not marked as ``inline``. + +`fastcall`:idx: + Fastcall means different things to different C compilers. One gets whatever + the C ``__fastcall`` means. + +`syscall`:idx: + The syscall convention is the same as ``__syscall`` in C. It is used for + interrupts. + +`noconv`:idx: + The generated C code will not have any explicit calling convention and thus + use the C compiler's default calling convention. This is needed because + Nimrod's default calling convention for procedures is ``fastcall`` to + improve speed. + +Most calling conventions exist only for the Windows 32-bit platform. + +Assigning/passing a procedure to a procedural variable is only allowed if one +of the following conditions hold: +1) The procedure that is accessed resists in the current module. +2) The procedure is marked with the ``procvar`` pragma (see `procvar pragma`_). +3) The procedure has a calling convention that differs from ``nimcall``. +4) The procedure is anonymous. + +The rules' purpose is to prevent the case that extending a non-``procvar`` +procedure with default parameters breaks client code. + +The default calling convention is ``nimcall``, unless it is an inner proc ( +a proc inside of a proc). For an inner proc an analysis is performed wether it +accesses its environment. If it does so, it has the calling convention +``closure``, otherwise it has the calling convention ``nimcall``. + + +Distinct type +~~~~~~~~~~~~~ + +A distinct type is new type derived from a `base type`:idx: that is +incompatible with its base type. In particular, it is an essential property +of a distinct type that it **does not** imply a subtype relation between it +and its base type. Explicit type conversions from a distinct type to its +base type and vice versa are allowed. + +A distinct type can be used to model different physical `units`:idx: with a +numerical base type, for example. The following example models currencies. + +Different currencies should not be mixed in monetary calculations. Distinct +types are a perfect tool to model different currencies: + +.. code-block:: nimrod + type + TDollar = distinct int + TEuro = distinct int + + var + d: TDollar + e: TEuro + + echo d + 12 + # Error: cannot add a number with no unit and a ``TDollar`` + +Unfortunately, ``d + 12.TDollar`` is not allowed either, +because ``+`` is defined for ``int`` (among others), not for ``TDollar``. So +a ``+`` for dollars needs to be defined: + +.. code-block:: + proc `+` (x, y: TDollar): TDollar = + result = TDollar(int(x) + int(y)) + +It does not make sense to multiply a dollar with a dollar, but with a +number without unit; and the same holds for division: + +.. code-block:: + proc `*` (x: TDollar, y: int): TDollar = + result = TDollar(int(x) * y) + + proc `*` (x: int, y: TDollar): TDollar = + result = TDollar(x * int(y)) + + proc `div` ... + +This quickly gets tedious. The implementations are trivial and the compiler +should not generate all this code only to optimize it away later - after all +``+`` for dollars should produce the same binary code as ``+`` for ints. +The pragma ``borrow`` has been designed to solve this problem; in principle +it generates the above trivial implementations: + +.. code-block:: nimrod + proc `*` (x: TDollar, y: int): TDollar {.borrow.} + proc `*` (x: int, y: TDollar): TDollar {.borrow.} + proc `div` (x: TDollar, y: int): TDollar {.borrow.} + +The ``borrow`` pragma makes the compiler use the same implementation as +the proc that deals with the distinct type's base type, so no code is +generated. + +But it seems all this boilerplate code needs to be repeated for the ``TEuro`` +currency. This can be solved with templates_. + +.. code-block:: nimrod + template Additive(typ: typeDesc): stmt = + proc `+` *(x, y: typ): typ {.borrow.} + proc `-` *(x, y: typ): typ {.borrow.} + + # unary operators: + proc `+` *(x: typ): typ {.borrow.} + proc `-` *(x: typ): typ {.borrow.} + + template Multiplicative(typ, base: typeDesc): stmt = + proc `*` *(x: typ, y: base): typ {.borrow.} + proc `*` *(x: base, y: typ): typ {.borrow.} + proc `div` *(x: typ, y: base): typ {.borrow.} + proc `mod` *(x: typ, y: base): typ {.borrow.} + + template Comparable(typ: typeDesc): stmt = + proc `<` * (x, y: typ): bool {.borrow.} + proc `<=` * (x, y: typ): bool {.borrow.} + proc `==` * (x, y: typ): bool {.borrow.} + + template DefineCurrency(typ, base: expr): stmt = + type + typ* = distinct base + Additive(typ) + Multiplicative(typ, base) + Comparable(typ) + + DefineCurrency(TDollar, int) + DefineCurrency(TEuro, int) + + +Void type +~~~~~~~~~ + +The `void`:idx: type denotes the absense of any type. Parameters of +type ``void`` are treated as non-existent, ``void`` as a return type means that +the procedure does not return a value: + +.. code-block:: nimrod + proc nothing(x, y: void): void = + echo "ha" + + nothing() # writes "ha" to stdout + +The ``void`` type is particularly useful for generic code: + +.. code-block:: nimrod + proc callProc[T](p: proc (x: T), x: T) = + when T is void: + p() + else: + p(x) + + proc intProc(x: int) = nil + proc emptyProc() = nil + + callProc[int](intProc, 12) + callProc[void](emptyProc) + +However, a ``void`` type cannot be inferred in generic code: + +.. code-block:: nimrod + callProc(emptyProc) + # Error: type mismatch: got (proc ()) + # but expected one of: + # callProc(p: proc (T), x: T) + +The ``void`` type is only valid for parameters and return types; other symbols +cannot have the type ``void``. + + +Type relations +-------------- + +The following section defines several relations on types that are needed to +describe the type checking done by the compiler. + + +Type equality +~~~~~~~~~~~~~ +Nimrod uses structural type equivalence for most types. Only for objects, +enumerations and distinct types name equivalence is used. The following +algorithm (in pseudo-code) determines type equality: + +.. code-block:: nimrod + proc typeEqualsAux(a, b: PType, + s: var set[PType * PType]): bool = + if (a,b) in s: return true + incl(s, (a,b)) + if a.kind == b.kind: + case a.kind + of int, intXX, float, floatXX, char, string, cstring, pointer, + bool, nil, void: + # leaf type: kinds identical; nothing more to check + result = true + of ref, ptr, var, set, seq, openarray: + result = typeEqualsAux(a.baseType, b.baseType, s) + of range: + result = typeEqualsAux(a.baseType, b.baseType, s) and + (a.rangeA == b.rangeA) and (a.rangeB == b.rangeB) + of array: + result = typeEqualsAux(a.baseType, b.baseType, s) and + typeEqualsAux(a.indexType, b.indexType, s) + of tuple: + if a.tupleLen == b.tupleLen: + for i in 0..a.tupleLen-1: + if not typeEqualsAux(a[i], b[i], s): return false + result = true + of object, enum, distinct: + result = a == b + of proc: + result = typeEqualsAux(a.parameterTuple, b.parameterTuple, s) and + typeEqualsAux(a.resultType, b.resultType, s) and + a.callingConvention == b.callingConvention + + proc typeEquals(a, b: PType): bool = + var s: set[PType * PType] = {} + result = typeEqualsAux(a, b, s) + +Since types are graphs which can have cycles, the above algorithm needs an +auxiliary set ``s`` to detect this case. + + +Type equality modulo type distinction +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following algorithm (in pseudo-code) determines whether two types +are equal with no respect to ``distinct`` types. For brevity the cycle check +with an auxiliary set ``s`` is omitted: + +.. code-block:: nimrod + proc typeEqualsOrDistinct(a, b: PType): bool = + if a.kind == b.kind: + case a.kind + of int, intXX, float, floatXX, char, string, cstring, pointer, + bool, nil, void: + # leaf type: kinds identical; nothing more to check + result = true + of ref, ptr, var, set, seq, openarray: + result = typeEqualsOrDistinct(a.baseType, b.baseType) + of range: + result = typeEqualsOrDistinct(a.baseType, b.baseType) and + (a.rangeA == b.rangeA) and (a.rangeB == b.rangeB) + of array: + result = typeEqualsOrDistinct(a.baseType, b.baseType) and + typeEqualsOrDistinct(a.indexType, b.indexType) + of tuple: + if a.tupleLen == b.tupleLen: + for i in 0..a.tupleLen-1: + if not typeEqualsOrDistinct(a[i], b[i]): return false + result = true + of distinct: + result = typeEqualsOrDistinct(a.baseType, b.baseType) + of object, enum: + result = a == b + of proc: + result = typeEqualsOrDistinct(a.parameterTuple, b.parameterTuple) and + typeEqualsOrDistinct(a.resultType, b.resultType) and + a.callingConvention == b.callingConvention + elif a.kind == distinct: + result = typeEqualsOrDistinct(a.baseType, b) + elif b.kind == distinct: + result = typeEqualsOrDistinct(a, b.baseType) + + +Subtype relation +~~~~~~~~~~~~~~~~ +If object ``a`` inherits from ``b``, ``a`` is a subtype of ``b``. This subtype +relation is extended to the types ``var``, ``ref``, ``ptr``: + +.. code-block:: nimrod + proc isSubtype(a, b: PType): bool = + if a.kind == b.kind: + case a.kind + of object: + var aa = a.baseType + while aa != nil and aa != b: aa = aa.baseType + result = aa == b + of var, ref, ptr: + result = isSubtype(a.baseType, b.baseType) + +.. XXX nil is a special value! + + +Convertible relation +~~~~~~~~~~~~~~~~~~~~ +A type ``a`` is **implicitly** convertible to type ``b`` iff the following +algorithm returns true: + +.. code-block:: nimrod + # XXX range types? + proc isImplicitlyConvertible(a, b: PType): bool = + case a.kind of int: result = b in {int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64, float, float32, float64} - of int8: result = b in {int16, int32, int64, int} - of int16: result = b in {int32, int64, int} - of int32: result = b in {int64, int} + of int8: result = b in {int16, int32, int64, int} + of int16: result = b in {int32, int64, int} + of int32: result = b in {int64, int} of uint: result = b in {uint32, uint64} of uint8: result = b in {uint16, uint32, uint64} of uint16: result = b in {uint32, uint64} of uint32: result = b in {uint64} - of float: result = b in {float32, float64} - of float32: result = b in {float64, float} - of float64: result = b in {float32, float} - of seq: - result = b == openArray and typeEquals(a.baseType, b.baseType) - of array: - result = b == openArray and typeEquals(a.baseType, b.baseType) - if a.baseType == char and a.indexType.rangeA == 0: - result = b = cstring - of cstring, ptr: - result = b == pointer - of string: - result = b == cstring - -A type ``a`` is **explicitly** convertible to type ``b`` iff the following -algorithm returns true: - -.. code-block:: nimrod - proc isIntegralType(t: PType): bool = - result = isOrdinal(t) or t.kind in {float, float32, float64} - - proc isExplicitlyConvertible(a, b: PType): bool = - if isImplicitlyConvertible(a, b): return true - if typeEqualsOrDistinct(a, b): return true - if isIntegralType(a) and isIntegralType(b): return true - if isSubtype(a, b) or isSubtype(b, a): return true - return false - -The convertible relation can be relaxed by a user-defined type -`converter`:idx:. - -.. code-block:: nimrod - converter toInt(x: char): int = result = ord(x) - - var - x: int - chr: char = 'a' - - # implicit conversion magic happens here - x = chr - echo x # => 97 - # you can use the explicit form too - x = chr.toInt - echo x # => 97 - -The type conversion ``T(a)`` is an L-value if ``a`` is an L-value and -``typeEqualsOrDistinct(T, type(a))`` holds. - - -Assignment compatibility -~~~~~~~~~~~~~~~~~~~~~~~~ - -An expression ``b`` can be assigned to an expression ``a`` iff ``a`` is an -`l-value` and ``isImplicitlyConvertible(b.typ, a.typ)`` holds. - - -Overloading resolution -~~~~~~~~~~~~~~~~~~~~~~ - -To be written. - - -Statements and expressions --------------------------- -Nimrod uses the common statement/expression paradigm: `Statements`:idx: do not -produce a value in contrast to expressions. Call expressions are statements. -If the called procedure returns a value, it is not a valid statement -as statements do not produce values. To evaluate an expression for -side-effects and throw its value away, one can use the ``discard`` statement. - -Statements are separated into `simple statements`:idx: and -`complex statements`:idx:. -Simple statements are statements that cannot contain other statements like -assignments, calls or the ``return`` statement; complex statements can -contain other statements. To avoid the `dangling else problem`:idx:, complex -statements always have to be intended:: - - simpleStmt ::= returnStmt - | yieldStmt - | discardStmt - | raiseStmt - | breakStmt - | continueStmt - | pragma - | importStmt - | fromStmt - | includeStmt - | exprStmt - complexStmt ::= ifStmt | whileStmt | caseStmt | tryStmt | forStmt - | blockStmt | asmStmt - | procDecl | iteratorDecl | macroDecl | templateDecl - | constSection | letSection - | typeSection | whenStmt | varSection - - - -Discard statement -~~~~~~~~~~~~~~~~~ - -Syntax:: - - discardStmt ::= 'discard' expr - -Example: - -.. code-block:: nimrod - proc p(x, y: int): int = - return x + y - - discard p(3, 4) # discard the return value of `p` - -The `discard`:idx: statement evaluates its expression for side-effects and -throws the expression's resulting value away. - -Ignoring the return value of a procedure without using a discard statement is -a static error. - -The return value can be ignored implicitely if the called proc/iterator has -been declared with the `discardable`:idx: pragma: - -.. code-block:: nimrod - proc p(x, y: int): int {.discardable.} = - return x + y - - p(3, 4) # now valid - - -Var statement -~~~~~~~~~~~~~ - -Syntax:: - - colonOrEquals ::= ':' typeDesc ['=' expr] | '=' expr - varField ::= symbol ['*'] [pragma] - varPart ::= symbol (comma symbol)* [comma] colonOrEquals [COMMENT | IND COMMENT] - varSection ::= 'var' (varPart - | indPush (COMMENT|varPart) - (SAD (COMMENT|varPart))* DED indPop) - - -`Var`:idx: statements declare new local and global variables and -initialize them. A comma separated list of variables can be used to specify -variables of the same type: - -.. code-block:: nimrod - - var - a: int = 0 - x, y, z: int - -If an initializer is given the type can be omitted: the variable is then of the -same type as the initializing expression. Variables are always initialized -with a default value if there is no initializing expression. The default -value depends on the type and is always a zero in binary. - -============================ ============================================== -Type default value -============================ ============================================== -any integer type 0 -any float 0.0 -char '\\0' -bool false -ref or pointer type nil -procedural type nil -sequence nil (*not* ``@[]``) -string nil (*not* "") -tuple[x: A, y: B, ...] (default(A), default(B), ...) - (analogous for objects) -array[0..., T] [default(T), ...] -range[T] default(T); this may be out of the valid range -T = enum cast[T](0); this may be an invalid value -============================ ============================================== - - -The implicit initialization can be avoided for optimization reasons with the -`noinit`:idx: pragma: - -.. code-block:: nimrod - var - a {.noInit.}: array [0..1023, char] - -If a proc is annotated with the ``noinit`` pragma this refers to its implicit -``result`` variable: - -.. code-block:: nimrod - proc returnUndefinedValue: int {.noinit.} = nil - - -let statement -~~~~~~~~~~~~~ - -A `Let`:idx: statement declares new local and global `single assignment`:idx: -variables and binds a value to them. The syntax is the of the ``var`` -statement, except that the keyword ``var`` is replaced by the keyword ``let``. -Let variables are not l-values and can thus not be passed to ``var`` parameters -nor can their address be taken. They cannot be assigned new values. - -For let variables the same pragmas are available as for ordinary variables. - - -Const section -~~~~~~~~~~~~~ - -Syntax:: - - colonAndEquals ::= [':' typeDesc] '=' expr - - constDecl ::= symbol ['*'] [pragma] colonAndEquals [COMMENT | IND COMMENT] - | COMMENT - constSection ::= 'const' indPush constDecl (SAD constDecl)* DED indPop - -`Constants`:idx: are symbols which are bound to a value. The constant's value -cannot change. The compiler must be able to evaluate the expression in a -constant declaration at compile time. - -Nimrod contains a sophisticated compile-time evaluator, so procedures which -have no side-effect can be used in constant expressions too: - -.. code-block:: nimrod - import strutils - const - constEval = contains("abc", 'b') # computed at compile time! - - -The rules for compile-time computability are: - -1. Literals are compile-time computable. -2. Type conversions are compile-time computable. -3. Procedure calls of the form ``p(X)`` are compile-time computable if - ``p`` is a proc without side-effects (see the `noSideEffect pragma`_ - for details) and if ``X`` is a (possibly empty) list of compile-time - computable arguments. - - -Constants cannot be of type ``ptr``, ``ref``, ``var`` or ``object``, nor can -they contain such a type. - - -Static statement/expression -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Syntax:: - staticExpr ::= 'static' '(' optInd expr optPar ')' - staticStmt ::= 'static' ':' stmt - -A `static`:idx: statement/expression can be used to enforce compile -time evaluation explicitely. Enforced compile time evaluation can even evaluate -code that has side effects: - -.. code-block:: - - static: - echo "echo at compile time" - + of float: result = b in {float32, float64} + of float32: result = b in {float64, float} + of float64: result = b in {float32, float} + of seq: + result = b == openArray and typeEquals(a.baseType, b.baseType) + of array: + result = b == openArray and typeEquals(a.baseType, b.baseType) + if a.baseType == char and a.indexType.rangeA == 0: + result = b = cstring + of cstring, ptr: + result = b == pointer + of string: + result = b == cstring + +A type ``a`` is **explicitly** convertible to type ``b`` iff the following +algorithm returns true: + +.. code-block:: nimrod + proc isIntegralType(t: PType): bool = + result = isOrdinal(t) or t.kind in {float, float32, float64} + + proc isExplicitlyConvertible(a, b: PType): bool = + if isImplicitlyConvertible(a, b): return true + if typeEqualsOrDistinct(a, b): return true + if isIntegralType(a) and isIntegralType(b): return true + if isSubtype(a, b) or isSubtype(b, a): return true + return false + +The convertible relation can be relaxed by a user-defined type +`converter`:idx:. + +.. code-block:: nimrod + converter toInt(x: char): int = result = ord(x) + + var + x: int + chr: char = 'a' + + # implicit conversion magic happens here + x = chr + echo x # => 97 + # you can use the explicit form too + x = chr.toInt + echo x # => 97 + +The type conversion ``T(a)`` is an L-value if ``a`` is an L-value and +``typeEqualsOrDistinct(T, type(a))`` holds. + + +Assignment compatibility +~~~~~~~~~~~~~~~~~~~~~~~~ + +An expression ``b`` can be assigned to an expression ``a`` iff ``a`` is an +`l-value` and ``isImplicitlyConvertible(b.typ, a.typ)`` holds. + + +Overloading resolution +~~~~~~~~~~~~~~~~~~~~~~ + +To be written. + + +Statements and expressions +-------------------------- +Nimrod uses the common statement/expression paradigm: `Statements`:idx: do not +produce a value in contrast to expressions. Call expressions are statements. +If the called procedure returns a value, it is not a valid statement +as statements do not produce values. To evaluate an expression for +side-effects and throw its value away, one can use the ``discard`` statement. + +Statements are separated into `simple statements`:idx: and +`complex statements`:idx:. +Simple statements are statements that cannot contain other statements like +assignments, calls or the ``return`` statement; complex statements can +contain other statements. To avoid the `dangling else problem`:idx:, complex +statements always have to be intended:: + + simpleStmt ::= returnStmt + | yieldStmt + | discardStmt + | raiseStmt + | breakStmt + | continueStmt + | pragma + | importStmt + | fromStmt + | includeStmt + | exprStmt + complexStmt ::= ifStmt | whileStmt | caseStmt | tryStmt | forStmt + | blockStmt | asmStmt + | procDecl | iteratorDecl | macroDecl | templateDecl + | constSection | letSection + | typeSection | whenStmt | varSection + + + +Discard statement +~~~~~~~~~~~~~~~~~ + +Syntax:: + + discardStmt ::= 'discard' expr + +Example: + +.. code-block:: nimrod + proc p(x, y: int): int = + return x + y + + discard p(3, 4) # discard the return value of `p` + +The `discard`:idx: statement evaluates its expression for side-effects and +throws the expression's resulting value away. + +Ignoring the return value of a procedure without using a discard statement is +a static error. + +The return value can be ignored implicitely if the called proc/iterator has +been declared with the `discardable`:idx: pragma: + +.. code-block:: nimrod + proc p(x, y: int): int {.discardable.} = + return x + y + + p(3, 4) # now valid + + +Var statement +~~~~~~~~~~~~~ + +Syntax:: + + colonOrEquals ::= ':' typeDesc ['=' expr] | '=' expr + varField ::= symbol ['*'] [pragma] + varPart ::= symbol (comma symbol)* [comma] colonOrEquals [COMMENT | IND COMMENT] + varSection ::= 'var' (varPart + | indPush (COMMENT|varPart) + (SAD (COMMENT|varPart))* DED indPop) + + +`Var`:idx: statements declare new local and global variables and +initialize them. A comma separated list of variables can be used to specify +variables of the same type: + +.. code-block:: nimrod + + var + a: int = 0 + x, y, z: int + +If an initializer is given the type can be omitted: the variable is then of the +same type as the initializing expression. Variables are always initialized +with a default value if there is no initializing expression. The default +value depends on the type and is always a zero in binary. + +============================ ============================================== +Type default value +============================ ============================================== +any integer type 0 +any float 0.0 +char '\\0' +bool false +ref or pointer type nil +procedural type nil +sequence nil (*not* ``@[]``) +string nil (*not* "") +tuple[x: A, y: B, ...] (default(A), default(B), ...) + (analogous for objects) +array[0..., T] [default(T), ...] +range[T] default(T); this may be out of the valid range +T = enum cast[T](0); this may be an invalid value +============================ ============================================== + + +The implicit initialization can be avoided for optimization reasons with the +`noinit`:idx: pragma: + +.. code-block:: nimrod + var + a {.noInit.}: array [0..1023, char] + +If a proc is annotated with the ``noinit`` pragma this refers to its implicit +``result`` variable: + +.. code-block:: nimrod + proc returnUndefinedValue: int {.noinit.} = nil + + +let statement +~~~~~~~~~~~~~ + +A `Let`:idx: statement declares new local and global `single assignment`:idx: +variables and binds a value to them. The syntax is the of the ``var`` +statement, except that the keyword ``var`` is replaced by the keyword ``let``. +Let variables are not l-values and can thus not be passed to ``var`` parameters +nor can their address be taken. They cannot be assigned new values. + +For let variables the same pragmas are available as for ordinary variables. + + +Const section +~~~~~~~~~~~~~ + +Syntax:: + + colonAndEquals ::= [':' typeDesc] '=' expr + + constDecl ::= symbol ['*'] [pragma] colonAndEquals [COMMENT | IND COMMENT] + | COMMENT + constSection ::= 'const' indPush constDecl (SAD constDecl)* DED indPop + +`Constants`:idx: are symbols which are bound to a value. The constant's value +cannot change. The compiler must be able to evaluate the expression in a +constant declaration at compile time. + +Nimrod contains a sophisticated compile-time evaluator, so procedures which +have no side-effect can be used in constant expressions too: + +.. code-block:: nimrod + import strutils + const + constEval = contains("abc", 'b') # computed at compile time! + + +The rules for compile-time computability are: + +1. Literals are compile-time computable. +2. Type conversions are compile-time computable. +3. Procedure calls of the form ``p(X)`` are compile-time computable if + ``p`` is a proc without side-effects (see the `noSideEffect pragma`_ + for details) and if ``X`` is a (possibly empty) list of compile-time + computable arguments. + + +Constants cannot be of type ``ptr``, ``ref``, ``var`` or ``object``, nor can +they contain such a type. + + +Static statement/expression +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Syntax:: + staticExpr ::= 'static' '(' optInd expr optPar ')' + staticStmt ::= 'static' ':' stmt + +A `static`:idx: statement/expression can be used to enforce compile +time evaluation explicitely. Enforced compile time evaluation can even evaluate +code that has side effects: + +.. code-block:: + + static: + echo "echo at compile time" + It's a static error if the compiler cannot perform the evaluation at compile -time. - -The current implementation poses some restrictions for compile time -evaluation: Code which contains ``cast`` or makes use of the foreign function -interface cannot be evaluated at compile time. Later versions of Nimrod will -support the FFI at compile time. - - -If statement -~~~~~~~~~~~~ - -Syntax:: - - ifStmt ::= 'if' expr ':' stmt ('elif' expr ':' stmt)* ['else' ':' stmt] - -Example: - -.. code-block:: nimrod - - var name = readLine(stdin) - - if name == "Andreas": - echo("What a nice name!") - elif name == "": - echo("Don't you have a name?") - else: - echo("Boring name...") - -The `if`:idx: statement is a simple way to make a branch in the control flow: -The expression after the keyword ``if`` is evaluated, if it is true -the corresponding statements after the ``:`` are executed. Otherwise -the expression after the ``elif`` is evaluated (if there is an -``elif`` branch), if it is true the corresponding statements after -the ``:`` are executed. This goes on until the last ``elif``. If all -conditions fail, the ``else`` part is executed. If there is no ``else`` -part, execution continues with the statement after the ``if`` statement. - - -Case statement -~~~~~~~~~~~~~~ - -Syntax:: - - caseStmt ::= 'case' expr [':'] ('of' sliceExprList ':' stmt)* - ('elif' expr ':' stmt)* - ['else' ':' stmt] - -Example: - -.. code-block:: nimrod - - case readline(stdin) - of "delete-everything", "restart-computer": - echo("permission denied") - of "go-for-a-walk": echo("please yourself") +time. + +The current implementation poses some restrictions for compile time +evaluation: Code which contains ``cast`` or makes use of the foreign function +interface cannot be evaluated at compile time. Later versions of Nimrod will +support the FFI at compile time. + + +If statement +~~~~~~~~~~~~ + +Syntax:: + + ifStmt ::= 'if' expr ':' stmt ('elif' expr ':' stmt)* ['else' ':' stmt] + +Example: + +.. code-block:: nimrod + + var name = readLine(stdin) + + if name == "Andreas": + echo("What a nice name!") + elif name == "": + echo("Don't you have a name?") + else: + echo("Boring name...") + +The `if`:idx: statement is a simple way to make a branch in the control flow: +The expression after the keyword ``if`` is evaluated, if it is true +the corresponding statements after the ``:`` are executed. Otherwise +the expression after the ``elif`` is evaluated (if there is an +``elif`` branch), if it is true the corresponding statements after +the ``:`` are executed. This goes on until the last ``elif``. If all +conditions fail, the ``else`` part is executed. If there is no ``else`` +part, execution continues with the statement after the ``if`` statement. + + +Case statement +~~~~~~~~~~~~~~ + +Syntax:: + + caseStmt ::= 'case' expr [':'] ('of' sliceExprList ':' stmt)* + ('elif' expr ':' stmt)* + ['else' ':' stmt] + +Example: + +.. code-block:: nimrod + + case readline(stdin) + of "delete-everything", "restart-computer": + echo("permission denied") + of "go-for-a-walk": echo("please yourself") else: echo("unknown command") # indentation of the branches is also allowed; and so is an optional colon # after the selecting expression: - case readline(stdin): - of "delete-everything", "restart-computer": - echo("permission denied") - of "go-for-a-walk": echo("please yourself") + case readline(stdin): + of "delete-everything", "restart-computer": + echo("permission denied") + of "go-for-a-walk": echo("please yourself") else: echo("unknown command") - - -The `case`:idx: statement is similar to the if statement, but it represents -a multi-branch selection. The expression after the keyword ``case`` is -evaluated and if its value is in a *slicelist* the corresponding statements -(after the ``of`` keyword) are executed. If the value is not in any -given *slicelist* the ``else`` part is executed. If there is no ``else`` -part and not all possible values that ``expr`` can hold occur in a -``slicelist``, a static error occurs. This holds only for expressions of + + +The `case`:idx: statement is similar to the if statement, but it represents +a multi-branch selection. The expression after the keyword ``case`` is +evaluated and if its value is in a *slicelist* the corresponding statements +(after the ``of`` keyword) are executed. If the value is not in any +given *slicelist* the ``else`` part is executed. If there is no ``else`` +part and not all possible values that ``expr`` can hold occur in a +``slicelist``, a static error occurs. This holds only for expressions of ordinal types. "All possible values" of ``expr`` are determined by ``expr``'s type. - -If the expression is not of an ordinal type, and no ``else`` part is -given, control passes after the ``case`` statement. - -To suppress the static error in the ordinal case an ``else`` part with a ``nil`` -statement can be used. - -As a special semantic extension, an expression in an ``of`` branch of a case -statement may evaluate to a set constructor; the set is then expanded into -a list of its elements: - -.. code-block:: nimrod - const - SymChars: set[char] = {'a'..'z', 'A'..'Z', '\x80'..'\xFF'} - - proc classify(s: string) = - case s[0] - of SymChars, '_': echo "an identifier" - of '0'..'9': echo "a number" - else: echo "other" - - # is equivalent to: - proc classify(s: string) = - case s[0] - of 'a'..'z', 'A'..'Z', '\x80'..'\xFF', '_': echo "an identifier" - of '0'..'9': echo "a number" - else: echo "other" - - -When statement -~~~~~~~~~~~~~~ - -Syntax:: - - whenStmt ::= 'when' expr ':' stmt ('elif' expr ':' stmt)* ['else' ':' stmt] - -Example: - -.. code-block:: nimrod - - when sizeof(int) == 2: - echo("running on a 16 bit system!") - elif sizeof(int) == 4: - echo("running on a 32 bit system!") - elif sizeof(int) == 8: - echo("running on a 64 bit system!") - else: - echo("cannot happen!") - -The `when`:idx: statement is almost identical to the ``if`` statement with some -exceptions: - -* Each ``expr`` has to be a constant expression (of type ``bool``). -* The statements do not open a new scope. -* The statements that belong to the expression that evaluated to true are - translated by the compiler, the other statements are not checked for - semantics! However, each ``expr`` is checked for semantics. - -The ``when`` statement enables conditional compilation techniques. As -a special syntactic extension, the ``when`` construct is also available -within ``object`` definitions. - - -Raise statement -~~~~~~~~~~~~~~~ - -Syntax:: - - raiseStmt ::= 'raise' [expr] - -Example: - -.. code-block:: nimrod - raise newEOS("operating system failed") - -Apart from built-in operations like array indexing, memory allocation, etc. -the ``raise`` statement is the only way to raise an exception. - -.. XXX document this better! - -If no exception name is given, the current exception is `re-raised`:idx:. The -`ENoExceptionToReraise`:idx: exception is raised if there is no exception to -re-raise. It follows that the ``raise`` statement *always* raises an -exception. - - -Try statement -~~~~~~~~~~~~~ - -Syntax:: - - qualifiedIdent ::= symbol ['.' symbol] - exceptList ::= [qualifiedIdent (comma qualifiedIdent)* [comma]] - tryStmt ::= 'try' ':' stmt - ('except' exceptList ':' stmt)* - ['finally' ':' stmt] - -Example: - -.. code-block:: nimrod - # read the first two lines of a text file that should contain numbers - # and tries to add them - var - f: TFile - if open(f, "numbers.txt"): - try: - var a = readLine(f) - var b = readLine(f) - echo("sum: " & $(parseInt(a) + parseInt(b))) - except EOverflow: - echo("overflow!") - except EInvalidValue: - echo("could not convert string to integer") - except EIO: - echo("IO error!") - except: - echo("Unknown exception!") - finally: - close(f) - - - -The statements after the `try`:idx: are executed in sequential order unless -an exception ``e`` is raised. If the exception type of ``e`` matches any -of the list ``exceptlist`` the corresponding statements are executed. -The statements following the ``except`` clauses are called -`exception handlers`:idx:. - -The empty `except`:idx: clause is executed if there is an exception that is -in no list. It is similar to an ``else`` clause in ``if`` statements. - -If there is a `finally`:idx: clause, it is always executed after the -exception handlers. - -The exception is *consumed* in an exception handler. However, an -exception handler may raise another exception. If the exception is not -handled, it is propagated through the call stack. This means that often -the rest of the procedure - that is not within a ``finally`` clause - -is not executed (if an exception occurs). - - -Except and finally statements + +If the expression is not of an ordinal type, and no ``else`` part is +given, control passes after the ``case`` statement. + +To suppress the static error in the ordinal case an ``else`` part with a ``nil`` +statement can be used. + +As a special semantic extension, an expression in an ``of`` branch of a case +statement may evaluate to a set constructor; the set is then expanded into +a list of its elements: + +.. code-block:: nimrod + const + SymChars: set[char] = {'a'..'z', 'A'..'Z', '\x80'..'\xFF'} + + proc classify(s: string) = + case s[0] + of SymChars, '_': echo "an identifier" + of '0'..'9': echo "a number" + else: echo "other" + + # is equivalent to: + proc classify(s: string) = + case s[0] + of 'a'..'z', 'A'..'Z', '\x80'..'\xFF', '_': echo "an identifier" + of '0'..'9': echo "a number" + else: echo "other" + + +When statement +~~~~~~~~~~~~~~ + +Syntax:: + + whenStmt ::= 'when' expr ':' stmt ('elif' expr ':' stmt)* ['else' ':' stmt] + +Example: + +.. code-block:: nimrod + + when sizeof(int) == 2: + echo("running on a 16 bit system!") + elif sizeof(int) == 4: + echo("running on a 32 bit system!") + elif sizeof(int) == 8: + echo("running on a 64 bit system!") + else: + echo("cannot happen!") + +The `when`:idx: statement is almost identical to the ``if`` statement with some +exceptions: + +* Each ``expr`` has to be a constant expression (of type ``bool``). +* The statements do not open a new scope. +* The statements that belong to the expression that evaluated to true are + translated by the compiler, the other statements are not checked for + semantics! However, each ``expr`` is checked for semantics. + +The ``when`` statement enables conditional compilation techniques. As +a special syntactic extension, the ``when`` construct is also available +within ``object`` definitions. + + +Raise statement +~~~~~~~~~~~~~~~ + +Syntax:: + + raiseStmt ::= 'raise' [expr] + +Example: + +.. code-block:: nimrod + raise newEOS("operating system failed") + +Apart from built-in operations like array indexing, memory allocation, etc. +the ``raise`` statement is the only way to raise an exception. + +.. XXX document this better! + +If no exception name is given, the current exception is `re-raised`:idx:. The +`ENoExceptionToReraise`:idx: exception is raised if there is no exception to +re-raise. It follows that the ``raise`` statement *always* raises an +exception. + + +Try statement +~~~~~~~~~~~~~ + +Syntax:: + + qualifiedIdent ::= symbol ['.' symbol] + exceptList ::= [qualifiedIdent (comma qualifiedIdent)* [comma]] + tryStmt ::= 'try' ':' stmt + ('except' exceptList ':' stmt)* + ['finally' ':' stmt] + +Example: + +.. code-block:: nimrod + # read the first two lines of a text file that should contain numbers + # and tries to add them + var + f: TFile + if open(f, "numbers.txt"): + try: + var a = readLine(f) + var b = readLine(f) + echo("sum: " & $(parseInt(a) + parseInt(b))) + except EOverflow: + echo("overflow!") + except EInvalidValue: + echo("could not convert string to integer") + except EIO: + echo("IO error!") + except: + echo("Unknown exception!") + finally: + close(f) + + + +The statements after the `try`:idx: are executed in sequential order unless +an exception ``e`` is raised. If the exception type of ``e`` matches any +of the list ``exceptlist`` the corresponding statements are executed. +The statements following the ``except`` clauses are called +`exception handlers`:idx:. + +The empty `except`:idx: clause is executed if there is an exception that is +in no list. It is similar to an ``else`` clause in ``if`` statements. + +If there is a `finally`:idx: clause, it is always executed after the +exception handlers. + +The exception is *consumed* in an exception handler. However, an +exception handler may raise another exception. If the exception is not +handled, it is propagated through the call stack. This means that often +the rest of the procedure - that is not within a ``finally`` clause - +is not executed (if an exception occurs). + + +Except and finally statements ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -`except`:idx: and `finally`:idx: can also be used as a stand-alone statements. -Any statements following them in the current block will be considered to be -in an implicit try block: - -.. code-block:: nimrod - var f = open("numbers.txt") - finally: close(f) - ... - - -Return statement -~~~~~~~~~~~~~~~~ - -Syntax:: - - returnStmt ::= 'return' [expr] - -Example: - -.. code-block:: nimrod - return 40+2 - -The `return`:idx: statement ends the execution of the current procedure. -It is only allowed in procedures. If there is an ``expr``, this is syntactic -sugar for: - -.. code-block:: nimrod - result = expr - return result - - -``return`` without an expression is a short notation for ``return result`` if -the proc has a return type. The `result`:idx: variable is always the return -value of the procedure. It is automatically declared by the compiler. As all -variables, ``result`` is initialized to (binary) zero: - -.. code-block:: nimrod - proc returnZero(): int = - # implicitly returns 0 - - -Yield statement -~~~~~~~~~~~~~~~ - -Syntax:: - - yieldStmt ::= 'yield' expr - -Example: - -.. code-block:: nimrod - yield (1, 2, 3) - -The `yield`:idx: statement is used instead of the ``return`` statement in -iterators. It is only valid in iterators. Execution is returned to the body -of the for loop that called the iterator. Yield does not end the iteration -process, but execution is passed back to the iterator if the next iteration -starts. See the section about iterators (`Iterators and the for statement`_) -for further information. - - -Block statement -~~~~~~~~~~~~~~~ - -Syntax:: - - blockStmt ::= 'block' [symbol] ':' stmt - -Example: - -.. code-block:: nimrod - var found = false - block myblock: - for i in 0..3: - for j in 0..3: - if a[j][i] == 7: - found = true - break myblock # leave the block, in this case both for-loops - echo(found) - -The block statement is a means to group statements to a (named) `block`:idx:. -Inside the block, the ``break`` statement is allowed to leave the block -immediately. A ``break`` statement can contain a name of a surrounding -block to specify which block is to leave. - - -Break statement -~~~~~~~~~~~~~~~ - -Syntax:: - - breakStmt ::= 'break' [symbol] - -Example: - -.. code-block:: nimrod - break - -The `break`:idx: statement is used to leave a block immediately. If ``symbol`` -is given, it is the name of the enclosing block that is to leave. If it is -absent, the innermost block is left. - - -While statement -~~~~~~~~~~~~~~~ - -Syntax:: - - whileStmt ::= 'while' expr ':' stmt - -Example: - -.. code-block:: nimrod - echo("Please tell me your password: \n") - var pw = readLine(stdin) - while pw != "12345": - echo("Wrong password! Next try: \n") - pw = readLine(stdin) - - -The `while`:idx: statement is executed until the ``expr`` evaluates to false. -Endless loops are no error. ``while`` statements open an `implicit block`, -so that they can be left with a ``break`` statement. - - -Continue statement -~~~~~~~~~~~~~~~~~~ - -Syntax:: - - continueStmt ::= 'continue' - -A `continue`:idx: statement leads to the immediate next iteration of the -surrounding loop construct. It is only allowed within a loop. A continue -statement is syntactic sugar for a nested block: - -.. code-block:: nimrod - while expr1: - stmt1 - continue - stmt2 - -Is equivalent to: - -.. code-block:: nimrod - while expr1: - block myBlockName: - stmt1 - break myBlockName - stmt2 - - -Assembler statement -~~~~~~~~~~~~~~~~~~~ -Syntax:: - - asmStmt ::= 'asm' [pragma] (STR_LIT | RSTR_LIT | TRIPLESTR_LIT) - -The direct embedding of `assembler`:idx: code into Nimrod code is supported -by the unsafe ``asm`` statement. Identifiers in the assembler code that refer to -Nimrod identifiers shall be enclosed in a special character which can be -specified in the statement's pragmas. The default special character is ``'`'``: - -.. code-block:: nimrod - proc addInt(a, b: int): int {.noStackFrame.} = - # a in eax, and b in edx - asm """ - mov eax, `a` - add eax, `b` - jno theEnd - call `raiseOverflow` - theEnd: - """ - -If expression -~~~~~~~~~~~~~ - -An `if expression` is almost like an if statement, but it is an expression. -Example: - -.. code-block:: nimrod - var y = if x > 8: 9 else: 10 - -An if expression always results in a value, so the ``else`` part is -required. ``Elif`` parts are also allowed (but unlikely to be good -style). - - -Table constructor -~~~~~~~~~~~~~~~~~ - -A `table constructor`:idx: is syntactic sugar for an array constructor: - -.. code-block:: nimrod - {"key1": "value1", "key2": "value2"} - - # is the same as: - [("key1", "value1"), ("key2", "value2")] - - -The empty table can be written ``{:}`` (in contrast to the empty set -which is ``{}``) which is thus another way to write as the empty array -constructor ``[]``. This slightly unusal way of supporting tables -has lots of advantages: - -* The order of the (key,value)-pairs is preserved, thus it is easy to - support ordered dicts with for example ``{key: val}.newOrderedTable``. -* A table literal can be put into a ``const`` section and the compiler - can easily put it into the executable's data section just like it can - for arrays and the generated data section requires a minimal amount - of memory. -* Every table implementation is treated equal syntactically. -* Apart from the minimal syntactic sugar the language core does not need to - know about tables. - - -Type conversions -~~~~~~~~~~~~~~~~ -Syntactically a `type conversion` is like a procedure call, but a -type name replaces the procedure name. A type conversion is always -safe in the sense that a failure to convert a type to another -results in an exception (if it cannot be determined statically). - - -Type casts -~~~~~~~~~~ -Example: - - -.. code-block:: nimrod - cast[int](x) - -Type casts are a crude mechanism to interpret the bit pattern of -an expression as if it would be of another type. Type casts are -only needed for low-level programming and are inherently unsafe. - - -The addr operator -~~~~~~~~~~~~~~~~~ -The `addr` operator returns the address of an l-value. If the -type of the location is ``T``, the `addr` operator result is -of the type ``ptr T``. Taking the address of an object that resides -on the stack is **unsafe**, as the pointer may live longer than the -object on the stack and can thus reference a non-existing object. - - -Procedures -~~~~~~~~~~ -What most programming languages call `methods`:idx: or `functions`:idx: are -called `procedures`:idx: in Nimrod (which is the correct terminology). A -procedure declaration defines an identifier and associates it with a block -of code. -A procedure may call itself recursively. A parameter may be given a default -value that is used if the caller does not provide a value for this parameter. -The syntax is:: - - param ::= symbol (comma symbol)* (':' typeDesc ['=' expr] | '=' expr) - paramList ::= ['(' [param (comma param)*] [SAD] ')'] [':' typeDesc] - - genericParam ::= symbol [':' typeDesc] ['=' expr] - genericParams ::= '[' genericParam (comma genericParam)* [SAD] ']' - - procDecl ::= 'proc' symbol ['*'] [genericParams] paramList [pragma] - ['=' stmt] - -If the ``= stmt`` part is missing, it is a `forward`:idx: declaration. If -the proc returns a value, the procedure body can access an implicitly declared -variable named `result`:idx: that represents the return value. Procs can be -overloaded. The overloading resolution algorithm tries to find the proc that is -the best match for the arguments. Example: - -.. code-block:: nimrod - - proc toLower(c: Char): Char = # toLower for characters - if c in {'A'..'Z'}: - result = chr(ord(c) + (ord('a') - ord('A'))) - else: - result = c - - proc toLower(s: string): string = # toLower for strings - result = newString(len(s)) - for i in 0..len(s) - 1: - result[i] = toLower(s[i]) # calls toLower for characters; no recursion! - -Calling a procedure can be done in many different ways: - -.. code-block:: nimrod - proc callme(x, y: int, s: string = "", c: char, b: bool = false) = ... - - # call with positional arguments # parameter bindings: - callme(0, 1, "abc", '\t', true) # (x=0, y=1, s="abc", c='\t', b=true) - # call with named and positional arguments: - callme(y=1, x=0, "abd", '\t') # (x=0, y=1, s="abd", c='\t', b=false) - # call with named arguments (order is not relevant): - callme(c='\t', y=1, x=0) # (x=0, y=1, s="", c='\t', b=false) - # call as a command statement: no () needed: - callme 0, 1, "abc", '\t' - - -A procedure cannot modify its parameters (unless the parameters have the type -`var`). - -`Operators`:idx: are procedures with a special operator symbol as identifier: - -.. code-block:: nimrod - proc `$` (x: int): string = - # converts an integer to a string; this is a prefix operator. - return intToStr(x) - -Operators with one parameter are prefix operators, operators with two -parameters are infix operators. (However, the parser distinguishes these from -the operator's position within an expression.) There is no way to declare -postfix operators: all postfix operators are built-in and handled by the -grammar explicitly. - -Any operator can be called like an ordinary proc with the '`opr`' -notation. (Thus an operator can have more than two parameters): - -.. code-block:: nimrod - proc `*+` (a, b, c: int): int = - # Multiply and add - return a * b + c - - assert `*+`(3, 4, 6) == `*`(a, `+`(b, c)) - + +`except`:idx: and `finally`:idx: can also be used as a stand-alone statements. +Any statements following them in the current block will be considered to be +in an implicit try block: + +.. code-block:: nimrod + var f = open("numbers.txt") + finally: close(f) + ... + + +Return statement +~~~~~~~~~~~~~~~~ + +Syntax:: + + returnStmt ::= 'return' [expr] + +Example: + +.. code-block:: nimrod + return 40+2 + +The `return`:idx: statement ends the execution of the current procedure. +It is only allowed in procedures. If there is an ``expr``, this is syntactic +sugar for: + +.. code-block:: nimrod + result = expr + return result + + +``return`` without an expression is a short notation for ``return result`` if +the proc has a return type. The `result`:idx: variable is always the return +value of the procedure. It is automatically declared by the compiler. As all +variables, ``result`` is initialized to (binary) zero: + +.. code-block:: nimrod + proc returnZero(): int = + # implicitly returns 0 + + +Yield statement +~~~~~~~~~~~~~~~ + +Syntax:: + + yieldStmt ::= 'yield' expr + +Example: + +.. code-block:: nimrod + yield (1, 2, 3) + +The `yield`:idx: statement is used instead of the ``return`` statement in +iterators. It is only valid in iterators. Execution is returned to the body +of the for loop that called the iterator. Yield does not end the iteration +process, but execution is passed back to the iterator if the next iteration +starts. See the section about iterators (`Iterators and the for statement`_) +for further information. + + +Block statement +~~~~~~~~~~~~~~~ + +Syntax:: + + blockStmt ::= 'block' [symbol] ':' stmt + +Example: + +.. code-block:: nimrod + var found = false + block myblock: + for i in 0..3: + for j in 0..3: + if a[j][i] == 7: + found = true + break myblock # leave the block, in this case both for-loops + echo(found) + +The block statement is a means to group statements to a (named) `block`:idx:. +Inside the block, the ``break`` statement is allowed to leave the block +immediately. A ``break`` statement can contain a name of a surrounding +block to specify which block is to leave. + + +Break statement +~~~~~~~~~~~~~~~ + +Syntax:: + + breakStmt ::= 'break' [symbol] + +Example: + +.. code-block:: nimrod + break + +The `break`:idx: statement is used to leave a block immediately. If ``symbol`` +is given, it is the name of the enclosing block that is to leave. If it is +absent, the innermost block is left. + + +While statement +~~~~~~~~~~~~~~~ + +Syntax:: + + whileStmt ::= 'while' expr ':' stmt + +Example: + +.. code-block:: nimrod + echo("Please tell me your password: \n") + var pw = readLine(stdin) + while pw != "12345": + echo("Wrong password! Next try: \n") + pw = readLine(stdin) + + +The `while`:idx: statement is executed until the ``expr`` evaluates to false. +Endless loops are no error. ``while`` statements open an `implicit block`, +so that they can be left with a ``break`` statement. + + +Continue statement +~~~~~~~~~~~~~~~~~~ + +Syntax:: + + continueStmt ::= 'continue' + +A `continue`:idx: statement leads to the immediate next iteration of the +surrounding loop construct. It is only allowed within a loop. A continue +statement is syntactic sugar for a nested block: + +.. code-block:: nimrod + while expr1: + stmt1 + continue + stmt2 + +Is equivalent to: + +.. code-block:: nimrod + while expr1: + block myBlockName: + stmt1 + break myBlockName + stmt2 + + +Assembler statement +~~~~~~~~~~~~~~~~~~~ +Syntax:: + + asmStmt ::= 'asm' [pragma] (STR_LIT | RSTR_LIT | TRIPLESTR_LIT) + +The direct embedding of `assembler`:idx: code into Nimrod code is supported +by the unsafe ``asm`` statement. Identifiers in the assembler code that refer to +Nimrod identifiers shall be enclosed in a special character which can be +specified in the statement's pragmas. The default special character is ``'`'``: + +.. code-block:: nimrod + proc addInt(a, b: int): int {.noStackFrame.} = + # a in eax, and b in edx + asm """ + mov eax, `a` + add eax, `b` + jno theEnd + call `raiseOverflow` + theEnd: + """ + +If expression +~~~~~~~~~~~~~ + +An `if expression` is almost like an if statement, but it is an expression. +Example: + +.. code-block:: nimrod + var y = if x > 8: 9 else: 10 + +An if expression always results in a value, so the ``else`` part is +required. ``Elif`` parts are also allowed (but unlikely to be good +style). + + +Table constructor +~~~~~~~~~~~~~~~~~ + +A `table constructor`:idx: is syntactic sugar for an array constructor: + +.. code-block:: nimrod + {"key1": "value1", "key2": "value2"} + + # is the same as: + [("key1", "value1"), ("key2", "value2")] + + +The empty table can be written ``{:}`` (in contrast to the empty set +which is ``{}``) which is thus another way to write as the empty array +constructor ``[]``. This slightly unusal way of supporting tables +has lots of advantages: + +* The order of the (key,value)-pairs is preserved, thus it is easy to + support ordered dicts with for example ``{key: val}.newOrderedTable``. +* A table literal can be put into a ``const`` section and the compiler + can easily put it into the executable's data section just like it can + for arrays and the generated data section requires a minimal amount + of memory. +* Every table implementation is treated equal syntactically. +* Apart from the minimal syntactic sugar the language core does not need to + know about tables. + + +Type conversions +~~~~~~~~~~~~~~~~ +Syntactically a `type conversion` is like a procedure call, but a +type name replaces the procedure name. A type conversion is always +safe in the sense that a failure to convert a type to another +results in an exception (if it cannot be determined statically). + + +Type casts +~~~~~~~~~~ +Example: + + +.. code-block:: nimrod + cast[int](x) + +Type casts are a crude mechanism to interpret the bit pattern of +an expression as if it would be of another type. Type casts are +only needed for low-level programming and are inherently unsafe. + + +The addr operator +~~~~~~~~~~~~~~~~~ +The `addr` operator returns the address of an l-value. If the +type of the location is ``T``, the `addr` operator result is +of the type ``ptr T``. Taking the address of an object that resides +on the stack is **unsafe**, as the pointer may live longer than the +object on the stack and can thus reference a non-existing object. + + +Procedures +~~~~~~~~~~ +What most programming languages call `methods`:idx: or `functions`:idx: are +called `procedures`:idx: in Nimrod (which is the correct terminology). A +procedure declaration defines an identifier and associates it with a block +of code. +A procedure may call itself recursively. A parameter may be given a default +value that is used if the caller does not provide a value for this parameter. +The syntax is:: + + param ::= symbol (comma symbol)* (':' typeDesc ['=' expr] | '=' expr) + paramList ::= ['(' [param (comma param)*] [SAD] ')'] [':' typeDesc] + + genericParam ::= symbol [':' typeDesc] ['=' expr] + genericParams ::= '[' genericParam (comma genericParam)* [SAD] ']' + + procDecl ::= 'proc' symbol ['*'] [genericParams] paramList [pragma] + ['=' stmt] + +If the ``= stmt`` part is missing, it is a `forward`:idx: declaration. If +the proc returns a value, the procedure body can access an implicitly declared +variable named `result`:idx: that represents the return value. Procs can be +overloaded. The overloading resolution algorithm tries to find the proc that is +the best match for the arguments. Example: + +.. code-block:: nimrod + + proc toLower(c: Char): Char = # toLower for characters + if c in {'A'..'Z'}: + result = chr(ord(c) + (ord('a') - ord('A'))) + else: + result = c + + proc toLower(s: string): string = # toLower for strings + result = newString(len(s)) + for i in 0..len(s) - 1: + result[i] = toLower(s[i]) # calls toLower for characters; no recursion! + +Calling a procedure can be done in many different ways: + +.. code-block:: nimrod + proc callme(x, y: int, s: string = "", c: char, b: bool = false) = ... + + # call with positional arguments # parameter bindings: + callme(0, 1, "abc", '\t', true) # (x=0, y=1, s="abc", c='\t', b=true) + # call with named and positional arguments: + callme(y=1, x=0, "abd", '\t') # (x=0, y=1, s="abd", c='\t', b=false) + # call with named arguments (order is not relevant): + callme(c='\t', y=1, x=0) # (x=0, y=1, s="", c='\t', b=false) + # call as a command statement: no () needed: + callme 0, 1, "abc", '\t' + + +A procedure cannot modify its parameters (unless the parameters have the type +`var`). + +`Operators`:idx: are procedures with a special operator symbol as identifier: + +.. code-block:: nimrod + proc `$` (x: int): string = + # converts an integer to a string; this is a prefix operator. + return intToStr(x) + +Operators with one parameter are prefix operators, operators with two +parameters are infix operators. (However, the parser distinguishes these from +the operator's position within an expression.) There is no way to declare +postfix operators: all postfix operators are built-in and handled by the +grammar explicitly. + +Any operator can be called like an ordinary proc with the '`opr`' +notation. (Thus an operator can have more than two parameters): + +.. code-block:: nimrod + proc `*+` (a, b, c: int): int = + # Multiply and add + return a * b + c + + assert `*+`(3, 4, 6) == `*`(a, `+`(b, c)) + Nonoverloadable builtins ~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2383,491 +2383,491 @@ The following builtin procs cannot be overloaded for reasons of implementation simplicity (they require specialized semantic checking):: defined, definedInScope, compiles, low, high, sizeOf, - is, of, echo, shallowCopy, getAst + is, of, echo, shallowCopy, getAst Thus they act more like keywords than like ordinary identifiers; unlike a keyword however, a redefinition may `shadow`:id: the definition in -the ``system`` module. - - -Var parameters -~~~~~~~~~~~~~~ -The type of a parameter may be prefixed with the ``var`` keyword: - -.. code-block:: nimrod - proc divmod(a, b: int; res, remainder: var int) = - res = a div b - remainder = a mod b - - var - x, y: int - - divmod(8, 5, x, y) # modifies x and y - assert x == 1 - assert y == 3 - -In the example, ``res`` and ``remainder`` are `var parameters`. -Var parameters can be modified by the procedure and the changes are -visible to the caller. The argument passed to a var parameter has to be -an l-value. Var parameters are implemented as hidden pointers. The -above example is equivalent to: - -.. code-block:: nimrod - proc divmod(a, b: int; res, remainder: ptr int) = - res[] = a div b - remainder[] = a mod b - - var - x, y: int - divmod(8, 5, addr(x), addr(y)) - assert x == 1 - assert y == 3 - -In the examples, var parameters or pointers are used to provide two -return values. This can be done in a cleaner way by returning a tuple: - -.. code-block:: nimrod - proc divmod(a, b: int): tuple[res, remainder: int] = - return (a div b, a mod b) - - var t = divmod(8, 5) - assert t.res == 1 - assert t.remainder = 3 - -One can use `tuple unpacking`:idx: to access the tuple's fields: - -.. code-block:: nimrod - var (x, y) = divmod(8, 5) # tuple unpacking - assert x == 1 - assert y == 3 - - -Var return type -~~~~~~~~~~~~~~~ - -A proc, converter or iterator may return a ``var`` type which means that the -returned value is an l-value and can be modified by the caller: - -.. code-block:: nimrod - var g = 0 - - proc WriteAccessToG(): var int = - result = g - - WriteAccessToG() = 6 - assert g == 6 - -It is a compile time error if the implicitely introduced pointer could be -used to access a location beyond its lifetime: - -.. code-block:: nimrod - proc WriteAccessToG(): var int = - var g = 0 - result = g # Error! - -For iterators, a component of a tuple return type can have a ``var`` type too: - -.. code-block:: nimrod - iterator mpairs(a: var seq[string]): tuple[key: int, val: var string] = - for i in 0..a.high: - yield (i, a[i]) - -In the standard library every name of a routine that returns a ``var`` type -starts with the prefix ``m`` per convention. - - -Overloading of the subscript operator -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The ``[]`` subscript operator for arrays/openarrays/sequences can be overloaded. -Overloading support is only possible if the first parameter has no type that -already supports the built-in ``[]`` notation. Currently the compiler -does not check this restriction. - - -Multi-methods -~~~~~~~~~~~~~ - -Procedures always use static dispatch. `Multi-methods`:idx: use dynamic -dispatch. - -.. code-block:: nimrod - type - TExpr = object ## abstract base class for an expression - TLiteral = object of TExpr - x: int - TPlusExpr = object of TExpr - a, b: ref TExpr - - method eval(e: ref TExpr): int = - # override this base method - quit "to override!" - - method eval(e: ref TLiteral): int = return e.x - - method eval(e: ref TPlusExpr): int = - # watch out: relies on dynamic binding - return eval(e.a) + eval(e.b) - - proc newLit(x: int): ref TLiteral = - new(result) - result.x = x - - proc newPlus(a, b: ref TExpr): ref TPlusExpr = - new(result) - result.a = a - result.b = b - - echo eval(newPlus(newPlus(newLit(1), newLit(2)), newLit(4))) - -In the example the constructors ``newLit`` and ``newPlus`` are procs -because they should use static binding, but ``eval`` is a method because it -requires dynamic binding. - -In a multi-method all parameters that have an object type are used for the -dispatching: - -.. code-block:: nimrod - type - TThing = object - TUnit = object of TThing - x: int - - method collide(a, b: TThing) {.inline.} = - quit "to override!" - - method collide(a: TThing, b: TUnit) {.inline.} = - echo "1" - - method collide(a: TUnit, b: TThing) {.inline.} = - echo "2" - - var - a, b: TUnit - collide(a, b) # output: 2 - - -Invocation of a multi-method cannot be ambiguous: collide 2 is preferred over -collide 1 because the resolution works from left to right. -In the example ``TUnit, TThing`` is prefered over ``TThing, TUnit``. - -**Performance note**: Nimrod does not produce a virtual method table, but -generates dispatch trees. This avoids the expensive indirect branch for method -calls and enables inlining. However, other optimizations like compile time -evaluation or dead code elimination do not work with methods. - - -Iterators and the for statement -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Syntax:: - - forStmt ::= 'for' symbol (comma symbol)* [comma] 'in' expr ':' stmt - - param ::= symbol (comma symbol)* [comma] ':' typeDesc - paramList ::= ['(' [param (comma param)* [comma]] ')'] [':' typeDesc] - - genericParam ::= symbol [':' typeDesc] - genericParams ::= '[' genericParam (comma genericParam)* [comma] ']' - - iteratorDecl ::= 'iterator' symbol ['*'] [genericParams] paramList [pragma] - ['=' stmt] - -The `for`:idx: statement is an abstract mechanism to iterate over the elements -of a container. It relies on an `iterator`:idx: to do so. Like ``while`` -statements, ``for`` statements open an `implicit block`:idx:, so that they -can be left with a ``break`` statement. The ``for`` loop declares - -iteration variables (``x`` in the example) - their scope reaches until the -end of the loop body. The iteration variables' types are inferred by the -return type of the iterator. - -An iterator is similar to a procedure, except that it is always called in the -context of a ``for`` loop. Iterators provide a way to specify the iteration over -an abstract type. A key role in the execution of a ``for`` loop plays the -``yield`` statement in the called iterator. Whenever a ``yield`` statement is -reached the data is bound to the ``for`` loop variables and control continues -in the body of the ``for`` loop. The iterator's local variables and execution -state are automatically saved between calls. Example: - -.. code-block:: nimrod - # this definition exists in the system module - iterator items*(a: string): char {.inline.} = - var i = 0 - while i < len(a): - yield a[i] - inc(i) - - for ch in items("hello world"): # `ch` is an iteration variable - echo(ch) - -The compiler generates code as if the programmer would have written this: - -.. code-block:: nimrod - var i = 0 - while i < len(a): - var ch = a[i] - echo(ch) - inc(i) - -The current implementation always inlines the iterator code leading to zero -overhead for the abstraction. But this may increase the code size. Later -versions of the compiler will only inline iterators which have the calling -convention ``inline``. - -If the iterator yields a tuple, there have to be as many iteration variables -as there are components in the tuple. The i'th iteration variable's type is -the type of the i'th component. - - -Implict items/pairs invokations -+++++++++++++++++++++++++++++++ - -If the for loop expression ``e`` does not denote an iterator and the for loop -has exactly 1 variable, the for loop expression is rewritten to ``items(e)``; -ie. an ``items`` iterator is implicitely invoked: - -.. code-block:: nimrod - for x in [1,2,3]: echo x - -If the for loop has exactly 2 variables, a ``pairs`` iterator is implicitely -invoked. - -Symbol lookup of the identifiers ``items``/``pairs`` is performed after -the rewriting step, so that all overloadings of ``items``/``pairs`` are taken -into account. - - -Type sections -~~~~~~~~~~~~~ - -Syntax:: - - typeDef ::= typeDesc | objectDef | enumDef - - genericParam ::= symbol [':' typeDesc] - genericParams ::= '[' genericParam (comma genericParam)* [comma] ']' - - typeDecl ::= COMMENT - | symbol ['*'] [genericParams] ['=' typeDef] [COMMENT|IND COMMENT] - - typeSection ::= 'type' indPush typeDecl (SAD typeDecl)* DED indPop - - -Example: - -.. code-block:: nimrod - type # example demonstrates mutually recursive types - PNode = ref TNode # a traced pointer to a TNode - TNode = object - le, ri: PNode # left and right subtrees - sym: ref TSym # leaves contain a reference to a TSym - - TSym = object # a symbol - name: string # the symbol's name - line: int # the line the symbol was declared in - code: PNode # the symbol's abstract syntax tree - -A `type`:idx: section begins with the ``type`` keyword. It contains multiple -type definitions. A type definition binds a type to a name. Type definitions -can be recursive or even mutually recursive. Mutually recursive types are only -possible within a single ``type`` section. - - -Generics --------- - -Example: - -.. code-block:: nimrod - type - TBinaryTree[T] = object # TBinaryTree is a generic type with - # with generic param ``T`` - le, ri: ref TBinaryTree[T] # left and right subtrees; may be nil - data: T # the data stored in a node - PBinaryTree[T] = ref TBinaryTree[T] # a shorthand for notational convenience - - proc newNode[T](data: T): PBinaryTree[T] = # constructor for a node - new(result) - result.data = data - - proc add[T](root: var PBinaryTree[T], n: PBinaryTree[T]) = - if root == nil: - root = n - else: - var it = root - while it != nil: - var c = cmp(it.data, n.data) # compare the data items; uses - # the generic ``cmp`` proc that works for - # any type that has a ``==`` and ``<`` - # operator - if c < 0: - if it.le == nil: - it.le = n - return - it = it.le - else: - if it.ri == nil: - it.ri = n - return - it = it.ri - - iterator inorder[T](root: PBinaryTree[T]): T = - # inorder traversal of a binary tree - # recursive iterators are not yet implemented, so this does not work in - # the current compiler! - if root.le != nil: yield inorder(root.le) - yield root.data - if root.ri != nil: yield inorder(root.ri) - - var - root: PBinaryTree[string] # instantiate a PBinaryTree with the type string - add(root, newNode("hallo")) # instantiates generic procs ``newNode`` and - add(root, newNode("world")) # ``add`` - for str in inorder(root): - writeln(stdout, str) - -`Generics`:idx: are Nimrod's means to parametrize procs, iterators or types with -`type parameters`:idx:. Depending on context, the brackets are used either to -introduce type parameters or to instantiate a generic proc, iterator or type. - - -Is operator -~~~~~~~~~~~ - -The `is`:idx: operator checks for type equivalence at compile time. It is -therefore very useful for type specialization within generic code: - -.. code-block:: nimrod - type - TTable[TKey, TValue] = object - keys: seq[TKey] - values: seq[TValue] - when not (TKey is string): # nil value for strings used for optimization - deletedKeys: seq[bool] - - -Type operator -~~~~~~~~~~~~~ - -The `type`:idx: (in many other languages called `typeof`:idx:) operator can -be used to get the type of an expression: - -.. code-block:: nimrod - var x = 0 - var y: type(x) # y has type int - -If ``type`` is used to determine the result type of a proc/iterator/converter -call ``c(X)`` (where ``X`` stands for a possibly empty list of arguments), the -interpretation where ``c`` is an iterator is preferred over the -other interpretations: - -.. code-block:: nimrod - import strutils - - # strutils contains both a ``split`` proc and iterator, but since an - # an iterator is the preferred interpretation, `y` has the type ``string``: - var y: type("a b c".split) - - -Type constraints -~~~~~~~~~~~~~~~~ - -`Type constraints`:idx: can be used to restrict the instantiation of a generic -type parameter. Only the specified types are valid for instantiation: - -.. code-block:: nimrod - proc onlyIntOrString[T: int|string](x, y: T) = nil - - onlyIntOrString(450, 616) # valid - onlyIntOrString(5.0, 0.0) # type mismatch - onlyIntOrString("xy", 50) # invalid as 'T' cannot be both at the same time - -Apart from ordinary types, type constraints can also be of the -following *type classes*: - -================== =================================================== -type class matches -================== =================================================== -``object`` any object type -``tuple`` any tuple type - -``enum`` any enumeration -``proc`` any proc type -``ref`` any ``ref`` type -``ptr`` any ``ptr`` type -``var`` any ``var`` type -``distinct`` any distinct type -``array`` any array type -``set`` any set type -``seq`` any seq type -================== =================================================== - -The following example is taken directly from the system module: - -.. code-block:: nimrod - proc `==`*[T: tuple](x, y: T): bool = - ## generic ``==`` operator for tuples that is lifted from the components - ## of `x` and `y`. - for a, b in fields(x, y): - if a != b: return false - return true - - -Symbol lookup in generics -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Symbols in generics are looked up in two different contexts: Both the context -at definition and the context at instantiation are considered for any symbol -occuring in a generic: - -.. code-block:: nimrod - type - TIndex = distinct int - - proc `==` (a, b: TIndex): bool {.borrow.} - - var a = (0, 0.TIndex) - var b = (0, 0.TIndex) - - echo a == b # works! - -In the example the generic ``==`` for tuples (as defined in the system module) -uses the ``==`` operators of the tuple's components. However, the ``==`` for -the ``TIndex`` type is defined *after* the ``==`` for tuples; yet the example -compiles as the instantiation takes the currently defined symbols into account -too. - - -Templates ---------- - -A `template`:idx: is a simple form of a macro: It is a simple substitution -mechanism that operates on Nimrod's abstract syntax trees. It is processed in -the semantic pass of the compiler. - -The syntax to *invoke* a template is the same as calling a procedure. - -Example: - -.. code-block:: nimrod - template `!=` (a, b: expr): expr = - # this definition exists in the System module - not (a == b) - - assert(5 != 6) # the compiler rewrites that to: assert(not (5 == 6)) - -The ``!=``, ``>``, ``>=``, ``in``, ``notin``, ``isnot`` operators are in fact -templates: - -| ``a > b`` is transformed into ``b < a``. -| ``a in b`` is transformed into ``contains(b, a)``. -| ``notin`` and ``isnot`` have the obvious meanings. - -The "types" of templates can be the symbols ``expr`` (stands for *expression*), -``stmt`` (stands for *statement*) or ``typedesc`` (stands for *type -description*). These are no real types, they just help the compiler parsing. -Real types can be used too; this implies that expressions are expected. +the ``system`` module. + + +Var parameters +~~~~~~~~~~~~~~ +The type of a parameter may be prefixed with the ``var`` keyword: + +.. code-block:: nimrod + proc divmod(a, b: int; res, remainder: var int) = + res = a div b + remainder = a mod b + + var + x, y: int + + divmod(8, 5, x, y) # modifies x and y + assert x == 1 + assert y == 3 + +In the example, ``res`` and ``remainder`` are `var parameters`. +Var parameters can be modified by the procedure and the changes are +visible to the caller. The argument passed to a var parameter has to be +an l-value. Var parameters are implemented as hidden pointers. The +above example is equivalent to: + +.. code-block:: nimrod + proc divmod(a, b: int; res, remainder: ptr int) = + res[] = a div b + remainder[] = a mod b + + var + x, y: int + divmod(8, 5, addr(x), addr(y)) + assert x == 1 + assert y == 3 + +In the examples, var parameters or pointers are used to provide two +return values. This can be done in a cleaner way by returning a tuple: + +.. code-block:: nimrod + proc divmod(a, b: int): tuple[res, remainder: int] = + return (a div b, a mod b) + + var t = divmod(8, 5) + assert t.res == 1 + assert t.remainder = 3 + +One can use `tuple unpacking`:idx: to access the tuple's fields: + +.. code-block:: nimrod + var (x, y) = divmod(8, 5) # tuple unpacking + assert x == 1 + assert y == 3 + + +Var return type +~~~~~~~~~~~~~~~ + +A proc, converter or iterator may return a ``var`` type which means that the +returned value is an l-value and can be modified by the caller: + +.. code-block:: nimrod + var g = 0 + + proc WriteAccessToG(): var int = + result = g + + WriteAccessToG() = 6 + assert g == 6 + +It is a compile time error if the implicitely introduced pointer could be +used to access a location beyond its lifetime: + +.. code-block:: nimrod + proc WriteAccessToG(): var int = + var g = 0 + result = g # Error! + +For iterators, a component of a tuple return type can have a ``var`` type too: + +.. code-block:: nimrod + iterator mpairs(a: var seq[string]): tuple[key: int, val: var string] = + for i in 0..a.high: + yield (i, a[i]) + +In the standard library every name of a routine that returns a ``var`` type +starts with the prefix ``m`` per convention. + + +Overloading of the subscript operator +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``[]`` subscript operator for arrays/openarrays/sequences can be overloaded. +Overloading support is only possible if the first parameter has no type that +already supports the built-in ``[]`` notation. Currently the compiler +does not check this restriction. + + +Multi-methods +~~~~~~~~~~~~~ + +Procedures always use static dispatch. `Multi-methods`:idx: use dynamic +dispatch. + +.. code-block:: nimrod + type + TExpr = object ## abstract base class for an expression + TLiteral = object of TExpr + x: int + TPlusExpr = object of TExpr + a, b: ref TExpr + + method eval(e: ref TExpr): int = + # override this base method + quit "to override!" + + method eval(e: ref TLiteral): int = return e.x + + method eval(e: ref TPlusExpr): int = + # watch out: relies on dynamic binding + return eval(e.a) + eval(e.b) + + proc newLit(x: int): ref TLiteral = + new(result) + result.x = x + + proc newPlus(a, b: ref TExpr): ref TPlusExpr = + new(result) + result.a = a + result.b = b + + echo eval(newPlus(newPlus(newLit(1), newLit(2)), newLit(4))) + +In the example the constructors ``newLit`` and ``newPlus`` are procs +because they should use static binding, but ``eval`` is a method because it +requires dynamic binding. + +In a multi-method all parameters that have an object type are used for the +dispatching: + +.. code-block:: nimrod + type + TThing = object + TUnit = object of TThing + x: int + + method collide(a, b: TThing) {.inline.} = + quit "to override!" + + method collide(a: TThing, b: TUnit) {.inline.} = + echo "1" + + method collide(a: TUnit, b: TThing) {.inline.} = + echo "2" + + var + a, b: TUnit + collide(a, b) # output: 2 + + +Invocation of a multi-method cannot be ambiguous: collide 2 is preferred over +collide 1 because the resolution works from left to right. +In the example ``TUnit, TThing`` is prefered over ``TThing, TUnit``. + +**Performance note**: Nimrod does not produce a virtual method table, but +generates dispatch trees. This avoids the expensive indirect branch for method +calls and enables inlining. However, other optimizations like compile time +evaluation or dead code elimination do not work with methods. + + +Iterators and the for statement +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Syntax:: + + forStmt ::= 'for' symbol (comma symbol)* [comma] 'in' expr ':' stmt + + param ::= symbol (comma symbol)* [comma] ':' typeDesc + paramList ::= ['(' [param (comma param)* [comma]] ')'] [':' typeDesc] + + genericParam ::= symbol [':' typeDesc] + genericParams ::= '[' genericParam (comma genericParam)* [comma] ']' + + iteratorDecl ::= 'iterator' symbol ['*'] [genericParams] paramList [pragma] + ['=' stmt] + +The `for`:idx: statement is an abstract mechanism to iterate over the elements +of a container. It relies on an `iterator`:idx: to do so. Like ``while`` +statements, ``for`` statements open an `implicit block`:idx:, so that they +can be left with a ``break`` statement. The ``for`` loop declares + +iteration variables (``x`` in the example) - their scope reaches until the +end of the loop body. The iteration variables' types are inferred by the +return type of the iterator. + +An iterator is similar to a procedure, except that it is always called in the +context of a ``for`` loop. Iterators provide a way to specify the iteration over +an abstract type. A key role in the execution of a ``for`` loop plays the +``yield`` statement in the called iterator. Whenever a ``yield`` statement is +reached the data is bound to the ``for`` loop variables and control continues +in the body of the ``for`` loop. The iterator's local variables and execution +state are automatically saved between calls. Example: + +.. code-block:: nimrod + # this definition exists in the system module + iterator items*(a: string): char {.inline.} = + var i = 0 + while i < len(a): + yield a[i] + inc(i) + + for ch in items("hello world"): # `ch` is an iteration variable + echo(ch) + +The compiler generates code as if the programmer would have written this: + +.. code-block:: nimrod + var i = 0 + while i < len(a): + var ch = a[i] + echo(ch) + inc(i) + +The current implementation always inlines the iterator code leading to zero +overhead for the abstraction. But this may increase the code size. Later +versions of the compiler will only inline iterators which have the calling +convention ``inline``. + +If the iterator yields a tuple, there have to be as many iteration variables +as there are components in the tuple. The i'th iteration variable's type is +the type of the i'th component. + + +Implict items/pairs invokations ++++++++++++++++++++++++++++++++ + +If the for loop expression ``e`` does not denote an iterator and the for loop +has exactly 1 variable, the for loop expression is rewritten to ``items(e)``; +ie. an ``items`` iterator is implicitely invoked: + +.. code-block:: nimrod + for x in [1,2,3]: echo x + +If the for loop has exactly 2 variables, a ``pairs`` iterator is implicitely +invoked. + +Symbol lookup of the identifiers ``items``/``pairs`` is performed after +the rewriting step, so that all overloadings of ``items``/``pairs`` are taken +into account. + + +Type sections +~~~~~~~~~~~~~ + +Syntax:: + + typeDef ::= typeDesc | objectDef | enumDef + + genericParam ::= symbol [':' typeDesc] + genericParams ::= '[' genericParam (comma genericParam)* [comma] ']' + + typeDecl ::= COMMENT + | symbol ['*'] [genericParams] ['=' typeDef] [COMMENT|IND COMMENT] + + typeSection ::= 'type' indPush typeDecl (SAD typeDecl)* DED indPop + + +Example: + +.. code-block:: nimrod + type # example demonstrates mutually recursive types + PNode = ref TNode # a traced pointer to a TNode + TNode = object + le, ri: PNode # left and right subtrees + sym: ref TSym # leaves contain a reference to a TSym + + TSym = object # a symbol + name: string # the symbol's name + line: int # the line the symbol was declared in + code: PNode # the symbol's abstract syntax tree + +A `type`:idx: section begins with the ``type`` keyword. It contains multiple +type definitions. A type definition binds a type to a name. Type definitions +can be recursive or even mutually recursive. Mutually recursive types are only +possible within a single ``type`` section. + + +Generics +-------- + +Example: + +.. code-block:: nimrod + type + TBinaryTree[T] = object # TBinaryTree is a generic type with + # with generic param ``T`` + le, ri: ref TBinaryTree[T] # left and right subtrees; may be nil + data: T # the data stored in a node + PBinaryTree[T] = ref TBinaryTree[T] # a shorthand for notational convenience + + proc newNode[T](data: T): PBinaryTree[T] = # constructor for a node + new(result) + result.data = data + + proc add[T](root: var PBinaryTree[T], n: PBinaryTree[T]) = + if root == nil: + root = n + else: + var it = root + while it != nil: + var c = cmp(it.data, n.data) # compare the data items; uses + # the generic ``cmp`` proc that works for + # any type that has a ``==`` and ``<`` + # operator + if c < 0: + if it.le == nil: + it.le = n + return + it = it.le + else: + if it.ri == nil: + it.ri = n + return + it = it.ri + + iterator inorder[T](root: PBinaryTree[T]): T = + # inorder traversal of a binary tree + # recursive iterators are not yet implemented, so this does not work in + # the current compiler! + if root.le != nil: yield inorder(root.le) + yield root.data + if root.ri != nil: yield inorder(root.ri) + + var + root: PBinaryTree[string] # instantiate a PBinaryTree with the type string + add(root, newNode("hallo")) # instantiates generic procs ``newNode`` and + add(root, newNode("world")) # ``add`` + for str in inorder(root): + writeln(stdout, str) + +`Generics`:idx: are Nimrod's means to parametrize procs, iterators or types with +`type parameters`:idx:. Depending on context, the brackets are used either to +introduce type parameters or to instantiate a generic proc, iterator or type. + + +Is operator +~~~~~~~~~~~ + +The `is`:idx: operator checks for type equivalence at compile time. It is +therefore very useful for type specialization within generic code: + +.. code-block:: nimrod + type + TTable[TKey, TValue] = object + keys: seq[TKey] + values: seq[TValue] + when not (TKey is string): # nil value for strings used for optimization + deletedKeys: seq[bool] + + +Type operator +~~~~~~~~~~~~~ + +The `type`:idx: (in many other languages called `typeof`:idx:) operator can +be used to get the type of an expression: + +.. code-block:: nimrod + var x = 0 + var y: type(x) # y has type int + +If ``type`` is used to determine the result type of a proc/iterator/converter +call ``c(X)`` (where ``X`` stands for a possibly empty list of arguments), the +interpretation where ``c`` is an iterator is preferred over the +other interpretations: + +.. code-block:: nimrod + import strutils + + # strutils contains both a ``split`` proc and iterator, but since an + # an iterator is the preferred interpretation, `y` has the type ``string``: + var y: type("a b c".split) + + +Type constraints +~~~~~~~~~~~~~~~~ + +`Type constraints`:idx: can be used to restrict the instantiation of a generic +type parameter. Only the specified types are valid for instantiation: + +.. code-block:: nimrod + proc onlyIntOrString[T: int|string](x, y: T) = nil + + onlyIntOrString(450, 616) # valid + onlyIntOrString(5.0, 0.0) # type mismatch + onlyIntOrString("xy", 50) # invalid as 'T' cannot be both at the same time + +Apart from ordinary types, type constraints can also be of the +following *type classes*: + +================== =================================================== +type class matches +================== =================================================== +``object`` any object type +``tuple`` any tuple type + +``enum`` any enumeration +``proc`` any proc type +``ref`` any ``ref`` type +``ptr`` any ``ptr`` type +``var`` any ``var`` type +``distinct`` any distinct type +``array`` any array type +``set`` any set type +``seq`` any seq type +================== =================================================== + +The following example is taken directly from the system module: + +.. code-block:: nimrod + proc `==`*[T: tuple](x, y: T): bool = + ## generic ``==`` operator for tuples that is lifted from the components + ## of `x` and `y`. + for a, b in fields(x, y): + if a != b: return false + return true + + +Symbol lookup in generics +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Symbols in generics are looked up in two different contexts: Both the context +at definition and the context at instantiation are considered for any symbol +occuring in a generic: + +.. code-block:: nimrod + type + TIndex = distinct int + + proc `==` (a, b: TIndex): bool {.borrow.} + + var a = (0, 0.TIndex) + var b = (0, 0.TIndex) + + echo a == b # works! + +In the example the generic ``==`` for tuples (as defined in the system module) +uses the ``==`` operators of the tuple's components. However, the ``==`` for +the ``TIndex`` type is defined *after* the ``==`` for tuples; yet the example +compiles as the instantiation takes the currently defined symbols into account +too. + + +Templates +--------- + +A `template`:idx: is a simple form of a macro: It is a simple substitution +mechanism that operates on Nimrod's abstract syntax trees. It is processed in +the semantic pass of the compiler. + +The syntax to *invoke* a template is the same as calling a procedure. + +Example: + +.. code-block:: nimrod + template `!=` (a, b: expr): expr = + # this definition exists in the System module + not (a == b) + + assert(5 != 6) # the compiler rewrites that to: assert(not (5 == 6)) + +The ``!=``, ``>``, ``>=``, ``in``, ``notin``, ``isnot`` operators are in fact +templates: + +| ``a > b`` is transformed into ``b < a``. +| ``a in b`` is transformed into ``contains(b, a)``. +| ``notin`` and ``isnot`` have the obvious meanings. + +The "types" of templates can be the symbols ``expr`` (stands for *expression*), +``stmt`` (stands for *statement*) or ``typedesc`` (stands for *type +description*). These are no real types, they just help the compiler parsing. +Real types can be used too; this implies that expressions are expected. Ordinary vs immediate templates @@ -2880,8 +2880,8 @@ So ordinary templates cannot receive undeclared identifiers: .. code-block:: nimrod - template declareInt(x: expr) = - var x: int + template declareInt(x: expr) = + var x: int declareInt(x) # error: unknown identifier: 'x' @@ -2891,133 +2891,133 @@ receive undeclared identifiers: .. code-block:: nimrod - template declareInt(x: expr) {.immediate.} = - var x: int + template declareInt(x: expr) {.immediate.} = + var x: int declareInt(x) # valid Scoping in templates -~~~~~~~~~~~~~~~~~~~~ - -The template body does not open a new scope. To open a new scope a ``block`` -statement can be used: - -.. code-block:: nimrod - template declareInScope(x: expr, t: typeDesc): stmt {.immediate.} = - var x: t - - template declareInNewScope(x: expr, t: typeDesc): stmt {.immediate.} = - # open a new scope: - block: - var x: t - - declareInScope(a, int) - a = 42 # works, `a` is known here - - declareInNewScope(b, int) - b = 42 # does not work, `b` is unknown +~~~~~~~~~~~~~~~~~~~~ + +The template body does not open a new scope. To open a new scope a ``block`` +statement can be used: + +.. code-block:: nimrod + template declareInScope(x: expr, t: typeDesc): stmt {.immediate.} = + var x: t + + template declareInNewScope(x: expr, t: typeDesc): stmt {.immediate.} = + # open a new scope: + block: + var x: t + + declareInScope(a, int) + a = 42 # works, `a` is known here + + declareInNewScope(b, int) + b = 42 # does not work, `b` is unknown Passing a code block to a template -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -If there is a ``stmt`` parameter it should be the last in the template -declaration, because statements are passed to a template via a -special ``:`` syntax: - -.. code-block:: nimrod - - template withFile(f, fn, mode: expr, actions: stmt): stmt {.immediate.} = - block: - var f: TFile - if open(f, fn, mode): - try: - actions - finally: - close(f) - else: - quit("cannot open: " & fn) - - withFile(txt, "ttempl3.txt", fmWrite): - txt.writeln("line 1") - txt.writeln("line 2") - -In the example the two ``writeln`` statements are bound to the ``actions`` -parameter. - -**Note:** The symbol binding rules for templates might change! - -Symbol binding within templates happens after template instantation: - -.. code-block:: nimrod - # Module A - var - lastId = 0 - - template genId*: expr = - inc(lastId) - lastId - -.. code-block:: nimrod - # Module B - import A - - echo genId() # Error: undeclared identifier: 'lastId' - - -Bind statement -~~~~~~~~~~~~~~ - -Syntax:: - - bindStmt ::= 'bind' IDENT (comma IDENT)* - -Exporting a template is a often a leaky abstraction as it can depend on -symbols that are not visible from a client module. However, to compensate for -this case, a `bind`:idx: statement can be used: It declares all identifiers -that should be bound early (i.e. when the template is parsed): - -.. code-block:: nimrod - # Module A - var - lastId = 0 - - template genId*: expr = - bind lastId - inc(lastId) - lastId - -.. code-block:: nimrod - # Module B - import A - - echo genId() # Works - -A ``bind`` statement can also be used in generics for the same purpose. - - -Identifier construction -~~~~~~~~~~~~~~~~~~~~~~~ - -In templates identifiers can be constructed with the backticks notation: - -.. code-block:: nimrod - - template typedef(name: expr, typ: typeDesc) {.immediate.} = - type - `T name`* {.inject.} = typ - `P name`* {.inject.} = ref `T name` - - typedef(myint, int) - var x: PMyInt - -In the example ``name`` is instantiated with ``myint``, so \`T name\` becomes -``Tmyint``. - +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If there is a ``stmt`` parameter it should be the last in the template +declaration, because statements are passed to a template via a +special ``:`` syntax: + +.. code-block:: nimrod + + template withFile(f, fn, mode: expr, actions: stmt): stmt {.immediate.} = + block: + var f: TFile + if open(f, fn, mode): + try: + actions + finally: + close(f) + else: + quit("cannot open: " & fn) + + withFile(txt, "ttempl3.txt", fmWrite): + txt.writeln("line 1") + txt.writeln("line 2") + +In the example the two ``writeln`` statements are bound to the ``actions`` +parameter. + +**Note:** The symbol binding rules for templates might change! + +Symbol binding within templates happens after template instantation: + +.. code-block:: nimrod + # Module A + var + lastId = 0 + + template genId*: expr = + inc(lastId) + lastId + +.. code-block:: nimrod + # Module B + import A + + echo genId() # Error: undeclared identifier: 'lastId' + + +Bind statement +~~~~~~~~~~~~~~ + +Syntax:: + + bindStmt ::= 'bind' IDENT (comma IDENT)* + +Exporting a template is a often a leaky abstraction as it can depend on +symbols that are not visible from a client module. However, to compensate for +this case, a `bind`:idx: statement can be used: It declares all identifiers +that should be bound early (i.e. when the template is parsed): + +.. code-block:: nimrod + # Module A + var + lastId = 0 + + template genId*: expr = + bind lastId + inc(lastId) + lastId + +.. code-block:: nimrod + # Module B + import A + + echo genId() # Works + +A ``bind`` statement can also be used in generics for the same purpose. + + +Identifier construction +~~~~~~~~~~~~~~~~~~~~~~~ + +In templates identifiers can be constructed with the backticks notation: + +.. code-block:: nimrod + + template typedef(name: expr, typ: typeDesc) {.immediate.} = + type + `T name`* {.inject.} = typ + `P name`* {.inject.} = ref `T name` + + typedef(myint, int) + var x: PMyInt + +In the example ``name`` is instantiated with ``myint``, so \`T name\` becomes +``Tmyint``. + Lookup rules for template parameters -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A parameter ``p`` in a template is even substituted in the expression ``x.p``. Thus template arguments can be used as field names and a global symbol can be @@ -3058,7 +3058,7 @@ But the global symbol can properly be captured by a ``bind`` statement: Hygiene in templates -~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~ Per default templates are `hygienic`:idx:\: Local identifiers declared in a template cannot be accessed in the instantiation context: @@ -3087,14 +3087,14 @@ is ``gensym`` and for ``proc``, ``iterator``, ``converter``, ``template``, template parameter, it is an inject'ed symbol: .. code-block:: nimrod - template withFile(f, fn, mode: expr, actions: stmt): stmt {.immediate.} = - block: + template withFile(f, fn, mode: expr, actions: stmt): stmt {.immediate.} = + block: var f: TFile # since 'f' is a template param, it's injected implicitely ... - - withFile(txt, "ttempl3.txt", fmWrite): - txt.writeln("line 1") - txt.writeln("line 2") + + withFile(txt, "ttempl3.txt", fmWrite): + txt.writeln("line 1") + txt.writeln("line 2") The ``inject`` and ``gensym`` pragmas are second class annotations; they have @@ -3111,73 +3111,73 @@ To get rid of hygiene in templates, one can use the `dirty`:idx: pragma for a template. ``inject`` and ``gensym`` have no effect in ``dirty`` templates. - -Macros ------- - -A `macro`:idx: is a special kind of low level template. Macros can be used + +Macros +------ + +A `macro`:idx: is a special kind of low level template. Macros can be used to implement `domain specific languages`:idx:. Like templates, macros come in -the 2 flavors *immediate* and *ordinary*. - -While macros enable advanced compile-time code transformations, they -cannot change Nimrod's syntax. However, this is no real restriction because -Nimrod's syntax is flexible enough anyway. - -To write macros, one needs to know how the Nimrod concrete syntax is converted -to an abstract syntax tree. - -There are two ways to invoke a macro: -(1) invoking a macro like a procedure call (`expression macros`) -(2) invoking a macro with the special ``macrostmt`` syntax (`statement macros`) - - -Expression Macros -~~~~~~~~~~~~~~~~~ - -The following example implements a powerful ``debug`` command that accepts a -variable number of arguments: - -.. code-block:: nimrod - # to work with Nimrod syntax trees, we need an API that is defined in the - # ``macros`` module: - import macros - - macro debug(n: varargs[expr]): stmt = - # `n` is a Nimrod AST that contains the whole macro invocation - # this macro returns a list of statements: - result = newNimNode(nnkStmtList, n) - # iterate over any argument that is passed to this macro: - for i in 0..n.len-1: - # add a call to the statement list that writes the expression; - # `toStrLit` converts an AST to its string representation: - add(result, newCall("write", newIdentNode("stdout"), toStrLit(n[i]))) - # add a call to the statement list that writes ": " - add(result, newCall("write", newIdentNode("stdout"), newStrLitNode(": "))) - # add a call to the statement list that writes the expressions value: - add(result, newCall("writeln", newIdentNode("stdout"), n[i])) - - var - a: array [0..10, int] - x = "some string" - a[0] = 42 - a[1] = 45 - - debug(a[0], a[1], x) - -The macro call expands to: - -.. code-block:: nimrod - write(stdout, "a[0]") - write(stdout, ": ") - writeln(stdout, a[0]) - - write(stdout, "a[1]") - write(stdout, ": ") - writeln(stdout, a[1]) - - write(stdout, "x") - write(stdout, ": ") - writeln(stdout, x) +the 2 flavors *immediate* and *ordinary*. + +While macros enable advanced compile-time code transformations, they +cannot change Nimrod's syntax. However, this is no real restriction because +Nimrod's syntax is flexible enough anyway. + +To write macros, one needs to know how the Nimrod concrete syntax is converted +to an abstract syntax tree. + +There are two ways to invoke a macro: +(1) invoking a macro like a procedure call (`expression macros`) +(2) invoking a macro with the special ``macrostmt`` syntax (`statement macros`) + + +Expression Macros +~~~~~~~~~~~~~~~~~ + +The following example implements a powerful ``debug`` command that accepts a +variable number of arguments: + +.. code-block:: nimrod + # to work with Nimrod syntax trees, we need an API that is defined in the + # ``macros`` module: + import macros + + macro debug(n: varargs[expr]): stmt = + # `n` is a Nimrod AST that contains the whole macro invocation + # this macro returns a list of statements: + result = newNimNode(nnkStmtList, n) + # iterate over any argument that is passed to this macro: + for i in 0..n.len-1: + # add a call to the statement list that writes the expression; + # `toStrLit` converts an AST to its string representation: + add(result, newCall("write", newIdentNode("stdout"), toStrLit(n[i]))) + # add a call to the statement list that writes ": " + add(result, newCall("write", newIdentNode("stdout"), newStrLitNode(": "))) + # add a call to the statement list that writes the expressions value: + add(result, newCall("writeln", newIdentNode("stdout"), n[i])) + + var + a: array [0..10, int] + x = "some string" + a[0] = 42 + a[1] = 45 + + debug(a[0], a[1], x) + +The macro call expands to: + +.. code-block:: nimrod + write(stdout, "a[0]") + write(stdout, ": ") + writeln(stdout, a[0]) + + write(stdout, "a[1]") + write(stdout, ": ") + writeln(stdout, a[1]) + + write(stdout, "x") + write(stdout, ": ") + writeln(stdout, x) Arguments that are passed to a ``varargs`` parameter are wrapped in an array @@ -3186,7 +3186,7 @@ children. BindSym -~~~~~~~ +~~~~~~~ The above ``debug`` macro relies on the fact that ``write``, ``writeln`` and ``stdout`` are declared in the system module and thus visible in the @@ -3194,91 +3194,91 @@ instantiating context. There is a way to use bound identifiers (aka `symbols`:idx) instead of using unbound identifiers. The ``bindSym`` builtin can be used for that: -.. code-block:: nimrod - import macros - - macro debug(n: varargs[expr]): stmt = - result = newNimNode(nnkStmtList, n) - for i in 0..n.len-1: - # we can bind symbols in scope via 'bindSym': - add(result, newCall(bindSym"write", bindSym"stdout", toStrLit(n[i]))) - add(result, newCall(bindSym"write", bindSym"stdout", newStrLitNode(": "))) - add(result, newCall(bindSym"writeln", bindSym"stdout", n[i])) - - var - a: array [0..10, int] - x = "some string" - a[0] = 42 - a[1] = 45 - - debug(a[0], a[1], x) - -The macro call expands to: - -.. code-block:: nimrod - write(stdout, "a[0]") - write(stdout, ": ") - writeln(stdout, a[0]) - - write(stdout, "a[1]") - write(stdout, ": ") - writeln(stdout, a[1]) - - write(stdout, "x") - write(stdout, ": ") - writeln(stdout, x) +.. code-block:: nimrod + import macros + + macro debug(n: varargs[expr]): stmt = + result = newNimNode(nnkStmtList, n) + for i in 0..n.len-1: + # we can bind symbols in scope via 'bindSym': + add(result, newCall(bindSym"write", bindSym"stdout", toStrLit(n[i]))) + add(result, newCall(bindSym"write", bindSym"stdout", newStrLitNode(": "))) + add(result, newCall(bindSym"writeln", bindSym"stdout", n[i])) + + var + a: array [0..10, int] + x = "some string" + a[0] = 42 + a[1] = 45 + + debug(a[0], a[1], x) + +The macro call expands to: + +.. code-block:: nimrod + write(stdout, "a[0]") + write(stdout, ": ") + writeln(stdout, a[0]) + + write(stdout, "a[1]") + write(stdout, ": ") + writeln(stdout, a[1]) + + write(stdout, "x") + write(stdout, ": ") + writeln(stdout, x) However, the symbols ``write``, ``writeln`` and ``stdout`` are already bound and are not looked up again. As the example shows, ``bindSym`` does work with overloaded symbols implicitely. - - -Statement Macros -~~~~~~~~~~~~~~~~ - -Statement macros are defined just as expression macros. However, they are -invoked by an expression following a colon:: - - exprStmt ::= lowestExpr ['=' expr | [expr (comma expr)* [comma]] [macroStmt]] - macroStmt ::= ':' [stmt] ('of' [sliceExprList] ':' stmt - | 'elif' expr ':' stmt - | 'except' exceptList ':' stmt )* - ['else' ':' stmt] - -The following example outlines a macro that generates a lexical analyzer from -regular expressions: - - -.. code-block:: nimrod - import macros - - macro case_token(n: stmt): stmt = - # creates a lexical analyzer from regular expressions - # ... (implementation is an exercise for the reader :-) - nil - - case_token: # this colon tells the parser it is a macro statement - of r"[A-Za-z_]+[A-Za-z_0-9]*": - return tkIdentifier - of r"0-9+": - return tkInteger - of r"[\+\-\*\?]+": - return tkOperator - else: - return tkUnknown - - -**Style note**: For code readability, it is the best idea to use the least -powerful programming construct that still suffices. So the "check list" is: - -(1) Use an ordinary proc/iterator, if possible. -(2) Else: Use a generic proc/iterator, if possible. -(3) Else: Use a template, if possible. -(4) Else: Use a macro. - - -Macros as pragmas -~~~~~~~~~~~~~~~~~ + + +Statement Macros +~~~~~~~~~~~~~~~~ + +Statement macros are defined just as expression macros. However, they are +invoked by an expression following a colon:: + + exprStmt ::= lowestExpr ['=' expr | [expr (comma expr)* [comma]] [macroStmt]] + macroStmt ::= ':' [stmt] ('of' [sliceExprList] ':' stmt + | 'elif' expr ':' stmt + | 'except' exceptList ':' stmt )* + ['else' ':' stmt] + +The following example outlines a macro that generates a lexical analyzer from +regular expressions: + + +.. code-block:: nimrod + import macros + + macro case_token(n: stmt): stmt = + # creates a lexical analyzer from regular expressions + # ... (implementation is an exercise for the reader :-) + nil + + case_token: # this colon tells the parser it is a macro statement + of r"[A-Za-z_]+[A-Za-z_0-9]*": + return tkIdentifier + of r"0-9+": + return tkInteger + of r"[\+\-\*\?]+": + return tkOperator + else: + return tkUnknown + + +**Style note**: For code readability, it is the best idea to use the least +powerful programming construct that still suffices. So the "check list" is: + +(1) Use an ordinary proc/iterator, if possible. +(2) Else: Use a generic proc/iterator, if possible. +(3) Else: Use a template, if possible. +(4) Else: Use a macro. + + +Macros as pragmas +~~~~~~~~~~~~~~~~~ Whole routines (procs, iterators etc.) can also be passed to a template or a macro via the pragma notation: @@ -3295,517 +3295,517 @@ This is a simple syntactic transformation into: m: proc p() = nil - - -Modules -------- -Nimrod supports splitting a program into pieces by a `module`:idx: concept. -Each module needs to be in its own file and has its own `namespace`:idx:. -Modules enable `information hiding`:idx: and `separate compilation`:idx:. -A module may gain access to symbols of another module by the `import`:idx: -statement. `Recursive module dependencies`:idx: are allowed, but slightly -subtle. Only top-level symbols that are marked with an asterisk (``*``) are -exported. - -The algorithm for compiling modules is: - -- compile the whole module as usual, following import statements recursively - -- if there is a cycle only import the already parsed symbols (that are - exported); if an unknown identifier occurs then abort - -This is best illustrated by an example: - -.. code-block:: nimrod - # Module A - type - T1* = int # Module A exports the type ``T1`` - import B # the compiler starts parsing B - - proc main() = - var i = p(3) # works because B has been parsed completely here - - main() - - -.. code-block:: nimrod - # Module B - import A # A is not parsed here! Only the already known symbols - # of A are imported. - - proc p*(x: A.T1): A.T1 = - # this works because the compiler has already - # added T1 to A's interface symbol table - return x + 1 - - -Scope rules ------------ -Identifiers are valid from the point of their declaration until the end of -the block in which the declaration occurred. The range where the identifier -is known is the `scope`:idx: of the identifier. The exact scope of an -identifier depends on the way it was declared. - -Block scope -~~~~~~~~~~~ -The *scope* of a variable declared in the declaration part of a block -is valid from the point of declaration until the end of the block. If a -block contains a second block, in which the identifier is redeclared, -then inside this block, the second declaration will be valid. Upon -leaving the inner block, the first declaration is valid again. An -identifier cannot be redefined in the same block, except if valid for -procedure or iterator overloading purposes. - - -Tuple or object scope -~~~~~~~~~~~~~~~~~~~~~ -The field identifiers inside a tuple or object definition are valid in the -following places: - -* To the end of the tuple/object definition. -* Field designators of a variable of the given tuple/object type. -* In all descendant types of the object type. - -Module scope -~~~~~~~~~~~~ -All identifiers of a module are valid from the point of declaration until -the end of the module. Identifiers from indirectly dependent modules are *not* -available. The `system`:idx: module is automatically imported in every other -module. - -If a module imports an identifier by two different modules, each occurrence of -the identifier has to be qualified, unless it is an overloaded procedure or -iterator in which case the overloading resolution takes place: - -.. code-block:: nimrod - # Module A - var x*: string - -.. code-block:: nimrod - # Module B - var x*: int - -.. code-block:: nimrod - # Module C - import A, B - write(stdout, x) # error: x is ambiguous - write(stdout, A.x) # no error: qualifier used - - var x = 4 - write(stdout, x) # not ambiguous: uses the module C's x - - -Compiler Messages -================= - -The Nimrod compiler emits different kinds of messages: `hint`:idx:, -`warning`:idx:, and `error`:idx: messages. An *error* message is emitted if -the compiler encounters any static error. - - -Pragmas -======= - -Syntax:: - - colonExpr ::= expr [':' expr] - colonExprList ::= [colonExpr (comma colonExpr)* [comma]] - - pragma ::= '{.' optInd (colonExpr [comma])* [SAD] ('.}' | '}') - -Pragmas are Nimrod's method to give the compiler additional information / -commands without introducing a massive number of new keywords. Pragmas are -processed on the fly during semantic checking. Pragmas are enclosed in the -special ``{.`` and ``.}`` curly brackets. Pragmas are also often used as a -first implementation to play with a language feature before a nicer syntax -to access the feature becomes available. - - -noSideEffect pragma -------------------- -The `noSideEffect`:idx: pragma is used to mark a proc/iterator to have no side -effects. This means that the proc/iterator only changes locations that are -reachable from its parameters and the return value only depends on the -arguments. If none of its parameters have the type ``var T`` -or ``ref T`` or ``ptr T`` this means no locations are modified. It is a static -error to mark a proc/iterator to have no side effect if the compiler cannot -verify this. + + +Modules +------- +Nimrod supports splitting a program into pieces by a `module`:idx: concept. +Each module needs to be in its own file and has its own `namespace`:idx:. +Modules enable `information hiding`:idx: and `separate compilation`:idx:. +A module may gain access to symbols of another module by the `import`:idx: +statement. `Recursive module dependencies`:idx: are allowed, but slightly +subtle. Only top-level symbols that are marked with an asterisk (``*``) are +exported. + +The algorithm for compiling modules is: + +- compile the whole module as usual, following import statements recursively + +- if there is a cycle only import the already parsed symbols (that are + exported); if an unknown identifier occurs then abort + +This is best illustrated by an example: + +.. code-block:: nimrod + # Module A + type + T1* = int # Module A exports the type ``T1`` + import B # the compiler starts parsing B + + proc main() = + var i = p(3) # works because B has been parsed completely here + + main() + + +.. code-block:: nimrod + # Module B + import A # A is not parsed here! Only the already known symbols + # of A are imported. + + proc p*(x: A.T1): A.T1 = + # this works because the compiler has already + # added T1 to A's interface symbol table + return x + 1 + + +Scope rules +----------- +Identifiers are valid from the point of their declaration until the end of +the block in which the declaration occurred. The range where the identifier +is known is the `scope`:idx: of the identifier. The exact scope of an +identifier depends on the way it was declared. + +Block scope +~~~~~~~~~~~ +The *scope* of a variable declared in the declaration part of a block +is valid from the point of declaration until the end of the block. If a +block contains a second block, in which the identifier is redeclared, +then inside this block, the second declaration will be valid. Upon +leaving the inner block, the first declaration is valid again. An +identifier cannot be redefined in the same block, except if valid for +procedure or iterator overloading purposes. + + +Tuple or object scope +~~~~~~~~~~~~~~~~~~~~~ +The field identifiers inside a tuple or object definition are valid in the +following places: + +* To the end of the tuple/object definition. +* Field designators of a variable of the given tuple/object type. +* In all descendant types of the object type. + +Module scope +~~~~~~~~~~~~ +All identifiers of a module are valid from the point of declaration until +the end of the module. Identifiers from indirectly dependent modules are *not* +available. The `system`:idx: module is automatically imported in every other +module. + +If a module imports an identifier by two different modules, each occurrence of +the identifier has to be qualified, unless it is an overloaded procedure or +iterator in which case the overloading resolution takes place: + +.. code-block:: nimrod + # Module A + var x*: string + +.. code-block:: nimrod + # Module B + var x*: int + +.. code-block:: nimrod + # Module C + import A, B + write(stdout, x) # error: x is ambiguous + write(stdout, A.x) # no error: qualifier used + + var x = 4 + write(stdout, x) # not ambiguous: uses the module C's x + + +Compiler Messages +================= + +The Nimrod compiler emits different kinds of messages: `hint`:idx:, +`warning`:idx:, and `error`:idx: messages. An *error* message is emitted if +the compiler encounters any static error. + + +Pragmas +======= + +Syntax:: + + colonExpr ::= expr [':' expr] + colonExprList ::= [colonExpr (comma colonExpr)* [comma]] + + pragma ::= '{.' optInd (colonExpr [comma])* [SAD] ('.}' | '}') + +Pragmas are Nimrod's method to give the compiler additional information / +commands without introducing a massive number of new keywords. Pragmas are +processed on the fly during semantic checking. Pragmas are enclosed in the +special ``{.`` and ``.}`` curly brackets. Pragmas are also often used as a +first implementation to play with a language feature before a nicer syntax +to access the feature becomes available. + + +noSideEffect pragma +------------------- +The `noSideEffect`:idx: pragma is used to mark a proc/iterator to have no side +effects. This means that the proc/iterator only changes locations that are +reachable from its parameters and the return value only depends on the +arguments. If none of its parameters have the type ``var T`` +or ``ref T`` or ``ptr T`` this means no locations are modified. It is a static +error to mark a proc/iterator to have no side effect if the compiler cannot +verify this. As a special semantic rule, the built-in ``debugEcho`` pretends to be free of side effects, so that it can be used for debugging routines marked as ``noSideEffect``. - -**Future directions**: ``func`` may become a keyword and syntactic sugar for a -proc with no side effects: - -.. code-block:: nimrod - func `+` (x, y: int): int - - -procvar pragma --------------- -The `procvar`:idx: pragma is used to mark a proc that it can be passed to a -procedural variable. - - - -compileTime pragma ------------------- -The `compileTime`:idx: pragma is used to mark a proc to be used at compile -time only. No code will be generated for it. Compile time procs are useful -as helpers for macros. - - -noReturn pragma ---------------- -The `noreturn`:idx: pragma is used to mark a proc that never returns. - - -Acyclic pragma --------------- -The `acyclic`:idx: pragma can be used for object types to mark them as acyclic -even though they seem to be cyclic. This is an **optimization** for the garbage -collector to not consider objects of this type as part of a cycle: - -.. code-block:: nimrod - type - PNode = ref TNode - TNode {.acyclic, final.} = object - left, right: PNode - data: string - -In the example a tree structure is declared with the ``TNode`` type. Note that -the type definition is recursive and the GC has to assume that objects of -this type may form a cyclic graph. The ``acyclic`` pragma passes the -information that this cannot happen to the GC. If the programmer uses the -``acyclic`` pragma for data types that are in reality cyclic, the GC may leak -memory, but nothing worse happens. - -**Future directions**: The ``acyclic`` pragma may become a property of a -``ref`` type: - -.. code-block:: nimrod - type - PNode = acyclic ref TNode - TNode = object - left, right: PNode - data: string - - -Final pragma ------------- -The `final`:idx: pragma can be used for an object type to specify that it -cannot be inherited from. - - -shallow pragma --------------- -The `shallow`:idx: pragma affects the semantics of a type: The compiler is -allowed to make a shallow copy. This can cause serious semantic issues and -break memory safety! However, it can speed up assignments considerably, -because the semantics of Nimrod require deep copying of sequences and strings. -This can be expensive, especially if sequences are used to build a tree -structure: - -.. code-block:: nimrod - type - TNodeKind = enum nkLeaf, nkInner - TNode {.final, shallow.} = object - case kind: TNodeKind - of nkLeaf: - strVal: string - of nkInner: - children: seq[TNode] - - -Pure pragma ------------ -An object type can be marked with the `pure`:idx: pragma so that its type -field which is used for runtime type identification is omitted. This is -necessary for binary compatibility with other compiled languages. - - -NoStackFrame pragma -------------------- -A proc can be marked with the `noStackFrame`:idx: pragma to tell the compiler -it should not generate a stack frame for the proc. There are also no exit -statements like ``return result;`` generated. This is useful for procs that -only consist of an assembler statement. - - -error pragma ------------- -The `error`:idx: pragma is used to make the compiler output an error message -with the given content. Compilation does not necessarily abort after an error -though. - -The ``error`` pragma can also be used to -annotate a symbol (like an iterator or proc). The *usage* of the symbol then -triggers a compile-time error. This is especially useful to rule out that some -operation is valid due to overloading and type conversions: - -.. code-block:: nimrod - ## check that underlying int values are compared and not the pointers: - proc `==`(x, y: ptr int): bool {.error.} - - -fatal pragma ------------- -The `fatal`:idx: pragma is used to make the compiler output an error message -with the given content. In contrast to the ``error`` pragma, compilation -is guaranteed to be aborted by this pragma. - -warning pragma --------------- -The `warning`:idx: pragma is used to make the compiler output a warning message -with the given content. Compilation continues after the warning. - -hint pragma ------------ -The `hint`:idx: pragma is used to make the compiler output a hint message with -the given content. Compilation continues after the hint. - -line pragma ------------ -The `line`:idx: pragma can be used to affect line information of the annotated -statement as seen in stack backtraces: - -.. code-block:: nimrod - - template myassert*(cond: expr, msg = "") = - if not cond: - # change run-time line information of the 'raise' statement: - {.line: InstantiationInfo().}: - raise newException(EAssertionFailed, msg) - -If the ``line`` pragma is used with a parameter, the parameter needs be a -``tuple[filename: string, line: int]``. If it is used without a parameter, -``system.InstantiationInfo()`` is used. - - -linearScanEnd pragma --------------------- -The `linearScanEnd`:idx: pragma can be used to tell the compiler how to -compile a Nimrod `case`:idx: statement. Syntactially it has to be used as a -statement: - -.. code-block:: nimrod - case myInt - of 0: - echo "most common case" - of 1: - {.linearScanEnd.} - echo "second most common case" - of 2: echo "unlikely: use branch table" - else: echo "unlikely too: use branch table for ", myInt - -In the example, the case branches ``0`` and ``1`` are much more common than -the other cases. Therefore the generated assembler code should test for these -values first, so that the CPU's branch predictor has a good chance to succeed -(avoiding an expensive CPU pipeline stall). The other cases might be put into a -jump table for O(1) overhead, but at the cost of a (very likely) pipeline -stall. - -The ``linearScanEnd`` pragma should be put into the last branch that should be -tested against via linear scanning. If put into the last branch of the -whole ``case`` statement, the whole ``case`` statement uses linear scanning. - - -unroll pragma -------------- -The `unroll`:idx: pragma can be used to tell the compiler that it should unroll -a `for`:idx: or `while`:idx: loop for runtime efficiency: - -.. code-block:: nimrod - proc searchChar(s: string, c: char): int = - for i in 0 .. s.high: - {.unroll: 4.} - if s[i] == c: return i - result = -1 - -In the above example, the search loop is unrolled by a factor 4. The unroll -factor can be left out too; the compiler then chooses an appropriate unroll -factor. - -**Note**: Currently the compiler recognizes but ignores this pragma. + +**Future directions**: ``func`` may become a keyword and syntactic sugar for a +proc with no side effects: + +.. code-block:: nimrod + func `+` (x, y: int): int + + +procvar pragma +-------------- +The `procvar`:idx: pragma is used to mark a proc that it can be passed to a +procedural variable. + + + +compileTime pragma +------------------ +The `compileTime`:idx: pragma is used to mark a proc to be used at compile +time only. No code will be generated for it. Compile time procs are useful +as helpers for macros. + + +noReturn pragma +--------------- +The `noreturn`:idx: pragma is used to mark a proc that never returns. + + +Acyclic pragma +-------------- +The `acyclic`:idx: pragma can be used for object types to mark them as acyclic +even though they seem to be cyclic. This is an **optimization** for the garbage +collector to not consider objects of this type as part of a cycle: + +.. code-block:: nimrod + type + PNode = ref TNode + TNode {.acyclic, final.} = object + left, right: PNode + data: string + +In the example a tree structure is declared with the ``TNode`` type. Note that +the type definition is recursive and the GC has to assume that objects of +this type may form a cyclic graph. The ``acyclic`` pragma passes the +information that this cannot happen to the GC. If the programmer uses the +``acyclic`` pragma for data types that are in reality cyclic, the GC may leak +memory, but nothing worse happens. + +**Future directions**: The ``acyclic`` pragma may become a property of a +``ref`` type: + +.. code-block:: nimrod + type + PNode = acyclic ref TNode + TNode = object + left, right: PNode + data: string + + +Final pragma +------------ +The `final`:idx: pragma can be used for an object type to specify that it +cannot be inherited from. + + +shallow pragma +-------------- +The `shallow`:idx: pragma affects the semantics of a type: The compiler is +allowed to make a shallow copy. This can cause serious semantic issues and +break memory safety! However, it can speed up assignments considerably, +because the semantics of Nimrod require deep copying of sequences and strings. +This can be expensive, especially if sequences are used to build a tree +structure: + +.. code-block:: nimrod + type + TNodeKind = enum nkLeaf, nkInner + TNode {.final, shallow.} = object + case kind: TNodeKind + of nkLeaf: + strVal: string + of nkInner: + children: seq[TNode] + + +Pure pragma +----------- +An object type can be marked with the `pure`:idx: pragma so that its type +field which is used for runtime type identification is omitted. This is +necessary for binary compatibility with other compiled languages. + + +NoStackFrame pragma +------------------- +A proc can be marked with the `noStackFrame`:idx: pragma to tell the compiler +it should not generate a stack frame for the proc. There are also no exit +statements like ``return result;`` generated. This is useful for procs that +only consist of an assembler statement. + + +error pragma +------------ +The `error`:idx: pragma is used to make the compiler output an error message +with the given content. Compilation does not necessarily abort after an error +though. + +The ``error`` pragma can also be used to +annotate a symbol (like an iterator or proc). The *usage* of the symbol then +triggers a compile-time error. This is especially useful to rule out that some +operation is valid due to overloading and type conversions: + +.. code-block:: nimrod + ## check that underlying int values are compared and not the pointers: + proc `==`(x, y: ptr int): bool {.error.} + + +fatal pragma +------------ +The `fatal`:idx: pragma is used to make the compiler output an error message +with the given content. In contrast to the ``error`` pragma, compilation +is guaranteed to be aborted by this pragma. + +warning pragma +-------------- +The `warning`:idx: pragma is used to make the compiler output a warning message +with the given content. Compilation continues after the warning. + +hint pragma +----------- +The `hint`:idx: pragma is used to make the compiler output a hint message with +the given content. Compilation continues after the hint. + +line pragma +----------- +The `line`:idx: pragma can be used to affect line information of the annotated +statement as seen in stack backtraces: + +.. code-block:: nimrod + + template myassert*(cond: expr, msg = "") = + if not cond: + # change run-time line information of the 'raise' statement: + {.line: InstantiationInfo().}: + raise newException(EAssertionFailed, msg) + +If the ``line`` pragma is used with a parameter, the parameter needs be a +``tuple[filename: string, line: int]``. If it is used without a parameter, +``system.InstantiationInfo()`` is used. + + +linearScanEnd pragma +-------------------- +The `linearScanEnd`:idx: pragma can be used to tell the compiler how to +compile a Nimrod `case`:idx: statement. Syntactially it has to be used as a +statement: + +.. code-block:: nimrod + case myInt + of 0: + echo "most common case" + of 1: + {.linearScanEnd.} + echo "second most common case" + of 2: echo "unlikely: use branch table" + else: echo "unlikely too: use branch table for ", myInt + +In the example, the case branches ``0`` and ``1`` are much more common than +the other cases. Therefore the generated assembler code should test for these +values first, so that the CPU's branch predictor has a good chance to succeed +(avoiding an expensive CPU pipeline stall). The other cases might be put into a +jump table for O(1) overhead, but at the cost of a (very likely) pipeline +stall. + +The ``linearScanEnd`` pragma should be put into the last branch that should be +tested against via linear scanning. If put into the last branch of the +whole ``case`` statement, the whole ``case`` statement uses linear scanning. + + +unroll pragma +------------- +The `unroll`:idx: pragma can be used to tell the compiler that it should unroll +a `for`:idx: or `while`:idx: loop for runtime efficiency: + +.. code-block:: nimrod + proc searchChar(s: string, c: char): int = + for i in 0 .. s.high: + {.unroll: 4.} + if s[i] == c: return i + result = -1 + +In the above example, the search loop is unrolled by a factor 4. The unroll +factor can be left out too; the compiler then chooses an appropriate unroll +factor. + +**Note**: Currently the compiler recognizes but ignores this pragma. immediate pragma ---------------- See `Ordinary vs immediate templates`_. - - -compilation option pragmas --------------------------- -The listed pragmas here can be used to override the code generation options -for a section of code. - -The implementation currently provides the following possible options (various -others may be added later). - -=============== =============== ============================================ -pragma allowed values description -=============== =============== ============================================ -checks on|off Turns the code generation for all runtime - checks on or off. -boundChecks on|off Turns the code generation for array bound - checks on or off. -overflowChecks on|off Turns the code generation for over- or - underflow checks on or off. -nilChecks on|off Turns the code generation for nil pointer - checks on or off. -assertions on|off Turns the code generation for assertions - on or off. -warnings on|off Turns the warning messages of the compiler - on or off. -hints on|off Turns the hint messages of the compiler - on or off. -optimization none|speed|size Optimize the code for speed or size, or + + +compilation option pragmas +-------------------------- +The listed pragmas here can be used to override the code generation options +for a section of code. + +The implementation currently provides the following possible options (various +others may be added later). + +=============== =============== ============================================ +pragma allowed values description +=============== =============== ============================================ +checks on|off Turns the code generation for all runtime + checks on or off. +boundChecks on|off Turns the code generation for array bound + checks on or off. +overflowChecks on|off Turns the code generation for over- or + underflow checks on or off. +nilChecks on|off Turns the code generation for nil pointer + checks on or off. +assertions on|off Turns the code generation for assertions + on or off. +warnings on|off Turns the warning messages of the compiler + on or off. +hints on|off Turns the hint messages of the compiler + on or off. +optimization none|speed|size Optimize the code for speed or size, or disable optimization. patterns on|off Turns the term rewriting templates/macros - on or off. -callconv cdecl|... Specifies the default calling convention for - all procedures (and procedure types) that - follow. -=============== =============== ============================================ - -Example: - -.. code-block:: nimrod - {.checks: off, optimization: speed.} - # compile without runtime checks and optimize for speed - - -push and pop pragmas --------------------- -The `push/pop`:idx: pragmas are very similar to the option directive, -but are used to override the settings temporarily. Example: - -.. code-block:: nimrod - {.push checks: off.} - # compile this section without runtime checks as it is - # speed critical - # ... some code ... - {.pop.} # restore old settings - - -register pragma ---------------- -The `register`:idx: pragma is for variables only. It declares the variable as -``register``, giving the compiler a hint that the variable should be placed -in a hardware register for faster access. C compilers usually ignore this -though and for good reasons: Often they do a better job without it anyway. - -In highly specific cases (a dispatch loop of an bytecode interpreter for -example) it may provide benefits, though. - - -global pragma -------------- -The `global`:idx: pragma can be applied to a variable within a proc to instruct -the compiler to store it in a global location and initialize it once at program -startup. - -.. code-block:: nimrod - proc isHexNumber(s: string): bool = - var pattern {.global.} = re"[0-9a-fA-F]+" - result = s.match(pattern) - -When used within a generic proc, a separate unique global variable will be -created for each instantiation of the proc. The order of initialization of -the created global variables within a module is not defined, but all of them -will be initialized after any top-level variables in their originating module -and before any variable in a module that imports it. - -DeadCodeElim pragma -------------------- -The `deadCodeElim`:idx: pragma only applies to whole modules: It tells the -compiler to activate (or deactivate) dead code elimination for the module the -pragma appers in. - -The ``--deadCodeElim:on`` command line switch has the same effect as marking -every module with ``{.deadCodeElim:on}``. However, for some modules such as -the GTK wrapper it makes sense to *always* turn on dead code elimination - -no matter if it is globally active or not. - -Example: - -.. code-block:: nimrod - {.deadCodeElim: on.} - - -Pragma pragma -------------- - -The `pragma`:idx: pragma can be used to declare user defined pragmas. This is -useful because Nimrod's templates and macros do not affect pragmas. User -defined pragmas are in a different module-wide scope than all other symbols. -They cannot be imported from a module. - -Example: - -.. code-block:: nimrod - when appType == "lib": - {.pragma: rtl, exportc, dynlib, cdecl.} - else: - {.pragma: rtl, importc, dynlib: "client.dll", cdecl.} - - proc p*(a, b: int): int {.rtl.} = - return a+b - -In the example a new pragma named ``rtl`` is introduced that either imports -a symbol from a dynamic library or exports the symbol for dynamic library -generation. - - -Disabling certain messages --------------------------- -Nimrod generates some warnings and hints ("line too long") that may annoy the -user. A mechanism for disabling certain messages is provided: Each hint -and warning message contains a symbol in brackets. This is the message's -identifier that can be used to enable or disable it: - -.. code-block:: Nimrod - {.warning[LineTooLong]: off.} # turn off warning about too long lines - -This is often better than disabling all warnings at once. - - -Foreign function interface -========================== - -Nimrod's `FFI`:idx: (foreign function interface) is extensive and only the -parts that scale to other future backends (like the LLVM/EcmaScript backends) -are documented here. - - -Importc pragma --------------- -The `importc`:idx: pragma provides a means to import a proc or a variable -from C. The optional argument is a string containing the C identifier. If -the argument is missing, the C name is the Nimrod identifier *exactly as -spelled*: - -.. code-block:: - proc printf(formatstr: cstring) {.importc: "printf", varargs.} - -Note that this pragma is somewhat of a misnomer: Other backends will provide -the same feature under the same name. - - -Exportc pragma --------------- -The `exportc`:idx: pragma provides a means to export a type, a variable, or a -procedure to C. The optional argument is a string containing the C identifier. -If the argument is missing, the C name is the Nimrod -identifier *exactly as spelled*: - -.. code-block:: Nimrod - proc callme(formatstr: cstring) {.exportc: "callMe", varargs.} - -Note that this pragma is somewhat of a misnomer: Other backends will provide -the same feature under the same name. + on or off. +callconv cdecl|... Specifies the default calling convention for + all procedures (and procedure types) that + follow. +=============== =============== ============================================ + +Example: + +.. code-block:: nimrod + {.checks: off, optimization: speed.} + # compile without runtime checks and optimize for speed + + +push and pop pragmas +-------------------- +The `push/pop`:idx: pragmas are very similar to the option directive, +but are used to override the settings temporarily. Example: + +.. code-block:: nimrod + {.push checks: off.} + # compile this section without runtime checks as it is + # speed critical + # ... some code ... + {.pop.} # restore old settings + + +register pragma +--------------- +The `register`:idx: pragma is for variables only. It declares the variable as +``register``, giving the compiler a hint that the variable should be placed +in a hardware register for faster access. C compilers usually ignore this +though and for good reasons: Often they do a better job without it anyway. + +In highly specific cases (a dispatch loop of an bytecode interpreter for +example) it may provide benefits, though. + + +global pragma +------------- +The `global`:idx: pragma can be applied to a variable within a proc to instruct +the compiler to store it in a global location and initialize it once at program +startup. + +.. code-block:: nimrod + proc isHexNumber(s: string): bool = + var pattern {.global.} = re"[0-9a-fA-F]+" + result = s.match(pattern) + +When used within a generic proc, a separate unique global variable will be +created for each instantiation of the proc. The order of initialization of +the created global variables within a module is not defined, but all of them +will be initialized after any top-level variables in their originating module +and before any variable in a module that imports it. + +DeadCodeElim pragma +------------------- +The `deadCodeElim`:idx: pragma only applies to whole modules: It tells the +compiler to activate (or deactivate) dead code elimination for the module the +pragma appers in. + +The ``--deadCodeElim:on`` command line switch has the same effect as marking +every module with ``{.deadCodeElim:on}``. However, for some modules such as +the GTK wrapper it makes sense to *always* turn on dead code elimination - +no matter if it is globally active or not. + +Example: + +.. code-block:: nimrod + {.deadCodeElim: on.} + + +Pragma pragma +------------- + +The `pragma`:idx: pragma can be used to declare user defined pragmas. This is +useful because Nimrod's templates and macros do not affect pragmas. User +defined pragmas are in a different module-wide scope than all other symbols. +They cannot be imported from a module. + +Example: + +.. code-block:: nimrod + when appType == "lib": + {.pragma: rtl, exportc, dynlib, cdecl.} + else: + {.pragma: rtl, importc, dynlib: "client.dll", cdecl.} + + proc p*(a, b: int): int {.rtl.} = + return a+b + +In the example a new pragma named ``rtl`` is introduced that either imports +a symbol from a dynamic library or exports the symbol for dynamic library +generation. + + +Disabling certain messages +-------------------------- +Nimrod generates some warnings and hints ("line too long") that may annoy the +user. A mechanism for disabling certain messages is provided: Each hint +and warning message contains a symbol in brackets. This is the message's +identifier that can be used to enable or disable it: + +.. code-block:: Nimrod + {.warning[LineTooLong]: off.} # turn off warning about too long lines + +This is often better than disabling all warnings at once. + + +Foreign function interface +========================== + +Nimrod's `FFI`:idx: (foreign function interface) is extensive and only the +parts that scale to other future backends (like the LLVM/EcmaScript backends) +are documented here. + + +Importc pragma +-------------- +The `importc`:idx: pragma provides a means to import a proc or a variable +from C. The optional argument is a string containing the C identifier. If +the argument is missing, the C name is the Nimrod identifier *exactly as +spelled*: + +.. code-block:: + proc printf(formatstr: cstring) {.importc: "printf", varargs.} + +Note that this pragma is somewhat of a misnomer: Other backends will provide +the same feature under the same name. + + +Exportc pragma +-------------- +The `exportc`:idx: pragma provides a means to export a type, a variable, or a +procedure to C. The optional argument is a string containing the C identifier. +If the argument is missing, the C name is the Nimrod +identifier *exactly as spelled*: + +.. code-block:: Nimrod + proc callme(formatstr: cstring) {.exportc: "callMe", varargs.} + +Note that this pragma is somewhat of a misnomer: Other backends will provide +the same feature under the same name. Bycopy pragma @@ -3825,232 +3825,232 @@ Byref pragma The `byref`:idx: pragma can be applied to an object or tuple type and instructs the compiler to pass the type by reference (hidden pointer) to procs. - - -Varargs pragma --------------- -The `varargs`:idx: pragma can be applied to procedures only (and procedure -types). It tells Nimrod that the proc can take a variable number of parameters -after the last specified parameter. Nimrod string values will be converted to C -strings automatically: - -.. code-block:: Nimrod - proc printf(formatstr: cstring) {.nodecl, varargs.} - - printf("hallo %s", "world") # "world" will be passed as C string - - -Dynlib pragma for import ------------------------- -With the `dynlib`:idx: pragma a procedure or a variable can be imported from + + +Varargs pragma +-------------- +The `varargs`:idx: pragma can be applied to procedures only (and procedure +types). It tells Nimrod that the proc can take a variable number of parameters +after the last specified parameter. Nimrod string values will be converted to C +strings automatically: + +.. code-block:: Nimrod + proc printf(formatstr: cstring) {.nodecl, varargs.} + + printf("hallo %s", "world") # "world" will be passed as C string + + +Dynlib pragma for import +------------------------ +With the `dynlib`:idx: pragma a procedure or a variable can be imported from a dynamic library (``.dll`` files for Windows, ``lib*.so`` files for UNIX). -The non-optional argument has to be the name of the dynamic library: - -.. code-block:: Nimrod - proc gtk_image_new(): PGtkWidget {. - cdecl, dynlib: "libgtk-x11-2.0.so", importc.} - -In general, importing a dynamic library does not require any special linker -options or linking with import libraries. This also implies that no *devel* -packages need to be installed. - -The ``dynlib`` import mechanism supports a versioning scheme: - -.. code-block:: nimrod - proc Tcl_Eval(interp: pTcl_Interp, script: cstring): int {.cdecl, - importc, dynlib: "libtcl(|8.5|8.4|8.3).so.(1|0)".} - -At runtime the dynamic library is searched for (in this order):: - - libtcl.so.1 - libtcl.so.0 - libtcl8.5.so.1 - libtcl8.5.so.0 - libtcl8.4.so.1 - libtcl8.4.so.0 - libtcl8.3.so.1 - libtcl8.3.so.0 - -The ``dynlib`` pragma supports not only constant strings as argument but also -string expressions in general: - -.. code-block:: nimrod - import os - - proc getDllName: string = - result = "mylib.dll" - if ExistsFile(result): return - result = "mylib2.dll" - if ExistsFile(result): return - quit("could not load dynamic library") - - proc myImport(s: cstring) {.cdecl, importc, dynlib: getDllName().} - -**Note**: Patterns like ``libtcl(|8.5|8.4).so`` are only supported in constant -strings, because they are precompiled. - -**Note**: Passing variables to the ``dynlib`` pragma will fail at runtime -because of order of initialization problems. - - -Dynlib pragma for export ------------------------- - -With the ``dynlib`` pragma a procedure can also be exported to -a dynamic library. The pragma then has no argument and has to be used in -conjunction with the ``exportc`` pragma: - -.. code-block:: Nimrod - proc exportme(): int {.cdecl, exportc, dynlib.} - -This is only useful if the program is compiled as a dynamic library via the -``--app:lib`` command line option. - - -Threads -======= - -Even though Nimrod's `thread`:idx: support and semantics are preliminary, -they should be quite usable already. To enable thread support -the ``--threads:on`` command line switch needs to be used. The ``system`` -module then contains several threading primitives. -See the `threads <threads.html>`_ and `channels <channels.html>`_ modules -for the thread API. - -Nimrod's memory model for threads is quite different than that of other common -programming languages (C, Pascal, Java): Each thread has its own (garbage -collected) heap and sharing of memory is restricted to global variables. This -helps to prevent race conditions. GC efficiency is improved quite a lot, -because the GC never has to stop other threads and see what they reference. -Memory allocation requires no lock at all! This design easily scales to massive -multicore processors that will become the norm in the future. - - - -Thread pragma -------------- - -A proc that is executed as a new thread of execution should be marked by the -`thread pragma`:idx:. The compiler checks procedures marked as ``thread`` for -violations of the `no heap sharing restriction`:idx:\: This restriction implies -that it is invalid to construct a data structure that consists of memory -allocated from different (thread local) heaps. - -Since the semantic checking of threads requires whole program analysis, -it is quite expensive and can be turned off with ``--threadanalysis:off`` to -improve compile times. - -A thread proc is passed to ``createThread`` and invoked indirectly; so the -``thread`` pragma implies ``procvar``. - - -Threadvar pragma ----------------- - -A global variable can be marked with the `threadvar`:idx: pragma; it is -a `thead-local`:idx: variable then: - -.. code-block:: nimrod - var checkpoints* {.threadvar.}: seq[string] = @[] - - -Actor model ------------ - -**Caution**: This section is already outdated! XXX - -Nimrod supports the `actor model`:idx: of concurrency natively: - -.. code-block:: nimrod - type - TMsgKind = enum - mLine, mEof - TMsg = object {.pure, final.} - case k: TMsgKind - of mEof: nil - of mLine: data: string - - var - thr: TThread[TMsg] - printedLines = 0 - m: TMsg - - proc print() {.thread.} = - while true: - var x = recv[TMsg]() - if x.k == mEof: break - echo x.data - discard atomicInc(printedLines) - - createThread(thr, print) - - var input = open("readme.txt") - while not endOfFile(input): - m.data = input.readLine() - thr.send(m) - close(input) - m.k = mEof - thr.send(m) - joinThread(thr) - - echo printedLines - -In the actor model threads communicate only over sending messages (`send`:idx: -and `recv`:idx: built-ins), not by sharing memory. Every thread has -an `inbox`:idx: that keeps incoming messages until the thread requests a new -message via the ``recv`` operation. The inbox is an unlimited FIFO queue. - -In the above example the ``print`` thread also communicates with its -parent thread over the ``printedLines`` global variable. In general, it is -highly advisable to only read from globals, but not to write to them. In fact -a write to a global that contains GC'ed memory is always wrong, because it -violates the *no heap sharing restriction*: - -.. code-block:: nimrod - var - global: string - t: TThread[string] - - proc horrible() {.thread.} = - global = "string in thread local heap!" - - createThread(t, horrible) - joinThread(t) - -For the above code the compiler produces "Warning: write to foreign heap". This -warning might become an error message in future versions of the compiler. - -Creating a thread is an expensive operation, because a new stack and heap needs -to be created for the thread. It is therefore highly advisable that a thread -handles a large amount of work. Nimrod prefers *coarse grained* -over *fine grained* concurrency. - - -Threads and exceptions ----------------------- - -The interaction between threads and exceptions is simple: A *handled* exception -in one thread cannot affect any other thread. However, an *unhandled* -exception in one thread terminates the whole *process*! - -Taint mode -========== - -The Nimrod compiler and most parts of the standard library support -a `taint mode`:idx:. Input strings are declared with the `TaintedString`:idx: -string type declared in the ``system`` module. - -If the taint mode is turned on (via the ``--taintMode:on`` command line -option) it is a distinct string type which helps to detect input -validation errors: - -.. code-block:: nimrod - echo "your name: " - var name: TaintedString = stdin.readline - # it is safe here to output the name without any input validation, so - # we simply convert `name` to string to make the compiler happy: - echo "hi, ", name.string - -If the taint mode is turned off, ``TaintedString`` is simply an alias for -``string``. - +The non-optional argument has to be the name of the dynamic library: + +.. code-block:: Nimrod + proc gtk_image_new(): PGtkWidget {. + cdecl, dynlib: "libgtk-x11-2.0.so", importc.} + +In general, importing a dynamic library does not require any special linker +options or linking with import libraries. This also implies that no *devel* +packages need to be installed. + +The ``dynlib`` import mechanism supports a versioning scheme: + +.. code-block:: nimrod + proc Tcl_Eval(interp: pTcl_Interp, script: cstring): int {.cdecl, + importc, dynlib: "libtcl(|8.5|8.4|8.3).so.(1|0)".} + +At runtime the dynamic library is searched for (in this order):: + + libtcl.so.1 + libtcl.so.0 + libtcl8.5.so.1 + libtcl8.5.so.0 + libtcl8.4.so.1 + libtcl8.4.so.0 + libtcl8.3.so.1 + libtcl8.3.so.0 + +The ``dynlib`` pragma supports not only constant strings as argument but also +string expressions in general: + +.. code-block:: nimrod + import os + + proc getDllName: string = + result = "mylib.dll" + if ExistsFile(result): return + result = "mylib2.dll" + if ExistsFile(result): return + quit("could not load dynamic library") + + proc myImport(s: cstring) {.cdecl, importc, dynlib: getDllName().} + +**Note**: Patterns like ``libtcl(|8.5|8.4).so`` are only supported in constant +strings, because they are precompiled. + +**Note**: Passing variables to the ``dynlib`` pragma will fail at runtime +because of order of initialization problems. + + +Dynlib pragma for export +------------------------ + +With the ``dynlib`` pragma a procedure can also be exported to +a dynamic library. The pragma then has no argument and has to be used in +conjunction with the ``exportc`` pragma: + +.. code-block:: Nimrod + proc exportme(): int {.cdecl, exportc, dynlib.} + +This is only useful if the program is compiled as a dynamic library via the +``--app:lib`` command line option. + + +Threads +======= + +Even though Nimrod's `thread`:idx: support and semantics are preliminary, +they should be quite usable already. To enable thread support +the ``--threads:on`` command line switch needs to be used. The ``system`` +module then contains several threading primitives. +See the `threads <threads.html>`_ and `channels <channels.html>`_ modules +for the thread API. + +Nimrod's memory model for threads is quite different than that of other common +programming languages (C, Pascal, Java): Each thread has its own (garbage +collected) heap and sharing of memory is restricted to global variables. This +helps to prevent race conditions. GC efficiency is improved quite a lot, +because the GC never has to stop other threads and see what they reference. +Memory allocation requires no lock at all! This design easily scales to massive +multicore processors that will become the norm in the future. + + + +Thread pragma +------------- + +A proc that is executed as a new thread of execution should be marked by the +`thread pragma`:idx:. The compiler checks procedures marked as ``thread`` for +violations of the `no heap sharing restriction`:idx:\: This restriction implies +that it is invalid to construct a data structure that consists of memory +allocated from different (thread local) heaps. + +Since the semantic checking of threads requires whole program analysis, +it is quite expensive and can be turned off with ``--threadanalysis:off`` to +improve compile times. + +A thread proc is passed to ``createThread`` and invoked indirectly; so the +``thread`` pragma implies ``procvar``. + + +Threadvar pragma +---------------- + +A global variable can be marked with the `threadvar`:idx: pragma; it is +a `thead-local`:idx: variable then: + +.. code-block:: nimrod + var checkpoints* {.threadvar.}: seq[string] = @[] + + +Actor model +----------- + +**Caution**: This section is already outdated! XXX + +Nimrod supports the `actor model`:idx: of concurrency natively: + +.. code-block:: nimrod + type + TMsgKind = enum + mLine, mEof + TMsg = object {.pure, final.} + case k: TMsgKind + of mEof: nil + of mLine: data: string + + var + thr: TThread[TMsg] + printedLines = 0 + m: TMsg + + proc print() {.thread.} = + while true: + var x = recv[TMsg]() + if x.k == mEof: break + echo x.data + discard atomicInc(printedLines) + + createThread(thr, print) + + var input = open("readme.txt") + while not endOfFile(input): + m.data = input.readLine() + thr.send(m) + close(input) + m.k = mEof + thr.send(m) + joinThread(thr) + + echo printedLines + +In the actor model threads communicate only over sending messages (`send`:idx: +and `recv`:idx: built-ins), not by sharing memory. Every thread has +an `inbox`:idx: that keeps incoming messages until the thread requests a new +message via the ``recv`` operation. The inbox is an unlimited FIFO queue. + +In the above example the ``print`` thread also communicates with its +parent thread over the ``printedLines`` global variable. In general, it is +highly advisable to only read from globals, but not to write to them. In fact +a write to a global that contains GC'ed memory is always wrong, because it +violates the *no heap sharing restriction*: + +.. code-block:: nimrod + var + global: string + t: TThread[string] + + proc horrible() {.thread.} = + global = "string in thread local heap!" + + createThread(t, horrible) + joinThread(t) + +For the above code the compiler produces "Warning: write to foreign heap". This +warning might become an error message in future versions of the compiler. + +Creating a thread is an expensive operation, because a new stack and heap needs +to be created for the thread. It is therefore highly advisable that a thread +handles a large amount of work. Nimrod prefers *coarse grained* +over *fine grained* concurrency. + + +Threads and exceptions +---------------------- + +The interaction between threads and exceptions is simple: A *handled* exception +in one thread cannot affect any other thread. However, an *unhandled* +exception in one thread terminates the whole *process*! + +Taint mode +========== + +The Nimrod compiler and most parts of the standard library support +a `taint mode`:idx:. Input strings are declared with the `TaintedString`:idx: +string type declared in the ``system`` module. + +If the taint mode is turned on (via the ``--taintMode:on`` command line +option) it is a distinct string type which helps to detect input +validation errors: + +.. code-block:: nimrod + echo "your name: " + var name: TaintedString = stdin.readline + # it is safe here to output the name without any input validation, so + # we simply convert `name` to string to make the compiler happy: + echo "hi, ", name.string + +If the taint mode is turned off, ``TaintedString`` is simply an alias for +``string``. + |