diff options
author | Andreas Rumpf <andreas@andi> | 2008-06-22 16:14:11 +0200 |
---|---|---|
committer | Andreas Rumpf <andreas@andi> | 2008-06-22 16:14:11 +0200 |
commit | 405b86068e6a3d39970b9129ceec0a9108464b28 (patch) | |
tree | c0449946f54baae6ea88baf453157ddd7faa8f86 /doc | |
download | Nim-405b86068e6a3d39970b9129ceec0a9108464b28.tar.gz |
Initial import
Diffstat (limited to 'doc')
-rwxr-xr-x | doc/docs.txt | 26 | ||||
-rwxr-xr-x | doc/endb.txt | 174 | ||||
-rwxr-xr-x | doc/filelist.txt | 44 | ||||
-rwxr-xr-x | doc/grammar.txt | 186 | ||||
-rwxr-xr-x | doc/html/empty.txt | 1 | ||||
-rwxr-xr-x | doc/intern.txt | 575 | ||||
-rwxr-xr-x | doc/lib.txt | 48 | ||||
-rwxr-xr-x | doc/manual.txt | 1742 | ||||
-rwxr-xr-x | doc/nimdoc.css | 295 | ||||
-rwxr-xr-x | doc/nimrodc.txt | 241 | ||||
-rwxr-xr-x | doc/overview.txt | 9 | ||||
-rwxr-xr-x | doc/posix.txt | 220 | ||||
-rwxr-xr-x | doc/readme.txt | 11 | ||||
-rwxr-xr-x | doc/regexprs.txt | 296 | ||||
-rwxr-xr-x | doc/rst.txt | 111 | ||||
-rwxr-xr-x | doc/spec.txt | 1297 | ||||
-rwxr-xr-x | doc/theindex.txt | 1436 | ||||
-rwxr-xr-x | doc/tutorial.txt | 215 |
18 files changed, 6927 insertions, 0 deletions
diff --git a/doc/docs.txt b/doc/docs.txt new file mode 100755 index 000000000..f58a16f48 --- /dev/null +++ b/doc/docs.txt @@ -0,0 +1,26 @@ + "Incorrect documentation is often worse than no documentation." + -- Bertrand Meyer + +The documentation consists of several documents: + +- | `Nimrod manual <manual.html>`_ + | Read this to get to know the Nimrod programming system. + +- | `User guide for the Nimrod Compiler <nimrodc.html>`_ + | The user guide lists command line arguments, Nimrodc's special features, etc. + +- | `User guide for the Embedded Nimrod Debugger <endb.html>`_ + | This document describes how to use the Embedded debugger. The embedded + debugger currently has no GUI. Please help! + +- | `Nimrod library documentation <lib.html>`_ + | This document describes Nimrod's standard library. + +- | `Nimrod internal documentation <intern.html>`_ + | The internal documentation describes how the compiler is implemented. Read + this if you want to hack the compiler or develop advanced macros. + +- | `Index <theindex.html>`_ + | The generated index. Often the quickest way to find the piece of + information you need. + diff --git a/doc/endb.txt b/doc/endb.txt new file mode 100755 index 000000000..b69a6ec6d --- /dev/null +++ b/doc/endb.txt @@ -0,0 +1,174 @@ +=========================================== + Embedded Nimrod Debugger User Guide +=========================================== + +:Author: Andreas Rumpf +:Version: |nimrodversion| + +.. contents:: + +Nimrod comes with a platform independant debugger - +the `Embedded Nimrod Debugger`:idx: (`ENDB`:idx:). The debugger is +*embedded* into your executable if it has been +compiled with the ``--debugger:on`` command line option. +This also defines the conditional symbol ``ENDB`` for you. + +Note: You must not compile your program with the ``--app:gui`` +command line option because then there is no console +available for the debugger. + +If you start your program the debugger will immediately show +a prompt on the console. You can now enter a command. The next sections +deal with the possible commands. As usual for Nimrod for all commands +underscores and case do not matter. Optional components of a command +are listed in brackets ``[...]`` here. + + +General Commands +================ + +``h``, ``help`` + Display a quick reference of the possible commands. + +``q``, ``quit`` + Quit the debugger and the program. + +<ENTER> + (Without any typed command) repeat the previous debugger command. + If there is no previous command, ``step_into`` is assumed. + +Executing Commands +================== + +``s``, ``step_into`` + Single step, stepping into routine calls. + +``n``, ``step_over`` + Single step, without stepping into routine calls. + +``f``, ``skip_current`` + Continue execution until the current routine finishes. + +``c``, ``continue`` + Continue execution until the next breakpoint. + +``i``, ``ignore`` + Continue execution, ignore all breakpoints. This is effectively quitting + the debugger and runs the program until it finishes. + + +Breakpoint Commands +=================== + +``b``, ``setbreak`` <identifier> [fromline [toline]] [file] + Set a new breakpoint named 'identifier' for the given file + and line numbers. If no file is given, the current execution point's + filename is used. If the filename has no extension, ``.nim`` is + appended for your convenience. + If no line numbers are given, the current execution point's + line is used. If both ``fromline`` and ``toline`` are given the + breakpoint contains a line number range. Some examples if it is still + unclear: + + * ``b br1 12 15 thallo`` creates a breakpoint named ``br1`` that + will be triggered if the instruction pointer reaches one of the + lines 12-15 in the file ``thallo.nim``. + * ``b br1 12 thallo`` creates a breakpoint named ``br1`` that + will be triggered if the instruction pointer reaches the + line 12 in the file ``thallo.nim``. + * ``b br1 12`` creates a breakpoint named ``br1`` that + will be triggered if the instruction pointer reaches the + line 12 in the current file. + * ``b br1`` creates a breakpoint named ``br1`` that + will be triggered if the instruction pointer reaches the + current line in the current file again. + +``breakpoints`` + Display the entire breakpoint list. + +``disable`` <identifier> + Disable a breakpoint. It remains disabled until you turn it on again + with the ``enable`` command. + +``enable`` <identifier> + Enable a breakpoint. + +Often it happens when debugging that you keep retyping the breakpoints again +and again because they are lost when you restart your program. This is not +necessary: A special pragma has been defined for this: + + +The ``{.breakpoint.}`` pragma +----------------------------- + +The `breakpoint`:idx: pragma is syntactically a statement. It can be used +to mark the *following line* as a breakpoint: + +.. code-block:: Nimrod + write("1") + {.breakpoint: "before_write_2".} + write("2") + +The name of the breakpoint here is ``before_write_2``. Of course the +breakpoint's name is optional - the compiler will generate one for you +if you leave it out. + +Code for the ``breakpoint`` pragma is only generated if the debugger +is turned on, so you don't need to remove it from your source code after +debugging. + + +Data Display Commands +===================== + +``e``, ``eval`` <exp> + Evaluate the expression <exp>. Note that ENDB has no full-blown expression + evaluator built-in. So expressions are limited: + + * To display global variables prefix their names with their + owning module: ``nim1.globalVar`` + * To display local variables or parameters just type in + their name: ``localVar``. If you want to inspect variables that are not + in the current stack frame, use the ``up`` or ``down`` command. + + Unfortunately, only inspecting variables is possible at the moment. Maybe + a future version will implement a full-blown Nimrod expression evaluator, + but this is not easy to do and would bloat the debugger's code. + + Since displaying the whole data structures is often not needed and + painfully slow, the debugger uses a *maximal display depth* concept for + displaying. + + You can alter the *maximal display depth* with the ``maxdisplay`` + command. + +``maxdisplay`` <natural> + Sets the maximal display depth to the given integer value. A value of 0 + means there is no maximal display depth. Default is 3. + +``o``, ``out`` <filename> <exp> + Evaluate the expression <exp> and store its string representation into a + file named <filename>. If the file does not exist, it will be created, + otherwise it will be opened for appending. + +``w``, ``where`` + Display the current execution point. + +``u``, ``up`` + Go up in the call stack. + +``d``, ``down`` + Go down in the call stack. + +``stackframe`` [file] + Displays the content of the current stack frame in ``stdout`` or + appends it to the file, depending on whether a file is given. + +``callstack`` + Display the entire call stack (but not its content). + +``l``, ``locals`` + Display the available local variables in the current stack frame. + +``g``, ``globals`` + Display all the global variables that are available for inspection. diff --git a/doc/filelist.txt b/doc/filelist.txt new file mode 100755 index 000000000..cdb06cb9c --- /dev/null +++ b/doc/filelist.txt @@ -0,0 +1,44 @@ +Short description of Nimrod's modules +------------------------------------- + +============== ========================================================== +Module Description +============== ========================================================== +lexbase buffer handling of the lexical analyser +scanner lexical analyser + +ast type definitions of the abstract syntax tree (AST) and + node constructors +astalgo algorithms for containers of AST nodes; converting the + AST to YAML; the symbol table +trees few algorithms for nodes; this module is less important +types module for traversing type graphs; also contain several + helpers for dealing with types + +sigmatch contains the matching algorithm that is used for proc + calls +semexprs contains the semantic checking phase for expressions +semstmts contains the semantic checking phase for statements +semtypes contains the semantic checking phase for types + +idents implements a general mapping from identifiers to an internal + representation (``PIdent``) that is used, so that a simple + pointer comparison suffices to say whether two Nimrod + identifiers are equivalent + +ropes implements long strings using represented as trees for + lazy evaluation; used mainly by the code generators + +ccgobj contains type definitions neeeded for C code generation + and some helpers +ccgmangl contains the name mangler for converting Nimrod + identifiers to their C counterparts +ccgutils contains helpers for the C code generator +ccgtemps contains the handling of temporary variables for the + C code generator +ccgtypes the generator for C types +ccgstmts the generator for statements +ccgexprs the generator for expressions +extccomp this module calls the C compiler and linker; interesting + if you want to add support for a new C compiler +============== ========================================================== diff --git a/doc/grammar.txt b/doc/grammar.txt new file mode 100755 index 000000000..6bbf2c3e7 --- /dev/null +++ b/doc/grammar.txt @@ -0,0 +1,186 @@ +module ::= ([COMMENT] [SAD] stmt)* + +optComma ::= [ ',' ] [COMMENT] [IND] +operator ::= OP0 | OR | XOR | AND | OP3 | OP4 | OP5 | IS | ISNOT | IN | NOTIN + | OP6 | DIV | MOD | SHL | SHR | OP7 | NOT + +prefixOperator ::= OP0 | OP3 | OP4 | OP5 | OP6 | OP7 | NOT + +optInd ::= [COMMENT] [IND] + + +lowestExpr ::= orExpr ( OP0 optInd orExpr )* +orExpr ::= andExpr ( OR | XOR optInd andExpr )* +andExpr ::= cmpExpr ( AND optInd cmpExpr )* +cmpExpr ::= ampExpr ( OP3 | IS | ISNOT | IN | NOTIN optInd ampExpr )* +ampExpr ::= plusExpr ( OP4 optInd plusExpr )* +plusExpr ::= mulExpr ( OP5 optInd mulExpr )* +mulExpr ::= dollarExpr ( OP6 | DIV | MOD | SHL | SHR optInd dollarExpr )* +dollarExpr ::= primary ( OP7 optInd primary )* + +namedTypeOrExpr ::= + DOTDOT [expr] + | expr [EQUALS (expr [DOTDOT expr] | typeDescK | DOTDOT [expr] ) + | DOTDOT [expr]] + | typeDescK + +castExpr ::= CAST BRACKET_LE optInd typeDesc BRACKERT_RI + PAR_LE optInd expr PAR_RI +addrExpr ::= ADDR PAR_LE optInd expr PAR_RI +symbol ::= ACC (KEYWORD | IDENT | operator | PAR_LE PAR_RI + | BRACKET_LE BRACKET_RI) ACC | IDENT +accExpr ::= KEYWORD | IDENT | operator [DOT KEYWORD | IDENT | operator] + paramList +primary ::= ( prefixOperator optInd )* ( IDENT | literal | ACC accExpr ACC + | castExpr | addrExpr ) ( + DOT optInd symbol + #| CURLY_LE namedTypeDescList CURLY_RI + | PAR_LE optInd + namedExprList + PAR_RI + | BRACKET_LE optInd + (namedTypeOrExpr optComma)* + BRACKET_RI + | CIRCUM + | pragma )* + +literal ::= INT_LIT | INT8_LIT | INT16_LIT | INT32_LIT | INT64_LIT + | FLOAT_LIT | FLOAT32_LIT | FLOAT64_LIT + | STR_LIT | RSTR_LIT | TRIPLESTR_LIT + | CHAR_LIT | RCHAR_LIT + | NIL + | BRACKET_LE optInd (expr [COLON expr] optComma )* BRACKET_RI # []-Constructor + | CURLY_LE optInd (expr [DOTDOT expr] optComma )* CURLY_RI # {}-Constructor + | PAR_LE optInd (expr [COLON expr] optComma )* PAR_RI # ()-Constructor + + +exprList ::= ( expr optComma )* + +namedExpr ::= expr [EQUALS expr] # actually this is symbol EQUALS expr|expr +namedExprList ::= ( namedExpr optComma )* + +exprOrSlice ::= expr [ DOTDOT expr ] +sliceList ::= ( exprOrSlice optComma )+ + +anonymousProc ::= LAMBDA paramList [pragma] EQUALS stmt +expr ::= lowestExpr + | anonymousProc + | IF expr COLON expr + (ELIF expr COLON expr)* + ELSE COLON expr + +namedTypeDesc ::= typeDescK | expr [EQUALS (typeDescK | expr)] +namedTypeDescList ::= ( namedTypeDesc optComma )* + +qualifiedIdent ::= symbol [ DOT symbol ] + +typeDescK ::= VAR typeDesc + | REF typeDesc + | PTR typeDesc + | TYPE expr + | PROC paramList [pragma] + +typeDesc ::= typeDescK | primary + +optSemicolon ::= [SEMICOLON] + +macroStmt ::= COLON [stmt] (OF [sliceList] COLON stmt + | ELIF expr COLON stmt + | EXCEPT exceptList COLON stmt )* + [ELSE COLON stmt] + +simpleStmt ::= returnStmt + | yieldStmt + | discardStmt + | raiseStmt + | breakStmt + | continueStmt + | pragma + | importStmt + | fromStmt + | includeStmt + | exprStmt +complexStmt ::= ifStmt | whileStmt | caseStmt | tryStmt | forStmt + | blockStmt | asmStmt + | procDecl | iteratorDecl | macroDecl | templateDecl + | constSection | typeSection | whenStmt | varSection + +indPush ::= IND # push +stmt ::= simpleStmt [SAD] + | indPush (complexStmt | simpleStmt) + ([SAD] (complexStmt | simpleStmt) )* + DED + +exprStmt ::= lowestExpr [EQUALS expr | (expr optComma)* [macroStmt]] +returnStmt ::= RETURN [expr] +yieldStmt ::= YIELD expr +discardStmt ::= DISCARD expr +raiseStmt ::= RAISE [expr] +breakStmt ::= BREAK [symbol] +continueStmt ::= CONTINUE +ifStmt ::= IF expr COLON stmt (ELIF expr COLON stmt)* [ELSE COLON stmt] +whenStmt ::= WHEN expr COLON stmt (ELIF expr COLON stmt)* [ELSE COLON stmt] +caseStmt ::= CASE expr (OF sliceList COLON stmt)* + (ELIF expr COLON stmt)* + [ELSE COLON stmt] +whileStmt ::= WHILE expr COLON stmt +forStmt ::= FOR (symbol optComma)+ IN expr [DOTDOT expr] COLON stmt +exceptList ::= (qualifiedIdent optComma)* + +tryStmt ::= TRY COLON stmt + (EXCEPT exceptList COLON stmt)* + [FINALLY COLON stmt] +asmStmt ::= ASM [pragma] (STR_LIT | RSTR_LIT | TRIPLESTR_LIT) +blockStmt ::= BLOCK [symbol] COLON stmt +importStmt ::= IMPORT ((symbol | STR_LIT | RSTR_LIT | TRIPLESTR_LIT) [AS symbol] optComma)+ +includeStmt ::= INCLUDE ((symbol | STR_LIT | RSTR_LIT | TRIPLESTR_LIT) optComma)+ +fromStmt ::= FROM (symbol | STR_LIT | RSTR_LIT | TRIPLESTR_LIT) IMPORT (symbol optComma)+ + +pragma ::= CURLYDOT_LE (expr [COLON expr] optComma)+ (CURLYDOT_RI | CURLY_RI) + +paramList ::= [PAR_LE ((symbol optComma)+ COLON typeDesc optComma)* PAR_RI] [COLON typeDesc] + +genericParams ::= BRACKET_LE (symbol [EQUALS typeDesc] )* BRACKET_RI + +procDecl ::= PROC symbol ["*"] [genericParams] + paramList [pragma] + [EQUALS stmt] +macroDecl ::= MACRO symbol ["*"] [genericParams] paramList [pragma] + [EQUALS stmt] +iteratorDecl ::= ITERATOR symbol ["*"] [genericParams] paramList [pragma] + [EQUALS stmt] +templateDecl ::= TEMPLATE symbol ["*"] [genericParams] paramList [pragma] + [EQUALS stmt] + +colonAndEquals ::= [COLON typeDesc] EQUALS expr + +constDecl ::= symbol ["*"] [pragma] colonAndEquals [COMMENT | IND COMMENT] + | COMMENT +constSection ::= CONST indPush constDecl (SAD constDecl)* DED +typeDef ::= typeDesc | recordDef | objectDef | enumDef + +recordIdentPart ::= + (symbol ["*" | "-"] [pragma] optComma)+ COLON typeDesc [COMMENT | IND COMMENT] + +recordWhen ::= WHEN expr COLON [COMMENT] recordPart + (ELIF expr COLON [COMMENT] recordPart)* + [ELSE COLON [COMMENT] recordPart] +recordCase ::= CASE expr COLON typeDesc [COMMENT] + (OF sliceList COLON [COMMENT] recordPart)* + [ELSE COLON [COMMENT] recordPart] + +recordPart ::= recordWhen | recordCase | recordIdentPart + | indPush recordPart (SAD recordPart)* DED +recordDef ::= RECORD [pragma] recordPart + +objectDef ::= OBJECT [pragma] [OF typeDesc] recordPart +enumDef ::= ENUM [OF typeDesc] (symbol [EQUALS expr] optComma [COMMENT | IND COMMENT])+ + +typeDecl ::= COMMENT + | symbol ["*"] [genericParams] [EQUALS typeDef] [COMMENT | IND COMMENT] + +typeSection ::= TYPE indPush typeDecl (SAD typeDecl)* DED + +colonOrEquals ::= COLON typeDesc [EQUALS expr] | EQUALS expr +varPart ::= (symbol ["*" | "-"] [pragma] optComma)+ colonOrEquals [COMMENT | IND COMMENT] +varSection ::= VAR (varPart | indPush (COMMENT|varPart) (SAD (COMMENT|varPart))* DED) diff --git a/doc/html/empty.txt b/doc/html/empty.txt new file mode 100755 index 000000000..20f9a91e3 --- /dev/null +++ b/doc/html/empty.txt @@ -0,0 +1 @@ +This file keeps several tools from deleting this subdirectory. diff --git a/doc/intern.txt b/doc/intern.txt new file mode 100755 index 000000000..4f1a7a15f --- /dev/null +++ b/doc/intern.txt @@ -0,0 +1,575 @@ +========================================= + Internals of the Nimrod Compiler +========================================= + + +:Author: Andreas Rumpf +:Version: |nimrodversion| + +.. contents:: + + +Directory structure +=================== + +The Nimrod project's directory structure is: + +============ ============================================== +Path Purpose +============ ============================================== +``bin`` binary files go into here +``nim`` Pascal sources of the Nimrod compiler; this + should be modified, not the Nimrod version in + ``rod``! +``rod`` Nimrod sources of the Nimrod compiler; + automatically generated from the Pascal + version +``data`` data files that are used for generating source + code go into here +``doc`` the documentation lives here; it is a bunch of + reStructuredText files +``dist`` download packages as zip archives go into here +``config`` configuration files for Nimrod go into here +``lib`` the Nimrod library lives here; ``rod`` depends + on it! +``web`` website of Nimrod; generated by ``genweb.py`` + from the ``*.txt`` and ``*.tmpl`` files +``koch`` the Koch Build System (written for Nimrod) +``obj`` generated ``*.obj`` files go into here +============ ============================================== + + +Bootstrapping the compiler +========================== + +The compiler is written in a subset of Pascal with special annotations so +that it can be translated to Nimrod code automatically. This conversion is +done by Nimrod itself via the undocumented ``boot`` command. Thus both Nimrod +and Free Pascal can compile the Nimrod compiler. + +Requirements for bootstrapping: + +- Free Pascal (I used version 2.2); it may not be needed +- Python (should work with 2.4 or higher) and the code generator *cog* + (included in this distribution!) + +- C compiler -- one of: + + * win32-lcc + * Borland C++ (tested with 5.5) + * Microsoft C++ + * Digital Mars C++ + * Watcom C++ (currently broken; a fix is welcome!) + * GCC + * Intel C++ + * Pelles C + * llvm-gcc + +| Compiling the compiler is a simple matter of running: +| ``koch.py boot`` +| Or you can compile by hand, this is not difficult. + +If you want to debug the compiler, use the command:: + + koch.py boot --debugger:on + +The ``koch.py`` script is Nimrod's maintainance script: Everything that has +been automated is accessible with it. It is a replacement for make and shell +scripting with the advantage that it is more portable and is easier to read. + + +Coding standards +================ + +The compiler is written in a subset of Pascal with special annotations so +that it can be translated to Nimrod code automatically. As a generell rule, +Pascal code that does not translate to Nimrod automatically is forbidden. + + +Porting to new platforms +======================== + +Porting Nimrod to a new architecture is pretty easy, since C is the most +portable programming language (within certain limits) and Nimrod generates +C code, porting the code generator is not necessary. + +POSIX-compliant systems on conventional hardware are usually pretty easy to +port: Add the platform to ``platform`` (if it is not already listed there), +check that the OS, System modules work and recompile Nimrod. + +The only case where things aren't as easy is when the garbage +collector needs some assembler tweaking to work. The standard +version of the GC uses C's ``setjmp`` function to store all registers +on the hardware stack. It may be that the new platform needs to +replace this generic code by some assembler code. + + +Runtime type information +======================== + +*Runtime type information* (RTTI) is needed for several aspects of the Nimrod +programming language: + +Garbage collection + The most important reason for RTTI. Generating + traversal procedures produces bigger code and is likely to be slower on + modern hardware as dynamic procedure binding is hard to predict. + +Complex assignments + Sequences and strings are implemented as + pointers to resizeable buffers, but Nimrod requires copying for + assignments. Apart from RTTI the compiler could generate copy procedures + for any type that needs one. However, this would make the code bigger and + the RTTI is likely already there for the GC. + +We already knew the type information as a graph in the compiler. +Thus we need to serialize this graph as RTTI for C code generation. +Look at the files ``lib/typeinfo.nim``, ``lib/hti.nim`` for more information. + +However, generating type information proved to be difficult and the format +wastes memory. Variant records make problems too. We use a mix of iterator +procedures and constant data structures: + +.. code-block:: Nimrod + type + TNimTypeSlot {.export.} = record + offset: int + typ: int + name: CString + TSlotIterator = proc (obj: pointer, field: int): ptr TNimTypeSlot + TNimType {.export.} = record + Kind: TNimTypeKind + baseType, indexType: int + size, len: int + slots: TSlotIterator # instead of: ptr array [0..10_000, TNimTypeSlot] + +This is not easy to understand either. Best is to use just the ``rodgen`` +module and store type information as string constants. + +After thinking I came to the conclusion that this is again premature +optimization. We should just construct the type graph at runtime. In the init +section new types should be constructed and registered: + +.. code-block:: Nimrod + type + TSlotTriple = record + offset: int + typ: PRTL_Type + name: Cstring + PSlots = ptr TSlots + TSlots = record + case kind + of linear: + fields: array [TSlotTriple] + of nested: + discriminant: TSlotTriple + otherSlots: array [discriminant, PSlots] + + TTypeKind = enum ... + RTL_Type = record + size: int + base: PRTL_Type + case kind + of tyArray, tySequence: + elemSize: int + of tyRecord, tyObject, tyEnum: + slots: PSlots + + +The Garbage Collector +===================== + +Introduction +------------ + +We use the term *cell* here to refer to everything that is traced +(sequences, refs, strings). +This section describes how the new GC works. The old algorithms +all had the same problem: Too complex to get them right. This one +tries to find the right compromise. + +The basic algorithm is *Deferrent reference counting* with cycle detection. +References in the stack are not counted for better performance and easier C +code generation. The GC starts by traversing the hardware stack and increments +the reference count (RC) of every cell that it encounters. After the GC has +done its work the stack is traversed again and the RC of every cell +that it encounters are decremented again. Thus no marking bits in the RC are +needed. Between these stack traversals the GC has a complete accurate view over +the RCs. + +Each cell has a header consisting of a RC and a pointer to its type +descriptor. However the program does not know about these, so they are placed at +negative offsets. In the GC code the type ``PCell`` denotes a pointer +decremented by the right offset, so that the header can be accessed easily. It +is extremely important that ``pointer`` is not confused with a ``PCell`` +as this would lead to a memory corruption. + + +When to trigger a collection +---------------------------- + +Since there are really two different garbage collectors (reference counting +and mark and sweep) we use two different heuristics when to run the passes. +The RC-GC pass is fairly cheap: Thus we use an additive increase (7 pages) +for the RC_Threshold and a multiple increase for the CycleThreshold. + + +The AT and ZCT sets +------------------- + +The GC maintains two sets throughout the lifetime of +the program (plus two temporary ones). The AT (*any table*) simply contains +every cell. The ZCT (*zero count table*) contains every cell whose RC is +zero. This is used to reclaim most cells fast. + +The ZCT contains redundant information -- the AT alone would suffice. +However, traversing the AT and look if the RC is zero would touch every living +cell in the heap! That's why the ZCT is updated whenever a RC drops to zero. +The ZCT is not updated when a RC is incremented from zero to one, as +this would be too costly. + + +The CellSet data structure +-------------------------- + +The AT and ZCT depend on an extremely efficient datastructure for storing a +set of pointers - this is called a ``PCellSet`` in the source code. +Inserting, deleting and searching are done in constant time. However, +modifying a ``PCellSet`` during traversation leads to undefined behaviour. + +.. code-block:: Nimrod + type + PCellSet # hidden + + proc allocCellSet: PCellSet # make a new set + proc deallocCellSet(s: PCellSet) # empty the set and free its memory + proc incl(s: PCellSet, elem: PCell) # include an element + proc excl(s: PCellSet, elem: PCell) # exclude an element + + proc `in`(elem: PCell, s: PCellSet): bool + + iterator elements(s: PCellSet): (elem: PCell) + + +All the operations have to be performed efficiently. Because a Cellset can +become huge (the AT contains every allocated cell!) a hash table is not +suitable for this. + +We use a mixture of bitset and patricia tree for this. One node in the +patricia tree contains a bitset that decribes a page of the operating system +(not always, but that doesn't matter). +So including a cell is done as follows: + +- Find the page descriptor for the page the cell belongs to. +- Set the appropriate bit in the page descriptor indicating that the + cell points to the start of a memory block. + +Removing a cell is analogous - the bit has to be set to zero. +Single page descriptors are never deleted from the tree. Typically a page +descriptor is only 19 words big, so it does not waste much by not deleting +it. Apart from that the AT and ZCT are rebuilt frequently, so removing a +single page descriptor from the tree is never necessary. + +Complete traversal is done like so:: + + for each page decriptor d: + for each bit in d: + if bit == 1: + traverse the pointer belonging to this bit + + +Further complications +--------------------- + +In Nimrod the compiler cannot always know if a reference +is stored on the stack or not. This is caused by var parameters. +Consider this example: + +.. code-block:: Nimrod + proc setRef(r: var ref TNode) = + new(r) + + proc usage = + var + r: ref TNode + setRef(r) # here we should not update the reference counts, because + # r is on the stack + setRef(r.left) # here we should update the refcounts! + +Though it would be possible to produce code updating the refcounts (if +necessary) before and after the call to ``setRef``, it is a complex task to +do so in the code generator. So we don't and instead decide at runtime +whether the reference is on the stack or not. The generated code looks +roughly like this: + +.. code-block:: C + void setref(TNode** ref) { + unsureAsgnRef(ref, newObj(TNode_TI, sizeof(TNode))) + } + void usage(void) { + setRef(&r) + setRef(&r->left) + } + +Note that for systems with a continous stack (which most systems have) +the check whether the ref is on the stack is very cheap (only two +comparisons). Another advantage of this scheme is that the code produced is +a tiny bit smaller. + + +The algorithm in pseudo-code +---------------------------- +Now we come to the nitty-gritty. The algorithm works in several phases. + +Phase 1 - Consider references from stack +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +:: + + for each pointer p in the stack: incRef(p) + +This is necessary because references in the hardware stack are not traced for +better performance. After Phase 1 the RCs are accurate. + + +Phase 2 - Free the ZCT +~~~~~~~~~~~~~~~~~~~~~~ +This is how things used to (not) work:: + + for p in elements(ZCT): + if RC(p) == 0: + call finalizer of p + for c in children(p): decRef(c) # free its children recursively + # if necessary; the childrens RC >= 1, BUT they may still be in the ZCT! + free(p) + else: + remove p from the ZCT + +Instead we do it this way. Note that the recursion is gone too! +:: + + newZCT = nil + for p in elements(ZCT): + if RC(p) == 0: + call finalizer of p + for c in children(p): + assert(RC(c) > 0) + dec(RC(c)) + if RC(c) == 0: + if newZCT == nil: newZCT = allocCellSet() + incl(newZCT, c) + free(p) + else: + # nothing to do! We will use the newZCS + + deallocCellSet(ZCT) + ZCT = newZCT + +This phase is repeated until enough memory is available or the ZCT is nil. +If still not enough memory is available the cyclic detector gets its chance +to do something. + + +Phase 3 - Cycle detection +~~~~~~~~~~~~~~~~~~~~~~~~~ +Cycle detection works by subtracting internal reference counts:: + + newAT = allocCellSet() + + for y in elements(AT): + # pretend that y is dead: + for c in children(y): + dec(RC(c)) + # note that this should not be done recursively as we have all needed + # pointers in the AT! This makes it more efficient too! + + proc restore(y: PCell) = + # unfortunately, the recursion here cannot be eliminated easily + if y not_in newAT: + incl(newAT, y) + for c in children(y): + inc(RC(c)) # restore proper reference counts! + restore(c) + + for y in elements(AT) with rc > 0: + restore(y) + + for y in elements(AT) with rc == 0: + free(y) # if pretending worked, it was part of a cycle + + AT = newAT + + +Phase 4 - Ignore references from stack again +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +:: + + for each pointer p in the stack: + dec(RC(p)) + if RC(p) == 0: incl(ZCT, p) + +Now the RCs correctly discard any references from the stack. One can also see +this as a temporary marking operation. Things that are referenced from stack +are marked during the GC's operarion and now have to be unmarked. + + + +The compiler's architecture +=========================== + +Nimrod uses the classic compiler architecture: A scanner feds tokens to a +parser. The parser builds a syntax tree that is used by the code generator. +This syntax tree is the interface between the parser and the code generator. +It is essential to understand most of the compiler's code. + +In order to compile Nimrod correctly, type-checking has to be seperated from +parsing. Otherwise generics would not work. Code generation is done for a +whole module only after it has been checked for semantics. + +.. include:: filelist.txt + +The first command line argument selects the backend. Thus the backend is +responsible for calling the parser and semantic checker. However, when +compiling ``import`` or ``include`` statements, the semantic checker needs to +call the backend, this is done by embedding a PBackend into a TContext. + + +The syntax tree +--------------- +The synax tree consists of nodes which may have an arbitrary number of +children. Types and symbols are represented by other nodes, because they +may contain cycles. The AST changes its shape after semantic checking. This +is needed to make life easier for the code generators. See the "ast" module +for the type definitions. + +We use the notation ``nodeKind(fields, [sons])`` for describing +nodes. ``nodeKind[sons]`` is a short-cut for ``nodeKind([sons])``. +XXX: Description of the language's syntax and the corresponding trees. + + +How the RTL is compiled +======================= + +The system module contains the part of the RTL which needs support by +compiler magic (and the stuff that needs to be in it because the spec +says so). The C code generator generates the C code for it just like any other +module. However, calls to some procedures like ``addInt`` are inserted by +the CCG. Therefore the module ``magicsys`` contains a table +(``compilerprocs``) with all symbols that are marked as ``compilerproc``. + + +How separate compilation will work +================================== + +Soon compiling from scratch every module that's needed will become too slow as +programs grow. For easier cleaning all generated files are generated in the +directory: ``$base/rod_gen``. This cannot be changed. The generated C files +get the names of the modules they result from. A compiled Nimrod module has the +extension ``.rod`` and is a binary file. The format may change from release +to release. The rod-file is mostly a binary representation of the parse trees. + +Nimrod currently compiles any module into its own C file. Some things like +type-information, common string literals, common constant sets need to be +shared though. We deal with this problem by writing the shared data +in the main C file. Only "headers" are generated in the other modules. However, +each precompiled Nimrod module lists the shared data it depends on. The same +holds for procedures that have to generated from generics. + +A big problem is that the data must get the same name each time it is compiled. + +The C compiler is only called for the C files, that changed after the last +compilation (or if their object file does not exist anymore). To work +reliably, in the header comment of the C file these things are listed, so +that the C compiler is called again should they change: + +* Nimrod's Version +* the target CC +* the target OS +* the target CPU + +The version is questionable: If the resulting C file is the same, it does not +matter that Nimrods's version has increased. We do it anyway to be on the safe +side. + + +Generation of dynamic link libraries +==================================== + +Generation of dynamic link libraries or shared libraries is not difficult; the +underlying C compiler already does all the hard work for us. The problem is the +common runtime library, especially the memory manager. Note that Borland's +Delphi had exactly the same problem. The workaround is to not link the GC with +the Dll and provide an extra runtime dll that needs to be initialized. + + + +How to implement closures +========================= + +A closure is a record of a proc pointer and a context ref. The context ref +points to a garbage collected record that contains the needed variables. +An example: + +.. code-block:: Nimrod + + type + TListRec = record + data: string + next: ref TListRec + + proc forEach(head: ref TListRec, visitor: proc (s: string) {.closure.}) = + var it = head + while it != nil: + visit(it.data) + it = it.next + + proc sayHello() = + var L = new List(["hallo", "Andreas"]) + var temp = "jup\xff" + forEach(L, lambda(s: string) = + io.write(temp) + io.write(s) + ) + + +This should become the following in C: + +.. code-block:: C + typedef struct ... /* List type */ + + typedef struct closure { + void (*PrcPart)(string, void*); + void* ClPart; + } + + typedef struct Tcl_data { + string temp; // all accessed variables are put in here! + } + + void forEach(TListRec* head, const closure visitor) { + TListRec* it = head; + while (it != NIM_NULL) { + visitor.prc(it->data, visitor->cl_data); + it = it->next; + } + } + + void printStr(string s, void* cl_data) { + Tcl_data* x = (Tcl_data*) cl_data; + io_write(x->temp); + io_write(s); + } + + void sayhello() { + Tcl_data* data = new(...); + asgnRef(&data->temp, "jup\xff"); + ... + + closure cl; + cl.prc = printStr; + cl.cl_data = data; + foreach(L, cl); + } + + +What about nested closure? - There's not much difference: Just put all used +variables in the data record. diff --git a/doc/lib.txt b/doc/lib.txt new file mode 100755 index 000000000..4e4f9caa0 --- /dev/null +++ b/doc/lib.txt @@ -0,0 +1,48 @@ +======================= +Nimrod Standard Library +======================= + +:Author: Andreas Rumpf +:Version: |nimrodversion| + +Though the Nimrod Standard Library is still evolving, it is already quite +usable. It is divided into basic libraries that contains modules that virtually +every program will need and advanced libraries which are more heavy weight. +Advanced libraries are in the ``lib/base`` directory. + +Basic libraries +=============== + +* `System <system.html>`_ + Basic procs and operators that every program needs. It also provides IO + facilities for reading and writing text and binary files. It is imported + implicitly by the compiler. Do not import it directly. It relies on compiler + magic to work. + +* `Strutils <strutils.html>`_ + This module contains common string handling operations like converting a + string into uppercase, splitting a string into substrings, searching for + substrings, replacing substrings. + +* `OS <os.html>`_ + Basic operating system facilities like retrieving environment variables, + reading command line arguments, working with directories, running shell + commands, etc. This module is -- like any other basic library -- + platform independant. + +* `Math <math.html>`_ + Mathematical operations like cosine, square root. + +* `Complex <complex.html>`_ + This module implements complex numbers and their mathematical operations. + +* `Times <times.html>`_ + The ``times`` module contains basic support for working with time. + + +Advanced libaries +================= + +* `Regexprs <regexprs.html>`_ + This module contains procedures and operators for handling regular + expressions. diff --git a/doc/manual.txt b/doc/manual.txt new file mode 100755 index 000000000..8debb92a5 --- /dev/null +++ b/doc/manual.txt @@ -0,0 +1,1742 @@ +============= +Nimrod Manual +============= + +:Author: Andreas Rumpf +:Version: |nimrodversion| + +.. contents:: + + +About this document +=================== + +This document describes the lexis, the syntax, and the semantics of Nimrod. + +The language constructs are explained using an extended BNF, in +which ``(a)*`` means 0 or more ``a``'s, ``a+`` means 1 or more ``a``'s, and +``(a)?`` means an optional *a*; an alternative spelling for optional parts is +``[a]``. The ``|`` symbol is used to mark alternatives +and has the lowest precedence. Parentheses may be used to group elements. +Non-terminals are in lowercase, terminal symbols (including keywords) are in +UPPERCASE. An example:: + + if_stmt ::= IF expr COLON stmts (ELIF expr COLON stmts)* [ELSE stmts] + +Other parts of Nimrod - like scoping rules or runtime semantics are only +described in an informal manner. The reason is that formal semantics are +difficult to write and understand. However, there is only one Nimrod +implementation, so one may consider it as the formal specification; +especially since the compiler's code is pretty clean (well, some parts of it). + + +Definitions +=========== + +A Nimrod program specifies a computation that acts on a memory consisting of +components called `locations`:idx:. A variable is basically a name for a +location. Each variable and location is of a certain `type`:idx:. The +variable's type is called `static type`:idx:, the location's type is called +`dynamic type`:idx:. If the static type is not the same as the dynamic type, +it is a supertype of the dynamic type. + +An `identifier`:idx: is a symbol declared as a name for a variable, type, +procedure, etc. The region of the program over which a declaration applies is +called the `scope`:idx: of the declaration. Scopes can be nested. The meaning +of an identifier is determined by the smallest enclosing scope in which the +identifier is declared. + +An expression specifies a computation that produces a value or location. +Expressions that produce locations are called `l-values`:idx:. An l-value +can denote either a location or the value the location contains, depending on +the context. Expressions whose values can be determined statically are called +`constant expressions`:idx:; they are never l-values. + +A `static error`:idx: is an error that the implementation detects before +program execution. Unless explicitly classified, an error is a static error. + +A `checked runtime error`:idx: is an error that the implementation detects +and reports at runtime. The method for reporting such errors is via *raising +exceptions*. However, the implementation provides a means to disable these +runtime checks. See the section pragmas_ for details. + +An `unchecked runtime error`:idx: is an error that is not guaranteed to be +detected, and can cause the subsequent behavior of the computation to +be arbitrary. Unchecked runtime errors cannot occur if only `safe`:idx: +language features are used. + + +Lexical Analysis +================ + +Encoding +-------- + +All Nimrod source files are in the UTF-8 encoding (or its ASCII subset). Other +encodings are not supported. Any of the standard platform line termination +sequences can be used - the Unix form using ASCII LF (linefeed), the Windows +form using the ASCII sequence CR LF (return followed by linefeed), or the old +Macintosh form using the ASCII CR (return) character. All of these forms can be +used equally, regardless of platform. + + +Indentation +----------- + +Nimrod's standard grammar describes an `indentation sensitive`:idx: language. +This means that all the control structures are recognized by indentation. +Indentation consists only of spaces; tabulators are not allowed. + +The terminals ``IND`` (indentation), ``DED`` (dedentation) and ``SAD`` +(same indentation) are generated by the scanner, denoting an indentation. + +These terminals are only generated for lines that are not empty or contain +only whitespace and comments. + +The parser and the scanner communicate over a stack which indentation terminal +should be generated: The stack consists of integers counting the spaces. The +stack is initialized with a zero on its top. The scanner reads from the stack: +If the current indentation token consists of more spaces than the entry at the +top of the stack, a ``IND`` token is generated, else if it consists of the same +number of spaces, a ``SAD`` token is generated. If it consists of fewer spaces, +a ``DED`` token is generated for any item on the stack that is greater than the +current. These items are then popped from the stack by the scanner. At the end +of the file, a ``DED`` token is generated for each number remaining on the +stack that is larger than zero. + +Because the grammar contains some optional ``IND`` tokens, the scanner cannot +push new indentation levels. This has to be done by the parser. The symbol +``indPush`` indicates that an ``IND`` token is expected; the current number of +leading spaces is pushed onto the stack by the parser. + +Comments +-------- + +`Comments`:idx: start anywhere outside a string or character literal with the +hash character ``#``. +Comments consist of a concatenation of `comment pieces`:idx:. A comment piece +starts with ``#`` and runs until the end of the line. The end of line characters +belong to the piece. If the next line only consists of a comment piece which is +aligned to the preceding one, it does not start a new comment: + +.. code-block:: nimrod + + i = 0 # This is a single comment over multiple lines belonging to the + # assignment statement. The scanner merges these two pieces. + # This is a new comment belonging to the current block, but to no particular + # statement. + i = i + 1 # This a new comment that is NOT + echo(i) # continued here, because this comment refers to the echo statement + +Comments are tokens; they are only allowed at certain places in the input file +as they belong to the syntax tree! This feature enables perfect source-to-source +transformations (such as pretty-printing) and superior documentation generators. +A side-effect is that the human reader of the code always knows exactly which +code snippet the comment refers to. + + +Identifiers & Keywords +---------------------- + +`Identifiers`:idx: in Nimrod can be any string of letters, digits +and underscores, beginning with a letter. Two immediate following +underscores ``__`` are not allowed:: + + letter ::= 'A'..'Z' | 'a'..'z' | '\x80'..'\xff' + digit ::= '0'..'9' + IDENTIFIER ::= letter ( ['_'] letter | digit )* + +The following `keywords`:idx: are reserved and cannot be used as identifiers: + +.. code-block:: nimrod + :file: ../data/keywords.txt + +Some keywords are unused; they are reserved for future developments of the +language. + +Nimrod is a `style-insensitive`:idx: language. This means that it is not +case-sensitive and even underscores are ignored: +**type** is a reserved word, and so is **TYPE** or **T_Y_P_E**. The idea behind +this is that this allows programmers to use their own prefered spelling style +and libraries written by different programmers cannot use incompatible +conventions. The editors or IDE can show the identifiers as preferred. Another +advantage is that it frees the programmer from remembering the spelling of an +identifier. + + +Literal strings +--------------- + +`Literal strings`:idx: can be delimited by matching double quotes, and can +contain the following `escape sequences`:idx:\ : + +================== =================================================== + Escape sequence Meaning +================== =================================================== + ``\n`` `newline`:idx: + ``\r`` `carriage return`:idx: + ``\l`` `line feed`:idx: + ``\f`` `form feed`:idx: + ``\t`` `tabulator`:idx: + ``\v`` `vertical tabulator`:idx: + ``\\`` `backslash`:idx: + ``\"`` `quotation mark`:idx: + ``\'`` `apostrophe`:idx: + ``\d+`` `character with decimal value d`:idx:; + all decimal digits directly + following are used for the + character + ``\a`` `alert`:idx: + ``\b`` `backspace`:idx: + ``\e`` `escape`:idx: `[ESC]`:idx: + ``\xHH`` `character with hex value HH`:idx:; + exactly two hex digits are allowed +================== =================================================== + + +Strings in Nimrod may contain any 8-bit value, except embedded zeros +which are not allowed for compability with `C`:idx:. + +Literal strings can also be delimited by three double squotes +``"""`` ... ``"""``. +Literals in this form may run for several lines, may contain ``"`` and do not +interpret any escape sequences. +For convenience, when the opening ``"""`` is immediately +followed by a newline, the newline is not included in the string. +There are also `raw string literals` that are preceded with the letter ``r`` +(or ``R``) and are delimited by matching double quotes (just like ordinary +string literals) and do not interpret the escape sequences. This is especially +convenient for regular expressions or Windows paths: + +.. code-block:: nimrod + + var f = openFile(r"C:\texts\text.txt") # a raw string, so ``\t`` is no tab + + +Literal characters +------------------ + +Character literals are enclosed in single quotes ``''`` and can contain the +same escape sequences as strings - with one exception: ``\n`` is not allowed +as it may be wider than one character (often it is the pair CR/LF for example). +A character is not an Unicode character but a single byte. The reason for this +is efficiency: For the overwhelming majority of use-cases, the resulting +programs will still handle UTF-8 properly as UTF-8 was specially designed for +this. +Another reason is that Nimrod should support ``array[char, int]`` or +``set[char]`` efficiently as many algorithms rely on this feature. + + +Numerical constants +------------------- + +`Numerical constants`:idx: are of a single type and have the form:: + + hexdigit ::= digit | 'A'..'F' | 'a'..'f' + octdigit ::= '0'..'7' + bindigit ::= '0'..'1' + INT_LIT ::= digit ( ['_'] digit )* + | '0' ('x' | 'X' ) hexdigit ( ['_'] hexdigit )* + | '0o' octdigit ( ['_'] octdigit )* + | '0' ('b' | 'B' ) bindigit ( ['_'] bindigit )* + + INT8_LIT ::= INT_LIT '\'' ('i' | 'I' ) '8' + INT16_LIT ::= INT_LIT '\'' ('i' | 'I' ) '16' + INT32_LIT ::= INT_LIT '\'' ('i' | 'I' ) '32' + INT64_LIT ::= INT_LIT '\'' ('i' | 'I' ) '64' + + exponent ::= ('e' | 'E' ) ['+' | '-'] digit ( ['_'] digit )* + FLOAT_LIT ::= digit (['_'] digit)* ('.' (['_'] digit)* [exponent] |exponent) + FLOAT32_LIT ::= ( FLOAT_LIT | INT_LIT ) '\'' ('f' | 'F') '32' + FLOAT64_LIT ::= ( FLOAT_LIT | INT_LIT ) '\'' ('f' | 'F') '64' + + +As can be seen in the productions, numerical constants can contain unterscores +for readability. Integer and floating point literals may be given in decimal (no +prefix), binary (prefix ``0b``), octal (prefix ``0o``) and +hexadecimal (prefix ``0x``) notation. + +There exists a literal for each numerical type that is +defined. The suffix starting with an apostophe ('\'') is called a +`type suffix`:idx:. Literals without a type prefix are of the type ``int``, +unless the literal contains a dot or an ``E`` in which case it is of +type ``float``. + +The following table specifies type suffixes: + +================= ========================= + Type Suffix Resulting type of literal +================= ========================= + ``'i8`` int8 + ``'i16`` int16 + ``'i32`` int32 + ``'i64`` int64 + ``'f32`` float32 + ``'f64`` float64 +================= ========================= + +Floating point literals may also be in binary, octal or hexadecimal +notation: +``0B0_10001110100_0000101001000111101011101111111011000101001101001001'f64`` +is approximately 1.72826e35 according to the IEEE floating point standard. + + + +Other tokens +------------ + +The following strings denote other tokens:: + + ( ) { } [ ] , ; [. .] {. .} (. .) + : = ^ .. ` + +`..`:tok: takes precedence over other tokens that contain a dot: `{..}`:tok: are +the three tokens `{`:tok:, `..`:tok:, `}`:tok: and not the two tokens +`{.`:tok:, `.}`:tok:. + +In Nimrod one can define his own operators. An `operator`:idx: is any +combination of the following characters that are not listed above:: + + + - * / < > + = @ $ ~ & % + ! ? ^ . | + +These keywords are also operators: +``and or not xor shl shr div mod in notin is isnot``. + + +Syntax +====== + +This section lists Nimrod's standard syntax in ENBF. How the parser receives +indentation tokens is already described in the Lexical Analysis section. + +Nimrod allows user-definable operators. +Binary operators have 8 different levels of precedence. For user-defined +operators, the precedence depends on the first character the operator consists +of. All binary operators are left-associative. + +================ ============================================== ================== =============== +Precedence level Operators First characters Terminal symbol +================ ============================================== ================== =============== + 7 (highest) ``$`` OP7 + 6 ``* / div mod shl shr %`` ``* % \ /`` OP6 + 5 ``+ -`` ``+ ~ |`` OP5 + 4 ``&`` ``&`` OP4 + 3 ``== <= < >= > != in not_in is isnot`` ``= < > !`` OP3 + 2 ``and`` OP2 + 1 ``or xor`` OP1 + 0 (lowest) ``? @ ^ ` : .`` OP0 +================ ============================================== ================== =============== + + +The grammar's start symbol is ``module``. The grammar is LL(1) and therefore +not ambigious. + +.. include:: grammar.txt + :literal: + + + +Semantics +========= + +Constants +--------- + +`Constants`:idx: are symbols which are bound to a value. The constant's value +cannot change. The compiler must be able to evaluate the expression in a +constant declaration at compile time. + +.. + Nimrod contains a sophisticated + compile-time evaluator, so procedures declared with the ``{.noSideEffect.}`` + pragma can be used in constant expressions: + + .. code-block:: nimrod + + from strutils import findSubStr + const + x = findSubStr('a', "hallo") # x is 1; this is computed at compile time! + + +Types +----- + +All expressions have a `type`:idx: which is known at compile time. Thus Nimrod +is statically typed. One can declare new types, which is in +essence defining an identifier that can be used to denote this custom type. + +These are the major type classes: + +* ordinal types (consist of integer, bool, character, enumeration + (and subranges thereof) types) +* floating point types +* string type +* structured types +* reference (pointer) type +* procedural type +* generic type + + +Ordinal types +~~~~~~~~~~~~~ +`Ordinal types`:idx: have the following characteristics: + +- Ordinal types are countable and ordered. This property allows + the operation of functions as ``Inc``, ``Ord``, ``Dec`` on ordinal types to + be defined. +- Ordinal values have a smallest possible value. Trying to count farther + down than the smallest value gives a checked runtime or static error. +- Ordinal values have a largest possible value. Trying to count farther + than the largest value gives a checked runtime or static error. + +Integers, bool, characters and enumeration types (and subrange of these +types) belong to ordinal types. + + +Pre-defined numerical types +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +These integer types are pre-defined: + +``int`` + the generic signed integer type; its size is platform dependant + (the compiler chooses the processor's fastest integer type) + this type should be used in general. An integer literal that has no type + suffix is of this type. + +intXX + additional signed integer types of XX bits use this naming scheme + (example: int16 is a 16 bit wide integer). + The current implementation supports ``int8``, ``int16``, ``int32``, ``int64``. + Literals of these types have the suffix 'iXX. + + +There are no `unsigned integer`:idx: types, only `unsigned operations`:idx: +that treat their arguments as unsigned. Unsigned operations all wrap around; +they may not lead to over- or underflow errors. Unsigned operations use the +``%`` postfix as convention: + +====================== ====================================================== +operation meaning +====================== ====================================================== +``a +% b`` unsigned integer addition +``a -% b`` unsigned integer substraction +``a *% b`` unsigned integer multiplication +``a /% b`` unsigned integer division +``a %% b`` unsigned integer modulo operation +``a <% b`` treat ``a`` and ``b`` as unsigned and compare +``a <=% b`` treat ``a`` and ``b`` as unsigned and compare +``ze(a)`` extends the bits of ``a`` with zeros until it has the + width of the ``int`` type +``toU8(a)`` treats ``a`` as unsigned and converts it to an + unsigned integer of 8 bits (but still the + ``int8`` type) +``toU16(a)`` treats ``a`` as unsigned and converts it to an + unsigned integer of 16 bits (but still the + ``int16`` type) +``toU32(a)`` treats ``a`` as unsigned and converts it to an + unsigned integer of 32 bits (but still the + ``int32`` type) +====================== ====================================================== + +The following floating point types are pre-defined: + +``float`` + the generic floating point type; its size is platform dependant + (the compiler chooses the processor's fastest floating point type) + this type should be used in general + +floatXX + an implementation may define additional floating point types of XX bits using + this naming scheme (example: float64 is a 64 bit wide float). The current + implementation supports ``float32`` and ``float64``. Literals of these types + have the suffix 'fXX. + +`Automatic type conversion`:idx: in expressions where different kinds +of integer types are used is performed. However, if the type conversion +loses information, the `EInvalidValue`:idx: exception is raised. Certain cases +of the convert error are detected at compile time. + +Automatic type conversion in expressions with different kinds +of floating point types is performed: The smaller type is +converted to the larger. Arithmetic performed on floating point types +follows the IEEE standard. Only the ``int`` type is converted to a floating +point type automatically, other integer types are not. + + +Boolean type +~~~~~~~~~~~~ +The `boolean`:idx: type is named ``bool`` in Nimrod and can be one of the two +pre-defined values ``true`` and ``false``. Conditions in while, +if, elif, when statements need to be of type bool. + +This condition holds:: + + ord(false) == 0 and ord(true) == 1 + +The operators ``not, and, or, xor, implies, <, <=, >, >=, !=, ==`` are defined +for the bool type. The ``and`` and ``or`` operators perform short-cut +evaluation. Example: + +.. code-block:: nimrod + + while p != nil and p.name != "xyz": + # p.name is not evaluated if p == nil + p = p.next + + +The size of the bool type is one byte. + + +Character type +~~~~~~~~~~~~~~ +The `character type`:idx: is named ``char`` in Nimrod. Its size is one byte. +Thus it cannot represent an UTF-8 character, but a part of it. +The reason for this is efficiency: For the overwhelming majority of use-cases, +the resulting programs will still handle UTF-8 properly as UTF-8 was specially +designed for this. +Another reason is that Nimrod can support ``array[char, int]`` or +``set[char]`` efficiently as many algorithms rely on this feature. The +`TUniChar` type is used for Unicode characters, it can represent any Unicode +character. ``TUniChar`` is declared the ``unicode`` standard module. + + + +Enumeration types +~~~~~~~~~~~~~~~~~ +`Enumeration`:idx: types define a new type whose values consist only of the ones +specified. +The values are ordered by the order in enum's declaration. Example: + +.. code-block:: nimrod + + type + TDirection = enum + north, east, south, west + + +Now the following holds:: + + ord(north) == 0 + ord(east) == 1 + ord(south) == 2 + ord(west) == 3 + +Thus, north < east < south < west. The comparison operators can be used +with enumeration types. + +For better interfacing to other programming languages, the fields of enum +types can be assigned an explicit ordinal value. However, the ordinal values +have to be in ascending order. A field whose ordinal value that is not +explicitly given, is assigned the value of the previous field + 1. + +An explicit ordered enum can have *wholes*: + +.. code-block:: nimrod + type + TTokenType = enum + a = 2, b = 4, c = 89 # wholes are valid + +However, it is then not an ordinal anymore, so it is not possible to use these +enums as an index type for arrays. The procedures ``inc``, ``dec``, ``succ`` +and ``pred`` are not available for them either. + + +Subrange types +~~~~~~~~~~~~~~ +A `subrange`:idx: type is a range of values from an ordinal type (the host +type). To define a subrange type, one must specify it's limiting values: the +highest and lowest value of the type: + +.. code-block:: nimrod + type + TSubrange = range[0..5] + + +``TSubrange`` is a subrange of an integer which can only hold the values 0 +to 5. Assigning any other value to a variable of type ``TSubrange`` is a +checked runtime error (or static error if it can be statically +determined). Assignments from the base type to one of its subrange types +(and vice versa) are allowed. + +A subrange type has the same size as its base type (``int`` in the example). + + +String type +~~~~~~~~~~~ +All string literals are of the type `string`:idx:. A string in Nimrod is very +similar to a sequence of characters. However, strings in Nimrod both are +zero-terminated and have a length field. One can retrieve the length with the +builtin ``len`` procedure; the length never counts the terminating zero. +The assignment operator for strings always copies the string. + +Strings are compared by their lexicographical order. All comparison operators +are available. Strings can be indexed like arrays (lower bound is 0). Unlike +arrays, they can be used in case statements: + +.. code-block:: nimrod + + case paramStr(i) + of "-v": incl(options, optVerbose) + of "-h", "-?": incl(options, optHelp) + else: write(stdout, "invalid command line option!\n") + +Per convention, all strings are UTF-8 strings, but this is not enforced. For +example, when reading strings from binary files, they are merely a sequence of +bytes. The index operation ``s[i]`` means the i-th *char* of ``s``, not the +i-th *unichar*. The iterator ``unichars`` from the ``unicode`` standard +module can be used for iteration over all unicode characters. + + +Structured types +~~~~~~~~~~~~~~~~ +A variable of a `structured type`:idx: can hold multiple values at the same +time. Stuctured types can be nested to unlimited levels. Arrays, sequences, +records, objects and sets belong to the structured types. + +Array and sequence types +~~~~~~~~~~~~~~~~~~~~~~~~ +`Arrays`:idx: are a homogenous type, meaning that each element in the array +has the same type. Arrays always have a fixed length which is specified at +compile time (except for open arrays). They can be indexed by any ordinal type. +A parameter ``A`` may be an *open array*, in which case it is indexed by +integers from 0 to ``len(A)-1``. + +`Sequences`:idx: are similar to arrays but of dynamic length which may change +during runtime (like strings). A sequence ``S`` is always indexed by integers +from 0 to ``len(S)-1`` and its bounds are checked. Sequences can also be +constructed by the array constructor ``[]``. + +A sequence may be passed to a parameter that is of type *open array*, but +not to a multi-dimensional open array, because it is impossible to do so in an +efficient manner. + +An array expression may be constructed by the array constructor ``[]``. +A constructed array is assignment compatible to a sequence. + +Example: + +.. code-block:: nimrod + + type + TIntArray = array[0..5, int] # an array that is indexed with 0..5 + TIntSeq = seq[int] # a sequence of integers + var + x: TIntArray + y: TIntSeq + x = [1, 2, 3, 4, 5, 6] # [] this is the array constructor that is compatible + # with arrays, open arrays and + y = [1, 2, 3, 4, 5, 6] # sequences + +The lower bound of an array may be received by the built-in proc +``low()``, the higher bound by ``high()``. The length may be +received by ``len()``. + +Arrays are always bounds checked (at compile-time or at runtime). These +checks can be disabled via pragmas or invoking the compiler with the +``--bound_checks:off`` command line switch. + + +Tuples, record and object types +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +A variable of a `record`:idx: or `object`:idx: type is a heterogenous storage +container. +A record or object defines various named *fields* of a type. The assignment +operator for records and objects always copies the whole record/object. The +constructor ``()`` can be used to initialize records/objects. A field may +be given a default value. Fields with default values do not have to be listed +in a record construction, all other fields have to be listed. + +.. code-block:: nimrod + + type + TPerson = record # type representing a person + name: string # a person consists of a name + age: int = 30 # and an age which default value is 30 + + var + person: TPerson + person = (name: "Peter") # person.age is its default value (30) + +The implementation aligns the fields for best access performance. The alignment +is done in a way that is compatible the way the C compiler does it. + +The difference between records and objects is that objects allow inheritance. +Objects have access to their type at runtime, so that the ``is`` operator +can be used to determine the object's type. Assignment from an object to its +parents' object leads to a static or runtime error (the +`EInvalidObjectAssignment`:idx: exception is raised). + +.. code-block:: nimrod + + type + TPerson = object + name: string + age: int + + TStudent = object of TPerson # a student is a person + id: int # with an id field + + var + student: TStudent + person: TPerson + student = (name: "Peter", age: 89, id: 3) + person = (name: "Mary", age: 17) + assert(student is TStudent) # is true + person = student # this is an error; person has no storage for id. + + +Set type +~~~~~~~~ +The `set type`:idx: models the mathematical notion of a set. The set's +basetype can only be an ordinal type. The reason is that sets are implemented +as bit vectors. Sets are designed for high performance computing. + +Note: The sets module can be used for sets of other types. + +Sets can be constructed via the set constructor: ``{}`` is the empty set. The +empty set is type combatible with any special set type. The constructor +can also be used to include elements (and ranges of elements) in the set: + +.. code-block:: nimrod + + {'a'..'z', '0'..'9'} # This constructs a set that conains the + # letters from 'a' to 'z' and the digits + # from '0' to '9' + +These operations are supported by sets: + +================== ======================================================== +operation meaning +================== ======================================================== +``A + B`` union of two sets +``A * B`` intersection of two sets +``A - B`` difference of two sets (A without B's elements) +``A == B`` set equality +``A <= B`` subset relation (A is subset of B or equal to B) +``A < B`` strong subset relation (A is a real subset of B) +``e in A`` set membership (A contains element e) +``A -+- B`` symmetric set difference (= (A - B) + (B - A)) +``card(A)`` the cardinality of A (number of elements in A) +``incl(A, elem)`` same as A = A + {elem}, but may be faster +``excl(A, elem)`` same as A = A - {elem}, but may be faster +================== ======================================================== + +Reference type +~~~~~~~~~~~~~~ +References (similiar to `pointers`:idx: in other programming languages) are a +way to introduce many-to-one relationships. This means different references can +point to and modify the same location in memory. References should be used +sparingly in a program. They are only needed for constructing graphs. + +Nimrod distinguishes between `traced`:idx: and `untraced`:idx: references. +Untraced references are also called *pointers*. The difference between them is +that traced references are garbage collected, untraced are not. Thus untraced +references are *unsafe*. However for certain low-level operations (accessing +the hardware) untraced references are unavoidable. + +Traced references are declared with the **ref** keyword, untraced references +are declared with the **ptr** keyword. + +The ``^`` operator can be used to derefer a reference, the ``addr`` procedure +returns the address of an item. An address is always an untraced reference. +Thus the usage of ``addr`` is an *unsafe* feature. + +The ``.`` (access a record field operator) and ``[]`` (array/string/sequence +index operator) operators perform implicit dereferencing operations for +reference types: + +.. code-block:: nimrod + + type + PNode = ref TNode + TNode = record + le, ri: PNode + data: int + + var + n: PNode + new(n) + n.data = 9 # no need to write n^.data + +To allocate a new traced object, the built-in procedure ``new`` has to be used. +To deal with untraced memory, the procedures ``alloc``, ``dealloc`` and +``realloc`` can be used. The documentation of the system module contains +further information. + +Special care has to be taken if an untraced object contains traced objects like +traced references, strings or sequences: In order to free everything properly, +the built-in procedure ``finalize`` has to be called before freeing the +untraced memory manually! + +.. XXX finalizers for traced objects + +Procedural type +~~~~~~~~~~~~~~~ +A `procedural type`:idx: is internally a pointer to procedure. ``nil`` is +an allowed value for variables of a procedural type. Nimrod uses procedural +types to achieve `functional`:idx: programming techniques. Dynamic dispatch +for OOP constructs can also be implemented with procedural types. + +Example: + +.. code-block:: nimrod + + type + TCallback = proc (x: int) {.cdecl.} + + proc printItem(x: Int) = ... + + proc forEach(c: TCallback) = + ... + + forEach(printItem) # this will NOT work because calling conventions differ + +A subtle issue with procedural types is that the calling convention of the +procedure influences the type compability: Procedural types are only compatible +if they have the same calling convention. + +Nimrod supports these `calling conventions`:idx:, which are all incompatible to +each other: + +`stdcall`:idx: + This the stdcall convention as specified by Microsoft. The generated C + procedure is declared with the ``__stdcall`` keyword. + +`cdecl`:idx: + The cdecl convention means that a procedure shall use the same convention + as the C compiler. Under windows the generated C procedure is declared with + the ``__cdecl`` keyword. + +`safecall`:idx: + This is the safecall convention as specified by Microsoft. The generated C + procedure is declared with the ``__safecall`` keyword. The word *safe* + refers to the fact that all hardware registers shall be pushed to the + hardware stack. + +`inline`:idx: + The inline convention means the the caller should not call the procedure, + but inline its code directly. Note that Nimrod does not inline, but leaves + this to the C compiler. Thus it generates ``__inline`` procedures. This is + only a hint for the compiler: It may completely ignore it and + it may inline procedures that are not marked as ``inline``. + +`fastcall`:idx: + Fastcall means different things to different C compilers. One gets whatever + the C ``__fastcall`` means. + +`nimcall`:idx: + Nimcall is the default convention used for Nimrod procedures. It is the + same as ``fastcall``, but only for C compilers that support ``fastcall``. + +`closure`:idx: + indicates that the procedure expects a context, a closure that needs + to be passed to the procedure. The implementation is the + same as ``cdecl``, but with a hidden pointer parameter (the + *closure*). The hidden parameter is always the last one. + +`syscall`:idx: + The syscall convention is the same as ``__syscall`` in C. It is used for + interrupts. + +`noconv`:idx: + The generated C code will not have any explicit calling convention and thus + use the C compiler's default calling convention. This is needed because + Nimrod's default calling convention for procedures is ``fastcall`` to + improve speed. This is unlikely to be needed by the user. + +Most calling conventions exist only for the Windows 32-bit platform. + + + +Statements +---------- +Nimrod uses the common statement/expression paradigma: `Statements`:idx: do not +produce a value in contrast to expressions. Call expressions are statements. +If the called procedure returns a value, it is not a valid statement +as statements do not produce values. To evaluate an expression for +side-effects and throwing its value away, one can use the ``discard`` +statement. + +Statements are separated into `simple statements`:idx: and +`complex statements`:idx:. +Simple statements are statements that cannot contain other statements, like +assignments, calls or the ``return`` statement; complex statements can +contain other statements. To avoid the `dangling else problem`:idx:, complex +statements always have to be intended:: + + simpleStmt ::= returnStmt + | yieldStmt + | discardStmt + | raiseStmt + | breakStmt + | continueStmt + | pragma + | importStmt + | fromStmt + | includeStmt + | exprStmt + complexStmt ::= ifStmt | whileStmt | caseStmt | tryStmt | forStmt + | blockStmt | asmStmt + | procDecl | iteratorDecl | macroDecl | templateDecl + | constDecl | typeDecl | whenStmt | varStmt + + + +Discard statement +~~~~~~~~~~~~~~~~~ + +Syntax:: + + discardStmt ::= DISCARD expr + +Example: + +.. code-block:: nimrod + + discard proc_call("arg1", "arg2") # discard the return value of `proc_call` + +The `discard`:idx: statement evaluates its expression for side-effects and +throws the expression's resulting value away. If the expression has no +side-effects, this generates a static error. Ignoring the return value of a +procedure without using a discard statement is not allowed. + + +Var statement +~~~~~~~~~~~~~ + +Syntax:: + + colonOrEquals ::= COLON typeDesc [EQUALS expr] | EQUALS expr + varPart ::= (symbol ["*" | "-"] [pragma] optComma)+ colonOrEquals [COMMENT] + varStmt ::= VAR (varPart | indPush varPart (SAD varPart)* DED) + +`Var`:idx: statements declare new local and global variables and +initialize them. A comma seperated list of variables can be used to specify +variables of the same type: + +.. code-block:: nimrod + + var + a: int = 0 + x, y, z: int + +If an initializer is given the type can be omitted: The variable is of the +same type as the initializing expression. Variables are always initialized +with a default value if there is no initializing expression. The default +value depends on the type and is always a zero in binary. + +============================ ============================================== +Type default value +============================ ============================================== +any integer type 0 +any float 0.0 +char '\0' +bool false +ref or pointer type nil +procedural type nil +sequence nil +string nil (**not** "") +tuple[A, B, ...] (default(A), default(B), ...) + (analogous for objects and records) +array[0..., T] [default(T), ...] +range[T] default(T); this may be out of the valid range +T = enum cast[T](0); this may be an invalid value +============================ ============================================== + + +Const section +~~~~~~~~~~~~~ + +Syntax:: + + colonAndEquals ::= [COLON typeDesc] EQUALS expr + constDecl ::= CONST + indPush + symbol ["*"] [pragma] colonAndEquals + (SAD symbol ["*"] [pragma] colonAndEquals)* + DED + +Example: + +.. code-block:: nimrod + + const + MyFilename = "/home/my/file.txt" + debugMode: bool = false + +The `const`:idx: section declares symbolic constants. A symbolic constant is +a name for a constant expression. Symbolic constants only allow read-access. + + +If statement +~~~~~~~~~~~~ + +Syntax:: + + ifStmt ::= IF expr COLON stmt (ELIF expr COLON stmt)* [ELSE COLON stmt] + +Example: + +.. code-block:: nimrod + + var name = readLine(stdin) + + if name == "Andreas": + echo("What a nice name!") + elif name == "": + echo("Don't you have a name?") + else: + echo("Boring name...") + +The `if`:idx: statement is a simple way to make a branch in the control flow: +The expression after the keyword ``if`` is evaluated, if it is true +the corresponding statements after the ``:`` are executed. Otherwise +the expression after the ``elif`` is evaluated (if there is an +``elif`` branch), if it is true the corresponding statements after +the ``:`` are executed. This goes on until the last ``elif``. If all +conditions fail, the ``else`` part is executed. If there is no ``else`` +part, execution continues with the statement after the ``if`` statement. + + +Case statement +~~~~~~~~~~~~~~ + +Syntax:: + + caseStmt ::= CASE expr (OF sliceList COLON stmt)* + (ELIF expr COLON stmt)* + [ELSE COLON stmt] + +Example: + +.. code-block:: nimrod + + case readline(stdin) + of "delete-everything", "restart-computer": + echo("permission denied") + of "go-for-a-walk": echo("please yourself") + else: echo("unknown command") + +The `case`:idx: statement is similar to the if statement, but it represents +a multi-branch selection. The expression after the keyword ``case`` is +evaluated and if its value is in a *vallist* the corresponding statements +(after the ``of`` keyword) are executed. If the value is no given *vallist* +the ``else`` part is executed. If there is no ``else`` part and not all +possible values that ``expr`` can hold occur in a ``vallist``, a static +error is given. This holds only for expressions of ordinal types. +If the expression is not of an ordinal type, and no ``else`` part is +given, control just passes after the ``case`` statement. + +To suppress the static error in the ordinal case the programmer needs +to write an ``else`` part with a ``nil`` statement. + + +When statement +~~~~~~~~~~~~~~ + +Syntax:: + + whenStmt ::= WHEN expr COLON stmt (ELIF expr COLON stmt)* [ELSE COLON stmt] + +Example: + +.. code-block:: nimrod + + when sizeof(int) == 2: + echo("running on a 16 bit system!") + elif sizeof(int) == 4: + echo("running on a 32 bit system!") + elif sizeof(int) == 8: + echo("running on a 64 bit system!") + else: + echo("cannot happen!") + +The `when`:idx: statement is almost identical to the ``if`` statement with some +exceptions: + +* Each ``expr`` has to be a constant expression (of type ``bool``). +* The statements do not open a new scope if they introduce new identifiers. +* The statements that belong to the expression that evaluated to true are + translated by the compiler, the other statements are not checked for + syntax or semantics at all! This holds also for any ``expr`` coming + after the expression that evaluated to true. + +The ``when`` statement enables conditional compilation techniques. As +a special syntatic extension, the ``when`` construct is also available +within ``record`` or ``object`` definitions. + + +Raise statement +~~~~~~~~~~~~~~~ + +Syntax:: + + raiseStmt ::= RAISE [expr] + +Example: + +.. code-block:: nimrod + raise EOS("operating system failed") + +Apart from built-in operations like array indexing, memory allocation, etc. +the ``raise`` statement is the only way to raise an exception. The +identifier has to be the name of a previously declared exception. A +comma followed by an expression may follow; the expression must be of type +``string`` or ``cstring``; this is an error message that can be extracted +with the `getCurrentExceptionMsg`:idx: procedure in the module ``system``. + +If no exception name is given, the current exception is `re-raised`:idx:. The +`ENoExceptionToReraise`:idx: exception is raised if there is no exception to +re-raise. It follows that the ``raise`` statement *always* raises an +exception. + + +Try statement +~~~~~~~~~~~~~ + +Syntax:: + + exceptList ::= (qualifiedIdent optComma)* + tryStmt ::= TRY COLON stmt + (EXCEPT exceptList COLON stmt)* + [FINALLY COLON stmt] + +Example: + +.. code-block:: nimrod + # read the first two lines of a text file that should contain numbers + # and tries to add them + var + f: TFile + if openFile(f, "numbers.txt"): + try: + var a = readLine(f) + var b = readLine(f) + echo("sum: " & $(parseInt(a) + parseInt(b))) + except EOverflow: + echo("overflow!") + except EValue: + echo("could not convert string to integer") + except EIO: + echo("IO error!") + finally: + closeFile(f) + +The statements after the `try`:idx: are executed in sequential order unless +an exception ``e`` is raised. If the exception type of ``e`` matches any +of the list ``exceptlist`` the corresponding statements are executed. +The statements following the ``except`` clauses are called +`exception handlers`:idx:. + +The empty `except`:idx: clause is executed if there is an exception that is +in no list. It is similiar to an ``else`` clause in ``if`` statements. + +If there is a `finally`:idx: clause, it is always executed after the +exception handlers. + +The exception is *consumed* in an exception handler. However, an +exception handler may raise another exception. If the exception is not +handled, it is propagated through the call stack. This means that often +the rest of the procedure - that is not within a ``finally`` clause - +is not executed (if an exception occurs). + + +Return statement +~~~~~~~~~~~~~~~~ + +Syntax:: + + returnStmt ::= RETURN [expr] + +Example: + +.. code-block:: nimrod + return 40+2 + +The `return`:idx: statement ends the execution of the current procedure. +It is only allowed in procedures. If there is an ``expr``, this is syntactic +sugar for: + +.. code-block:: nimrod + result = expr + return + +The `result`:idx: variable is always the return value of the procedure. It is +automatically declared by the compiler. + + +Yield statement +~~~~~~~~~~~~~~~ + +Syntax:: + + yieldStmt ::= YIELD expr + +Example: + +.. code-block:: nimrod + yield (1, 2, 3) + +The `yield`:idx: statement is used instead of the ``return`` statement in +iterators. It is only valid in iterators. Execution is returned to the body +of the for loop that called the iterator. Yield does not end the iteration +process, but execution is passed back to the iterator if the next iteration +starts. See the section about iterators (`Iterators and the for statement`_) +for further information. + + +Block statement +~~~~~~~~~~~~~~~ + +Syntax:: + + blockStmt ::= BLOCK [symbol] COLON stmt + +Example: + +.. code-block:: nimrod + var found = false + block myblock: + for i in 0..3: + for j in 0..3: + if a[j][i] == 7: + found = true + break myblock # leave the block, in this case both for-loops + echo(found) + +The block statement is a means to group statements to a (named) `block`:idx:. +Inside the block, the ``break`` statement is allowed to leave the block +immediately. A ``break`` statement can contain a name of a surrounding +block to specify which block is to leave. + + +Break statement +~~~~~~~~~~~~~~~ + +Syntax:: + + breakStmt ::= BREAK [symbol] + +Example: + +.. code-block:: nimrod + break + +The `break`:idx: statement is used to leave a block immediately. If ``symbol`` +is given, it is the name of the enclosing block that is to leave. If it is +absent, the innermost block is leaved. + + +While statement +~~~~~~~~~~~~~~~ + +Syntax:: + + whileStmt ::= WHILE expr COLON stmt + +Example: + +.. code-block:: nimrod + echo("Please tell me your password: \n") + var pw = readLine(stdin) + while pw != "12345": + echo("Wrong password! Next try: \n") + pw = readLine(stdin) + + +The `while`:idx: statement is executed until the ``expr`` evaluates to false. +Endless loops are no error. ``while`` statements open an `implicit block`, +so that they can be leaved with a ``break`` statement. + + +Continue statement +~~~~~~~~~~~~~~~~~~ + +Syntax:: + + continueStmt ::= CONTINUE + +A `continue`:idx: statement leads to the immediate next iteration of the +surrounding loop construct. It is only allowed within a loop. A continue +statement is syntactic sugar for a nested block: + +.. code-block:: nimrod + while expr1: + stmt1 + continue + stmt2 + + # is equivalent to: + while expr1: + block myBlockName: + stmt1 + break myBlockName + stmt2 + + +Assembler statement +~~~~~~~~~~~~~~~~~~~ +Syntax:: + + asmStmt ::= ASM [pragma] (STR_LIT | RSTR_LIT | TRIPLESTR_LIT) + +The direct embedding of `assembler`:idx: code into Nimrod code is supported +by the unsafe ``asm`` statement. Identifiers in the assembler code that refer to +Nimrod identifiers shall be enclosed in a special character which can be +specified in the statement's pragmas. The default special character is ``'`'``. + + +Procedures +~~~~~~~~~~ +What most programming languages call `methods`:idx: or `funtions`:idx: are +called `procedures`:idx: in Nimrod (which is the correct terminology). A +procedure declaration defines an identifier and associates it with a block +of code. A procedure may call itself recursively. The syntax is:: + + paramList ::= [PAR_LE ((symbol optComma)+ COLON typeDesc optComma)* PAR_RI] + [COLON typeDesc] + genericParams ::= BRACKET_LE (symbol [EQUALS typeDesc] )* BRACKET_RI + + procDecl ::= PROC symbol ["*"] [genericParams] paramList [pragma] + [EQUALS stmt] + +If the ``EQUALS stms`` part is missing, it is a `forward`:idx: declaration. If +the proc returns a value, the procedure body can access an implicit declared +variable named `result`:idx: that represents the return value. Procs can be +overloaded. The overloading resolution algorithm tries to find the proc that is +the best match for the arguments. A parameter may be given a default value that +is used if the caller does not provide a value for this parameter. Example: + +.. code-block:: nimrod + + proc toLower(c: Char): Char = # toLower for characters + if c in {'A'..'Z'}: + result = chr(ord(c) + (ord('a') - ord('A'))) + else: + result = c + + proc toLower(s: string): string = # toLower for strings + result = newString(len(s)) + for i in 0..len(s) - 1: + result[i] = toLower(s[i]) # calls toLower for characters; no recursion! + +`Operators`:idx: are procedures with a special operator symbol as identifier: + +.. code-block:: nimrod + proc `$` (x: int): string = # converts an integer to a string; + # since it has one parameter this is a prefix + # operator. With two parameters it would be + # an infix operator. + return intToStr(x) + +Calling a procedure can be done in many different ways: + +.. code-block:: nimrod + proc callme(x, y: int, s: string = "", c: char, b: bool = false) = ... + + # call with positional arguments# parameter bindings: + callme(0, 1, "abc", '\t', true) # (x=0, y=1, s="abc", c='\t', b=true) + # call with named and positional arguments: + callme(y=1, x=0, "abd", '\t') # (x=0, y=1, s="abd", c='\t', b=false) + # call with named arguments (order is not relevant): + callme(c='\t', y=1, x=0) # (x=0, y=1, s="", c='\t', b=false) + # call as a command statement: no () or , needed: + callme 0 1 "abc" '\t' + + +Iterators and the for statement +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Syntax:: + + forStmt ::= FOR (symbol optComma)+ IN expr [DOTDOT expr] COLON stmt + + paramList ::= [PAR_LE ((symbol optComma)+ COLON typeDesc optComma)* PAR_RI] + [COLON typeDesc] + genericParams ::= BRACKET_LE (symbol [EQUALS typeDesc] )* BRACKET_RI + + iteratorDecl ::= ITERATOR symbol ["*"] [genericParams] paramList [pragma] + [EQUALS stmt] + +The `for`:idx: statement is an abstract mechanism to iterate over the elements +of a container. It relies on an `iterator`:idx: to do so. Like ``while`` +statements, ``for`` statements open an `implicit block`:idx:, so that they +can be leaved with a ``break`` statement. The ``for`` loop declares +iteration variables (``x`` in the example) - their scope reaches until the +end of the loop body. The iteration variables' types are inferred by the +return type of the iterator. + +An iterator is similar to a procedure, except that it is always called in the +context of a ``for`` loop. Iterators provide a way to specify the iteration over +an abstract type. A key role in the execution of a ``for`` loop plays the +``yield`` statement in the called iterator. Whenever a ``yield`` statement is +reached the data is bound to the ``for`` loop variables and control continues +in the body of the ``for`` loop. The iterator's local variables and execution +state are automatically saved between calls. Example: + +.. code-block:: nimrod + # this definition exists in the system module + iterator items*(a: string): char {.inline.} = + var i = 0 + while i < len(a): + yield a[i] + inc(i) + + for ch in items("hello world"): # `ch` is an iteration variable + echo(ch) + +The compiler generates code as if the programmer would have written this: + +.. code-block:: nimrod + var i = 0 + while i < len(a): + var ch = a[i] + echo(ch) + inc(i) + +The current implementation always inlines the iterator code leading to zero +overhead for the abstraction. But this may increase the code size. Later +versions of the compiler will only inline iterators which have the calling +convention ``inline``. + +If the iterator yields a tuple, there have to be as many iteration variables +as there are components in the tuple. The i'th iteration variable's type is +the one of the i'th component. + + +Type sections +~~~~~~~~~~~~~ + +Syntax:: + + typeDef ::= typeDesc | recordDef | objectDef | enumDef + genericParams ::= BRACKET_LE (symbol [EQUALS typeDesc] )* BRACKET_RI + + typeDecl ::= TYPE + indPush + symbol ["*"] [genericParams] [EQUALS typeDef] + (SAD symbol ["*"] [genericParams] [EQUALS typeDef])* + DED + +Example: + +.. code-block:: nimrod + type # example demonstrates mutually recursive types + PNode = ref TNode # a traced pointer to a TNode + TNode = record + le, ri: PNode # left and right subtrees + sym: ref TSym # leaves contain a reference to a TSym + + TSym = record # a symbol + name: string # the symbol's name + line: int # the line the symbol was declared in + code: PNode # the symbol's abstract syntax tree + +A `type`:idx: section begins with the ``type`` keyword. It contains multiple +type definitions. A type definition binds a type to a name. Type definitions +can be recursive or even mutually recursive. Mutually Recursive types are only +possible within a single ``type`` section. + + +Generics +~~~~~~~~ + +Example: + +.. code-block:: nimrod + type + TBinaryTree[T] = record # TBinaryTree is a generic type with + # with generic param ``T`` + le, ri: ref TBinaryTree[T] # left and right subtrees; may be nil + data: T # the data stored in a node + PBinaryTree[T] = ref TBinaryTree[T] # a shorthand for notational convenience + + proc newNode[T](data: T): PBinaryTree[T] = # constructor for a node + new(result) + result.dat = data + + proc add[T](root: var PBinaryTree[T], n: PBinaryTree[T]) = + if root == nil: + root = n + else: + var it = root + while it != nil: + var c = cmp(it.data, n.data) # compare the data items; uses + # the generic ``cmd`` proc that works for + # any type that has a ``==`` and ``<`` + # operator + if c < 0: + if it.le == nil: + it.le = n + return + it = it.le + else: + if it.ri == nil: + it.ri = n + return + it = it.ri + + iterator inorder[T](root: PBinaryTree[T]): T = + # inorder traversal of a binary tree + # recursive iterators are not yet implemented, so this does not work in + # the current compiler! + if root.le != nil: + yield inorder(root.le) + yield root.data + if root.ri != nil: + yield inorder(root.ri) + + var + root: PBinaryTree[string] # instantiate a PBinaryTree with the type string + add(root, newNode("hallo")) # instantiates generic procs ``newNode`` and + add(root, newNode("world")) # ``add`` + for str in inorder(root): + writeln(stdout, str) + +`Generics`:idx: are Nimrod's means to parametrize procs, iterators or types with +`type parameters`:idx:. Depending on context, the brackets are used either to +introduce type parameters or to instantiate a generic proc, iterator or type. + + +Templates +~~~~~~~~~ + +A `template`:idx: is a simple form of a macro. It operates on parse trees and is +processed in the semantic pass of the compiler. So they integrate well with the +rest of the language and share none of C's preprocessor macros flaws. However, +they may lead to code that is harder to understand and maintain. So one ought +to use them sparingly. The usage of ordinary procs, iterators or generics is +preferred to the usage of templates. + +Example: + +.. code-block:: nimrod + template `!=` (a, b: expr): expr = + # this definition exists in the System module + not (a == b) + + writeln(5 != 6) # the compiler rewrites that to: writeln(not (5 == 6)) + + +Macros +~~~~~~ + +`Macros`:idx: are the most powerful feature of Nimrod. They should be used +only to implement `domain specific languages`:idx:. They may lead to code +that is harder to understand and maintain. So one ought to use them sparingly. +The usage of ordinary procs, iterators or generics is preferred to the usage of +macros. + + +Modules +------- +Nimrod supports splitting a program into pieces by a `module`:idx: concept. +Modules make separate compilation possible. Each module needs to be in its +own file. Modules enable `information hiding`:idx: and +`separate compilation`:idx:. A module may gain access to symbols of another +module by the `import`:idx: statement. `Recursive module dependancies`:idx: are +allowed, but slightly subtle. + +The algorithm for compiling modules is: + +- Compile the whole module as usual, following import statements recursively +- if we have a cycle only import the already parsed symbols (that are + exported); if an unknown identifier occurs then abort + +This is best illustrated by an example: + +.. code-block:: nimrod + # Module A + type + T1* = int + import B # the compiler starts parsing B + + proc main() = + var i = p(3) # works because B has been parsed completely here + + main() + + + # Module B + import A # A is not parsed here! Only the already known symbols + # of A are imported here. + + proc p*(x: A.T1): A.T1 # this works because the compiler has already + # added T1 to A's interface symbol table + + proc p(x: A.T1): A.T1 = return x + 1 + + +Scope rules +----------- +Identifiers are valid from the point of their declaration until the end of +the block in which the declaration occurred. The range where the identifier +is known is the `scope`:idx: of the identifier. The exact scope of an +identifier depends on the way it was declared. + +Block scope +~~~~~~~~~~~ +The *scope* of a variable declared in the declaration part of a block +is valid from the point of declaration until the end of the block. If a +block contains a second block, in which the identifier is redeclared, +then inside this block, the second declaration will be valid. Upon +leaving the inner block, the first declaration is valid again. An +identifier cannot be redefined in the same block, except if valid for +procedure or iterator overloading purposes. + + +Record or object scope +~~~~~~~~~~~~~~~~~~~~~~ +The field identifiers inside a record or object definition are valid in the +following places: + +* To the end of the record definition +* Field designators of a variable of the given record type. +* In all descendent types of the object type. + +Module scope +~~~~~~~~~~~~ +All identifiers in the interface part of a module are valid from the point of +declaration, until the end of the module. Furthermore, the identifiers are +known in other modules that import the module. Identifiers from indirectly +dependent modules are *not* available. The `system`:idx: module is automatically +imported in all other modules. + +If a module imports an identifier by two different modules, +each occurance of the identifier has to be qualified, unless it is an +overloaded procedure or iterator in which case the overloading +resolution takes place: + +.. code-block:: nimrod + # Module A + var x*: string + + # Module B + var x*: int + + # Module C + import A, B + write(stdout, x) # error: x is ambigious + write(sdtout, A.x) # no error: qualifier used + + var x = 4 + write(stdout, x) # not ambigious: uses the module C's x + + +Messages +======== + +The Nimrod compiler emits different kinds of messages: `hint`:idx:, +`warning`:idx:, and `error`:idx: messages. An *error* message is emitted if +the compiler encounters any static error. + +Pragmas +======= + +Syntax:: + + pragma ::= CURLYDOT_LE (expr [COLON expr] optComma)+ (CURLYDOT_RI | CURLY_RI) + +Pragmas are Nimrod's method to give the compiler additional information/ +commands without introducing a massive number of new keywords. Pragmas are +processed on the fly during parsing. Pragmas are always enclosed in the +special ``{.`` and ``.}`` curly brackets. + + +define pragma +------------- +The `define`:idx: pragma defines a conditional symbol. This symbol may only be +used in other pragmas and in the ``defined`` expression and not in ordinary +Nimrod source code. The conditional symbols go into a special symbol table. +The compiler defines the target processor and the target operating +system as conditional symbols. + + +undef pragma +------------ +The `undef`:idx: pragma the counterpart to the define pragma. It undefines a +conditional symbol. + + +error pragma +------------ +The `error`:idx: pragma is used to make the compiler output an error message +with the given content. Compilation currently aborts after an error, but this +may be changed in later versions. + + +fatal pragma +------------ +The `fatal`:idx: pragma is used to make the compiler output an error message +with the given content. In contrast to the ``error`` pragma, compilation +is guaranteed to be aborted by this pragma. + +warning pragma +-------------- +The `warning`:idx: pragma is used to make the compiler output a warning message +with the given content. Compilation continues after the warning. + +hint pragma +----------- +The `hint`:idx: pragma is used to make the compiler output a hint message with +the given content. Compilation continues after the hint. + + +compilation option pragmas +-------------------------- +The listed pragmas here can be used to override the code generation options +for a section of code. +:: + + "{." pragma: val {pragma: val} ".}" + + +The implementation currently provides the following possible options (later +various others may be added). + +=============== =============== ============================================ +pragma allowed values description +=============== =============== ============================================ +checks on|off Turns the code generation for all runtime + checks on or off. +bound_checks on|off Turns the code generation for array bound + checks on or off. +overflow_checks on|off Turns the code generation for over- or + underflow checks on or off. +nil_checks on|off Turns the code generation for nil pointer + checks on or off. +assertions on|off Turns the code generation for assertions + on or off. +warnings on|off Turns the warning messages of the compiler + on or off. +hints on|off Turns the hint messages of the compiler + on or off. +optimization none|speed|size Optimize the code for speed or size, or + disable optimization. For non-optimizing + compilers this option has no effect. + Neverless they must parse it properly. +callconv cdecl|... Specifies the default calling convention for + all procedures (and procedure types) that + follow. +=============== =============== ============================================ + +Example: + +.. code-block:: nimrod + {.checks: off, optimization: speed.} + # compile without runtime checks and optimize for speed + + +push and pop pragmas +-------------------- +The `push/pop`:idx: pragmas are very similar to the option directive, +but are used to override the settings temporarily. Example: + +.. code-block:: nimrod + {.push checks: off.} + # compile this section without runtime checks as it is + # speed critical + # ... some code ... + {.pop.} # restore old settings diff --git a/doc/nimdoc.css b/doc/nimdoc.css new file mode 100755 index 000000000..6154f0b2e --- /dev/null +++ b/doc/nimdoc.css @@ -0,0 +1,295 @@ +/* +:Author: David Goodger +:Contact: goodger@python.org +:Date: $Date: 2006-05-21 22:44:42 +0200 (Sun, 21 May 2006) $ +:Revision: $Revision: 4564 $ +:Copyright: This stylesheet has been placed in the public domain. + +Default cascading style sheet for the HTML output of Docutils. + +See http://docutils.sf.net/docs/howto/html-stylesheets.html for how to +customize this style sheet. +*/ + +/* + Modified for the Nimrod Documenation by + Andreas Rumpf +*/ + +/* used to remove borders from tables and images */ +.borderless, table.borderless td, table.borderless th { + border: 0 } + +table.borderless td, table.borderless th { + /* Override padding for "table.docutils td" with "! important". + The right padding separates the table cells. */ + padding: 0 0.5em 0 0 ! important } + +.first { + /* Override more specific margin styles with "! important". */ + margin-top: 0 ! important } + +.last, .with-subtitle { + margin-bottom: 0 ! important } + +.hidden { + display: none } + +a.toc-backref { + text-decoration: none ; + color: black } + +blockquote.epigraph { + margin: 2em 5em ; } + +dl.docutils dd { + margin-bottom: 0.5em } + +/* Uncomment (and remove this text!) to get bold-faced definition list terms +dl.docutils dt { + font-weight: bold } +*/ + +div.abstract { + margin: 2em 5em } + +div.abstract p.topic-title { + font-weight: bold ; + text-align: center } + +div.admonition, div.attention, div.caution, div.danger, div.error, +div.hint, div.important, div.note, div.tip, div.warning { + margin: 2em ; + border: medium outset ; + padding: 1em } + +div.admonition p.admonition-title, div.hint p.admonition-title, +div.important p.admonition-title, div.note p.admonition-title, +div.tip p.admonition-title { + font-weight: bold ; + font-family: sans-serif } + +div.attention p.admonition-title, div.caution p.admonition-title, +div.danger p.admonition-title, div.error p.admonition-title, +div.warning p.admonition-title { + color: red ; + font-weight: bold ; + font-family: sans-serif } + +/* Uncomment (and remove this text!) to get reduced vertical space in + compound paragraphs. +div.compound .compound-first, div.compound .compound-middle { + margin-bottom: 0.5em } + +div.compound .compound-last, div.compound .compound-middle { + margin-top: 0.5em } +*/ + +div.dedication { + margin: 2em 5em ; + text-align: center ; + font-style: italic } + +div.dedication p.topic-title { + font-weight: bold ; + font-style: normal } + +div.figure { + margin-left: 2em ; + margin-right: 2em } + +div.footer, div.header { + clear: both; + font-size: smaller } + +div.line-block { + display: block ; + margin-top: 1em ; + margin-bottom: 1em } + +div.line-block div.line-block { + margin-top: 0 ; + margin-bottom: 0 ; + margin-left: 1.5em } + +div.sidebar { + margin-left: 1em ; + border: medium outset ; + padding: 1em ; + background-color: #ffffee ; + width: 40% ; + float: right ; + clear: right } + +div.sidebar p.rubric { + font-family: sans-serif ; + font-size: medium } + +div.system-messages { + margin: 5em } + +div.system-messages h1 { + color: red } + +div.system-message { + border: medium outset ; + padding: 1em } + +div.system-message p.system-message-title { + color: red ; + font-weight: bold } + +div.topic { + margin: 2em; +} + +h1.section-subtitle, h2.section-subtitle, h3.section-subtitle, +h4.section-subtitle, h5.section-subtitle, h6.section-subtitle { + margin-top: 0.4em } + +h1.title { text-align: center } +h2.subtitle { text-align: center } +hr.docutils { width: 75% } +img.align-left { clear: left } +img.align-right { clear: right } + +ol.simple, ul.simple { + margin-bottom: 1em } + +ol.arabic { + list-style: decimal } + +ol.loweralpha { + list-style: lower-alpha } + +ol.upperalpha { + list-style: upper-alpha } + +ol.lowerroman { + list-style: lower-roman } + +ol.upperroman { + list-style: upper-roman } + +p.attribution { + text-align: right ; + margin-left: 50% } + +p.caption { + font-style: italic } + +p.credits { + font-style: italic ; + font-size: smaller } + +p.label { + white-space: nowrap } + +p.rubric { + font-weight: bold ; + font-size: larger ; + color: maroon ; + text-align: center } + +p.sidebar-title { + font-family: sans-serif ; + font-weight: bold ; + font-size: larger } + +p.sidebar-subtitle { + font-family: sans-serif ; + font-weight: bold } + +p.topic-title { + font-weight: bold } + +pre.address { + margin-bottom: 0 ; + margin-top: 0 ; + font-family: serif ; + font-size: 100% } + +pre, span.pre { + background-color:#F9F9F9; + border:1px dotted #2F6FAB; + color:black; +} + +pre {padding:1em;} + +pre.literal-block, pre.doctest-block { + margin-left: 2em ; + margin-right: 2em } + +span.classifier { + font-family: sans-serif ; + font-style: oblique } + +span.classifier-delimiter { + font-family: sans-serif ; + font-weight: bold } + +span.interpreted { + font-family: sans-serif } + +span.option { + white-space: nowrap } + +span.pre { white-space: pre } + +span.problematic { + color: red } + +span.section-subtitle { + /* font-size relative to parent (h1..h6 element) */ + font-size: 80% } + +table.citation { + border-left: solid 1px gray; + margin-left: 1px } + +table.docinfo { + margin: 2em 4em } + +table.docutils { + margin-top: 0.5em ; + margin-bottom: 0.5em } + +table.footnote { + border-left: solid 1px black; + margin-left: 1px } + +table.docutils td, table.docutils th, +table.docinfo td, table.docinfo th { + padding-left: 0.5em ; + padding-right: 0.5em ; + vertical-align: top } + +table.docutils th.field-name, table.docinfo th.docinfo-name { + font-weight: bold ; + text-align: left ; + white-space: nowrap ; + padding-left: 0 } + +h1 tt.docutils, h2 tt.docutils, h3 tt.docutils, +h4 tt.docutils, h5 tt.docutils, h6 tt.docutils { + font-size: 100% } + +ul.auto-toc { + list-style-type: none } + +a.reference { + color: #E00000; + font-weight:bold; +} + +a.reference:hover { + color: #E00000; + background-color: #ffff00; + display: margin; + font-weight:bold; +} + +div.topic ul { + list-style-type: none; +} diff --git a/doc/nimrodc.txt b/doc/nimrodc.txt new file mode 100755 index 000000000..72e2d205f --- /dev/null +++ b/doc/nimrodc.txt @@ -0,0 +1,241 @@ +=================================== + Nimrod Compiler User Guide +=================================== + +:Author: Andreas Rumpf +:Version: |nimrodversion| + +.. contents:: + +Introduction +============ + +This document describes the usage of the *Nimrod compiler* +on the different supported platforms. It is not a definition of the Nimrod +programming system (therefore is the Nimrod manual). + +Nimrod is free software; it is licensed under the +`GNU General Public License <gpl.html>`_. + + +Compiler Usage +============== + +Command line switches +--------------------- +Basis command line switches are: + +.. include:: ../data/basicopt.txt + +Advanced command line switches are: + +.. include:: ../data/advopt.txt + + +Configuration file +------------------ +The ``nimrod`` executable loads the configuration file ``config/nimrod.cfg`` +unless this is suppressed by the ``--skip_cfg`` command line option. +Configuration settings can be overwritten in a project specific +configuration file that is read automatically. This specific file has to +be in the same directory as the project and be of the same name, except +that its extension should be ``.cfg``. + +Command line settings have priority over configuration file settings. + + +Nimrod's directory structure +---------------------------- +The generated files that Nimrod produces all go into a subdirectory called +``rod_gen``. This makes it easy to write a script that deletes all generated +files. For example the generated C code for the module ``path/modA.nim`` +will become ``path/rod_gen/modA.c``. + +However, the generated C code is not platform independant! C code generated for +Linux does not compile on Windows, for instance. The comment on top of the +C file lists the OS, CPU and CC the file has been compiled for. + +The library lies in ``lib``. Directly in the library directory are essential +Nimrod modules like the ``system`` and ``os`` modules. Under ``lib/base`` +are additional specialized libraries or interfaces to foreign libraries which +are included in the standard distribution. The ``lib/extra`` directory is +initially empty. Third party libraries should go there. In the default +configuration the compiler always searches for libraries in ``lib``, +``lib/base`` and ``lib/extra``. + + +Additional Features +=================== + +This section describes Nimrod's additional features that are not listed in the +Nimrod manual. + +New Pragmas and Options +----------------------- + +Because Nimrod generates C code it needs some "red tape" to work properly. +Thus lots of options and pragmas for tweaking the generated C code are +available. + +No_decl Pragma +~~~~~~~~~~~~~~ +The `no_decl`:idx: pragma can be applied to almost any symbol (variable, proc, +type, etc.) and is one of the most important for interoperability with C: +It tells Nimrod that it should not generate a declaration for the symbol in +the C code. Thus it makes the following possible, for example: + +.. code-block:: Nimrod + var + EOF {.import: "EOF", no_decl.}: cint # pretend EOF was a variable, as + # Nimrod does not know its value + +Varargs Pragma +~~~~~~~~~~~~~~ +The `varargs`:idx: pragma can be applied to procedures only. It tells Nimrod +that the proc can take a variable number of parameters after the last +specified parameter. Nimrod string values will be converted to C +strings automatically: + +.. code-block:: Nimrod + proc printf(formatstr: cstring) {.nodecl, varargs.} + + printf("hallo %s", "world") # "world" will be passed as C string + + +Header Pragma +~~~~~~~~~~~~~ +The `header`:idx: pragma is very similar to the ``no_decl`` pragma: It can be +applied to almost any symbol and specifies that not only it should not be +declared but also that it leads to the inclusion of a given header file: + +.. code-block:: Nimrod + type + PFile {.import: "FILE*", header: "<stdio.h>".} = pointer + # import C's FILE* type; Nimrod will treat it as a new pointer type + +The ``header`` pragma expects always a string constant. The string contant +contains the header file: As usual for C, a system header file is enclosed +in angle brackets: ``<>``. If no angle brackets are given, Nimrod +encloses the header file in ``""`` in the generated C code. + + +No_static Pragma +~~~~~~~~~~~~~~~~ +The `no_static`:idx: pragma can be applied to almost any symbol and specifies +that it shall not be declared ``static`` in the generated C code. Note that +symbols in the interface part of a module never get declared ``static``, so +only in special cases is this pragma necessary. + + +Line_dir Option +~~~~~~~~~~~~~~~ +The `line_dir`:idx: option can be turned on or off. If on the generated C code +contains ``#line`` directives. + + +Stack_trace Option +~~~~~~~~~~~~~~~~~~ +If the `stack_trace`:idx: option is turned on, the generated C contains code to +ensure that proper stack traces are given if the program crashes or an +uncaught exception is raised. + + +Line_trace Option +~~~~~~~~~~~~~~~~~ +The `line_trace`:idx: option implies the ``stack_trace`` option. If turned on, +the generated C contains code to ensure that proper stack traces with line +number information are given if the program crashes or an uncaught exception +is raised. + +Debugger Option +~~~~~~~~~~~~~~~ +The `debugger`:idx: option enables or disables the *Embedded Nimrod Debugger*. +See the documentation of endb_ for further information. + + +Breakpoint Pragma +~~~~~~~~~~~~~~~~~ +The *breakpoint* pragma was specially added for the sake of debugging with +ENDB. See the documentation of `endb <endb.html>`_ for further information. + + +Volatile Pragma +~~~~~~~~~~~~~~~ +The `volatile`:idx: pragma is for variables only. It declares the variable as +``volatile``, whatever that means in C/C++. + +Register Pragma +~~~~~~~~~~~~~~~ +The `register`:idx: pragma is for variables only. It declares the variable as +``register``, giving the compiler a hint that the variable should be placed +in a hardware register for faster access. C compilers usually ignore this +though and for good reason: Often they do a better job without it anyway. + +In highly specific cases (a dispatch loop of interpreters for example) it +may provide benefits, though. + + +Disabling certain messages +-------------------------- +Nimrod generates some warnings and hints ("line too long") that may annoy the +user. Thus a mechanism for disabling certain messages is provided: Each hint +and warning message contains a symbol in brackets. This is the message's +identifier that can be used to enable or disable it: + +.. code-block:: Nimrod + {.warning[LineTooLong]: off.} # turn off warning about too long lines + +This is often better than disabling all warnings at once. + + +Debugging with Nimrod +===================== + +Nimrod comes with its own *Embedded Nimrod Debugger*. See +the documentation of endb_ for further information. + + +Optimizing for Nimrod +===================== + +Nimrod has no separate optimizer, but the C code that is produced is very +efficient. Most C compilers have excellent optimizers, so usually it is +not needed to optimize one's code. Nimrod has been designed to encourage +efficient code: The most readable code in Nimrod is often the most efficient +too. + +However, sometimes one has to optimize. Do it in the following order: + +1. switch off the embedded debugger (it is **slow**!) +2. turn on the optimizer and turn off runtime checks +3. profile your code to find where the bottlenecks are +4. try to find a better algorithm +5. do low-level optimizations + +This section can only help you with the last item. Note that rewriting parts +of your program in C is *never* necessary to speed up your program, because +everything that can be done in C can be done in Nimrod. Rewriting parts in +assembler *might*. + +Optimizing string handling +-------------------------- + +String assignments are sometimes expensive in Nimrod: They are required to +copy the whole string. However, the compiler is often smart enough to not copy +strings. Due to the argument passing semantics, strings are never copied when +passed to subroutines. The compiler does not copy strings that are returned by +a routine, because a routine returns a new string anyway. Thus it is efficient +to do: + +.. code-block:: Nimrod + var s = procA() # assignment will not copy the string; procA allocates a new + # string anyway + +However it is not efficient to do: + +.. code-block:: Nimrod + var s = varA # assignment has to copy the whole string into a new buffer! + +String case statements are optimized too. A hashing scheme is used for them +if several different string constants are used. This is likely to be more +efficient than any hand-coded scheme. diff --git a/doc/overview.txt b/doc/overview.txt new file mode 100755 index 000000000..242039086 --- /dev/null +++ b/doc/overview.txt @@ -0,0 +1,9 @@ +============================= +Nimrod Documentation Overview +============================= + +:Author: Andreas Rumpf +:Version: |nimrodversion| + +.. include:: ../doc/docs.txt + diff --git a/doc/posix.txt b/doc/posix.txt new file mode 100755 index 000000000..e71b08f53 --- /dev/null +++ b/doc/posix.txt @@ -0,0 +1,220 @@ +Function POSIX Description +access Tests for file accessibility +alarm Schedules an alarm +asctime Converts a time structure to a string +cfgetispeed Reads terminal input baud rate +cfgetospeed Reads terminal output baud rate +cfsetispeed Sets terminal input baud rate +cfsetospeed Sets terminal output baud rate +chdir Changes current working directory +chmod Changes file mode +chown Changes owner and/or group of a file +close Closes a file +closedir Ends directory read operation +creat Creates a new file or rewrites an existing one +ctermid Generates terminal pathname +cuserid Gets user name +dup Duplicates an open file descriptor +dup2 Duplicates an open file descriptor +execl Executes a file +execle Executes a file +execlp Executes a file +execv Executes a file +execve Executes a file +execvp Executes a file +_exit Terminates a process +fcntl Manipulates an open file descriptor +fdopen Opens a stream on a file descriptor +fork Creates a process +fpathconf Gets configuration variable for an open file +fstat Gets file status +getcwd Gets current working directory +getegid Gets effective group ID +getenv Gets environment variable +geteuid Gets effective user ID +getgid Gets real group ID +getgrgid Reads groups database based on group ID +getgrnam Reads groups database based on group name +getgroups Gets supplementary group IDs +getlogin Gets user name +getpgrp Gets process group ID +getpid Gets process ID +getppid Gets parent process ID +getpwnam Reads user database based on user name +getpwuid Reads user database based on user ID +getuid Gets real user ID +isatty Determines if a file descriptor is associated with a terminal +kill Sends a kill signal to a process +link Creates a link to a file +longjmp Restores the calling environment +lseek Repositions read/write file offset +mkdir Makes a directory +mkfifo Makes a FIFO special file +open Opens a file +opendir Opens a directory +pathconf Gets configuration variables for a path +pause Suspends a process execution +pipe Creates an interprocess channel +read Reads from a file +readdir Reads a directory +rename Renames a file +rewinddir Resets the readdir() pointer +rmdir Removes a directory +setgid Sets group ID +setjmp Saves the calling environment for use by longjmp() +setlocale Sets or queries a program's locale +setpgid Sets a process group ID for job control +setuid Sets the user ID +sigaction Examines and changes signal action +sigaddset Adds a signal to a signal set +sigdelset Removes a signal to a signal set +sigemptyset Creates an empty signal set +sigfillset Creates a full set of signals +sigismember Tests a signal for a selected member +siglongjmp Goes to and restores signal mask +sigpending Examines pending signals +sigprocmask Examines and changes blocked signals +sigsetjmp Saves state for siglongjmp() +sigsuspend Waits for a signal +sleep Delays process execution +stat Gets information about a file +sysconf Gets system configuration information +tcdrain Waits for all output to be transmitted to the terminal +tcflow Suspends/restarts terminal output +tcflush Discards terminal data +tcgetattr Gets terminal attributes +tcgetpgrp Gets foreground process group ID +tcsendbreak Sends a break to a terminal +tcsetattr Sets terminal attributes +tcsetpgrp Sets foreground process group ID +time Determines the current calendar time +times Gets process times +ttyname Determines a terminal pathname +tzset Sets the timezone from environment variables +umask Sets the file creation mask +uname Gets system name +unlink Removes a directory entry +utime Sets file access and modification times +waitpid Waits for process termination +write Writes to a file + + +POSIX.1b function calls Function POSIX Description +aio_cancel Tries to cancel an asynchronous operation +aio_error Retrieves the error status for an asynchronous operation +aio_read Asynchronously reads from a file +aio_return Retrieves the return status for an asynchronous operation +aio_suspend Waits for an asynchronous operation to complete +aio_write Asynchronously writes to a file +clock_getres Gets resolution of a POSIX.1b clock +clock_gettime Gets the time according to a particular POSIX.1b clock +clock_settime Sets the time according to a particular POSIX.1b clock +fdatasync Synchronizes at least the data part of a file with the underlying media +fsync Synchronizes a file with the underlying media +kill, sigqueue Sends signals to a process +lio_listio Performs a list of I/O operations, synchronously or asynchronously +mlock Locks a range of memory +mlockall Locks the entire memory space down +mmap Maps a shared memory object (or possibly another file) into process's address space +mprotect Changes memory protection on a mapped area +mq_close Terminates access to a POSIX.1b message queue +mq_getattr Gets POSIX.1b message queue attributes +mq_notify Registers a request to be notified when a message arrives on an empty message queue +mq_open Creates/accesses a POSIX.1b message queue +mq_receive Receives a message from a POSIX.1b message queue +mq_send Sends a message on a POSIX.1b message queue +mq_setattr Sets a subset of POSIX.1b message queue attributes +msync Makes a mapping consistent with the underlying object +munlock Unlocks a range of memory +munlockall Unlocks the entire address space +munmap Undo mapping established by mmap +nanosleep Pauses execution for a number of nanoseconds +sched_get_priority_max Gets maximum priority value for a scheduler +sched_get_priority_min Gets minimum priority value for a scheduler +sched_getparam Retrieves scheduling parameters for a particular process +sched_getscheduler Retrieves scheduling algorithm for a particular purpose +sched_rr_get_interval Gets the SCHED_RR interval for the named process +sched_setparam Sets scheduling parameters for a process +sched_setscheduler Sets scheduling algorithm/parameters for a process +sched_yield Yields the processor +sem_close Terminates access to a POSIX.1b semaphore +sem_destroy De-initializes a POSIX.1b unnamed semaphore +sem_getvalue Gets the value of a POSIX.1b semaphore +sem_open Creates/accesses a POSIX.1b named semaphore +sem_post Posts (signal) a POSIX.1b named or unnamed semaphore +sem_unlink Destroys a POSIX.1b named semaphore +sem_wait, sem_trywait Waits on a POSIX.1b named or unnamed semaphore +shm_open Creates/accesses a POSIX.1b shared memory object +shm_unlink Destroys a POSIX.1b shared memory object +sigwaitinfosigtimedwait Synchronously awaits signal arrival; avoid calling handler +timer_create Creates a POSIX.1b timer based on a particular clock +timer_delete Deletes a POSIX.1b timer +timer_gettime Time remaining on a POSIX.1b timer before expiration +timer_settime Sets expiration time/interval for a POSIX.1b timer +wait, waitpid Retrieves status of a terminated process and clean up corpse + + +POSIX.1c function calls Function POSIX Description +pthread_atfork Declares procedures to be called before and after a fork +pthread_attr_destroy Destroys a thread attribute object +pthread_attr_getdetachstate Obtains the setting of the detached state of a thread +pthread_attr_getinheritsched Obtains the setting of the scheduling inheritance of a thread +pthread_attr_getschedparam Obtains the parameters associated with the scheduling policy attribute of a thread +pthread_attr_getschedpolicy Obtains the setting of the scheduling policy of a thread +pthread_attr_getscope Obtains the setting of the scheduling scope of a thread +pthread_attr_getstackaddr Obtains the stack address of a thread +pthread_attr_getstacksize Obtains the stack size of a thread +pthread_attr_init Initializes a thread attribute object +pthread_attr_setdetachstate Adjusts the detached state of a thread +pthread_attr_setinheritsched Adjusts the scheduling inheritance of a thread +pthread_attr_setschedparam Adjusts the parameters associated with the scheduling policy of a thread +pthread_attr_setschedpolicy Adjusts the scheduling policy of a thread +pthread_attr_setscope Adjusts the scheduling scope of a thread +pthread_attr_setstackaddr Adjusts the stack address of a thread +pthread_attr_setstacksize Adjusts the stack size of a thread +pthread_cancel Cancels the specific thread +pthread_cleanup_pop Removes the routine from the top of a thread's cleanup stack, and if execute is nonzero, runs it +pthread_cleanup_push Places a routine on the top of a thread's cleanup stack +pthread_condattr_destroy Destroys a condition variable attribute object +pthread_condattr_getpshared Obtains the process-shared setting of a condition variable attribute object +pthread_condattr_init Initializes a condition variable attribute object +pthread_condattr_setpshared Sets the process-shared attribute in a condition variable attribute object to either PTHREAD_PROCESS_SHARED or PTHREAD_PROCESS_PRIVATE +pthread_cond_broadcast Unblocks all threads that are waiting on a condition variable +pthread_cond_destroy Destroys a condition variable +pthread_cond_init Initializes a condition variable with the attributes specified in the specified condition variable attribute object +pthread_cond_signal Unblocks at least one thread waiting on a condition variable +pthread_cond_timedwait Automatically unlocks the specified mutex, and places the calling thread into a wait state +pthread_cond_wait Automatically unlocks the specified mutex, and places the calling thread into a wait state +pthread_create Creates a thread with the attributes specified in attr +pthread_detach Marks a threads internal data structures for deletion +pthread_equal Compares one thread handle to another thread handle +pthread_exit Terminates the calling thread +pthread_getschedparam Obtains both scheduling policy and scheduling parameters of an existing thread +pthread_getspecific Obtains the thread specific data value associated with the specific key in the calling thread +pthread_join Causes the calling thread to wait for the specific thread’s termination +pthread_key_create Generates a unique thread-specific key that's visible to all threads in a process +pthread_key_delete Deletes a thread specific key +pthread_kill Delivers a signal to the specified thread +pthread_mutexattr_destroy Destroys a mutex attribute object +pthread_mutexattr_getprioceiling Obtains the priority ceiling of a mutex attribute object +pthread_mutexattr_getprotocol Obtains protocol of a mutex attribute object +pthread_mutexattr_getpshared Obtains a process-shared setting of a mutex attribute object +pthread_mutexattr_init Initializes a mutex attribute object +pthread_mutexattr_setprioceiling Sets the priority ceiling attribute of a mutex attribute object +pthread_mutexattr_setprotocol Sets the protocol attribute of a mutex attribute object +pthread_mutexattr_setpshared Sets the process-shared attribute of a mutex attribute object to either PTHREAD_PROCESS_SHARED or PTHREAD_PROCESS_PRIVATE +pthread_mutex_destroy Destroys a mutex +pthread_mutex_init Initializes a mutex with the attributes specified in the specified mutex attribute object +pthread_mutex_lock Locks an unlocked mutex +pthread_mutex_trylock Tries to lock a not tested +pthread_mutex_unlock Unlocks a mutex +pthread_once Ensures that init_routine will run just once regardless of how many threads in the process call it +pthread_self Obtains a thread handle of a calling thread +pthread_setcancelstate Sets a thread's cancelability state +pthread_setcanceltype Sets a thread's cancelability type +pthread_setschedparam Adjusts the scheduling policy and scheduling parameters of an existing thread +pthread_setspecific Sets the thread-specific data value associated with the specific key in the calling thread +pthread_sigmask Examines or changes the calling thread's signal mask +pthread_testcancel Requests that any pending cancellation request be delivered to the calling thread + + diff --git a/doc/readme.txt b/doc/readme.txt new file mode 100755 index 000000000..02d7477d4 --- /dev/null +++ b/doc/readme.txt @@ -0,0 +1,11 @@ +============================ +Nimrod's documenation system +============================ + +This folder contains Nimrod's documentation. The documentation +is written in a format called *reStructuredText*, a markup language that reads +like ASCII and can be converted to HTML, Tex and other formats automatically! + +Unfortunately reStructuredText does not allow to colorize source code in the +HTML page. Therefore a postprocessor runs over the generated HTML code, looking +for Nimrod code fragments and colorizing them. diff --git a/doc/regexprs.txt b/doc/regexprs.txt new file mode 100755 index 000000000..572a39260 --- /dev/null +++ b/doc/regexprs.txt @@ -0,0 +1,296 @@ +Licence of the PCRE library +=========================== + +PCRE is a library of functions to support regular expressions whose +syntax and semantics are as close as possible to those of the Perl 5 +language. + +| Written by Philip Hazel +| Copyright (c) 1997-2005 University of Cambridge + +---------------------------------------------------------------------- + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +* Neither the name of the University of Cambridge nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +Regular expression syntax and semantics +======================================= + +As the regular expressions supported by this module are enormous, +the reader is referred to http://perldoc.perl.org/perlre.html for the +full documentation of Perl's regular expressions. + +Because the backslash ``\`` is a meta character both in the Nimrod +programming language and in regular expressions, it is strongly +recommended that one uses the *raw* strings of Nimrod, so that +backslashes are interpreted by the regular expression engine:: + + r"\S" # matches any character that is not whitespace + +A regular expression is a pattern that is matched against a subject string +from left to right. Most characters stand for themselves in a pattern, and +match the corresponding characters in the subject. As a trivial example, +the pattern:: + + The quick brown fox + +matches a portion of a subject string that is identical to itself. +The power of regular expressions comes from the ability to include +alternatives and repetitions in the pattern. These are encoded in +the pattern by the use of metacharacters, which do not stand for +themselves but instead are interpreted in some special way. + +There are two different sets of metacharacters: those that are recognized +anywhere in the pattern except within square brackets, and those that are +recognized in square brackets. Outside square brackets, the metacharacters +are as follows: + +============== ============================================================ +meta character meaning +============== ============================================================ +``\`` general escape character with several uses +``^`` assert start of string (or line, in multiline mode) +``$`` assert end of string (or line, in multiline mode) +``.`` match any character except newline (by default) +``[`` start character class definition +``|`` start of alternative branch +``(`` start subpattern +``)`` end subpattern +``?`` extends the meaning of ``(`` + also 0 or 1 quantifier + also quantifier minimizer +``*`` 0 or more quantifier +``+`` 1 or more quantifier + also "possessive quantifier" +``{`` start min/max quantifier +============== ============================================================ + + +Part of a pattern that is in square brackets is called a "character class". +In a character class the only metacharacters are: + +============== ============================================================ +meta character meaning +============== ============================================================ +``\`` general escape character +``^`` negate the class, but only if the first character +``-`` indicates character range +``[`` POSIX character class (only if followed by POSIX syntax) +``]`` terminates the character class +============== ============================================================ + + +The following sections describe the use of each of the metacharacters. + + +Backslash +--------- +The `backslash`:idx: character has several uses. Firstly, if it is followed +by a non-alphanumeric character, it takes away any special meaning that +character may have. This use of backslash as an escape character applies +both inside and outside character classes. + +For example, if you want to match a ``*`` character, you write ``\*`` in +the pattern. This escaping action applies whether or not the following +character would otherwise be interpreted as a metacharacter, so it is always +safe to precede a non-alphanumeric with backslash to specify that it stands +for itself. In particular, if you want to match a backslash, you write ``\\``. + + +Non-printing characters +----------------------- +A second use of backslash provides a way of encoding non-printing characters +in patterns in a visible manner. There is no restriction on the appearance of +non-printing characters, apart from the binary zero that terminates a pattern, +but when a pattern is being prepared by text editing, it is usually easier to +use one of the following escape sequences than the binary character it +represents:: + +============== ============================================================ +character meaning +============== ============================================================ +``\a`` alarm, that is, the BEL character (hex 07) +``\e`` escape (hex 1B) +``\f`` formfeed (hex 0C) +``\n`` newline (hex 0A) +``\r`` carriage return (hex 0D) +``\t`` tab (hex 09) +``\ddd`` character with octal code ddd, or backreference +``\xhh`` character with hex code hh +============== ============================================================ + +After ``\x``, from zero to two hexadecimal digits are read (letters can be in +upper or lower case). In UTF-8 mode, any number of hexadecimal digits may +appear between ``\x{`` and ``}``, but the value of the character code must be +less than 2**31 (that is, the maximum hexadecimal value is 7FFFFFFF). If +characters other than hexadecimal digits appear between ``\x{`` and ``}``, or +if there is no terminating ``}``, this form of escape is not recognized. +Instead, the initial ``\x`` will be interpreted as a basic hexadecimal escape, +with no following digits, giving a character whose value is zero. + +After ``\0`` up to two further octal digits are read. In both cases, if there +are fewer than two digits, just those that are present are used. Thus the +sequence ``\0\x\07`` specifies two binary zeros followed by a BEL character +(code value 7). Make sure you supply two digits after the initial zero if +the pattern character that follows is itself an octal digit. + +The handling of a backslash followed by a digit other than 0 is complicated. +Outside a character class, PCRE reads it and any following digits as a +decimal number. If the number is less than 10, or if there have been at least +that many previous capturing left parentheses in the expression, the entire +sequence is taken as a back reference. A description of how this works is +given later, following the discussion of parenthesized subpatterns. + +Inside a character class, or if the decimal number is greater than 9 and +there have not been that many capturing subpatterns, PCRE re-reads up to +three octal digits following the backslash, and generates a single byte +from the least significant 8 bits of the value. Any subsequent digits stand +for themselves. For example: + +============== ============================================================ +example meaning +============== ============================================================ +``\040`` is another way of writing a space +``\40`` is the same, provided there are fewer than 40 previous + capturing subpatterns +``\7`` is always a back reference +``\11`` might be a back reference, or another way of writing a tab +``\011`` is always a tab +``\0113`` is a tab followed by the character "3" +``\113`` might be a back reference, otherwise the character with + octal code 113 +``\377`` might be a back reference, otherwise the byte consisting + entirely of 1 bits +``\81`` is either a back reference, or a binary zero followed by + the two characters "8" and "1" +============== ============================================================ + +Note that octal values of 100 or greater must not be introduced by a leading +zero, because no more than three octal digits are ever read. + +All the sequences that define a single byte value or a single UTF-8 character +(in UTF-8 mode) can be used both inside and outside character classes. In +addition, inside a character class, the sequence ``\b`` is interpreted as the +backspace character (hex 08), and the sequence ``\X`` is interpreted as the +character "X". Outside a character class, these sequences have different +meanings (see below). + +Generic character types +----------------------- +The third use of backslash is for specifying `generic character types`:idx:. +The following are always recognized: + +============== ============================================================ +character type meaning +============== ============================================================ +``\d`` any decimal digit +``\D`` any character that is not a decimal digit +``\s`` any whitespace character +``\S`` any character that is not a whitespace character +``\w`` any "word" character +``\W`` any "non-word" character +============== ============================================================ + +Each pair of escape sequences partitions the complete set of characters into +two disjoint sets. Any given character matches one, and only one, of each pair. + +These character type sequences can appear both inside and outside character +classes. They each match one character of the appropriate type. If the +current matching point is at the end of the subject string, all of them fail, +since there is no character to match. + +For compatibility with Perl, ``\s`` does not match the VT character (code 11). +This makes it different from the the POSIX "space" class. The ``\s`` characters +are HT (9), LF (10), FF (12), CR (13), and space (32). + +A "word" character is an underscore or any character less than 256 that is +a letter or digit. The definition of letters and digits is controlled by +PCRE's low-valued character tables, and may vary if locale-specific matching +is taking place (see "Locale support" in the pcreapi page). For example, +in the "fr_FR" (French) locale, some character codes greater than 128 are +used for accented letters, and these are matched by ``\w``. + +In UTF-8 mode, characters with values greater than 128 never match ``\d``, +``\s``, or ``\w``, and always match ``\D``, ``\S``, and ``\W``. This is true +even when Unicode character property support is available. + +Simple assertions +----------------- +The fourth use of backslash is for certain `simple assertions`:idx:. An +assertion specifies a condition that has to be met at a particular point in +a match, without consuming any characters from the subject string. The use of +subpatterns for more complicated assertions is described below. The +backslashed assertions are:: + +============== ============================================================ +assertion meaning +============== ============================================================ +``\b`` matches at a word boundary +``\B`` matches when not at a word boundary +``\A`` matches at start of subject +``\Z`` matches at end of subject or before newline at end +``\z`` matches at end of subject +``\G`` matches at first matching position in subject +============== ============================================================ + +These assertions may not appear in character classes (but note that ``\b`` +has a different meaning, namely the backspace character, inside a character +class). + +A word boundary is a position in the subject string where the current +character and the previous character do not both match ``\w`` or ``\W`` (i.e. +one matches ``\w`` and the other matches ``\W``), or the start or end of the +string if the first or last character matches ``\w``, respectively. + +The ``\A``, ``\Z``, and ``\z`` assertions differ from the traditional +circumflex and dollar in that they only ever match at the very start and +end of the subject string, whatever options are set. +The difference between ``\Z`` and ``\z`` is that ``\Z`` matches before +a newline that is the last character of the string as well as at the end +of the string, whereas ``\z`` matches only at the end. + +.. + Regular expressions in Nimrod itself! + ------------------------------------- + + 'a' -- matches the character a + 'a'-'z' -- range operator '-' + 'A' | 'B' -- alternative operator | + * 'a' -- prefix * is needed + + 'a' -- prefix + is needed + ? 'a' -- prefix ? is needed + letter -- character classes with real names! + letters + white + whites + any -- any character + () -- are Nimrod syntax + ! 'a'-'z' + + -- concatentation via proc call: + + re('A' 'Z' word * ) diff --git a/doc/rst.txt b/doc/rst.txt new file mode 100755 index 000000000..c4f3805b3 --- /dev/null +++ b/doc/rst.txt @@ -0,0 +1,111 @@ +=========================================================================== + Nimrod's implementation of |rst| +=========================================================================== + +:Author: Andreas Rumpf +:Version: |nimrodversion| + +.. contents:: + +Introduction +============ + +This document describes the subset of `Docutils`_' `reStructuredText`_ as it +has been implemented in the Nimrod compiler for generating documentation. +Elements of |rst| that are not listed here have not been implemented. +Unfortunately, the specification of |rst| is quite vague, so Nimrod is not as +compatible to the original implementation as one would like. + +Even though Nimrod's |rst| parser does not parse all constructs, it is pretty +usable. The missing features can easily be circumvented. An indication of this +fact is that Nimrod's +*whole* documentation itself (including this document) is +processed by Nimrod's |rst| parser. (Which is an order of magnitude faster than +Docutils' parser.) + + +Inline elements +=============== + +Ordinary text may contain *inline elements*. + + +Bullet lists +============ + +*Bullet lists* look like this:: + + * Item 1 + * Item 2 that + spans over multiple lines + * Item 3 + * Item 4 + - bullet lists may nest + - item 3b + - valid bullet characters are ``+``, ``*`` and ``-`` + +This results in: +* Item 1 +* Item 2 that + spans over multiple lines +* Item 3 +* Item 4 + - bullet lists may nest + - item 3b + - valid bullet characters are ``+``, ``*`` and ``-`` + + +Enumerated lists +================ + +*Enumerated lists* + + +Defintion lists +=============== + +Save this code to the file "greeting.nim". Now compile and run it: + + ``nimrod run greeting.nim`` + +As you see, with the ``run`` command Nimrod executes the file automatically +after compilation. You can even give your program command line arguments by +appending them after the filename that is to be compiled and run: + + ``nimrod run greeting.nim arg1 arg2`` + + +Tables +====== + +Nimrod only implements simple tables of the form:: + + ================== =============== =================== + header 1 header 2 header n + ================== =============== =================== + Cell 1 Cell 2 Cell 3 + Cell 4 Cell 5; any Cell 6 + cell that is + not in column 1 + may span over + multiple lines + Cell 7 Cell 8 Cell 9 + ================== =============== =================== + +This results in: +================== =============== =================== +header 1 header 2 header n +================== =============== =================== +Cell 1 Cell 2 Cell 3 +Cell 4 Cell 5; any Cell 6 + cell that is + not in column 1 + may span over + multiple lines +Cell 7 Cell 8 Cell 9 +================== =============== =================== + + +.. |rst| replace:: reStructuredText +.. _reStructuredText: http://docutils.sourceforge.net/rst.html#reference-documentation +.. _docutils: http://docutils.sourceforge.net/ diff --git a/doc/spec.txt b/doc/spec.txt new file mode 100755 index 000000000..3bad06e97 --- /dev/null +++ b/doc/spec.txt @@ -0,0 +1,1297 @@ +==================== +Nimrod Specification +==================== + +:Author: Andreas Rumpf + +.. contents:: + + +About this document +=================== + +This document describes the lexis, the syntax, and the semantics of Nimrod. +However, this is only a first draft. Some parts need to be more precise, +features may be added to the language, etc. + +The language constructs are explained using an extended BNF, in +which ``(a)*`` means 0 or more ``a``'s, ``a+`` means 1 or more ``a``'s, and +``(a)?`` means an optional *a*; an alternative spelling for optional parts is +``[a]``. The ``|`` symbol is used to mark alternatives +and has the lowest precedence. Parentheses may be used to group elements. +Non-terminals are in lowercase, terminal symbols (including keywords) are in +UPPERCASE. An example:: + + if_stmt ::= IF expr COLON stmts (ELIF expr COLON stmts)* [ELSE stmts] + + +Definitions +=========== + +The following defintions are the same as their counterparts in the +specification of the Modula-3 programming language. + +A Nimrod program specifies a computation that acts on a sequence of digital +components called `locations`. A variable is a set of locations that +represents a mathematical value according to a convention determined by the +variable's *type*. If a value can be represented by some variable of type +``T``, then we say that the value is a *member* of ``T`` and ``T`` *contains* +the value. + +An *identifier* is a symbol declared as a name for a variable, type, +procedure, etc. The region of the program over which a declaration applies is +called the *scope* of the declaration. Scopes can be nested. The meaning of an +identifier is determined by the smallest enclosing scope in which the +identifier is declared. + +An expression specifies a computation that produces a value or variable. +Expressions that produce variables are called `designators`. A designator +can denote either a variable or the value of that variable, depending on +the context. Some designators are *readonly*, which means that they cannot +be used in contexts that might change the value of the variable. A +designator that is not readonly is called *writable*. Expressions whose +values can be determined statically are called *constant expressions*; +they are never designators. + +A `static error` is an error that the implementation must detect before +program execution. Violations of the language definition are static +errors unless they are explicitly classified as runtime errors. + +A `checked runtime error` is an error that the implementation must detect +and report at runtime. The method for reporting such errors is via *raising +exceptions*. However, an implementation may provide a means to disable these +runtime checks. See the section *pragmas* for details. + +An `unchecked runtime error` is an error that is not guaranteed to be +detected, and can cause the subsequent behavior of the computation to +be arbitrary. Unchecked runtime errors cannot occur if only `safe` +language features are used. + + +Lexical Analysis +================ + +Indentation +----------- + +Nimrod's standard grammar describes an `indentation sensitive` language. +This means that all the control structures are recognized by the indentation. +Indentation consists only of spaces; tabulators are not allowed. + +The terminals ``IND`` (indentation), ``DED`` (dedentation) and ``SAD`` +(same indentation) are generated by the scanner, denoting an indentation. +Using tabulators for the indentation is not allowed. + +These terminals are only generated for *logical lines*, i.e. not for an empty +line and not for a line with only whitespace or comments. + +The parser and the scanner communicate over a stack which indentation terminal +should be generated: The stack consists of integers counting the spaces. The +stack is initialized with a zero on its top. The scanner reads from the stack: +If the current indentation token consists of more spaces than the entry at the +top of the stack, a ``IND`` token is generated, else if it consists of the same +number of spaces, a ``SAD`` token is generated. If it consists of fewer spaces, +a ``DED`` token is generated for any item on the stack that is greater than the +current. These items are then popped from the stack by the scanner. At the end +of the file, a ``DED`` token is generated for each number remaining on the +stack that is larger than zero. + +Because the grammar contains some optional ``IND`` tokens, the scanner cannot +push new indentation levels. This has to be done by the parser. The symbol +``IND_PUSH`` indicates that the ``IND`` token should be pushed onto the stack +by the parser. + +An Example how this works:: + + if_stmt ::= IF expr COLON stmts (ELIF expr COLON stmts)* (ELSE stmts)? + stmts ::= IND_PUSH (stmt [SAD])+ DED | stmt [SAD] + + if expr0: + stmt1 # would be valid because, SAD is not generated any longer! + + if expr0: + stmt1 # scanner: IND; parser pushes 2 onto the stack + if expr3: stmt5 # DED; ... SAD eaten by stmt5 + else: stmt6 # + if expr4: stmt7 + if expr: + stmt1 # scanner: IND; the parser pushes 2 onto the stack + stmt2 # scanner: SAD (because indentation is 2) + elif expr2: # scanner: DED (because indentation is 0) + stmts3 # scanner: IND; the parser pushes 2 onto the stack + else: # scanner: DED; + stmt4 # scanner: IND; the parser pushes 2 onto the stack + # scanner generates 1 DED, because end of file and 1 item on stack + + + +Identifiers & Keywords +---------------------- + +`Identifiers` in Nimrod can be any string of letters, digits +and underscores, beginning with a letter. Two immediate following +underscores ``__`` are not allowed:: + + letter ::= 'A'..'Z' | 'a'..'z' + digit ::= '0'..'9' + IDENTIFIER ::= letter ( ['_'] letter | digit )* + +The following `keywords` are reserved and cannot be used as identifiers:: + + ${keywords} + +Some keywords are unused; they are reserved for future developments of the +language. + +Nimrod is a `style-insensitive` language. This means that it is not +case-sensitive and even underscores are ignored: +**type** is a reserved word, and so is **TYPE** or **T_Y_P_E**. The idea behind +this is, that this allows programmers to use their own prefered spelling style. +Editors can show the identifiers as configured. + + +Literal strings +--------------- + +`Literal strings` can be delimited by matching double quotes, and can contain +the following `escape sequences`: + +================== ================================== + Escape sequence Meaning +================== ================================== + ``\n`` `newline` + ``\r`` `carriage return` + ``\l`` `line feed` + ``\f`` `form feed` + ``\t`` `tabulator` + ``\v`` `vertical tabulator` + ``\\`` `backslash` + ``\"`` `quotation mark` + ``\'`` `apostrophe` + ``\y`` `character with number y` + ``\a`` `alert` + ``\b`` `backspace` + ``\e`` `escape` `[ESC]` +================== ================================== + + +Strings in Nimrod may contain any 8-bit value, except embedded zeros, +which are not allowed for compability with `C`. + +Literal strings can also be delimited by three double squotes +``"""`` ... ``"""``. +Literals in this form may run for several lines, may contain ``"`` and do not +interpret any escape sequences. +For convenience, when the opening ``"""`` is immediately +followed by a newline, the newline is not included in the string. +`Raw string literals` are preceded with the letter ``r`` (or ``R``) +and are delimited by matching double quotes (just like ordinary string +literals) and do not interpret the escape sequences. + + +Literal characters +------------------ + +Character literals are enclosed in single quotes ``''`` and can contain the +same escape sequences as strings - with one exception: ``\n`` is not allowed +as it may be wider than one character (often it is the pair CR/LF for example). + + +Numerical constants +------------------- + +`Numerical constants` are of a single type and have the form:: + + hexdigit ::= digit | 'A'..'F' | 'a'..'f' + octdigit ::= '0'..'7' + bindigit ::= '0'..'1' + INT_LIT ::= digit ( ['_'] digit )* + | '0' ('x' | 'X' ) hexdigit ( ['_'] hexdigit )* + | '0o' octdigit ( ['_'] octdigit )* + | '0' ('b' | 'B' ) bindigit ( ['_'] bindigit )* + + INT8_LIT ::= INT_LIT '\'' ('i' | 'I' ) '8' + INT16_LIT ::= INT_LIT '\'' ('i' | 'I' ) '16' + INT32_LIT ::= INT_LIT '\'' ('i' | 'I' ) '32' + INT64_LIT ::= INT_LIT '\'' ('i' | 'I' ) '64' + + exponent ::= ('e' | 'E' ) ['+' | '-'] digit ( ['_'] digit )* + FLOAT_LIT ::= digit (['_'] digit)* ('.' (['_'] digit)* [exponent] |exponent) + FLOAT32_LIT ::= ( FLOAT_LIT | INT_LIT ) '\'' ('f' | 'F') '32' + FLOAT64_LIT ::= ( FLOAT_LIT | INT_LIT ) '\'' ('f' | 'F') '64' + + +As can be seen in the productions, numerical constants can contain unterscores +for readability. Integer and floating point literals may be given in decimal (no +prefix), binary (prefix ``0b``), octal (prefix ``0o``) and +hexadecimal (prefix ``0x``) notation. + +There exists a literal for each numerical type that are +defined. The suffix starting with an apostophe ('\'') is called a +`type suffix`. Literals without a type prefix are of the type ``int``, unless +the literal contains a dot or an ``E`` in which case it is of type ``float``. + +The following table specifies type suffixes: + +================= ========================= + Type Suffix Resulting type of literal +================= ========================= + ``'i8`` int8 + ``'i16`` int16 + ``'i32`` int32 + ``'i64`` int64 + ``'f32`` float32 + ``'f64`` float64 +================= ========================= + +Floating point literals may also be in binary, octal or hexadecimal +notation: +``0B0_10001110100_0000101001000111101011101111111011000101001101001001'f64`` +is approximately 1.72826e35 according to the IEEE floating point standard. + + + + +Comments +-------- + +`Comments` start anywhere outside a string with the hash character ``#``. +Comments run until the end of the line. Comments are tokens; they are only +allowed at certain places in the input file as they belong to the syntax. This +is essential for performing correct source-to-source transformations or +documentation generators. + + +Other tokens +------------ + +The following strings denote other tokens:: + + ( ) { } [ ] , ; [. .] {. .} (. .) + : = ^ .. ` + +``..`` takes precedence over other tokens that contain a dot: ``{..}`` are the +three tokens ``{``, ``..``, ``}`` and not the two tokens ``{.``, ``.}``. + +In Nimrod one can define his own operators. An `operator` is any +combination of the following characters that are not listed above:: + + + - * / < > + = @ $ ~ & % + ! ? ^ . | + +These keywords are also operators: +``and or not xor shl shr div mod in notin is isnot``. + + +Syntax +====== + +This section lists Nimrod's standard syntax in ENBF. How the parser receives +indentation tokens is already described in the Lexical Analysis section. + +Nimrod allows user-definable operators. +Binary operators have 8 different levels of precedence. For user-defined +operators, the precedence depends on the first character the operator consists +of. All binary operators are left-associative. + +================ ============================================== ================== =============== +Precedence level Operators First characters Terminal symbol +================ ============================================== ================== =============== + 7 (highest) ``$`` OP7 + 6 ``* / div mod shl shr %`` ``* % \ /`` OP6 + 5 ``+ -`` ``+ ~ |`` OP5 + 4 ``&`` ``&`` OP4 + 3 ``== <= < >= > != in not_in is isnot`` ``= < > !`` OP3 + 2 ``and`` OP2 + 1 ``or xor`` OP1 + 0 (lowest) ``? @ ^ ` : .`` OP0 +================ ============================================== ================== =============== + + +The grammar's start symbol is ``module``. The grammar is LL(1) and therefore +not ambigious. + +.. include:: grammar.txt + :literal: + + + +Semantics +========= + +Constants +--------- + +Constants are symbols which are bound to a value. The constant's value +cannot change. The compiler must be able to evaluate the expression in a +constant declaration at compile time. This means that most of the functions in +the runtime library cannot be used in a constant declaration. + +Operators such as ``+, -, *, /, not, and, or, div, mod`` and the procedures +``addr, ord, chr, sizeof, trunc, round, frac, odd, abs`` can be used, however. +An implementation may restrict the usage of ``addr`` in constant expressions. + + +Types +----- + +All expressions have a type which is known at compile time. Thus Nimrod is +statically typed. One can declare new types, which is in +essence defining an identifier that can be used to denote this custom type +when declaring variables further in the source code. + +These are the major type classes: + +* ordinal types (consist of integer, bool, character, enumeration + (and subranges thereof) types) +* floating point types +* string type +* structured types +* reference (pointer) type +* procedural type + + +Ordinal types +~~~~~~~~~~~~~ +Ordinal types have the following characteristics: + +- Ordinal types are countable and ordered. This property allows + the operation of functions as Inc, Ord, Dec on ordinal types to be defined. +- Ordinal values have a smallest possible value. Trying to count farther + down than the smallest value gives a checked runtime or static error. +- Ordinal values have a largest possible value. Trying to count farther + than the largest value gives a checked runtime or static error. + +Signed integers, bool, characters and enumeration types (and subrange of these +types) belong to ordinal types. Unsigned integer types are special in the way +that over- and underflows generate no errors, but wrap around. + +Pre-defined numerical types +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +These integer types are pre-defined: + +``int`` + the generic signed integer type; its size is platform dependant + (a compiler should choose the processor's fastest integer type) + this type should be used in general. An integer literal that has no type + suffix is of this type. + +intXX + an implementation may define additional signed integer types + of XX bits using this naming scheme (example: int16 is a 16 bit wide integer). + The current implementation supports ``int8``, ``int16``, ``int32``, ``int64``. + Literals of these types have the suffix 'iXX. + + +There are no unsigned integer types, only *unsigned operations* that treat their +arguments as unsigned. Unsigned operations all wrap around; they may not lead to +over- or underflow errors. + +The following floating point types are pre-defined: + +``float`` + the generic floating point type; its size is platform dependant + (a compiler should choose the processor's fastest floating point type) + this type should be used in general + +floatXX + an implementation may define additional floating point types of XX bits using + this naming scheme (example: float64 is a 64 bit wide float). The current + implementation supports ``float32`` and ``float64``. Literals of these types + have the suffix 'fXX. + +Automatic type conversion in expressions where different kinds +of integer types are used is performed. However, if the type conversion +loses information, the `EConvertError` exception is raised. An implementation +may detect certain cases of the convert error at compile time. + +Automatic type conversion in expressions with different kinds +of floating point types is performed: The smaller type is +converted to the larger. Arithmetic performed on floating point types +follows the IEEE standard. + + +Boolean type +~~~~~~~~~~~~ +The boolean type is named ``bool`` in Nimrod and can be one of the two +pre-defined values ``true`` and ``false``. Conditions in while, +if, elif, when statements need to be of type bool. + +This condition should hold:: + + ord(false) == 0 and ord(true) == 1 + +The operators ``not, and, or, xor`` are defined for the bool type. +The ``and`` and ``or`` operators perform short-cut evaluation. +Example:: + + while p != nil and p.name != "xyz": + # p.name is not evaluated if p == nil + p = p.next + + +The size of the bool type is implementation-dependant, typically it is +one byte. + + +Character type +~~~~~~~~~~~~~~ +The character type is named ``char`` in Nimrod and uses the platform's +native encoding. Thus on nearly every platform its size is one byte due to +the popular UTF-8 encoding. +Character literals are enclosed in single quotes ``''``. + +.. Note:: For platform-independant character handling is the ``encoding`` + standard module. + + +Enumeration types +~~~~~~~~~~~~~~~~~ +Enumeration types define a new type whose values consist only of the ones +specified. +The values are ordered by the order in enum's declaration. Example:: + + type + TDirection = enum + north, east, south, west + + +Now the following holds:: + + ord(north) == 0 + ord(east) == 1 + ord(south) == 2 + ord(west) == 3 + +Thus, north < east < south < west. The comparison operators can be used +with enumeration types. + +An implemenation should store enumeration types with the minimal number of +bytes required for the particular enum, unless efficiency would be affected by +doing so. + +For better interfacing to other programming languages, the fields of enum +types can be assigned an explicit ordinal value. However, the ordinal values +have to be in ascending order appropriately. A field whose ordinal value that +is not explicitly given, gets the value of the previous field + 1. + +An explicit ordered enum can have *wholes*:: + + type + TTokenType = enum + a = 2, b = 4, c = 89 # wholes are valid + +However, it is then not an ordinal anymore, so it is not possible to use these +enums as an index type for arrays. The procedures ``inc``, ``dec``, ``succ`` +and ``pred`` are not available for them. + + +Subrange types +~~~~~~~~~~~~~~ +A subrange type is a range of values from an ordinal type (the host type). +To define a subrange type, one must specify it's limiting values: the highest +and lowest value of the type:: + + type + TSubrange = range[0..5] + + +``TSubrange`` is a subrange of an integer which can only hold the values 0 +to 5. Assigning an other value to a variable of type ``TSubrange`` is a +checked runtime error (or static error if it can be statically +determined). Assignments from the base type to one of its subrange types +(and vice versa) are allowed. + +An implemenation should give it the same size as its base type. + + +String type +~~~~~~~~~~~ +All string literals are of the type string. A string in Nimrod is very +similar to a sequence of characters. However, strings in Nimrod both are +zero-terminated and have a length field. One can retrieve the length with the +builtin ``length`` procedure; the length never counts the terminating zero. +The assignment operator for strings always copies the string. + +Strings are compared by their lexicographical order. All comparison operators +are available. String can be indexed like arrays (lower bound is 0). Unlike +arrays, they can be used in case statements:: + + case paramStr(i) + of "-v": incl(options, optVerbose) + of "-h", "-?": incl(options, optHelp) + else: write(stdout, "invalid command line option!\n") + + +Structured types +~~~~~~~~~~~~~~~~ +A variable of a structured type can hold multiple values at the same time. +Stuctured types can be nested to unlimited levels. Arrays, sequences, records, +objects and sets belong to the structured types. + +Array type +~~~~~~~~~~ +Arrays are a homogenous type, meaning that each element in the array has the +same type. Arrays always have a fixed length which is specified at compile time +(except for open arrays). They can be indexed by any ordinal type. A parameter +may leave out the index type in the declaration making it an +*open array*. An open array ``A`` is always indexed by integers from 0 to +``length(A)-1``. + +A sequence may be passed to a parameter that is of type *open array*, but +not to a multi-dimensional open array, because it is impossible to do so in an +efficient manner. + +An array expression may be constructed by the array constructor ``[]``. +A constructed array is assignment compatible to a sequence. + +Example:: + + type + TIntArray = array[0..5, int] + var + x: TIntArray + x = [1, 2, 3, 4, 5, 6] # this is the array constructor + +The lower bound of an array may be received by the built-in proc +``low()``, the higher bound by ``high()``. The length may be +received by ``length()``. + +Arrays are always bounds checked (at compile-time or at runtime). An +implementation may provide a means to disable these checks. + + +Sequence type +~~~~~~~~~~~~~ +Sequences are similar to arrays but of dynamic length which may change +during runtime (like strings). A sequence ``S`` is always indexed by integers +from 0 to ``length(S)-1`` and its bounds are checked. Sequences can also be +constructed by the array constructor ``[]``. + + +Record and object types +~~~~~~~~~~~~~~~~~~~~~~~ +A variable of a record or object type is a heterogenous storage container. +A record or object defines various named *fields* of a type. The assignment +operator for records and objects always copies the whole record/object. The +constructor ``[]`` can be used to initialize records/objects. A field may +be given a default value. Fields with default values do not have to be listed +in a record construction, all other fields have to be listed. +:: + + type + TPerson = record # type representing a person + name: string # a person consists of a name + age: int = 30 # and an age which default value is 30 + + var + person: TPerson + person = (name: "Peter") # person.age is its default value (30) + +An implementation may align or even reorder the fields for best access +performance. The alignment may be specified with the `align` +pragma. If an alignment is specified the compiler shall not reorder the fields. + +The difference between records and objects is that objects allow inheritance. +Objects have access to their type at runtime, so that the ``is`` operator +can be used to determine the object's type. Assignment from an object to its +parents' object leads to a static or runtime error (the +`EInvalidObjectAssignment` exception is raised). +:: + + type + TPerson = object + name: string + age: int + + TStudent = object of TPerson # a student is a person + id: int # with an id field + + var + student: TStudent + person: TPerson + student = (name: "Peter", age: 89, id: 3) + person = (name: "Mary", age: 17) + assert(student is TStudent) # is true + person = student # this is an error; person has no storage for id. + + +Set type +~~~~~~~~ +The `set type` models the mathematical notion of a set. The set's +basetype can only be an ordinal type. The reason is that sets are implemented +as bit vectors. Sets are designed for high performance computing. + +.. Note:: The sets module can be used for sets of other types. + +Sets can be constructed via the set constructor: ``{}`` is the empty set. The +empty set is type combatible with any special set type. The constructor +can also be used to include elements (and ranges of elements) in the set:: + + {'a'..'z', '0'..'9'} # This constructs a set that conains the + # letters from 'a' to 'z' and the digits + # from '0' to '9' + +These operations are supported by sets: + +================== ========================================================== +operation meaning +================== ========================================================== + A + B union of two sets + A * B intersection of two sets + A - B difference of two sets (A without B's elements) + A == B set equality + A <= B subset relation (A is subset of B or equal to B) + A < B strong subset relation (A is a real subset of B) + e in A set membership (A contains element e) + A >< B symmetric set difference (= (A - B) + (B - A)) + card(A) the cardinality of A (number of elements in A) + incl(A, elem) same as A = A + {elem}, but may be faster + excl(A, elem) same as A = A - {elem}, but may be faster +================== ========================================================== + +Reference type +~~~~~~~~~~~~~~ +References (similiar to `pointers` in other programming languages) are a way to +introduce many-to-one relationships. This means different references can point +to and modify the same location in memory. References should be used sparingly +in a program. They are only needed for constructing graphs. + +Nimrod distinguishes between *traced* and *untraced* references. Untraced +references are also called `pointers`. The difference between them is that +traced references are garbage collected, untraced are not. Thus untraced +references are *unsafe*. However for certain low-level operations (accessing +the hardware) untraced references are unavoidable. + +Traced references are declared with the **ref** keyword, untraced references +are declared with the **ptr** keyword. + +The ``^`` operator can be used to derefer a reference, the ``addr`` procedure +returns the address of an item. An address is always an untraced reference. +Thus the usage of ``addr`` is an *unsafe* feature. + +The ``.`` (access a record field operator) and ``[]`` (array/string/sequence +index operator) operators perform implicit dereferencing operations for +reference types:: + + type + PNode = ref TNode + TNode = record + le, ri: PNode + data: int + + var + n: PNode + new(n) + n.data = 9 # no need to write n^.data + +As can be seen by the example, reference types are the only types that can be +used in *implicit forward declarations*: TNode may be used before it is +defined, because only a refence to it is needed. + +To allocate a new traced object, the built-in procedure ``new`` has to be used. +To deal with untraced memory, the procedures ``alloc``, ``dealloc`` and +``realloc`` can be used. The documentation of the system module contains +further information. + +Special care has to be taken if an untraced object contains traced objects like +traced references, strings or sequences: In order to free everything properly, +the built-in procedure ``finalize`` has to be called before freeing the +untraced memory manually! + +.. XXX finalizers for traced objects + +Procedural type +~~~~~~~~~~~~~~~ +A procedural type is internally a pointer to procedure. Thus ``nil`` is an +allowed value for variables of a procedural type. Nimrod uses procedural types +to achieve `functional` programming techniques. Dynamic dispatch for OOP +constructs can also be implemented with procedural types. + +Example:: + + type + TCallback = proc (x: int) {.cdecl.} + + proc printItem(x: Int) = ... + + proc forEach(c: TCallback) = + ... + + forEach(printItem) # this will NOT work because calling conventions differ + +A subtle issue with procedural types is that the calling convention of the +procedure influences the type compability: Procedural types are only compatible +if they have the same calling convention. + +Altough a Nimrod implementation may provide additional calling conventions +the following shall always exist: + +``cdecl`` + is used for interfacing with C; indicates that a proc shall + use the same calling convention as the C compiler. + +``inline`` + indicates that the caller of the proc should not call it, + but rather inline its code in place for improved efficiency. Note that + this is only a hint for the compiler: It may completely ignore it and + it may inline procedures that are not marked as ``inline``. + +``closure`` + indicates that the procedure expects a context, a `closure` that needs + to be passed to the procedure. + + +Statements +---------- +Nimrod uses the common statement/expression paradigma: Statements do not +produce a value in contrast to expressions. Call expressions are statements. +If the called procedure returns a value, it is not a valid statement +as statements do not produce values. To evaluate an expression for +side-effects and throwing its value away, one can use the ``discard`` +statement. + +Statements are separated into `simple statements` and `complex statements`. +Simple statements are statements that cannot contain other statements, like +assignments, calls or the ``return`` statement; complex statements can +contain other statements. To avoid the `dangling else problem`, complex +statements always have to be intended:: + + XXX + + + +Discard statement +~~~~~~~~~~~~~~~~~ + +Syntax:: + + discard_stmt ::= discard expr + +The `discard` statement evaluates its expression for side-effects and throws +the expression's resulting value away. If the expression has no side-effects, +this shall generate at least a warning from the compiler. + + +Var statement +~~~~~~~~~~~~~ + +Syntax:: + + varlist ::= identlist [asgn_opr expr doc] + var_section ::= var varlist + | var INDENT(x > I[-1]) PUSH(x) varlist + { INDENT(x) varlist } POP + +`Var` statements simply declare new local and global variables and may +initialize them. A comma seperated list of variables can be used to specify +variables of the same type:: + + var + a: int = 0 + x, y, z: int + +However, an initializer is not allowed for such a list as its semantic +would be ambigious in some cases. If an initializer is given the type +can be omitted: The variable is of the same type as the initializing +expression. + + +If statement +~~~~~~~~~~~~ + +Syntax:: + + if_stmt ::= PUSH(x = I[-1]) if expr ":" stmts + { INDENT(x) elif expr ":" stmts } + [ INDENT(x) else ":" stmts ] + POP + +The `if` statement is a simple way to make a branch in the control flow: +The expression after the keyword ``if`` is evaluated, if it is true +the corresponding statements after the ``:`` are executed. Otherwise +the expression after the ``elif`` is evaluated (if there is an +``elif`` branch), if it is true the corresponding statements after +the ``:`` are executed. This goes on until the last ``elif``. If all +conditions fail, the ``else`` part is executed. If there is no ``else`` +part, execution continues with the statement after the ``if`` statement. + + +Case statement +~~~~~~~~~~~~~~ + +Syntax:: + + case_stmt ::= PUSH(I[-1]) case expr + INDENT(x>=I[-1]) of vallist ":" stmts + { INDENT(x) of vallist ":" stmts } + { INDENT(x) elif expr ":" stmts } + [ INDENT(x) else ":" stmts ] + POP + +The `case` statement is similar to the if statement, but it represents +a multi-branch selection. The expression after the keyword ``case`` is +evaluated and if its value is in a *vallist* the corresponding statements +(after the ``of`` keyword) are executed. If the value is no given *vallist* +the ``else`` part is executed. If there is no ``else`` part and not all +possible values that ``expr`` can hold occur in a ``vallist``, a static +error shall be given. This holds only for expressions of ordinal types. +If the expression is not of an ordinal type, and no ``else`` part is +given, control just passes after the ``case`` statement. + +To suppress the static error in the ordinal case the programmer needs +to write an ``else`` part with a ``nil`` statement. + + +When statement +~~~~~~~~~~~~~~ + +Syntax:: + + when_stmt ::= PUSH(x=I[-1]) when expr ":" stmts + { INDENT(x) elif expr ":" stmts } + [ INDENT(x) else ":" stmts ] + POP + +The `when` statement is almost identical to the ``if`` statement with some +exceptions: + +* Each ``expr`` has to be a constant expression of type ``bool``. +* The statements do not open a new scope if they introduce new identifiers. +* The statements that belong to the expression that evaluated to true are + translated by the compiler, the other statements are not checked for + syntax or semantics at all! This holds also for any ``expr`` coming + after the expression that evaluated to true. + +The ``when`` statement enables conditional compilation techniques. As +a special syntatic extension, the ``when `` construct is also available +within ``record`` or ``object`` definitions. + + +Raise statement +~~~~~~~~~~~~~~~ + +Syntax:: + + raise_stmt ::= raise [qualified_identifier [comma expr]] + +Apart from built-in operations like array indexing, memory allocation, etc. +the ``raise`` statement is the only way to raise an exception. The +identifier has to be the name of a previously declared exception. A +comma followed by an expression may follow; the expression must be of type +``string`` or ``cstring``; this is an error message that can be extracted +with the `getCurrentExceptionMsg` procedure in the module ``system``. + +If no exception name is given, the current exception is `re-raised`. The +`ENoExceptionToReraise` exception is raised if there is no exception to +re-raise. It follows that the ``raise`` statement *always* raises an +exception. + + +Try statement +~~~~~~~~~~~~~ + +Syntax:: + + try_stmt ::= PUSH(x=I[-1]) try ":" stmts + { INDENT(x) except exceptlist ":" stmts } + [ INDENT(x) except ":" stmts ] + [ INDENT(x) finally ":" stmts ] + POP + +The statements after the ``try`` are executed in sequential order unless +an exception ``e`` is raised. If the exception type of ``e`` matches any +of the list ``exceptlist`` the corresponding statements are executed. +The statements following the ``except`` clauses are called +`exception handlers`. + +The empty ``except`` clause is executed if there is an exception that is +in no list. It is similiar to an ``else`` clause in ``if`` statements. + +If there is a ``finally`` clause given, it is always executed after the +exception handlers. + +The exception is *consumed* in an exception handler. However, an +exception handler may raise another exception. If the exception is not +handled, it is propagated through the call stack. This means that often +the rest of the procedure - that is not within a ``finally`` clause - +is not executed (if an exception occurs). + + +Block statement +~~~~~~~~~~~~~~~ + +Syntax:: + + block_stmt ::= block [IDENTIFIER] ":" stmts + +The block statement is a means to group statements to a (named) `block`. +Inside the block, the ``break`` statement is allowed to leave the block +immediately. A ``break`` statement can contain a name of a surrounding +block to specify which block is to leave. + + +Break statement +~~~~~~~~~~~~~~~ + +Syntax:: + + break_stmt ::= break [IDENTIFIER] + +The break statement is used to leave a block immediately. If ``IDENTIFIER`` +is given, it is the name of the enclosing block that is to leave. If it is +absent, the innermost block is leaved. + + +While statement +~~~~~~~~~~~~~~~ + +Syntax:: + + while_stmt ::= WHILE expr COLON stmts + +The `while` statement is executed until the ``expr`` evaluates to false. +Endless loops are no error. ``while`` statements open an `implicit block`, +so that they can be aborted by a ``break`` statement. + + +For statement & iterators +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Syntax:: + + for_stmt ::= PUSH(x=I[-1]) for exprlist in expr [".." expr] ":" stmts + POP + +The `for` statement is an abstract mechanism to iterate over the elements +of a container. It relies on an *iterator* to do so. Like ``while`` +statements, ``for`` statements open an `implicit block`, so that they +can be aborted by a ``break`` statement. + +XXX + + + +Assembler statement +~~~~~~~~~~~~~~~~~~~ +Syntax:: + + asm_stmt ::= asm [CHAR_LITERAL] STRING_LITERAL + +The direct embedding of assembler code into Nimrod code is supported by the +unsafe ``asm`` statement. Identifiers in the assembler code that refer to +Nimrod identifiers shall be enclosed in a special character which can be +specified right after the ``asm`` keyword. The default special character is +``'!'``. An implementation does not need to support the assembler statement, +giving a static error if it encounters one. + + +Modules +------- +Nimrod supports splitting a program into pieces by a module concept. Modules +make separate compilation possible. Each module needs to be in its own file. +Modules consist of an interface and an implementation section. The interface +section lists the symbols that can be imported from other modules. Thus modules +enable `information hiding`. The interface section may not contain any +code that is executable. This means that only the headers of procedures can +appear in the interface. A module may gain access to symbols of another module +by the `import` statement. Recursive module dependancies are allowed, but +slightly subtle. + +The algorithm for compiling modules is: + +- Compile the whole module as usual, following import statements recursively +- if we have a cycle only import the already parsed symbols (in the interface + of course); if an unknown identifier occurs then abort + +This is best illustrated by an example:: + + # Module A + interface + type + T1 = int + import B # the compiler starts parsing B + + implementation + + proc main() = + var i = p(3) # works because B has been parsed completely here + + main() + + + # Module B + interface + import A # A is not parsed here! Only the already known symbols + # of A are imported here. + + proc p(x: A.T1): A.T1 # this works because the compiler has already + # added T1 to A's interface symbol table + + implementation + + proc p(x: A.T1): A.T1 = return x + 1 + + +Scope rules +----------- +Identifiers are valid from the point of their declaration until the end of +the block in which the declaration occurred. The range where the identifier +is known is the `scope` of the identifier. The exact scope of an identifier +depends on the way it was declared. + +Block scope +~~~~~~~~~~~ +The *scope* of a variable declared in the declaration part of a block +is valid from the point of declaration until the end of the block. If a +block contains a second block, in which the identifier is redeclared, +then inside this block, the second declaration will be valid. Upon +leaving the inner block, the first declaration is valid again. An +identifier cannot be redefined in the same block, except if valid for +procedure or iterator overloading purposes. + + +Record or object scope +~~~~~~~~~~~~~~~~~~~~~~ +The field identifiers inside a record or object definition are valid in the +following places: + +* To the end of the record definition +* Field designators of a variable of the given record type. +* In all descendent types of the object type. + +Module scope +~~~~~~~~~~~~ +All identifiers in the interface part of a module are valid from the point of +declaration, until the end of the module. Furthermore, the identifiers are +known in other modules that import the module. Identifiers from indirectly +dependent modules are *not* available. Identifiers declared in the +implementation part of a module are valid from the point of declaration to +the end of the module. The ``system`` module is automatically imported in +all other modules. + +If a module imports an identifier by two different modules, +each occurance of the identifier has to be qualified, unless it is an +overloaded procedure or iterator in which case the overloading +resolution takes place:: + + # Module A + interface + var x: string + + # Module B + interface + var x: int + + # Module C + import A, B, io + write(stdout, x) # error: x is ambigious + write(sdtout, A.x) # no error: qualifier used + + +Messages +======== + +A Nimrod compiler has to emit different kinds of messages: `hint`, +`warning`, and `error` messages. `errors` have to be emitted if the compiler +encounters any static errors. If and when the other two message kinds are +emitted is not specified, unless a message is requested with the +``hint`` or ``warning`` pragma. + +Pragmas +======= +Pragmas are Nimrod's method to give the compiler additional information/ +commands without introducing a massive number of new keywords. Pragmas are +processed on the fly during parsing. Pragmas are always enclosed in the +special ``{.`` and ``.}`` curly brackets. There are a number of pragmas +that a compiler has to process; a compiler may define additional pragmas +not specified here. + + +define pragma +------------- +The `define` pragma defines a conditional symbol. This symbol may only be +used in other pragmas and in the ``defined`` expression and not in ordinary +Nimrod source code. The conditional symbols go into a special symbol table. +The compiler shall define the target processor and the target operating +system as conditional symbols. See `Annex A <XXX>`_ for a list of specified +processors and operating systems. The Syntax of the define pragma is:: + + define_pragma ::= curlydot_le "define" colon IDENTIFIER curlydot_ri + + +undef pragma +------------ +The `undef` pragma the counterpart to the define pragma. It undefines a +conditional symbol. Syntax:: + + undef_pragma ::= curlydot_le "undef" colon IDENTIFIER curlydot_ri + + +error pragma +------------ +The `error` pragma is used to make the compiler output an error message with +the given content. Compilation may abort after an error (or not). Syntax:: + + error_pragma ::= curlydot_le "error" colon STRING_LITERAL curlydot_ri + + +fatal pragma +------------ +The `fatal` pragma is used to make the compiler output an error message with +the given content. In contrast to the ``error`` pragma, compilation +is guaranteed to be aborted by this pragma. Syntax:: + + fatal_pragma ::= curlydot_le "fatal" colon STRING_LITERAL curlydot_ri + + +warning pragma +-------------- +The `warning` pragma is used to make the compiler output a warning message with +the given content. Compilation continues after the warning. Syntax:: + + warning_pragma ::= curlydot_le "warning" colon STRING_LITERAL curlydot_ri + + +hint pragma +----------- +The `hint` pragma is used to make the compiler output a hint message with +the given content. Compilation continues after the hint. Syntax:: + + + hint_pragma ::= curlydot_le "hint" colon STRING_LITERAL curlydot_ri + + + +compilation option pragmas +-------------------------- +The listed pragmas here can be used to override the code generation options +for a section of code. +:: + + "{." pragma: val {pragma: val} ".}" + + +An implementation should provide at least the following possible options (it can +add various others). If an implementation does not recognize the option, a +warning shall be given to the user. + +=============== =============== ============================================ +pragma allowed values description +=============== =============== ============================================ +checks on|off Turns the code generation for all runtime + checks on or off. +bound_checks on|off Turns the code generation for array bound + checks on or off. +overflow_checks on|off Turns the code generation for over- or + underflow checks on or off. +nil_checks on|off Turns the code generation for nil pointer + checks on or off. +assertions on|off Turns the code generation for assertions + on or off. +warnings on|off Turns the warning messages of the compiler + on or off. +hints on|off Turns the hint messages of the compiler + on or off. +optimization none|speed|size Optimize the code for speed or size, or + disable optimization. For non-optimizing + compilers this option has no effect. + Neverless they must parse it properly. +callconv cdecl|... Specifies the default calling convention for + all procedures (and procedure types) that + follow. +=============== =============== ============================================ + +Example:: + + {.checks: off, optimization: speed.} + # compile without runtime checks and optimize for speed + + +push and pop pragmas +-------------------- +The push/pop pragmas are very similar to the option directive, +but are used to override the settings temporarily. Example:: + + {.push checks: off.} + # compile this section without runtime checks as it is + # speed critical + # ... some code ... + {.pop.} # restore old settings + + +Annex A: List of conditional symbols +==================================== + +``posix`` is defined on any POSIX compatible operating system. + ++----------------------------+-----------------+ +| Operating System | Symbols | ++============================+=================+ +| AIX | aix, posix | ++----------------------------+-----------------+ +| Compaq Tru64 UNIX | tru64, posix | ++----------------------------+-----------------+ +| Digital UNIX | tru64, posix | ++----------------------------+-----------------+ +| OSF/1 | tru64, posix | ++----------------------------+-----------------+ +| FreeBSD | freebsd, | +| | posix, | +| | bsd | ++----------------------------+-----------------+ +| GNU/Linux | linux, | +| | posix | ++----------------------------+-----------------+ +| HP-UX | hpux, | +| | posix | ++----------------------------+-----------------+ +| Irix | irix, | +| | posix | ++----------------------------+-----------------+ +| MacOS X | macosx, | +| | posix | ++----------------------------+-----------------+ +| NetBSD | netbsd, | +| | posix, | +| | bsd | ++----------------------------+-----------------+ +| OpenBSD | openbsd, | +| | posix, | +| | bsd | ++----------------------------+-----------------+ +| Solaris | solaris, | +| | posix | ++----------------------------+-----------------+ +| Windows (all variants) | windows | ++----------------------------+-----------------+ + + ++----------------------------+-----------------+ +| Processor | Symbols | ++============================+=================+ +| Compaq Alpha | alpha | ++----------------------------+-----------------+ +| HP Precision Architecture | hppa | ++----------------------------+-----------------+ +| INTEL x86 | x86 | ++----------------------------+-----------------+ +| AMD/INTEL x86 64bit | x86_64, | +| | amd64 | ++----------------------------+-----------------+ +| MIPS RISC | mips | ++----------------------------+-----------------+ +| IBM Power PC | powerpc | ++----------------------------+-----------------+ +| SPARC | sparc | ++----------------------------+-----------------+ +| MicroSPARC | sparc | ++----------------------------+-----------------+ +| UltraSPARC | sparc | ++----------------------------+-----------------+ + +On targets for 16, 32 or 64 bit processors the symbols ``cpu16``, ``cpu32`` +or ``cpu64`` shall be defined respectively. On little endian machines the +symbol ``little_endian`` and on big endian ones the symbol ``big_endian`` +are defined. diff --git a/doc/theindex.txt b/doc/theindex.txt new file mode 100755 index 000000000..034f07a9f --- /dev/null +++ b/doc/theindex.txt @@ -0,0 +1,1436 @@ + +===== +Index +===== + +.. index:: + + + `!=`:idx: + `system.html#235 <system.html#235>`_ + + `$`:idx: + * `system.html#326 <system.html#326>`_ + * `system.html#327 <system.html#327>`_ + * `system.html#328 <system.html#328>`_ + * `system.html#329 <system.html#329>`_ + * `system.html#330 <system.html#330>`_ + * `system.html#331 <system.html#331>`_ + * `system.html#332 <system.html#332>`_ + * `times.html#109 <times.html#109>`_ + * `times.html#110 <times.html#110>`_ + + `%`:idx: + * `strutils.html#128 <strutils.html#128>`_ + * `strutils.html#129 <strutils.html#129>`_ + + `%%`:idx: + * `system.html#318 <system.html#318>`_ + * `system.html#319 <system.html#319>`_ + + `&`:idx: + * `system.html#245 <system.html#245>`_ + * `system.html#246 <system.html#246>`_ + * `system.html#247 <system.html#247>`_ + * `system.html#248 <system.html#248>`_ + * `system.html#349 <system.html#349>`_ + * `system.html#350 <system.html#350>`_ + * `system.html#351 <system.html#351>`_ + * `system.html#352 <system.html#352>`_ + + `*`:idx: + * `system.html#159 <system.html#159>`_ + * `system.html#178 <system.html#178>`_ + * `system.html#196 <system.html#196>`_ + * `system.html#207 <system.html#207>`_ + * `complex.html#107 <complex.html#107>`_ + + `*%`:idx: + * `system.html#314 <system.html#314>`_ + * `system.html#315 <system.html#315>`_ + + `+`:idx: + * `system.html#154 <system.html#154>`_ + * `system.html#157 <system.html#157>`_ + * `system.html#173 <system.html#173>`_ + * `system.html#176 <system.html#176>`_ + * `system.html#192 <system.html#192>`_ + * `system.html#194 <system.html#194>`_ + * `system.html#208 <system.html#208>`_ + * `complex.html#103 <complex.html#103>`_ + + `+%`:idx: + * `system.html#310 <system.html#310>`_ + * `system.html#311 <system.html#311>`_ + + `-`:idx: + * `system.html#155 <system.html#155>`_ + * `system.html#158 <system.html#158>`_ + * `system.html#174 <system.html#174>`_ + * `system.html#177 <system.html#177>`_ + * `system.html#193 <system.html#193>`_ + * `system.html#195 <system.html#195>`_ + * `system.html#209 <system.html#209>`_ + * `complex.html#104 <complex.html#104>`_ + * `complex.html#105 <complex.html#105>`_ + * `times.html#113 <times.html#113>`_ + + `-%`:idx: + * `system.html#312 <system.html#312>`_ + * `system.html#313 <system.html#313>`_ + + `-+-`:idx: + `system.html#210 <system.html#210>`_ + + `/`:idx: + * `system.html#197 <system.html#197>`_ + * `os.html#117 <os.html#117>`_ + * `complex.html#106 <complex.html#106>`_ + + `/%`:idx: + * `system.html#316 <system.html#316>`_ + * `system.html#317 <system.html#317>`_ + + `/../`:idx: + `os.html#121 <os.html#121>`_ + + `<`:idx: + * `system.html#169 <system.html#169>`_ + * `system.html#188 <system.html#188>`_ + * `system.html#200 <system.html#200>`_ + * `system.html#227 <system.html#227>`_ + * `system.html#228 <system.html#228>`_ + * `system.html#229 <system.html#229>`_ + * `system.html#230 <system.html#230>`_ + * `system.html#231 <system.html#231>`_ + * `system.html#232 <system.html#232>`_ + * `system.html#233 <system.html#233>`_ + * `system.html#234 <system.html#234>`_ + + `<%`:idx: + * `system.html#322 <system.html#322>`_ + * `system.html#323 <system.html#323>`_ + + `<=`:idx: + * `system.html#168 <system.html#168>`_ + * `system.html#187 <system.html#187>`_ + * `system.html#199 <system.html#199>`_ + * `system.html#220 <system.html#220>`_ + * `system.html#221 <system.html#221>`_ + * `system.html#222 <system.html#222>`_ + * `system.html#223 <system.html#223>`_ + * `system.html#224 <system.html#224>`_ + * `system.html#225 <system.html#225>`_ + * `system.html#226 <system.html#226>`_ + + `<=%`:idx: + * `system.html#320 <system.html#320>`_ + * `system.html#321 <system.html#321>`_ + + `==`:idx: + * `system.html#167 <system.html#167>`_ + * `system.html#186 <system.html#186>`_ + * `system.html#198 <system.html#198>`_ + * `system.html#211 <system.html#211>`_ + * `system.html#212 <system.html#212>`_ + * `system.html#213 <system.html#213>`_ + * `system.html#214 <system.html#214>`_ + * `system.html#215 <system.html#215>`_ + * `system.html#216 <system.html#216>`_ + * `system.html#217 <system.html#217>`_ + * `system.html#218 <system.html#218>`_ + * `system.html#219 <system.html#219>`_ + * `system.html#353 <system.html#353>`_ + * `complex.html#102 <complex.html#102>`_ + + `>`:idx: + `system.html#237 <system.html#237>`_ + + `>%`:idx: + `system.html#325 <system.html#325>`_ + + `>=`:idx: + `system.html#236 <system.html#236>`_ + + `>=%`:idx: + `system.html#324 <system.html#324>`_ + + `[ESC]`:idx: + `manual.html#134 <manual.html#134>`_ + + `abs`:idx: + * `system.html#170 <system.html#170>`_ + * `system.html#189 <system.html#189>`_ + * `system.html#201 <system.html#201>`_ + * `complex.html#108 <complex.html#108>`_ + + `add`:idx: + * `system.html#249 <system.html#249>`_ + * `system.html#250 <system.html#250>`_ + * `system.html#251 <system.html#251>`_ + * `system.html#252 <system.html#252>`_ + * `system.html#253 <system.html#253>`_ + + `addQuitProc`:idx: + `system.html#287 <system.html#287>`_ + + `alert`:idx: + `manual.html#131 <manual.html#131>`_ + + `allCharsInSet`:idx: + `strutils.html#133 <strutils.html#133>`_ + + `alloc`:idx: + `system.html#296 <system.html#296>`_ + + `alloc0`:idx: + `system.html#297 <system.html#297>`_ + + `AltSep`:idx: + `os.html#104 <os.html#104>`_ + + `and`:idx: + * `system.html#164 <system.html#164>`_ + * `system.html#183 <system.html#183>`_ + * `system.html#204 <system.html#204>`_ + + `apostrophe`:idx: + `manual.html#129 <manual.html#129>`_ + + `AppendFileExt`:idx: + `os.html#127 <os.html#127>`_ + + `arccos`:idx: + `math.html#110 <math.html#110>`_ + + `arcsin`:idx: + `math.html#111 <math.html#111>`_ + + `arctan`:idx: + `math.html#112 <math.html#112>`_ + + `arctan2`:idx: + `math.html#113 <math.html#113>`_ + + `array`:idx: + `system.html#108 <system.html#108>`_ + + `Arrays`:idx: + `manual.html#153 <manual.html#153>`_ + + `assembler`:idx: + `manual.html#198 <manual.html#198>`_ + + `assert`:idx: + `system.html#301 <system.html#301>`_ + + `Automatic type conversion`:idx: + `manual.html#145 <manual.html#145>`_ + + `backslash`:idx: + * `manual.html#127 <manual.html#127>`_ + * `regexprs.html#101 <regexprs.html#101>`_ + + `backspace`:idx: + `manual.html#132 <manual.html#132>`_ + + `BiggestFloat`:idx: + `system.html#257 <system.html#257>`_ + + `BiggestInt`:idx: + `system.html#256 <system.html#256>`_ + + `block`:idx: + `manual.html#194 <manual.html#194>`_ + + `boolean`:idx: + `manual.html#147 <manual.html#147>`_ + + `break`:idx: + `manual.html#195 <manual.html#195>`_ + + `breakpoint`:idx: + `endb.html#103 <endb.html#103>`_ + + `Byte`:idx: + `system.html#113 <system.html#113>`_ + + `C`:idx: + `manual.html#136 <manual.html#136>`_ + + `calling conventions`:idx: + `manual.html#164 <manual.html#164>`_ + + `capitalize`:idx: + `strutils.html#110 <strutils.html#110>`_ + + `card`:idx: + `system.html#151 <system.html#151>`_ + + `carriage return`:idx: + `manual.html#122 <manual.html#122>`_ + + `case`:idx: + `manual.html#182 <manual.html#182>`_ + + `cchar`:idx: + `system.html#258 <system.html#258>`_ + + `cdecl`:idx: + `manual.html#166 <manual.html#166>`_ + + `cdouble`:idx: + `system.html#265 <system.html#265>`_ + + `cfloat`:idx: + `system.html#264 <system.html#264>`_ + + `ChangeFileExt`:idx: + `os.html#128 <os.html#128>`_ + + `character type`:idx: + `manual.html#148 <manual.html#148>`_ + + `character with decimal value d`:idx: + `manual.html#130 <manual.html#130>`_ + + `character with hex value HH`:idx: + `manual.html#135 <manual.html#135>`_ + + `checked runtime error`:idx: + `manual.html#110 <manual.html#110>`_ + + `chr`:idx: + `system.html#153 <system.html#153>`_ + + `cint`:idx: + `system.html#261 <system.html#261>`_ + + `classify`:idx: + `math.html#123 <math.html#123>`_ + + `clong`:idx: + `system.html#262 <system.html#262>`_ + + `clongdouble`:idx: + `system.html#266 <system.html#266>`_ + + `clonglong`:idx: + `system.html#263 <system.html#263>`_ + + `CloseFile`:idx: + `system.html#367 <system.html#367>`_ + + `closure`:idx: + `manual.html#171 <manual.html#171>`_ + + `cmp`:idx: + * `system.html#243 <system.html#243>`_ + * `system.html#244 <system.html#244>`_ + + `cmpIgnoreCase`:idx: + `strutils.html#119 <strutils.html#119>`_ + + `cmpIgnoreStyle`:idx: + `strutils.html#120 <strutils.html#120>`_ + + `cmpPaths`:idx: + `os.html#126 <os.html#126>`_ + + `comment pieces`:idx: + `manual.html#115 <manual.html#115>`_ + + `Comments`:idx: + `manual.html#114 <manual.html#114>`_ + + `CompileDate`:idx: + `system.html#275 <system.html#275>`_ + + `CompileTime`:idx: + `system.html#276 <system.html#276>`_ + + `complex statements`:idx: + `manual.html#176 <manual.html#176>`_ + + `const`:idx: + `manual.html#180 <manual.html#180>`_ + + `constant expressions`:idx: + `manual.html#108 <manual.html#108>`_ + + `Constants`:idx: + `manual.html#140 <manual.html#140>`_ + + `continue`:idx: + `manual.html#197 <manual.html#197>`_ + + `copy`:idx: + * `system.html#288 <system.html#288>`_ + * `system.html#289 <system.html#289>`_ + + `copyFile`:idx: + `os.html#131 <os.html#131>`_ + + `copyMem`:idx: + `system.html#293 <system.html#293>`_ + + `cos`:idx: + `math.html#114 <math.html#114>`_ + + `cosh`:idx: + `math.html#115 <math.html#115>`_ + + `countBits`:idx: + `math.html#103 <math.html#103>`_ + + `countdown`:idx: + `system.html#341 <system.html#341>`_ + + `countup`:idx: + `system.html#342 <system.html#342>`_ + + `cpuEndian`:idx: + `system.html#281 <system.html#281>`_ + + `createDir`:idx: + `os.html#135 <os.html#135>`_ + + `cschar`:idx: + `system.html#259 <system.html#259>`_ + + `cshort`:idx: + `system.html#260 <system.html#260>`_ + + `cstringArray`:idx: + `system.html#267 <system.html#267>`_ + + `CurDir`:idx: + `os.html#101 <os.html#101>`_ + + `dangling else problem`:idx: + `manual.html#177 <manual.html#177>`_ + + `dbgLineHook`:idx: + `system.html#337 <system.html#337>`_ + + `dealloc`:idx: + `system.html#299 <system.html#299>`_ + + `debugger`:idx: + `nimrodc.html#108 <nimrodc.html#108>`_ + + `dec`:idx: + `system.html#144 <system.html#144>`_ + + `define`:idx: + `manual.html#224 <manual.html#224>`_ + + `defined`:idx: + `system.html#101 <system.html#101>`_ + + `deleteStr`:idx: + `strutils.html#115 <strutils.html#115>`_ + + `DirSep`:idx: + `os.html#103 <os.html#103>`_ + + `discard`:idx: + `manual.html#178 <manual.html#178>`_ + + `div`:idx: + * `system.html#160 <system.html#160>`_ + * `system.html#179 <system.html#179>`_ + + `domain specific languages`:idx: + `manual.html#213 <manual.html#213>`_ + + `dynamic type`:idx: + `manual.html#104 <manual.html#104>`_ + + `EAccessViolation`:idx: + `system.html#128 <system.html#128>`_ + + `EArithmetic`:idx: + `system.html#125 <system.html#125>`_ + + `EAssertionFailed`:idx: + `system.html#129 <system.html#129>`_ + + `EAsynch`:idx: + `system.html#119 <system.html#119>`_ + + `E_Base`:idx: + `system.html#118 <system.html#118>`_ + + `echo`:idx: + `system.html#380 <system.html#380>`_ + + `EControlC`:idx: + `system.html#130 <system.html#130>`_ + + `EDivByZero`:idx: + `system.html#126 <system.html#126>`_ + + `EInvalidIndex`:idx: + `system.html#133 <system.html#133>`_ + + `EInvalidObjectAssignment`:idx: + * `manual.html#157 <manual.html#157>`_ + * `system.html#137 <system.html#137>`_ + + `EInvalidObjectConversion`:idx: + `system.html#138 <system.html#138>`_ + + `EInvalidRegEx`:idx: + `regexprs.html#104 <regexprs.html#104>`_ + + `EInvalidValue`:idx: + * `manual.html#146 <manual.html#146>`_ + * `system.html#131 <system.html#131>`_ + + `EIO`:idx: + `system.html#122 <system.html#122>`_ + + `Embedded Nimrod Debugger`:idx: + `endb.html#101 <endb.html#101>`_ + + `ENDB`:idx: + `endb.html#102 <endb.html#102>`_ + + `EndOfFile`:idx: + `system.html#368 <system.html#368>`_ + + `endsWith`:idx: + `strutils.html#132 <strutils.html#132>`_ + + `ENoExceptionToReraise`:idx: + * `manual.html#186 <manual.html#186>`_ + * `system.html#136 <system.html#136>`_ + + `Enumeration`:idx: + `manual.html#149 <manual.html#149>`_ + + `EOS`:idx: + `system.html#123 <system.html#123>`_ + + `EOutOfMemory`:idx: + `system.html#132 <system.html#132>`_ + + `EOutOfRange`:idx: + `system.html#134 <system.html#134>`_ + + `EOverflow`:idx: + `system.html#127 <system.html#127>`_ + + `equalMem`:idx: + `system.html#295 <system.html#295>`_ + + `ERessourceExhausted`:idx: + `system.html#124 <system.html#124>`_ + + `error`:idx: + * `manual.html#223 <manual.html#223>`_ + * `manual.html#226 <manual.html#226>`_ + + `escape`:idx: + `manual.html#133 <manual.html#133>`_ + + `escape sequences`:idx: + `manual.html#120 <manual.html#120>`_ + + `EStackOverflow`:idx: + `system.html#135 <system.html#135>`_ + + `ESynch`:idx: + `system.html#120 <system.html#120>`_ + + `ESystem`:idx: + `system.html#121 <system.html#121>`_ + + `except`:idx: + `manual.html#189 <manual.html#189>`_ + + `exception handlers`:idx: + `manual.html#188 <manual.html#188>`_ + + `excl`:idx: + `system.html#150 <system.html#150>`_ + + `executeProcess`:idx: + `os.html#129 <os.html#129>`_ + + `executeShellCommand`:idx: + `os.html#130 <os.html#130>`_ + + `existsDir`:idx: + `os.html#136 <os.html#136>`_ + + `existsEnv`:idx: + `os.html#140 <os.html#140>`_ + + `ExistsFile`:idx: + `os.html#115 <os.html#115>`_ + + `exp`:idx: + `math.html#108 <math.html#108>`_ + + `expandFilename`:idx: + `os.html#114 <os.html#114>`_ + + `extractDir`:idx: + `os.html#124 <os.html#124>`_ + + `extractFilename`:idx: + `os.html#125 <os.html#125>`_ + + `ExtSep`:idx: + `os.html#107 <os.html#107>`_ + + `fastcall`:idx: + `manual.html#169 <manual.html#169>`_ + + `fatal`:idx: + `manual.html#227 <manual.html#227>`_ + + `FileSystemCaseSensitive`:idx: + `os.html#106 <os.html#106>`_ + + `finally`:idx: + `manual.html#190 <manual.html#190>`_ + + `find`:idx: + * `regexprs.html#109 <regexprs.html#109>`_ + * `regexprs.html#110 <regexprs.html#110>`_ + + `findSubStr`:idx: + * `strutils.html#112 <strutils.html#112>`_ + * `strutils.html#113 <strutils.html#113>`_ + + `FlushFile`:idx: + `system.html#370 <system.html#370>`_ + + `for`:idx: + `manual.html#205 <manual.html#205>`_ + + `form feed`:idx: + `manual.html#124 <manual.html#124>`_ + + `forward`:idx: + `manual.html#202 <manual.html#202>`_ + + `frexp`:idx: + `math.html#109 <math.html#109>`_ + + `functional`:idx: + `manual.html#163 <manual.html#163>`_ + + `funtions`:idx: + `manual.html#200 <manual.html#200>`_ + + `GC_disable`:idx: + `system.html#354 <system.html#354>`_ + + `GC_disableMarkAndSweep`:idx: + `system.html#360 <system.html#360>`_ + + `GC_enable`:idx: + `system.html#355 <system.html#355>`_ + + `GC_enableMarkAndSweep`:idx: + `system.html#359 <system.html#359>`_ + + `GC_fullCollect`:idx: + `system.html#356 <system.html#356>`_ + + `GC_setStrategy`:idx: + `system.html#358 <system.html#358>`_ + + `generic character types`:idx: + `regexprs.html#102 <regexprs.html#102>`_ + + `Generics`:idx: + `manual.html#209 <manual.html#209>`_ + + `getApplicationDir`:idx: + `os.html#108 <os.html#108>`_ + + `getApplicationFilename`:idx: + `os.html#109 <os.html#109>`_ + + `getClockStr`:idx: + `times.html#112 <times.html#112>`_ + + `getConfigDir`:idx: + `os.html#113 <os.html#113>`_ + + `getCurrentDir`:idx: + `os.html#110 <os.html#110>`_ + + `getCurrentExceptionMsg`:idx: + * `manual.html#184 <manual.html#184>`_ + * `system.html#334 <system.html#334>`_ + + `getDateStr`:idx: + `times.html#111 <times.html#111>`_ + + `getEnv`:idx: + `os.html#139 <os.html#139>`_ + + `getFilePos`:idx: + `system.html#389 <system.html#389>`_ + + `getFileSize`:idx: + `system.html#381 <system.html#381>`_ + + `getFreeMem`:idx: + `system.html#339 <system.html#339>`_ + + `getGMTime`:idx: + `times.html#107 <times.html#107>`_ + + `getHomeDir`:idx: + `os.html#112 <os.html#112>`_ + + `getLastModificationTime`:idx: + `os.html#137 <os.html#137>`_ + + `getLocalTime`:idx: + `times.html#106 <times.html#106>`_ + + `getOccupiedMem`:idx: + `system.html#338 <system.html#338>`_ + + `getRefcount`:idx: + `system.html#333 <system.html#333>`_ + + `getStartMilsecs`:idx: + `times.html#114 <times.html#114>`_ + + `getTime`:idx: + `times.html#105 <times.html#105>`_ + + `getTotalMem`:idx: + `system.html#340 <system.html#340>`_ + + `header`:idx: + `nimrodc.html#103 <nimrodc.html#103>`_ + + `high`:idx: + `system.html#105 <system.html#105>`_ + + `hint`:idx: + * `manual.html#221 <manual.html#221>`_ + * `manual.html#229 <manual.html#229>`_ + + `hypot`:idx: + `math.html#116 <math.html#116>`_ + + `identifier`:idx: + `manual.html#105 <manual.html#105>`_ + + `Identifiers`:idx: + `manual.html#116 <manual.html#116>`_ + + `if`:idx: + `manual.html#181 <manual.html#181>`_ + + `implicit block`:idx: + `manual.html#207 <manual.html#207>`_ + + `import`:idx: + `manual.html#217 <manual.html#217>`_ + + `in`:idx: + `system.html#239 <system.html#239>`_ + + `inc`:idx: + `system.html#143 <system.html#143>`_ + + `incl`:idx: + `system.html#149 <system.html#149>`_ + + `indentation sensitive`:idx: + `manual.html#113 <manual.html#113>`_ + + `inf`:idx: + `system.html#335 <system.html#335>`_ + + `information hiding`:idx: + `manual.html#215 <manual.html#215>`_ + + `inline`:idx: + `manual.html#168 <manual.html#168>`_ + + `in_Operator`:idx: + * `strutils.html#121 <strutils.html#121>`_ + * `strutils.html#122 <strutils.html#122>`_ + + `in_Operator`:idx: + `system.html#238 <system.html#238>`_ + + `intToStr`:idx: + `strutils.html#124 <strutils.html#124>`_ + + `is`:idx: + `system.html#241 <system.html#241>`_ + + `is_not`:idx: + `system.html#242 <system.html#242>`_ + + `isPowerOfTwo`:idx: + `math.html#102 <math.html#102>`_ + + `items`:idx: + * `system.html#343 <system.html#343>`_ + * `system.html#344 <system.html#344>`_ + * `system.html#345 <system.html#345>`_ + * `system.html#346 <system.html#346>`_ + * `system.html#347 <system.html#347>`_ + * `system.html#348 <system.html#348>`_ + + `iterator`:idx: + `manual.html#206 <manual.html#206>`_ + + `iterOverEnvironment`:idx: + `os.html#144 <os.html#144>`_ + + `JoinPath`:idx: + * `os.html#116 <os.html#116>`_ + * `os.html#118 <os.html#118>`_ + + `keywords`:idx: + `manual.html#117 <manual.html#117>`_ + + `l-values`:idx: + `manual.html#107 <manual.html#107>`_ + + `len`:idx: + * `system.html#145 <system.html#145>`_ + * `system.html#146 <system.html#146>`_ + * `system.html#147 <system.html#147>`_ + * `system.html#148 <system.html#148>`_ + + `line feed`:idx: + `manual.html#123 <manual.html#123>`_ + + `line_dir`:idx: + `nimrodc.html#105 <nimrodc.html#105>`_ + + `lines`:idx: + `system.html#390 <system.html#390>`_ + + `line_trace`:idx: + `nimrodc.html#107 <nimrodc.html#107>`_ + + `Literal strings`:idx: + `manual.html#119 <manual.html#119>`_ + + `ln`:idx: + `math.html#107 <math.html#107>`_ + + `locations`:idx: + `manual.html#101 <manual.html#101>`_ + + `log10`:idx: + `math.html#117 <math.html#117>`_ + + `low`:idx: + `system.html#106 <system.html#106>`_ + + `Macros`:idx: + `manual.html#212 <manual.html#212>`_ + + `match`:idx: + * `regexprs.html#106 <regexprs.html#106>`_ + * `regexprs.html#107 <regexprs.html#107>`_ + + `matchLen`:idx: + `regexprs.html#108 <regexprs.html#108>`_ + + `max`:idx: + * `system.html#172 <system.html#172>`_ + * `system.html#191 <system.html#191>`_ + * `system.html#203 <system.html#203>`_ + + `MaxSubpatterns`:idx: + `regexprs.html#105 <regexprs.html#105>`_ + + `methods`:idx: + `manual.html#199 <manual.html#199>`_ + + `min`:idx: + * `system.html#171 <system.html#171>`_ + * `system.html#190 <system.html#190>`_ + * `system.html#202 <system.html#202>`_ + + `mod`:idx: + * `system.html#161 <system.html#161>`_ + * `system.html#180 <system.html#180>`_ + + `module`:idx: + `manual.html#214 <manual.html#214>`_ + + `moveFile`:idx: + `os.html#132 <os.html#132>`_ + + `moveMem`:idx: + `system.html#294 <system.html#294>`_ + + `nan`:idx: + `system.html#336 <system.html#336>`_ + + `Natural`:idx: + `system.html#114 <system.html#114>`_ + + `new`:idx: + * `system.html#103 <system.html#103>`_ + * `system.html#104 <system.html#104>`_ + + `newline`:idx: + `manual.html#121 <manual.html#121>`_ + + `newString`:idx: + `system.html#291 <system.html#291>`_ + + `nextPowerOfTwo`:idx: + `math.html#101 <math.html#101>`_ + + `nimcall`:idx: + `manual.html#170 <manual.html#170>`_ + + `NimrodMajor`:idx: + `system.html#278 <system.html#278>`_ + + `NimrodMinor`:idx: + `system.html#279 <system.html#279>`_ + + `NimrodPatch`:idx: + `system.html#280 <system.html#280>`_ + + `NimrodVersion`:idx: + `system.html#277 <system.html#277>`_ + + `nl`:idx: + `strutils.html#104 <strutils.html#104>`_ + + `noconv`:idx: + `manual.html#173 <manual.html#173>`_ + + `no_decl`:idx: + `nimrodc.html#101 <nimrodc.html#101>`_ + + `normalize`:idx: + `strutils.html#111 <strutils.html#111>`_ + + `no_static`:idx: + `nimrodc.html#104 <nimrodc.html#104>`_ + + `not`:idx: + * `system.html#102 <system.html#102>`_ + * `system.html#156 <system.html#156>`_ + * `system.html#175 <system.html#175>`_ + + `not_in`:idx: + `system.html#240 <system.html#240>`_ + + `Numerical constants`:idx: + `manual.html#137 <manual.html#137>`_ + + `object`:idx: + `manual.html#156 <manual.html#156>`_ + + `openarray`:idx: + `system.html#109 <system.html#109>`_ + + `OpenFile`:idx: + `system.html#366 <system.html#366>`_ + + `operator`:idx: + `manual.html#139 <manual.html#139>`_ + + `Operators`:idx: + `manual.html#204 <manual.html#204>`_ + + `or`:idx: + * `system.html#165 <system.html#165>`_ + * `system.html#184 <system.html#184>`_ + * `system.html#205 <system.html#205>`_ + + `ord`:idx: + `system.html#152 <system.html#152>`_ + + `Ordinal types`:idx: + `manual.html#142 <manual.html#142>`_ + + `paramCount`:idx: + `os.html#141 <os.html#141>`_ + + `paramStr`:idx: + `os.html#142 <os.html#142>`_ + + `ParDir`:idx: + `os.html#102 <os.html#102>`_ + + `parentDir`:idx: + `os.html#120 <os.html#120>`_ + + `ParseFloat`:idx: + `strutils.html#126 <strutils.html#126>`_ + + `ParseInt`:idx: + `strutils.html#125 <strutils.html#125>`_ + + `PathSep`:idx: + `os.html#105 <os.html#105>`_ + + `PFloat32`:idx: + `system.html#269 <system.html#269>`_ + + `PFloat64`:idx: + `system.html#270 <system.html#270>`_ + + `PInt32`:idx: + `system.html#272 <system.html#272>`_ + + `PInt64`:idx: + `system.html#271 <system.html#271>`_ + + `PObject`:idx: + `system.html#117 <system.html#117>`_ + + `pointers`:idx: + `manual.html#159 <manual.html#159>`_ + + `Positive`:idx: + `system.html#115 <system.html#115>`_ + + `pow`:idx: + `math.html#121 <math.html#121>`_ + + `pred`:idx: + `system.html#142 <system.html#142>`_ + + `procedural type`:idx: + `manual.html#162 <manual.html#162>`_ + + `procedures`:idx: + `manual.html#201 <manual.html#201>`_ + + `push/pop`:idx: + `manual.html#230 <manual.html#230>`_ + + `putEnv`:idx: + `os.html#138 <os.html#138>`_ + + `quit`:idx: + `system.html#286 <system.html#286>`_ + + `QuitFailure`:idx: + `system.html#274 <system.html#274>`_ + + `QuitSuccess`:idx: + `system.html#273 <system.html#273>`_ + + `quotation mark`:idx: + `manual.html#128 <manual.html#128>`_ + + `random`:idx: + `math.html#104 <math.html#104>`_ + + `randomize`:idx: + `math.html#105 <math.html#105>`_ + + `range`:idx: + `system.html#107 <system.html#107>`_ + + `re-raised`:idx: + `manual.html#185 <manual.html#185>`_ + + `readBuffer`:idx: + `system.html#384 <system.html#384>`_ + + `ReadBytes`:idx: + `system.html#382 <system.html#382>`_ + + `readChar`:idx: + `system.html#369 <system.html#369>`_ + + `ReadChars`:idx: + `system.html#383 <system.html#383>`_ + + `readFile`:idx: + `system.html#371 <system.html#371>`_ + + `readLine`:idx: + `system.html#378 <system.html#378>`_ + + `realloc`:idx: + `system.html#298 <system.html#298>`_ + + `record`:idx: + `manual.html#155 <manual.html#155>`_ + + `Recursive module dependancies`:idx: + `manual.html#218 <manual.html#218>`_ + + `register`:idx: + `nimrodc.html#110 <nimrodc.html#110>`_ + + `removeDir`:idx: + `os.html#134 <os.html#134>`_ + + `removeFile`:idx: + `os.html#133 <os.html#133>`_ + + `repeatChar`:idx: + `strutils.html#130 <strutils.html#130>`_ + + `replaceStr`:idx: + `strutils.html#114 <strutils.html#114>`_ + + `repr`:idx: + `system.html#254 <system.html#254>`_ + + `result`:idx: + * `manual.html#192 <manual.html#192>`_ + * `manual.html#203 <manual.html#203>`_ + + `return`:idx: + `manual.html#191 <manual.html#191>`_ + + `safe`:idx: + `manual.html#112 <manual.html#112>`_ + + `safecall`:idx: + `manual.html#167 <manual.html#167>`_ + + `sameFile`:idx: + `os.html#143 <os.html#143>`_ + + `scope`:idx: + * `manual.html#106 <manual.html#106>`_ + * `manual.html#219 <manual.html#219>`_ + + `separate compilation`:idx: + `manual.html#216 <manual.html#216>`_ + + `seq`:idx: + `system.html#110 <system.html#110>`_ + + `Sequences`:idx: + `manual.html#154 <manual.html#154>`_ + + `set`:idx: + `system.html#111 <system.html#111>`_ + + `set type`:idx: + `manual.html#158 <manual.html#158>`_ + + `setCurrentDir`:idx: + `os.html#111 <os.html#111>`_ + + `setFilePos`:idx: + `system.html#388 <system.html#388>`_ + + `setLen`:idx: + * `system.html#290 <system.html#290>`_ + * `system.html#300 <system.html#300>`_ + + `shl`:idx: + * `system.html#163 <system.html#163>`_ + * `system.html#182 <system.html#182>`_ + + `shr`:idx: + * `system.html#162 <system.html#162>`_ + * `system.html#181 <system.html#181>`_ + + `simple assertions`:idx: + `regexprs.html#103 <regexprs.html#103>`_ + + `simple statements`:idx: + `manual.html#175 <manual.html#175>`_ + + `sinh`:idx: + `math.html#118 <math.html#118>`_ + + `sizeof`:idx: + `system.html#140 <system.html#140>`_ + + `split`:idx: + `strutils.html#117 <strutils.html#117>`_ + + `SplitFilename`:idx: + `os.html#123 <os.html#123>`_ + + `SplitPath`:idx: + `os.html#119 <os.html#119>`_ + + `splitSeq`:idx: + `strutils.html#118 <strutils.html#118>`_ + + `sqrt`:idx: + * `math.html#106 <math.html#106>`_ + * `complex.html#109 <complex.html#109>`_ + + `stack_trace`:idx: + `nimrodc.html#106 <nimrodc.html#106>`_ + + `startsWith`:idx: + `strutils.html#131 <strutils.html#131>`_ + + `Statements`:idx: + `manual.html#174 <manual.html#174>`_ + + `static error`:idx: + `manual.html#109 <manual.html#109>`_ + + `static type`:idx: + `manual.html#103 <manual.html#103>`_ + + `stdcall`:idx: + `manual.html#165 <manual.html#165>`_ + + `stderr`:idx: + `system.html#365 <system.html#365>`_ + + `stdin`:idx: + `system.html#363 <system.html#363>`_ + + `stdout`:idx: + `system.html#364 <system.html#364>`_ + + `string`:idx: + `manual.html#151 <manual.html#151>`_ + + `strip`:idx: + `strutils.html#105 <strutils.html#105>`_ + + `strStart`:idx: + `strutils.html#103 <strutils.html#103>`_ + + `structured type`:idx: + `manual.html#152 <manual.html#152>`_ + + `style-insensitive`:idx: + `manual.html#118 <manual.html#118>`_ + + `subrange`:idx: + `manual.html#150 <manual.html#150>`_ + + `succ`:idx: + `system.html#141 <system.html#141>`_ + + `swap`:idx: + `system.html#302 <system.html#302>`_ + + `syscall`:idx: + `manual.html#172 <manual.html#172>`_ + + `system`:idx: + `manual.html#220 <manual.html#220>`_ + + `tabulator`:idx: + `manual.html#125 <manual.html#125>`_ + + `TAddress`:idx: + `system.html#255 <system.html#255>`_ + + `tan`:idx: + `math.html#119 <math.html#119>`_ + + `tanh`:idx: + `math.html#120 <math.html#120>`_ + + `TCharSet`:idx: + `strutils.html#101 <strutils.html#101>`_ + + `TComplex`:idx: + `complex.html#101 <complex.html#101>`_ + + `template`:idx: + `manual.html#211 <manual.html#211>`_ + + `TEndian`:idx: + `system.html#268 <system.html#268>`_ + + `TFile`:idx: + `system.html#361 <system.html#361>`_ + + `TFileMode`:idx: + `system.html#362 <system.html#362>`_ + + `TFloatClass`:idx: + `math.html#122 <math.html#122>`_ + + `TGC_Strategy`:idx: + `system.html#357 <system.html#357>`_ + + `TimeInfoToTime`:idx: + `times.html#108 <times.html#108>`_ + + `TMonth`:idx: + `times.html#101 <times.html#101>`_ + + `toBiggestFloat`:idx: + `system.html#283 <system.html#283>`_ + + `toBiggestInt`:idx: + `system.html#285 <system.html#285>`_ + + `toBin`:idx: + `strutils.html#135 <strutils.html#135>`_ + + `TObject`:idx: + `system.html#116 <system.html#116>`_ + + `toFloat`:idx: + `system.html#282 <system.html#282>`_ + + `toHex`:idx: + `strutils.html#123 <strutils.html#123>`_ + + `toInt`:idx: + `system.html#284 <system.html#284>`_ + + `toLower`:idx: + * `strutils.html#106 <strutils.html#106>`_ + * `strutils.html#107 <strutils.html#107>`_ + + `toOct`:idx: + `strutils.html#134 <strutils.html#134>`_ + + `toOctal`:idx: + `strutils.html#116 <strutils.html#116>`_ + + `toString`:idx: + `strutils.html#127 <strutils.html#127>`_ + + `toU16`:idx: + `system.html#308 <system.html#308>`_ + + `toU32`:idx: + `system.html#309 <system.html#309>`_ + + `toU8`:idx: + `system.html#307 <system.html#307>`_ + + `toUpper`:idx: + * `strutils.html#108 <strutils.html#108>`_ + * `strutils.html#109 <strutils.html#109>`_ + + `traced`:idx: + `manual.html#160 <manual.html#160>`_ + + `TResult`:idx: + `system.html#139 <system.html#139>`_ + + `try`:idx: + `manual.html#187 <manual.html#187>`_ + + `TTime`:idx: + `times.html#103 <times.html#103>`_ + + `TTimeInfo`:idx: + `times.html#104 <times.html#104>`_ + + `tuple`:idx: + `system.html#112 <system.html#112>`_ + + `TWeekDay`:idx: + `times.html#102 <times.html#102>`_ + + `type`:idx: + * `manual.html#102 <manual.html#102>`_ + * `manual.html#141 <manual.html#141>`_ + * `manual.html#208 <manual.html#208>`_ + + `type parameters`:idx: + `manual.html#210 <manual.html#210>`_ + + `type suffix`:idx: + `manual.html#138 <manual.html#138>`_ + + `unchecked runtime error`:idx: + `manual.html#111 <manual.html#111>`_ + + `undef`:idx: + `manual.html#225 <manual.html#225>`_ + + `UnixToNativePath`:idx: + `os.html#122 <os.html#122>`_ + + `unsigned integer`:idx: + `manual.html#143 <manual.html#143>`_ + + `unsigned operations`:idx: + `manual.html#144 <manual.html#144>`_ + + `untraced`:idx: + `manual.html#161 <manual.html#161>`_ + + `Var`:idx: + `manual.html#179 <manual.html#179>`_ + + `varargs`:idx: + `nimrodc.html#102 <nimrodc.html#102>`_ + + `vertical tabulator`:idx: + `manual.html#126 <manual.html#126>`_ + + `volatile`:idx: + `nimrodc.html#109 <nimrodc.html#109>`_ + + `walkFiles`:idx: + `os.html#145 <os.html#145>`_ + + `warning`:idx: + * `manual.html#222 <manual.html#222>`_ + * `manual.html#228 <manual.html#228>`_ + + `when`:idx: + `manual.html#183 <manual.html#183>`_ + + `while`:idx: + `manual.html#196 <manual.html#196>`_ + + `Whitespace`:idx: + `strutils.html#102 <strutils.html#102>`_ + + `write`:idx: + * `system.html#372 <system.html#372>`_ + * `system.html#373 <system.html#373>`_ + * `system.html#374 <system.html#374>`_ + * `system.html#375 <system.html#375>`_ + * `system.html#376 <system.html#376>`_ + * `system.html#377 <system.html#377>`_ + + `writeBuffer`:idx: + `system.html#387 <system.html#387>`_ + + `writeBytes`:idx: + `system.html#385 <system.html#385>`_ + + `writeChars`:idx: + `system.html#386 <system.html#386>`_ + + `writeln`:idx: + `system.html#379 <system.html#379>`_ + + `xor`:idx: + * `system.html#166 <system.html#166>`_ + * `system.html#185 <system.html#185>`_ + * `system.html#206 <system.html#206>`_ + + `yield`:idx: + `manual.html#193 <manual.html#193>`_ + + `ze`:idx: + * `system.html#303 <system.html#303>`_ + * `system.html#304 <system.html#304>`_ + * `system.html#306 <system.html#306>`_ + + `ze64`:idx: + `system.html#305 <system.html#305>`_ + + `zeroMem`:idx: + `system.html#292 <system.html#292>`_ \ No newline at end of file diff --git a/doc/tutorial.txt b/doc/tutorial.txt new file mode 100755 index 000000000..f37b116c4 --- /dev/null +++ b/doc/tutorial.txt @@ -0,0 +1,215 @@ +=========================================== +Tutorial of the Nimrod Programming Language +=========================================== + +:Author: Andreas Rumpf + +Motivation +========== + +Why yet another programming language? + +Look at the trends behind all the new programming languages: + +* They try to be dynamic: Dynamic typing, dynamic method binding, etc. + In my opinion the most things the dynamic features buy could be achieved + with static means in a more efficient and *understandable* way. + +* They depend on big runtime environments which you need to + ship with your program as each new version of these may break compability + in subtle ways or you use recently added features - thus forcing your + users to update their runtime environment. Compiled programs where the + executable contains all needed code are simply the better solution. + +* They are unsuitable for systems programming: Do you really want to + write an operating system, a device driver or an interpreter in a language + that is just-in-time compiled (or interpreted)? + + +So what lacks are *good* systems programming languages. Nimrod is such a +language. It offers the following features: + +* It is readable: It reads from left to right (unlike the C-syntax + languages). +* It is strongly and statically typed: This enables the compiler to find + more errors. Static typing also makes programs more *readable*. +* It is compiled. (Currently this is done via compilation to C.) +* It is garbage collected. Big systems need garbage collection. Manuell + memory management is also supported through *untraced pointers*. +* It scales because high level features are also available: It has built-in + bit sets, strings, enumerations, objects, arrays and dynamically resizeable + arrays (called *sequences*). +* It has high performance: The current implementation compiles to C + and uses a Deutsch-Bobrow garbage collector together with Christoper's + partial mark-sweep garbage collector leading to excellent execution + speed and a small memory footprint. +* It has real modules with proper interfaces and supports separate + compilation. +* It is portable: It compiles to C and platform specific features have + been separated and documented. So even if your platform is not supported + porting should be easy. +* It is flexible: Although primilarily a procedural language, generic, + functional and object-oriented programming is also supported. +* It is easy to learn, easy to use and leads to elegant programs. +* You can link an embedded debugger to your program (ENDB). ENDB is + very easy to use - there is no need to clutter your code with + ``echo`` statements for proper debugging. + + +Introduction +============ + +This document is a tutorial for the programming language *Nimrod*. It should +be a readable quick tour through the language instead of a dry specification +(which can be found `here <manual.html>`_). This tutorial assumes that +the reader already knows some other programming language such as Pascal. Thus +it is detailed in cases where Nimrod differs from other programming languages +and kept short where Nimrod is more or less the same. + + +A quick tour through the language +================================= + +The first program +----------------- + +We start the tour with a modified "hallo world" program: + +.. code-block:: Nimrod + # This is a comment + # Standard IO-routines are always accessible + write(stdout, "What's your name? ") + var name: string = readLine(stdin) + write(stdout, "Hi, " & name & "!\n") + + +Save this code to the file "greeting.nim". Now compile and run it:: + + nimrod run greeting.nim + +As you see, with the ``run`` command Nimrod executes the file automatically +after compilation. You can even give your program command line arguments by +appending them after the filename that is to be compiled and run:: + + nimrod run greeting.nim arg1 arg2 + +Though it should be pretty obvious what the program does, I will explain the +syntax: Statements which are not indented are executed when the program +starts. Indentation is Nimrod's way of grouping statements. String literals +are enclosed in double quotes. The ``var`` statement declares a new variable +named ``name`` of type ``string`` with the value that is returned by the +``readline`` procedure. Since the compiler knows that ``readline`` returns +a string, you can leave out the type in the declaration. So this will work too: + +.. code-block:: Nimrod + var name = readline(stdin) + +Note that this is the only form of type inference that exists in Nimrod: +This is because it yields a good compromise between brevity and readability. + +The ``&`` operator concates strings together. ``\n`` stands for the +new line character(s). On several operating systems ``\n`` is represented by +*two* characters: Linefeed and Carriage Return. That is why +*character literals* cannot contain ``\n``. But since Nimrod handles strings +so well, this is a nonissue. + +The "hallo world" program contains several identifiers that are already +known to the compiler: ``write``, ``stdout``, ``readLine``, etc. These +built-in items are declared in the system_ module which is implicitly +imported by any other module. + + +Lexical elements +---------------- + +Let us look into Nimrod's lexical elements in more detail: Like other +programming languages Nimrod consists of identifiers, keywords, comments, +operators, and other punctation marks. Case is *insignificant* in Nimrod and +even underscores are ignored: ``This_is_an_identifier`` and this is the same +identifier ``ThisIsAnIdentifier``. This feature enables one to use other +peoples code without bothering about a naming convention that one does not +like. + +String literals are enclosed in double quotes, character literals in single +quotes. There exist also *raw* string and character literals: + +.. code-block:: Nimrod + r"C:\program files\nim" + +In raw literals the backslash is not an escape character, so they fit +the principle *what you see is what you get*. *Long string literals* +are also available (``""" ... """``); they can span over multiple lines +and the ``\`` is not an escape character either. They are very useful +for embedding SQL code templates for example. + +Comments start with ``#`` and run till the end of the line. (Well this is not +quite true, but you should read the manual for a proper explanation.) + +... XXX number literals + + +The usual statements - if, while, for, case +------------------------------------------- + +In Nimrod indentation is used to group statements. +An example showing the most common statement types: + +.. code-block:: Nimrod + var name = readLine(stdin) + + if name == "Andreas": + echo("What a nice name!") + elif name == "": + echo("Don't you have a name?") + else: + echo("Boring name...") + + for i in 0..length(name)-1: + if name[i] == 'm': + echo("hey, there is an *m* in your name!") + + echo("Please give your password: \n") + var pw = readLine(stdin) + + while pw != "12345": + echo("Wrong password! Next try: \n") + pw = readLine(stdin) + + echo("""Login complete! + What do you want to do? + delete-everything + restart-computer + go-for-a-walk + """) + + case readline(stdin) + of "delete-everything", "restart-computer": + echo("permission denied") + of "go-for-a-walk": echo("please yourself") + else: echo("unknown command") + + +.. + Types + ----- + + Nimrod has a rich type system. This tutorial only gives a few examples. Read + the `manual <manual.html>`_ for further information: + + .. code-block:: Nimrod + type + TMyRecord = record + x, y: int + + + Procedures + ---------- + + Procedures are subroutines. They are declared in this way: + + .. code-block:: Nimrod + proc findSubStr(sub: string, + + +.. _strutils: strutils.html +.. _system: system.html |