diff options
author | Grzegorz Adam Hankiewicz <gradha@imap.cc> | 2013-11-15 23:04:21 +0100 |
---|---|---|
committer | Grzegorz Adam Hankiewicz <gradha@imap.cc> | 2013-11-16 20:45:57 +0100 |
commit | 38eb67de835db986105a7340def055cad704f697 (patch) | |
tree | 7586babecfb3bd9b25ae3ff9d293d3e5bec32a6b | |
parent | 9061b8961eb5cefd31eba9314a5932f35f524eac (diff) | |
download | Nim-38eb67de835db986105a7340def055cad704f697.tar.gz |
Expands tutorial macro section with step by step guide.
-rw-r--r-- | doc/tut2.txt | 269 |
1 files changed, 260 insertions, 9 deletions
diff --git a/doc/tut2.txt b/doc/tut2.txt index e1e36bfc4..4d8d0be15 100644 --- a/doc/tut2.txt +++ b/doc/tut2.txt @@ -699,15 +699,22 @@ once. Macros ====== -Macros enable advanced compile-time code transformations, but they -cannot change Nimrod's syntax. However, this is no real restriction because -Nimrod's syntax is flexible enough anyway. - -To write a macro, one needs to know how the Nimrod concrete syntax is converted -to an abstract syntax tree (AST). The AST is documented in the -`macros <macros.html>`_ module. - -There are two ways to invoke a macro: +Macros enable advanced compile-time code transformations, but they cannot +change Nimrod's syntax. However, this is no real restriction because Nimrod's +syntax is flexible enough anyway. Macros have to be implemented in pure Nimrod +code if `foreign function interface (FFI) +<manual.html#foreign-function-interface>`_ is not enabled in the compiler, but +other than that restriction (which at some point in the future will go away) +you can write any kind of Nimrod code and the compiler will run it at compile +time. + +There are two ways to write a macro, either *generating* Nimrod source code and +letting the compiler parse it, or creating manually an abstract syntax tree +(AST) which you feed to the compiler. In order to build the AST one needs to +know how the Nimrod concrete syntax is converted to an abstract syntax tree +(AST). The AST is documented in the `macros <macros.html>`_ module. + +Once your macro is finished, there are two ways to invoke it: (1) invoking a macro like a procedure call (`expression macros`:idx:) (2) invoking a macro with the special ``macrostmt`` syntax (`statement macros`:idx:) @@ -796,3 +803,247 @@ Term rewriting macros Term rewriting macros can be used to enhance the compilation process with user defined optimizations; see this `document <trmacros.html>`_ for further information. + + +Building your first macro +------------------------- + +To give a footstart to writing macros we will show now how to turn your typical +dynamic code into something that compiles statically. For the exercise we will +use the following snippet of code as the starting point: + +.. code-block:: nimrod + + import strutils, tables + + proc readCfgAtRuntime(cfgFilename: string): TTable[string, string] = + let + inputString = readFile(cfgFilename) + var + rawLines = split(inputString, {char(0x0a), char(0x0d)}) + source = "" + + result = initTable[string, string]() + for line in rawLines: + var chunks = split(line, ',') + if chunks.len != 2: + quit("Input needs comma split values, got: " & line) + result[chunks[0]] = chunks[1] + + if result.len < 1: quit("Input file empty!") + + let info = readCfgAtRuntime("data.cfg") + + when isMainModule: + echo info["licenseOwner"] + echo info["licenseKey"] + echo info["version"] + +Presumably this snippet of code could be used in a commercial software, reading +a configuration file to display information about the person who bought the +software. This external file would be generated by an online web shopping cart +to be included along the program containing the license information:: + + version,1.1 + licenseOwner,Hyori Lee + licenseKey,M1Tl3PjBWO2CC48m + +The ``readCfgAtRuntime`` proc will open the given filename and return a +``TTable`` from the `tables module <tables.html>`_. The parsing of the file is +done (without much care for handling invalid data or corner cases) using the +``split`` proc from the `strutils module <strutils.html>`_. There are many +things which can fail; mind the purpose is explaining how to make this run at +compile time, not how to properly implement a DRM scheme. + +The reimplementation of this code as a compile time proc will allow us to get +rid of the ``data.cfg`` file we would need to distribute along the binary, plus +if the information is really constant, it doesn't make from a logical point of +view to have it *mutable* in a global variable, it would be better if it was a +constant. Finally, and likely the most valuable feature, we can implement some +verification at compile time. You could think if this as a better *unit +testing*, since it is impossible to obtain a binary unless everything is +correct, preventing you to ship to users a broken program which won't start +because a small critical file is missing or its contents changed by mistake to +something invalid. + + +Generating source code +++++++++++++++++++++++ + +Our first attempt will start by modifying the program to generate a compile +time string with the *generated source code*, which we then pass to the +``parseStmt`` proc from the `macros module <macros.html>`_. Here is the +modified source code implementing the macro: + +.. code-block:: nimrod + import macros, strutils + + macro readCfgAndBuildSource(cfgFilename: string): stmt = + let + inputString = slurp(cfgFilename.strVal) + var + rawLines = split(inputString, {char(0x0a), char(0x0d)}) + source = "" + + for line in rawLines: + var chunks = split(line, ',') + if chunks.len != 2: + error("Input needs comma split values, got: " & line) + source &= "const cfg" & chunks[0] & "= \"" & chunks[1] & "\"\n" + + if source.len < 1: error("Input file empty!") + result = parseStmt(source) + + readCfgAndBuildSource("data.cfg") + + when isMainModule: + echo cfglicenseOwner + echo cfglicenseKey + echo cfgversion + +The good news is not much has changed! First, we need to change the handling of +the input parameter. In the dynamic version the ``readCfgAtRuntime`` proc +receives a string parameter. However, in the macro version it is also declared +as string, but this is the *outside* interface of the macro. When the macro is +run, it actually gets a ``PNimrodNode`` object instead of a string, and we have +to call the ``strVal`` proc from the `macros module <macros.html>`_ to obtain +the string being passed in to the macro. + +Second, we cannot use the ``readFile`` proc from the `system module +<system.html>`_ due to FFI restriction at compile time. If we try to use this +proc, or any other which depends on FFI, the compiler will error with the +message ``cannot evaluate`` and a dump of the macro's source code, along with a +stack trace where the compiler reached before bailing out. We can get around +this limitation by using the ``slurp`` proc from the `system module +<system.html>`_, which was precisely made for compilation time (just like +``gorge`` which executes an external program and captures its output). + +The interesting thing is that our macro does not return a runtime ``TTable`` +object. Instead, it builds up Nimrod source code into the ``source`` variable. +For each line of the configuration file a ``const`` variable will be generated. +To avoid conflicts we prefix these variables with ``cfg``. In essence, what the +compiler is doing is replacing the line calling the macro with the following +snippet of code: + +.. code-block:: nimrod + const cfgversion= "1.1" + const cfglicenseOwner= "Hyori Lee" + const cfglicenseKey= "M1Tl3PjBWO2CC48m" + +You can verify this yourself adding the line ``echo source`` somewhere at the +end of the macro and compiling the program. Another difference is that instead +of calling the usual ``quit`` proc to abort (which we could still call) this +version calls the ``error`` proc. The ``error`` proc has the same behavior as +``quit`` but will dump also the source and file line information where the +error happened, making it easier for the programmer to find where compilation +failed. In this situation it would point to the line invoking the macro, but +**not** the line of ``data.cfg`` we are processing, that's something the macro +itself would need to control. + + +Generating AST by hand +++++++++++++++++++++++ + +To generate an AST we would need to intimately know the structures used by the +Nimrod compiler exposed in the `macros module <macros.html>`_, which at first +look seems a daunting task. But we can use a helper shortcut the ``dumpTree`` +macro, which is used as a statement macro instead of an expression macro. +Since we know that we want to generate a bunch of ``const`` symbols we can +create the following source file and compile it to see what the compiler +*expects* from us: + +.. code-block:: nimrod + import macros + + dumpTree: + const cfgversion: string = "1.1" + const cfglicenseOwner= "Hyori Lee" + const cfglicenseKey= "M1Tl3PjBWO2CC48m" + +During compilation of the source code we should see the following lines in the +output (again, since this is a macro, compilation is enough, you don't have to +run any binary):: + + StmtList + ConstSection + ConstDef + Ident !"cfgversion" + Ident !"string" + StrLit 1.1 + ConstSection + ConstDef + Ident !"cfglicenseOwner" + Empty + StrLit Hyori Lee + ConstSection + ConstDef + Ident !"cfglicenseKey" + Empty + StrLit M1Tl3PjBWO2CC48m + +With this output we have a better idea of what kind of input the compiler +expects. We need to generate a list of statements. For each constant the source +code generates a ``ConstSection`` and a ``ConstDef``. If we were to move all +the constants to a single ``const`` block we would see only a single +``ConstSection`` with three children. + +Maybe you didn't notice, but in the ``dumpTree`` example the first constant +explicitly specifies the type of the constant. That's why in the tree output +the two last constants have their second child ``Empty`` but the first has a +string identifier. So basically a ``const`` definition is made up from an +identifier, optionally a type (can be an *empty* node) and the value. Armed +with this knowledge, let's look at the finished version of the AST building +macro: + +.. code-block:: nimrod + import macros, strutils + + macro readCfgAndBuildAST(cfgFilename: string): stmt = + let + inputString = slurp(cfgFilename.strVal) + var + rawLines = split(inputString, {char(0x0a), char(0x0d)}) + + result = newNimNode(nnkStmtList) + for line in rawLines: + var chunks = split(line, ',') + if chunks.len != 2: + error("Input needs comma split values, got: " & line) + var + section = newNimNode(nnkConstSection) + constDef = newNimNode(nnkConstDef) + constDef.add(newIdentNode("cfg" & chunks[0])) + constDef.add(newEmptyNode()) + constDef.add(newStrLitNode(chunks[1])) + section.add(constDef) + result.add(section) + + if result.len < 1: error("Input file empty!") + + readCfgAndBuildAST("data.cfg") + + when isMainModule: + echo cfglicenseOwner + echo cfglicenseKey + echo cfgversion + +Since we are building on the previous example generating source code, we will +only mention the differences to it. Instead of creating a temporary ``string`` +variable and writing into it source code as if it were written *by hand*, we +use the ``result`` variable directly and create a statement list node +(``nnkStmtList``) which will hold our children. + +For each input line we have to create a constant definition (``nnkConstDef``) +and wrap it inside a constant section (``nnkConstSection``). Once these +variables are created, we fill them hierarchichally like the previous AST dump +tree showed: the constant definition is a child of the section definition, and +the constant definition has an identifier node, an empty node (we let the +compiler figure out the type), and a string literal with the value. + +A last tip when writing a macro: if you are not sure the AST you are building +looks ok, you may be tempted to use the ``dumpTree`` macro. But you can't use +it *inside* the macro you are writting/debugging. Instead ``echo`` the string +generated by ``treeRepr``. If at the end of the this example you add ``echo +treeRepr(result)`` you should get the same output as using the ``dumpTree`` +macro, but of course you can call that at any point of the macro where you +might be having troubles. |