========================================================= Term rewriting macros for Nimrod ========================================================= :Author: Andreas Rumpf Term rewriting macros are macros or templates that have not only a *name* but also a *pattern* that is searched for after the semantic checking phase of the compiler: This means they provide an easy way to enhance the compilation pipeline with user defined optimizations: .. code-block:: nimrod template optMul{`*`(a, 2)}(a: int): int = a+a let x = 3 echo x * 2 The compiler now rewrites ``x * 2`` as ``x + x``. The code inside the curlies is the pattern to match against. The operators ``*``, ``**``, ``|``, ``~`` have a special meaning in patterns if they are written in infix notation, so to match verbatim against ``*`` the ordinary function call syntax needs to be used. Unfortunately optimizations are hard to get right and even the tiny example is **wrong**: .. code-block:: nimrod template optMul{`*`(a, 2)}(a: int): int = a+a proc f(): int = echo "side effect!" result = 55 echo f() * 2 We cannot duplicate 'a' if it denotes an expression that has a side effect! Fortunately Nimrod supports side effect analysis: .. code-block:: nimrod template optMul{`*`(a, 2)}(a: int{noSideEffect}): int = a+a proc f(): int = echo "side effect!" result = 55 echo f() * 2 # not optimized ;-) So what about ``2 * a``? We should tell the compiler ``*`` is commutative. We cannot really do that however as the following code only swaps arguments blindly: .. code-block:: nimrod template mulIsCommutative{`*`(a, b)}(a, b: int): int = b*a What optimizers really need to do is a *canonicalization*: .. code-block:: nimrod template canonMul{`*`(a, b)}(a: int{lit}, b: int): int = b*a The ``int{lit}`` parameter pattern matches against an expression of type ``int``, but only if it's a literal. Parameter constraints ===================== The parameter constraint expression can use the operators ``|`` (or), ``&`` (and) and ``~`` (not) and the following predicates: =================== ===================================================== Predicate Meaning =================== ===================================================== ``atom`` The matching node has no children. ``lit`` The matching node is a literal like "abc", 12. ``sym`` The matching node must be a symbol (a bound identifier). ``ident`` The matching node must be an identifier (an unbound identifier). ``call`` The matching AST must be a call/apply expression. ``lvalue`` The matching AST must be an lvalue. ``sideeffect`` The matching AST must have a side effect. ``nosideeffect`` The matching AST must have no side effect. ``param`` A symbol which is a parameter. ``genericparam`` A symbol which is a generic parameter. ``module`` A symbol which is a module. ``type`` A symbol which is a type. ``var`` A symbol which is a variable. ``let`` A symbol which is a ``let`` variable. ``const`` A symbol which is a constant. ``result`` The special ``result`` variable. ``proc`` A symbol which is a proc. ``method`` A symbol which is a method. ``iterator`` A symbol which is an iterator. ``converter`` A symbol which is a converter. ``macro`` A symbol which is a macro. ``template`` A symbol which is a template. ``field`` A symbol which is a field in a tuple or an object. ``enumfield`` A symbol which is a field in an enumeration. ``forvar`` A for loop variable. ``label`` A label (used in ``block`` statements). ``nk*`` The matching AST must have the specified kind. (Example: ``nkIfStmt`` denotes an ``if`` statement.) ``alias`` States that the marked parameter needs to alias with *some* other parameter. ``noalias`` States that *every* other parameter must not alias with the marked parameter. =================== ===================================================== The ``alias`` and ``noalias`` predicates refer not only to the matching AST, but also to every other bound parameter; syntactially they need to occur after the ordinary AST predicates: .. code-block:: nimrod template ex{a = b + c}(a: int{noalias}, b, c: int) = # this transformation is only valid if 'b' and 'c' do not alias 'a': a = b inc a, b Pattern operators ================= The operators ``*``, ``**``, ``|``, ``~`` have a special meaning in patterns if they are written in infix notation. The ``|`` operator ------------------ The ``|`` operator if used as infix operator creates an ordered choice: .. code-block:: nimrod template t{0|1}(): expr = 3 let a = 1 # outputs 3: echo a The matching is performed after the compiler performed some optimizations like constant folding, so the following does not work: .. code-block:: nimrod template t{0|1}(): expr = 3 # outputs 1: echo 1 The reason is that the compiler already transformed the 1 into "1" for the ``echo`` statement. However, a term rewriting macro should not change the semantics anyway. In fact they can be deactived with the ``--patterns:off`` command line option or temporarily with the ``patterns`` pragma. The ``{}`` operator ------------------- A pattern expression can be bound to a pattern parameter via the ``expr{param}`` notation: .. code-block:: nimrod template t{(0|1|2){x}}(x: expr): expr = x+1 let a = 1 # outputs 2: echo a The ``~`` operator ------------------ The ``~`` operator is the **not** operator in patterns: .. code-block:: nimrod template t{x = (~x){y} and (~x){z}}(x, y, z: bool): stmt = x = y if x: x = z var a = false b = true c = false a = b and c echo a The ``*`` operator ------------------ The ``*`` operator can *flatten* a nested binary expression like ``a & b & c`` to ``&(a, b, c)``: .. code-block:: nimrod var calls = 0 proc `&&`(s: varargs[string]): string = result = s[0] for i in 1..len(s)-1: result.add s[i] inc calls template optConc{ `&&` * a }(a: string): expr = &&a let space = " " echo "my" && (space & "awe" && "some " ) && "concat" # check that it's been optimized properly: doAssert calls == 1 The second operator of `*` must be a parameter; it is used to gather all the arguments. The expression ``"my" && (space & "awe" && "some " ) && "concat"`` is passed to ``optConc`` in ``a`` as a special list (of kind ``nkArgList``) which is flattened into a call expression; thus the invocation of ``optConc`` produces: .. code-block:: nimrod `&&`("my", space & "awe", "some ", "concat") The ``**`` operator ------------------- The ``**`` is much like the ``*`` operator, except that it gathers not only all the arguments, but also the matched operators in reverse polish notation: .. code-block:: nimrod import macros type TMatrix = object dummy: int proc `*`(a, b: TMatrix): TMatrix = nil proc `+`(a, b: TMatrix): TMatrix = nil proc `-`(a, b: TMatrix): TMatrix = nil proc `$`(a: TMatrix): string = result = $a.dummy proc mat21(): TMatrix = result.dummy = 21 macro optM{ (`+`|`-`|`*`) ** a }(a: TMatrix): expr = echo treeRepr(a) result = newCall(bindSym"mat21") var x, y, z: TMatrix echo x + y * z - x This passes the expression ``x + y * z - x`` to the ``optM`` macro as an ``nnkArgList`` node containing:: Arglist Sym "x" Sym "y" Sym "z" Sym "*" Sym "+" Sym "x" Sym "-" (Which is the reverse polish notation of ``x + y * z - x``.) Parameters ========== Parameters in a pattern are type checked in the matching process. If a parameter is of the type ``varargs`` it is treated specially and it can match 0 or more arguments in the AST to be matched against: .. code-block:: nimrod template optWrite{ write(f, x) ((write|writeln){w})(f, y) }(x, y: varargs[expr], f: TFile, w: expr) = w(f, x, y)