summary refs log blame commit diff stats
path: root/doc/manual/trmacros.txt
blob: 446896e286556a446a653588458f91a0142d5831 (plain) (tree)
> 2014-10-02 02:33:59 +0200 manual split up into multiple files; documented the new concurrency system' href='/ahoang/Nim/commit/doc/manual/trmacros.txt?h=devel&id=2c1f3f75f5d37db810113cf37e1b38c3b7b09ee7'>2c1f3f75f ^
1
2
3
4
5
6
7
8
9
10




                                                                              
                                                                         



                                                     
 




                                                                      
                                                                            




                                                                              
             


                                               
 


                       
 






                                                                           
 


                       
 

                                  


                                                                             
 





                                                                             
 
                                                          



                                                              
                                                                   






                                         
                                                                            






                                                                              
                                                                    
























                                                                            
                                                                       
                                                                             
                                                                        




                                                                              
                                                                       
                         
                                                                            
                                                                               




































                                                                              
                                                                             
                                                                





                                                                                
         
















                                                            
 











                                                                              
                  



                   
 















                                                                             

                                                                             















                                                                             
                   

                




                                                 

                     
                                                     


                                    
                     
 
                    


















                                                                     
                                                                      





                                                                             
                                
                                            
              
 























                                                                         
                                                            








                                                                             
                                                                            































                                                                             
                                                                  


                                                   
                                                       


                                                                         

                    
 
                                                                   


                                                                           

                               
 
              


                                                                         
Term rewriting macros
=====================

Term rewriting macros are macros or templates that have not only
a *name* but also a *pattern* that is searched for after the semantic checking
phase of the compiler: This means they provide an easy way to enhance the
compilation pipeline with user defined optimizations:

.. code-block:: nim
  template optMul{`*`(a, 2)}(a: int): int = a+a

  let x = 3
  echo x * 2

The compiler now rewrites ``x * 2`` as ``x + x``. The code inside the
curlies is the pattern to match against. The operators ``*``,  ``**``,
``|``, ``~`` have a special meaning in patterns if they are written in infix
notation, so to match verbatim against ``*`` the ordinary function call syntax
needs to be used.


Unfortunately optimizations are hard to get right and even the tiny example
is **wrong**:

.. code-block:: nim
  template optMul{`*`(a, 2)}(a: int): int = a+a

  proc f(): int =
    echo "side effect!"
    result = 55

  echo f() * 2

We cannot duplicate 'a' if it denotes an expression that has a side effect!
Fortunately Nim supports side effect analysis:

.. code-block:: nim
  template optMul{`*`(a, 2)}(a: int{noSideEffect}): int = a+a

  proc f(): int =
    echo "side effect!"
    result = 55

  echo f() * 2 # not optimized ;-)

You can make one overload matching with a constraint and one without, and the
one with a constraint will have precedence, and so you can handle both cases
differently.

So what about ``2 * a``? We should tell the compiler ``*`` is commutative. We
cannot really do that however as the following code only swaps arguments
blindly:

.. code-block:: nim
  template mulIsCommutative{`*`(a, b)}(a, b: int): int = b*a

What optimizers really need to do is a *canonicalization*:

.. code-block:: nim
  template canonMul{`*`(a, b)}(a: int{lit}, b: int): int = b*a

The ``int{lit}`` parameter pattern matches against an expression of
type ``int``, but only if it's a literal.



Parameter constraints
---------------------

The `parameter constraint`:idx: expression can use the operators ``|`` (or),
``&`` (and) and ``~`` (not) and the following predicates:

===================      =====================================================
Predicate                Meaning
===================      =====================================================
``atom``                 The matching node has no children.
``lit``                  The matching node is a literal like "abc", 12.
``sym``                  The matching node must be a symbol (a bound
                         identifier).
``ident``                The matching node must be an identifier (an unbound
                         identifier).
``call``                 The matching AST must be a call/apply expression.
``lvalue``               The matching AST must be an lvalue.
``sideeffect``           The matching AST must have a side effect.
``nosideeffect``         The matching AST must have no side effect.
``param``                A symbol which is a parameter.
``genericparam``         A symbol which is a generic parameter.
``module``               A symbol which is a module.
``type``                 A symbol which is a type.
``var``                  A symbol which is a variable.
``let``                  A symbol which is a ``let`` variable.
``const``                A symbol which is a constant.
``result``               The special ``result`` variable.
``proc``                 A symbol which is a proc.
``method``               A symbol which is a method.
``iterator``             A symbol which is an iterator.
``converter``            A symbol which is a converter.
``macro``                A symbol which is a macro.
``template``             A symbol which is a template.
``field``                A symbol which is a field in a tuple or an object.
``enumfield``            A symbol which is a field in an enumeration.
``forvar``               A for loop variable.
``label``                A label (used in ``block`` statements).
``nk*``                  The matching AST must have the specified kind.
                         (Example: ``nkIfStmt`` denotes an ``if`` statement.)
``alias``                States that the marked parameter needs to alias
                         with *some* other parameter.
``noalias``              States that *every* other parameter must not alias
                         with the marked parameter.
===================      =====================================================

Predicates that share their name with a keyword have to be escaped with
backticks: `` `const` ``.
The ``alias`` and ``noalias`` predicates refer not only to the matching AST,
but also to every other bound parameter; syntactically they need to occur after
the ordinary AST predicates:

.. code-block:: nim
  template ex{a = b + c}(a: int{noalias}, b, c: int) =
    # this transformation is only valid if 'b' and 'c' do not alias 'a':
    a = b
    inc a, c


Pattern operators
-----------------

The operators ``*``,  ``**``, ``|``, ``~`` have a special meaning in patterns
if they are written in infix notation.


The ``|`` operator
~~~~~~~~~~~~~~~~~~

The ``|`` operator if used as infix operator creates an ordered choice:

.. code-block:: nim
  template t{0|1}(): expr = 3
  let a = 1
  # outputs 3:
  echo a

The matching is performed after the compiler performed some optimizations like
constant folding, so the following does not work:

.. code-block:: nim
  template t{0|1}(): expr = 3
  # outputs 1:
  echo 1

The reason is that the compiler already transformed the 1 into "1" for
the ``echo`` statement. However, a term rewriting macro should not change the
semantics anyway. In fact they can be deactivated with the ``--patterns:off``
command line option or temporarily with the ``patterns`` pragma.


The ``{}`` operator
~~~~~~~~~~~~~~~~~~~

A pattern expression can be bound to a pattern parameter via the ``expr{param}``
notation:

.. code-block:: nim
  template t{(0|1|2){x}}(x: expr): expr = x+1
  let a = 1
  # outputs 2:
  echo a


The ``~`` operator
~~~~~~~~~~~~~~~~~~

The ``~`` operator is the **not** operator in patterns:

.. code-block:: nim
  template t{x = (~x){y} and (~x){z}}(x, y, z: bool): stmt =
    x = y
    if x: x = z

  var
    a = false
    b = true
    c = false
  a = b and c
  echo a


The ``*`` operator
~~~~~~~~~~~~~~~~~~

The ``*`` operator can *flatten* a nested binary expression like ``a & b & c``
to ``&(a, b, c)``:

.. code-block:: nim
  var
    calls = 0

  proc `&&`(s: varargs[string]): string =
    result = s[0]
    for i in 1..len(s)-1: result.add s[i]
    inc calls

  template optConc{ `&&` * a }(a: string): expr = &&a

  let space = " "
  echo "my" && (space & "awe" && "some " ) && "concat"

  # check that it's been optimized properly:
  doAssert calls == 1


The second operator of `*` must be a parameter; it is used to gather all the
arguments. The expression ``"my" && (space & "awe" && "some " ) && "concat"``
is passed to ``optConc`` in ``a`` as a special list (of kind ``nkArgList``)
which is flattened into a call expression; thus the invocation of ``optConc``
produces:

.. code-block:: nim
   `&&`("my", space & "awe", "some ", "concat")


The ``**`` operator
~~~~~~~~~~~~~~~~~~~

The ``**`` is much like the ``*`` operator, except that it gathers not only
all the arguments, but also the matched operators in reverse polish notation:

.. code-block:: nim
  import macros

  type
    Matrix = object
      dummy: int

  proc `*`(a, b: Matrix): Matrix = discard
  proc `+`(a, b: Matrix): Matrix = discard
  proc `-`(a, b: Matrix): Matrix = discard
  proc `$`(a: Matrix): string = result = $a.dummy
  proc mat21(): Matrix =
    result.dummy = 21

  macro optM{ (`+`|`-`|`*`) ** a }(a: Matrix): expr =
    echo treeRepr(a)
    result = newCall(bindSym"mat21")

  var x, y, z: Matrix

  echo x + y * z - x

This passes the expression ``x + y * z - x`` to the ``optM`` macro as
an ``nnkArgList`` node containing::

  Arglist
    Sym "x"
    Sym "y"
    Sym "z"
    Sym "*"
    Sym "+"
    Sym "x"
    Sym "-"

(Which is the reverse polish notation of ``x + y * z - x``.)


Parameters
----------

Parameters in a pattern are type checked in the matching process. If a
parameter is of the type ``varargs`` it is treated specially and it can match
0 or more arguments in the AST to be matched against:

.. code-block:: nim
  template optWrite{
    write(f, x)
    ((write|writeLine){w})(f, y)
  }(x, y: varargs[expr], f: File, w: expr) =
    w(f, x, y)



Example: Partial evaluation
---------------------------

The following example shows how some simple partial evaluation can be
implemented with term rewriting:

.. code-block:: nim
  proc p(x, y: int; cond: bool): int =
    result = if cond: x + y else: x - y

  template optP1{p(x, y, true)}(x, y: expr): expr = x + y
  template optP2{p(x, y, false)}(x, y: expr): expr = x - y


Example: Hoisting
-----------------

The following example shows how some form of hoisting can be implemented:

.. code-block:: nim
  import pegs

  template optPeg{peg(pattern)}(pattern: string{lit}): Peg =
    var gl {.global, gensym.} = peg(pattern)
    gl

  for i in 0 .. 3:
    echo match("(a b c)", peg"'(' @ ')'")
    echo match("W_HI_Le", peg"\y 'while'")

The ``optPeg`` template optimizes the case of a peg constructor with a string
literal, so that the pattern will only be parsed once at program startup and
stored in a global ``gl`` which is then re-used. This optimization is called
hoisting because it is comparable to classical loop hoisting.


AST based overloading
=====================

Parameter constraints can also be used for ordinary routine parameters; these
constraints affect ordinary overloading resolution then:

.. code-block:: nim
  proc optLit(a: string{lit|`const`}) =
    echo "string literal"
  proc optLit(a: string) =
    echo "no string literal"

  const
    constant = "abc"

  var
    variable = "xyz"

  optLit("literal")
  optLit(constant)
  optLit(variable)

However, the constraints ``alias`` and ``noalias`` are not available in
ordinary routines.


Move optimization
-----------------

The ``call`` constraint is particularly useful to implement a move
optimization for types that have copying semantics:

.. code-block:: nim
  proc `[]=`*(t: var Table, key: string, val: string) =
    ## puts a (key, value)-pair into `t`. The semantics of string require
    ## a copy here:
    let idx = findInsertionPosition(key)
    t[idx].key = key
    t[idx].val = val

  proc `[]=`*(t: var Table, key: string{call}, val: string{call}) =
    ## puts a (key, value)-pair into `t`. Optimized version that knows that
    ## the strings are unique and thus don't need to be copied:
    let idx = findInsertionPosition(key)
    shallowCopy t[idx].key, key
    shallowCopy t[idx].val, val

  var t: Table
  # overloading resolution ensures that the optimized []= is called here:
  t[f()] = g()