\input bkmacs
\photo{Trombone players produce different pitches partly
by varying the length of a tube.}{\pagetag{\trombone}\pspicture{4in}{trombone}{trombone}{}}
\chapter{Variables}
\chaptag{\variables}

A {\it \idx{variable}\/} is a connection between a name and a
\justidx{naming a value} value.\footnt{The term ``variable'' is used by
computer scientists to mean several subtly different things.  For example,
some people use ``variable'' to mean just a holder for a value, without a
name.  But what we said is what {\it we\/} mean by ``variable.''} That
sounds simple enough, but some complexities arise in practice.  To avoid
confusion later, we'll spend some time now looking at the idea of
``variable'' in more detail.

The name {\it variable\/} comes from algebra.  Many people are introduced to
variables in high school algebra classes, where the emphasis is on solving
equations.  ``If $x^{\kern0.5pt 3}-8=0$, what is the value of $x$?'' In
problems like these, although we call $x$ a variable, it's really a {\it
\bkidx{named}{constant}!\/} In this particular problem, $x$ has the
value~2.  In any such problem, at first we don't know the value of $x$, but
we understand that it does have some particular value, and that value isn't
going to change in the middle of the problem.

In functional programming, what we mean by ``variable'' is like a named
constant in mathematics.  Since a variable is the connection between a name
and a value, a formal parameter in a procedure definition isn't a variable;
it's just a name.  But when we invoke the procedure with a particular
argument, that name is associated with a value, and a variable is created.
If we invoke the procedure again, a {\it new\/} variable is created, perhaps
with a different value.

There are two possible sources of confusion about this.  One is that you may
have programmed before in a programming language like BASIC or Pascal, in
which a variable often {\it does\/} get a new value, even after it's already
had a previous value assigned to it.  Programs in those languages tend to be
full of things like ``{\tt X~=~X~+~1}.'' Back in Chapter \functions\ we told
you that this book is about something called
``\bkidx{functional}{programming},'' but we haven't yet explained exactly
what that means.  (Of course we {\it have\/} introduced a lot of functions,
and that is an important part of it.)  Part of what we mean by functional
programming is that once a variable exists, we aren't going to {\it
change\/} the value of that variable.  

The other possible source of confusion is that in Scheme, unlike the
situation in algebra, we may have more than one variable with the same name
at the same time.  That's because we may invoke one procedure, and the body
of that procedure may invoke another procedure, and each of them might use the
same formal parameter name.  There might be one variable named {\tt x} with
the value 7, and another variable named {\tt x} with the value 51, at the
same time.  The pitfall to avoid is thinking ``{\tt x} has changed its value
from 7 to 51.''

As an analogy, imagine that you are at a party along with Mick Jagger, Mick
Wilson, Mick Avory, and Mick Dolenz.  If you're having a conversation with
one of them, the name ``Mick'' means a particular person to you.  If you
notice someone else talking with a different Mick, you wouldn't think ``Mick
has become a different person.'' Instead, you'd think ``there are several
people here all with the name Mick.''

\subhd{How Little People Do Variables}

\justidx{little people} You can understand variables in terms of the
little-people model.  A variable, in this model, is the association in the
little person's mind between a formal parameter (name) and the actual
argument (value) she was given.  When we want to know {\tt (square~5)}, we
hire Srini and tell him his argument is 5.  Srini therefore substitutes 5
for {\tt x} in the body of {\tt square}.  Later, when we want to know the
square of 6, we hire Samantha and tell her that her argument is 6.  Srini and
Samantha have two different variables, both named {\tt x}.

\pagetag{\srini}
% \picture{2.3in}{Srini and Samantha}
\pspicture{2.3in}{srini}{srini}{\TrimBoundingBox{8pt}}

Srini and Samantha do their work separately, one after the other.  But
in a more complicated example, there could even be more than
one value called {\tt x} at the same time:

{\prgex%
(define (square x) (* x x))

(define (\ufun{hypotenuse} x y)
  (sqrt (+ (square x) (square y))))

> (hypotenuse 3 4)
5
}

\noindent Consider the situation when we've hired Hortense to evaluate that
expression.  Hortense associates the name {\tt x} with the value 3 (and also
the name {\tt y} with the value 4, but we're going to pay attention to {\tt
x}).  She has to compute two {\tt square}s.  She hires Solomon to compute
{\tt (square~3)}.  Solomon associates the name {\tt x} with the value 3.
This happens to be the same as Hortense's value, but it's still a separate
variable that could have had a different value---as we see when Hortense
hires Sheba to compute {\tt (square~4)}.  Now, simultaneously, Hortense
thinks {\tt x} is 3 and Sheba thinks {\tt x} is 4.

\pagetag{\sheba}
% \picture{2.3in}{Hortense and Sheba}
\pspicture{2.3in}{hortense}{hortense}{\TrimBoundingBox{8pt}}

(Remember that we said a variable is a connection between a name and a
value.  So {\tt x} isn't a variable!  The association of the name {\tt x}
with the value 5 is a variable.  The reason we're being so fussy about this
terminology is that it helps clarify the case in which several variables
have the same name.  But in practice people are generally sloppy about this
fine point; we can usually get away with saying ``{\tt x} is a variable''
when we mean ``there is some variable whose name is {\tt x}.'')

Another important point about the way little people do variables is that
they can't read each others' minds.  In particular, they don't know about
the values of the local variables that belong to the little people who
hired them.  For example, the following attempt to compute the value 10
won't work:

{\prgex%
(define (f x)
  (g 6))

(define (g y)
  (+ x y))

> (f 4)
ERROR -- VARIABLE X IS UNBOUND.
}

\noindent We hire Franz to compute {\tt (f 4)}.  He associates {\tt x} with
4 and evaluates {\tt (g~6)} by hiring Gloria.  Gloria associates {\tt y}
with 6, but she doesn't have any value for {\tt x}, so she's in trouble.
The solution is for Franz to tell Gloria that {\tt x} is {\tt 4}:

{\prgex%
(define (f x)
  (g x 6))

(define (g x y)
  (+ x y))

> (f 4)
10
}

\subhd{Global and Local Variables}

Until now, we've been using two very different kinds of naming.  We have
names for procedures, which are created permanently by {\tt define} and are
usable throughout our programs; and we have names for procedure arguments,
which are associated with values temporarily when we call a
procedure and are usable only inside that procedure.

These two kinds of naming seem to be different in every way.  One is for
procedures, one for data; the one for procedures makes a permanent, global
name, while the one for data makes a temporary, local name.  That picture
does reflect the way that procedures and other data are {\it usually\/} used,
but we'll see that really there is only one kind of naming.  The
boundaries can be crossed:  Procedures can be arguments to other
procedures, and any kind of data
can have a permanent, global name.  Right now we'll look at that last
point, about global variables.

Just as we've been using {\tt define} to associate names with procedures
globally, we can also use it for other kinds of data:

{\prgex%
> (define pi 3.141592654)

> (+ pi 5)
8.141592654

> (define song '(I am the walrus))

> (last song)
WALRUS
}

Once defined, a global variable can be used anywhere, just as a defined
procedure can be used anywhere.  (In fact, defining a procedure creates a
variable whose value is the procedure.  Just as {\tt pi} is the name of a
variable whose value is 3.141592654, {\tt last} is the name of a variable
whose value is a primitive procedure.  We'll come back to this
point in Chapter~\lambchop.)  When the name of a global variable
appears in an expression, the corresponding value must be substituted, just
as actual argument values are substituted for formal parameters.

When a little person is hired to carry out a compound procedure, his or her
first step is to substitute actual argument values for formal parameters in
the body.  The same little person substitutes values for global variable
names also.  (What if there is a global variable whose name happens to be
used as a formal parameter in this procedure?  Scheme's rule is that the
formal parameter takes precedence, but even though Scheme knows what to do,
conflicts like this make your program harder to read.)

How does this little person know what values to substitute for global
variable names?  What makes a variable ``global'' in the little-people model
is that {\it every\/} little person knows its value.  You can imagine that
there's a big chalkboard, with all the global definitions written on it, that
all the little people can see.
\justidx{chalkboard model}
\justidx{model, chalkboard}
If you prefer, you could imagine that whenever a global variable is defined,
the {\tt define} specialist climbs up a huge ladder, picks up a megaphone,
and yells something like ``Now hear this!  {\tt Pi} is 3.141592654!''

The association of a formal parameter (a name) with an actual argument (a
value) is called a {\it \bkidx{local}{variable}.}

It's awkward to have to say ``Harry associates the value 7 with the name
{\tt foo}'' all the time.  Most of the time we just say ``{\tt foo} has the
value 7,'' paying no attention to whether this association is in some
particular little person's head or if everybody knows it.

\subhd{The Truth about Substitution}

We said earlier in a footnote that Scheme doesn't actually do all the
copying and substituting we've been talking about.  What actually happens is
\justidx{substitution}
more like our model of global variables, in which there is a chalkboard
somewhere that associates names with values---except that instead of making
a new copy of every expression with values substituted for names, Scheme
works with the original expression and looks up the value for each
name at the moment when that value is needed.  To make local variables work,
there are several chalkboards:\ a global one and one for each little person.

The fully detailed model of variables using several chalkboards is what many
people find hardest about learning Scheme.  That's why we've chosen to use
the simpler \idx{substitution model}.\footnt{The reason that all of our
examples work with the substitution model is that this book uses only
functional programming, in the sense that we never change the value of a
variable.  If we started doing the {\tt X~=~X~+~1} style of programming, we
would need the more complicated \idx{chalkboard model}.}

\subhd{\ttpmb{Let}}

We're going to write a procedure that solves quadratic equations.  (We know
this is the prototypical boring programming problem, but it illustrates
clearly the point we're about to make.)

We'll use the \idx{quadratic formula} that you learned in high school
algebra class:

$$ ax^2+bx+c=0 \quad {\rm when} \quad x = {-b \pm \sqrt{b^2-4ac} \over 2a} $$

{\prgex%
(define (roots a b c)
  (se (/ (+ (- b) (sqrt (- (* b b) (* 4 a c))))
         (* 2 a))
      (/ (- (- b) (sqrt (- (* b b) (* 4 a c))))
         (* 2 a))))
}

Since there are two possible solutions, we return a sentence containing two
numbers.  This procedure works fine,\footnt{That is, it works if the equation
has real roots, or if your version of Scheme has complex numbers.  Also, the
limited precision with which computers can represent irrational numbers can
make this particular algorithm give wrong answers in practice even though
it's correct in theory.} but it does have the disadvantage of repeating a
lot of the work.  It computes the square root part of the formula twice.
We'd like to avoid that inefficiency.

One thing we can do is to compute the square root and use that as the
actual argument to a helper procedure that does the rest of the job:

{\prgex%
(define (roots a b c)
  (roots1 a b c (sqrt (- (* b b) (* 4 a c)))))

(define (roots1 a b c discriminant)
  (se (/ (+ (- b) discriminant) (* 2 a))
      (/ (- (- b) discriminant) (* 2 a))))
}

\noindent This version evaluates the square root only once.  The resulting
value is used as the argument named {\tt discriminant} in {\tt roots1}.

We've solved the problem we posed for ourselves initially:\ avoiding the
redundant computation of the discriminant (the square-root part of the
formula).  The cost, though, is that we had to define an auxiliary procedure
{\tt roots1} that doesn't make much sense on its own.  (That is, you'd never
invoke {\tt roots1} for its own sake; only {\tt roots} uses it.)

Scheme provides a notation to express a computation of this kind more
\pagetag{\splet}
conveniently.  It's called \ttidx{let}:

{\prgex%
(define (roots a b c)
  (let ((discriminant (sqrt (- (* b b) (* 4 a c)))))
    (se (/ (+ (- b) discriminant) (* 2 a))
        (/ (- (- b) discriminant) (* 2 a)))))
}

\noindent Our new program is just an abbreviation for the previous version:
In effect, it creates a temporary procedure just like {\tt roots1}, but
without a name, and invokes it with the specified argument value.  But the
{\tt let} notation rearranges things so that we can say, in the right order,
``let the variable {\tt discriminant} have the value {\tt (sqrt\ellipsis)}\
and, using that variable, compute the body.''

{\tt Let} is a \bkidx{special}{form} that takes two arguments.  The first is
a sequence of name-value pairs enclosed in parentheses.  (In this example,
there is only one name-value pair.)  The second argument, the {\it body\/}
of the {\tt let}, is the expression to evaluate.

{\advance\medskipamount by -3pt

Now that we have this notation, we can use it with more than one name-value
connection to eliminate even more redundant computation:

{\prgex%
(define (\ufun{roots} a b c)
  (let ((discriminant (sqrt (- (* b b) (* 4 a c))))
        (minus-b (- b))
        (two-a (* 2 a)))
    (se (/ (+ minus-b discriminant) two-a)
        (/ (- minus-b discriminant) two-a))))
}

\noindent In this example, the first argument to {\tt let} includes three
name-value pairs.  It's as if we'd defined and invoked a procedure like
the following:

{\prgex%
(define (roots1 discriminant minus-b two-a) ...)
}

Like {\tt cond}, {\tt let} uses parentheses both with the usual meaning
(invoking a procedure) and to group sub-arguments that belong together.  This
\justidx{parentheses, for {\ttfont let} variables}
grouping happens in two ways.  Parentheses are used to group a name and the
expression that provides its value.  Also, an additional pair of parentheses
surrounds the entire collection of name-value pairs.


\subhd{Pitfalls}

\pit If you've programmed before in other languages, you may be accustomed
to a style of programming in which you {\it change\/} the value of a
variable by assigning it a new value.  You may be tempted to write

{\prgex%
> (define x (+ x 3))                         ;; no-no
}

\noindent Although some versions of Scheme do allow such redefinitions, so
that you can correct errors in your procedures, they're not strictly legal.
A definition is meant to be {\it permanent\/} in functional programming.
(Scheme does include other mechanisms for non-functional programming, but
we're not studying them in this book because once you allow reassignment you
need a more complex model of the evaluation process.)

\pit When you create more than one temporary variable at once using {\tt
let}, all of the expressions that provide the values are computed before any
of the variables are created.  Therefore, you can't have one expression
depend on another:

{\prgex%
> (let ((a (+ 4 7))                          ;; wrong!
	(b (* a 5)))
    (+ a b))
}

\noindent Don't think that {\tt a} gets the value 11 and therefore {\tt b}
gets the value 55.  That {\tt let} expression is equivalent to defining a
helper procedure

{\prgex%
(define (helper a b)
  (+ a b))
}

and then invoking it:

{\prgex%
(helper (+ 4 7) (* a 5))
}

\noindent The argument expressions, as always, are evaluated {\it before\/}
the function is invoked.  The expression {\tt (*~a~5)} will be evaluated
using the {\it global\/} value of {\tt a}, if there is one.  If not, an
error will result.  If you want to use {\tt a} in computing {\tt b}, you
must say

{\prgex%
> (let ((a (+ 4 7)))
    (let ((b (* a 5)))
      (+ a b)))
66
}

\pit {\tt Let}'s notation is tricky because, like {\tt cond}, it uses
parentheses that don't mean procedure invocation.  Don't teach yourself magic
formulas like ``two open parentheses before the {\tt let} variable and three
close parentheses at the end of its value.'' Instead, think about the
overall structure:

{\prgex%
(let {\rm{}variables} {\rm{}body})
}

\noindent {\tt Let} takes exactly two arguments.  The first argument to {\tt
let} is one or more name-value groupings, all in parentheses:

{\prgex%
((name1 value1) (name2 value2) (name3 value3) \ellipsis)
}

\noindent Each {\tt name} is a single word; each {\tt value} can be any
expression, usually a procedure invocation.  If it's a procedure invocation,
then parentheses are used with their usual meaning.

The second argument to {\tt let} is the expression to be evaluated using
those variables.

Now put all the pieces together:

{\prgex%
(let \pmbig{((}name1 (fn1 arg1)\pmbig{)}
     \pmbig{ (}name2 (fn2 arg2)\pmbig{)}
     \pmbig{ (}name3 (fn3 arg3)\pmbig{))}
  body)
}

} % medskipamount

\esubhd{Boring Exercises}

{\exercise
The following procedure does some redundant computation.

{\prgex%
(define (\ufun{gertrude} wd)
  (se (if (vowel? (first wd)) 'an 'a)
      wd
      'is
      (if (vowel? (first wd)) 'an 'a)
      wd
      'is
      (if (vowel? (first wd)) 'an 'a)
      wd))

> (gertrude 'rose)
(A ROSE IS A ROSE IS A ROSE)

> (gertrude 'iguana)
(AN IGUANA IS AN IGUANA IS AN IGUANA)
}

\noindent Use {\tt let} to avoid the redundant work.
}

\solution
Here are two possible solutions:

{\prgex%
(define (gertrude wd)
  (let ((article (if (vowel? (first wd)) 'an 'a)))
    (se article wd 'is article wd 'is article wd)))

(define (gertrude wd)
  (let ((phrase (se (if (vowel? (first wd)) 'an 'a) wd)))
    (se phrase 'is phrase 'is phrase)))
}
@

{\exercise
Put in the missing parentheses:

{\prgex%
> (let pi 3.14159
       pie 'lemon meringue
    se 'pi is pi 'but pie is pie)
(PI IS 3.14159 BUT PIE IS LEMON MERINGUE)
}}

\solution
{\prgex%
(let ((pi 3.14159)
      (pie '(lemon meringue)))
  (se '(pi is) pi '(but pie is) pie))
}
@

\esubhd{Real Exercises}

{\exercise
The following program doesn't work.  Why not?  Fix it.

{\prgex%
(define (\ufun{superlative} adjective word)
  (se (word adjective 'est) word))
}

\noindent It's supposed to work like this:

{\prgex%
> (superlative 'dumb 'exercise)
(DUMBEST EXERCISE)
}}

\solution
The body of {\tt superlative} tries to invoke the {\tt word} function, but
since {\tt word} is a formal parameter to {\tt superlative}, it loses its
meaning as the {\tt word} function.

{\prgex%
(define (superlative adjective wd)
  (se (word adjective 'est) wd))
}
@

{\exercise
What does this procedure do?  Explain how it manages to work.

{\prgex%
(define (\ufun{sum-square} a b)
  (let ((+ *)
        (* +))
    (* (+ a a) (+ b b))))
}}

\solution
This function temporarily makes {\tt +} stand for the multiplication function
and {\tt *} stand for the addition function, and then evaluates the
expression {\tt (*~(+~a~a)~(+~b~b))}.  Since {\tt +} is the multiplication
function, {\tt (+~a~a)} and {\tt (+~b~b)} take the squares of a and b, and
since {\tt *} is addition, we add the squares together.

This procedure does on purpose what the one in the previous
exercise does by accident.
@

\bye