<P>
<A NAME="santa"></A>
<P><CENTER><IMG SRC="../ss-pics/santa.jpg" ALT="figure: santa"></CENTER>
<HTML>
<HEAD>
<TITLE>Simply Scheme: Introducing Computer Science ch 17: Lists</TITLE>
</HEAD>
<BODY>
<HR>
<CITE>Simply Scheme:</CITE>
<CITE>Introducing Computer Science</CITE> 2/e Copyright (C) 1999 MIT
<H2>Chapter 17</H2>
<H1>Lists</H1>
<TABLE width="100%"><TR><TD>
<IMG SRC="../simply.jpg" ALT="cover photo">
<TD><TABLE>
<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian
Harvey</A><BR>University of California, Berkeley</CITE>
<TR><TD align="right"><CITE><A HREF="http://ccrma.stanford.edu/~matt">Matthew
Wright</A><BR>University of California, Santa Barbara</CITE>
<TR><TD align="right"><BR>
<TR><TD align="right"><A HREF="../pdf/ssch17.pdf">Download PDF version</A>
<TR><TD align="right"><A HREF="../ss-toc2.html">Back to Table of Contents</A>
<TR><TD align="right"><A HREF="part5.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../ssch18/trees.html"><STRONG>NEXT</STRONG></A>
<TR><TD align="right"><A HREF="http://mitpress.mit.edu/0262082810">MIT
Press web page for <CITE>Simply Scheme</CITE></A>
</TABLE></TABLE>
<HR>
<P>Suppose we're using Scheme to model an ice cream shop. We'll certainly need
to know all the flavors that are available:
<P><PRE>(vanilla ginger strawberry lychee raspberry mocha)
</PRE>
<P>For example, here's a procedure that models the behavior of
the salesperson when you place an order:
<P><PRE>(define (<A NAME="g1"></A>order flavor)
(if (member? flavor
'(vanilla ginger strawberry lychee raspberry mocha))
'(coming right up!)
(se '(sorry we have no) flavor)))
</PRE>
<P>But what happens if we want to sell a flavor like "root beer fudge ripple"
or "ultra chocolate"? We can't just put those words into a sentence of
flavors, or our program will think that each word is a separate flavor.
Beer ice cream doesn't sound very appealing.
<P>What we need is a way to express a collection of items, each of which is
itself a collection, like this:
<P><PRE>(vanilla (ultra chocolate) (heath bar crunch) ginger (cherry garcia))
</PRE>
<P>This is meant to represent five flavors, two of which are named by
single words, and the other three of which are named by sentences.
<P>Luckily for us, Scheme provides exactly this capability. The data structure
we're using in this example is called a <EM>list.</EM> The difference
between a sentence and a list is that the elements of a sentence must be
words, whereas the elements of a list can be anything at all: words, <CODE>#t</CODE>, procedures, or other lists. (A list that's an element of another list
is called a <EM>sublist.</EM> We'll use the name <EM>structured</EM>
<A NAME="g2"></A>
<A NAME="g3"></A>
list for a list that includes sublists.)
<P>Another way to think about the difference between sentences and lists is that
the definition of "list" is self-referential, because a list can include
lists as elements. The definition of "sentence" is not self-referential,
because the elements of a sentence must be words. We'll see that the
self-referential nature of recursive procedures is vitally important in
coping with lists.
<P>Another example in which lists could be helpful is the pattern matcher. We
used sentences to hold <CODE>known-values</CODE> databases, such as this one:
<P><PRE>(FRONT YOUR MOTHER ! BACK SHOULD KNOW !)
</PRE>
<P>This would be both easier for you to read and easier for programs
to manipulate if we used list structure to indicate the grouping instead of
exclamation points:
<P><PRE>((FRONT (YOUR MOTHER)) (BACK (SHOULD KNOW)))
</PRE>
<P>We remarked when we introduced sentences that they're a feature we added to
Scheme just for the sake of this book. Lists, by contrast, are at the core
of what Lisp has been about from its beginning. (In fact the name "Lisp"
stands for "LISt Processing.")
<P><H2>Selectors and Constructors</H2>
<P>When we introduced words and sentences we had to provide ways to take them
apart, such as <CODE>first</CODE>, and ways to put them together, such as <CODE>sentence</CODE>. Now we'll tell you about the selectors and
constructors for lists.
<P>
<P>The function to select the first element of a list is called
<A NAME="g4"></A><CODE>car</CODE>.<A NAME="text1" HREF="lists.html#ft1">[1]</A> The function to select the
portion of a list containing all but the first element is called
<A NAME="g5"></A><CODE>cdr</CODE>, which is pronounced "could-er." These are analogous to <CODE>first</CODE> and <CODE>butfirst</CODE> for words and sentences.
<P>Of course, we can't extract pieces of a list that's empty, so we need a
predicate that will check for an empty list. It's called <A NAME="g6"></A><CODE>null?</CODE> and it
returns <CODE>#t</CODE> for the empty list, <CODE>#f</CODE> for anything else. This is
the list equivalent of <CODE>empty?</CODE> for words and sentences.
<P>There are two constructors for lists. The function <A NAME="g7"></A><CODE>list</CODE> takes
any number of arguments and returns a list with those arguments as its
elements.
<P><PRE>> (list (+ 2 3) 'squash (= 2 2) (list 4 5) remainder 'zucchini)
(5 SQUASH #T (4 5) #<PROCEDURE> ZUCCHINI)
</PRE>
<P>The other constructor, <A NAME="g8"></A><CODE>cons</CODE>, is used when you already have
a list and you want to add one new element. <CODE>Cons</CODE> takes two arguments,
an element and a list (in that order), and returns a new list whose <CODE>car</CODE> is the first argument and whose <CODE>cdr</CODE> is the second.
<P><PRE>> (cons 'for '(no one))
(FOR NO ONE)
> (cons 'julia '())
(JULIA)
</PRE>
<P>There is also a function that combines the elements of two or more lists
into a larger list:
<P><PRE>> (append '(get back) '(the word))
(GET BACK THE WORD)
</PRE>
<P>It's important that you understand how <CODE>list</CODE>, <CODE>cons</CODE>,
and <A NAME="g9"></A><CODE>append</CODE> differ from each other:
<P><PRE>> (list '(i am) '(the walrus))
((I AM) (THE WALRUS))
> (cons '(i am) '(the walrus))
((I AM) THE WALRUS)
> (append '(i am) '(the walrus))
(I AM THE WALRUS)
</PRE>
<P>When <CODE>list</CODE> is invoked with two arguments, it considers them to be two
proposed elements for a new two-element list. <CODE>List</CODE> doesn't care
whether the arguments are themselves lists, words, or anything else; it just
creates a new list whose elements are the arguments. In this case, it ends
up with a list of two lists.
<P>
<CODE>Cons</CODE> requires that its second argument be a list.<A NAME="text2" HREF="lists.html#ft2">[2]</A> <CODE>Cons</CODE> will extend that list to form a new list, one element
longer than the original; the first element of the resulting list comes from
the first argument to <CODE>cons</CODE>. In other words, when you pass <CODE>cons</CODE>
two arguments, you get back a list whose <CODE>car</CODE> is the first argument to
<CODE>cons</CODE> and whose <CODE>cdr</CODE> is the second argument.
<P>Thus, in this example, the three elements of the returned list consist
of the first argument as one single element, followed by <EM>the elements
of</EM> the second argument (in this case, two words). (You may be wondering
why anyone would want to use such a strange constructor instead of <CODE>list</CODE>. The answer has to do with recursive procedures, but hang on for a few
paragraphs and we'll show you an example, which will help more than any
explanation we could give in English.)
<P>Finally, <CODE>append</CODE> of two arguments uses the elements of <EM>both</EM>
arguments as elements of its return value.
<P>Pictorially, <CODE>list</CODE> creates a list whose elements are the arguments:
<P><CENTER><IMG SRC="../ss-pics/list.jpg" ALT="figure: list"></CENTER>
<P><CODE>Cons</CODE> creates an extension of its second argument with
one new element:
<P><CENTER><IMG SRC="../ss-pics/cons.jpg" ALT="figure: cons"></CENTER>
<P><CODE>Append</CODE> creates a list whose elements are the <EM>elements
of</EM> the arguments, which must be lists:
<P><CENTER><IMG SRC="../ss-pics/append.jpg" ALT="figure: append"></CENTER>
<P><H2>Programming with Lists</H2>
<P><A NAME="praise"></A>
<PRE>(define (<A NAME="g10"></A>praise flavors)
(if (null? flavors)
'()
(cons (se (car flavors) '(is delicious))
(praise (cdr flavors)))))
> (praise '(ginger (ultra chocolate) lychee (rum raisin)))
((GINGER IS DELICIOUS) (ULTRA CHOCOLATE IS DELICIOUS)
(LYCHEE IS DELICIOUS) (RUM RAISIN IS DELICIOUS))
</PRE>
<P>In this example our result is a <EM>list of sentences.</EM> That is,
the result is a list that includes smaller lists as elements, but each of
these smaller lists is a sentence, in which only words are allowed. That's
why we used the constructor <CODE>cons</CODE> for the overall list, but <CODE>se</CODE>
for each sentence within the list.
<P>This is the example worth a thousand words that we promised, to show why <CODE>cons</CODE> is useful. <CODE>List</CODE> wouldn't work in this situation. You can
use <CODE>list</CODE> only when you know exactly how many elements will be in your
complete list. Here, we are writing a procedure that works for any number of
elements, so we recursively build up the list, one element at a time.
<P>In the following example we take advantage of structured lists to produce a
translation dictionary. The entire dictionary is a list; each element of
the dictionary, a single translation, is a two-element list; and in some
cases a translation may involve a phrase rather than a single word, so we
can get three deep in lists.
<P><PRE>(define (<A NAME="g11"></A>translate wd)
(lookup wd '((window fenetre) (book livre) (computer ordinateur)
(house maison) (closed ferme) (pate pate) (liver foie)
(faith foi) (weekend (fin de semaine))
((practical joke) attrape) (pal copain))))
(define (<A NAME="g12"></A>lookup wd dictionary)
(cond ((null? dictionary) '(parlez-vous anglais?))
((equal? wd (car (car dictionary)))
(car (cdr (car dictionary))))
(else (lookup wd (cdr dictionary)))))
> (translate 'computer)
ORDINATEUR
> (translate '(practical joke))
ATTRAPE
> (translate 'recursion)
(PARLEZ-VOUS ANGLAIS?)
</PRE>
<P>By the way, this example will help us explain why those ridiculous names
<CODE>car</CODE> and <CODE>cdr</CODE> haven't died out. In this not-so-hard program we
find ourselves saying
<P><PRE>(car (cdr (car dictionary)))
</PRE>
<P>to refer to the French part of the first translation in the
dictionary. Let's go through that slowly. <CODE>(Car dictionary)</CODE> gives us
the first element of the dictionary, one English-French pairing. <CODE>Cdr</CODE>
of that first element is a one-element list, that is, all but the English word
that's the first element of the pairing. What we want isn't the one-element
list but rather its only element, the French word, which is its <CODE>car</CODE>.
<P>This <CODE>car</CODE> of <CODE>cdr</CODE> of <CODE>car</CODE> business is pretty lengthy and
<A NAME="g13"></A>
awkward. But Scheme gives us a way to say it succinctly:
<P><PRE>(cadar dictionary)
</PRE>
<P>In general, we're allowed to use names like <CODE>cddadr</CODE> up to
four deep in <CODE>A</CODE>s and <CODE>D</CODE>s. That one means
<A NAME="cadr"></A>
<P><PRE>(cdr (cdr (car (cdr something))))
</PRE>
<P>or in other words, take the <CODE>cdr</CODE> of the <CODE>cdr</CODE> of the <CODE>car</CODE> of the <CODE>cdr</CODE> of its argument. Notice that the order of letters
<CODE>A</CODE> and <CODE>D</CODE> follows the order in which you'd write the procedure
names, but (as always) the procedure that's invoked first is the one on
the right. Don't make the mistake of reading <CODE>cadr</CODE> as meaning
"first take the <CODE>car</CODE> and then take the <CODE>cdr</CODE>." It means "take
the <CODE>car</CODE> of the <CODE>cdr</CODE>."
<P>The most commonly used of these abbreviations are <A NAME="g14"></A><CODE>cadr</CODE>, which selects
the second element of a list; <CODE>caddr</CODE>, which selects the third element;
and <CODE>cadddr</CODE>, which selects the fourth.
<P><H2>The Truth about Sentences</H2>
<P>You've probably noticed that it's hard to distinguish between a sentence
(which <EM>must</EM> be made up of words) and a list that <EM>happens</EM> to
have words as its elements.
<P>The fact is, sentences <EM>are</EM> lists. You could take <CODE>car</CODE> of a
sentence, for example, and it'd work fine. Sentences are an
<A NAME="g15"></A><A NAME="g16"></A>abstract data type represented by lists. We created the sentence
<A NAME="g17"></A>
ADT by writing special selectors and constructors that provide a
different way of using the same underlying machinery—a different
interface, a different metaphor, a different point of view.
<P>How does our sentence point of view differ from the built-in Scheme point of
view using lists? There are three differences:
<P><P><TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">A sentence can contain only words, not sublists.
</TABLE><TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">Sentence selectors are symmetrical front-to-back.
</TABLE><TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">Sentences and words have the same selectors.
</TABLE><P>
All of these differences fit a common theme: Words and sentences
are meant to represent English text. The three differences reflect three
characteristics of English text: First, text is made of sequences of words,
not complicated structures with sublists. Second, in manipulating text (for
example, finding the plural of a noun) we need to look at the end of a word
or sentence as often as at the beginning. Third, since words and sentences
work together so closely, it makes sense to use the same tools with both. By
contrast, from Scheme's ordinary point of view, an English sentence is just
one particular case of a much more general data structure, whereas a
symbol<A NAME="text3" HREF="lists.html#ft3">[3]</A> is something entirely
different.
<P>The constructors and selectors for sentences reflect these three
differences. For example, it so happens that Scheme represents lists in a
way that makes it easy to find the first element, but harder to find the
last one. That's reflected in the fact that there are no primitive
selectors for lists equivalent to <CODE>last</CODE> and <CODE>butlast</CODE> for
sentences. But we want <CODE>last</CODE> and <CODE>butlast</CODE> to be a part of the
sentence package, so we have to write them in terms of the "real" Scheme
list selectors. (In the versions presented here, we are ignoring the issue
of applying the selectors to words.)
<P><PRE>(define (<A NAME="g18"></A>first sent) ;;; just for sentences
(car sent))
(define (<A NAME="g19"></A>last sent)
(if (null? (cdr sent))
(car sent)
(last (cdr sent))))
(define (<A NAME="g20"></A>butfirst sent)
(cdr sent))
(define (<A NAME="g21"></A>butlast sent)
(if (null? (cdr sent))
'()
(cons (car sent) (butlast (cdr sent)))))
</PRE>
<P>If you look "behind the curtain" at the implementation, <CODE>last</CODE> is a lot more complicated than <CODE>first</CODE>. But from the point of
view of a sentence user, they're equally simple.
<P>In Chapter 16 we used the pattern matcher's known-values database to
introduce the idea of abstract data types. In that example, the most
important contribution of the ADT was to isolate the details of the
implementation, so that the higher-level procedures could invoke <CODE>lookup</CODE> and <CODE>add</CODE> without the clutter of looking for exclamation
points. We did hint, though, that the ADT represents a shift in how the
programmer thinks about the sentences that are used to represent databases;
we don't take the acronym of a database, even though the database <EM>is</EM>
a sentence and so it would be possible to apply the <CODE>acronym</CODE> procedure
to it. Now, in thinking about sentences, this idea of shift in viewpoint is
more central. Although sentences are represented as lists, they behave much
like words, which are represented quite differently.<A NAME="text4" HREF="lists.html#ft4">[4]</A> Our sentence mechanism highlights the <EM>uses</EM> of
sentences, rather than the implementation.
<P><H2>Higher-Order Functions</H2>
<P>The <A NAME="g22"></A><A NAME="g23"></A>higher-order functions that we've used until now work only for
words and sentences. But the <EM>idea</EM> of higher-order functions applies
perfectly well to structured lists. The official list versions of <CODE>every</CODE>, <CODE>keep</CODE>, and <CODE>accumulate</CODE> are called <CODE>map</CODE>, <CODE>filter</CODE>,
and <CODE>reduce</CODE>.
<P><CODE>Map</CODE> takes two arguments, a function and a list, and returns a list
<A NAME="g24"></A>
containing the result of applying the function to each element of the list.
<A NAME="map"></A>
<P><PRE>> (map square '(9 8 7 6))
(81 64 49 36)
> (map (lambda (x) (se x x)) '(rocky raccoon))
((ROCKY ROCKY) (RACCOON RACCOON))
> (every (lambda (x) (se x x)) '(rocky raccoon))
(ROCKY ROCKY RACCOON RACCOON)
> (map car '((john lennon) (paul mccartney)
(george harrison) (ringo starr)))
(JOHN PAUL GEORGE RINGO)
> (map even? '(9 8 7 6))
(#F #T #F #T)
> (map (lambda (x) (word x x)) 'rain)
ERROR - INVALID ARGUMENT TO MAP: RAIN
</PRE>
<P>The word "map" may seem strange for this function, but it comes
from the mathematical study of functions, in which they talk about a <EM>mapping</EM> of the domain into the range. In this terminology, one talks
about "mapping a function over a set" (a set of argument values, that is),
and Lispians have taken over the same vocabulary, except that we talk about
mapping over lists instead of mapping over sets. In any case, <CODE>map</CODE> is
a genuine Scheme primitive, so it's the official grownup way to talk about
an <CODE>every</CODE>-like higher-order function, and you'd better learn to like it.
<P><CODE>Filter</CODE> also takes a function and a list as arguments; it returns a
<A NAME="filter"></A>
<A NAME="g25"></A>
list containing only those elements of the argument list for which the
function returns a true value. This is the same as <CODE>keep</CODE>, except that
the elements of the argument list may be sublists, and their structure is
preserved in the result.
<P><PRE>> (filter (lambda (flavor) (member? 'swirl flavor))
'((rum raisin) (root beer swirl) (rocky road) (fudge swirl)))
((ROOT BEER SWIRL) (FUDGE SWIRL))
> (filter word? '((ultra chocolate) ginger lychee (raspberry sherbet)))
(GINGER LYCHEE)
</PRE>
<P><PRE>> (filter (lambda (nums) (= (car nums) (cadr nums)))
'((2 3) (4 4) (5 6) (7 8) (9 9)))
((4 4) (9 9))
</PRE>
<P><CODE>Filter</CODE> probably makes sense to you as a name; the metaphor
of the air filter that allows air through but doesn't allow dirt, and so on,
evokes something that passes some data and blocks other data. The only
problem with the name is that it doesn't tell you whether the elements for
which the predicate function returns <CODE>#t</CODE> are filtered in or filtered
out. But you're already used to <CODE>keep</CODE>, and <CODE>filter</CODE> works
the same way. <CODE>Filter</CODE> is not a standard Scheme primitive, but it's a
universal convention; everyone defines it the same way we do.
<P><CODE>Reduce</CODE> is just like <CODE>accumulate</CODE> except that it works only on
<A NAME="reduce"></A>
<A NAME="g26"></A>
lists, not on words. Neither is a built-in Scheme primitive; both names are
seen in the literature. (The name "reduce" is official in the languages
APL and Common Lisp, which do include this higher-order function as a primitive.)
<P><PRE>> (reduce * '(4 5 6))
120
> (reduce (lambda (list1 list2) (list (+ (car list1) (car list2))
(+ (cadr list1) (cadr list2))))
'((1 2) (30 40) (500 600)))
(531 642)
</PRE>
<P><H2>Other Primitives for Lists</H2>
<P>The <A NAME="g27"></A><CODE>list?</CODE> predicate returns <CODE>#t</CODE> if its argument is a list, <CODE>#f</CODE> otherwise.
<P>The predicate <CODE>equal?</CODE>, which we've discussed earlier as applied to
words and sentences, also works for structured lists.
<P>The predicate <CODE>member?</CODE>, which we used in one of the
examples above, isn't a true Scheme primitive, but part of the word and
sentence package. (You can tell because it "takes apart" a word to look
at its letters separately, something that Scheme doesn't ordinarily do.)
Scheme does have a <A NAME="g28"></A><CODE>member</CODE> primitive without the question mark that's
like <CODE>member?</CODE> except for two differences: Its second argument must be
a list (but can be a structured list); and instead of returning <CODE>#t</CODE> it
returns the portion of the argument list starting with the element equal to
the first argument. This will be clearer with an example:
<P>
<PRE>> (member 'd '(a b c d e f g))
(D E F G)
> (member 'h '(a b c d e f g))
#F
</PRE>
<P>This is the main example in Scheme of the semipredicate
idea that we mentioned earlier in passing. It doesn't have a question mark
in its name because it returns values other than <CODE>#t</CODE> and <CODE>#f</CODE>,
but it works as a predicate because any non-<CODE>#f</CODE> value is considered
true.
<P>The only word-and-sentence functions that we haven't already mentioned are
<CODE>item</CODE> and <CODE>count</CODE>. The list equivalent of <CODE>item</CODE> is called
<CODE><A NAME="g29"></A><CODE>list-ref</CODE></CODE> (short for "reference"); it's different in that it
counts items from zero instead of from one and takes its arguments in the
other order:
<P><PRE>> (list-ref '(happiness is a warm gun) 3)
WARM
</PRE>
<P>The list equivalent of <CODE>count</CODE> is called <A NAME="g30"></A><CODE>length</CODE>, and
it's exactly the same except that it doesn't work on words.
<P>
<P><H2>Association Lists</H2>
<P><A NAME="g31"></A>
<A NAME="g32"></A>
<A NAME="g33"></A>
<P>An example earlier in this chapter was about translating from English to
French. This involved searching for an entry in a list by comparing the
first element of each entry with the information we were looking for. A
list of names and corresponding values is called an <EM>association
list,</EM> or an <EM>a-list.</EM> The Scheme primitive <CODE>assoc</CODE> looks up a
name in an a-list:
<P><PRE>> (assoc 'george
'((john lennon) (paul mccartney)
(george harrison) (ringo starr)))
(GEORGE HARRISON)
> (assoc 'x '((i 1) (v 5) (x 10) (l 50) (c 100) (d 500) (m 1000)))
(X 10)
> (assoc 'ringo '((mick jagger) (keith richards) (brian jones)
(charlie watts) (bill wyman)))
#F
</PRE>
<P>
<PRE>(define dictionary
'((window fenetre) (book livre) (computer ordinateur)
(house maison) (closed ferme) (pate pate) (liver foie)
(faith foi) (weekend (fin de semaine))
((practical joke) attrape) (pal copain)))
(define (<A NAME="g34"></A>translate wd)
(let ((record (assoc wd dictionary)))
(if record
(cadr record)
'(parlez-vous anglais?))))
</PRE>
<P><CODE>Assoc</CODE> returns <CODE>#f</CODE> if it can't find the entry you're
looking for in your association list. Our <CODE>translate</CODE> procedure
checks for that possibility before using <CODE>cadr</CODE> to extract the French
translation, which is the second element of an entry.
<P><H2>Functions That Take Variable Numbers of Arguments</H2>
<P><A NAME="g35"></A>
<A NAME="g36"></A>
<P>In the beginning of this book we told you about some Scheme procedures that
can take any number of arguments, but you haven't yet learned how to write
such procedures for yourself, because Scheme's mechanism for writing these
procedures requires the use of lists.
<P>Here's a procedure that takes one or more numbers as arguments and returns
true if these numbers are in increasing order:
<P><PRE>(define (<A NAME="g37"></A>increasing? number . rest-of-numbers)
(cond ((null? rest-of-numbers) #t)
((> (car rest-of-numbers) number)
(apply increasing? rest-of-numbers))
(else #f)))
> (increasing? 4 12 82)
#T
> (increasing? 12 4 82 107)
#F
</PRE>
<P>The first novelty to notice in this program is the dot in the first line.
In listing the formal parameters of a procedure, you can use a dot just
before the last parameter to mean that that parameter (<CODE>rest-of-numbers</CODE>
in this case) represents any number of arguments, including zero. The value
that will be associated with this parameter when the procedure is invoked
will be a list whose elements are the actual argument values.
<P>In this example, you must invoke <CODE>increasing?</CODE> with at least one
argument; that argument will be associated with the parameter <CODE>number</CODE>.
If there are no more arguments, <CODE>rest-of-numbers</CODE> will be the empty
list. But if there are more arguments, <CODE>rest-of-numbers</CODE> will be a list
of their values. (In fact, these two cases are the same: <CODE>Rest-of-numbers</CODE> will be a list of all the remaining arguments, and if there
are no such arguments, <CODE>rest-of-numbers</CODE> is a list with no elements.)
<P>The other novelty in this example is the procedure <A NAME="g38"></A><CODE>apply</CODE>. It takes
two arguments, a procedure and a list. <CODE>Apply</CODE> invokes the given
procedure with the elements of the given list as its arguments, and returns
whatever value the procedure returns. Therefore, the following two
expressions are equivalent:
<P><PRE>(+ 3 4 5)
(apply + '(3 4 5))
</PRE>
<P>We use <CODE>apply</CODE> in <CODE>increasing?</CODE> because we don't know how
many arguments we'll need in its recursive invocation. We can't just say
<P><PRE>(increasing? rest-of-numbers)
</PRE>
<P>because that would give <CODE>increasing?</CODE> a list as its single
argument, and it doesn't take lists as arguments—it takes numbers. We
want <EM>the numbers in the list</EM> to be the arguments.
<P>We've used the name <CODE>rest-of-numbers</CODE> as the formal parameter to suggest
"the rest of the arguments," but that's not just an idea we made up. A
parameter that follows a dot and therefore represents a variable number of
arguments is called a <EM><A NAME="g39"></A><A NAME="g40"></A>rest parameter.</EM>
<P>Here's a table showing the values of <CODE>number</CODE> and <CODE>rest-of-numbers</CODE>
in the recursive invocations of <CODE>increasing?</CODE> for the example
<P><PRE>(increasing? 3 5 8 20 6 43 72)
number rest-of-numbers
3 (5 8 20 6 43 72)
5 (8 20 6 43 72)
8 (20 6 43 72)
20 (6 43 72) (returns false at this point)
</PRE>
<P>In the <CODE>increasing?</CODE> example we've used one formal parameter
before the dot, but you may use any number of such parameters, including zero.
The number of formal parameters before the dot determines the <EM>minimum</EM> number of arguments that must be used when your procedure is
invoked. There can be only one formal parameter <EM>after</EM> the dot.
<P><H2>Recursion on Arbitrary Structured Lists</H2>
<P>Let's pretend we've stored this entire book in a gigantic Scheme list
structure. It's a list of chapters. Each chapter is a list of sections.
Each section is a list of paragraphs. Each paragraph is a list of
sentences, which are themselves lists of words.
<P>Now we want to know how many times the word "mathematicians" appears in the
book. We could do it the incredibly boring way:
<P><PRE>(define (appearances-in-book wd book)
(reduce + (map (lambda (chapter) (appearances-in-chapter wd chapter))
book)))
(define (appearances-in-chapter wd chapter)
(reduce + (map (lambda (section) (appearances-in-section wd section))
chapter)))
(define (appearances-in-section wd section)
(reduce + (map (lambda (paragraph)
(appearances-in-paragraph wd paragraph))
section)))
(define (appearances-in-paragraph wd paragraph)
(reduce + (map (lambda (sent) (appearances-in-sentence wd sent))
paragraph)))
(define (appearances-in-sentence given-word sent)
(length (filter (lambda (sent-word) (equal? sent-word given-word))
sent)))
</PRE>
<P>but that <EM>would</EM> be incredibly boring.
<P>What we're going to do is similar to the reasoning we used in developing the
idea of recursion in Chapter 11. There, we wrote a family of
procedures named <CODE>downup1</CODE>, <CODE>downup2</CODE>, and so on; we then noticed
that most of these procedures looked almost identical, and "collapsed"
them into a single recursive procedure. In the same spirit, notice that all
the <CODE>appearances-in-</CODE> procedures are very similar. We can make them
even more similar by rewriting the last one:
<P><PRE>(define (appearances-in-sentence wd sent)
(reduce + (map (lambda (wd2) (appearances-in-word wd wd2))
sent)))
(define (appearances-in-word wd wd2)
(if (equal? wd wd2) 1 0))
</PRE>
<P>Now, just as before, we want to write a single procedure
that combines all of these.
<P>What's the base case? Books, chapters, sections, paragraphs, and sentences
are all lists of smaller units. It's only when we get down to individual
words that we have to do something different:
<P><PRE>(define (deep-appearances wd structure)
(if (word? structure)
(if (equal? structure wd) 1 0)
(reduce +
(map (lambda (sublist) (deep-appearances wd sublist))
structure))))
> (deep-appearances
'the
'(((the man) in ((the) moon)) ate (the) potstickers))
3
> (deep-appearances 'n '(lambda (n) (if (= n 0) 1 (* n (f (- n 1))))))
4
> (deep-appearances 'mathematicians the-book-structure)
7
</PRE>
<P>This is quite different from the recursive situations we've seen
before. What looks like a recursive call from <CODE>deep-appearances</CODE> to
itself is actually inside an anonymous procedure that will be called
<EM>repeatedly</EM> by <CODE>map</CODE>. <CODE>Deep-appearances</CODE> doesn't just call
itself once in the recursive case; it uses <CODE>map</CODE> to call itself for each
element of <CODE>structure</CODE>. Each of those calls returns a number; <CODE>map</CODE>
returns a list of those numbers. What we want is the sum of those numbers,
and that's what <CODE>reduce</CODE> will give us.
<P>This explains why <CODE>deep-appearances</CODE> must accept words as well as lists
as the <CODE>structure</CODE> argument. Consider a case like
<P><PRE>(deep-appearances 'foo '((a) b))
</PRE>
<P>Since <CODE>structure</CODE> has two elements, <CODE>map</CODE> will call
<CODE>deep-appearances</CODE> twice. One of these calls uses the list <CODE>(a)</CODE> as
the second argument, but the other call uses the word <CODE>b</CODE> as the second
argument.
<P>Of course, if <CODE>structure</CODE> is a word, we can't make recursive calls for
its elements; that's why words are the base case for this recursion. What
should <CODE>deep-appearances</CODE> return for a word? If it's the word we're
looking for, that counts as one appearance. If not, it counts as no
appearances.
<P>You're accustomed to seeing the empty list as the base case in a recursive
list processing procedure. Also, you're accustomed to thinking of the base
case as the end of a <EM>complete</EM> problem; you've gone through all of
the elements of a list, and there are no more elements to find. In most
problems, there is only one recursive invocation that turns out to be a base
case. But in using <CODE>deep-appearances</CODE>, there are <EM>many</EM>
invocations for base cases—one for every word in the list structure.
Reaching a base case doesn't mean that we've reached the end of the entire
structure! You might want to trace a short example to help you understand
the sequence of events.
<P>Although there's no official name for a structure made of lists of lists of
… of lists, there <EM>is</EM> a common convention for naming
procedures that deal with these structures; that's why we've called this
procedure <CODE>deep-appearances</CODE>. The word "deep" indicates that this
procedure is just like a procedure to look for the number of appearances of
a word in a list, except that it looks "all the way down" into the
sub-sub-⋅⋅⋅-sublists instead of just looking at the elements of the top-level
list.
<P>This version of <CODE>deep-appearances</CODE>, in which higher-order procedures are
used to deal with the sublists of a list, is a common programming
style. But for some problems, there's another way to organize the same
basic program without higher-order procedures. This other organization
leads to very compact, but rather tricky, programs. It's also a widely used
style, so we want you to be able to recognize it.
<P>Here's the idea. We deal with the base case—words—just as before. But
for lists we do what we often do in trying to simplify a list problem: We
divide the list into its first element (its <CODE>car</CODE>) and all the rest of
its elements (its <CODE>cdr</CODE>). But in this case, the resulting program is a
little tricky. Ordinarily, a recursive program for lists makes a recursive
call for the <CODE>cdr</CODE>, which is a list of the same kind as the whole
argument, but does something non-recursive for the <CODE>car</CODE>, which is just
one element of that list. This time, the <CODE>car</CODE> of the kind of structured
list-of-lists we're exploring may itself be a list-of-lists! So we make a
recursive call for it, as well:
<P><PRE>(define (<A NAME="g41"></A>deep-appearances wd structure)
(cond ((equal? wd structure) 1) ; base case: desired word
((word? structure) 0) ; base case: other word
((null? structure) 0) ; base case: empty list
(else (+ (deep-appearances wd (car structure))
(deep-appearances wd (cdr structure))))))
</PRE>
<P>This procedure has two different kinds of base case. The first
two <CODE>cond</CODE> clauses are similar to the base case in the previous version
of <CODE>deep-appearances</CODE>; they deal with a "structure" consisting of a
single word. If the structure is the word we're looking for, then the word
appears once in it. If the structure is some other word, then the word
appears zero times. The third clause is more like the base case of an
ordinary list recursion; it deals with an empty list, in which case the word
appears zero times in it. (This still may not be the end of the entire
structure used as the argument to the top-level invocation, but may instead
be merely the end of a sublist within that structure.)
<P>If we reach the <CODE>else</CODE> clause, then the structure is neither a word
nor an empty list. It must, therefore, be a non-empty list, with a <CODE>car</CODE>
and a <CODE>cdr</CODE>. The number of appearances in the entire structure
of the word we're looking for is equal to the number of appearances in the
<CODE>car</CODE> plus the number in the <CODE>cdr</CODE>.
<P>In <CODE>deep-appearances</CODE> the desired result is a single number. What if we
want to build a new list-of-lists structure? Having used <CODE>car</CODE> and <CODE>cdr</CODE> to disassemble a structure, we can use <CODE>cons</CODE> to build a new one.
For example, we'll translate our entire book into Pig Latin:
<P><PRE>(define (<A NAME="g42"></A>deep-pigl structure)
(cond ((word? structure) (pigl structure))
((null? structure) '())
(else (cons (deep-pigl (car structure))
(deep-pigl (cdr structure))))))
> (deep-pigl '((this is (a structure of (words)) with)
(a (peculiar) shape)))
((ISTHAY ISAY (AAY UCTURESTRAY OFAY (ORDSWAY)) ITHWAY)
(AAY (ECULIARPAY) APESHAY))
</PRE>
<P>Compare <CODE>deep-pigl</CODE> with an <CODE>every</CODE>-pattern list recursion
such as <CODE>praise</CODE> on page <A HREF="lists.html#praise">there</A>. Both look like
<P><PRE>(cons (<EM>something</EM> (car argument)) (<EM>something</EM> (cdr argument)))
</PRE>
<P>And yet these procedures are profoundly different. <CODE>Praise</CODE>
is a simple left-to-right walk through the elements of a sequence;
<CODE>deep-pigl</CODE> dives in and out of sublists. The difference is a result
of the fact that <CODE>praise</CODE> does one recursive call, for the <CODE>cdr</CODE>,
while <CODE>deep-pigl</CODE> does two, for the <CODE>car</CODE> as well as the <CODE>cdr</CODE>.
The pattern exhibited by <CODE>deep-pigl</CODE> is called <CODE>car</CODE>-<CODE>cdr</CODE>
recursion. (Another name for it is "tree recursion," for a reason we'll
see in the next chapter.)
<P><H2>Pitfalls</H2>
<P>Just as we mentioned about the names <CODE>word</CODE> and <CODE>sentence</CODE>,
resist the temptation to use <CODE>list</CODE> as a formal parameter. We use
<CODE>lst</CODE> instead, but other alternatives are capital <CODE>L</CODE> or <CODE>seq</CODE>
(for "sequence").
<P>The list constructor <CODE>cons</CODE> does not treat its two arguments
equivalently. The second one must be the list you're trying to extend.
There is no equally easy way to extend a list on the right (although you can
put the new element into a one-element list and use <CODE>append</CODE>). If you
get the arguments backward, you're likely to get funny-looking results that
aren't lists, such as
<P><PRE>((3 . 2) . 1)
</PRE>
<P>The result you get when you <CODE>cons</CODE> onto something that isn't a
list is called a <EM>pair.</EM> It's sometimes called a "dotted pair"
because of what it looks like when printed:
<P><PRE>> (cons 'a 'b)
(A . B)
</PRE>
<P>It's just the printed representation that's dotted, however; the
dot isn't part of the pair any more than the parentheses around a list are
elements of the list. Lists are made of pairs; that's why <CODE>cons</CODE> can
construct lists. But we're not going to talk about any pairs that <EM>aren't</EM> part of lists, so you don't have to think about them at all,
except to know that if dots appear in your results you're <CODE>cons</CODE>ing
backward.
<P>Don't get confused between lists and sentences. Sentences have no
internal structure; the good aspect of this is that it's hard to make
mistakes about building the structure, but the bad aspect is that you might
need such a structure. You can have lists whose elements are sentences, but
it's confusing if you think of the same structure sometimes as a list and
sometimes as a sentence.
<P>In reading someone else's program, it's easy not to notice that a
procedure is making two recursive calls instead of just one. If you notice
only the recursive call for the <CODE>cdr</CODE>, you might think you're looking at
a sequential recursion.
<P>If you're writing a procedure whose argument is a list-of-lists, it may
feel funny to let it also accept a word as the argument value. People
therefore sometimes insist on a list as the argument, leading to an overly
complicated base case. If your base case test says
<P><PRE>(word? (car structure))
</PRE>
<P>then think about whether you'd have a better-organized program
if the base case were
<P><PRE>(word? structure)
</PRE>
<P>Remember that in a deep-structure recursion you may need two base
cases, one for reaching an element that isn't a sublist, and the other for
an empty list, with no elements at all. (Our <CODE>deep-appearances</CODE>
procedure is an example.) Don't forget the empty-list case.
<P><H2>Boring Exercises</H2>
<P><B>17.1</B> What will Scheme print in response to each of the following expressions?
Try to figure it out in your head before you try it on the computer.
<P><PRE>> (car '(Rod Chris Colin Hugh Paul))
> (cadr '(Rod Chris Colin Hugh Paul))
> (cdr '(Rod Chris Colin Hugh Paul))
> (car 'Rod)
> (cons '(Rod Argent) '(Chris White))
> (append '(Rod Argent) '(Chris White))
> (list '(Rod Argent) '(Chris White))
> (caadr '((Rod Argent) (Chris White)
(Colin Blunstone) (Hugh Grundy) (Paul Atkinson)))
> (assoc 'Colin '((Rod Argent) (Chris White)
(Colin Blunstone) (Hugh Grundy) (Paul Atkinson)))
> (assoc 'Argent '((Rod Argent) (Chris White)
(Colin Blunstone) (Hugh Grundy) (Paul Atkinson)))
</PRE>
<P><B>17.2</B> For each of the following examples, write a procedure of two arguments
that, when applied to the sample arguments, returns the sample result.
Your procedures may not include any quoted data.
<P><PRE>> (f1 '(a b c) '(d e f))
((B C D))
> (f2 '(a b c) '(d e f))
((B C) E)
> (f3 '(a b c) '(d e f))
(A B C A B C)
> (f4 '(a b c) '(d e f))
((A D) (B C E F))
</PRE>
<P>
<B>17.3</B> Describe the value returned by this invocation of <CODE>map</CODE>:
<P><PRE>> (map (lambda (x) (lambda (y) (+ x y))) '(1 2 3 4))
</PRE>
<P>
<H2>Real Exercises</H2>
<P><B>17.4</B> Describe the result of calling the following procedure with a list as its
argument. (See if you can figure it out before you try it.)
<P><PRE>(define (<A NAME="g43"></A>mystery lst)
(mystery-helper lst '()))
(define (mystery-helper lst other)
(if (null? lst)
other
(mystery-helper (cdr lst) (cons (car lst) other))))
</PRE>
<P>
<B>17.5</B> Here's a procedure that takes two numbers as arguments and returns
whichever number is larger:
<P><PRE>(define (<A NAME="g44"></A>max2 a b)
(if (> b a) b a))
</PRE>
<P>Use <CODE>max2</CODE> to implement <CODE>max</CODE>, a procedure that takes
one or more numeric arguments and returns the largest of them.
<P>
<B>17.6</B> Implement <CODE>append</CODE> using <CODE>car</CODE>, <CODE>cdr</CODE>, and <CODE>cons</CODE>.
(Note: The built-in <CODE>append</CODE> can take any number of arguments.
First write a version that accepts only two arguments. Then,
optionally, try to write a version that takes any number.)
<P>
<B>17.7</B> <CODE>Append</CODE> may remind you of <CODE>sentence</CODE>. They're similar, except that
<CODE>append</CODE> works only with lists as arguments, whereas <CODE>sentence</CODE> will
accept words as well as lists. Implement <CODE><A NAME="g45"></A>sentence</CODE> using <CODE>append</CODE>. (Note: The built-in <CODE>sentence</CODE> can take any number of
arguments. First write a version that accepts only two
arguments. Then, optionally, try to write a version that takes any
number. Also, you don't have to worry about the error checking that the
real <CODE>sentence</CODE> does.)
<P>
<B>17.8</B> Write <CODE>member</CODE>.
<P>
<B>17.9</B> Write <CODE>list-ref</CODE>.
<P>
<B>17.10</B> Write <CODE>length</CODE>.
<P>
<B>17.11</B> Write <CODE><A NAME="g46"></A>before-in-list?</CODE>, which takes a list and two elements of
the list. It should return <CODE>#t</CODE> if the second argument appears in the
list argument before the third argument:
<P><PRE>> (before-in-list? '(back in the ussr) 'in 'ussr)
#T
> (before-in-list? '(back in the ussr) 'the 'back)
#F
</PRE>
<P>The procedure should also return <CODE>#f</CODE> if either of the supposed elements
doesn't appear at all.
<P>
<B>17.12</B> Write a procedure called <CODE><A NAME="g47"></A>flatten</CODE> that takes as its argument a
list, possibly including sublists, but whose ultimate building blocks are
words (not Booleans or procedures). It should return a sentence containing
all the words of the list, in the order in which they appear in the original:
<P><PRE>> (flatten '(((a b) c (d e)) (f g) ((((h))) (i j) k)))
(A B C D E F G H I J K)
</PRE>
<P>
<B>17.13</B> Here is a procedure that counts the number of words anywhere within a
structured list:
<P><PRE>(define (deep-count lst)
(cond ((null? lst) 0)
((word? (car lst)) (+ 1 (deep-count (cdr lst))))
(else (+ (deep-count (car lst))
(deep-count (cdr lst))))))
</PRE>
<P>Although this procedure works, it's more complicated than
necessary. Simplify it.
<P>
<B>17.14</B> Write a procedure <CODE><A NAME="g48"></A>branch</CODE> that takes as arguments a list of
numbers and a nested list structure. It should be the list-of-lists equivalent
of <CODE>item</CODE>, like this:
<P><PRE>> (branch '(3) '((a b) (c d) (e f) (g h)))
(E F)
> (branch '(3 2) '((a b) (c d) (e f) (g h)))
F
> (branch '(2 3 1 2) '((a b) ((c d) (e f) ((g h) (i j)) k) (l m)))
H
</PRE>
<P>In the last example above, the second element of the list is
<P><PRE>((C D) (E F) ((G H) (I J)) K)
</PRE>
<P>The third element of that smaller
list is <CODE>((G H) (I J))</CODE>; the first element of that is <CODE>(G H)</CODE>; and
the second element of <EM>that</EM> is just <CODE>H</CODE>.
<P>
<P>
<B>17.15</B> Modify the pattern matcher to represent the <CODE>known-values</CODE> database as a
list of two-element lists, as we suggested at the beginning of this chapter.
<P>
<B>17.16</B> Write a predicate <CODE><A NAME="g49"></A>valid-infix?</CODE> that takes a list as argument
and returns <CODE>#t</CODE> if and only if the list is a legitimate infix
arithmetic expression (alternating operands and operators, with
parentheses—that is, sublists—allowed for grouping).
<P><PRE>> (valid-infix? '(4 + 3 * (5 - 2)))
#T
> (valid-infix? '(4 + 3 * (5 2)))
#F
</PRE>
<P>
<HR>
<A NAME="ft1" HREF="lists.html#text1">[1]</A> Don't even try to figure out a sensible reason for
this name. It's a leftover bit of history from the first computer on which
Lisp was implemented. It stands for "contents of address register" (at
least that's what all the books say, although it's really the address <EM>portion</EM> of the accumulator register). <CODE>Cdr</CODE>, coming up in the next
sentence, stands for "contents of decrement register." The names seem
silly in the Lisp context, but that's because the Lisp people used these
register components in ways the computer designers didn't intend. Anyway,
this is all very interesting to history buffs but irrelevant to our
purposes. We're just showing off that one of us is actually old enough to
remember these antique computers first-hand.<P>
<A NAME="ft2" HREF="lists.html#text2">[2]</A> This is
not the whole story. See the "pitfalls" section for a slightly expanded
version.<P>
<A NAME="ft3" HREF="lists.html#text3">[3]</A> As we said in Chapter 5, "symbol" is the official name
for words that are neither strings nor numbers.<P>
<A NAME="ft4" HREF="lists.html#text4">[4]</A> We implemented
words by combining three data types that are primitive in Scheme: strings,
symbols, and numbers.<P>
<P><A HREF="../ss-toc2.html">(back to Table of Contents)</A><P>
<A HREF="part5.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../ssch18/trees.html"><STRONG>NEXT</STRONG></A>
<P>
<ADDRESS>
<A HREF="../index.html">Brian Harvey</A>,
<CODE>bh@cs.berkeley.edu</CODE>
</ADDRESS>
</BODY>
</HTML>