Simply Scheme: Introducing Computer Science ch 17: Lists

Suppose we're using Scheme to model an ice cream shop. We'll certainly need to know all the flavors that are available:

For example, here's a procedure that models the behavior of the salesperson when you place an order:

But what happens if we want to sell a flavor like "root beer fudge ripple" or "ultra chocolate"? We can't just put those words into a sentence of flavors, or our program will think that each word is a separate flavor. Beer ice cream doesn't sound very appealing.

What we need is a way to express a collection of items, each of which is itself a collection, like this:

This is meant to represent five flavors, two of which are named by single words, and the other three of which are named by sentences.

Luckily for us, Scheme provides exactly this capability. The data structure we're using in this example is called a list. The difference between a sentence and a list is that the elements of a sentence must be words, whereas the elements of a list can be anything at all: words, #t, procedures, or other lists. (A list that's an element of another list is called a sublist. We'll use the name structured list for a list that includes sublists.)

Another way to think about the difference between sentences and lists is that the definition of "list" is self-referential, because a list can include lists as elements. The definition of "sentence" is not self-referential, because the elements of a sentence must be words. We'll see that the self-referential nature of recursive procedures is vitally important in coping with lists.

Another example in which lists could be helpful is the pattern matcher. We used sentences to hold known-values databases, such as this one:

This would be both easier for you to read and easier for programs to manipulate if we used list structure to indicate the grouping instead of exclamation points:

We remarked when we introduced sentences that they're a feature we added to Scheme just for the sake of this book. Lists, by contrast, are at the core of what Lisp has been about from its beginning. (In fact the name "Lisp" stands for "LISt Processing.")

Selectors and Constructors

When we introduced words and sentences we had to provide ways to take them apart, such as first, and ways to put them together, such as sentence. Now we'll tell you about the selectors and constructors for lists.

The function to select the first element of a list is called car.[1] The function to select the portion of a list containing all but the first element is called cdr, which is pronounced "could-er." These are analogous to first and butfirst for words and sentences.

Of course, we can't extract pieces of a list that's empty, so we need a predicate that will check for an empty list. It's called null? and it returns #t for the empty list, #f for anything else. This is the list equivalent of empty? for words and sentences.

There are two constructors for lists. The function list takes any number of arguments and returns a list with those arguments as its elements.

The other constructor, cons, is used when you already have a list and you want to add one new element. Cons takes two arguments, an element and a list (in that order), and returns a new list whose car is the first argument and whose cdr is the second.

There is also a function that combines the elements of two or more lists into a larger list:

It's important that you understand how list, cons, and append differ from each other:

When list is invoked with two arguments, it considers them to be two proposed elements for a new two-element list. List doesn't care whether the arguments are themselves lists, words, or anything else; it just creates a new list whose elements are the arguments. In this case, it ends up with a list of two lists.

Cons requires that its second argument be a list.[2] Cons will extend that list to form a new list, one element longer than the original; the first element of the resulting list comes from the first argument to cons. In other words, when you pass cons two arguments, you get back a list whose car is the first argument to cons and whose cdr is the second argument.

Thus, in this example, the three elements of the returned list consist of the first argument as one single element, followed by the elements of the second argument (in this case, two words). (You may be wondering why anyone would want to use such a strange constructor instead of list. The answer has to do with recursive procedures, but hang on for a few paragraphs and we'll show you an example, which will help more than any explanation we could give in English.)

Finally, append of two arguments uses the elements of both arguments as elements of its return value.

Append creates a list whose elements are the elements of the arguments, which must be lists:

Programming with Lists

In this example our result is a list of sentences. That is, the result is a list that includes smaller lists as elements, but each of these smaller lists is a sentence, in which only words are allowed. That's why we used the constructor cons for the overall list, but se for each sentence within the list.

This is the example worth a thousand words that we promised, to show why cons is useful. List wouldn't work in this situation. You can use list only when you know exactly how many elements will be in your complete list. Here, we are writing a procedure that works for any number of elements, so we recursively build up the list, one element at a time.

In the following example we take advantage of structured lists to produce a translation dictionary. The entire dictionary is a list; each element of the dictionary, a single translation, is a two-element list; and in some cases a translation may involve a phrase rather than a single word, so we can get three deep in lists.

By the way, this example will help us explain why those ridiculous names car and cdr haven't died out. In this not-so-hard program we find ourselves saying

to refer to the French part of the first translation in the dictionary. Let's go through that slowly. (Car dictionary) gives us the first element of the dictionary, one English-French pairing. Cdr of that first element is a one-element list, that is, all but the English word that's the first element of the pairing. What we want isn't the one-element list but rather its only element, the French word, which is its car.

This car of cdr of car business is pretty lengthy and awkward. But Scheme gives us a way to say it succinctly:

In general, we're allowed to use names like cddadr up to four deep in As and Ds. That one means

or in other words, take the cdr of the cdr of the car of the cdr of its argument. Notice that the order of letters A and D follows the order in which you'd write the procedure names, but (as always) the procedure that's invoked first is the one on the right. Don't make the mistake of reading cadr as meaning "first take the car and then take the cdr." It means "take the car of the cdr."

The most commonly used of these abbreviations are cadr, which selects the second element of a list; caddr, which selects the third element; and cadddr, which selects the fourth.

The Truth about Sentences

You've probably noticed that it's hard to distinguish between a sentence (which must be made up of words) and a list that happens to have words as its elements.

The fact is, sentences are lists. You could take car of a sentence, for example, and it'd work fine. Sentences are an abstract data type represented by lists. We created the sentence ADT by writing special selectors and constructors that provide a different way of using the same underlying machinery—a different interface, a different metaphor, a different point of view.

How does our sentence point of view differ from the built-in Scheme point of view using lists? There are three differences:

•		A sentence can contain only words, not sublists.

•		Sentence selectors are symmetrical front-to-back.

•		Sentences and words have the same selectors.

All of these differences fit a common theme: Words and sentences are meant to represent English text. The three differences reflect three characteristics of English text: First, text is made of sequences of words, not complicated structures with sublists. Second, in manipulating text (for example, finding the plural of a noun) we need to look at the end of a word or sentence as often as at the beginning. Third, since words and sentences work together so closely, it makes sense to use the same tools with both. By contrast, from Scheme's ordinary point of view, an English sentence is just one particular case of a much more general data structure, whereas a symbol[3] is something entirely different.

The constructors and selectors for sentences reflect these three differences. For example, it so happens that Scheme represents lists in a way that makes it easy to find the first element, but harder to find the last one. That's reflected in the fact that there are no primitive selectors for lists equivalent to last and butlast for sentences. But we want last and butlast to be a part of the sentence package, so we have to write them in terms of the "real" Scheme list selectors. (In the versions presented here, we are ignoring the issue of applying the selectors to words.)

If you look "behind the curtain" at the implementation, last is a lot more complicated than first. But from the point of view of a sentence user, they're equally simple.

In Chapter 16 we used the pattern matcher's known-values database to introduce the idea of abstract data types. In that example, the most important contribution of the ADT was to isolate the details of the implementation, so that the higher-level procedures could invoke lookup and add without the clutter of looking for exclamation points. We did hint, though, that the ADT represents a shift in how the programmer thinks about the sentences that are used to represent databases; we don't take the acronym of a database, even though the database is a sentence and so it would be possible to apply the acronym procedure to it. Now, in thinking about sentences, this idea of shift in viewpoint is more central. Although sentences are represented as lists, they behave much like words, which are represented quite differently.[4] Our sentence mechanism highlights the uses of sentences, rather than the implementation.

Higher-Order Functions

The higher-order functions that we've used until now work only for words and sentences. But the idea of higher-order functions applies perfectly well to structured lists. The official list versions of every, keep, and accumulate are called map, filter, and reduce.

Map takes two arguments, a function and a list, and returns a list containing the result of applying the function to each element of the list.

The word "map" may seem strange for this function, but it comes from the mathematical study of functions, in which they talk about a mapping of the domain into the range. In this terminology, one talks about "mapping a function over a set" (a set of argument values, that is), and Lispians have taken over the same vocabulary, except that we talk about mapping over lists instead of mapping over sets. In any case, map is a genuine Scheme primitive, so it's the official grownup way to talk about an every-like higher-order function, and you'd better learn to like it.

Filter also takes a function and a list as arguments; it returns a list containing only those elements of the argument list for which the function returns a true value. This is the same as keep, except that the elements of the argument list may be sublists, and their structure is preserved in the result.

Filter probably makes sense to you as a name; the metaphor of the air filter that allows air through but doesn't allow dirt, and so on, evokes something that passes some data and blocks other data. The only problem with the name is that it doesn't tell you whether the elements for which the predicate function returns #t are filtered in or filtered out. But you're already used to keep, and filter works the same way. Filter is not a standard Scheme primitive, but it's a universal convention; everyone defines it the same way we do.

Reduce is just like accumulate except that it works only on lists, not on words. Neither is a built-in Scheme primitive; both names are seen in the literature. (The name "reduce" is official in the languages APL and Common Lisp, which do include this higher-order function as a primitive.)

Other Primitives for Lists

The predicate equal?, which we've discussed earlier as applied to words and sentences, also works for structured lists.

The predicate member?, which we used in one of the examples above, isn't a true Scheme primitive, but part of the word and sentence package. (You can tell because it "takes apart" a word to look at its letters separately, something that Scheme doesn't ordinarily do.) Scheme does have a member primitive without the question mark that's like member? except for two differences: Its second argument must be a list (but can be a structured list); and instead of returning #t it returns the portion of the argument list starting with the element equal to the first argument. This will be clearer with an example:

This is the main example in Scheme of the semipredicate idea that we mentioned earlier in passing. It doesn't have a question mark in its name because it returns values other than #t and #f, but it works as a predicate because any non-#f value is considered true.

The only word-and-sentence functions that we haven't already mentioned are item and count. The list equivalent of item is called list-ref (short for "reference"); it's different in that it counts items from zero instead of from one and takes its arguments in the other order:

The list equivalent of count is called length, and it's exactly the same except that it doesn't work on words.

Association Lists

An example earlier in this chapter was about translating from English to French. This involved searching for an entry in a list by comparing the first element of each entry with the information we were looking for. A list of names and corresponding values is called an association list, or an a-list. The Scheme primitive assoc looks up a name in an a-list:

Assoc returns #f if it can't find the entry you're looking for in your association list. Our translate procedure checks for that possibility before using cadr to extract the French translation, which is the second element of an entry.

Functions That Take Variable Numbers of Arguments

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html><head><title>Python: module test.tc_displayable</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head><body bgcolor="#f0f0f8">

<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="heading">
<tr bgcolor="#7799ee">
<td valign=bottom>&nbsp;<br>
<font color="#ffffff" face="helvetica, arial">&nbsp;<br><big><big><strong><a href="test.html"><font color="#ffffff">test</font></a>.tc_displayable</strong></big></big></font></td
><td align=right valign=bottom
><font color="#ffffff" face="helvetica, arial"><a href=".">index</a><br><a href="file:/home/hut/work/ranger/test/tc_displayable.py">/home/hut/work/ranger/test/tc_displayable.py</a></font></td></tr></table>
    <p></p>
<p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#aa55cc">
<td colspan=3 valign=bottom>&nbsp;<br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Modules</strong></big></font></td></tr>
    
<tr><td bgcolor="#aa55cc"><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</tt></td><td>&nbsp;</td>
<td width="100%"><table width="100%" summary="list"><tr><td width="25%" valign=top><a href="curses.html">curses</a><br>
</td><td width="25%" valign=top><a href="unittest.html">unittest</a><br>
</td><td width="25%" valign=top></td><td width="25%" valign=top></td></tr></table></td></tr></table><p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#ee77aa">
<td colspan=3 valign=bottom>&nbsp;<br>
<font color="#ffffff" face="helvetica, arial"><big><strong>Classes</strong></big></font></td></tr>
    
<tr><td bgcolor="#ee77aa"><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</tt></td><td>&nbsp;</td>
<td width="100%"><dl>
<dt><font face="helvetica, arial"><a href="unittest.html#TestCase">unittest.TestCase</a>(<a href="builtins.html#object">builtins.object</a>)
</font></dt><dd>
<dl>
<dt><font face="helvetica, arial"><a href="test.tc_displayable.html#TestDisplayable">TestDisplayable</a>
</font></dt><dt><font face="helvetica, arial"><a href="test.tc_displayable.html#TestDisplayableContainer">TestDisplayableContainer</a>
</font></dt></dl>
</dd>
</dl>
 <p>
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="section">
<tr bgcolor="#ffc8d8">
<td colspan=3 valign=bottom>&nbsp;<br>
<font color="#000000" face="helvetica, arial"><a name="TestDisplayable">class <strong>TestDisplayable</strong></a>(<a href="unittest.html#TestCase">unittest.TestCase</a>)</font></td></tr>
    
<tr><td bgcolor="#ffc8d8"><tt>&nbsp;&nbsp;&nbsp;</tt></

Chapter 17

Lists