|
The language that we use in this book isn't exactly standard Scheme. We've provided several extensions that may seem unusual to an experienced Scheme programmer. This may make the book feel weird at first, but there's a pedagogic reason for each extension.
Along with our slightly strange version of Scheme, our book has a slightly unusual order of topics. Several ideas that are introduced very early in the typical Scheme-based text are delayed in ours, most notably recursion. Quite a few people have looked at our table of contents, noted some particular big idea of computer science, and remarked, "I can't believe you wait so long before getting to such and such!"
In this preface for instructors, we describe and explain the unusual elements of our approach. Other teaching issues, including the timing and ordering of topics, are discussed in the Instructor's Manual.
The chapter named "Lists" in this book is Chapter 17, about halfway through the book. But really we use lists much earlier than that, almost from the beginning.
Teachers of Lisp have always had trouble deciding when and how to introduce lists. The advantage of an early introduction is that students can then write interesting symbolic programs instead of boring numeric ones. The disadvantage is that students must struggle with the complexity of the implementation, such as the asymmetry between the two ends of a list, while still also struggling with the idea of composition of functions and Lisp's prefix notation.
We prefer to have it both ways. We want to spare beginning students the risk of accidentally constructing ill-formed lists such as
((((() . D) . C) . B) . A)
but we also want to write natural-language programs from the
beginning of the book. Our solution is to borrow from Logo the idea of a
sentence abstract data type.[1] Sentences are
guaranteed to be flat, proper lists, and they appear to be symmetrical to
the user of the abstraction. (That is, it's as easy to ask for the last
word of a sentence as to ask for the first word.) The sentence
constructor accepts either a word or a sentence in any argument position.
We defer structured lists until we have higher-order functions and
recursion, the tools we need to be able to use the structure
effectively.[2] A
structured list can be understood as a tree, and Lisp programmers generally
use that understanding implicitly. After introducing car
-cdr
recursion, we present an explicit abstract data type for trees, without
reference to its implementation. Then we make the connection between these
formal trees and the name "tree recursion" used for structured lists
generally. But Chapter 18 can be omitted, if the instructor finds
the tree ADT unnecessary, and the reader of Chapter 17 will still
be able to use structured lists.
We haven't said what a word is. Scheme includes separate data types
for characters, symbols, strings, and numbers. We want to be able to
dissect words into letters, just as we can dissect sentences into words, so
that we can write programs like plural
and pig-latin
. Orthodox
Scheme style would use strings for such purposes, but we want a sentence to
look (like this)
and not ("like" "this")
. We've arranged that
in most contexts symbols, strings, and numbers can be used interchangeably;
our readers never see Scheme characters at all.[3]
Although a word made of letters is represented internally as a symbol, while
a word made of digits is represented as a number, above the abstraction line
they're both words. (A word that standard Scheme won't accept as a symbol
nor as a number is represented as a string.)
There is an efficiency cost to treating both words and sentences as abstract aggregates, since it's slow to disassemble a sentence from right to left and slow to disassemble a word in either direction. Many simple procedures that seem linear actually behave quadratically. Luckily, words aren't usually very long, and the applications we undertake in the early chapters don't use large amounts of data in any form. We write our large projects as efficiently as we can without making the programs unreadable, but we generally don't make a fuss about it. Near the end of the book we discuss explicitly the efficient use of data structures.
Even though computers represent numbers internally in many different ways (fixed point, bignum, floating point, exact rational, complex), when people visit mathland, they expect to meet numbers there, and they expect that all the numbers will understand how to add, subtract, multiply, and divide with each other. (The exception is dividing by zero, but that's because of the inherent rules of mathematics, not because of the separation of numbers into categories by representation format.)
We feel the same way about visiting textland. We expect to meet English
text there. It takes the form of words and sentences. The operations that
text understands include first
, last
, butfirst
, and
butlast
to divide the text into its component parts. You can't divide an
empty word or sentence into parts, but it's just as natural to divide a word
into letters as to divide a sentence into words. (The ideas of mathland and
textland, as well as the details of the word and sentence procedures, come
from Logo.)
Some people who are accustomed to Scheme's view of data types consider
first
to be badly "overloaded"; they feel that a procedure that selects an
element from a list shouldn't also extract a letter from a symbol. Some of
them would prefer that we use car
for lists, use substring
for
strings, and not disassemble symbols at all. Others want us to define
word-first
and sentence-first
.
To us, word-first
and sentence-first
sound no less awkward than
fixnum-+
and bignum-+
. Everyone agrees that it's reasonable to
overload the name +
because the purposes are so similar. Our students
find it just as reasonable that first
works for words as well as for
sentences; they don't get confused by this.
As for the inviolability of symbols—the wall between names and data—we
are following an older Lisp tradition, in which it was commonplace to
explode
symbols and to construct new names within a program. Practically
speaking, all that prevents us from representing words as strings is that
Scheme requires quotation marks around them. But in any case, the
abstraction we're presenting is that the data we're dissecting are neither
strings nor symbols, but words.
Scheme relies on procedure invocation as virtually its only control
mechanism. In order to write interesting programs, a Scheme user must
understand at least one of two hard ideas: recursion or procedure as object
(in order to use higher-order procedures). We believe that higher-order
procedures are easier to learn, especially because we begin in Chapter
8 by applying them only to named procedures. Using a named procedure
as an argument to another procedure is the way to use procedures as objects
that's least upsetting to a beginner. After the reader is comfortable with
higher-order procedures, we introduce lambda
; after that we introduce
recursion. We do the tic-tac-toe example with higher-order procedures and
lambda
, but not recursion.
In this edition, however, we have made the necessary minor revisions so that an instructor who prefers to begin with recursion can assign Part IV before Part III.
When we get to recursion, we begin with an example of embedded recursion. Many books begin with the simplest possible recursive procedure, which turns out to be a simple sequential recursion, or even a tail recursion. We feel that starting with such examples allows students to invent the "go back" model of recursion as looping.
One of the most unusual characteristics of this book is that there is no
assignment to variables in it. The reason we avoid set!
is that the
environment model of evaluation is very hard for most students. We use a
pure substitution model throughout most of the book. (With the background
they get from this book, students should be ready for the environment model
when they see a rigorous presentation, as they will, for example, in Chapter
3 of SICP.)
As the last topic in the book, we do introduce a form of mutation, namely
vector-set!
. Mutation of vectors is less problematic than mutation of
lists, because lists naturally share storage. You really have to go out of
your way to get two pointers to the same vector.[4] Mutation of data
structures is less problematic than assignment to variables because it
separates the issue of mutation from the issues of binding and scope. Using
vectors raises no new questions about the evaluation process, so we present
mutation without reference to any formal model of evaluation. We
acknowledge that we're on thin ice here, but it seems to work for our
students.
In effect, our model of mutation is the "shoebox" model that you'd find in a mainstream programming language text. Before we get to mutation, we use input/output programming to introduce the ideas of effect and sequence; assigning a value to a vector element introduces the important idea of state. We use the sequential model to write two more or less practical programs, a spreadsheet and a database system. A more traditional approach to assignment in Scheme would be to build an object-oriented language extension, but the use of local state variables would definitely force us to pay attention to environments.
[1] Speaking of abstraction, even though that's the name of Part V, we do make an occasion in each of the earlier parts to talk about abstraction as examples come up.
[2] Even then, we take lists as a primitive data type. We don't teach about pairs or improper lists, except as a potential pitfall.
[3] Scheme's primitive
I/O facility gives you the choice of expressions or characters. Instead of
using read-char
, we invent read-line
, which reads a line as a
sentence, and read-string
, which returns the line as one long word.
[4] We don't talk about
eq?
at all. We're careful to write our programs in such a way that
the issue of identity doesn't arise for the reader.
Brian Harvey,
bh@cs.berkeley.edu