|
There are two schools of thought about teaching computer science. We might caricature the two views this way:
• | The conservative view: Computer programs have become too large and complex to encompass in a human mind. Therefore, the job of computer science education is to teach people how to discipline their work in such a way that 500 mediocre programmers can join together and produce a program that correctly meets its specification. |
---|
• | The radical view: Computer programs have become too large and complex to encompass in a human mind. Therefore, the job of computer science education is to teach people how to expand their minds so that the programs can fit, by learning to think in a vocabulary of larger, more powerful, more flexible ideas than the obvious ones. Each unit of programming thought must have a big payoff in the capabilities of the program. |
---|
Of course nobody would admit to endorsing the first approach as we've
described it. Yet many introductory programming courses seem to spend half
their time on obscure rules of the programming language (semicolons go
between the instructions in Pascal, but after each
instruction in C) and the other half on stylistic commandments (thou shalt
comment each procedure with its preconditions and postconditions; thou shalt
not use goto
). In an article that was not intended as a
caricature, the noted computer scientist Edsger Dijkstra argues that
beginning computer science students should not be allowed to use
computers, lest they learn to debug their programs interactively instead of
writing programs that can be proven correct by formal methods before
testing.[1]
If you are about to be a student in an introductory computer science course, you may already be an experienced programmer of your home computer, or instead you may have only a vague idea of what you're getting into. Perhaps you suspect that programming a computer is like programming a VCR: entering endless obscure numeric codes. Even if you're already a computer programmer, you may not yet have a clear idea of what computer science means. In either case, what we want to do in this book is put our best foot forward—introduce you to some new ideas, get you excited, rather than mold you into a disciplined soldier of the programming army.
In order to understand the big ideas, though, we'll also have to expend some effort on technical details; studying computer science without writing computer programs is like trying to study German grammar without learning any of the words in the language. But we'll try to keep the ideas in view while struggling with the details, and we hope you'll remember them too.
We said that our approach to teaching computer science emphasizes big ideas. Our explanation of symbolic programming in the following paragraphs is in part just an illustration of that approach. But we chose this particular example for another reason also. Scheme, the programming language used in this book, is an unusual choice for an introductory computer science course. You may wonder why we didn't use a more traditional language, such as Pascal, Modula-2, or C. Our discussion of symbolic programming is the beginning of an answer to that question.
Originally computers were about numbers. Scientists used them to solve equations; businesses used them to compute the payroll and the inventory. We were rescued from this boring state of affairs mainly by researchers in artificial intelligence—people who wanted to get computers to think more nearly the way people do, about ideas in general rather than just numbers.
What does it mean to represent ideas in a computer? Here's a simple example: We want to teach the computer to answer the question, "Was so-and-so a Beatle?" We can't quite ask the question in English; in this book we interact with the computer using Scheme. Our interactions will look like this:
You type: (beatle? 'paul)
Computer replies: #t
(computerese for "true")
You type: (beatle? 'elvis)
Computer replies: #f
("false")
Here's the program that does the job:
(define (beatle? person) (member? person '(john paul george ringo)))
If you examine this program with a (metaphoric) magnifying glass, you'll find that it's really still full of numbers. In fact, each letter or punctuation character is represented in the computer by its own unique number.[2]But the point of the example is that you don't have to know that! When you see
(john paul george ringo)
you don't have to worry about the numbers that represent the letters inside the computer; all you have to know is that you're seeing a sentence made up of four words. Our programming language hides the underlying mechanism and lets us think in terms more appropriate to the problem we're trying to solve. That hiding of details is called abstraction, one of the big ideas in this book.
Programming with words and sentences is an example of symbolic programming. In 1960 John McCarthy invented the Lisp programming language to handle symbolic computations like this one. Our programming language, Scheme, is a modern dialect of Lisp.
Symbolic programming is one aspect of the reason why we like to teach computer science using Scheme instead of a more traditional language. More generally, Lisp (and therefore Scheme) was designed to support what we've called the radical view of computer science. In this view, computer science is about tools for expressing ideas. Symbolic programming allows the computer to express ideas; other aspects of Lisp's design help the programmer express ideas conveniently. Sometimes that goal comes in conflict with the conservative computer scientist's goal of protection against errors.
Here's an example. We want to tell our computer, "To square a number, multiply it by itself." In Scheme we can say
(define (square num) (* num num))
The asterisk represents multiplication, and is followed by the
two operands—in this case, both the same number. This short program works
for any number, of course, as we can see in the following dialogue. (The
lines with >
in front are the ones you type.)
> (square 4) 16 > (square 3.14) 9.8596 > (square -0.3) 0.09
But the proponents of the 500-mediocre-programmer
school[3] think this straightforward approach
is sinful. "What!" they cry. "You haven't said whether num
is a whole number or a number with a decimal fraction!" They're
afraid that you might write the square
program with whole
numbers in mind, and then apply it to a decimal fraction by
mistake. If you're on a team with 499 other programmers, it's easy
to have failures of communication so that one programmer uses
another's program in unintended ways.
To avoid that danger, they want you to write these two separate programs:
function SquareOfWholeNumber(num: integer): integer; begin SquareOfWholeNumber := num * num end; function SquareOfDecimalNumber(num: real): real; begin SquareOfDecimalNumber := num * num end;
Isn't this silly? Why do they pick this particular distinction (whole numbers and decimals) to worry about? Why not positive and negative numbers, for example? Why not odd and even numbers?
That two-separate-program example is written in the Pascal language. Pascal was designed by Niklaus Wirth, one of the leaders of the structured programming school, specifically to force programming students to write programs that fit conservative ideas about programming style and technique; you can't write a program in Pascal at all unless you write it in the approved style. Naturally, this language has been very popular with school teachers.[4] That's why, as we write this in 1993, the overwhelming majority of introductory computer science classes are taught using Pascal, even though no professional programmer would be caught dead using it.[5]
For fourteen years after the introduction of Pascal in 1970, its hegemony in computer science education was essentially unchallenged. But in 1984, two professors at the Massachusetts Institute of Technology and a programmer at Bolt, Beranek and Newman (a commercial research lab) published the Scheme-based Structure and Interpretation of Computer Programs (Harold Abelson and Gerald Jay Sussman with Julie Sussman, MIT Press/McGraw-Hill). That ground-breaking text brought the artificial intelligence approach to a wide audience for the first time. We (Brian and Matt) have been teaching their course together for several years. Each time, we learn something new.
The only trouble with SICP is that it was written for MIT students, all of whom love science and are quite comfortable with formal mathematics. Also, most of the students who use SICP at MIT have already learned to program computers before they begin. As a result, many other schools have found the book too challenging for a beginning course. We believe that everyone who is seriously interested in computer science must read SICP eventually. Our book is a prequel; it's meant to teach you what you need to know in order to read that book successfully.[6] Generally speaking, our primary goal in Parts I-V has been preparation for SICP, while the focus of Part VI is to connect the course with the kinds of programming used in "real world" application programs like spreadsheets and databases. (These are the last example and the last project in the book.)
This book is intended as an introduction to computer programming and to computer science for two kinds of students.
For those whose main interest is in some other field, we provide a self-contained, one-semester experience with computer programming in a language with a minimum of complicated notation, so that students can quickly come in contact with high-level ideas about algorithms, functions, and recursion. The book ends with the implementation of a spreadsheet program and a database program, so it complements a computer application course in which the commercial versions of such programs are used.
For those who intend to continue the study of computer science but who have no prior programming experience, we offer a preparatory course, less intense than a traditional CS 1 but not limited to programming technique; we give the flavor of computer science ideas that will be studied in more depth later in the curriculum. We also include an extensive discussion of recursion, which is a stumbling block for many beginning students.
The course at Berkeley for which we wrote this book includes both categories of students. About 90% of the first-year students who intend to major in computer science have already had a programming course in high school, and most of them begin with SICP. The other 10% are advised to take this course first. But many of the students in this course aren't computer science majors. A few other departments (business administration and architecture are the main ones) have a specific computer course requirement, and all students must meet a broader "quantitative reasoning" requirement; our course satisfies these requirements. Finally, some students come just out of curiosity about computers.
We assume that you have never programmed a computer. On the other hand, we do assume that you can use a computer; we don't talk about how to turn it on, how to edit text, and so on, because those details are too different from one computer model to another. If you've never used a computer before, you may wish to spend a few days with a book written specifically for your machine that will introduce you to its operation. It won't take more than a few days, because you don't have to be an expert before you read our book. As long as you can start up the Scheme interpreter and correct your typing mistakes, you're ready.
We assume that you're not a mathematics lover. (If you are, you might be ready to read SICP right away.) The earlier example about squaring a number is about as advanced as we get. And of course you don't have to do any arithmetic at all; computers are good at that. You'll learn how to tell the computer to do arithmetic, but that's no harder than using a pocket calculator. Most of our programming examples are concerned with words and sentences rather than with numbers. A typical example is to get Scheme to figure out the plural form of a noun. Usually that means putting an "s" on the end, but not quite always. (What's the plural of "French fry"?)
Do the exercises! Whenever we teach programming, we always get students who say, "When I read the book it all makes sense, but on the exams, when you ask me to write a program, I never know where to start." Computer science is two things: a bunch of big ideas, as we've been saying, and also a skill. You can't learn the skill by watching.
Do the exercises on a computer! It's not good enough to solve the exercises on paper, even if you feel sure your solution is correct. Maybe it's 99% correct but there's some little detail you've overlooked. When you run such a program, you won't get 99% of the answer you wanted. By trying the exercise on the computer, you get unambiguous feedback. If your program is correct, you get the response you expected. If not, not.
Don't feel bad if you don't get things right the first time. Even the most experienced programmers have to debug their programs—that is, fix the parts that don't work. In fact, an important part of what you'll learn from the exercises is the process of debugging your solutions. It would be too bad if all of your programs in this course worked the first time, because that would let you avoid the practice in debugging that you'll certainly need when you write more complicated programs later. Also, don't be afraid or ashamed to ask for help if you get stuck. That, too, is part of the working style of professional programmers.
In some of the chapters, we've divided the exercises into two categories, "boring" and "real." The boring exercises ask you to work through examples mechanically, to make sure you understand the rules. The real exercises ask you to invent something, usually a small computer program, but sometimes an explanation of some situation that we present. (In some chapters, the exercises are just labeled "exercises," which means that they're all considered "real.") We don't intend that the boring exercises be handed in; the idea is for you to do as many of them as you need to make sure you understand the mechanics of whatever topic you're learning.
Occasionally we introduce some idea with a simplified explanation, saving the whole truth for later. We warn you when we do this. Also, we sometimes write preliminary, partial, or incorrect example programs, and we always flag these with a comment like
(define (something foo baz) ;; first version )
When we introduce technical terms, we sometimes mention the origin of the word, if it's not obvious, to help prevent the terminology from seeming arbitrary.
This book starts easy but gets harder, in two different ways. One is that we spend some time teaching you the basics of Scheme before we get to two hard big ideas, namely, function as object and recursion. The earlier chapters are short and simple. You may get the idea that the whole book will be trivial. You'll change your mind in Parts III and IV.
The other kind of difficulty in the book is that it includes long programming examples and projects. ("Examples" are programs we write and describe; "projects" are programs we ask you to write.) Writing a long program is quite different from writing a short one. Each small piece may be easy, but fitting them together and remembering all of them at once is a challenge. The examples and projects get longer as the book progresses, but even the first example, tic-tac-toe, is much longer and more complex than anything that comes before it.
As the text explains more fully later, in this book we use some extensions to the standard Scheme language—features that we implemented ourselves, as Scheme programs. If you are using this book in a course, your instructor will provide our programs for you, and you don't have to worry about it. But if you're reading the book on your own, you'll need to follow the instructions in Appendix A.
There are several reference documents at the end of the book. If you don't understand a technical term in the text, try the Glossary for a short definition, or the General Index to find the more complete explanation in the text. If you've forgotten how to use a particular Scheme primitive procedure, look in the Alphabetical Table of Scheme Primitives, or in the General Index. If you've forgotten the name of the relevant primitive, refer to the inside back cover, where all the primitive procedures are listed by category. Some of our example programs make reference to procedures that were defined earlier, either in another example or in an exercise. If you're reading an example program and it refers to some procedure that's defined elsewhere, you can find that other procedure in the Index of Defined Procedures.
[1] "On the Cruelty of Really Teaching Computer Science," Communications of the ACM, vol. 32, no. 12, December, 1989.
[2] The left parenthesis is 40, for example, and the letter
d
is 100. If it were a capital D
it would be 68.
[3] Their own names for their approach are structured programming and software engineering.
[4] Of course, your teacher isn't an uptight authoritarian, or you wouldn't be using our book!
[5] Okay, we're exaggerating. But even Professor Wirth himself has found Pascal so restrictive that he had to design more flexible languages—although not flexible enough—called Modula and Oberon.
[6] As the ideas pioneered by SICP have spread, we are starting to see other intellectually respectable introductions to computer science that are meant as alternatives to SICP. In particular, we should acknowledge Scheme and the Art of Programming (George Springer and Daniel P. Friedman, MIT Press/McGraw-Hill, 1989) as a recognized classic. We believe our book will serve as preparation for theirs, too.
Brian Harvey,
bh@cs.berkeley.edu