diff options
author | elioat <elioat@tilde.institute> | 2023-08-23 07:52:19 -0400 |
---|---|---|
committer | elioat <elioat@tilde.institute> | 2023-08-23 07:52:19 -0400 |
commit | 562a9a52d599d9a05f871404050968a5fd282640 (patch) | |
tree | 7d3305c1252c043bfe246ccc7deff0056aa6b5ab /js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html | |
parent | 5d012c6c011a9dedf7d0a098e456206244eb5a0f (diff) | |
download | tour-562a9a52d599d9a05f871404050968a5fd282640.tar.gz |
*
Diffstat (limited to 'js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html')
-rw-r--r-- | js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html | 2465 |
1 files changed, 2465 insertions, 0 deletions
diff --git a/js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html b/js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html new file mode 100644 index 0000000..da1e1ce --- /dev/null +++ b/js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html @@ -0,0 +1,2465 @@ +<HTML> +<HEAD> +<TITLE>Computer Science Logo Style vol 3 ch 3: Algorithms and Data Structures</TITLE> +</HEAD> +<BODY> +<CITE>Computer Science Logo Style</CITE> volume 3: +<CITE>Beyond Programming</CITE> 2/e Copyright (C) 1997 MIT +<H1>Algorithms and Data Structures</H1> + +<TABLE width="100%"><TR><TD> +<IMG SRC="../csls3.jpg" ALT="cover photo"> +<TD><TABLE> +<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian +Harvey</A><BR>University of California, Berkeley</CITE> +<TR><TD align="right"><BR> +<TR><TD align="right"><A HREF="../pdf/v3ch03.pdf">Download PDF version</A> +<TR><TD align="right"><A HREF="../v3-toc2.html">Back to Table of Contents</A> +<TR><TD align="right"><A HREF="../v3ch2/v3ch2.html"><STRONG>BACK</STRONG></A> +chapter thread <A HREF="https://people.eecs.berkeley.edu/~bh/v3ch4/v3ch4.html"><STRONG>NEXT</STRONG></A> +<TR><TD align="right"><A HREF="https://mitpress.mit.edu/books/computer-science-logo-style-second-edition-volume-3">MIT +Press web page for <CITE>Computer Science Logo Style</CITE></A> +</TABLE></TABLE> + +<HR><P>Program file for this chapter: <A HREF="algs.lg"><CODE>algs</CODE></A> + +<P>What's wrong with this procedure? + +<P><PRE>to process.sentence :sent +output fput (process.word first :sent) (process.sentence bf :sent) +end +</PRE> + +<P>If you said "It's a recursive procedure without a stop rule," +you've solved a simple problem in <EM>analysis of algorithms.</EM> This branch +of computer science is concerned with the <EM>correctness</EM> and the +<EM>efficiency</EM> of programs. + +<P>The field is called analysis of <EM>algorithms</EM> rather than analysis of +<EM>programs</EM> because the emphasis is on the meaning of a program (how +you might express the program in English) and not on the details of its +expression in a particular language. For example, the error in the procedure + +<P><PRE>to process.sentence :sent +if emptyp :sent [output []] +output fput (process.word first :sent) (<U>process.snetence</U> bf :sent) +end +</PRE> + +<P>is just as fatal as the error in the first example, but there isn't +much of theoretical interest to say about a misspelled procedure name. On +the other hand, another branch of computer science, +<EM>program verification,</EM> is concerned +with developing techniques to ensure that a +computer program really does correctly express the algorithm it is meant to +express. + +<P>The examples I've given so far are atypical, it's worth noting, in that you +were able to see the errors in the procedures without having any real idea +of what they do! Without seeing <CODE>process.word</CODE> you can tell that this +program is doing something or other to each word of a sentence, but you +don't know whether the words are being translated to another language, +converted from upper case to lower case letters, translated into a string of +phoneme codes to be sent to a speech synthesizer, or what. (Even what I just +said about doing something to the words of a sentence is an inference based +on the names of procedures and variables, which might not really reflect the +results of the program.) More interesting problems in analysis of +algorithms have to do with the specific properties of particular algorithms. + +<P>In this chapter I discuss algorithms along with <EM>data structures,</EM> the +different ways in which information can be represented in a computer program, +because these two aspects of a program interact strongly. That is, the +choices you make as a programmer about data representation have a profound +effect on the algorithms you can use to solve your problem. Similarly, once +you have chosen an algorithm, that choice determines the particular kinds of +information your program will need to do its work. Algorithms and data +structures are the central concerns of software engineering, the overall +name for the study of how to turn a problem statement into a working program +in a way that uses both the computer and the programming staff effectively. + +<P><H2>Local Optimization vs. Efficient Algorithms</H2> + +<P>When you're trying to make a computer program run as fast as possible, +the most obvious place to start is with the details of the instructions +in the program. But that's generally <EM>not</EM> the most effective +approach. Rethinking the big picture can give much more dramatic +improvements. To show you what I mean, I'll give some examples of both. +Later I'll talk about <EM>order of growth,</EM> a more formal way to +understand this same idea. + +<P>Consider the following procedure that implements +the <EM>quadratic formula</EM> + +<P><CENTER><IMG SRC="qformula.gif"></CENTER> + +<P>This formula gives the two values of <EM>x</EM> that solve the equation +<P><CENTER><EM>ax</EM><SUP><SMALL>2</SMALL></SUP> + <EM>bx</EM> + <EM>c</EM> = 0</CENTER><P> + +<P><PRE>to quadratic :a :b :c +localmake "x1 (-:b + sqrt (:b*:b - 4*:a*:c))/2*:a +localmake "x2 (-:b - sqrt (:b*:b - 4*:a*:c))/2*:a +print (sentence [The solutions are] :x1 "and :x2) +end +</PRE> + +<P>Before we talk about the efficiency of this program, is it correct? +This is not a simple yes-or-no question. The procedure gives the correct +results for those quadratic equations that have two real solutions. For +example, the equation 2<EM>x</EM><SUP><SMALL>2</SMALL></SUP>+5<EM>x</EM>-3 = 0 has the solutions <EM>x</EM> = −3 and ½; the instruction <CODE>quadratic 2 5 -3</CODE> will print those solutions. +Some quadratic equations, like <EM>x</EM><SUP><SMALL>2</SMALL></SUP>-8<EM>x</EM>+16 = 0, have only one solution; for +these equations, <CODE>quadratic</CODE> prints the same solution twice, which is +not exactly incorrect but still a little embarrassing. Other equations, +like <EM>x</EM><SUP><SMALL>2</SMALL></SUP>+1 = 0, have solutions that are complex numbers. Depending on our +purposes, we might want <CODE>quadratic</CODE> to print the solutions <EM>i</EM> and −<EM>i</EM> +for this equation, or we might want it to print "This equation has no real +solutions." But since most versions of Logo do not provide complex +arithmetic, what will really happen is that we'll get the error message + +<P><PRE>sqrt doesn't like -1 as input. +</PRE> + +<P>If <CODE>quadratic</CODE> is used as part of a larger project, getting +the Logo error message means that the program dies altogether in this case +without a chance to recover. If we have several equations to +solve, it would be better if the program could continue to the remaining +equations even if no solution is found for one of them. A program that +operates correctly for the kinds of inputs that the programmer had in mind, +but blows up when given unanticipated inputs, is said not to be <EM> +robust;</EM> a robust program should do something appropriate even if there +are errors in its input data. + +<P>But my real reason for displaying this example is to discuss its efficiency. +It computes the expression + +<P><PRE>sqrt (:b*:b - 4*:a*:c) +</PRE> + +<P>twice. Multiplication is slower than addition on most computers; +this expression involves three multiplications as well as the even slower +square root extraction. The program would be faster if it were written this +way: + +<P><PRE>to quadratic :a :b :c +localmake "sqrt sqrt (:b*:b - 4*:a*:c) +localmake "x1 (-:b + :sqrt)/2*:a +localmake "x2 (-:b - :sqrt)/2*:a +print (sentence [The solutions are] :x1 "and :x2) +end +</PRE> + +<P>This kind of change to a program is called +<EM>common subexpression elimination.</EM> It's a +pretty easy way to speed up a program; +so easy, in fact, that some "optimizing compilers" for large computers do +it automatically. In other words, an optimizing compiler for Logo would +treat the first version of <CODE>quadratic</CODE> as if it were written like the +second version. (As far as I know, nobody has actually written such a +compiler for Logo.) + +<P>Common subexpression elimination is an example of <EM>local</EM> optimization. +This means that we improved the program by paying attention to one small +piece of it at a time. (A less elegant name for local optimization is "code +bumming.") Is it worth the effort? It depends how many times this procedure +will be run. When I say that multiplication is slow, I mean that it takes +a few millionths of a second. If you are writing a <CODE>quadratic</CODE> +procedure to do the dozen problems in your high school algebra homework, the +extra time you spend thinking up the second version and typing it into the +editor probably outweighs the saving of time in running the procedure. But +if you are trying to predict the weather and need to solve tens of thousands +of equations, the saving may be significant. + +<P>If you want to write locally optimized +Logo programs, it's important to know that <CODE>first</CODE>, <CODE>butfirst</CODE>, and +<CODE>fput</CODE> take a constant amount of time regardless of the length of the +input list, whereas <CODE>last</CODE>, <CODE>butlast</CODE>, and <CODE>lput</CODE> take an amount +of time proportional to the length of the list. If you add <EM>n</EM> items to a +list, one by one, using <CODE>fput</CODE>, the length of time required is <EM>n</EM> times +the constant amount of time for each item. But if you add the same <EM>n</EM> +items using <CODE>lput</CODE>, if the first item takes <EM>t</EM> microseconds, the last +takes <EM>nt</EM>. On the average, each <CODE>lput</CODE> takes something like <EM>nt</EM>/2 +microseconds, so the total time is <EM>n</EM><SUP><SMALL>2</SMALL></SUP><EM>t</EM>/2. The gain in efficiency from +using <CODE>fput</CODE> instead of <CODE>lput</CODE> isn't significant if your list has +only a few items, but the gain gets more and more significant as the size of +the list increases. Suppose you want to create a list of the numbers from 1 +to 1000. One straightforward way would be + +<P><PRE>print cascade 1000 [lput # ?] [] +</PRE> + +<P>On my home computer this instruction takes about 26 +seconds.<SUP>*</SUP> If you +just use <CODE>fput</CODE> in place of <CODE>lput</CODE> the list will come out in the +wrong order, but it's possible to reverse the order of that list to get the +same result as the first version: + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>The actual time depends not only on what model of +computer you have but also on how much memory and what else is in it.</SMALL></BLOCKQUOTE></SMALL><P><PRE>print reverse cascade 1000 [fput # ?] [] +</PRE> + +<P>You might think this would be slower, because it has the extra +<CODE>reverse</CODE> step. But in fact this instruction took about 12 seconds. +(It's important that the <CODE>reverse</CODE> tool in the Berkeley Logo +library uses <CODE>first</CODE> +and <CODE>fput</CODE> to do its work. If we used the more obvious + +<P><PRE>to reverse :list +if emptyp :list [output []] +output lput first :list reverse bf :list +end +</PRE> + +<P>then the elimination of <CODE>lput</CODE> in the <CODE>cascade</CODE> template +would be offset by the <CODE>lput</CODE> in the reversal.) + +<P> +At the other extreme, the broadest possible <EM>global</EM> optimization of a +program is to think of an entirely new algorithm based on a different way of +thinking about the original problem. As an example, in Chapter 2 there are +three different programs to solve the Simplex lock problem. The first +program, <CODE>lock</CODE>, works by enumerating explicitly all the different +patterns of possible combinations. That algorithm is not fundamentally +recursive; although my Logo implementation includes recursive subprocedures, +the number of combinations for a five-button lock is not determined on the +basis of the number for a four-button lock. (The program does compute the +number of combinations that use four out of the five available buttons, but +that isn't the same thing.) The second program, <CODE>simplex</CODE>, is based on +a mathematical argument relating the number of combinations to a fairly +simple <EM>recursive function</EM>--that is, a mathematical function with an +inductive definition. Because the function is computationally simple, the +intellectual energy I invested in doing the mathematics paid off with a +significantly faster program. The third version, <CODE>simp</CODE>, uses a closed +form formula that solves the problem in no time! + +<P>To illustrate more sharply the differences among these three programs, I tried each +of them on a ten-button version of the Simplex lock. To compute <CODE> +lock 10</CODE> took 260 seconds on my computer; <CODE>simplex 10</CODE> took 80 seconds; and +<CODE>simp 10</CODE> took less than half a second. For this size lock, understanding the +mathematical idea of a recursive function sped up the program by a factor of +three, and using the mathematical technique called generating functions +achieved an additional speedup by a factor of almost 200! What's important +to understand about this example is that it wasn't better <EM>programming</EM> +skill that made the difference, but greater knowledge of <EM>mathematics.</EM> +(In the next section, though, you'll learn a programming trick that can +sometimes achieve similar speedups.) + +<P>Many "trick" math problems involve a similar shift in thinking about the +fundamental algorithm to be used. For example, in Chapter 2 we computed the +probability of picking a matching pair of socks out of a drawer containing +six brown and four blue socks this way: +<P><CENTER><IMG SRC="math3.gif" ALT="math display"></CENTER><P> +Suppose we ask this question: Out of the same drawer of six browns and two +blues, we pick <EM>three</EM> socks at random; what is the probability that +at least two of the socks match? + +<P>The number of triples of socks in which at least two are brown is the number +in which all three are brown, <IMG SRC="6choose3.gif">, plus the number in which two are +brown and one is blue, <IMG SRC="6c2x4c1.gif">. The number in which +at least two are blue is, similarly, <IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch3/4c3+4c2x6c1.gif">. The total number of triples is of course <IMG SRC="10choose3.gif">. +So the probability of a matching pair within the chosen triple is +<P><CENTER><IMG SRC="math4.gif" ALT="math display"></CENTER><P> +which is 100% probability. +We could modify the <CODE>socks</CODE> procedure to get the same result by listing +all the possible triples and then filtering all the ones containing a +matching pair, using a filter like + +<P><PRE>to haspair :triple +output or (memberp first :triple butfirst :triple) ~ + (equalp (item 2 :triple) last :triple) +end +</PRE> + +<P>But the problem becomes entirely trivial if you notice that there +are only two possible colors, so obviously there is no way that three +randomly chosen socks can have three distinct colors! Of course there has +to be a matching pair within any possible triple. A problem like this, that +invites a messy arithmetic solution but is obvious if properly understood, is +the mathematician's idea of a joke. (Did you get it?) + +<P><H2>Memoization</H2> + + +<P>Some efficiency tricks are applicable to a number of different problems and +become part of the "toolkit" of any professional programmer. One example + +is relevant to many inductively defined functions; the trick is to have the +program remember the result of each invocation of the function. For example, +in Chapter 2 we defined the number of terms in a +multinomial expansion to be + +<P><PRE>to t :n :k +if equalp :k 0 [output 1] +if equalp :n 0 [output 0] +output (t :n :k-1)+(t :n-1 :k) +end +</PRE> + +<P> + +<P>What happens when we compute <CODE>t 4 7</CODE>? +<P><CENTER><IMG SRC="math5.gif" ALT="math display"></CENTER><P> +Many calculations are performed repeatedly. In the chart above I've +underlined three places where <EM>t</EM>(3,5) is computed. Each of those in turn +involves repeated computation of <EM>t</EM>(3,4) and so on. This computation took +me about 18 seconds. + +<P> + +<P>Here is a version of <CODE>t</CODE> that uses property lists to remember all the +values it's already computed. This version will calculate <EM>t</EM>(3,5) only the +first time it's needed; a second request for <EM>t</EM>(3,5) will instantly output +the remembered value. + +<P><PRE>to t :n :k +localmake "result gprop :n :k +if not emptyp :result [output :result] +make "result realt :n :k +pprop :n :k :result +output :result +end + +to realt :n :k +if equalp :k 0 [output 1] +if equalp :n 0 [output 0] +output (t :n :k-1)+(t :n-1 :k) +end +</PRE> + +<P>Computing <CODE>t 4 7</CODE> isn't really a big enough problem to +show off how much faster this version is; even the original program takes +only 2 ½ seconds on my home computer. But the amount of time +needed grows quickly as you try larger inputs. Using the original procedure +from Chapter 2, my computer took just under five <EM>minutes</EM> to +compute <CODE>t 8 10</CODE>; the program shown here took less than two <EM> +seconds.</EM> This program computed <CODE>t 12 12</CODE> in about three seconds; +I estimate that the original procedure would take five <EM>hours</EM> to +solve that problem! (As you can imagine, I didn't try it.) + +<P>The memoized version of <CODE>t</CODE> has the same structure as the original. +That is, in order to compute <CODE>t 4 7</CODE>, the program must first carry out +the two subtasks <CODE>t 4 6</CODE> and <CODE>t 3 7</CODE>. The only difference between +the original version and the memoized version is that in the latter, +whenever a second invocation is made with inputs that have already been +seen, the result is output immediately without repeating that subtask. Any +recursive function can be memoized in the same way.<SUP>*</SUP> + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>I've used +property lists of numbers to hold the remembered values of the <CODE>t</CODE> +function. If I wanted to use the same technique for some other function in +the same workspace, I'd have to find a way to keep the values for the two +functions from getting in each other's way. For example, if <EM>t</EM>(4,7)=120 +and some other function <EM>h</EM>(4,7)=83, then I might store the value + +<P><PRE>[t 120 h 83] +</PRE> + +<P>on the appropriate property list.</SMALL></BLOCKQUOTE></SMALL><P>Sometimes, though, the same general idea--remembering the results +of past computations--can be used more effectively by changing the +program structure so that just the right subproblems are solved +as they're needed to work toward the overall solution. Rearranging +the program structure isn't called memoization, but we're still using +the same idea. For example, +the Simplex lock function +<P><CENTER><IMG SRC="math6.gif" ALT="math display"></CENTER><P> +from Chapter 2 +is a combination (sorry about the pun) of values of two functions. It isn't +exactly helpful to remember every possible value of <IMG SRC="nchoosei.gif"> because +each value is used only once. But the calculation of <EM>f</EM>(<EM>n</EM>) uses the entire +<EM>n</EM>th row of Pascal's Triangle, and it's easy to compute that if we remember +the row above it. The values of <EM>f</EM>(<EM>i</EM>) are used repeatedly, so it makes +sense to keep a list of them. So my plan is to have two lists, +the first of which is a list of values of <EM>f</EM>(<EM>i</EM>) and the second a row of +Pascal's Triangle: + +<P><PRE>round ?1 ?2 + + 0 [1] [1 1] + 1 [1 1] [1 2 1] + 2 [3 1 1] [1 3 3 1] + 3 [13 3 1 1] [1 4 6 4 1] + 4 [75 13 3 1 1] [1 5 10 10 5 1] + 5 [541 75 13 3 1 1] [1 6 15 20 15 6 1] +</PRE> + +<P>The solution to the problem is twice the first member of the last +value of <CODE>?1</CODE>. + +<P>Instead of starting with a request for <EM>f</EM>(5) and carrying out subtasks +as needed, the new program will begin with <EM>f</EM>(0) and will work its way +up to larger input values until the desired result is found. This technique +is called <EM>dynamic programming:</EM> + +<P><PRE>to simplex :buttons +output 2 * first (cascade :buttons + [fput (sumprods bf ?2 ?1) ?1] [1] + [fput 1 nextrow ?2] [1 1]) +end + +to sumprods :a :b +output reduce "sum (map "product :a :b) +end + +to nextrow :combs +if emptyp butfirst :combs [output :combs] +output fput (sum first :combs first butfirst :combs) ~ + nextrow butfirst :combs +end +</PRE> + +<P>I tried both versions of <CODE>simplex</CODE> for a 12-button lock. +The version in Chapter 2 took about 5 ½ minutes to get the +answer (which is that there are about 56 billion combinations); this +version took about one second, comparable to the closed form <CODE>simp</CODE> +procedure. + +<P>If you just read this program with no prior idea of what algorithm it's using, it +must be hard to see how it reflects the original problem. But if you think +of it as a quasi-memoization of the earlier version it should make sense to you. + +<P><H2>Sorting Algorithms</H2> + +<P>Every textbook on algorithms uses sorting as one of its main examples. +There are several reasons for this. First of all, sorting is one of the +most useful things to do with a computer, in a wide variety of settings. +There are many different known sorting algorithms, ranging from obvious to +subtle. These algorithms have been analyzed with great care, so a lot is +known about their behavior. (What does that mean? It means we can answer +questions like "How long does this algorithm take, on the average?" +"How long does it take, at worst?" "If the things we're sorting are +mostly in order to begin with, does that make it faster?") And all this +effort pays off, in that the cleverest algorithms really are much faster +than the more obvious ones. + +<P>The problem we want to solve is to take a list in unknown order and +rearrange it to get a new list of the same members in some standard order. +This might be alphabetical order if the members of the list are words, or +size order if they're numbers, or something else. For the most part, the +exact ordering relation isn't very important. As long as we have a way to +compare two items and find out which comes first (or that they're equal, +sometimes) it doesn't matter what the details of the comparison are. To +make things simple, in this chapter I'll assume we're always sorting numbers. + +<P>Because the length of time per comparison depends on the nature of the +things being compared, and because that length isn't really part of what +distinguishes one sorting algorithm from another, analyses of the time taken +by a sorting program are usually expressed not in terms of seconds but in +terms of number of comparisons. This measure also eliminates the effect of +one computer being faster than another. To help make such measurements, +we'll compare two numbers using this procedure: + +<P><PRE>to lessthanp :a :b +if not namep "comparisons [make "comparisons 0] +make "comparisons :comparisons+1 +output :a < :b +end +</PRE> + +<P>Of course, if we want to use <CODE>></CODE> or <CODE>=</CODE> comparisons in a +sorting algorithm, we should write analogous procedures for those. But in +fact I'll only need <CODE>lessthanp</CODE> for the algorithms I'm going to show you. +After trying out a sort program, we can find out how many comparisons it +made using this convenient little tool: + +<P><PRE>to howmany +print :comparisons +ern "comparisons +end +</PRE> + +<P>After telling us the number of comparisons, this procedure erases +the counter variable to prepare for the next experiment. + +<P>If you haven't studied sort algorithms before, it will be a good exercise +for you to invent one yourself before you continue. Your procedure <CODE> +sort</CODE> should take a list of numbers as input, and should output a list of +the same numbers in order from smallest to largest. + +<P><PRE>? <U>show sort [5 20 3 5 18 9]</U> +[3 5 5 9 18 20] +</PRE> + +<P>Notice that it's allowable for two (or more) equal numbers to +appear in the input. + +<P>So that we can compare different algorithms fairly, we should try them on +the same input data. You can make a list of 100 random numbers this way: + +<P><PRE>make "list cascade 100 [fput random 100 ?] [] +</PRE> + +<P>You should try out both your sort procedures and mine on your +random list. In case you want to try your algorithm on my data, to +compare the exact numbers of comparisons needed, here is the list I used: + +<P><PRE>[11 41 50 66 41 61 73 38 2 94 43 55 24 1 77 77 13 2 93 35 + 43 69 9 46 88 20 43 73 11 74 69 33 28 4 5 1 15 17 13 94 + 88 42 12 31 67 42 30 30 13 91 31 8 55 6 31 84 57 50 50 31 + 36 52 5 12 10 19 69 0 9 81 62 14 39 54 45 72 18 47 48 35 + 76 44 77 34 75 52 61 86 34 44 64 53 25 39 4 55 55 54 53 64] +</PRE> + +<P>Notice in passing that this is a list of 100 random numbers, but +not a list of the first 100 numbers in random order. Some numbers, like 43, +appear more than once in the list, while others don't appear at all. This +is perfectly realistic for numbers that occur in real life, although of +course some situations give rise to lists of <EM>unique</EM> items. + +<P><H2>Sorting by Selection</H2> + + +<P>Although there are many known sorting algorithms, most fall into two +main groups. There are the ones that order the input items one at a time +and there are the ones that divide the problem into roughly equal-sized +smaller problems. I'll show you one of each. Within a group, the +differences between algorithms have to do with details of exactly how the +problem is divided into pieces, what's where in computer memory, and so on. +But these details are generally much less important than the basic division +between the two categories. If you took my advice and wrote your own sort +procedure, and if you hadn't studied sorting before, the one you wrote is +almost certainly in the first category. + +<P>My sample algorithm in the first group is a <EM>selection</EM> sort. +Expressed in words, the algorithm is this: First find the smallest number, +then find the next smallest, and so on. This idea can be put in recursive +form; the output from the procedure should be a list whose first member is +the smallest number and whose remaining elements are the sorted version of +the other numbers. I'll show you two versions of this algorithm: first a +straightforward but inefficient one, and then a version that's improved in +speed but not quite so obvious. Here's the first version: + +<P><PRE>to ssort :list +if emptyp :list [output []] +localmake "smallest reduce "min :list +output fput :smallest (ssort remove.once :smallest :list) +end + +to remove.once :item :list +if equalp :item first :list [output butfirst :list] +output fput first :list (remove.once :item butfirst :list) +end +</PRE> + +<P>In this version of <CODE>ssort</CODE>, we start by finding the +smallest number in the list. Then we remove that number from the +list, sort what's left, and put the smallest number back at the front +of the sorted list. The only slight complication is that I had to +write my own variant of <CODE>remove</CODE> that, unlike the standard Berkeley +Logo library version, removes only one copy of the chosen number from +the list, just in case the same number appears more than once. + +<P>By using <CODE>min</CODE> to find the smallest number, I've interfered with +my goal of counting the number of comparisons, but I didn't worry about +that because I'm about to rewrite <CODE>ssort</CODE> anyway. The problem is +that this version goes through the list of numbers twice for each +invocation, first to find the smallest number and then again to remove +that number from the list that will be used as input to the recursive +call. The program will be much faster if it does the finding and the +removing all at once. The resulting procedure is a little harder to +read, but it should help if you remember that it's trying to do the same +job as the original version. + +<P><PRE>to ssort :list +if emptyp :list [output []] +output ssort1 (first :list) (butfirst :list) [] +end + +to ssort1 :min :in :out +if emptyp :in [output fput :min ssort :out] +if lessthanp :min (first :in) ~ + [output ssort1 :min (butfirst :in) (fput first :in :out)] +output ssort1 (first :in) (butfirst :in) (fput :min :out) +end +</PRE> + +<P><CODE>Ssort</CODE> is invoked once for each time a smallest number must +be found. For each of those iterations, <CODE>ssort1</CODE> is invoked once for +each member of the still-unsorted list; the numbers in the list are moved +from <CODE>:in</CODE> to <CODE>:out</CODE> except that the smallest-so-far is singled out +in <CODE>:min</CODE>. + +<P>Suppose we try out <CODE>ssort</CODE> on our list of 100 numbers. How many +comparisons will be needed? To find the smallest of 100 numbers we have to +make 99 comparisons; the smallest-so-far must be compared against each of +the remaining ones. To find the next smallest requires 98 comparisons, and +so on. Finally we have two numbers remaining to be sorted and it takes one +comparison to get them in order. The total number of comparisons is +<P><CENTER>99+98+97+ ··· +2+1 = 4950</CENTER><P> +It makes no difference what order the numbers were in to begin with, or +whether some of them are equal, or anything else about the input data. It +takes 4950 comparisons for <CODE>ssort</CODE> to sort 100 numbers, period. You can +try out the program on various lists of 100 numbers to make sure I'm right. + +<P>In general, if we want to sort a list of length <EM>n</EM> with <CODE>ssort</CODE> the +number of comparisons required is the sum of the integers from 1 to <EM>n</EM>−1. +It turns out that there is a closed form definition for this sum: +<P><CENTER><IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch3/sumints.gif"></CENTER><P> + +<P>Selection sort uses these three steps: + +<P><P> +<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">Pull out the smallest value. +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Sort the other values. +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Put the smallest one at the front. + +<P></TABLE><P> + +<P>It's the first of those steps that does the comparisons. A similar +but different algorithm is <EM>insertion sort,</EM> which defers the +comparisons until the last step: + +<P><P> +<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">Pull out any old value (such as the <CODE>first</CODE>). +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Sort the other values. +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Put the chosen one where it belongs, in order. + +<P></TABLE><P> + +<P>Try writing a procedure <CODE>isort</CODE> to implement this +algorithm. How many comparisons does it require? You'll find that +for this algorithm the answer depends on the input data. What is the +smallest possible number of comparisons? The largest number? What +kinds of input data give rise to these extreme values? + +<P><H2>Sorting by Partition</H2> + +<P>There is one fundamental insight behind all methods for sorting with fewer +comparisons: Two small sorting jobs are faster than one large one. +Specifically, suppose we have 100 numbers to sort. Using <CODE>ssort</CODE> +requires 4950 comparisons. Instead, suppose we split up the 100 numbers +into two groups of 50. If we use <CODE>ssort</CODE> on each group, each will +require 1225 comparisons; the two groups together require twice that, or +2450 comparisons. That's about half as many +comparisons as the straight <CODE>ssort</CODE> of 100 numbers. + +<P>But this calculation underestimates how much time we can save using this +insight, because the same reasoning applies to each of those groups of 50. +We can split each into two groups of 25. Then how many comparisons will be +required altogether? + +<P>The basic idea we'll use is to pick some number that we think is likely to +be a median value for the entire list; that is, we'd like half the numbers +to be less than this partition value and half to be greater. There are many +possible ways to choose this partition value; I'm going to take the average +of the first and last numbers of the (not yet sorted!) input. Then we run +through the input list, dividing it into two smaller pieces by comparing +each number against the partition value. (Notice that <CODE>ssort</CODE> compares +pairs of numbers within the input list; the partition sort compares one +number from the list against another number that might or might not itself +be in the list.) We use the same technique recursively to sort each of the +two sublists, then append the results. + +<P>Note the similarities and differences between this selection sort algorithm: + +<P><P> +<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">Pull out the smallest value. +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Sort the other values. +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Put the smallest one at the front. + +<P></TABLE><P> + +<P>and the following <EM>partition sort</EM> algorithm: + +<P><P> +<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">Divide the input into the small-value half and the large-value half. +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Sort each half separately. +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Put the sorted small values in front of the sorted large ones. + +<P></TABLE><P> + +<P>Again, I'll write this program more than once, first in an overly simple version +and then more realistically. Here's the simple one: + +<P><PRE>to psort :list +if (count :list) < 2 [output :list] +localmake "split guess.middle.value :list +output sentence psort filter [? < :split] :list ~ + psort filter [not (? < :split)] :list +end + +to guess.middle.value :list +output ((first :list) + (last :list)) / 2 +end +</PRE> + +<P>To minimize the number of comparisons, we want to split the input list +into two equal-size halves. It's important to note that this program +is only guessing about which value of <CODE>:split</CODE> would achieve that +balanced splitting. You may wonder why I didn't take the average of +all the numbers in the list. There are two reasons. One is that that +would add a lot of time to the sorting process; we'd have to look at +every number to find the average, then look at every number again to do +the actual splitting. But a more interesting reason is that the average +isn't quite what we want anyway. Suppose we are asked to sort this +list of numbers: + +<P><PRE>[3 2 1000 5 1 4] +</PRE> + +<P>The average of these numbers is about 169. But if we use +that value as the split point, we'll divide the list into these two +pieces: + +<P><PRE>[3 2 5 1 4] and [1000] +</PRE> + +<P>Not a very even division! To divide this list of six values +into two equal pieces, we'd need a split point of 3 ½. In +general, what we want is the <EM>median</EM> value rather than the +<EM>average</EM> value. And if you think about it, you pretty much have +to have the numbers already sorted in order to find the median.<SUP>*</SUP> So we just try to make a good guess that +doesn't take long to compute. + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>That +isn't quite true; there is a clever algorithm that can find the median +after only a partial sorting of the values. But it's true enough that +we can't use a sorting algorithm whose first step requires that we've +already done some sorting.</SMALL></BLOCKQUOTE></SMALL><P>Just as in the case of selection sort, one problem with this simple program +is that it goes through the input list twice, once for each of the calls to +<CODE>filter</CODE>. And the call to <CODE>count</CODE> in the end test adds a third walk +through the entire list. Here's a version that fixes those inefficiencies: + +<P><PRE>to psort :list +if emptyp :list [output []] +if emptyp butfirst :list [output :list] +localmake "split ((first :list) + (last :list)) / 2 +output psort1 :split :list [] [] +end + +to psort1 :split :in :low :high +if emptyp :in [output sentence (psort :low) (psort :high)] +if lessthanp first :in :split ~ + [output psort1 :split (butfirst :in) (fput first :in :low) :high] +output psort1 :split (butfirst :in) :low (fput first :in :high) +end +</PRE> + +<P>This version of <CODE>psort</CODE> has one good attribute and one bad +attribute. The good attribute is that it's very cleanly organized. You can +see that it's the job of <CODE>psort</CODE> to choose the partition value and the +job of <CODE>psort1</CODE> to divide the list in two parts based on comparisons +with that value. The bad attribute is that it doesn't <EM>quite</EM> work as +it stands. + +<P>As for any recursive procedure, it's important that the input get smaller +for each recursive invocation. If not, <CODE>psort</CODE> could end up invoking +itself over and over with the same input. It's very easy to see that <CODE> +ssort</CODE> avoids that danger, because the input list is shorter by exactly one +member for each recursive invocation. But <CODE>psort</CODE> divides its input +into two pieces, each of which ought to be about half the length of the +original if we're lucky. We're lucky if the partition value we choose is, +in fact, the median value of the input list. If we're less lucky, the two +parts might be imbalanced, say ¼ of the members below the partition and +¾ of them above it. Can we be <EM>so</EM> unlucky that <EM>all</EM> of +the input numbers are on the same side of the partition? See if you can +think of a case in which this will happen. + +<P>The partition value is chosen as the average of two members of the input +list. If those two members (the first and last) are unequal, then one of +them must be less than the partition value and the other greater. So there +will be at least one number on each side of the partition. But if the two +averaged numbers are equal, then the partition value will be equal to both +of them. Each of them will be put in the <CODE>high</CODE> bucket. If all the +other numbers in the list are also greater than or equal to the partition, +then they'll all end up in the <CODE>high</CODE> bucket, and nothing will be +accomplished in the recursion step. The simplest failing example is trying +to sort a list of numbers all of which are equal, like + +<P><PRE>? <U>show psort [4 4 4 4 4]</U> +</PRE> + +<P>We could take various approaches to eliminating this bug. See how +many ways you can think of, and decide which you like best, before reading +further. + +<P>Since the problem has to do with the choice of partition value, you could +imagine using a more complicated means to select that value. For example, +you could start with the first member of the input list, then look through +the list for another member not equal to the first. When you find one, +average the two of them and that's the partition value. If you get all the +way to the end of the list without finding two unequal members, declare the +list sorted and output it as is. The trouble with this technique is that +many extra comparisons are needed to find the partition value. Those +comparisons don't really help in ordering the input, but they do add to the +time taken just as much as the "real" comparisons done later. + +<P>Another approach is to say that since the problem only arises if the first +and last input members are equal, we should treat that situation as a +special case. That is, we'll add an instruction like + +<P><PRE>if equalp first :list last :list [...] +</PRE> + +<P>Again, this approach adds a comparison that doesn't really help to +sort the file, although it's better than the first idea because it only adds +one extra comparison per invocation instead of perhaps several. + +<P>A more straightforward approach that might seem to make the program more +efficient, rather than less, is to divide the list into <EM>three</EM> +buckets, <CODE>low</CODE>, <CODE>high</CODE>, and <CODE>equal</CODE>. This way, the problem gets +shorter faster, since the <CODE>equal</CODE> bucket doesn't have to be sorted +recursively; it's already in order. The trouble is that it takes two +comparisons, one for equality and one for <CODE>lessthanp</CODE>, to know how to +direct each list member into a three-way split. Some computers can compare +numbers once and branch in different directions for less, equal, or greater; +one programming language, Fortran, includes that kind of three-way branching +through an "arithmetic IF" statement that accepts different instructions +for the cases of a given quantity being less than, equal to, or greater than +zero. But in Logo we'd have to say + +<P><PRE>if lessthanp first :in :split [...] +if equaltop first :in :split [...] +</PRE> + +<P>with two comparisons for each list member. (I'm imagining that +<CODE>equaltop</CODE> would keep track of the number of comparisons just as +<CODE>lessthanp</CODE> does.) + +<P>What I chose was to do the first <CODE>lessthanp</CODE> test for the list in <CODE> +psort</CODE> instead of <CODE>psort1</CODE>, and use it to ensure that either the first +or the last member of the list starts out the <CODE>low</CODE> bucket. + +<P><PRE>to psort :list +if emptyp :list [output []] +if emptyp butfirst :list [output :list] +localmake "split ((first :list) + (last :list)) / 2 +if lessthanp first :list :split ~ + [output psort1 :split (butfirst :list) (list first :list) []] +output psort1 :split (butlast :list) (list last :list) [] +end +</PRE> + +<P><CODE>Psort1</CODE> is unchanged. + +<P>How many comparisons should <CODE>psort</CODE> require to sort 100 numbers? Unlike +<CODE>ssort</CODE>, its exact performance depends on the particular list of numbers +given as input. But we can get a general idea. The first step is to divide +the 100 numbers into two buckets, using 100 comparisons against the +partition value. The second step divides each of the buckets in two again. +We can't say, in general, how big each bucket is; but we do know that each +of the original 100 numbers has to be in one bucket or the other. So each +of 100 numbers will participate in a comparison in this second round also. +The same argument applies to the third round, and so on. Each round +involves 100 comparisons. (This isn't quite true toward the end of the +process. When a bucket only contains one number, it is considered sorted +without doing any comparisons. So as the buckets get smaller, eventually +some of the numbers "drop out" of the comparison process, while others are +still not in their final position.) + +<P>Each round involves 100 comparisons, more or less. How many rounds are +there? This is where the original ordering of the input data makes itself +most strongly felt. If we're lucky, each round divides each bucket exactly +in half. In that case the size of the buckets decreases uniformly: + +<P><PRE>round size + + 1 100 + 2 50 + 3 25 + 4 12,13 + 5 6,7 + 6 3,4 + 7 1,2 +</PRE> + +<P>There is no round 8, because by then all of the buckets would be +of length 1 and there is no work left to do. So if all goes well we'd +expect to make about 700 comparisons, really a little less because by round +7 some of the numbers are already in length-1 buckets. Maybe 650? + +<P>What is the <EM>worst</EM> case? That would be if each round divides the +numbers into buckets as unevenly as possible, with one number in one bucket +and all the rest in the other. In that case it'll take 99 rounds to get all +the numbers into length-1 buckets. You may be tempted to estimate 9900 +comparisons for that situation, but in fact it isn't quite so bad, because +at each round one number falls into a length-1 bucket and drops out of the +sorting process. So the first round requires 100 comparisons, but the +second round only 99, the third 98, and so on. This situation is very much +like the way <CODE>ssort</CODE> works, and so we'd expect about 5000 comparisons. + +<P>Now try some experiments. Try <CODE>psort</CODE> on your random list, then try to +find input lists that give the best and worst possible results. + +<P><CODE>Psort</CODE> required 725 comparisons for my random list. That's somewhat +more than we predicted for the best case, but not too much more. <CODE> +Psort</CODE> seems to have done pretty well with this list. The simplest +worst-case input is one in which all the numbers are the same; I said + +<P><PRE>make "bad cascade 100 [fput 20 ?] [] +</PRE> + +<P>to make such a list. <CODE>Psort</CODE> required 5049 comparisons to +sort this list, slightly <EM>worse</EM> than <CODE>ssort</CODE> at 4950 comparisons. + +<P>What would a best-case input look like? It would divide evenly at each +stage; that is, the median value at each stage would be the average of the +first and last values. The simplest list that should meet that criterion is +a list of all the numbers from 1 to 100 in order: + +<P><PRE>make "inorder cascade 100 [lput # ?] [] +</PRE> + +<P>(Or you could use the <CODE>reverse</CODE> trick discussed earlier, but +for only 100 numbers it didn't seem worth the extra typing to me.) Using +<CODE>psort</CODE> to sort this list should require, we said, somewhere around 650 +to 700 comparisons. In fact it took 734 comparisons when I tried it, +slightly <EM>more</EM> than my randomly ordered list (725 comparisons). + +<P>Even 734 comparisons isn't terrible by any means, but when an algorithm +performs worse than expected, a true algorithm lover wants to know why. +Test cases like these can uncover either inefficiencies in the fundamental +algorithm or else ways in which the actual computer program doesn't live up +to the algorithm as described in theoretical language. If we could "tune +up" this program to sort <CODE>:inorder</CODE> in fewer than 700 comparisons, the +change might well improve the program's performance for any input. See if +you can figure out what the problem is before reading further. You can try +having <CODE>psort</CODE> print out its inputs each time it's called, as a way to +help gather information. + +<P>Here's a very large hint. I tried using the original version of <CODE> +psort</CODE>, before fixing the bug about the recursion sometimes leaving all the +numbers in one basket, and it sorted <CODE>:inorder</CODE> in only 672 comparisons. +(I knew the bug wouldn't make trouble in this case because none of the +numbers in this particular input list are equal.) Can you devise a better +<CODE>psort</CODE> that both works all the time and performs optimally for the +best-case input? + +<P>This partition sorting scheme is essentially similar to a very well-known +algorithm named quicksort, invented by C. A. R. Hoare. +Quicksort includes many improvements over this algorithm, not primarily in +reducing the number of comparisons but in decreasing the overhead time by, +for example, exchanging pairs of input items in their original memory +locations instead of making copies of sublists. Quicksort also switches to +a more straightforward <CODE>ssort</CODE>-like algorithm for very small input lists, +because the benefit of halving the problem is outweighed by the greater +complexity. (In fact, for a two-item input, <CODE>ssort</CODE> makes one +comparison and <CODE>psort</CODE> two.) + +<P>Here's the partition sort algorithm again: + +<P><P> +<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">Divide the input into the small-value half and the large-value half. +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Sort each half separately. +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Put the sorted small values in front of the sorted large ones. + +<P></TABLE><P> + +<P>The idea of cutting the problem in half is also used in +the following algorithm, called <EM>mergesort:</EM> + +<P><P> +<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">Divide the input arbitrarily into two equal size pieces. +<TR><TH align="right" valign="top">•<TD> <TD valign="top">Sort each half separately. +<TR><TH align="right" valign="top">•<TD> <TD valign="top"><EM>Merge</EM> the two sorted pieces by comparing values. + +<P></TABLE><P> + +<P>In a way, mergesort is to partition sort as insertion sort +is to selection sort. Both insertion sort and mergesort defer the +comparisons of numbers until the final step of the algorithm. + +<P>There is one important way in which mergesort is better than partition +sort. Since mergesort doesn't care how the input values are separated, we +can ensure that the two pieces are each exactly half the size of the input. +Therefore, the number of comparisons needed is always as small as possible; +there are no bad inputs. Nevertheless, quicksort is more widely used than +mergesort, because the very best implementations of quicksort seem to +require less overhead time, for the average input, than the best +implementations of mergesort. + +<P>If you want to write a mergesort program, the easiest way to divide a +list into two equal pieces is to select every other member, so the +odd-position members are in one half and the even-position members +in the other half. + +<P><H2>Order of Growth</H2> + +<P>I've mentioned that the complete quicksort algorithm includes several +optimization strategies to improve upon the partition sort I've +presented here. How important are these strategies? How much does +overhead contribute to the cost of a program? I did some experiments +to investigate this question. + +<P>First I timed <CODE>psort</CODE> and <CODE>ssort</CODE> with inputs of length 300. +Here are the results: + +<P><CENTER><TABLE> +<TR><TH>program<TH> <TH>comparisons<TH>time<TH>comparisons<TH> per second +<TR><TD align="center"><CODE>psort</CODE><TD><TD align="right">2940 <TD align="right">29 seconds<TD align="right">100 +<TR><TD align="center"><CODE>ssort</CODE><TD><TD align="right">44850 <TD align="right">313 seconds<TD align="right">143 +</TABLE></CENTER> + +<P><CODE>Ssort</CODE> seems to have much less overhead, since it can do +more comparisons per second than <CODE>psort</CODE>. Nevertheless, <CODE>psort</CODE> +always seems to be faster, for every size input I tried. The number +of comparisons outweighs the overhead. (By the way, these numbers don't +measure how fast the computer can compare two numbers! A typical computer +could perform more than a million comparisons per second, if only +comparisons were involved. Most of the time in these experiments is +actually going into the process by which the Logo interpreter figures out +what the instructions in the program mean. Because Berkeley Logo is +an interpreter, this figuring-out process happens every time an instruction +is executed. By contrast, I tried <CODE>ssort</CODE> on a list of length 300 in +Object Logo, a compiled version in which each instruction is figured out +only once, and the total time was 3.6 seconds.) + +<P>I wanted to give local optimization the best possible chance to win the +race, and so I decided to make selection sort as fast as I could, and +partition sort as slow as I could. I modified <CODE>ssort</CODE> to use the +Logo primitive <CODE>lessp</CODE> for comparison instead of doing the extra +bookkeeping of <CODE>lessthanp</CODE>, and I replaced <CODE>psort</CODE> with this +implementation: + +<P><PRE>to slowsort :list +if (count :list) < 2 [output :list] +localmake "split (reduce "sum :list)/(count :list) +output (sentence slowsort filter [? < :split] :list + filter [? = :split] :list + slowsort filter [? > :split] :list) +end +</PRE> + +<P>This version examines every member of the input list six times +on each recursive call! (<CODE>Count</CODE> is invoked twice; <CODE>reduce</CODE> +looks at every list member once; and <CODE>filter</CODE> is called three times +to do the actual partition.) Under these conditions I was able to get +<CODE>ssort</CODE> to win the race, but only for very small inputs: + +<P><CENTER><TABLE> +<TR><TH>program<TH> <TH>20 numbers<TH> <TH>100 numbers<TH> <TH>300 numbers +<TR><TD><CODE>slowsort</CODE><TD><TD align="right">2.7 seconds<TD><TD align="right">18 seconds<TD><TD align="right">63 seconds +<TR><TD><CODE>ssort</CODE><TD><TD align="right">1.2 seconds<TD><TD align="right">20 seconds<TD><TD align="right">182 seconds +</TABLE></CENTER> + +<P><CODE>Ssort</CODE> wins when sorting 20 numbers, but both programs +take less than three seconds. For 100 numbers, <CODE>slowsort</CODE> is +already winning the race, and its lead grows as the input list grows. +This is a common pattern: For small amounts of data, when the program is +fast enough no matter how you write it, local optimization can win the race, +but once the problem is large enough so that you actually care about +efficiency, choosing a better overall algorithm is always more important. +(Of course the very best results will come from choosing a good algorithm +<EM>and</EM> optimizing the program locally.) + +<P>What does "a better algorithm" actually mean? How do we measure the +quality of an algorithm? We've made a good start by counting the number +of comparisons required for our two sorting algorithms, but there is a +formal notation that can make the issues clearer. + +<P>Earlier I said that for a list of <EM>n</EM> numbers, <CODE>ssort</CODE> makes +<P><CENTER><IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch3/sumints.gif"></CENTER><P> +comparisons. But in a sense this tells us more than we want to +know, like saying that a certain program took 1,243,825 microseconds instead +of saying that it took somewhat over a second. The important thing to say +about <CODE>ssort</CODE> is that the number of comparisons is roughly proportional +to <EM>n</EM><SUP><SMALL>2</SMALL></SUP>; that is, doubling the size of the input list will quadruple the +time needed to sort the list. More formally, we say that the time required +for selection sorting is O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>), pronounced "big-oh of <EM>n</EM><SUP><SMALL>2</SMALL></SUP>" or +"order <EM>n</EM><SUP><SMALL>2</SMALL></SUP>." This is an abbreviation for the statement, "for large enough <EM>n</EM>, +the time is bounded by <EM>n</EM><SUP><SMALL>2</SMALL></SUP> times a constant." +The part about "for large enough <EM>n</EM>" is important because the running +time for some algorithm might, for example, involve a large constant setup +time. For small <EM>n</EM> that setup time might contribute more to the overall +time required than the part of the algorithm proportional<SUP>*</SUP> to <EM>n</EM><SUP><SMALL>2</SMALL></SUP>, but +once <EM>n</EM> becomes large enough, the <EM>n</EM><SUP><SMALL>2</SMALL></SUP> part will overtake any constant. + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>Strictly +speaking, the fact that an algorithm's time requirement is O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>) doesn't +mean that it's even approximately proportional to <EM>n</EM><SUP><SMALL>2</SMALL></SUP>, because O(...) +only establishes an upper bound. The time requirement could be proportional +to <EM>n</EM>, which would be better than <EM>n</EM><SUP><SMALL>2</SMALL></SUP>, and still be O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>). But +usually people use O(...) notation to mean that no smaller +order of growth would work, even though there's an official notation +with that meaning, Θ(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>), pronounced "big theta."</SMALL></BLOCKQUOTE></SMALL><P>I'd like to have a similar representation, O(something), for the +number of comparisons used by <CODE>psort</CODE> in the typical case. We said that +for 100 numbers, each round of sorting involves about 100 comparisons, and +the number of rounds required is the number of times you have to divide 100 +by 2 before you get down to 1, namely between 6 and 7. The number of +comparisons expected is the product of these numbers. In the general case, +the first number is just <EM>n</EM>. But what is the general formula for the +number of times you have to divide <EM>n</EM> by 2 to get 1? The answer is log<SUB><SMALL>2</SMALL></SUB> <EM>n</EM>. For example, if we had 128 numbers in the list instead of 100, we would +require exactly 7 rounds (in the best case) because 2<SUP><SMALL>7</SMALL></SUP> = 128 and so +log<SUB><SMALL>2</SMALL></SUB> 128 = 7. (By the way, log<SUB><SMALL>2</SMALL></SUB> 100 ≅ 6.65, so the +theoretical best case for 100 numbers is 665 comparisons.) + +<P>In general, all the obvious sorting algorithms are O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>) and all the +clever ones are O(<EM>n</EM> log <EM>n</EM>).<SUP>*</SUP> (I don't +have to say O(<EM>n</EM> log<SUB><SMALL>2</SMALL></SUB> <EM>n</EM>) because the difference between logarithms to +different bases is just multiplication by a constant factor, which doesn't +count in O(...) notation, just as I don't have to worry about the fact +that the formula for <CODE>ssort</CODE> comparisons is nearer <EM>n</EM><SUP><SMALL>2</SMALL></SUP>/2 than <EM>n</EM><SUP><SMALL>2</SMALL></SUP>.) +By the way, I haven't really <EM>proven</EM> that <CODE>psort</CODE> is O(<EM>n</EM> log <EM>n</EM>) +in the <EM>typical</EM> case, only that it is in the best case. It's much +harder to prove things about the typical (or average) performance of any +sorting algorithm, because what is an "average" input list? For some +algorithms there is no proven formula for the average run time, but only the +results of experiments with many randomly chosen lists. + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>Don Knuth has written an O(<EM>n</EM><SUP><SMALL>3</SMALL></SUP>) + +sort program, just as an example of especially bad programming.</SMALL></BLOCKQUOTE></SMALL><P>An O(<EM>n</EM>) algorithm is called <EM>linear</EM> and an O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>) one <EM> +quadratic.</EM> O(<EM>n</EM> log <EM>n</EM>) is in between those, better than quadratic but not +as good as linear. What other orders of growth are commonly found? As one +example, the pre-memoized procedures for the <CODE>t</CODE> and <CODE>simplex</CODE> +functions in Chapter 2 have time requirements that are O(2<SUP><SMALL><EM>n</EM></SMALL></SUP>); these are +called <EM>exponential</EM> algorithms. This means that just adding one +to <EM>n</EM> makes the program take twice as long! The experimental results I gave +earlier agree with this formula: <CODE>simplex 10</CODE> took 80 seconds, while +<CODE>simplex 12</CODE> took 5 ½ minutes, about four times as long. +<CODE>Simplex 16</CODE> would take over an hour. (Start with 80 seconds, and +double it six times.) The memoized versions in this chapter are O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>) +(can you prove that?), which is much more manageable. But for some +<EM>really</EM> hard problems there is no known way to make them any +faster than O(2<SUP><SMALL><EM>n</EM></SMALL></SUP>); +problems like that are called <EM>intractable,</EM> while ones that +are merely polynomial--O(<EM>n</EM><SUP><SMALL><EM>i</EM></SMALL></SUP>) for any constant <EM>i</EM>--are called <EM> +tractable.</EM> + +<P><H2>Data Structures</H2> + + +<P>One of the reasons computers are so useful is that they can contain +a tremendous amount of information. An important part of writing +a large computer program is figuring out how to organize the information +used in the program. Most of the time, a program doesn't have to +deal with a bunch of unrelated facts, like "Tomatoes are red" +and "7 times 8 is 56." Instead there will be some kind of uniformity, +or natural grouping, in the information a program uses. + +<P>We'll see, too, that the data structures you choose for your program affect +the algorithms available to you. Organizing your data cleverly can reduce +the execution time of your procedures substantially. + +<P><H2>Data Structures in Real Life</H2> + +<P>Why should there be different kinds of organization? It might help +to look at some analogies with data structures outside of computers. +First of all, think about your personal address book. It probably +has specific pages reserved for each letter of the alphabet. Within +a letter, the names and addresses of people starting with that letter +are arranged in no particular order; you just add a new person in +the first free space on the page. + +<P>Now think about the printed telephone directory for your city. In +this case the names are not arranged randomly within each letter; +they're in strict alphabetical order. In computer science terminology, +the printed directory is a <EM>sorted list,</EM> while your personal +directory is a <EM>hash table.</EM> + +<P>Obviously, if the entries in the printed directory weren't in order +it would take much too long to find an address. You can get away +with random ordering within each letter in your personal directory +because you know a small enough number of people that it doesn't take +too long to look through each one. But you probably do know enough +people to make the separate page for each letter worthwhile, even +though the page for Q may be practically empty. By using separate +pages for each letter, with unused slots on each page reserved +for expansion, you are <EM>spending space</EM> to <EM>buy time.</EM> +That is, your address book is bigger than it would be if it were just +one long list, but looking up a number is faster this way. This +tradeoff between time and space is typical of computer +programming problems also. + +<P>Why don't you keep your personal directory in strict alphabetical order, +like the printed phone book? If you did that, looking up a number would be +even faster. The problem is that <EM>adding</EM> a new number would be +terribly painful; you'd have to erase all the names on the page below where +the new name belongs and rewrite each of them one line below where it +was.<SUP>*</SUP> In this case there is a tradeoff between <EM> +storage time</EM> and <EM>retrieval time;</EM> you pay a small +price in retrieval time to avoid a large price in storage time. This, too, +is a common aspect of data structure design in computer programs. + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>Why doesn't the phone company have to do that whenever they get +a new customer? The answer is that they maintain the directory information +in two forms. The printed directory is the <EM>external</EM> representation, +used only for looking up information; inside their computers they use an +<EM>internal</EM> representation that allows for easier insertion of new +entries.</SMALL></BLOCKQUOTE></SMALL><P>Other kinds of real-life data structures also have computer analogues. +If your desk looks like mine, with millions of little slips of paper +all over the place, it is what computer scientists call a <EM> +heap.</EM><SUP>*</SUP> +This might be an appropriate data structure for those cases in +which the program must deal with a large mass of unrelated facts. +On the other hand, in a large business office there will be a <EM> +hierarchical</EM> filing system. A file cabinet labeled "Personnel +Records" might contain a drawer labeled "Inactive A-H"; that drawer +would contain a file folder for each former employee whose name starts +with an appropriate letter. This kind of hierarchy might be represented +as a <EM>tree</EM> in a computer program. + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>Unfortunately, there are two things called a "heap" in +computer science. I'm thinking of the storage allocation heap, not the +special tree structure used in the "heapsort" algorithm.</SMALL></BLOCKQUOTE></SMALL><P><H2>Trees</H2> + +<P>We've used the idea of trees before. In Volume 1, the program to solve +pitcher pouring problems was designed in terms of a tree of possible +solution steps, although that tree was not represented as actual data +in the program. In Volume 2, I wrote <CODE>tree.map</CODE> as an example of a +higher order function operating on trees in which the leaf nodes are +words and the branch nodes are phrases. In this chapter we'll work +toward a general representation of trees as an abstract data type. + +<P>Here is a hierarchical grouping of some of the world's cities: + +<P><CENTER><IMG SRC="worldtree.gif" ALT="figure: worldtree"></CENTER> + +<P>Recall that a diagram like this is called a tree because it resembles a real +tree turned upside-down. Each place where a word or phrase appears in the +tree is called a <EM>node.</EM> At the top of the diagram is the <EM> +root node</EM> (<CODE>world</CODE>). The lines between nodes are called <EM> +branches.</EM> The cities, which do not have branches extending below them, +are called <EM>leaf nodes.</EM> The in-between nodes, the countries +and provinces, are called <EM>branch nodes.</EM> (The root node is +also considered a branch node since it, too, has branches below it.) This +tree tells us that Toronto is in Ontario, which is in Canada, which is in the +world. + +<P>A tree is a very general data structure because its shape is very flexible. +For example, in the part of this tree that represents Canada I've included +a level of tree structure, representing the provinces, that isn't +included in the subtree that represents France. As we'll see later, some +algorithms deal with restricted categories of trees. For example, a <EM> + +binary</EM> tree is a tree with at most two branches below each branch node. + +<P>So far this data structure is just a graphic representation on paper. There +are many ways in which a tree can be implemented in a computer program. +Let's say that I want to represent the tree of cities in a computer so that +I can ask questions from this data base. That is, I'd like to write a +program that will allow interactions like + +<P><PRE>? <U>show locate "Montreal</U> +[world Canada Quebec Montreal] +</PRE> + +<P>Let's pick a particular way to represent a tree in Logo. (I +should warn you that later in the chapter I'm going to invent different +representations, but I want to start with a simple one to meet our +immediate needs. So what you're about to see is not the final version of +these procedures.) Here's +the way I'm going to do it: Each branch node will be represented as a Logo +variable whose name is the name of the node, containing as its value a list +of the names of the nodes immediately below it. For example, this tree will +include a variable named <CODE>France</CODE> with the value + +<P><PRE>[Paris Dijon Avignon] +</PRE> + +<P>A leaf node is just a word that appears in a node list but isn't +the name of a variable. For a branch node, <CODE>thing</CODE> of the node's name +will provide a list of the names of its children. I can set up the tree +with this procedure: + +<P><PRE>to worldtree +make "world [France China Canada] +make "France [Paris Dijon Avignon] +make "China [Beijing Shanghai Guangzhou Suzhou] +make "Canada [Ontario Quebec Manitoba] +make "Ontario [Toronto Ottawa Windsor] +make "Quebec [Montreal Lachine] +make "Manitoba [Winnipeg] +end +</PRE> + +<P>In principle, <CODE>locate</CODE> is a lot like <CODE>filter</CODE>, in the sense that +we're looking through a data structure for something that meets a given +condition. But the implementation is a bit trickier than looking through +a sequential list, because each invocation gives rise to several recursive +invocations (one per child), not merely one recursive invocation as usual. +The program will be easier to understand if we introduce the term +<EM>forest,</EM> which means a list of trees. + +<P><PRE>to locate :city +output locate1 :city "world +end + +to locate1 :city :subtree +if equalp :city :subtree [output (list :city)] +if not namep :subtree [output []] +localmake "lower locate.in.forest :city (thing :subtree) +if emptyp :lower [output []] +output fput :subtree :lower +end + +to locate.in.forest :city :forest +if emptyp :forest [output []] +localmake "child locate1 :city first :forest +if not emptyp :child [output :child] +output locate.in.forest :city butfirst :forest +end + +? <U>show locate "Shanghai</U> +[world China Shanghai] +? <U>show locate "Montreal</U> +[world Canada Quebec Montreal] +</PRE> + +<P>Once we've set up this data base, we can write procedures to ask it other +kinds of questions, too. + +<P><PRE>to cities :tree +if not namep :tree [output (list :tree)] +output map.se [cities ?] thing :tree +end + +? <U>show cities "France</U> +[Paris Dijon Avignon] +? <U>show cities "Canada</U> +[Toronto Ottawa Windsor Montreal Lachine Winnipeg] +</PRE> + +<P><H2>Improving the Data Representation</H2> + +<P>There's a problem with the representation I've chosen for this tree. +Suppose we want to expand the data base to include the city of Quebec. +This city is in the province of Quebec, so all we have to do is add the name +to the appropriate list: + +<P><PRE>make "Quebec [Montreal Quebec Lachine] +</PRE> + +<P>If you try this, however, you'll find that <CODE>locate</CODE> and <CODE> +cities</CODE> will no longer work. They'll both be trapped in infinite loops. + +<P>The problem with this program can't be fixed just by changing the program. +It's really a problem in the way I decided to represent a tree. I said, "a +leaf node is just a word that appears in a node list but isn't the name of a +variable." But that means there's no way to allow a leaf node with the same +name as a branch node. To solve the problem I have to rethink the +conventions I use to represent a tree. + +<P>Being lazy, I'd like to change as little as possible in the program, so I'm +going to try to find a new representation as similar as possible to the old +one. Here's my idea: In my mind I associate a <EM>level</EM> with each +node in the tree. The node <CODE>world</CODE> is at level 1, <CODE>France</CODE> and +<CODE>Canada</CODE> at level 2, and so on. The names of the variables used to hold +the contents of a node will be formed from the node name and the level: +<CODE>world1</CODE>, <CODE>France2</CODE>, <CODE>Ontario3</CODE>, and so on. This solves the +problem because the node for Quebec province will be a branch node by virtue +of the variable <CODE>Quebec3</CODE>, but the node for Quebec city will be a leaf +node because there will be no <CODE>Quebec4</CODE> variable. + +<P>As it turns out, though, I have to change the program quite a bit to make +this work. Several procedures must be modified to take the level number +as an additional input. Also, since the variable that holds the information +about a place is no longer exactly named with the place's name, <CODE>cities</CODE> +has some extra work to do, just to find the node whose cities we want to +know. It can almost use <CODE>locate</CODE> for this purpose, but with a slight +wrinkle: If we ask for the cities in Quebec, we mean Quebec province, not +Quebec city. So we need a variant of <CODE>locate</CODE> that finds the node +highest up in the tree with the desired place name. I gave subprocedure +<CODE>locate1</CODE> an extra input, named <CODE>highest</CODE>, that's <CODE>true</CODE> if we want +the highest matching tree node (when called from <CODE>cities</CODE>) or <CODE>false</CODE> +if we want a matching leaf node (when called from <CODE>locate</CODE>). + +<P><PRE>to worldtree +make "world1 [France China Canada] +make "France2 [Paris Dijon Avignon] +make "China2 [Beijing Shanghai Guangzhou Suzhou] +make "Canada2 [Ontario Quebec Manitoba] +make "Ontario3 [Toronto Ottawa Windsor] +make "Quebec3 [Montreal Quebec Lachine] +make "Manitoba3 [Winnipeg] +end + +to locate :city +output locate1 :city "world 1 "false +end + +to locate1 :city :subtree :level :highest +localmake "name (word :subtree :level) +if and :highest equalp :city :subtree [output (list :city)] +if not namep :name ~ + [ifelse equalp :city :subtree + [output (list :city)] + [output []]] +localmake "lower locate.in.forest :city (thing :name) :level+1 :highest +if emptyp :lower [output []] +output fput :subtree :lower +end + +to locate.in.forest :city :forest :level :highest +if emptyp :forest [output []] +localmake "child locate1 :city first :forest :level :highest +if not emptyp :child [output :child] +output locate.in.forest :city butfirst :forest :level :highest +end + +to cities :tree +localmake "path locate1 :tree "world 1 "true +if emptyp :path [output []] +output cities1 :tree count :path +end + +to cities1 :tree :level +localmake "name (word :tree :level) +if not namep :name [output (list :tree)] +output map.se [(cities1 ? :level+1)] thing :name +end + +? <U>show locate "Quebec</U> +[world Canada Quebec Quebec] +? <U>show cities "Canada</U> +[Toronto Ottawa Windsor Montreal Quebec Lachine Winnipeg] +</PRE> + +<P>This new version solves the Quebec problem. But I'm still not satisfied. +I'd like to add the United States to the data base. This is a country whose +name is more than one word. How can I represent it in the tree structure? +The most natural thing would be to use a list: <CODE>[United States]</CODE>. +Unfortunately, a list can't be the name of a variable in Logo. Besides, +now that I've actually written the program using this representation I see +what a kludge it is! + +<P><H2>Trees as an Abstract Data Type</H2> + +<P>My next idea for representing a tree is to abandon the use of a separate +variable for each node; instead I'll put the entire tree in one big list. +A node will be a list whose <CODE>first</CODE> is the datum at that node and whose +<CODE>butfirst</CODE> is a list of children of the node. So the entire tree will be +represented by a list like this: + +<P><PRE>[world [France ...] [[United States] ...] [China ...] [Canada ...]] +</PRE> + +<P>The datum at each node can be either a word or a list. + +<P>But this time I'm going to be smarter than before. I'm going to recognize +that the program I'm writing should logically be separated into two parts: +the part that implements the <EM>tree</EM> data type, and the part that uses +a tree to implement the data base of cities. If I write procedures like +<CODE>locate</CODE> and <CODE>cities</CODE> in terms of general-purpose tree subprocedures +like <CODE>leafp</CODE>, a predicate that tells whether its input node is a leaf +node, then I can change my mind about the implementation of trees (as I've +done twice already) without changing that part of the program at all. + +<P>I'll start by implementing the abstract data type. I've decided that a tree +will be represented as a list with the datum of the root node as its <CODE> +first</CODE> and the subtrees in the <CODE>butfirst</CODE>. To make this work I need +<EM>selector</EM> procedures: + +<P><PRE>to datum :node +output first :node +end + +to children :node +output butfirst :node +end +</PRE> + +<P>and a <EM>constructor</EM> procedure to build a node out of its +pieces: + +<P><PRE>to tree :datum :children +output fput :datum :children +end +</PRE> + +<P>Selectors and constructors are the main procedures needed to define any data +structure, but there are usually some others that can be useful. For the +tree, the main missing one is <CODE>leafp</CODE>. + +<P><PRE>to leafp :node +output emptyp children :node +end +</PRE> + +<P>Now I can use these tools to write the data base procedures. + +<P><PRE>to locate :city +output locate1 :city :world "false +end + +to locate1 :city :subtree :wanttree +if and :wanttree (equalp :city datum :subtree) [output :subtree] +if leafp :subtree ~ + [ifelse equalp :city datum :subtree + [output (list :city)] + [output []]] +localmake "lower locate.in.forest :city (children :subtree) :wanttree +if emptyp :lower [output []] +output ifelse :wanttree [:lower] [fput (datum :subtree) :lower] +end + +to locate.in.forest :city :forest :wanttree +if emptyp :forest [output []] +localmake "child locate1 :city first :forest :wanttree +if not emptyp :child [output :child] +output locate.in.forest :city butfirst :forest :wanttree +end + +to cities :name +output cities1 (finddatum :name :world) +end + +to cities1 :subtree +if leafp :subtree [output (list datum :subtree)] +output map.se [cities1 ?] children :subtree +end + +to finddatum :datum :tree +output locate1 :name :tree "true +end +</PRE> + +<P>Once again, <CODE>cities</CODE> depends on a variant of <CODE>locate</CODE> that +outputs the subtree below a given name, instead of the usual <CODE>locate</CODE> +output, which is a list of the names on the path from <CODE>world</CODE> down to +a city. But instead of having <CODE>cities</CODE> call <CODE>locate1</CODE> directly, +I decided that it would be more elegant to provide a procedure <CODE> +finddatum</CODE> that takes a datum and a tree as inputs, whose output is the +subtree below that datum. + +<P>In <CODE>cities1</CODE>, the expression + +<P><PRE>(list datum :subtree) +</PRE> + +<P>turns out to be equivalent to just <CODE>:subtree</CODE> for the case of +a leaf node. (It is only for leaf nodes that the expression is evaluated.) +By adhering to the principle of data abstraction I'm making the program work +a little harder than it really has to. The advantage, again, is that this +version of <CODE>cities</CODE> will continue working even if we change the +underlying representation of trees. The efficiency cost is quite low; +changing the expression to <CODE>:subtree</CODE> is a local optimization comparable +to the common subexpression elimination we considered early in the chapter. + +<P>I also have to revise the procedure to set up the tree. It's going to +involve many nested invocations of <CODE>tree</CODE>, like this: + +<P><PRE>to worldtree +make "world tree "world + (list (tree "France + (list (tree "Paris []) + (tree "Dijon []) + (tree "Avignon []) ) ) + (tree "China + (list ... +</PRE> + +<P>and so on. I can shorten this procedure somewhat by inventing +an abbreviation for making a subtree all of whose children are leaves. + +<P><PRE>to leaf :datum +output tree :datum [] +end + +to leaves :leaves +output map [leaf ?] :leaves +end + +to worldtree +make "world + tree "world + (list (tree "France leaves [Paris Dijon Avignon]) + (tree "China leaves [Beijing Shanghai Guangzhou Suzhou]) + (tree [United States] + (list (tree [New York] + leaves [[New York] Albany Rochester + Armonk] ) + (tree "Massachusetts + leaves [Boston Cambridge Sudbury + Maynard] ) + (tree "California + leaves [[San Francisco] Berkeley + [Palo Alto] Pasadena] ) + (tree "Washington + leaves [Seattle Olympia] ) ) ) + (tree "Canada + (list (tree "Ontario + leaves [Toronto Ottawa Windsor] ) + (tree "Quebec + leaves [Montreal Quebec Lachine] ) + (tree "Manitoba leaves [Winnipeg]) ) ) ) +end + +? <U>worldtree</U> +? <U>show cities [United States]</U> +[[New York] Albany Rochester Armonk Boston Cambridge Sudbury Maynard + [San Francisco] Berkeley [Palo Alto] Pasadena Seattle Olympia] +? <U>show locate [Palo Alto]</U> +[world [United States] California [Palo Alto]] +</PRE> + +<P><H2>Tree Modification</H2> + + +<P>So far, so good. But the procedure <CODE>worldtree</CODE> just above is very +error-prone because of its high degree of nesting. In the earlier versions +I could create the tree a piece at a time instead of all at once. In a +practical data base system, people should be able to add new information at +any time, not just ask questions about the initial information. That is, +I'd like to be able to say + +<P><PRE>addchild :world (tree "Spain leaves [Madrid Barcelona Orense]) +</PRE> + +<P>to add a subtree to the world tree. New nodes should be possible +not only at the top of the tree but anywhere within it: + +<P><PRE>addchild (finddatum "Canada :world) (tree [Nova Scotia] leaves [Halifax]) +</PRE> + +<P>Most versions of Logo do not provide tools to add a new member to an +existing list. We could write a program that would make a new copy of the +entire tree, adding the new node at the right place in the copy, so that +<CODE>addchild</CODE> would take the form + +<P><PRE>make "world newcopy :world ... +</PRE> + +<P>but there are two objections to that. First, it would be quite +slow, especially for a large tree. Second, it would work only if I refrain +from assigning subtrees as the values of variables other than <CODE>world</CODE>. +That is, I'd like to be able to say + +<P><PRE>? <U>make "Canada finddatum "Canada :world</U> +? <U>addchild :Canada (tree [Nova Scotia] leaves [Halifax])</U> +? <U>show locate "Halifax</U> +[world Canada [Nova Scotia] Halifax] +</PRE> + +<P>Even though I've added the new node to the tree <CODE>:Canada</CODE>, it +should also be part of <CODE>:world</CODE> (which is where <CODE>locate</CODE> looks) +because <CODE>:Canada</CODE> is a subtree of <CODE>:world</CODE>. Similarly, I'd like to +be able to add a node to the Canadian subtree of <CODE>:world</CODE> and have it +also be part of <CODE>:Canada</CODE>. That wouldn't be true if <CODE>addchild</CODE> +makes a copy of its tree input instead of modifying the existing tree. + +<P>I'm going to solve this problem two ways. In the Doctor project in +Volume 2 of this series, you learned that Berkeley Logo does include +primitive procedures that allow the modification of an existing list +structure. I think the most elegant solution to the <CODE>addchild</CODE> problem +is the one that takes advantage of that feature. But I'll also show a +solution that works in any version of Logo, to make the point that list +mutation isn't absolutely necessary; we can achieve the same goals, with +the same efficiency, by other means. First here's the list mutation +version of <CODE>addchild</CODE>: + +<P><PRE>to addchild :tree :child +.setbf :tree (fput :child butfirst :tree) +end + +? <U>make "GB leaf [Great Britain]</U> +? <U>addchild :world :GB</U> +? <U>addchild :GB tree "England leaves [London Liverpool Oxford]</U> +? <U>addchild :GB tree "Scotland leaves [Edinburgh Glasgow]</U> +? <U>addchild :GB tree "Wales leaves [Abergavenny]</U> +? <U>show locate "Glasgow</U> +[world [Great Britain] Scotland Glasgow] +</PRE> + +<P>Just as <CODE>tree</CODE> is a constructor for the tree data type, and <CODE> +children</CODE> is a selector, <CODE>addchild</CODE> is called a <EM>mutator</EM> for +this data type. Notice, by the way, that <CODE>:GB</CODE>, which was originally +built as a leaf node, can be turned into a branch node by adding children to +it; <CODE>addchild</CODE> is not limited to nodes that already have children. + +<P>The solution using <CODE>.setbf</CODE> is elegant because I didn't have to change +any of the procedures I had already written; this version of <CODE>addchild</CODE> +works with the same tree implementation I had already set up. But suppose +we didn't have <CODE>.setbf</CODE> or didn't want to use it. (As we'll see shortly, +using list mutators does open up some possible pitfalls.) We can write a +mutator for the tree abstract data type even if we don't have mutators for +the underlying Logo lists! The secret is to take advantage of variables, +whose values can be changed--mutated--in any version of Logo. + +<P>To make this work I'm going to go back to a tree representation somewhat +like the one I started with, in which each node is represented by a separate +variable. But to avoid the problems I had earlier about Quebec and San +Francisco, the variable name won't be the datum it contains. Instead I'll +use a <EM>generated symbol,</EM> an arbitrary name made up by the program. +(This should sound familiar. I used the same trick in the Doctor project, +and there too it was an alternative to list mutation.) + +<P>This is where my use of data abstraction pays off. Look how little I have +to change: + +<P><PRE>to tree :datum :children +localmake "node gensym +make :node fput :datum :children +output :node +end + +to datum :node +output first thing :node +end + +to children :node +output butfirst thing :node +end + +to addchild :tree :child +make :tree lput :child thing :tree +end +</PRE> + +<P>That's it! <CODE>Leafp</CODE>, <CODE>finddatum</CODE>, <CODE>locate</CODE>, <CODE> +cities</CODE>, and <CODE>worldtree</CODE> all work perfectly without modification, +even though I've made a profound change in the actual representation of +trees. (Try printing <CODE>:world</CODE> in each version.) + +<P><CODE>Addchild</CODE> is only one of the possible ways in which I might want to +modify a tree structure. If we were studying trees more fully, I'd create +tool procedures to delete a node, to move a subtree from one location in the +tree to another, to insert a new node between a parent and its children, and +so on. There are also many more questions one might want to ask about a +tree. How many nodes does it have? What is its maximum depth? + +<P>I've mentioned general tree manipulation tools. There are also still some +unresolved issues about the particular use of trees in the city data base. +For example, although the problem of Quebec city and Quebec province is +under control, what if I want the data base to include Cambridge, England +as well as Cambridge, Massachusetts? What should <CODE>locate</CODE> do if given +<CODE>Cambridge</CODE> as input? + +<P>But instead of pursuing this example further I want to develop another +example, in which trees are used for a very different purpose. In the city +data base, the tree represents a <EM>hierarchy</EM> (Sudbury is <EM>part of</EM> +Massachusetts); in the example below, a tree is used to represent an <EM> +ordering</EM> of data, as in the sorting algorithms discussed earlier. + +<P><H2>Searching Algorithms and Trees</H2> + +<P>Forget about trees for a moment and consider the general problem of <EM> +searching</EM> through a data base for some piece of information. For +example, suppose you want a program to find the city corresponding to a +particular telephone area code. That is, I want to be able to say + +<P><PRE>? <U>print listcity 415</U> +San Francisco +</PRE> + +<P>The most straightforward way to do this is to have a list +containing code-city pairs, like this: + +<P><PRE>make "codelist [[202 Washington] [206 Seattle] [212 New York] + [213 Los Angeles] [215 Philadelphia] [303 Denver] [305 Miami] + [313 Detroit] [314 St. Louis] [401 Providence] [404 Atlanta] + [408 Sunnyvale] [414 Milwaukee] [415 San Francisco] [504 New Orleans] + [608 Madison] [612 St. Paul] [613 Kingston] [614 Columbus] + [615 Nashville] [617 Boston] [702 Las Vegas] [704 Charlotte] + [712 Sioux City] [714 Anaheim] [716 Rochester] [717 Scranton] + [801 Salt Lake City] [804 Newport News] [805 Ventura] [808 Honolulu]] +</PRE> + +<P>This is a list of lists. Each sublist contains one pairing of an +area code with a city. We can search for a particular area code by going +through the list member by member, comparing the first word with the desired +area code. (By the way, in reality a single area code can contain more than +one city, and vice versa, but never mind that; I'm trying to keep this +simple.) In accordance with the idea of data abstraction, we'll start with +procedures to extract the area code and city from a pair. + +<P><PRE>to areacode :pair +output first :pair +end + +to city :pair +output butfirst :pair +end +</PRE> + +<P>The city is the <CODE>butfirst</CODE> rather than the <CODE>last</CODE> to +accommodate cities with names of more than one word. + +<P>The iteration tool <CODE>find</CODE> does exactly what's needed, going through a +list member by member until it finds one that matches some criterion: + +<P><PRE>to listcity :code +output city find [equalp :code areacode ?] :codelist +end +</PRE> + +<P>Time for a little analysis of algorithms. What is the time + +behavior of this <EM>linear</EM> search algorithm as the data base gets +bigger? As for the case of the sorting algorithms, I'll concentrate on the +number of comparisons. How many times is <CODE>equalp</CODE> invoked? The best +case is if we're looking for the first code in the list, 202. In this case +only one comparison is made. The worst case is if we're looking for the +last code in the list, 808, or if we're looking for an area code that isn't +in the list at all. This requires 31 comparisons. (That's how many +code-city pairs happen to be in this particular data base.) On the average +this algorithm requires a number of comparisons half way between these +extremes, or 16. As we increase the size of the data base, the number of +comparisons required grows proportionally; this algorithm is O(<EM>n</EM>). + +<P>The area codes in <CODE>:codelist</CODE> are in ascending order. If you were +looking through the list yourself, you wouldn't look at every entry one +after another; you'd take advantage of the ordering by starting around the +middle and moving forward or backward depending on whether the area code you + +found was too low or too high. That's called a <EM>binary</EM> search +algorithm. <CODE>Listcity</CODE>, though, doesn't take +advantage of the ordering in the list; the pairs could just as well be +jumbled up and <CODE>listcity</CODE> would be equally happy. + +<P>Binary search works by starting with the median-value area code in the data +base. If that's the one we want, we're finished; otherwise we take the +higher or lower half of the remaining codes and examine the median value of +that subset. One way we could implement that algorithm would be to use a +binary tree to represent the code-city pairs: + +<P> + +<P><CENTER><IMG SRC="codetree.gif" ALT="figure: codetree"></CENTER> + +<P> + +<P>(In this picture I've just shown the area codes, but of course the +actual data structure will have a code-city pair at each node.) + +<P>We could construct the tree from scratch, using <CODE>tree</CODE> and <CODE>leaves</CODE> +as I did earlier, but since we already have the pairs in the correct sorted +order it's easier to let the computer do it for us: + +<P><PRE>to balance :list +if emptyp :list [output []] +if emptyp butfirst :list [output leaf first :list] +output balance1 (int (count :list)/2) :list [] +end + +to balance1 :count :in :out +if equalp :count 0 ~ + [output tree (first :in) (list balance reverse :out + balance butfirst :in )] +output balance1 (:count-1) (butfirst :in) (fput first :in :out) +end +</PRE> + +<P>In this program I'm using the trick of building <CODE>:out</CODE> using +<CODE>fput</CODE> instead of <CODE>lput</CODE> and then using <CODE>reverse</CODE> to get the +left branch back in ascending order, to make the construction a little +faster. + +<P><PRE>make "codetree balance :codelist +</PRE> + +<P>will generate the tree. + +<P>Now we're ready to write <CODE>treecity</CODE>, the tree-based search program +analogous to <CODE>listcity</CODE> for the linear list. + +<P><PRE>to treecity :code +output city treecity1 :code :codetree +end + +to treecity1 :code :tree +if emptyp :tree [output [000 no city]] +localmake "datum datum :tree +if :code = areacode :datum [output :datum] +if :code < areacode :datum [output treecity1 :code lowbranch :tree] +output treecity1 :code highbranch :tree +end + +to lowbranch :tree +if leafp :tree [output []] +output first children :tree +end + +to highbranch :tree +if leafp :tree [output []] +output last children :tree +end + +? <U>print treecity 415</U> +San Francisco +</PRE> + +<P><CODE>Treecity</CODE> gives the same output as <CODE>listcity</CODE>, but it's +faster. At worst it makes two comparisons (for equal and less than) at each +level of the tree. This tree is five levels deep, so if the input area code +is in the tree at all we'll make at most 9 comparisons, compared to 31 for +<CODE>listcity</CODE>. The average number is less. (How much less?) The +difference is even more striking for larger data bases; if the number of +pairs is 1000, the worst-case times are 1000 for <CODE>listcity</CODE> and 19 for +<CODE>treecity</CODE>. + +<P>In general the depth of the tree is the number of times you can divide the +number of pairs by two. This is like the situation we met in analyzing the +partition sort algorithm; the depth of the tree is the log base two of the +number of pairs. This is an O(log <EM>n</EM>) algorithm. + +<P>The procedures <CODE>lowbranch</CODE> and <CODE>highbranch</CODE> are data abstraction at +work. I could have used <CODE>first children :tree</CODE> directly in <CODE> +treecity1</CODE>, but this way I can change my mind about the representation of +the tree if necessary. In fact, in a practical program I would want to use +a representation that allows a node to have a right child (a <CODE> +highbranch</CODE>) without having a left child. Also, <CODE>lowbranch</CODE> and <CODE> +highbranch</CODE> are written robustly; they give an output, instead of +causing an error, if applied to a leaf node. (That happens if you ask for +the city of an area code that's not in the data base.) I haven't +consistently been writing robust programs in this chapter, but I thought I'd +give an example here. + +<P>The efficiency of the tree searching procedure depends on the fact +that the tree is <EM>balanced.</EM> In other words, the left branch +of each node is the same size as its right branch. I cheated slightly +by starting with 31 area codes, a number that allows for a perfectly +balanced tree. Suppose I had started with 35 area codes. How should +I have built the tree? What would happen to the efficiency of the +program? What is the maximum possible number of trials necessary +to find a node in that tree? What other numbers of nodes allow for +perfect balance? + +<P>There are two important ideas to take away from this example. The first is +that the choice of data representation can have a substantial effect on the +efficiency of a program. The second, as I mentioned at the beginning of +this section, is that we have used the same kind of data structure, a tree, +for two very different purposes: first to represent a <EM>hierarchy</EM> +(Sudbury is <EM>part of</EM> Massachusetts) and then to represent an <EM> +ordering</EM> (313 is <EM>before</EM> 608). + +<P><H2>Logo's Underlying Data Structures</H2> + +<P>An abstract data type, such as the tree type we've been discussing, must be +implemented using some lower-level means for data aggregation, that is, for +grouping separate things into a combined whole. In Berkeley Logo, there are +two main built-in means for aggregation: lists and arrays. (Every dialect +of Logo has lists, but not all have arrays.) Words can be thought of as a +third data aggregate, in which the grouped elements are letters, digits, +and punctuation characters, but we don't ordinarily use words as the basis +for abstract data types. + +<P>Logo was designed to be used primarily by people whose main interest is in +something other than computer programming, and so a design goal was to keep +to a minimum the extent to which the Logo programmer must know about how +Logo itself works internally. Even in these books, which <EM>are</EM> focused +on computing itself, we've gotten this far without looking very deeply into +how Logo's data structures actually work. But for the purposes of this +chapter it will be useful to take that step. + +<P>Essentially all computers today are divided into a <EM>processor</EM> +and a <EM>memory.</EM> (The exceptions are experimental "parallel +processing" machines in which many small sub-processors and sub-memories +are interconnected, sometimes combining both capabilities within a single +intergrated circuit.) Roughly speaking, the processor contains the +circuitry that implements hardware primitive procedures such as arithmetic +operations. (Not every Logo primitive is a hardware primitive.) The +memory holds the information used in carrying out a program, including +the program itself and whatever data it uses. The memory is divided into +millions of small pieces, each of which can hold a single value (a number, +a letter, and so on).<SUP>*</SUP> Each small piece has an <EM> +address,</EM> which is a number used to select a particular piece; +the metaphor is the street address of each house on a long street. The +hardware primitive procedures include a <CODE>load</CODE> operation that takes +a memory address as its input and finds the value in the specified piece +of memory, and a <CODE>store</CODE> command that takes an address and a value +as inputs and puts the given value into the chosen piece of memory. + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>I'm being vague about what a "value" is, and +in fact most computer memories can be divided into pieces of different +sizes. In most current computers, +a <EM>byte</EM> is a piece that can hold any of 256 different +values, while a <EM>word</EM> is a piece that can hold any of +about four billion different values. But these details aren't important +for my present purposes, and I'm going to ignore them. I'll talk as if +memories were simply word-addressable.</SMALL></BLOCKQUOTE></SMALL><P>With this background we can understand how lists and arrays differ. To +be specific, suppose that we have a collection of five numbers to store +in memory. If we use an array, that means that Logo finds five <EM> +consecutive</EM> pieces of memory to hold the five values, like this: + +<P><CENTER><IMG SRC="array.gif" ALT="figure: array"></CENTER> + +<P>If instead we use a list, Logo finds the memory for each of the +five values as that value is computed. The five memory slots might not +be consecutive, so each memory slot must contain not only the desired +value but also a <EM>pointer</EM> to the next slot. That is, each +slot must contain an additional number, the address of the next slot. +(What I'm calling a "slot" must therefore be big enough to hold two +numbers, and in fact Logo uses what we call a <EM>pair</EM> for +each, essentially an array of length two.) Since we don't care about +the actual numeric values of the addresses, but only about the pairs +to which they point, we generally use arrows to represent pointers +pictorially: + +<P><CENTER><IMG SRC="list.gif" ALT="figure: list"></CENTER> + +<P>The last pair of the list has a <EM>null pointer,</EM> a +special value to indicate that there is no next pair following it, indicated +by the diagonal line. + +<P>Why do we bother to provide two aggregation mechanisms? Why don't language +designers just pick the best one? Clearly the answer will be that each is +"the best" for different purposes. In the following paragraphs I'll +compare several characteristics of lists and arrays. + +<P>One advantage of arrays that should be obvious from these pictures is that a +list uses twice as much memory to hold the same number of values. But this +is not generally an important difference for modern computers; unless your +problem is really enormous, you don't have to worry about running out of +memory. + +<P>The most important advantage of arrays is that they are +<EM>random access.</EM> This means that each member of an array +can be found just as quickly as any other member, regardless of its +position within the array. If the program knows the address at which +the array begins, and it wants to find the <EM>n</EM>th member of the array, +only two operations are needed: First add <EM>n</EM> to the array's address, +then <CODE>load</CODE> the value from the resulting address. This takes a +constant (and very short) amount of time, O(1). By contrast, in a list +there is no simple arithmetic relationship between the address of the list's +first member and the address of its <EM>n</EM>th member. To find the <EM>n</EM>th member, +the program must <CODE>load</CODE> the pointer from the first pair, then use that +address to <CODE>load</CODE> the pointer from the second pair, and so on, <EM>n</EM> times. +The number of operations needed is O(<EM>n</EM>). + +<P>On the other hand, the most important advantage of lists is +<EM>dynamic allocation.</EM> This means that the programmer does not +have to decide ahead of time on how big the data aggregate will become. (We +saw an example recently when we wanted to add a child to a node of an +existing tree.) Consider the five-member aggregates shown earlier, and +suppose we want to add a sixth member. If we've used a list, we can say, +for example, + +<P><PRE>make "newlist fput "A :oldlist +</PRE> + +<P>and all Logo has to do is find one new pair: + +<P><CENTER><IMG SRC="growlist.gif" ALT="figure: growlist"></CENTER> + +<P>By contrast, once an array has been created we can't expand it, +because the new address would have to be adjacent to the old addresses, and +that piece of memory might already be used for something else. To make an +array bigger, you have to allocate a complete new array and copy the old +values into it. + +<P>Remember that arrays sacrifice efficient expansion in order to get +efficient random access. From the standpoint of program speed, one is not +absolutely better than the other; it depends on the nature of the problem +you're trying to solve. That's why it's best if a language offers both +structures, as Berkeley Logo does. For the very common +case of <CODE>foreach</CODE>-like iteration through an aggregate, neither random +access nor dynamic allocation is really necessary. For a data base that can +grow during the running of a program, the flexibility of dynamic allocation +is valuable. For many sorting algorithms, on the other hand, it's important +to be able to jump around in the aggregate and so random access is useful. +(A programmer using arrays can partially overcome the lack of dynamic +allocation by preallocating a too-large array and leaving some of it empty +at first. But if the order of members in the array matters, it may turn out +that the "holes" in the array aren't where they're needed, and so the +program still has to copy blocks of members to make room. Also, such +programs have occasional embarrassing failures because what the programmer +thought was an extravagantly large array turns out not quite large enough +for some special use of the program, or because a malicious user deliberately +"overruns" the length of the array in order to evade a program restriction.) + +<P>The solitaire program in Volume 2 of this series illustrates the different +advantages of lists and arrays. As in any card game, a solitaire player +distributes the cards into several small groups, each of which varies in +size as the play continues. For example, a typical step is to deal a card +from the <EM>hand</EM> onto the <EM>pile,</EM> each of which is represented as +a list: + +<P><PRE>make "pile fput (first :hand) :pile +make "hand butfirst :hand +</PRE> + +<P>(The actual solitaire program uses somewhat different +instructions to accomplish the same effect, with a <CODE>deal</CODE> procedure that +outputs the next available card after removing it from the hand.) + +<P>On the other hand (no pun intended), <EM>shuffling</EM> the deck is easier +when an array is used to hold the card values, because the shuffling +algorithm requires random jumping around in the deck, but does not change +the total number of cards, so that random access is more important than +dynamic allocation. Therefore, the program starts each round of play with +the deck in the form of an array of 52 cards. It shuffles the deck in array +form, and then copies the members of the array into a list, which is used +for the rest of that round. The advantage of an array for shuffling, and +the advantage of a list for dealing cards, together outweigh the time +spent copying the array into the list. + +<P>An important consequence of dynamic allocation is that lists are +<EM>sharable</EM> data structures. In the example above, <CODE> +:oldlist</CODE> contains five pairs and <CODE>:newlist</CODE> contains six, but the +total number of pairs used is six, not 11, because most of the same pairs +are part of both lists. That's why the <CODE>fput</CODE> operation takes O(1) +time, unaffected by the length of the list, as do <CODE>first</CODE> and <CODE> +butfirst</CODE>. (Now that you know how lists are constructed using pairs +and pointers, you should be able to understand something I've mentioned +earlier without explanation: <CODE>Lput</CODE>, <CODE>last</CODE>, and <CODE>butlast</CODE> +require O(<EM>n</EM>) time. It's much faster to operate near the beginning of a +list than near the end.) Arrays are not sharable; each array occupies its +own block of consecutive memory addresses. + +<P>When a data structure is both sharable and mutable, it's possible to get +into some very mysterious, hard-to-detect bugs. Suppose we do this: + +<P><PRE>? <U>make "one [Ice cream is delicious.]</U> +? <U>make "two fput "Spinach butfirst butfirst :one</U> +</PRE> + +<P> + +<P><CENTER><IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch3/spinach1.gif" ALT="figure: spinach1"></CENTER> + +<P>Then suppose you decide you don't like spinach. You might say + +<P><PRE>? <U>.setfirst butfirst butfirst :two "disgusting.</U> +? <U>print :two</U> +Spinach is disgusting. +</PRE> + +<P><CENTER><IMG SRC="spinach2.gif" ALT="figure: spinach2"></CENTER> + +<P>But you haven't taken into account that <CODE>:one</CODE> and <CODE>:two</CODE> +share memory, so this instruction has an unintended result: + +<P><PRE>? <U>print :one</U> +Ice cream is disgusting. +</PRE> + +<P>This is definitely a bug! + +<P>It's the combination of mutation and sharing that causes trouble. Arrays +are mutable but not sharable; the result of using <CODE>setitem</CODE> to change +a member of an array is easily predictable. The trouble with list mutation +is that you may change other lists besides the one you think you're changing. +That's why Berkeley Logo's <CODE>.setfirst</CODE> and <CODE>.setbf</CODE> primitives have +names starting with periods, the Logo convention for "experts only!" It's +also why many other Logo dialects don't allow list mutation at +all.<SUP>*</SUP> + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>Also, this helps to explain the importance of property lists +in Logo. Property lists are a safe form of mutable list, because they are +not sharable; that's why the <CODE>plist</CODE> primitive outputs a newly-allocated +<EM>copy</EM> of the property list.</SMALL></BLOCKQUOTE></SMALL><P>In order to explain to you about how something like this might happen, I've +had to tell you more about Logo's storage allocation techniques than is +ordinarily taught. One of the design goals of Logo was that the programmer +shouldn't have to think in terms of boxes and arrows as we're doing now. +But if you mutate sharable lists, you'd better understand the boxes +and arrows. It's not entirely obvious, you see, when two +lists <EM>do</EM> share storage. Suppose the example had been backwards: + +<P><PRE>? <U>make "one [Delicious, that's ice cream.]</U> +? <U>make "two lput "spinach. butlast butlast :one</U> +? <U>print :two</U> +Delicious, that's spinach. +? <U>.setfirst :two "Disgusting,</U> +? <U>print :two</U> +Disgusting, that's spinach. +? <U>print :one</U> +Delicious, that's ice cream. +</PRE> + +<P>In this case the other list is <EM>not</EM> mysteriously modified +because when <CODE>lput</CODE> is used, rather than <CODE>fput</CODE> as in the previous +example, the two lists do <EM>not</EM> share memory. That's a consequence of +the front-to-back direction of the arrows connecting the boxes; it's +possible to have two arrows pointing <EM>into</EM> a box, but only one arrow +can <EM>leave</EM> a box. You can't do this: + +<P><CENTER><IMG SRC="spinach3.gif" ALT="figure: spinach3"></CENTER> + +<P> + +<P>The combination of mutation and sharing, although tricky, is not all bad. +The same mutual dependence that made a mess of <CODE>:one</CODE> and <CODE>:two</CODE> in +the example above was desirable and helpful in the case of <CODE>:world</CODE> +and <CODE>:Canada</CODE> earlier. That's why Lisp has always included mutable +lists, and why versions of Logo intended for more expert users have also +chosen to allow list mutation. + +<P>Don't think that without mutable lists you are completely safe from mutual +dependence. Any language in which it's possible to give a datum a <EM> +name</EM> allows the programmer to set up the equivalent of sharable data, +just as I did in the final version of the tree of cities. As far as the +Logo interpreter is concerned, the value of the variable <CODE>world</CODE> is some +generated symbol like <CODE>G47</CODE>. That value is immune to changes in other +data structures. But if we think of <CODE>:world</CODE> as containing, +effectively, the entire tree whose root node is called <CODE>G47</CODE>, then +changes to other variables do, in effect, change the value of <CODE>world</CODE>. +What is true is that without mutable lists you can't easily set up a mutual +dependence <EM>by accident;</EM> you have to intend it. + +<P>By the way, by drawing the box and pointer diagrams with the actual data +inside the boxes, I may have given you the impression that each member of a +list or array must be a single number or letter, so that the value will fit +in one memory address. Actually, each member can be a pointer to anything. +For example, here's a picture of an array that includes lists, + +<P><PRE>{[A B C] D [E F]} +</PRE> + +<P><CENTER><IMG SRC="bigarray.gif" ALT="figure: bigarray"></CENTER> + +<P> +<TABLE width="100%"><TR><TD><A HREF="../v3-toc2.html">(back to Table of Contents)</A> +<TD align="right"><A HREF="../v3ch2/v3ch2.html"><STRONG>BACK</STRONG></A> +chapter thread <A HREF="https://people.eecs.berkeley.edu/~bh/v3ch4/v3ch4.html"><STRONG>NEXT</STRONG></A> +</TABLE> + +<P><H2>Program Listing</H2> + +<P><P> +<P><PRE> +;;; Algorithms and Data Structures + +;; Local optimization of quadratic formula + +to quadratic :a :b :c +localmake "root sqrt (:b * :b-4 * :a * :c) +localmake "x1 (-:b+:root)/(2 * :a) +localmake "x2 (-:b-:root)/(2 * :a) +print (sentence [The solutions are] :x1 "and :x2) +end + +;; Memoization of T function + +to t :n :k +localmake "result gprop :n :k +if not emptyp :result [output :result] +make "result realt :n :k +pprop :n :k :result +output :result +end + +to realt :n :k +if equalp :k 0 [output 1] +if equalp :n 0 [output 0] +output (t :n :k-1) + (t :n-1 :k) +end + +;; Speedup of Simplex function + +to simplex :buttons +output 2 * first (cascade :buttons + [fput (sumprods butfirst ?2 ?1) ?1] [1] + [fput 1 nextrow ?2] [1 1]) +end + +to sumprods :a :b +output reduce "sum (map "product :a :b) +end + +to nextrow :combs +if emptyp butfirst :combs [output :combs] +output fput (sum first :combs first butfirst :combs) ~ + nextrow butfirst :combs +end + +;; Sorting -- selection sort + +to ssort :list +if emptyp :list [output []] +output ssort1 (first :list) (butfirst :list) [] +end + +to ssort1 :min :in :out +if emptyp :in [output fput :min ssort :out] +if lessthanp :min (first :in) ~ + [output ssort1 :min (butfirst :in) (fput first :in :out)] +output ssort1 (first :in) (butfirst :in) (fput :min :out) +end + +;; Sorting -- partition sort + +to psort :list +if emptyp :list [output []] +if emptyp butfirst :list [output :list] +localmake "split ((first :list) + (last :list)) / 2 +if lessthanp first :list :split ~ + [output psort1 :split (butfirst :list) (list first :list) []] +output psort1 :split (butlast :list) (list last :list) [] +end + +to psort1 :split :in :low :high +if emptyp :in [output sentence (psort :low) (psort :high)] +if lessthanp first :in :split ~ + [output psort1 :split (butfirst :in) (fput first :in :low) :high] +output psort1 :split (butfirst :in) :low (fput first :in :high) +end + +;; Sorting -- count comparisons + +to lessthanp :a :b +if not namep "comparisons [make "comparisons 0] +make "comparisons :comparisons+1 +output :a < :b +end + +to howmany +print :comparisons +ern "comparisons +end + +;; Abstract Data Type for Trees: Constructor + +to tree :datum :children +output fput :datum :children +end + +;; Tree ADT: Selectors + +to datum :node +output first :node +end + +to children :node +output butfirst :node +end + +;; Tree ADT: Mutator + +to addchild :tree :child +.setbf :tree (fput :child butfirst :tree) +end + +;; Tree ADT: other procedures + +to leaf :datum +output tree :datum [] +end + +to leaves :leaves +output map [leaf ?] :leaves +end + +to leafp :node +output emptyp children :node +end + +;; The World tree + +to worldtree +make "world ~ + tree "world ~ + (list (tree "France leaves [Paris Dijon Avignon]) + (tree "China leaves [Beijing Shanghai Guangzhou Suzhou]) + (tree [United States] + (list (tree [New York] + leaves [[New York] Albany Rochester + Armonk] ) + (tree "Massachusetts + leaves [Boston Cambridge Sudbury + Maynard] ) + (tree "California + leaves [[San Francisco] Berkeley + [Palo Alto] Pasadena] ) + (tree "Washington + leaves [Seattle Olympia] ) ) ) + (tree "Canada + (list (tree "Ontario + leaves [Toronto Ottawa Windsor] ) + (tree "Quebec + leaves [Montreal Quebec Lachine] ) + (tree "Manitoba leaves [Winnipeg]) ) ) ) +end + +to locate :city +output locate1 :city :world "false +end + +to locate1 :city :subtree :wanttree +if and :wanttree (equalp :city datum :subtree) [output :subtree] +if leafp :subtree ~ + [ifelse equalp :city datum :subtree + [output (list :city)] + [output []]] +localmake "lower locate.in.forest :city (children :subtree) :wanttree +if emptyp :lower [output []] +output ifelse :wanttree [:lower] [fput (datum :subtree) :lower] +end + +to locate.in.forest :city :forest :wanttree +if emptyp :forest [output []] +localmake "child locate1 :city first :forest :wanttree +if not emptyp :child [output :child] +output locate.in.forest :city butfirst :forest :wanttree +end + +to cities :name +output cities1 (finddatum :name :world) +end + +to cities1 :subtree +if leafp :subtree [output (list datum :subtree)] +output map.se [cities1 ?] children :subtree +end + +to finddatum :datum :tree +output locate1 :name :tree "true +end + +;; Area code/city pairs ADT + +to areacode :pair +output first :pair +end + +to city :pair +output butfirst :pair +end + +;; Area code linear search + +make "codelist [[202 Washington] [206 Seattle] [212 New York] + [213 Los Angeles] [215 Philadelphia] [303 Denver] + [305 Miami] [313 Detroit] [314 St. Louis] + [401 Providence] [404 Atlanta] [408 Sunnyvale] + [414 Milwaukee] [415 San Francisco] [504 New Orleans] + [608 Madison] [612 St. Paul] [613 Kingston] + [614 Columbus] [615 Nashville] [617 Boston] + [702 Las Vegas] [704 Charlotte] + [712 Sioux City] [714 Anaheim] [716 Rochester] + [717 Scranton] [801 Salt Lake City] [804 Newport News] + [805 Ventura] [808 Honolulu]] + +to listcity :code +output city find [equalp :code areacode ?] :codelist +end + +;; Area code binary tree search + +to balance :list +if emptyp :list [output []] +if emptyp butfirst :list [output leaf first :list] +output balance1 (int (count :list)/2) :list [] +end + +to balance1 :count :in :out +if equalp :count 0 ~ + [output tree (first :in) (list balance reverse :out + balance butfirst :in)] +output balance1 (:count-1) (butfirst :in) (fput first :in :out) +end + +to treecity :code +output city treecity1 :code :codetree +end + +to treecity1 :code :tree +if emptyp :tree [output [0 no city]] +localmake "datum datum :tree +if :code = areacode :datum [output :datum] +if :code < areacode :datum [output treecity1 :code lowbranch :tree] +output treecity1 :code highbranch :tree +end + +to lowbranch :tree +if leafp :tree [output []] +output first children :tree +end + +to highbranch :tree +if leafp :tree [output []] +output last children :tree +end +</PRE><P> + + +<P> + +<P><A HREF="../v3-toc2.html">(back to Table of Contents)</A> +<P><A HREF="../v3ch2/v3ch2.html"><STRONG>BACK</STRONG></A> +chapter thread <A HREF="https://people.eecs.berkeley.edu/~bh/v3ch4/v3ch4.html"><STRONG>NEXT</STRONG></A> + +<P> +<ADDRESS> +<A HREF="../index.html">Brian Harvey</A>, +<CODE>bh@cs.berkeley.edu</CODE> +</ADDRESS> +</BODY> +</HTML> |