about summary refs log tree commit diff stats
path: root/js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html
diff options
context:
space:
mode:
authorelioat <elioat@tilde.institute>2023-08-23 07:52:19 -0400
committerelioat <elioat@tilde.institute>2023-08-23 07:52:19 -0400
commit562a9a52d599d9a05f871404050968a5fd282640 (patch)
tree7d3305c1252c043bfe246ccc7deff0056aa6b5ab /js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html
parent5d012c6c011a9dedf7d0a098e456206244eb5a0f (diff)
downloadtour-562a9a52d599d9a05f871404050968a5fd282640.tar.gz
*
Diffstat (limited to 'js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html')
-rw-r--r--js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html2465
1 files changed, 2465 insertions, 0 deletions
diff --git a/js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html b/js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html
new file mode 100644
index 0000000..da1e1ce
--- /dev/null
+++ b/js/games/nluqo.github.io/~bh/v3ch3/v3ch3.html
@@ -0,0 +1,2465 @@
+<HTML>
+<HEAD>
+<TITLE>Computer Science Logo Style vol 3 ch 3: Algorithms and Data Structures</TITLE>
+</HEAD>
+<BODY>
+<CITE>Computer Science Logo Style</CITE> volume 3:
+<CITE>Beyond Programming</CITE> 2/e Copyright (C) 1997 MIT
+<H1>Algorithms and Data Structures</H1>
+
+<TABLE width="100%"><TR><TD>
+<IMG SRC="../csls3.jpg" ALT="cover photo">
+<TD><TABLE>
+<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian
+Harvey</A><BR>University of California, Berkeley</CITE>
+<TR><TD align="right"><BR>
+<TR><TD align="right"><A HREF="../pdf/v3ch03.pdf">Download PDF version</A>
+<TR><TD align="right"><A HREF="../v3-toc2.html">Back to Table of Contents</A>
+<TR><TD align="right"><A HREF="../v3ch2/v3ch2.html"><STRONG>BACK</STRONG></A>
+chapter thread <A HREF="https://people.eecs.berkeley.edu/~bh/v3ch4/v3ch4.html"><STRONG>NEXT</STRONG></A>
+<TR><TD align="right"><A HREF="https://mitpress.mit.edu/books/computer-science-logo-style-second-edition-volume-3">MIT
+Press web page for <CITE>Computer Science Logo Style</CITE></A>
+</TABLE></TABLE>
+
+<HR><P>Program file for this chapter: <A HREF="algs.lg"><CODE>algs</CODE></A>
+
+<P>What's wrong with this procedure?
+
+<P><PRE>to process.sentence :sent
+output fput (process.word first :sent) (process.sentence bf :sent)
+end
+</PRE>
+
+<P>If you said &quot;It's a recursive procedure without a stop rule,&quot;
+you've solved a simple problem in <EM>analysis of algorithms.</EM>  This branch
+of computer science is concerned with the <EM>correctness</EM> and the
+<EM>efficiency</EM> of programs.
+
+<P>The field is called analysis of <EM>algorithms</EM> rather than analysis of
+<EM>programs</EM> because the emphasis is on the meaning of a program (how
+you might express the program in English) and not on the details of its
+expression in a particular language.  For example, the error in the procedure
+
+<P><PRE>to process.sentence :sent
+if emptyp :sent [output []]
+output fput (process.word first :sent) (<U>process.snetence</U> bf :sent)
+end
+</PRE>
+
+<P>is just as fatal as the error in the first example, but there isn't
+much of theoretical interest to say about a misspelled procedure name.  On
+the other hand, another branch of computer science,
+<EM>program verification,</EM> is concerned
+with developing techniques to ensure that a
+computer program really does correctly express the algorithm it is meant to
+express.
+
+<P>The examples I've given so far are atypical, it's worth noting, in that you
+were able to see the errors in the procedures without having any real idea
+of what they do!  Without seeing <CODE>process.word</CODE> you can tell that this
+program is doing something or other to each word of a sentence, but you
+don't know whether the words are being translated to another language,
+converted from upper case to lower case letters, translated into a string of
+phoneme codes to be sent to a speech synthesizer, or what.  (Even what I just
+said about doing something to the words of a sentence is an inference based
+on the names of procedures and variables, which might not really reflect the
+results of the program.)  More interesting problems in analysis of
+algorithms have to do with the specific properties of particular algorithms.
+
+<P>In this chapter I discuss algorithms along with <EM>data structures,</EM> the
+different ways in which information can be represented in a computer program,
+because these two aspects of a program interact strongly.  That is, the
+choices you make as a programmer about data representation have a profound
+effect on the algorithms you can use to solve your problem.  Similarly, once
+you have chosen an algorithm, that choice determines the particular kinds of
+information your program will need to do its work.  Algorithms and data
+structures are the central concerns of software engineering, the overall
+name for the study of how to turn a problem statement into a working program
+in a way that uses both the computer and the programming staff effectively.
+
+<P><H2>Local Optimization vs. Efficient Algorithms</H2>
+
+<P>When you're trying to make a computer program run as fast as possible,
+the most obvious place to start is with the details of the instructions
+in the program.  But that's generally <EM>not</EM> the most effective
+approach.  Rethinking the big picture can give much more dramatic
+improvements.  To show you what I mean, I'll give some examples of both.
+Later I'll talk about <EM>order of growth,</EM> a more formal way to
+understand this same idea.
+
+<P>Consider the following procedure that implements
+the <EM>quadratic formula</EM>
+
+<P><CENTER><IMG SRC="qformula.gif"></CENTER>
+
+<P>This formula gives the two values of <EM>x</EM> that solve the equation
+<P><CENTER><EM>ax</EM><SUP><SMALL>2</SMALL></SUP> + <EM>bx</EM> + <EM>c</EM> = 0</CENTER><P>
+
+<P><PRE>to quadratic :a :b :c
+localmake &quot;x1 (-:b + sqrt (:b*:b - 4*:a*:c))/2*:a
+localmake &quot;x2 (-:b - sqrt (:b*:b - 4*:a*:c))/2*:a
+print (sentence [The solutions are] :x1 &quot;and :x2)
+end
+</PRE>
+
+<P>Before we talk about the efficiency of this program, is it correct?
+This is not a simple yes-or-no question.  The procedure gives the correct
+results for those quadratic equations that have two real solutions.  For
+example, the equation 2<EM>x</EM><SUP><SMALL>2</SMALL></SUP>+5<EM>x</EM>-3 = 0 has the solutions <EM>x</EM> = &minus;3 and &frac12;; the instruction <CODE>quadratic 2 5 -3</CODE> will print those solutions.
+Some quadratic equations, like <EM>x</EM><SUP><SMALL>2</SMALL></SUP>-8<EM>x</EM>+16 = 0, have only one solution; for
+these equations, <CODE>quadratic</CODE> prints the same solution twice, which is
+not exactly incorrect but still a little embarrassing.  Other equations,
+like <EM>x</EM><SUP><SMALL>2</SMALL></SUP>+1 = 0, have solutions that are complex numbers.  Depending on our
+purposes, we might want <CODE>quadratic</CODE> to print the solutions <EM>i</EM> and &minus;<EM>i</EM>
+for this equation, or we might want it to print &quot;This equation has no real
+solutions.&quot; But since most versions of Logo do not provide complex
+arithmetic, what will really happen is that we'll get the error message
+
+<P><PRE>sqrt doesn't like -1 as input.
+</PRE>
+
+<P>If <CODE>quadratic</CODE> is used as part of a larger project, getting
+the Logo error message means that the program dies altogether in this case
+without a chance to recover.  If we have several equations to
+solve, it would be better if the program could continue to the remaining
+equations even if no solution is found for one of them.  A program that
+operates correctly for the kinds of inputs that the programmer had in mind,
+but blows up when given unanticipated inputs, is said not to be <EM>
+robust;</EM> a robust program should do something appropriate even if there
+are errors in its input data.
+
+<P>But my real reason for displaying this example is to discuss its efficiency.
+It computes the expression
+
+<P><PRE>sqrt (:b*:b - 4*:a*:c)
+</PRE>
+
+<P>twice.  Multiplication is slower than addition on most computers;
+this expression involves three multiplications as well as the even slower
+square root extraction.  The program would be faster if it were written this
+way:
+
+<P><PRE>to quadratic :a :b :c
+localmake &quot;sqrt sqrt (:b*:b - 4*:a*:c)
+localmake &quot;x1 (-:b + :sqrt)/2*:a
+localmake &quot;x2 (-:b - :sqrt)/2*:a
+print (sentence [The solutions are] :x1 &quot;and :x2)
+end
+</PRE>
+
+<P>This kind of change to a program is called
+<EM>common subexpression elimination.</EM>  It's a
+pretty easy way to speed up a program;
+so easy, in fact, that some &quot;optimizing compilers&quot; for large computers do
+it automatically.  In other words, an optimizing compiler for Logo would
+treat the first version of <CODE>quadratic</CODE> as if it were written like the
+second version.  (As far as I know, nobody has actually written such a
+compiler for Logo.)
+
+<P>Common subexpression elimination is an example of <EM>local</EM> optimization.
+This means that we improved the program by paying attention to one small
+piece of it at a time.  (A less elegant name for local optimization is &quot;code
+bumming.&quot;)  Is it worth the effort?  It depends how many times this procedure
+will be run.  When I say that multiplication is slow, I mean that it takes
+a few millionths of a second.  If you are writing a <CODE>quadratic</CODE>
+procedure to do the dozen problems in your high school algebra homework, the
+extra time you spend thinking up the second version and typing it into the
+editor probably outweighs the saving of time in running the procedure.  But
+if you are trying to predict the weather and need to solve tens of thousands
+of equations, the saving may be significant.
+
+<P>If you want to write locally optimized
+Logo programs, it's important to know that <CODE>first</CODE>, <CODE>butfirst</CODE>, and
+<CODE>fput</CODE> take a constant amount of time regardless of the length of the
+input list, whereas <CODE>last</CODE>, <CODE>butlast</CODE>, and <CODE>lput</CODE> take an amount
+of time proportional to the length of the list.  If you add <EM>n</EM> items to a
+list, one by one, using <CODE>fput</CODE>, the length of time required is <EM>n</EM> times
+the constant amount of time for each item.  But if you add the same <EM>n</EM>
+items using <CODE>lput</CODE>, if the first item takes <EM>t</EM> microseconds, the last
+takes <EM>nt</EM>.  On the average, each <CODE>lput</CODE> takes something like <EM>nt</EM>/2
+microseconds, so the total time is <EM>n</EM><SUP><SMALL>2</SMALL></SUP><EM>t</EM>/2.  The gain in efficiency from
+using <CODE>fput</CODE> instead of <CODE>lput</CODE> isn't significant if your list has
+only a few items, but the gain gets more and more significant as the size of
+the list increases.  Suppose you want to create a list of the numbers from 1
+to 1000.  One straightforward way would be
+
+<P><PRE>print cascade 1000 [lput # ?] []
+</PRE>
+
+<P>On my home computer this instruction takes about 26
+seconds.<SUP>*</SUP>  If you
+just use <CODE>fput</CODE> in place of <CODE>lput</CODE> the list will come out in the
+wrong order, but it's possible to reverse the order of that list to get the
+same result as the first version:
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>The actual time depends not only on what model of
+computer you have but also on how much memory and what else is in it.</SMALL></BLOCKQUOTE></SMALL><P><PRE>print reverse cascade 1000 [fput # ?] []
+</PRE>
+
+<P>You might think this would be slower, because it has the extra
+<CODE>reverse</CODE> step.  But in fact this instruction took about 12 seconds.
+(It's important that the <CODE>reverse</CODE> tool in the Berkeley Logo
+library uses <CODE>first</CODE>
+and <CODE>fput</CODE> to do its work.  If we used the more obvious
+
+<P><PRE>to reverse :list
+if emptyp :list [output []]
+output lput first :list reverse bf :list
+end
+</PRE>
+
+<P>then the elimination of <CODE>lput</CODE> in the <CODE>cascade</CODE> template
+would be offset by the <CODE>lput</CODE> in the reversal.)
+
+<P>
+At the other extreme, the broadest possible <EM>global</EM> optimization of a
+program is to think of an entirely new algorithm based on a different way of
+thinking about the original problem.  As an example, in Chapter 2 there are
+three different programs to solve the Simplex lock problem.  The first
+program, <CODE>lock</CODE>, works by enumerating explicitly all the different
+patterns of possible combinations.  That algorithm is not fundamentally
+recursive; although my Logo implementation includes recursive subprocedures,
+the number of combinations for a five-button lock is not determined on the
+basis of the number for a four-button lock.  (The program does compute the
+number of combinations that use four out of the five available buttons, but
+that isn't the same thing.)  The second program, <CODE>simplex</CODE>, is based on
+a mathematical argument relating the number of combinations to a fairly
+simple <EM>recursive function</EM>--that is, a mathematical function with an
+inductive definition.  Because the function is computationally simple, the
+intellectual energy I invested in doing the mathematics paid off with a
+significantly faster program.  The third version, <CODE>simp</CODE>, uses a closed
+form formula that solves the problem in no time!
+
+<P>To illustrate more sharply the differences among these three programs, I tried each
+of them on a ten-button version of the Simplex lock.  To compute <CODE>
+lock 10</CODE> took 260 seconds on my computer; <CODE>simplex 10</CODE> took 80 seconds; and
+<CODE>simp 10</CODE> took less than half a second.  For this size lock, understanding the
+mathematical idea of a recursive function sped up the program by a factor of
+three, and using the mathematical technique called generating functions
+achieved an additional speedup by a factor of almost 200!  What's important
+to understand about this example is that it wasn't better <EM>programming</EM>
+skill that made the difference, but greater knowledge of <EM>mathematics.</EM>
+(In the next section, though, you'll learn a programming trick that can
+sometimes achieve similar speedups.)
+
+<P>Many &quot;trick&quot; math problems involve a similar shift in thinking about the
+fundamental algorithm to be used.  For example, in Chapter 2 we computed the
+probability of picking a matching pair of socks out of a drawer containing
+six brown and four blue socks this way:
+<P><CENTER><IMG SRC="math3.gif" ALT="math display"></CENTER><P>
+Suppose we ask this question: Out of the same drawer of six browns and two
+blues, we pick <EM>three</EM> socks at random; what is the probability that
+at least two of the socks match?
+
+<P>The number of triples of socks in which at least two are brown is the number
+in which all three are brown, <IMG SRC="6choose3.gif">, plus the number in which two are
+brown and one is blue, <IMG SRC="6c2x4c1.gif">.  The number in which
+at least two are blue is, similarly, <IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch3/4c3+4c2x6c1.gif">.  The total number of triples is of course <IMG SRC="10choose3.gif">.
+So the probability of a matching pair within the chosen triple is
+<P><CENTER><IMG SRC="math4.gif" ALT="math display"></CENTER><P>
+which is 100% probability.
+We could modify the <CODE>socks</CODE> procedure to get the same result by listing
+all the possible triples and then filtering all the ones containing a
+matching pair, using a filter like
+
+<P><PRE>to haspair :triple
+output or (memberp first :triple butfirst :triple) ~
+          (equalp (item 2 :triple) last :triple)
+end
+</PRE>
+
+<P>But the problem becomes entirely trivial if you notice that there
+are only two possible colors, so obviously there is no way that three
+randomly chosen socks can have three distinct colors!  Of course there has
+to be a matching pair within any possible triple.  A problem like this, that
+invites a messy arithmetic solution but is obvious if properly understood, is
+the mathematician's idea of a joke.  (Did you get it?)
+
+<P><H2>Memoization</H2>
+
+
+<P>Some efficiency tricks are applicable to a number of different problems and
+become part of the &quot;toolkit&quot; of any professional programmer.  One example
+
+is relevant to many inductively defined functions; the trick is to have the
+program remember the result of each invocation of the function.  For example,
+in Chapter 2 we defined the number of terms in a
+multinomial expansion to be
+
+<P><PRE>to t :n :k
+if equalp :k 0 [output 1]
+if equalp :n 0 [output 0]
+output (t :n :k-1)+(t :n-1 :k)
+end
+</PRE>
+
+<P>
+
+<P>What happens when we compute <CODE>t 4 7</CODE>?
+<P><CENTER><IMG SRC="math5.gif" ALT="math display"></CENTER><P>
+Many calculations are performed repeatedly.  In the chart above I've
+underlined three places where <EM>t</EM>(3,5) is computed.  Each of those in turn
+involves repeated computation of <EM>t</EM>(3,4) and so on.  This computation took
+me about 18 seconds.
+
+<P>
+
+<P>Here is a version of <CODE>t</CODE> that uses property lists to remember all the
+values it's already computed.  This version will calculate <EM>t</EM>(3,5) only the
+first time it's needed; a second request for <EM>t</EM>(3,5) will instantly output
+the remembered value.
+
+<P><PRE>to t :n :k
+localmake &quot;result gprop :n :k
+if not emptyp :result [output :result]
+make &quot;result realt :n :k
+pprop :n :k :result
+output :result
+end
+
+to realt :n :k
+if equalp :k 0 [output 1]
+if equalp :n 0 [output 0]
+output (t :n :k-1)+(t :n-1 :k)
+end
+</PRE>
+
+<P>Computing <CODE>t 4 7</CODE> isn't really a big enough problem to
+show off how much faster this version is; even the original program takes
+only 2 &frac12; seconds on my home computer.  But the amount of time
+needed grows quickly as you try larger inputs.  Using the original procedure
+from Chapter 2, my computer took just under five <EM>minutes</EM> to
+compute <CODE>t 8 10</CODE>; the program shown here took less than two <EM>
+seconds.</EM>  This program computed <CODE>t 12 12</CODE> in about three seconds;
+I estimate that the original procedure would take five <EM>hours</EM> to
+solve that problem!  (As you can imagine, I didn't try it.)
+
+<P>The memoized version of <CODE>t</CODE> has the same structure as the original.
+That is, in order to compute <CODE>t 4 7</CODE>, the program must first carry out
+the two subtasks <CODE>t 4 6</CODE> and <CODE>t 3 7</CODE>.  The only difference between
+the original version and the memoized version is that in the latter,
+whenever a second invocation is made with inputs that have already been
+seen, the result is output immediately without repeating that subtask.  Any
+recursive function can be memoized in the same way.<SUP>*</SUP>
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>I've used
+property lists of numbers to hold the remembered values of the <CODE>t</CODE>
+function.  If I wanted to use the same technique for some other function in
+the same workspace, I'd have to find a way to keep the values for the two
+functions from getting in each other's way.  For example, if <EM>t</EM>(4,7)=120
+and some other function <EM>h</EM>(4,7)=83, then I might store the value
+
+<P><PRE>[t 120 h 83]
+</PRE>
+
+<P>on the appropriate property list.</SMALL></BLOCKQUOTE></SMALL><P>Sometimes, though, the same general idea--remembering the results
+of past computations--can be used more effectively by changing the
+program structure so that just the right subproblems are solved
+as they're needed to work toward the overall solution.  Rearranging
+the program structure isn't called memoization, but we're still using
+the same idea.  For example,
+the Simplex lock function
+<P><CENTER><IMG SRC="math6.gif" ALT="math display"></CENTER><P>
+from Chapter 2
+is a combination (sorry about the pun) of values of two functions.  It isn't
+exactly helpful to remember every possible value of <IMG SRC="nchoosei.gif"> because
+each value is used only once.  But the calculation of <EM>f</EM>(<EM>n</EM>) uses the entire
+<EM>n</EM>th row of Pascal's Triangle, and it's easy to compute that if we remember
+the row above it.  The values of <EM>f</EM>(<EM>i</EM>) are used repeatedly, so it makes
+sense to keep a list of them.  So my plan is to have two lists,
+the first of which is a list of values of <EM>f</EM>(<EM>i</EM>) and the second a row of
+Pascal's Triangle:
+
+<P><PRE>round               ?1           ?2
+
+  0                [1]          [1 1]
+  1              [1 1]         [1 2 1]
+  2            [3 1 1]        [1 3 3 1]
+  3         [13 3 1 1]       [1 4 6 4 1]
+  4      [75 13 3 1 1]     [1 5 10 10 5 1]
+  5  [541 75 13 3 1 1]    [1 6 15 20 15 6 1]
+</PRE>
+
+<P>The solution to the problem is twice the first member of the last
+value of <CODE>?1</CODE>.
+
+<P>Instead of starting with a request for <EM>f</EM>(5) and carrying out subtasks
+as needed, the new program will begin with <EM>f</EM>(0) and will work its way
+up to larger input values until the desired result is found.  This technique
+is called <EM>dynamic programming:</EM>
+
+<P><PRE>to simplex :buttons
+output 2 * first (cascade :buttons
+                          [fput (sumprods bf ?2 ?1) ?1] [1]
+                          [fput 1 nextrow ?2] [1 1])
+end
+
+to sumprods :a :b
+output reduce &quot;sum (map &quot;product :a :b)
+end
+
+to nextrow :combs
+if emptyp butfirst :combs [output :combs]
+output fput (sum first :combs first butfirst :combs) ~
+            nextrow butfirst :combs
+end
+</PRE>
+
+<P>I tried both versions of <CODE>simplex</CODE> for a 12-button lock.
+The version in Chapter 2 took about 5 &frac12; minutes to get the
+answer (which is that there are about 56 billion combinations); this
+version took about one second, comparable to the closed form <CODE>simp</CODE>
+procedure.
+
+<P>If you just read this program with no prior idea of what algorithm it's using, it
+must be hard to see how it reflects the original problem.  But if you think
+of it as a quasi-memoization of the earlier version it should make sense to you.
+
+<P><H2>Sorting Algorithms</H2>
+
+<P>Every textbook on algorithms uses sorting as one of its main examples.
+There are several reasons for this.  First of all, sorting is one of the
+most useful things to do with a computer, in a wide variety of settings.
+There are many different known sorting algorithms, ranging from obvious to
+subtle.  These algorithms have been analyzed with great care, so a lot is
+known about their behavior.  (What does that mean?  It means we can answer
+questions like &quot;How long does this algorithm take, on the average?&quot;
+&quot;How long does it take, at worst?&quot; &quot;If the things we're sorting are
+mostly in order to begin with, does that make it faster?&quot;)  And all this
+effort pays off, in that the cleverest algorithms really are much faster
+than the more obvious ones.
+
+<P>The problem we want to solve is to take a list in unknown order and
+rearrange it to get a new list of the same members in some standard order.
+This might be alphabetical order if the members of the list are words, or
+size order if they're numbers, or something else.  For the most part, the
+exact ordering relation isn't very important.  As long as we have a way to
+compare two items and find out which comes first (or that they're equal,
+sometimes) it doesn't matter what the details of the comparison are.  To
+make things simple, in this chapter I'll assume we're always sorting numbers.
+
+<P>Because the length of time per comparison depends on the nature of the
+things being compared, and because that length isn't really part of what
+distinguishes one sorting algorithm from another, analyses of the time taken
+by a sorting program are usually expressed not in terms of seconds but in
+terms of number of comparisons.  This measure also eliminates the effect of
+one computer being faster than another.  To help make such measurements,
+we'll compare two numbers using this procedure:
+
+<P><PRE>to lessthanp :a :b
+if not namep &quot;comparisons [make &quot;comparisons 0]
+make &quot;comparisons :comparisons+1
+output :a &lt; :b
+end
+</PRE>
+
+<P>Of course, if we want to use <CODE>&gt;</CODE> or <CODE>=</CODE> comparisons in a
+sorting algorithm, we should write analogous procedures for those.  But in
+fact I'll only need <CODE>lessthanp</CODE> for the algorithms I'm going to show you.
+After trying out a sort program, we can find out how many comparisons it
+made using this convenient little tool:
+
+<P><PRE>to howmany
+print :comparisons
+ern &quot;comparisons
+end
+</PRE>
+
+<P>After telling us the number of comparisons, this procedure erases
+the counter variable to prepare for the next experiment.
+
+<P>If you haven't studied sort algorithms before, it will be a good exercise
+for you to invent one yourself before you continue.  Your procedure <CODE>
+sort</CODE> should take a list of numbers as input, and should output a list of
+the same numbers in order from smallest to largest.
+
+<P><PRE>? <U>show sort [5 20 3 5 18 9]</U>
+[3 5 5 9 18 20]
+</PRE>
+
+<P>Notice that it's allowable for two (or more) equal numbers to
+appear in the input.
+
+<P>So that we can compare different algorithms fairly, we should try them on
+the same input data.  You can make a list of 100 random numbers this way:
+
+<P><PRE>make &quot;list cascade 100 [fput random 100 ?] []
+</PRE>
+
+<P>You should try out both your sort procedures and mine on your
+random list.  In case you want to try your algorithm on my data, to
+compare the exact numbers of comparisons needed, here is the list I used:
+
+<P><PRE>[11 41 50 66 41 61 73 38  2 94 43 55 24  1 77 77 13  2 93 35
+ 43 69  9 46 88 20 43 73 11 74 69 33 28  4  5  1 15 17 13 94
+ 88 42 12 31 67 42 30 30 13 91 31  8 55  6 31 84 57 50 50 31
+ 36 52  5 12 10 19 69  0  9 81 62 14 39 54 45 72 18 47 48 35
+ 76 44 77 34 75 52 61 86 34 44 64 53 25 39  4 55 55 54 53 64]
+</PRE>
+
+<P>Notice in passing that this is a list of 100 random numbers, but
+not a list of the first 100 numbers in random order.  Some numbers, like 43,
+appear more than once in the list, while others don't appear at all.  This
+is perfectly realistic for numbers that occur in real life, although of
+course some situations give rise to lists of <EM>unique</EM> items.
+
+<P><H2>Sorting by Selection</H2>
+
+
+<P>Although there are many known sorting algorithms, most fall into two
+main groups.  There are the ones that order the input items one at a time
+and there are the ones that divide the problem into roughly equal-sized
+smaller problems.  I'll show you one of each.  Within a group, the
+differences between algorithms have to do with details of exactly how the
+problem is divided into pieces, what's where in computer memory, and so on.
+But these details are generally much less important than the basic division
+between the two categories.  If you took my advice and wrote your own sort
+procedure, and if you hadn't studied sorting before, the one you wrote is
+almost certainly in the first category.
+
+<P>My sample algorithm in the first group is a <EM>selection</EM> sort.
+Expressed in words, the algorithm is this:  First find the smallest number,
+then find the next smallest, and so on.  This idea can be put in recursive
+form; the output from the procedure should be a list whose first member is
+the smallest number and whose remaining elements are the sorted version of
+the other numbers.  I'll show you two versions of this algorithm: first a
+straightforward but inefficient one, and then a version that's improved in
+speed but not quite so obvious.  Here's the first version:
+
+<P><PRE>to ssort :list
+if emptyp :list [output []]
+localmake &quot;smallest reduce &quot;min :list
+output fput :smallest (ssort remove.once :smallest :list)
+end
+
+to remove.once :item :list
+if equalp :item first :list [output butfirst :list]
+output fput first :list (remove.once :item butfirst :list)
+end
+</PRE>
+
+<P>In this version of <CODE>ssort</CODE>, we start by finding the
+smallest number in the list.  Then we remove that number from the
+list, sort what's left, and put the smallest number back at the front
+of the sorted list.  The only slight complication is that I had to
+write my own variant of <CODE>remove</CODE> that, unlike the standard Berkeley
+Logo library version, removes only one copy of the chosen number from
+the list, just in case the same number appears more than once.
+
+<P>By using <CODE>min</CODE> to find the smallest number, I've interfered with
+my goal of counting the number of comparisons, but I didn't worry about
+that because I'm about to rewrite <CODE>ssort</CODE> anyway.  The problem is
+that this version goes through the list of numbers twice for each
+invocation, first to find the smallest number and then again to remove
+that number from the list that will be used as input to the recursive
+call.  The program will be much faster if it does the finding and the
+removing all at once.  The resulting procedure is a little harder to
+read, but it should help if you remember that it's trying to do the same
+job as the original version.
+
+<P><PRE>to ssort :list
+if emptyp :list [output []]
+output ssort1 (first :list) (butfirst :list) []
+end
+
+to ssort1 :min :in :out
+if emptyp :in [output fput :min ssort :out]
+if lessthanp :min (first :in) ~
+   [output ssort1 :min (butfirst :in) (fput first :in :out)]
+output ssort1 (first :in) (butfirst :in) (fput :min :out)
+end
+</PRE>
+
+<P><CODE>Ssort</CODE> is invoked once for each time a smallest number must
+be found.  For each of those iterations, <CODE>ssort1</CODE> is invoked once for
+each member of the still-unsorted list; the numbers in the list are moved
+from <CODE>:in</CODE> to <CODE>:out</CODE> except that the smallest-so-far is singled out
+in <CODE>:min</CODE>.
+
+<P>Suppose we try out <CODE>ssort</CODE> on our list of 100 numbers.  How many
+comparisons will be needed?  To find the smallest of 100 numbers we have to
+make 99 comparisons; the smallest-so-far must be compared against each of
+the remaining ones.  To find the next smallest requires 98 comparisons, and
+so on.  Finally we have two numbers remaining to be sorted and it takes one
+comparison to get them in order.  The total number of comparisons is
+<P><CENTER>99+98+97+ &middot;&middot;&middot; +2+1 = 4950</CENTER><P>
+It makes no difference what order the numbers were in to begin with, or
+whether some of them are equal, or anything else about the input data.  It
+takes 4950 comparisons for <CODE>ssort</CODE> to sort 100 numbers, period.  You can
+try out the program on various lists of 100 numbers to make sure I'm right.
+
+<P>In general, if we want to sort a list of length <EM>n</EM> with <CODE>ssort</CODE> the
+number of comparisons required is the sum of the integers from 1 to <EM>n</EM>&minus;1.
+It turns out that there is a closed form definition for this sum:
+<P><CENTER><IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch3/sumints.gif"></CENTER><P>
+
+<P>Selection sort uses these three steps:
+
+<P><P>
+<TABLE><TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Pull out the smallest value.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Sort the other values.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Put the smallest one at the front.
+
+<P></TABLE><P>
+
+<P>It's the first of those steps that does the comparisons.  A similar
+but different algorithm is <EM>insertion sort,</EM> which defers the
+comparisons until the last step:
+
+<P><P>
+<TABLE><TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Pull out any old value (such as the <CODE>first</CODE>).
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Sort the other values.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Put the chosen one where it belongs, in order.
+
+<P></TABLE><P>
+
+<P>Try writing a procedure <CODE>isort</CODE> to implement this
+algorithm.  How many comparisons does it require?  You'll find that
+for this algorithm the answer depends on the input data.  What is the
+smallest possible number of comparisons?  The largest number?  What
+kinds of input data give rise to these extreme values?
+
+<P><H2>Sorting by Partition</H2>
+
+<P>There is one fundamental insight behind all methods for sorting with fewer
+comparisons:  Two small sorting jobs are faster than one large one.
+Specifically, suppose we have 100 numbers to sort.  Using <CODE>ssort</CODE>
+requires 4950 comparisons.  Instead, suppose we split up the 100 numbers
+into two groups of 50.  If we use <CODE>ssort</CODE> on each group, each will
+require 1225 comparisons; the two groups together require twice that, or
+2450 comparisons.  That's about half as many
+comparisons as the straight <CODE>ssort</CODE> of 100 numbers.
+
+<P>But this calculation underestimates how much time we can save using this
+insight, because the same reasoning applies to each of those groups of 50.
+We can split each into two groups of 25.  Then how many comparisons will be
+required altogether?
+
+<P>The basic idea we'll use is to pick some number that we think is likely to
+be a median value for the entire list; that is, we'd like half the numbers
+to be less than this partition value and half to be greater.  There are many
+possible ways to choose this partition value; I'm going to take the average
+of the first and last numbers of the (not yet sorted!) input.  Then we run
+through the input list, dividing it into two smaller pieces by comparing
+each number against the partition value.  (Notice that <CODE>ssort</CODE> compares
+pairs of numbers within the input list; the partition sort compares one
+number from the list against another number that might or might not itself
+be in the list.)  We use the same technique recursively to sort each of the
+two sublists, then append the results.
+
+<P>Note the similarities and differences between this selection sort algorithm:
+
+<P><P>
+<TABLE><TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Pull out the smallest value.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Sort the other values.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Put the smallest one at the front.
+
+<P></TABLE><P>
+
+<P>and the following <EM>partition sort</EM> algorithm:
+
+<P><P>
+<TABLE><TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Divide the input into the small-value half and the large-value half.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Sort each half separately.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Put the sorted small values in front of the sorted large ones.
+
+<P></TABLE><P>
+
+<P>Again, I'll write this program more than once, first in an overly simple version
+and then more realistically.  Here's the simple one:
+
+<P><PRE>to psort :list
+if (count :list) &lt; 2 [output :list]
+localmake &quot;split guess.middle.value :list
+output sentence psort filter [? &lt; :split] :list ~
+                psort filter [not (? &lt; :split)] :list
+end
+
+to guess.middle.value :list
+output ((first :list) + (last :list)) / 2
+end
+</PRE>
+
+<P>To minimize the number of comparisons, we want to split the input list
+into two equal-size halves.  It's important to note that this program
+is only guessing about which value of <CODE>:split</CODE> would achieve that
+balanced splitting.  You may wonder why I didn't take the average of
+all the numbers in the list.  There are two reasons.  One is that that
+would add a lot of time to the sorting process; we'd have to look at
+every number to find the average, then look at every number again to do
+the actual splitting.  But a more interesting reason is that the average
+isn't quite what we want anyway.  Suppose we are asked to sort this
+list of numbers:
+
+<P><PRE>[3 2 1000 5 1 4]
+</PRE>
+
+<P>The average of these numbers is about 169.  But if we use
+that value as the split point, we'll divide the list into these two
+pieces:
+
+<P><PRE>[3 2 5 1 4]  and  [1000]
+</PRE>
+
+<P>Not a very even division!  To divide this list of six values
+into two equal pieces, we'd need a split point of 3 &frac12;.  In
+general, what we want is the <EM>median</EM> value rather than the
+<EM>average</EM> value.  And if you think about it, you pretty much have
+to have the numbers already sorted in order to find the median.<SUP>*</SUP>  So we just try to make a good guess that
+doesn't take long to compute.
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>That
+isn't quite true; there is a clever algorithm that can find the median
+after only a partial sorting of the values.  But it's true enough that
+we can't use a sorting algorithm whose first step requires that we've
+already done some sorting.</SMALL></BLOCKQUOTE></SMALL><P>Just as in the case of selection sort, one problem with this simple program
+is that it goes through the input list twice, once for each of the calls to
+<CODE>filter</CODE>.  And the call to <CODE>count</CODE> in the end test adds a third walk
+through the entire list.  Here's a version that fixes those inefficiencies:
+
+<P><PRE>to psort :list
+if emptyp :list [output []]
+if emptyp butfirst :list [output :list]
+localmake &quot;split ((first :list) + (last :list)) / 2
+output psort1 :split :list [] []
+end
+
+to psort1 :split :in :low :high
+if emptyp :in [output sentence (psort :low) (psort :high)]
+if lessthanp first :in :split ~
+   [output psort1 :split (butfirst :in) (fput first :in :low) :high]
+output psort1 :split (butfirst :in) :low (fput first :in :high)
+end
+</PRE>
+
+<P>This version of <CODE>psort</CODE> has one good attribute and one bad
+attribute.  The good attribute is that it's very cleanly organized.  You can
+see that it's the job of <CODE>psort</CODE> to choose the partition value and the
+job of <CODE>psort1</CODE> to divide the list in two parts based on comparisons
+with that value.  The bad attribute is that it doesn't <EM>quite</EM> work as
+it stands.
+
+<P>As for any recursive procedure, it's important that the input get smaller
+for each recursive invocation.  If not, <CODE>psort</CODE> could end up invoking
+itself over and over with the same input.  It's very easy to see that <CODE>
+ssort</CODE> avoids that danger, because the input list is shorter by exactly one
+member for each recursive invocation.  But <CODE>psort</CODE> divides its input
+into two pieces, each of which ought to be about half the length of the
+original if we're lucky.  We're lucky if the partition value we choose is,
+in fact, the median value of the input list.  If we're less lucky, the two
+parts might be imbalanced, say &frac14; of the members below the partition and
+&frac34; of them above it.  Can we be <EM>so</EM> unlucky that <EM>all</EM> of
+the input numbers are on the same side of the partition?  See if you can
+think of a case in which this will happen.
+
+<P>The partition value is chosen as the average of two members of the input
+list.  If those two members (the first and last) are unequal, then one of
+them must be less than the partition value and the other greater.  So there
+will be at least one number on each side of the partition.  But if the two
+averaged numbers are equal, then the partition value will be equal to both
+of them.  Each of them will be put in the <CODE>high</CODE> bucket.  If all the
+other numbers in the list are also greater than or equal to the partition,
+then they'll all end up in the <CODE>high</CODE> bucket, and nothing will be
+accomplished in the recursion step.  The simplest failing example is trying
+to sort a list of numbers all of which are equal, like
+
+<P><PRE>? <U>show psort [4 4 4 4 4]</U>
+</PRE>
+
+<P>We could take various approaches to eliminating this bug.  See how
+many ways you can think of, and decide which you like best, before reading
+further.
+
+<P>Since the problem has to do with the choice of partition value, you could
+imagine using a more complicated means to select that value.  For example,
+you could start with the first member of the input list, then look through
+the list for another member not equal to the first.  When you find one,
+average the two of them and that's the partition value.  If you get all the
+way to the end of the list without finding two unequal members, declare the
+list sorted and output it as is.  The trouble with this technique is that
+many extra comparisons are needed to find the partition value.  Those
+comparisons don't really help in ordering the input, but they do add to the
+time taken just as much as the &quot;real&quot; comparisons done later.
+
+<P>Another approach is to say that since the problem only arises if the first
+and last input members are equal, we should treat that situation as a
+special case.  That is, we'll add an instruction like
+
+<P><PRE>if equalp first :list last :list [...]
+</PRE>
+
+<P>Again, this approach adds a comparison that doesn't really help to
+sort the file, although it's better than the first idea because it only adds
+one extra comparison per invocation instead of perhaps several.
+
+<P>A more straightforward approach that might seem to make the program more
+efficient, rather than less, is to divide the list into <EM>three</EM>
+buckets, <CODE>low</CODE>, <CODE>high</CODE>, and <CODE>equal</CODE>.  This way, the problem gets
+shorter faster, since the <CODE>equal</CODE> bucket doesn't have to be sorted
+recursively; it's already in order.  The trouble is that it takes two
+comparisons, one for equality and one for <CODE>lessthanp</CODE>, to know how to
+direct each list member into a three-way split.  Some computers can compare
+numbers once and branch in different directions for less, equal, or greater;
+one programming language, Fortran, includes that kind of three-way branching
+through an &quot;arithmetic IF&quot; statement that accepts different instructions
+for the cases of a given quantity being less than, equal to, or greater than
+zero.  But in Logo we'd have to say
+
+<P><PRE>if lessthanp first :in :split [...]
+if equaltop first :in :split [...]
+</PRE>
+
+<P>with two comparisons for each list member.  (I'm imagining that
+<CODE>equaltop</CODE> would keep track of the number of comparisons just as
+<CODE>lessthanp</CODE> does.)
+
+<P>What I chose was to do the first <CODE>lessthanp</CODE> test for the list in <CODE>
+psort</CODE> instead of <CODE>psort1</CODE>, and use it to ensure that either the first
+or the last member of the list starts out the <CODE>low</CODE> bucket.
+
+<P><PRE>to psort :list
+if emptyp :list [output []]
+if emptyp butfirst :list [output :list]
+localmake &quot;split ((first :list) + (last :list)) / 2
+if lessthanp first :list :split ~
+   [output psort1 :split (butfirst :list) (list first :list) []]
+output psort1 :split (butlast :list) (list last :list) []
+end
+</PRE>
+
+<P><CODE>Psort1</CODE> is unchanged.
+
+<P>How many comparisons should <CODE>psort</CODE> require to sort 100 numbers?  Unlike
+<CODE>ssort</CODE>, its exact performance depends on the particular list of numbers
+given as input.  But we can get a general idea.  The first step is to divide
+the 100 numbers into two buckets, using 100 comparisons against the
+partition value.  The second step divides each of the buckets in two again.
+We can't say, in general, how big each bucket is; but we do know that each
+of the original 100 numbers has to be in one bucket or the other.  So each
+of 100 numbers will participate in a comparison in this second round also.
+The same argument applies to the third round, and so on.  Each round
+involves 100 comparisons.  (This isn't quite true toward the end of the
+process.  When a bucket only contains one number, it is considered sorted
+without doing any comparisons.  So as the buckets get smaller, eventually
+some of the numbers &quot;drop out&quot; of the comparison process, while others are
+still not in their final position.)
+
+<P>Each round involves 100 comparisons, more or less.  How many rounds are
+there?  This is where the original ordering of the input data makes itself
+most strongly felt.  If we're lucky, each round divides each bucket exactly
+in half.  In that case the size of the buckets decreases uniformly:
+
+<P><PRE>round    size
+
+  1       100
+  2        50
+  3        25
+  4        12,13
+  5         6,7
+  6         3,4
+  7         1,2
+</PRE>
+
+<P>There is no round 8, because by then all of the buckets would be
+of length 1 and there is no work left to do.  So if all goes well we'd
+expect to make about 700 comparisons, really a little less because by round
+7 some of the numbers are already in length-1 buckets.  Maybe 650?
+
+<P>What is the <EM>worst</EM> case?  That would be if each round divides the
+numbers into buckets as unevenly as possible, with one number in one bucket
+and all the rest in the other.  In that case it'll take 99 rounds to get all
+the numbers into length-1 buckets.  You may be tempted to estimate 9900
+comparisons for that situation, but in fact it isn't quite so bad, because
+at each round one number falls into a length-1 bucket and drops out of the
+sorting process.  So the first round requires 100 comparisons, but the
+second round only 99, the third 98, and so on.  This situation is very much
+like the way <CODE>ssort</CODE> works, and so we'd expect about 5000 comparisons.
+
+<P>Now try some experiments.  Try <CODE>psort</CODE> on your random list, then try to
+find input lists that give the best and worst possible results.
+
+<P><CODE>Psort</CODE> required 725 comparisons for my random list.  That's somewhat
+more than we predicted for the best case, but not too much more.  <CODE>
+Psort</CODE> seems to have done pretty well with this list.  The simplest
+worst-case input is one in which all the numbers are the same; I said
+
+<P><PRE>make &quot;bad cascade 100 [fput 20 ?] []
+</PRE>
+
+<P>to make such a list.  <CODE>Psort</CODE> required 5049 comparisons to
+sort this list, slightly <EM>worse</EM> than <CODE>ssort</CODE> at 4950 comparisons.
+
+<P>What would a best-case input look like?  It would divide evenly at each
+stage; that is, the median value at each stage would be the average of the
+first and last values.  The simplest list that should meet that criterion is
+a list of all the numbers from 1 to 100 in order:
+
+<P><PRE>make &quot;inorder cascade 100 [lput # ?] []
+</PRE>
+
+<P>(Or you could use the <CODE>reverse</CODE> trick discussed earlier, but
+for only 100 numbers it didn't seem worth the extra typing to me.)  Using
+<CODE>psort</CODE> to sort this list should require, we said, somewhere around 650
+to 700 comparisons.  In fact it took 734 comparisons when I tried it,
+slightly <EM>more</EM> than my randomly ordered list (725 comparisons).
+
+<P>Even 734 comparisons isn't terrible by any means, but when an algorithm
+performs worse than expected, a true algorithm lover wants to know why.
+Test cases like these can uncover either inefficiencies in the fundamental
+algorithm or else ways in which the actual computer program doesn't live up
+to the algorithm as described in theoretical language.  If we could &quot;tune
+up&quot; this program to sort <CODE>:inorder</CODE> in fewer than 700 comparisons, the
+change might well improve the program's performance for any input.  See if
+you can figure out what the problem is before reading further.  You can try
+having <CODE>psort</CODE> print out its inputs each time it's called, as a way to
+help gather information.
+
+<P>Here's a very large hint.  I tried using the original version of <CODE>
+psort</CODE>, before fixing the bug about the recursion sometimes leaving all the
+numbers in one basket, and it sorted <CODE>:inorder</CODE> in only 672 comparisons.
+(I knew the bug wouldn't make trouble in this case because none of the
+numbers in this particular input list are equal.)  Can you devise a better
+<CODE>psort</CODE> that both works all the time and performs optimally for the
+best-case input?
+
+<P>This partition sorting scheme is essentially similar to a very well-known
+algorithm named quicksort, invented by C. A. R. Hoare.
+Quicksort includes many improvements over this algorithm, not primarily in
+reducing the number of comparisons but in decreasing the overhead time by,
+for example, exchanging pairs of input items in their original memory
+locations instead of making copies of sublists.  Quicksort also switches to
+a more straightforward <CODE>ssort</CODE>-like algorithm for very small input lists,
+because the benefit of halving the problem is outweighed by the greater
+complexity.  (In fact, for a two-item input, <CODE>ssort</CODE> makes one
+comparison and <CODE>psort</CODE> two.)
+
+<P>Here's the partition sort algorithm again:
+
+<P><P>
+<TABLE><TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Divide the input into the small-value half and the large-value half.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Sort each half separately.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Put the sorted small values in front of the sorted large ones.
+
+<P></TABLE><P>
+
+<P>The idea of cutting the problem in half is also used in
+the following algorithm, called <EM>mergesort:</EM>
+
+<P><P>
+<TABLE><TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Divide the input arbitrarily into two equal size pieces.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Sort each half separately.
+<TR><TH align="right" valign="top">&bull;<TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top"><EM>Merge</EM> the two sorted pieces by comparing values.
+
+<P></TABLE><P>
+
+<P>In a way, mergesort is to partition sort as insertion sort
+is to selection sort.  Both insertion sort and mergesort defer the
+comparisons of numbers until the final step of the algorithm.
+
+<P>There is one important way in which mergesort is better than partition
+sort.  Since mergesort doesn't care how the input values are separated, we
+can ensure that the two pieces are each exactly half the size of the input.
+Therefore, the number of comparisons needed is always as small as possible;
+there are no bad inputs.  Nevertheless, quicksort is more widely used than
+mergesort, because the very best implementations of quicksort seem to
+require less overhead time, for the average input, than the best
+implementations of mergesort.
+
+<P>If you want to write a mergesort program, the easiest way to divide a
+list into two equal pieces is to select every other member, so the
+odd-position members are in one half and the even-position members
+in the other half.
+
+<P><H2>Order of Growth</H2>
+
+<P>I've mentioned that the complete quicksort algorithm includes several
+optimization strategies to improve upon the partition sort I've
+presented here.  How important are these strategies?  How much does
+overhead contribute to the cost of a program?  I did some experiments
+to investigate this question.
+
+<P>First I timed <CODE>psort</CODE> and <CODE>ssort</CODE> with inputs of length 300.
+Here are the results:
+
+<P><CENTER><TABLE>
+<TR><TH>program<TH>&nbsp;&nbsp;&nbsp;&nbsp;<TH>comparisons<TH>time<TH>comparisons<TH> per second
+<TR><TD align="center"><CODE>psort</CODE><TD><TD align="right">2940&nbsp;&nbsp;&nbsp;&nbsp;<TD align="right">29 seconds<TD align="right">100
+<TR><TD align="center"><CODE>ssort</CODE><TD><TD align="right">44850&nbsp;&nbsp;&nbsp;&nbsp;<TD align="right">313 seconds<TD align="right">143
+</TABLE></CENTER>
+
+<P><CODE>Ssort</CODE> seems to have much less overhead, since it can do
+more comparisons per second than <CODE>psort</CODE>.  Nevertheless, <CODE>psort</CODE>
+always seems to be faster, for every size input I tried.  The number
+of comparisons outweighs the overhead.  (By the way, these numbers don't
+measure how fast the computer can compare two numbers!  A typical computer
+could perform more than a million comparisons per second, if only
+comparisons were involved.  Most of the time in these experiments is
+actually going into the process by which the Logo interpreter figures out
+what the instructions in the program mean.  Because Berkeley Logo is
+an interpreter, this figuring-out process happens every time an instruction
+is executed.  By contrast, I tried <CODE>ssort</CODE> on a list of length 300 in
+Object Logo, a compiled version in which each instruction is figured out
+only once, and the total time was 3.6 seconds.)
+
+<P>I wanted to give local optimization the best possible chance to win the
+race, and so I decided to make selection sort as fast as I could, and
+partition sort as slow as I could.  I modified <CODE>ssort</CODE> to use the
+Logo primitive <CODE>lessp</CODE> for comparison instead of doing the extra
+bookkeeping of <CODE>lessthanp</CODE>, and I replaced <CODE>psort</CODE> with this
+implementation:
+
+<P><PRE>to slowsort :list
+if (count :list) &lt; 2 [output :list]
+localmake &quot;split (reduce &quot;sum :list)/(count :list)
+output (sentence slowsort filter [? &lt; :split] :list
+                 filter [? = :split] :list
+                 slowsort filter [? &gt; :split] :list)
+end
+</PRE>
+
+<P>This version examines every member of the input list six times
+on each recursive call!  (<CODE>Count</CODE> is invoked twice; <CODE>reduce</CODE>
+looks at every list member once; and <CODE>filter</CODE> is called three times
+to do the actual partition.)  Under these conditions I was able to get
+<CODE>ssort</CODE> to win the race, but only for very small inputs:
+
+<P><CENTER><TABLE>
+<TR><TH>program<TH>&nbsp;&nbsp;&nbsp;<TH>20 numbers<TH>&nbsp;&nbsp;&nbsp;<TH>100 numbers<TH>&nbsp;&nbsp;&nbsp;<TH>300 numbers
+<TR><TD><CODE>slowsort</CODE><TD><TD align="right">2.7 seconds<TD><TD align="right">18 seconds<TD><TD align="right">63 seconds
+<TR><TD><CODE>ssort</CODE><TD><TD align="right">1.2 seconds<TD><TD align="right">20 seconds<TD><TD align="right">182 seconds
+</TABLE></CENTER>
+
+<P><CODE>Ssort</CODE> wins when sorting 20 numbers, but both programs
+take less than three seconds.  For 100 numbers, <CODE>slowsort</CODE> is
+already winning the race, and its lead grows as the input list grows.
+This is a common pattern:  For small amounts of data, when the program is
+fast enough no matter how you write it, local optimization can win the race,
+but once the problem is large enough so that you actually care about
+efficiency, choosing a better overall algorithm is always more important.
+(Of course the very best results will come from choosing a good algorithm
+<EM>and</EM> optimizing the program locally.)
+
+<P>What does &quot;a better algorithm&quot; actually mean?  How do we measure the
+quality of an algorithm?  We've made a good start by counting the number
+of comparisons required for our two sorting algorithms, but there is a
+formal notation that can make the issues clearer.
+
+<P>Earlier I said that for a list of <EM>n</EM> numbers, <CODE>ssort</CODE> makes
+<P><CENTER><IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch3/sumints.gif"></CENTER><P>
+comparisons.  But in a sense this tells us more than we want to
+know, like saying that a certain program took 1,243,825 microseconds instead
+of saying that it took somewhat over a second.  The important thing to say
+about <CODE>ssort</CODE> is that the number of comparisons is roughly proportional
+to <EM>n</EM><SUP><SMALL>2</SMALL></SUP>; that is, doubling the size of the input list will quadruple the
+time needed to sort the list.  More formally, we say that the time required
+for selection sorting is O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>), pronounced &quot;big-oh of <EM>n</EM><SUP><SMALL>2</SMALL></SUP>&quot; or
+&quot;order <EM>n</EM><SUP><SMALL>2</SMALL></SUP>.&quot;  This is an abbreviation for the statement, &quot;for large enough <EM>n</EM>,
+the time is bounded by <EM>n</EM><SUP><SMALL>2</SMALL></SUP> times a constant.&quot;
+The part about &quot;for large enough <EM>n</EM>&quot; is important because the running
+time for some algorithm might, for example, involve a large constant setup
+time.  For small <EM>n</EM> that setup time might contribute more to the overall
+time required than the part of the algorithm proportional<SUP>*</SUP> to <EM>n</EM><SUP><SMALL>2</SMALL></SUP>, but
+once <EM>n</EM> becomes large enough, the <EM>n</EM><SUP><SMALL>2</SMALL></SUP> part will overtake any constant.
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>Strictly
+speaking, the fact that an algorithm's time requirement is O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>) doesn't
+mean that it's even approximately proportional to <EM>n</EM><SUP><SMALL>2</SMALL></SUP>, because O(...)
+only establishes an upper bound.  The time requirement could be proportional
+to <EM>n</EM>, which would be better than <EM>n</EM><SUP><SMALL>2</SMALL></SUP>, and still be O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>).  But
+usually people use O(...) notation to mean that no smaller
+order of growth would work, even though there's an official notation
+with that meaning, &Theta;(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>), pronounced &quot;big theta.&quot;</SMALL></BLOCKQUOTE></SMALL><P>I'd like to have a similar representation, O(something), for the
+number of comparisons used by <CODE>psort</CODE> in the typical case.  We said that
+for 100 numbers, each round of sorting involves about 100 comparisons, and
+the number of rounds required is the number of times you have to divide 100
+by 2 before you get down to 1, namely between 6 and 7.  The number of
+comparisons expected is the product of these numbers.  In the general case,
+the first number is just <EM>n</EM>.  But what is the general formula for the
+number of times you have to divide <EM>n</EM> by 2 to get 1?  The answer is log<SUB><SMALL>2</SMALL></SUB>&nbsp;<EM>n</EM>.  For example, if we had 128 numbers in the list instead of 100, we would
+require exactly 7 rounds (in the best case) because 2<SUP><SMALL>7</SMALL></SUP>&nbsp;=&nbsp;128 and so
+log<SUB><SMALL>2</SMALL></SUB>&nbsp;128&nbsp;=&nbsp;7.  (By the way, log<SUB><SMALL>2</SMALL></SUB>&nbsp;100&nbsp;&cong;&nbsp;6.65, so the
+theoretical best case for 100 numbers is 665 comparisons.)
+
+<P>In general, all the obvious sorting algorithms are O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>) and all the
+clever ones are O(<EM>n</EM>&nbsp;log&nbsp;<EM>n</EM>).<SUP>*</SUP>  (I don't
+have to say  O(<EM>n</EM>&nbsp;log<SUB><SMALL>2</SMALL></SUB>&nbsp;<EM>n</EM>) because the difference between logarithms to
+different bases is just multiplication by a constant factor, which doesn't
+count in O(...) notation, just as I don't have to worry about the fact
+that the formula for <CODE>ssort</CODE> comparisons is nearer <EM>n</EM><SUP><SMALL>2</SMALL></SUP>/2 than <EM>n</EM><SUP><SMALL>2</SMALL></SUP>.)
+By the way, I haven't really <EM>proven</EM> that <CODE>psort</CODE> is O(<EM>n</EM>&nbsp;log&nbsp;<EM>n</EM>)
+in the <EM>typical</EM> case, only that it is in the best case.  It's much
+harder to prove things about the typical (or average) performance of any
+sorting algorithm, because what is an &quot;average&quot; input list?  For some
+algorithms there is no proven formula for the average run time, but only the
+results of experiments with many randomly chosen lists.
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>Don Knuth has written an O(<EM>n</EM><SUP><SMALL>3</SMALL></SUP>)
+
+sort program, just as an example of especially bad programming.</SMALL></BLOCKQUOTE></SMALL><P>An O(<EM>n</EM>) algorithm is called <EM>linear</EM> and an O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>) one <EM>
+quadratic.</EM>  O(<EM>n</EM>&nbsp;log&nbsp;<EM>n</EM>) is in between those, better than quadratic but not
+as good as linear.  What other orders of growth are commonly found?  As one
+example, the pre-memoized procedures for the <CODE>t</CODE> and <CODE>simplex</CODE>
+functions in Chapter 2 have time requirements that are O(2<SUP><SMALL><EM>n</EM></SMALL></SUP>); these are
+called <EM>exponential</EM> algorithms.  This means that just adding one
+to <EM>n</EM> makes the program take twice as long!  The experimental results I gave
+earlier agree with this formula: <CODE>simplex 10</CODE> took 80 seconds, while
+<CODE>simplex 12</CODE> took 5&nbsp;&frac12; minutes, about four times as long.
+<CODE>Simplex 16</CODE> would take over an hour.  (Start with 80 seconds, and
+double it six times.)  The memoized versions in this chapter are O(<EM>n</EM><SUP><SMALL>2</SMALL></SUP>)
+(can you prove that?), which is much more manageable.  But for some
+<EM>really</EM> hard problems there is no known way to make them any
+faster than O(2<SUP><SMALL><EM>n</EM></SMALL></SUP>);
+problems like that are called <EM>intractable,</EM> while ones that
+are merely polynomial--O(<EM>n</EM><SUP><SMALL><EM>i</EM></SMALL></SUP>) for any constant <EM>i</EM>--are called <EM>
+tractable.</EM>
+
+<P><H2>Data Structures</H2>
+
+
+<P>One of the reasons computers are so useful is that they can contain
+a tremendous amount of information.  An important part of writing
+a large computer program is figuring out how to organize the information
+used in the program.  Most of the time, a program doesn't have to
+deal with a bunch of unrelated facts, like &quot;Tomatoes are red&quot;
+and &quot;7 times 8 is 56.&quot;  Instead there will be some kind of uniformity,
+or natural grouping, in the information a program uses.
+
+<P>We'll see, too, that the data structures you choose for your program affect
+the algorithms available to you.  Organizing your data cleverly can reduce
+the execution time of your procedures substantially.
+
+<P><H2>Data Structures in Real Life</H2>
+
+<P>Why should there be different kinds of organization?  It might help
+to look at some analogies with data structures outside of computers.
+First of all, think about your personal address book.  It probably
+has specific pages reserved for each letter of the alphabet.  Within
+a letter, the names and addresses of people starting with that letter
+are arranged in no particular order; you just add a new person in
+the first free space on the page.
+
+<P>Now think about the printed telephone directory for your city.  In
+this case the names are not arranged randomly within each letter;
+they're in strict alphabetical order.  In computer science terminology,
+the printed directory is a <EM>sorted list,</EM> while your personal
+directory is a <EM>hash table.</EM>
+
+<P>Obviously, if the entries in the printed directory weren't in order
+it would take much too long to find an address.  You can get away
+with random ordering within each letter in your personal directory
+because you know a small enough number of people that it doesn't take
+too long to look through each one.  But you probably do know enough
+people to make the separate page for each letter worthwhile, even
+though the page for Q may be practically empty.  By using separate
+pages for each letter, with unused slots on each page reserved
+for expansion, you are <EM>spending space</EM> to <EM>buy time.</EM>
+That is, your address book is bigger than it would be if it were just
+one long list, but looking up a number is faster this way.  This
+tradeoff between time and space is typical of computer
+programming problems also.
+
+<P>Why don't you keep your personal directory in strict alphabetical order,
+like the printed phone book?  If you did that, looking up a number would be
+even faster.  The problem is that <EM>adding</EM> a new number would be
+terribly painful; you'd have to erase all the names on the page below where
+the new name belongs and rewrite each of them one line below where it
+was.<SUP>*</SUP>  In this case there is a tradeoff between <EM>
+storage time</EM> and <EM>retrieval time;</EM> you pay a small
+price in retrieval time to avoid a large price in storage time.  This, too,
+is a common aspect of data structure design in computer programs.
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>Why doesn't the phone company have to do that whenever they get
+a new customer?  The answer is that they maintain the directory information
+in two forms.  The printed directory is the <EM>external</EM> representation,
+used only for looking up information; inside their computers they use an
+<EM>internal</EM> representation that allows for easier insertion of new
+entries.</SMALL></BLOCKQUOTE></SMALL><P>Other kinds of real-life data structures also have computer analogues.
+If your desk looks like mine, with millions of little slips of paper
+all over the place, it is what computer scientists call a <EM>
+heap.</EM><SUP>*</SUP>
+This might be an appropriate data structure for those cases in
+which the program must deal with a large mass of unrelated facts.
+On the other hand, in a large business office there will be a <EM>
+hierarchical</EM> filing system.  A file cabinet labeled &quot;Personnel
+Records&quot; might contain a drawer labeled &quot;Inactive A-H&quot;; that drawer
+would contain a file folder for each former employee whose name starts
+with an appropriate letter.  This kind of hierarchy might be represented
+as a <EM>tree</EM> in a computer program.
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>Unfortunately, there are two things called a &quot;heap&quot; in
+computer science.  I'm thinking of the storage allocation heap, not the
+special tree structure used in the &quot;heapsort&quot; algorithm.</SMALL></BLOCKQUOTE></SMALL><P><H2>Trees</H2>
+
+<P>We've used the idea of trees before.  In Volume 1, the program to solve
+pitcher pouring problems was designed in terms of a tree of possible
+solution steps, although that tree was not represented as actual data
+in the program.  In Volume 2, I wrote <CODE>tree.map</CODE> as an example of a
+higher order function operating on trees in which the leaf nodes are
+words and the branch nodes are phrases.  In this chapter we'll work
+toward a general representation of trees as an abstract data type.
+
+<P>Here is a hierarchical grouping of some of the world's cities:
+
+<P><CENTER><IMG SRC="worldtree.gif" ALT="figure: worldtree"></CENTER>
+
+<P>Recall that a diagram like this is called a tree because it resembles a real
+tree turned upside-down.  Each place where a word or phrase appears in the
+tree is called a <EM>node.</EM> At the top of the diagram is the <EM>
+root node</EM> (<CODE>world</CODE>). The lines between nodes are called <EM>
+branches.</EM> The cities, which do not have branches extending below them,
+are called <EM>leaf nodes.</EM> The in-between nodes, the countries
+and provinces, are called <EM>branch nodes.</EM> (The root node is
+also considered a branch node since it, too, has branches below it.)  This
+tree tells us that Toronto is in Ontario, which is in Canada, which is in the
+world.
+
+<P>A tree is a very general data structure because its shape is very flexible.
+For example, in the part of this tree that represents Canada I've included
+a level of tree structure, representing the provinces, that isn't
+included in the subtree that represents France.  As we'll see later, some
+algorithms deal with restricted categories of trees.  For example, a <EM>
+
+binary</EM> tree is a tree with at most two branches below each branch node.
+
+<P>So far this data structure is just a graphic representation on paper.  There
+are many ways in which a tree can be implemented in a computer program.
+Let's say that I want to represent the tree of cities in a computer so that
+I can ask questions from this data base.  That is, I'd like to write a
+program that will allow interactions like
+
+<P><PRE>? <U>show locate &quot;Montreal</U>
+[world Canada Quebec Montreal]
+</PRE>
+
+<P>Let's pick a particular way to represent a tree in Logo.  (I
+should warn you that later in the chapter I'm going to invent different
+representations, but I want to start with a simple one to meet our
+immediate needs.  So what you're about to see is not the final version of
+these procedures.)  Here's
+the way I'm going to do it:  Each branch node will be represented as a Logo
+variable whose name is the name of the node, containing as its value a list
+of the names of the nodes immediately below it.  For example, this tree will
+include a variable named <CODE>France</CODE> with the value
+
+<P><PRE>[Paris Dijon Avignon]
+</PRE>
+
+<P>A leaf node is just a word that appears in a node list but isn't
+the name of a variable.  For a branch node, <CODE>thing</CODE> of the node's name
+will provide a list of the names of its children.  I can set up the tree
+with this procedure:
+
+<P><PRE>to worldtree
+make &quot;world [France China Canada]
+make &quot;France [Paris Dijon Avignon]
+make &quot;China [Beijing Shanghai Guangzhou Suzhou]
+make &quot;Canada [Ontario Quebec Manitoba]
+make &quot;Ontario [Toronto Ottawa Windsor]
+make &quot;Quebec [Montreal Lachine]
+make &quot;Manitoba [Winnipeg]
+end
+</PRE>
+
+<P>In principle, <CODE>locate</CODE> is a lot like <CODE>filter</CODE>, in the sense that
+we're looking through a data structure for something that meets a given
+condition.  But the implementation is a bit trickier than looking through
+a sequential list, because each invocation gives rise to several recursive
+invocations (one per child), not merely one recursive invocation as usual.
+The program will be easier to understand if we introduce the term
+<EM>forest,</EM> which means a list of trees.
+
+<P><PRE>to locate :city
+output locate1 :city &quot;world
+end
+
+to locate1 :city :subtree
+if equalp :city :subtree [output (list :city)]
+if not namep :subtree [output []]
+localmake &quot;lower locate.in.forest :city (thing :subtree)
+if emptyp :lower [output []]
+output fput :subtree :lower
+end
+
+to locate.in.forest :city :forest
+if emptyp :forest [output []]
+localmake &quot;child locate1 :city first :forest
+if not emptyp :child [output :child]
+output locate.in.forest :city butfirst :forest
+end
+
+? <U>show locate &quot;Shanghai</U>
+[world China Shanghai]
+? <U>show locate &quot;Montreal</U>
+[world Canada Quebec Montreal]
+</PRE>
+
+<P>Once we've set up this data base, we can write procedures to ask it other
+kinds of questions, too.
+
+<P><PRE>to cities :tree
+if not namep :tree [output (list :tree)]
+output map.se [cities ?] thing :tree
+end
+
+? <U>show cities &quot;France</U>
+[Paris Dijon Avignon]
+? <U>show cities &quot;Canada</U>
+[Toronto Ottawa Windsor Montreal Lachine Winnipeg]
+</PRE>
+
+<P><H2>Improving the Data Representation</H2>
+
+<P>There's a problem with the representation I've chosen for this tree.
+Suppose we want to expand the data base to include the city of Quebec.
+This city is in the province of Quebec, so all we have to do is add the name
+to the appropriate list:
+
+<P><PRE>make &quot;Quebec [Montreal Quebec Lachine]
+</PRE>
+
+<P>If you try this, however, you'll find that <CODE>locate</CODE> and <CODE>
+cities</CODE> will no longer work.  They'll both be trapped in infinite loops.
+
+<P>The problem with this program can't be fixed just by changing the program.
+It's really a problem in the way I decided to represent a tree.  I said, &quot;a
+leaf node is just a word that appears in a node list but isn't the name of a
+variable.&quot;  But that means there's no way to allow a leaf node with the same
+name as a branch node.  To solve the problem I have to rethink the
+conventions I use to represent a tree.
+
+<P>Being lazy, I'd like to change as little as possible in the program, so I'm
+going to try to find a new representation as similar as possible to the old
+one.  Here's my idea: In my mind I associate a <EM>level</EM> with each
+node in the tree.  The node <CODE>world</CODE> is at level 1, <CODE>France</CODE> and
+<CODE>Canada</CODE> at level 2, and so on.  The names of the variables used to hold
+the contents of a node will be formed from the node name and the level:
+<CODE>world1</CODE>, <CODE>France2</CODE>, <CODE>Ontario3</CODE>, and so on.  This solves the
+problem because the node for Quebec province will be a branch node by virtue
+of the variable <CODE>Quebec3</CODE>, but the node for Quebec city will be a leaf
+node because there will be no <CODE>Quebec4</CODE> variable.
+
+<P>As it turns out, though, I have to change the program quite a bit to make
+this work.  Several procedures must be modified to take the level number
+as an additional input.  Also, since the variable that holds the information
+about a place is no longer exactly named with the place's name, <CODE>cities</CODE>
+has some extra work to do, just to find the node whose cities we want to
+know.  It can almost use <CODE>locate</CODE> for this purpose, but with a slight
+wrinkle:  If we ask for the cities in Quebec, we mean Quebec province, not
+Quebec city.  So we need a variant of <CODE>locate</CODE> that finds the node
+highest up in the tree with the desired place name.  I gave subprocedure
+<CODE>locate1</CODE> an extra input, named <CODE>highest</CODE>, that's <CODE>true</CODE> if we want
+the highest matching tree node (when called from <CODE>cities</CODE>) or <CODE>false</CODE>
+if we want a matching leaf node (when called from <CODE>locate</CODE>).
+
+<P><PRE>to worldtree
+make &quot;world1 [France China Canada]
+make &quot;France2 [Paris Dijon Avignon]
+make &quot;China2 [Beijing Shanghai Guangzhou Suzhou]
+make &quot;Canada2 [Ontario Quebec Manitoba]
+make &quot;Ontario3 [Toronto Ottawa Windsor]
+make &quot;Quebec3 [Montreal Quebec Lachine]
+make &quot;Manitoba3 [Winnipeg]
+end
+
+to locate :city
+output locate1 :city &quot;world 1 &quot;false
+end
+
+to locate1 :city :subtree :level :highest
+localmake &quot;name (word :subtree :level)
+if and :highest equalp :city :subtree [output (list :city)]
+if not namep :name ~
+   [ifelse equalp :city :subtree
+           [output (list :city)]
+           [output []]]
+localmake &quot;lower locate.in.forest :city (thing :name) :level+1 :highest
+if emptyp :lower [output []]
+output fput :subtree :lower
+end
+
+to locate.in.forest :city :forest :level :highest
+if emptyp :forest [output []]
+localmake &quot;child locate1 :city first :forest :level :highest
+if not emptyp :child [output :child]
+output locate.in.forest :city butfirst :forest :level :highest
+end
+
+to cities :tree
+localmake &quot;path locate1 :tree &quot;world 1 &quot;true
+if emptyp :path [output []]
+output cities1 :tree count :path
+end
+
+to cities1 :tree :level
+localmake &quot;name (word :tree :level)
+if not namep :name [output (list :tree)]
+output map.se [(cities1 ? :level+1)] thing :name
+end
+
+? <U>show locate &quot;Quebec</U>
+[world Canada Quebec Quebec]
+? <U>show cities &quot;Canada</U>
+[Toronto Ottawa Windsor Montreal Quebec Lachine Winnipeg]
+</PRE>
+
+<P>This new version solves the Quebec problem.  But I'm still not satisfied.
+I'd like to add the United States to the data base.  This is a country whose
+name is more than one word.  How can I represent it in the tree structure?
+The most natural thing would be to use a list: <CODE>[United States]</CODE>.
+Unfortunately, a list can't be the name of a variable in Logo.  Besides,
+now that I've actually written the program using this representation I see
+what a kludge it is!
+
+<P><H2>Trees as an Abstract Data Type</H2>
+
+<P>My next idea for representing a tree is to abandon the use of a separate
+variable for each node; instead I'll put the entire tree in one big list.
+A node will be a list whose <CODE>first</CODE> is the datum at that node and whose
+<CODE>butfirst</CODE> is a list of children of the node.  So the entire tree will be
+represented by a list like this:
+
+<P><PRE>[world [France ...] [[United States] ...] [China ...] [Canada ...]]
+</PRE>
+
+<P>The datum at each node can be either a word or a list.
+
+<P>But this time I'm going to be smarter than before.  I'm going to recognize
+that the program I'm writing should logically be separated into two parts:
+the part that implements the <EM>tree</EM> data type, and the part that uses
+a tree to implement the data base of cities.  If I write procedures like
+<CODE>locate</CODE> and <CODE>cities</CODE> in terms of general-purpose tree subprocedures
+like <CODE>leafp</CODE>, a predicate that tells whether its input node is a leaf
+node, then I can change my mind about the implementation of trees (as I've
+done twice already) without changing that part of the program at all.
+
+<P>I'll start by implementing the abstract data type.  I've decided that a tree
+will be represented as a list with the datum of the root node as its <CODE>
+first</CODE> and the subtrees in the <CODE>butfirst</CODE>.  To make this work I need
+<EM>selector</EM> procedures:
+
+<P><PRE>to datum :node
+output first :node
+end
+
+to children :node
+output butfirst :node
+end
+</PRE>
+
+<P>and a <EM>constructor</EM> procedure to build a node out of its
+pieces:
+
+<P><PRE>to tree :datum :children
+output fput :datum :children
+end
+</PRE>
+
+<P>Selectors and constructors are the main procedures needed to define any data
+structure, but there are usually some others that can be useful.  For the
+tree, the main missing one is <CODE>leafp</CODE>.
+
+<P><PRE>to leafp :node
+output emptyp children :node
+end
+</PRE>
+
+<P>Now I can use these tools to write the data base procedures.
+
+<P><PRE>to locate :city
+output locate1 :city :world &quot;false
+end
+
+to locate1 :city :subtree :wanttree
+if and :wanttree (equalp :city datum :subtree) [output :subtree]
+if leafp :subtree ~
+   [ifelse equalp :city datum :subtree
+           [output (list :city)]
+           [output []]]
+localmake &quot;lower locate.in.forest :city (children :subtree) :wanttree
+if emptyp :lower [output []]
+output ifelse :wanttree [:lower] [fput (datum :subtree) :lower]
+end
+
+to locate.in.forest :city :forest :wanttree
+if emptyp :forest [output []]
+localmake &quot;child locate1 :city first :forest :wanttree
+if not emptyp :child [output :child]
+output locate.in.forest :city butfirst :forest :wanttree
+end
+
+to cities :name
+output cities1 (finddatum :name :world)
+end
+
+to cities1 :subtree
+if leafp :subtree [output (list datum :subtree)]
+output map.se [cities1 ?] children :subtree
+end
+
+to finddatum :datum :tree
+output locate1 :name :tree &quot;true
+end
+</PRE>
+
+<P>Once again, <CODE>cities</CODE> depends on a variant of <CODE>locate</CODE> that
+outputs the subtree below a given name, instead of the usual <CODE>locate</CODE>
+output, which is a list of the names on the path from <CODE>world</CODE> down to
+a city.  But instead of having <CODE>cities</CODE> call <CODE>locate1</CODE> directly,
+I decided that it would be more elegant to provide a procedure <CODE>
+finddatum</CODE> that takes a datum and a tree as inputs, whose output is the
+subtree below that datum.
+
+<P>In <CODE>cities1</CODE>, the expression
+
+<P><PRE>(list datum :subtree)
+</PRE>
+
+<P>turns out to be equivalent to just <CODE>:subtree</CODE> for the case of
+a leaf node.  (It is only for leaf nodes that the expression is evaluated.)
+By adhering to the principle of data abstraction I'm making the program work
+a little harder than it really has to.  The advantage, again, is that this
+version of <CODE>cities</CODE> will continue working even if we change the
+underlying representation of trees.  The efficiency cost is quite low;
+changing the expression to <CODE>:subtree</CODE> is a local optimization comparable
+to the common subexpression elimination we considered early in the chapter.
+
+<P>I also have to revise the procedure to set up the tree.  It's going to
+involve many nested invocations of <CODE>tree</CODE>, like this:
+
+<P><PRE>to worldtree
+make &quot;world tree &quot;world
+                 (list (tree &quot;France
+                             (list (tree &quot;Paris [])
+                                   (tree &quot;Dijon [])
+                                   (tree &quot;Avignon []) ) )
+                       (tree &quot;China
+                             (list ...
+</PRE>
+
+<P>and so on.  I can shorten this procedure somewhat by inventing
+an abbreviation for making a subtree all of whose children are leaves.
+
+<P><PRE>to leaf :datum
+output tree :datum []
+end
+
+to leaves :leaves
+output map [leaf ?] :leaves
+end
+
+to worldtree
+make &quot;world
+     tree &quot;world
+          (list (tree &quot;France leaves [Paris Dijon Avignon])
+                (tree &quot;China leaves [Beijing Shanghai Guangzhou Suzhou])
+                (tree [United States]
+                      (list (tree [New York]
+                                  leaves [[New York] Albany Rochester
+                                          Armonk] )
+                            (tree &quot;Massachusetts
+                                  leaves [Boston Cambridge Sudbury
+                                          Maynard] )
+                            (tree &quot;California
+                                  leaves [[San Francisco] Berkeley
+                                          [Palo Alto] Pasadena] )
+                            (tree &quot;Washington
+                                  leaves [Seattle Olympia] ) ) )
+                (tree &quot;Canada
+                      (list (tree &quot;Ontario
+                                  leaves [Toronto Ottawa Windsor] )
+                            (tree &quot;Quebec
+                                  leaves [Montreal Quebec Lachine] )
+                            (tree &quot;Manitoba leaves [Winnipeg]) ) ) )
+end
+
+? <U>worldtree</U>
+? <U>show cities [United States]</U>
+[[New York] Albany Rochester Armonk Boston Cambridge Sudbury Maynard
+ [San Francisco] Berkeley [Palo Alto] Pasadena Seattle Olympia]
+? <U>show locate [Palo Alto]</U>
+[world [United States] California [Palo Alto]]
+</PRE>
+
+<P><H2>Tree Modification</H2>
+
+
+<P>So far, so good.  But the procedure <CODE>worldtree</CODE> just above is very
+error-prone because of its high degree of nesting.  In the earlier versions
+I could create the tree a piece at a time instead of all at once.  In a
+practical data base system, people should be able to add new information at
+any time, not just ask questions about the initial information.  That is,
+I'd like to be able to say
+
+<P><PRE>addchild :world (tree &quot;Spain leaves [Madrid Barcelona Orense])
+</PRE>
+
+<P>to add a subtree to the world tree.  New nodes should be possible
+not only at the top of the tree but anywhere within it:
+
+<P><PRE>addchild (finddatum &quot;Canada :world) (tree [Nova Scotia] leaves [Halifax])
+</PRE>
+
+<P>Most versions of Logo do not provide tools to add a new member to an
+existing list.  We could write a program that would make a new copy of the
+entire tree, adding the new node at the right place in the copy, so that
+<CODE>addchild</CODE> would take the form
+
+<P><PRE>make &quot;world newcopy :world ...
+</PRE>
+
+<P>but there are two objections to that.  First, it would be quite
+slow, especially for a large tree.  Second, it would work only if I refrain
+from assigning subtrees as the values of variables other than <CODE>world</CODE>.
+That is, I'd like to be able to say
+
+<P><PRE>? <U>make &quot;Canada finddatum &quot;Canada :world</U>
+? <U>addchild :Canada (tree [Nova Scotia] leaves [Halifax])</U>
+? <U>show locate &quot;Halifax</U>
+[world Canada [Nova Scotia] Halifax]
+</PRE>
+
+<P>Even though I've added the new node to the tree <CODE>:Canada</CODE>, it
+should also be part of <CODE>:world</CODE> (which is where <CODE>locate</CODE> looks)
+because <CODE>:Canada</CODE> is a subtree of <CODE>:world</CODE>.  Similarly, I'd like to
+be able to add a node to the Canadian subtree of <CODE>:world</CODE> and have it
+also be part of <CODE>:Canada</CODE>.  That wouldn't be true if <CODE>addchild</CODE>
+makes a copy of its tree input instead of modifying the existing tree.
+
+<P>I'm going to solve this problem two ways.  In the Doctor project in
+Volume 2 of this series, you learned that Berkeley Logo does include
+primitive procedures that allow the modification of an existing list
+structure.  I think the most elegant solution to the <CODE>addchild</CODE> problem
+is the one that takes advantage of that feature.  But I'll also show a
+solution that works in any version of Logo, to make the point that list
+mutation isn't absolutely necessary; we can achieve the same goals, with
+the same efficiency, by other means.  First here's the list mutation
+version of <CODE>addchild</CODE>:
+
+<P><PRE>to addchild :tree :child
+.setbf :tree (fput :child butfirst :tree)
+end
+
+? <U>make &quot;GB leaf [Great Britain]</U>
+? <U>addchild :world :GB</U>
+? <U>addchild :GB tree &quot;England leaves [London Liverpool Oxford]</U>
+? <U>addchild :GB tree &quot;Scotland leaves [Edinburgh Glasgow]</U>
+? <U>addchild :GB tree &quot;Wales leaves [Abergavenny]</U>
+? <U>show locate &quot;Glasgow</U>
+[world [Great Britain] Scotland Glasgow]
+</PRE>
+
+<P>Just as <CODE>tree</CODE> is a constructor for the tree data type, and <CODE>
+children</CODE> is a selector, <CODE>addchild</CODE> is called a <EM>mutator</EM> for
+this data type.  Notice, by the way, that <CODE>:GB</CODE>, which was originally
+built as a leaf node, can be turned into a branch node by adding children to
+it; <CODE>addchild</CODE> is not limited to nodes that already have children.
+
+<P>The solution using <CODE>.setbf</CODE> is elegant because I didn't have to change
+any of the procedures I had already written; this version of <CODE>addchild</CODE>
+works with the same tree implementation I had already set up.  But suppose
+we didn't have <CODE>.setbf</CODE> or didn't want to use it.  (As we'll see shortly,
+using list mutators does open up some possible pitfalls.)  We can write a
+mutator for the tree abstract data type even if we don't have mutators for
+the underlying Logo lists!  The secret is to take advantage of variables,
+whose values can be changed--mutated--in any version of Logo.
+
+<P>To make this work I'm going to go back to a tree representation somewhat
+like the one I started with, in which each node is represented by a separate
+variable.  But to avoid the problems I had earlier about Quebec and San
+Francisco, the variable name won't be the datum it contains.  Instead I'll
+use a <EM>generated symbol,</EM> an arbitrary name made up by the program.
+(This should sound familiar.  I used the same trick in the Doctor project,
+and there too it was an alternative to list mutation.)
+
+<P>This is where my use of data abstraction pays off.  Look how little I have
+to change:
+
+<P><PRE>to tree :datum :children
+localmake &quot;node gensym
+make :node fput :datum :children
+output :node
+end
+
+to datum :node
+output first thing :node
+end
+
+to children :node
+output butfirst thing :node
+end
+
+to addchild :tree :child
+make :tree lput :child thing :tree
+end
+</PRE>
+
+<P>That's it!  <CODE>Leafp</CODE>, <CODE>finddatum</CODE>, <CODE>locate</CODE>, <CODE>
+cities</CODE>, and <CODE>worldtree</CODE> all work perfectly without modification,
+even though I've made a profound change in the actual representation of
+trees.  (Try printing <CODE>:world</CODE> in each version.)
+
+<P><CODE>Addchild</CODE> is only one of the possible ways in which I might want to
+modify a tree structure.  If we were studying trees more fully, I'd create
+tool procedures to delete a node, to move a subtree from one location in the
+tree to another, to insert a new node between a parent and its children, and
+so on.  There are also many more questions one might want to ask about a
+tree.  How many nodes does it have?  What is its maximum depth?
+
+<P>I've mentioned general tree manipulation tools.  There are also still some
+unresolved issues about the particular use of trees in the city data base.
+For example, although the problem of Quebec city and Quebec province is
+under control, what if I want the data base to include Cambridge, England
+as well as Cambridge, Massachusetts?  What should <CODE>locate</CODE> do if given
+<CODE>Cambridge</CODE> as input?
+
+<P>But instead of pursuing this example further I want to develop another
+example, in which trees are used for a very different purpose.  In the city
+data base, the tree represents a <EM>hierarchy</EM> (Sudbury is <EM>part of</EM>
+Massachusetts); in the example below, a tree is used to represent an <EM>
+ordering</EM> of data, as in the sorting algorithms discussed earlier.
+
+<P><H2>Searching Algorithms and Trees</H2>
+
+<P>Forget about trees for a moment and consider the general problem of <EM>
+searching</EM> through a data base for some piece of information.  For
+example, suppose you want a program to find the city corresponding to a
+particular telephone area code.  That is, I want to be able to say
+
+<P><PRE>? <U>print listcity 415</U>
+San Francisco
+</PRE>
+
+<P>The most straightforward way to do this is to have a list
+containing code-city pairs, like this:
+
+<P><PRE>make &quot;codelist [[202 Washington] [206 Seattle] [212 New York]
+   [213 Los Angeles] [215 Philadelphia] [303 Denver] [305 Miami]
+   [313 Detroit] [314 St. Louis] [401 Providence] [404 Atlanta]
+   [408 Sunnyvale] [414 Milwaukee] [415 San Francisco] [504 New Orleans]
+   [608 Madison] [612 St. Paul] [613 Kingston] [614 Columbus]
+   [615 Nashville] [617 Boston] [702 Las Vegas] [704 Charlotte]
+   [712 Sioux City] [714 Anaheim] [716 Rochester] [717 Scranton]
+   [801 Salt Lake City] [804 Newport News] [805 Ventura] [808 Honolulu]]
+</PRE>
+
+<P>This is a list of lists.  Each sublist contains one pairing of an
+area code with a city.  We can search for a particular area code by going
+through the list member by member, comparing the first word with the desired
+area code.  (By the way, in reality a single area code can contain more than
+one city, and vice versa, but never mind that; I'm trying to keep this
+simple.)  In accordance with the idea of data abstraction, we'll start with
+procedures to extract the area code and city from a pair.
+
+<P><PRE>to areacode :pair
+output first :pair
+end
+
+to city :pair
+output butfirst :pair
+end
+</PRE>
+
+<P>The city is the <CODE>butfirst</CODE> rather than the <CODE>last</CODE> to
+accommodate cities with names of more than one word.
+
+<P>The iteration tool <CODE>find</CODE> does exactly what's needed, going through a
+list member by member until it finds one that matches some criterion:
+
+<P><PRE>to listcity :code
+output city find [equalp :code areacode ?] :codelist
+end
+</PRE>
+
+<P>Time for a little analysis of algorithms.  What is the time
+
+behavior of this <EM>linear</EM> search algorithm as the data base gets
+bigger?  As for the case of the sorting algorithms, I'll concentrate on the
+number of comparisons.  How many times is <CODE>equalp</CODE> invoked?  The best
+case is if we're looking for the first code in the list, 202.  In this case
+only one comparison is made.  The worst case is if we're looking for the
+last code in the list, 808, or if we're looking for an area code that isn't
+in the list at all.  This requires 31 comparisons.  (That's how many
+code-city pairs happen to be in this particular data base.)  On the average
+this algorithm requires a number of comparisons half way between these
+extremes, or 16.  As we increase the size of the data base, the number of
+comparisons required grows proportionally; this algorithm is O(<EM>n</EM>).
+
+<P>The area codes in <CODE>:codelist</CODE> are in ascending order.  If you were
+looking through the list yourself, you wouldn't look at every entry one
+after another; you'd take advantage of the ordering by starting around the
+middle and moving forward or backward depending on whether the area code you
+
+found was too low or too high.  That's called a <EM>binary</EM> search
+algorithm.  <CODE>Listcity</CODE>, though, doesn't take
+advantage of the ordering in the list; the pairs could just as well be
+jumbled up and <CODE>listcity</CODE> would be equally happy.
+
+<P>Binary search works by starting with the median-value area code in the data
+base.  If that's the one we want, we're finished; otherwise we take the
+higher or lower half of the remaining codes and examine the median value of
+that subset.  One way we could implement that algorithm would be to use a
+binary tree to represent the code-city pairs:
+
+<P>
+
+<P><CENTER><IMG SRC="codetree.gif" ALT="figure: codetree"></CENTER>
+
+<P>
+
+<P>(In this picture I've just shown the area codes, but of course the
+actual data structure will have a code-city pair at each node.)
+
+<P>We could construct the tree from scratch, using <CODE>tree</CODE> and <CODE>leaves</CODE>
+as I did earlier, but since we already have the pairs in the correct sorted
+order it's easier to let the computer do it for us:
+
+<P><PRE>to balance :list
+if emptyp :list [output []]
+if emptyp butfirst :list [output leaf first :list]
+output balance1 (int (count :list)/2) :list []
+end
+
+to balance1 :count :in :out
+if equalp :count 0 ~
+   [output tree (first :in) (list balance reverse :out
+                                  balance butfirst :in )]
+output balance1 (:count-1) (butfirst :in) (fput first :in :out)
+end
+</PRE>
+
+<P>In this program I'm using the trick of building <CODE>:out</CODE> using
+<CODE>fput</CODE> instead of <CODE>lput</CODE> and then using <CODE>reverse</CODE> to get the
+left branch back in ascending order, to make the construction a little
+faster.
+
+<P><PRE>make &quot;codetree balance :codelist
+</PRE>
+
+<P>will generate the tree.
+
+<P>Now we're ready to write <CODE>treecity</CODE>, the tree-based search program
+analogous to <CODE>listcity</CODE> for the linear list.
+
+<P><PRE>to treecity :code
+output city treecity1 :code :codetree
+end
+
+to treecity1 :code :tree
+if emptyp :tree [output [000 no city]]
+localmake &quot;datum datum :tree
+if :code = areacode :datum [output :datum]
+if :code &lt; areacode :datum [output treecity1 :code lowbranch :tree]
+output treecity1 :code highbranch :tree
+end
+
+to lowbranch :tree
+if leafp :tree [output []]
+output first children :tree
+end
+
+to highbranch :tree
+if leafp :tree [output []]
+output last children :tree
+end
+
+? <U>print treecity 415</U>
+San Francisco
+</PRE>
+
+<P><CODE>Treecity</CODE> gives the same output as <CODE>listcity</CODE>, but it's
+faster.  At worst it makes two comparisons (for equal and less than) at each
+level of the tree.  This tree is five levels deep, so if the input area code
+is in the tree at all we'll make at most 9 comparisons, compared to 31 for
+<CODE>listcity</CODE>.  The average number is less.  (How much less?)  The
+difference is even more striking for larger data bases; if the number of
+pairs is 1000, the worst-case times are 1000 for <CODE>listcity</CODE> and 19 for
+<CODE>treecity</CODE>.
+
+<P>In general the depth of the tree is the number of times you can divide the
+number of pairs by two.  This is like the situation we met in analyzing the
+partition sort algorithm; the depth of the tree is the log base two of the
+number of pairs.  This is an O(log&nbsp;<EM>n</EM>) algorithm.
+
+<P>The procedures <CODE>lowbranch</CODE> and <CODE>highbranch</CODE> are data abstraction at
+work.  I could have used <CODE>first children :tree</CODE> directly in <CODE>
+treecity1</CODE>, but this way I can change my mind about the representation of
+the tree if necessary.  In fact, in a practical program I would want to use
+a representation that allows a node to have a right child (a <CODE>
+highbranch</CODE>) without having a left child.  Also, <CODE>lowbranch</CODE> and <CODE>
+highbranch</CODE> are written robustly; they give an output, instead of
+causing an error, if applied to a leaf node.  (That happens if you ask for
+the city of an area code that's not in the data base.)  I haven't
+consistently been writing robust programs in this chapter, but I thought I'd
+give an example here.
+
+<P>The efficiency of the tree searching procedure depends on the fact
+that the tree is <EM>balanced.</EM>  In other words, the left branch
+of each node is the same size as its right branch.  I cheated slightly
+by starting with 31 area codes, a number that allows for a perfectly
+balanced tree.  Suppose I had started with 35 area codes.  How should
+I have built the tree?  What would happen to the efficiency of the
+program?  What is the maximum possible number of trials necessary
+to find a node in that tree?  What other numbers of nodes allow for
+perfect balance?
+
+<P>There are two important ideas to take away from this example.  The first is
+that the choice of data representation can have a substantial effect on the
+efficiency of a program.  The second, as I mentioned at the beginning of
+this section, is that we have used the same kind of data structure, a tree,
+for two very different purposes: first to represent a <EM>hierarchy</EM>
+(Sudbury is <EM>part of</EM> Massachusetts) and then to represent an <EM>
+ordering</EM> (313 is <EM>before</EM> 608).
+
+<P><H2>Logo's Underlying Data Structures</H2>
+
+<P>An abstract data type, such as the tree type we've been discussing, must be
+implemented using some lower-level means for data aggregation, that is, for
+grouping separate things into a combined whole.  In Berkeley Logo, there are
+two main built-in means for aggregation: lists and arrays.  (Every dialect
+of Logo has lists, but not all have arrays.)  Words can be thought of as a
+third data aggregate, in which the grouped elements are letters, digits,
+and punctuation characters, but we don't ordinarily use words as the basis
+for abstract data types.
+
+<P>Logo was designed to be used primarily by people whose main interest is in
+something other than computer programming, and so a design goal was to keep
+to a minimum the extent to which the Logo programmer must know about how
+Logo itself works internally.  Even in these books, which <EM>are</EM> focused
+on computing itself, we've gotten this far without looking very deeply into
+how Logo's data structures actually work.  But for the purposes of this
+chapter it will be useful to take that step.
+
+<P>Essentially all computers today are divided into a <EM>processor</EM>
+and a <EM>memory.</EM>  (The exceptions are experimental &quot;parallel
+processing&quot; machines in which many small sub-processors and sub-memories
+are interconnected, sometimes combining both capabilities within a single
+intergrated circuit.)  Roughly speaking, the processor contains the
+circuitry that implements hardware primitive procedures such as arithmetic
+operations.  (Not every Logo primitive is a hardware primitive.)  The
+memory holds the information used in carrying out a program, including
+the program itself and whatever data it uses.  The memory is divided into
+millions of small pieces, each of which can hold a single value (a number,
+a letter, and so on).<SUP>*</SUP> Each small piece has an <EM>
+address,</EM> which is a number used to select a particular piece;
+the metaphor is the street address of each house on a long street.  The
+hardware primitive procedures include a <CODE>load</CODE> operation that takes
+a memory address as its input and finds the value in the specified piece
+of memory, and a <CODE>store</CODE> command that takes an address and a value
+as inputs and puts the given value into the chosen piece of memory.
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>I'm being vague about what a &quot;value&quot; is, and
+in fact most computer memories can be divided into pieces of different
+sizes.  In most current computers,
+a <EM>byte</EM> is a piece that can hold any of 256 different
+values, while a <EM>word</EM> is a piece that can hold any of
+about four billion different values.  But these details aren't important
+for my present purposes, and I'm going to ignore them.  I'll talk as if
+memories were simply word-addressable.</SMALL></BLOCKQUOTE></SMALL><P>With this background we can understand how lists and arrays differ.  To
+be specific, suppose that we have a collection of five numbers to store
+in memory.  If we use an array, that means that Logo finds five <EM>
+consecutive</EM> pieces of memory to hold the five values, like this:
+
+<P><CENTER><IMG SRC="array.gif" ALT="figure: array"></CENTER>
+
+<P>If instead we use a list, Logo finds the memory for each of the
+five values as that value is computed.  The five memory slots might not
+be consecutive, so each memory slot must contain not only the desired
+value but also a <EM>pointer</EM> to the next slot.  That is, each
+slot must contain an additional number, the address of the next slot.
+(What I'm calling a &quot;slot&quot; must therefore be big enough to hold two
+numbers, and in fact Logo uses what we call a <EM>pair</EM> for
+each, essentially an array of length two.)  Since we don't care about
+the actual numeric values of the addresses, but only about the pairs
+to which they point, we generally use arrows to represent pointers
+pictorially:
+
+<P><CENTER><IMG SRC="list.gif" ALT="figure: list"></CENTER>
+
+<P>The last pair of the list has a <EM>null pointer,</EM> a
+special value to indicate that there is no next pair following it, indicated
+by the diagonal line.
+
+<P>Why do we bother to provide two aggregation mechanisms?  Why don't language
+designers just pick the best one?  Clearly the answer will be that each is
+&quot;the best&quot; for different purposes.  In the following paragraphs I'll
+compare several characteristics of lists and arrays.
+
+<P>One advantage of arrays that should be obvious from these pictures is that a
+list uses twice as much memory to hold the same number of values.  But this
+is not generally an important difference for modern computers; unless your
+problem is really enormous, you don't have to worry about running out of
+memory.
+
+<P>The most important advantage of arrays is that they are
+<EM>random access.</EM>  This means that each member of an array
+can be found just as quickly as any other member, regardless of its
+position within the array.  If the program knows the address at which
+the array begins, and it wants to find the <EM>n</EM>th member of the array,
+only two operations are needed:  First add <EM>n</EM> to the array's address,
+then <CODE>load</CODE> the value from the resulting address.  This takes a
+constant (and very short) amount of time, O(1).  By contrast, in a list
+there is no simple arithmetic relationship between the address of the list's
+first member and the address of its <EM>n</EM>th member.  To find the <EM>n</EM>th member,
+the program must <CODE>load</CODE> the pointer from the first pair, then use that
+address to <CODE>load</CODE> the pointer from the second pair, and so on, <EM>n</EM> times.
+The number of operations needed is O(<EM>n</EM>).
+
+<P>On the other hand, the most important advantage of lists is
+<EM>dynamic allocation.</EM>  This means that the programmer does not
+have to decide ahead of time on how big the data aggregate will become.  (We
+saw an example recently when we wanted to add a child to a node of an
+existing tree.)  Consider the five-member aggregates shown earlier, and
+suppose we want to add a sixth member.  If we've used a list, we can say,
+for example,
+
+<P><PRE>make &quot;newlist fput &quot;A :oldlist
+</PRE>
+
+<P>and all Logo has to do is find one new pair:
+
+<P><CENTER><IMG SRC="growlist.gif" ALT="figure: growlist"></CENTER>
+
+<P>By contrast, once an array has been created we can't expand it,
+because the new address would have to be adjacent to the old addresses, and
+that piece of memory might already be used for something else.  To make an
+array bigger, you have to allocate a complete new array and copy the old
+values into it.
+
+<P>Remember that arrays sacrifice efficient expansion in order to get
+efficient random access.  From the standpoint of program speed, one is not
+absolutely better than the other; it depends on the nature of the problem
+you're trying to solve.  That's why it's best if a language offers both
+structures, as Berkeley Logo does.  For the very common
+case of <CODE>foreach</CODE>-like iteration through an aggregate, neither random
+access nor dynamic allocation is really necessary.  For a data base that can
+grow during the running of a program, the flexibility of dynamic allocation
+is valuable.  For many sorting algorithms, on the other hand, it's important
+to be able to jump around in the aggregate and so random access is useful.
+(A programmer using arrays can partially overcome the lack of dynamic
+allocation by preallocating a too-large array and leaving some of it empty
+at first.  But if the order of members in the array matters, it may turn out
+that the &quot;holes&quot; in the array aren't where they're needed, and so the
+program still has to copy blocks of members to make room.  Also, such
+programs have occasional embarrassing failures because what the programmer
+thought was an extravagantly large array turns out not quite large enough
+for some special use of the program, or because a malicious user deliberately
+&quot;overruns&quot; the length of the array in order to evade a program restriction.)
+
+<P>The solitaire program in Volume 2 of this series illustrates the different
+advantages of lists and arrays.  As in any card game, a solitaire player
+distributes the cards into several small groups, each of which varies in
+size as the play continues.  For example, a typical step is to deal a card
+from the <EM>hand</EM> onto the <EM>pile,</EM> each of which is represented as
+a list:
+
+<P><PRE>make &quot;pile fput (first :hand) :pile
+make &quot;hand butfirst :hand
+</PRE>
+
+<P>(The actual solitaire program uses somewhat different
+instructions to accomplish the same effect, with a <CODE>deal</CODE> procedure that
+outputs the next available card after removing it from the hand.)
+
+<P>On the other hand (no pun intended), <EM>shuffling</EM> the deck is easier
+when an array is used to hold the card values, because the shuffling
+algorithm requires random jumping around in the deck, but does not change
+the total number of cards, so that random access is more important than
+dynamic allocation.  Therefore, the program starts each round of play with
+the deck in the form of an array of 52 cards.  It shuffles the deck in array
+form, and then copies the members of the array into a list, which is used
+for the rest of that round.  The advantage of an array for shuffling, and
+the advantage of a list for dealing cards, together outweigh the time
+spent copying the array into the list.
+
+<P>An important consequence of dynamic allocation is that lists are
+<EM>sharable</EM> data structures.  In the example above, <CODE>
+:oldlist</CODE> contains five pairs and <CODE>:newlist</CODE> contains six, but the
+total number of pairs used is six, not 11, because most of the same pairs
+are part of both lists.  That's why the <CODE>fput</CODE> operation takes O(1)
+time, unaffected by the length of the list, as do <CODE>first</CODE> and <CODE>
+butfirst</CODE>.  (Now that you know how lists are constructed using pairs
+and pointers, you should be able to understand something I've mentioned
+earlier without explanation:  <CODE>Lput</CODE>, <CODE>last</CODE>, and <CODE>butlast</CODE>
+require O(<EM>n</EM>) time.  It's much faster to operate near the beginning of a
+list than near the end.)  Arrays are not sharable; each array occupies its
+own block of consecutive memory addresses.
+
+<P>When a data structure is both sharable and mutable, it's possible to get
+into some very mysterious, hard-to-detect bugs.  Suppose we do this:
+
+<P><PRE>? <U>make &quot;one [Ice cream is delicious.]</U>
+? <U>make &quot;two fput &quot;Spinach butfirst butfirst :one</U>
+</PRE>
+
+<P>
+
+<P><CENTER><IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch3/spinach1.gif" ALT="figure: spinach1"></CENTER>
+
+<P>Then suppose you decide you don't like spinach.  You might say
+
+<P><PRE>? <U>.setfirst butfirst butfirst :two &quot;disgusting.</U>
+? <U>print :two</U>
+Spinach is disgusting.
+</PRE>
+
+<P><CENTER><IMG SRC="spinach2.gif" ALT="figure: spinach2"></CENTER>
+
+<P>But you haven't taken into account that <CODE>:one</CODE> and <CODE>:two</CODE>
+share memory, so this instruction has an unintended result:
+
+<P><PRE>? <U>print :one</U>
+Ice cream is disgusting.
+</PRE>
+
+<P>This is definitely a bug!
+
+<P>It's the combination of mutation and sharing that causes trouble.  Arrays
+are mutable but not sharable; the result of using <CODE>setitem</CODE> to change
+a member of an array is easily predictable.  The trouble with list mutation
+is that you may change other lists besides the one you think you're changing.
+That's why Berkeley Logo's <CODE>.setfirst</CODE> and <CODE>.setbf</CODE> primitives have
+names starting with periods, the Logo convention for &quot;experts only!&quot;  It's
+also why many other Logo dialects don't allow list mutation at
+all.<SUP>*</SUP>
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>Also, this helps to explain the importance of property lists
+in Logo.  Property lists are a safe form of mutable list, because they are
+not sharable; that's why the <CODE>plist</CODE> primitive outputs a newly-allocated
+<EM>copy</EM> of the property list.</SMALL></BLOCKQUOTE></SMALL><P>In order to explain to you about how something like this might happen, I've
+had to tell you more about Logo's storage allocation techniques than is
+ordinarily taught.  One of the design goals of Logo was that the programmer
+shouldn't have to think in terms of boxes and arrows as we're doing now.
+But if you mutate sharable lists, you'd better understand the boxes
+and arrows.  It's not entirely obvious, you see, when two
+lists <EM>do</EM> share storage.  Suppose the example had been backwards:
+
+<P><PRE>? <U>make &quot;one [Delicious, that's ice cream.]</U>
+? <U>make &quot;two lput &quot;spinach. butlast butlast :one</U>
+? <U>print :two</U>
+Delicious, that's spinach.
+? <U>.setfirst :two &quot;Disgusting,</U>
+? <U>print :two</U>
+Disgusting, that's spinach.
+? <U>print :one</U>
+Delicious, that's ice cream.
+</PRE>
+
+<P>In this case the other list is <EM>not</EM> mysteriously modified
+because when <CODE>lput</CODE> is used, rather than <CODE>fput</CODE> as in the previous
+example, the two lists do <EM>not</EM> share memory.  That's a consequence of
+the front-to-back direction of the arrows connecting the boxes; it's
+possible to have two arrows pointing <EM>into</EM> a box, but only one arrow
+can <EM>leave</EM> a box.  You can't do this:
+
+<P><CENTER><IMG SRC="spinach3.gif" ALT="figure: spinach3"></CENTER>
+
+<P>
+
+<P>The combination of mutation and sharing, although tricky, is not all bad.
+The same mutual dependence that made a mess of <CODE>:one</CODE> and <CODE>:two</CODE> in
+the example above was desirable and helpful in the case of <CODE>:world</CODE>
+and <CODE>:Canada</CODE> earlier.  That's why Lisp has always included mutable
+lists, and why versions of Logo intended for more expert users have also
+chosen to allow list mutation.
+
+<P>Don't think that without mutable lists you are completely safe from mutual
+dependence.  Any language in which it's possible to give a datum a <EM>
+name</EM> allows the programmer to set up the equivalent of sharable data,
+just as I did in the final version of the tree of cities.  As far as the
+Logo interpreter is concerned, the value of the variable <CODE>world</CODE> is some
+generated symbol like <CODE>G47</CODE>.  That value is immune to changes in other
+data structures.  But if we think of <CODE>:world</CODE> as containing,
+effectively, the entire tree whose root node is called <CODE>G47</CODE>, then
+changes to other variables do, in effect, change the value of <CODE>world</CODE>.
+What is true is that without mutable lists you can't easily set up a mutual
+dependence <EM>by accident;</EM> you have to intend it.
+
+<P>By the way, by drawing the box and pointer diagrams with the actual data
+inside the boxes, I may have given you the impression that each member of a
+list or array must be a single number or letter, so that the value will fit
+in one memory address.  Actually, each member can be a pointer to anything.
+For example, here's a picture of an array that includes lists,
+
+<P><PRE>{[A B C] D [E F]}
+</PRE>
+
+<P><CENTER><IMG SRC="bigarray.gif" ALT="figure: bigarray"></CENTER>
+
+<P>
+<TABLE width="100%"><TR><TD><A HREF="../v3-toc2.html">(back to Table of Contents)</A>
+<TD align="right"><A HREF="../v3ch2/v3ch2.html"><STRONG>BACK</STRONG></A>
+chapter thread <A HREF="https://people.eecs.berkeley.edu/~bh/v3ch4/v3ch4.html"><STRONG>NEXT</STRONG></A>
+</TABLE>
+
+<P><H2>Program Listing</H2>
+
+<P><P>
+<P><PRE>
+;;; Algorithms and Data Structures
+
+;; Local optimization of quadratic formula
+
+to quadratic :a :b :c
+localmake "root sqrt (:b * :b-4 * :a * :c)
+localmake "x1 (-:b+:root)/(2 * :a)
+localmake "x2 (-:b-:root)/(2 * :a)
+print (sentence [The solutions are] :x1 "and :x2)
+end
+
+;; Memoization of T function
+
+to t :n :k
+localmake "result gprop :n :k
+if not emptyp :result [output :result]
+make "result realt :n :k
+pprop :n :k :result
+output :result
+end
+
+to realt :n :k
+if equalp :k 0 [output 1]
+if equalp :n 0 [output 0]
+output (t :n :k-1) + (t :n-1 :k)
+end
+
+;; Speedup of Simplex function
+
+to simplex :buttons
+output 2 * first (cascade :buttons
+                          [fput (sumprods butfirst ?2 ?1) ?1] [1]
+                          [fput 1 nextrow ?2] [1 1])
+end
+
+to sumprods :a :b
+output reduce "sum (map "product :a :b)
+end
+
+to nextrow :combs
+if emptyp butfirst :combs [output :combs]
+output fput (sum first :combs first butfirst :combs) ~
+            nextrow butfirst :combs
+end
+
+;; Sorting -- selection sort
+
+to ssort :list
+if emptyp :list [output []]
+output ssort1 (first :list) (butfirst :list) []
+end
+
+to ssort1 :min :in :out
+if emptyp :in [output fput :min ssort :out]
+if lessthanp :min (first :in) ~
+   [output ssort1 :min (butfirst :in) (fput first :in :out)]
+output ssort1 (first :in) (butfirst :in) (fput :min :out)
+end
+
+;; Sorting -- partition sort
+
+to psort :list
+if emptyp :list [output []]
+if emptyp butfirst :list [output :list]
+localmake "split ((first :list) + (last :list)) / 2
+if lessthanp first :list :split ~
+   [output psort1 :split (butfirst :list) (list first :list) []]
+output psort1 :split (butlast :list) (list last :list) []
+end
+
+to psort1 :split :in :low :high
+if emptyp :in [output sentence (psort :low) (psort :high)]
+if lessthanp first :in :split ~
+   [output psort1 :split (butfirst :in) (fput first :in :low) :high]
+output psort1 :split (butfirst :in) :low (fput first :in :high)
+end
+
+;; Sorting -- count comparisons
+
+to lessthanp :a :b
+if not namep "comparisons [make "comparisons 0]
+make "comparisons :comparisons+1
+output :a < :b
+end
+
+to howmany
+print :comparisons
+ern "comparisons
+end
+
+;; Abstract Data Type for Trees: Constructor
+
+to tree :datum :children
+output fput :datum :children
+end
+
+;; Tree ADT: Selectors
+
+to datum :node
+output first :node
+end
+
+to children :node
+output butfirst :node
+end
+
+;; Tree ADT: Mutator
+
+to addchild :tree :child
+.setbf :tree (fput :child butfirst :tree)
+end
+
+;; Tree ADT: other procedures
+
+to leaf :datum
+output tree :datum []
+end
+
+to leaves :leaves
+output map [leaf ?] :leaves
+end
+
+to leafp :node
+output emptyp children :node
+end
+
+;; The World tree
+
+to worldtree
+make "world ~
+     tree "world ~
+          (list (tree "France leaves [Paris Dijon Avignon])
+                (tree "China leaves [Beijing Shanghai Guangzhou Suzhou])
+                (tree [United States]
+                      (list (tree [New York]
+                                  leaves [[New York] Albany Rochester
+                                          Armonk] )
+                            (tree "Massachusetts
+                                  leaves [Boston Cambridge Sudbury
+                                          Maynard] )
+                            (tree "California
+                                  leaves [[San Francisco] Berkeley
+                                          [Palo Alto] Pasadena] )
+                            (tree "Washington
+                                  leaves [Seattle Olympia] ) ) )
+                (tree "Canada
+                      (list (tree "Ontario
+                                  leaves [Toronto Ottawa Windsor] )
+                            (tree "Quebec
+                                  leaves [Montreal Quebec Lachine] )
+                            (tree "Manitoba leaves [Winnipeg]) ) ) )
+end
+
+to locate :city
+output locate1 :city :world "false
+end
+
+to locate1 :city :subtree :wanttree
+if and :wanttree (equalp :city datum :subtree) [output :subtree]
+if leafp :subtree ~
+   [ifelse equalp :city datum :subtree
+           [output (list :city)]
+           [output []]]
+localmake "lower locate.in.forest :city (children :subtree) :wanttree
+if emptyp :lower [output []]
+output ifelse :wanttree [:lower] [fput (datum :subtree) :lower]
+end
+
+to locate.in.forest :city :forest :wanttree
+if emptyp :forest [output []]
+localmake "child locate1 :city first :forest :wanttree
+if not emptyp :child [output :child]
+output locate.in.forest :city butfirst :forest :wanttree
+end
+
+to cities :name
+output cities1 (finddatum :name :world)
+end
+
+to cities1 :subtree
+if leafp :subtree [output (list datum :subtree)]
+output map.se [cities1 ?] children :subtree
+end
+
+to finddatum :datum :tree
+output locate1 :name :tree "true
+end
+
+;; Area code/city pairs ADT
+
+to areacode :pair
+output first :pair
+end
+
+to city :pair
+output butfirst :pair
+end
+
+;; Area code linear search
+
+make "codelist [[202 Washington] [206 Seattle] [212 New York]
+                [213 Los Angeles] [215 Philadelphia] [303 Denver]
+                [305 Miami] [313 Detroit] [314 St. Louis]
+                [401 Providence] [404 Atlanta] [408 Sunnyvale]
+                [414 Milwaukee] [415 San Francisco] [504 New Orleans]
+                [608 Madison] [612 St. Paul] [613 Kingston]
+                [614 Columbus] [615 Nashville] [617 Boston]
+                [702 Las Vegas] [704 Charlotte]
+                [712 Sioux City] [714 Anaheim] [716 Rochester]
+                [717 Scranton] [801 Salt Lake City] [804 Newport News]
+                [805 Ventura] [808 Honolulu]]
+
+to listcity :code
+output city find [equalp :code areacode ?] :codelist
+end
+
+;; Area code binary tree search
+
+to balance :list
+if emptyp :list [output []]
+if emptyp butfirst :list [output leaf first :list]
+output balance1 (int (count :list)/2) :list []
+end
+
+to balance1 :count :in :out
+if equalp :count 0 ~
+   [output tree (first :in) (list balance reverse :out
+                                  balance butfirst :in)]
+output balance1 (:count-1) (butfirst :in) (fput first :in :out)
+end
+
+to treecity :code
+output city treecity1 :code :codetree
+end
+
+to treecity1 :code :tree
+if emptyp :tree [output [0 no city]]
+localmake "datum datum :tree
+if :code = areacode :datum [output :datum]
+if :code < areacode :datum [output treecity1 :code lowbranch :tree]
+output treecity1 :code highbranch :tree
+end
+
+to lowbranch :tree
+if leafp :tree [output []]
+output first children :tree
+end
+
+to highbranch :tree
+if leafp :tree [output []]
+output last children :tree
+end
+</PRE><P>
+
+
+<P>
+
+<P><A HREF="../v3-toc2.html">(back to Table of Contents)</A>
+<P><A HREF="../v3ch2/v3ch2.html"><STRONG>BACK</STRONG></A>
+chapter thread <A HREF="https://people.eecs.berkeley.edu/~bh/v3ch4/v3ch4.html"><STRONG>NEXT</STRONG></A>
+
+<P>
+<ADDRESS>
+<A HREF="../index.html">Brian Harvey</A>, 
+<CODE>bh@cs.berkeley.edu</CODE>
+</ADDRESS>
+</BODY>
+</HTML>