*

author: elioat <elioat@tilde.institute> 2023-08-23 07:52:19 -0400
committer: elioat <elioat@tilde.institute> 2023-08-23 07:52:19 -0400
commit: 562a9a52d599d9a05f871404050968a5fd282640 (patch)
tree: 7d3305c1252c043bfe246ccc7deff0056aa6b5ab /js/games/nluqo.github.io/~bh/ssch18/trees.html
parent: 5d012c6c011a9dedf7d0a098e456206244eb5a0f (diff)
download: tour-562a9a52d599d9a05f871404050968a5fd282640.tar.gz
1 files changed, 848 insertions, 0 deletions
diff --git a/js/games/nluqo.github.io/~bh/ssch18/trees.html b/js/games/nluqo.github.io/~bh/ssch18/trees.html
new file mode 100644
index 0000000..3282cb2
--- /dev/null
+++ b/js/games/nluqo.github.io/~bh/ssch18/trees.html
@@ -0,0 +1,848 @@
+<P>
+
+<P><A NAME="mondrian"></A><CENTER><IMG SRC="../ss-pics/mondrian.jpg" ALT="figure: mondrian"></CENTER><P><CENTER><EM>Apple Tree in Blossom,</EM> Piet Mondrian
+(1912)
+</CENTER><P>
+
+<HTML>
+<HEAD>
+<TITLE>Simply Scheme: Introducing Computer Science ch 18: Trees</TITLE>
+</HEAD>
+<BODY>
+<HR>
+<CITE>Simply Scheme:</CITE>
+<CITE>Introducing Computer Science</CITE> 2/e Copyright (C) 1999 MIT
+<H2>Chapter 18</H2>
+<H1>Trees</H1>
+
+<TABLE width="100%"><TR><TD>
+<IMG SRC="../simply.jpg" ALT="cover photo">
+<TD><TABLE>
+<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian
+Harvey</A><BR>University of California, Berkeley</CITE>
+<TR><TD align="right"><CITE><A HREF="http://ccrma.stanford.edu/~matt">Matthew
+Wright</A><BR>University of California, Santa Barbara</CITE>
+<TR><TD align="right"><BR>
+<TR><TD align="right"><A HREF="../pdf/ssch18.pdf">Download PDF version</A>
+<TR><TD align="right"><A HREF="../ss-toc2.html">Back to Table of Contents</A>
+<TR><TD align="right"><A HREF="../ssch17/lists.html"><STRONG>BACK</STRONG></A>
+chapter thread <A HREF="../ssch19/implement-hof.html"><STRONG>NEXT</STRONG></A>
+<TR><TD align="right"><A HREF="http://mitpress.mit.edu/0262082810">MIT
+Press web page for <CITE>Simply Scheme</CITE></A>
+</TABLE></TABLE>
+
+<HR>
+
+
+<P>The big advantage of full-featured lists over sentences is their ability to
+represent <EM>structure</EM> in our data by means of sublists.
+In this chapter we'll look at examples in which we use lists and sublists to
+represent two-dimensional information structures.  The kinds of structures
+we'll consider are called <EM>trees</EM> because they resemble trees in
+nature:
+
+<P><P><CENTER><IMG SRC="../ss-pics/firsttree.jpg" ALT="figure: firsttree">
+&nbsp;&nbsp;&nbsp;&nbsp;<IMG SRC="../ss-pics/tree.jpg" ALT="photo: tree">
+</CENTER>
+
+
+<P>
+<P>The components of a tree are called <EM>nodes.</EM> At the
+top is the <EM><A NAME="g1"></A><A NAME="g2"></A>root node</EM> of the tree; in the interior of the
+diagram there are <EM><A NAME="g3"></A><A NAME="g4"></A>branch nodes;</EM> at the bottom are the <EM><A NAME="g5"></A><A NAME="g6"></A>leaf nodes,</EM> from which no further branches extend.
+
+<P>We're going to begin by considering a tree as an abstract data type, without
+thinking about how lists are used to represent trees.  For example, we'll
+construct trees using a procedure named <CODE>make-node</CODE>, as if that were a
+Scheme primitive.  About halfway through the chapter, we'll explore the
+relationship between trees and lists.
+
+<P><H2>Example: The World</H2>
+
+<P>Here is a tree that represents the world:
+
+<P><CENTER><IMG SRC="../ss-pics/world.jpg" ALT="figure: world"></CENTER>
+
+<P>Each node in the tree represents some region of the world.  Consider the
+node labeled &quot;Great Britain.&quot;  There are two parts to this node:  The
+obvious part is the label itself, the name &quot;Great Britain.&quot;  But
+the regions of the world that are included within Great Britain&mdash;that is,
+the nodes that are attached beneath Great Britain in the figure&mdash;are also part
+of this node.
+
+<P>We say that every node has a <EM>datum</EM> and zero or more <EM>children.</EM>  For the moment, let's just say that the datum can be
+either a word or a sentence.  The children, if any, are themselves trees.
+Notice that this definition is recursive&mdash;a tree is made up of trees.
+(What's the base case?)
+
+<P>This family metaphor is also part of the terminology of
+trees.<A NAME="text1" HREF="trees.html#ft1">[1]</A> We say that a node is the <EM>parent</EM> of
+another node, or that two nodes are <EM>siblings.</EM>  In more advanced
+treatments, you even hear things like &quot;grandparent&quot; and &quot;cousin,&quot; but we
+won't get into that.
+
+<P>What happens when you prune an actual tree by cutting off a branch?
+The cut-off part is essentially a tree in itself, with a smaller trunk
+and fewer branches.  The metaphor isn't perfect because the cut-off part
+doesn't have roots, but still, we can stick the end in the ground and
+hope that the cut-off end will take root as a new tree.
+
+<P>It's the same with a country in our example; each country is a branch node of
+the entire world tree, but also a tree in itself.  Depending on how you
+think about it, Great Britain can be either a component of the entire world
+or a collection of smaller locations.  So the branch node that represents
+Great Britain is the root node of a <EM>subtree</EM> of the entire tree.
+
+<P><CENTER><IMG SRC="../ss-pics/britain.jpg" ALT="figure: britain"></CENTER>
+
+<P>What is a node?  It might seem natural to think of a node as being just the
+information in one of the circles in the diagram&mdash;that is, to think of a
+node as including only its datum.  In that way of thinking, each node would
+be separate from every other node, just as the words in a sentence are all
+separate elements.  However, it will be more useful to think of a node as a
+structure that includes everything below that circle also: the datum and the
+children.  So when we think of the node for Great Britain, we're thinking
+not only of the name &quot;Great Britain,&quot; but also of everything <EM>in</EM> Great Britain.  From this perspective, the root node of a tree
+includes the entire tree.  We might as well say that the node <EM>is</EM> the
+tree.
+
+<P>The constructor for a tree is actually the constructor for one node,
+its root node.  Our constructor for trees is therefore called
+<CODE><A NAME="g7"></A><CODE>make-node</CODE></CODE>.  It takes two arguments: the datum and a (possibly
+empty) list of children.  As the following example shows, constructing what
+we think of as one tree requires the construction of many such nodes.
+
+<P><PRE>(define world-tree                           ;; painful-to-type version
+  (make-node
+   'world
+   (list (make-node
+          'italy
+          (list (make-node 'venezia '())
+                (make-node 'riomaggiore '())
+                (make-node 'firenze '())
+                (make-node 'roma '())))
+         (make-node
+          '(united states)
+          (list (make-node 'california
+                           (list (make-node 'berkeley '())
+                                 (make-node '(san francisco) '())
+                                 (make-node 'gilroy '())))
+                (make-node 'massachusetts
+                           (list (make-node 'cambridge '())
+                                 (make-node 'amherst '())
+                                 (make-node 'sudbury '())))))))) 
+</PRE>
+
+<P>You'll notice that we haven't defined all of the places shown
+in the figure.  That's because we got tired of doing all this typing;
+we're going to invent some abbreviations later.  For now, we'll take time
+out to show you the selectors for trees.
+<A NAME="g8"></A>
+<A NAME="g9"></A>
+
+<P><PRE>&gt; (datum world-tree)
+WORLD
+
+&gt; (datum (car (children world-tree)))
+ITALY
+
+&gt; (datum (car (children (cadr (children world-tree)))))
+CALIFORNIA
+
+&gt; (datum (car (children (car (children
+			      (cadr (children world-tree)))))))
+BERKELEY
+</PRE>
+
+<P><CODE>Datum</CODE> of a tree node returns the datum of that node.
+<CODE>Children</CODE> of a node returns a list of the children of the node.
+(A list of trees is called a <EM>forest.</EM>)
+
+<P>Here are some abbreviations to help us construct the world tree with less
+typing.  Unlike <CODE>make-node</CODE>, <CODE>datum</CODE>, and <CODE>children</CODE>, which are
+intended to work on trees in general, these abbreviations were designed
+with the world tree specifically in mind:
+
+<P><PRE>(define (<A NAME="g10"></A>leaf datum)
+  (make-node datum '()))
+
+(define (<A NAME="g11"></A>cities name-list)
+  (map leaf name-list))
+</PRE>
+
+<P>With these abbreviations the world tree is somewhat easier to
+define:
+
+<P><PRE>(define world-tree
+  (make-node
+   'world
+   (list (make-node
+          'italy
+          (cities '(venezia riomaggiore firenze roma)))
+         (make-node
+          '(united states)
+          (list (make-node
+                 'california
+                 (cities '(berkeley (san francisco) gilroy)))
+                (make-node
+                 'massachusetts
+                 (cities '(cambridge amherst sudbury)))
+                (make-node 'ohio (cities '(kent)))))
+         (make-node 'zimbabwe (cities '(harare hwange)))
+         (make-node 'china
+		    (cities '(beijing shanghai guangzhou suzhou)))
+         (make-node
+          '(great britain)
+          (list 
+           (make-node 'england (cities '(liverpool)))
+           (make-node 'scotland
+		      (cities '(edinburgh glasgow (gretna green))))
+           (make-node 'wales (cities '(abergavenny)))))
+         (make-node
+          'australia
+          (list
+           (make-node 'victoria (cities '(melbourne)))
+           (make-node '(new south wales) (cities '(sydney)))
+           (make-node 'queensland
+		      (cities '(cairns (port douglas))))))
+         (make-node 'honduras (cities '(tegucigalpa))))))
+</PRE>
+
+<P>
+<P><H2>How Big Is My Tree?</H2>
+
+<P>Now that we have the tree, how many cities are there in our world?
+
+<P><PRE>(define (<A NAME="g12"></A>count-leaves tree)
+  (if (leaf? tree)
+      1
+      (reduce + (map count-leaves (children tree)))))
+
+(define (<A NAME="g13"></A>leaf? node)
+  (null? (children node)))
+
+&gt; (count-leaves world-tree)
+27
+</PRE>
+
+<P>At first glance, this may seem like a simple case of recursion, with
+<CODE>count-leaves</CODE> calling <CODE>count-leaves</CODE>.  But since what looks like a
+single recursive call is really a call to <CODE>map</CODE>, it is equivalent to
+<EM>several</EM> recursive calls, one for each child of the given tree node.
+
+<P><H2>Mutual Recursion</H2>
+
+In Chapter 14 we wrote recursive
+procedures that were equivalent to using higher-order functions.  Let's do
+the same for <CODE>count-leaves</CODE>.
+
+<P><PRE>(define (<A NAME="g14"></A>count-leaves tree)
+  (if (leaf? tree)
+      1
+      (count-leaves-in-forest (children tree))))
+
+(define (<A NAME="g15"></A>count-leaves-in-forest forest)
+  (if (null? forest)
+      0
+      (+ (count-leaves (car forest))
+         (count-leaves-in-forest (cdr forest)))))
+</PRE>
+
+<P>Note that <CODE>count-leaves</CODE> calls <CODE>count-leaves-in-forest</CODE>,
+and <CODE>count-leaves-</CODE><CODE>in-</CODE><CODE>forest</CODE> calls
+<CODE>count-leaves</CODE>.  This pattern is called <EM><A NAME="g16"></A><A NAME="g17"></A>mutual recursion.</EM>
+
+<P>Mutual recursion is often a useful technique for dealing with trees.  In the
+typical recursion we've seen before this chapter, we've moved sequentially
+through a list or sentence, with each recursive call taking us one step to
+the right.  In the following paragraphs we present three different models to
+help you think about how the shape of a tree gives rise to a mutual recursion.
+
+<P>In the first model, we're going to think of <CODE>count-leaves</CODE> as an
+initialization procedure, and <CODE>count-leaves-in-forest</CODE> as its helper
+procedure.  Suppose we want to count the leaves of a tree.  Unless the
+argument is a very shallow<A NAME="text2" HREF="trees.html#ft2">[2]</A> tree, this will involve counting the leaves of all
+of the children of that tree.  What we want is a straightforward sequential
+recursion over the list of children.  But we're given the wrong argument: 
+the tree itself, not its list of children.  So we need an initialization
+procedure, <CODE>count-leaves</CODE>, whose job is to extract the list of children
+and invoke a helper procedure, <CODE>count-leaves-in-forest</CODE>, with that list
+as argument.
+
+<P>The helper procedure follows the usual sequential list pattern:  Do
+something to the <CODE>car</CODE> of the list, and recursively handle the <CODE>cdr</CODE>
+of the list.  Now, what do we have to do to the <CODE>car</CODE>?  In the usual
+sequential recursion, the <CODE>car</CODE> of the list is something simple, such as
+a word.  What's special about trees is that here the <CODE>car</CODE> is itself a
+tree, just like the entire data structure we started with.  Therefore, we
+must invoke a procedure whose domain is trees: <CODE>count-leaves</CODE>.
+
+<P>This model is built on two ideas.  One is the idea of the domain of a
+function; the reason we need two procedures is that we need one that takes a
+tree as its argument and one that takes a list of trees as its argument.
+The other idea is the leap of faith; we assume that the invocation of <CODE>count-leaves</CODE> within <CODE>count-leaves-in-forest</CODE> will correctly handle each
+child without tracing the exact sequence of events.
+
+<P>The second model is easier to state but less rigorous.  Because of the
+two-dimensional nature of trees, in order to visit every node we have to be
+able to move in two different directions.  From a given node we have to be
+able to move <EM>down</EM> to its children, but from each child we must be
+able to move <EM>across</EM> to its next sibling.
+
+<P>The job of <CODE>count-leaves-in-forest</CODE> is to move from left to right
+through a list of children.  (It does this using the more familiar kind of
+recursion, in which it invokes itself directly.)  The downward motion
+happens in <CODE>count-leaves</CODE>, which moves down one level by invoking <CODE>children</CODE>.  How does the program move down more than one level?  At each
+level, <CODE>count-leaves</CODE> is invoked recursively from <CODE>count-leaves-in-forest</CODE>.
+
+<P>The third model is also based on the two-dimensional nature of trees.
+Imagine for a moment that each node in the tree has at most one child.
+In that case, <CODE>count-leaves</CODE> could move from the root down to the single
+leaf with a structure very similar to the actual procedure, but carrying out
+a sequential recursion:
+
+<P><PRE>(define (count-leaf tree)
+  (if (leaf? tree)
+      1
+      (count-leaf (child tree))))
+</PRE>
+
+<P>The trouble with this, of course, is that at each downward
+step there isn't a single &quot;next&quot; node.  Instead of a single path from the
+root to the leaf, there are multiple paths from the root to many leaves.
+To make our idea of downward motion through sequential recursion work in a
+real tree, at each level we must &quot;clone&quot; <CODE>count-leaves</CODE> as many times
+as there are children.  <CODE>Count-leaves-in-forest</CODE> is the factory that
+manufactures the clones.  It hires one <CODE>count-leaves</CODE> little person for
+each child and accumulates their results.
+
+<P>The key point in recursion on trees is that each child of a tree is itself a
+perfectly good tree.  This recursiveness in the nature of trees gives rise
+to a very recursive structure for programs that use trees.  The reason we
+say &quot;very&quot; recursive is that each invocation of <CODE>count-leaves</CODE> causes
+not just one but several recursive invocations, one for each child, by way
+of <CODE>count-leaves-in-forest</CODE>.
+
+<P>In fact, we use the name <EM><A NAME="g18"></A><A NAME="g19"></A>tree recursion</EM> for any situation
+in which a procedure invocation results in more than one recursive call,
+even if there isn't an argument that's a tree.  The computation of Fibonacci
+numbers from Chapter 13 is an example of a tree recursion with no
+tree.  The <CODE>car</CODE>-<CODE>cdr</CODE> recursions in Chapter 17 are also
+tree recursions; any structured list-of-lists has a somewhat tree-like,
+two-dimensional character even though it doesn't use the formal mechanisms
+we're exploring in this chapter.
+The <CODE>cdr</CODE> recursion is a &quot;horizontal&quot; one, moving from one element
+to another within the same list; the <CODE>car</CODE> recursion is a &quot;vertical&quot;
+one, exploring a sublist of the given list.
+
+<P>
+<P><H2>Searching for a Datum in the Tree</H2>
+
+<P>Procedures that explore trees aren't always as simple as <CODE>count-leaves</CODE>.
+We started with that example because we could write it using higher-order
+functions, so that you'd understand the structure of the problem before we
+had to take on the complexity of mutual recursion.  But many tree problems
+don't quite fit our higher-order functions.
+
+<P>For example, let's write a predicate <CODE>in-tree?</CODE> that takes the name of
+a place and a tree as arguments and tells whether or not that place is in the
+tree.  It <EM>is</EM> possible to make it work with <CODE>filter</CODE>:
+
+<P><PRE>(define (<A NAME="g20"></A>in-tree? place tree)
+  (or (equal? place (datum tree))
+      (not (null? (filter (lambda (subtree) (in-tree? place subtree))
+			  (children tree))))))
+</PRE>
+
+<P>This awkward construction, however, also performs
+unnecessary computation.  If the place we're looking for happens to be in the
+first child of a node, <CODE>filter</CODE> will nevertheless look in all the other
+children as well.  We can do better by replacing the use of <CODE>filter</CODE>
+with a mutual recursion:
+
+<P><PRE>(define (<A NAME="g21"></A>in-tree? place tree)
+  (or (equal? place (datum tree))
+      (in-forest? place (children tree))))
+
+(define (<A NAME="g22"></A>in-forest? place forest)
+  (if (null? forest)
+      #f
+      (or (in-tree? place (car forest))
+	  (in-forest? place (cdr forest)))))
+
+&gt; (in-tree? 'abergavenny world-tree)
+#T
+
+&gt; (in-tree? 'abbenay world-tree)
+#F
+
+&gt; (in-tree? 'venezia (cadr (children world-tree)))
+#F
+</PRE>
+
+<P>Although any mutual recursion is a little tricky to read, the structure
+of this program does fit the way we'd describe the algorithm in
+English.  A place is in a tree if one of two conditions holds: the place is
+the datum at the root of the tree, or the place is (recursively) in one of
+the child trees of this tree.  That's what <CODE>in-tree?</CODE> says.  As for
+<CODE>in-forest?</CODE>, it says that a place is in one of a group of trees if
+the place is in the first tree, or if it's in one of the remaining trees.
+
+<P><H2>Locating a Datum in the Tree</H2>
+
+<P>Our next project is similar to the previous one, but a little more
+intricate.  We'd like to be able to locate a city and find out all of
+the larger regions that enclose the city.  For example, we want to say
+
+<P><PRE>&gt; (locate 'berkeley world-tree)
+(WORLD (UNITED STATES) CALIFORNIA BERKELEY)
+</PRE>
+
+<P>Instead of just getting a yes-or-no answer about whether a city is
+in the tree, we now want to find out <EM>where</EM> it is.
+
+<P>The algorithm is recursive:  To look for Berkeley within the world, we need
+to be able to look for Berkeley within any subtree.  The <CODE>world</CODE> node
+has several children (countries).  <CODE>Locate</CODE> recursively asks each of
+those children to find a path to Berkeley.  All but one of the children
+return <CODE>#f</CODE>, because they can't find Berkeley within their territory.
+But the <CODE>(united states)</CODE> node returns
+
+<P><PRE>((UNITED STATES) CALIFORNIA BERKELEY)
+</PRE>
+
+<P>To make a complete path, we just prepend the name of the current
+node, <CODE>world</CODE>, to this path.  What happens when <CODE>locate</CODE> tries to
+look for Berkeley in Australia?  Since all of Australia's children return
+<CODE>#f</CODE>, there is no path to Berkeley from Australia, so <CODE>locate</CODE>
+returns <CODE>#f</CODE>.
+
+<P><PRE>(define (<A NAME="g23"></A>locate city tree)
+  (if (equal? city (datum tree))
+      (list city)
+      (let ((subpath (locate-in-forest city (children tree))))
+        (if subpath
+            (cons (datum tree) subpath)
+            #f))))
+
+(define (<A NAME="g24"></A>locate-in-forest city forest)
+  (if (null? forest)
+      #f
+      (or (locate city (car forest))
+	  (locate-in-forest city (cdr forest)))))
+</PRE>
+
+<P>Compare the structure of <CODE>locate</CODE> with that of <CODE>in-tree?</CODE>.  The
+helper procedures <CODE>in-forest?</CODE> and <CODE>locate-in-forest</CODE> are almost
+identical.  The main procedures look different, because <CODE>locate</CODE> has a
+harder job, but both of them check for two possibilities:  The city might be
+the datum of the argument node, or it might belong to one of the child trees.
+
+<P><H2>Representing Trees as Lists</H2>
+
+<P>We've done a lot with trees, but we haven't yet talked about the way Scheme
+stores trees internally.  How do <CODE>make-node</CODE>, <CODE>datum</CODE>, and <CODE>children</CODE> work?  It turns out to be very convenient to represent trees in
+terms of lists.
+
+<P><PRE>(define (<A NAME="g25"></A>make-node datum children)
+  (cons datum children))
+
+(define (<A NAME="g26"></A>datum node)
+  (car node))
+
+(define (<A NAME="g27"></A>children node)
+  (cdr node))
+</PRE>
+
+<P>In other words, a tree is a list whose first element is the datum
+and whose remaining elements are subtrees.
+
+<P>
+<PRE>&gt; world-tree
+(WORLD
+   (ITALY (VENEZIA) (RIOMAGGIORE) (FIRENZE) (ROMA))
+   ((UNITED STATES)
+    (CALIFORNIA (BERKELEY) ((SAN FRANCISCO)) (GILROY))
+    (MASSACHUSETTS (CAMBRIDGE) (AMHERST) (SUDBURY))
+    (OHIO (KENT)))
+   (ZIMBABWE (HARARE) (HWANGE))
+   (CHINA (BEIJING) (SHANGHAI) (GUANGSZHOU) (SUZHOW))
+   ((GREAT BRITAIN)
+    (ENGLAND (LIVERPOOL))
+    (SCOTLAND (EDINBURGH) (GLASGOW) ((GRETNA GREEN)))
+    (WALES (ABERGAVENNY)))
+   (AUSTRALIA
+    (VICTORIA (MELBOURNE))
+    ((NEW SOUTH WALES) (SYDNEY))
+    (QUEENSLAND (CAIRNS) ((PORT DOUGLAS))))
+   (HONDURAS (TEGUCIGALPA)))
+</PRE>
+
+<P><PRE>&gt; (car (children world-tree))
+(ITALY (VENEZIA) (RIOMAGGIORE) (FIRENZE) (ROMA))
+</PRE>
+
+<P>Ordinarily, however, we're not going to print out trees in their entirety.
+As in the <CODE>locate</CODE> example, we'll extract just some subset of the
+information and put it in a more readable form.
+
+<P><H2>Abstract Data Types</H2>
+
+<P>The procedures <CODE>make-node</CODE>, <CODE>datum</CODE>, and <CODE>children</CODE> define an
+<A NAME="g28"></A><A NAME="g29"></A>abstract data type for trees.  Using this ADT,
+<A NAME="g30"></A>
+<A NAME="g31"></A>
+we were able to write several useful procedures to manipulate trees before
+pinning down exactly how a tree is represented as a Scheme list.
+
+<P>Although it would be possible to refer to the parts of a node by using <CODE>car</CODE> and <CODE>cdr</CODE> directly, your programs will be more readable if you use
+the ADT-specific selectors and constructors.  Consider this example:
+
+<P><PRE>(in-tree? 'venezia (caddr world-tree))
+</PRE>
+
+<P>What does <CODE>caddr</CODE> mean in this context?  Is the <CODE>caddr</CODE> of
+a tree a datum?  A child?  A forest?  Of course you could work it out by
+careful reasoning, but the form in which we presented this example
+originally was much clearer:
+
+<P><PRE>(in-tree? 'venezia (cadr (children world-tree)))
+</PRE>
+
+<P>Even better would be
+
+<P><PRE>(in-tree? 'venezia (list-ref (children world-tree) 1))
+</PRE>
+
+<P>Using the appropriate selectors and constructors is called <EM>respecting</EM> the data abstraction.  Failing to use the appropriate
+selectors and constructors is called a <EM>data abstraction
+violation.</EM>
+
+<P>Since we wrote the selectors and constructor for trees ourselves, we
+could have defined them to use some different representation:
+
+<P><PRE>(define (make-node datum children)
+  (list 'the 'node 'with 'datum datum 'and 'children children))
+
+(define (datum node) (list-ref node 4))
+
+(define (children node) (list-ref node 7))
+</PRE>
+
+<P><PRE>&gt; (make-node 'italy (cities '(venezia riomaggiore firenze roma)))
+(THE NODE WITH DATUM ITALY AND CHILDREN
+     ((THE NODE WITH DATUM VENEZIA AND CHILDREN ())
+      (THE NODE WITH DATUM RIOMAGGIORE AND CHILDREN ())
+      (THE NODE WITH DATUM FIRENZE AND CHILDREN ())
+      (THE NODE WITH DATUM ROMA AND CHILDREN ())))
+</PRE>
+
+<P>You might expect that this change in the representation would
+require changes to all the procedures we wrote earlier, such as <CODE>count-leaves</CODE>.  But in fact, those procedures would continue to work
+perfectly because they don't see the representation.  (They respect the data
+abstraction.)  As long as <CODE>datum</CODE> and <CODE>children</CODE> find the right
+information, it doesn't matter how the trees are stored.  All that matters
+is that the constructors and selectors have to be compatible with each other.
+
+<P>On the other hand, the example in this section in which we violated the data
+abstraction by using <CODE>caddr</CODE> to find the second child of <CODE>world-tree</CODE> would fail if we changed the representation.  Many cases like
+this one, in which formerly working programs failed after a change in
+representation, led programmers to use such moralistic terms as
+&quot;respecting&quot; and &quot;violating&quot; data abstractions.<A NAME="text3" HREF="trees.html#ft3">[3]</A>
+
+<P> 
+<H2>An Advanced Example: Parsing Arithmetic Expressions</H2>
+
+<P>Consider the notation for arithmetic expressions.  Scheme uses <EM><A NAME="g32"></A>
+<A NAME="g33"></A>
+prefix</EM> notation: <CODE>(+ 3 4)</CODE>.  By contrast, people who aren't Scheme
+programmers generally represent arithmetic computations using an <EM>infix</EM> notation, in which the function symbol goes between two
+arguments: 3+4.
+
+<P>Our goal in this section is to translate an infix arithmetic expression into
+a tree representing the computation.  This translation process is called
+<EM>parsing</EM> the expression.  For example, we'll turn the expression
+
+<P><P><CENTER>4+3&times;7-5/(3+4)+6</CENTER><P>
+
+into the tree
+
+<P><CENTER><IMG SRC="../ss-pics/parse0.jpg" ALT="figure: parse0"></CENTER>
+
+<P>The point of using a tree is that it's going to be very easy to perform the
+computation once we have it in tree form.  In the original infix form, it's
+hard to know what to do first, because there are <EM>precedence</EM> rules
+that determine an implicit grouping:  Multiplication and division come
+before addition and subtraction; operations with the same precedence are
+done from left to right.  Our sample expression is equivalent to
+
+<P><P><CENTER>(((4 + (3 &times; 7)) - (5 / (3+4))) + 6)</CENTER><P>
+
+In the tree representation, it's easy to see that the operations
+nearer the leaves are done first; the root node is the last operation,
+because it depends on the results of lower-level operations.
+
+<P>Our program will take as its argument an infix arithmetic expression in the
+form of a list:
+
+<P><PRE>&gt; (parse '(4 + 3 * 7 - 5 / (3 + 4) + 6))
+</PRE>
+
+<P>Each element of the list must be one of three things: a number;
+one of the four symbols <CODE>+</CODE>, <CODE>-</CODE>, <CODE>*</CODE>, or <CODE>/</CODE>; or a sublist
+(such as the three-element list <CODE>(3 + 4)</CODE> in this example) satisfying
+the same rule.  (You can imagine that we're implementing a pocket
+calculator.  If we were implementing a computer programming language, then
+we'd also accept variable names as operands.  But we're not bothering with
+that complication because it doesn't really affect the part of the problem
+about turning the expression into a tree.)
+
+<P>What makes this problem tricky is that we can't put the list elements into
+the tree as soon as we see them.  For example, the first three elements of
+our sample list are <CODE>4</CODE>, <CODE>+</CODE>, and <CODE>3</CODE>.  It's tempting to build a
+subtree of those three elements:
+
+<P><CENTER><IMG SRC="../ss-pics/fourplus.jpg" ALT="figure: fourplus"></CENTER>
+
+<P>But if you compare this picture with the earlier picture of the
+correct tree, you'll see that the second argument to this <CODE>+</CODE> invocation
+isn't the number <CODE>3</CODE>, but rather the subexpression <CODE>3 * 7</CODE>.
+
+<P>By this reasoning you might think that we have to examine the entire
+expression before we can start building the tree.  But in fact we can
+sometimes build a subtree with confidence.  For example, when we see the
+minus sign in our sample expression, we can tell that the subexpression
+<CODE>3 * 7</CODE> that comes before it is complete, because <CODE>*</CODE> has higher
+precedence than <CODE>-</CODE> does.
+
+<P>Here's the plan.  The program will examine its argument from left to right.
+Since the program can't finish processing each list element right away, it
+has to maintain information about the elements that have been examined but
+not entirely processed.  It's going to be easier to maintain that
+information in two parts: one list for still-pending operations and another
+for still-pending operands.  Here are the first steps in parsing our sample
+expression; the program examines the elements of the argument, putting
+numbers onto the operand list and operation symbols onto the operation
+list:<A NAME="text4" HREF="trees.html#ft4">[4]</A>
+
+<P>
+<CENTER><IMG SRC="../ss-pics/parse1.jpg" ALT="figure: parse1"></CENTER>
+
+<P>At this point, the program is looking at the <CODE>*</CODE> operator in the infix
+expression.  If this newly seen operator had lower precedence than the <CODE>+</CODE> that's already at the head of the list of operations, then it would be
+time to carry out the <CODE>+</CODE> operation by creating a tree with <CODE>+</CODE> at
+the root and the first two operands in the list as its children.  Instead,
+since <CODE>*</CODE> has higher precedence than <CODE>+</CODE>, the program isn't ready to
+create a subtree but must instead add the <CODE>*</CODE> to its operation list.
+
+<P><CENTER><IMG SRC="../ss-pics/parse2.jpg" ALT="figure: parse2"></CENTER>
+
+<P>This time, the newly seen <CODE>-</CODE> operation has lower precedence than the
+<CODE>*</CODE> at the head of the operation list.  Therefore, it's time for the
+program to <EM>handle</EM> the <CODE>*</CODE> operator, by making a subtree
+containing that operator and the first two elements of the operand list.
+This new subtree becomes the new first element of the operand list.
+
+<P><CENTER><IMG SRC="../ss-pics/parse3.jpg" ALT="figure: parse3"></CENTER>
+
+<P>Because the program decided to handle the waiting <CODE>*</CODE> operator, it still
+hasn't moved the <CODE>-</CODE> operator from the infix expression to the operator
+list.  Now the program must compare <CODE>-</CODE> with the <CODE>+</CODE> at the head of
+the list.  These two operators have the same precedence.  Since we want to
+carry out same-precedence operators from left to right, it's time to handle
+the <CODE>+</CODE> operator.
+
+<P><CENTER><IMG SRC="../ss-pics/parse4.jpg" ALT="figure: parse4"></CENTER>
+
+<P>Finally the program can move the <CODE>-</CODE> operator onto the operator list.
+The next several steps are similar to ones we've already seen.
+
+<P><CENTER><IMG SRC="../ss-pics/parse5.jpg" ALT="figure: parse5"></CENTER>
+
+
+<P>This is a new situation: The first unseen element of the infix expression is
+neither a number nor an operator, but a sublist.  We recursively <CODE>parse</CODE>
+this subexpression, adding the resulting tree to the operand list.
+
+<P><CENTER><IMG SRC="../ss-pics/parse6.jpg" ALT="figure: parse6"></CENTER>
+
+<P>Then we proceed as before, processing the <CODE>/</CODE> because it has higher
+precedence than the <CODE>+</CODE>, then the <CODE>-</CODE> because it has the same
+priority as the <CODE>+</CODE>, and so on.
+
+<P><CENTER><IMG SRC="../ss-pics/parse7.jpg" ALT="figure: parse7"></CENTER>
+
+<P>Once the program has examined every element of the infix expression, the
+operators remaining on the operator list must be handled.  In this case
+there is only one such operator.  Once the operators have all been handled,
+there should be one element remaining on the operand list; that element is
+the desired tree for the entire original expression.
+
+<P><CENTER><IMG SRC="../ss-pics/parse8.jpg" ALT="figure: parse8"></CENTER>
+
+<P>The following program implements this algorithm.  It works only for correctly
+formed infix expressions; if given an argument like <CODE>(3 + *)</CODE>, it'll give
+an incorrect result or a Scheme error.
+
+<P><PRE>(define (<A NAME="g34"></A>parse expr)
+  (parse-helper expr '() '()))
+
+(define (parse-helper expr operators operands)
+  (cond ((null? expr)
+	 (if (null? operators)
+	     (car operands)
+	     (handle-op '() operators operands)))
+	((number? (car expr))
+	 (parse-helper (cdr expr)
+		       operators
+		       (cons (make-node (car expr) '()) operands)))
+	((list? (car expr))
+	 (parse-helper (cdr expr)
+		       operators
+		       (cons (parse (car expr)) operands)))
+	(else (if (or (null? operators)
+		      (&gt; (precedence (car expr))
+			 (precedence (car operators))))
+		  (parse-helper (cdr expr)
+				(cons (car expr) operators)
+				operands)
+		  (handle-op expr operators operands)))))
+
+(define (handle-op expr operators operands)
+  (parse-helper expr
+		(cdr operators)
+		(cons (make-node (car operators)
+				 (list (cadr operands) (car operands)))
+		      (cddr operands))))
+
+(define (precedence oper)
+  (if (member? oper '(+ -)) 1 2))
+</PRE>
+
+<P>We promised that after building the tree it would be easy to compute the
+value of the expression.  Here is the program to do that:
+
+<P><PRE>(define (<A NAME="g35"></A>compute tree)
+  (if (number? (datum tree))
+      (datum tree)
+      ((function-named-by (datum tree))
+         (compute (car (children tree)))
+         (compute (cadr (children tree))))))
+
+(define (function-named-by oper)
+  (cond ((equal? oper '+) +)
+	((equal? oper '-) -)
+	((equal? oper '*) *)
+	((equal? oper '/) /)
+	(else (error &quot;no such operator as&quot; oper))))
+
+&gt; (compute (parse '(4 + 3 * 7 - 5 / (3 + 4) + 6)))
+30.285714285714
+</PRE>
+
+<P><H2>Pitfalls</H2>
+
+<P>A leaf node is a perfectly good actual argument to a tree procedure,
+even though the picture of a leaf node doesn't look treeish because there
+aren't any branches.  A common mistake is to make the base case of the
+recursion be a node whose children are leaves, instead of a node that's a
+leaf itself.
+
+<P>
+<P>The value returned by <CODE>children</CODE> is not a tree, but a forest.  It's
+therefore not a suitable actual argument to a procedure that expects a tree.
+
+<P><H2>Exercises</H2>
+
+<P><B>18.1</B>&nbsp;&nbsp;What does
+
+<P><PRE>((SAN FRANCISCO))
+</PRE>
+
+<P>mean in the printout of <CODE>world-tree</CODE>?  Why two sets of
+parentheses?
+
+
+<P>
+<P>
+<B>18.2</B>&nbsp;&nbsp;Suppose we change the definition of the tree constructor so that it
+uses <CODE>list</CODE> instead of <CODE>cons</CODE>:
+
+<P><PRE>(define (make-node datum children)
+  (list datum children))
+</PRE>
+
+<P>How do we have to change the selectors so that everything still works?
+
+
+<P>
+<B>18.3</B>&nbsp;&nbsp;Write <CODE><A NAME="g36"></A>depth,</CODE> a procedure that takes a tree as argument and
+returns the largest number of nodes connected through parent-child links.
+That is, a leaf node has depth 1; a tree in which all the children of the
+root node are leaves has depth 2.  Our world tree has depth 4 (because the
+longest path from the root to a leaf is, for example, world, country, state,
+city).
+
+
+<P>
+<B>18.4</B>&nbsp;&nbsp;Write <CODE><A NAME="g37"></A>count-nodes</CODE>, a procedure that takes a tree as argument
+and returns the total number of nodes in the tree.  (Earlier we counted the
+number of <EM>leaf</EM> nodes.)
+
+
+<P>
+<B>18.5</B>&nbsp;&nbsp;Write <CODE><A NAME="g38"></A>prune</CODE>, a procedure that takes a tree as argument and
+returns a copy of the tree, but with all the leaf nodes of the original tree
+removed.  (If the argument to <CODE>prune</CODE> is a one-node tree, in which the
+root node has no children, then <CODE>prune</CODE> should return <CODE>#f</CODE> because
+the result of removing the root node wouldn't be a tree.)
+
+
+<P>
+
+<B>18.6</B>&nbsp;&nbsp;Write a program <CODE><A NAME="g39"></A>parse-scheme</CODE> that parses a Scheme arithmetic
+expression into the same kind of tree that <CODE>parse</CODE> produces for infix
+expressions.  Assume that all procedure invocations in the Scheme expression
+have two arguments.
+
+<P>The resulting tree should be a valid argument to <CODE>compute</CODE>:
+
+<P><PRE>&gt; (compute (parse-scheme '(* (+ 4 3) 2)))
+14
+</PRE>
+
+<P>(You can solve this problem without the restriction to
+two-argument invocations if you rewrite <CODE>compute</CODE> so that it doesn't
+assume every branch node has two children.)
+
+
+<P>
+
+<HR>
+<A NAME="ft1" HREF="trees.html#text1">[1]</A> Contrariwise, the tree metaphor is also part of the
+terminology of families.<P>
+<A NAME="ft2" HREF="trees.html#text2">[2]</A> You probably think of trees as being short
+or tall.  But since our trees are upside-down, the convention is to call
+them shallow or deep.<P>
+<A NAME="ft3" HREF="trees.html#text3">[3]</A> Another example
+of a data abstraction violation is in Chapter 16.  When <CODE>match</CODE> creates
+an empty known-values database, we didn't use a constructor.  Instead, we
+merely used a quoted empty sentence:
+
+<P><PRE>(define (match pattern sent)
+  (match-using-known-values pattern sent '()))
+</PRE><P>
+<A NAME="ft4" HREF="trees.html#text4">[4]</A> Actually, as we'll see shortly, the elements of the operand
+list are trees, so what we put in the operand list is a one-node tree whose
+datum is the number.<P>
+<P><A HREF="../ss-toc2.html">(back to Table of Contents)</A><P>
+<A HREF="../ssch17/lists.html"><STRONG>BACK</STRONG></A>
+chapter thread <A HREF="../ssch19/implement-hof.html"><STRONG>NEXT</STRONG></A>
+
+<P>
+<ADDRESS>
+<A HREF="../index.html">Brian Harvey</A>, 
+<CODE>bh@cs.berkeley.edu</CODE>
+</ADDRESS>
+</BODY>
+</HTML>
author	elioat <elioat@tilde.institute>	2023-08-23 07:52:19 -0400
committer	elioat <elioat@tilde.institute>	2023-08-23 07:52:19 -0400
commit	562a9a52d599d9a05f871404050968a5fd282640 (patch)
tree	7d3305c1252c043bfe246ccc7deff0056aa6b5ab /js/games/nluqo.github.io/~bh/ssch18/trees.html
parent	5d012c6c011a9dedf7d0a098e456206244eb5a0f (diff)
download	tour-562a9a52d599d9a05f871404050968a5fd282640.tar.gz