<HTML>
<HEAD>
<TITLE>Computer Science Logo Style vol 1 ch 11: Recursive Operations</TITLE>
</HEAD>
<BODY>
<CITE>Computer Science Logo Style</CITE> volume 1:
<CITE>Symbolic Computing</CITE> 2/e Copyright (C) 1997 MIT
<H1>Recursive Operations</H1>
<TABLE width="100%"><TR><TD>
<IMG SRC="../csls1.jpg" ALT="cover photo">
<TD><TABLE>
<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian
Harvey</A><BR>University of California, Berkeley</CITE>
<TR><TD align="right"><BR>
<TR><TD align="right"><A HREF="../pdf/v1ch11.pdf">Download PDF version</A>
<TR><TD align="right"><A HREF="../v1-toc2.html">Back to Table of Contents</A>
<TR><TD align="right"><A HREF="../v1ch10/v1ch10.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../v1ch12/v1ch12.html"><STRONG>NEXT</STRONG></A>
<TR><TD align="right"><A HREF="https://mitpress.mit.edu/books/computer-science-logo-style-second-edition-volume-1">MIT
Press web page for Computer Science Logo Style</A>
</TABLE></TABLE>
<HR>
<P>
So far, the recursive procedures we've seen have all been commands, not
operations. Remember that an operation is a procedure that has an <EM>
output</EM> rather than an <EM>effect.</EM> In other words, an operation
computes some value that is then used as the input to some other procedure.
In the instruction
<P>
<PRE>print first "hello
</PRE>
<P><CODE>print</CODE> is a command, because it does something: It prints its
input (whatever that may be) on the screen. But <CODE>first</CODE> is an
operation, because it computes something: With the word <CODE>hello</CODE> as
input, it computes the letter <CODE>h</CODE>, which is the first letter of
the input.
<P><H2>A Simple Substitution Cipher</H2>
<P>
I'm going to write a program to produce secret messages. The program
will take an ordinary English sentence (in the form of a Logo list)
and change each letter into some other letter. For example, we can
decide to replace the letter E with the letter J every time it occurs
in the message. The program will need two inputs: the message
and the correspondence between letters. The latter will take the form
of a word of 26 letters, representing the coded versions of the 26
letters in alphabetical order. For example, the word
<P>
<PRE>qwertyuiopasdfghjklzxcvbnm
</PRE>
<P>indicates that the letter A in the original text will be represented
by Q in the secret version, B will be represented by W, and so on.
<P>In order to encipher a sentence, we must go through it word by word.
(Strictly speaking, what we're doing is called a <EM>cipher</EM> rather
than a <EM>code</EM> because the latter is a system that substitutes
something for an entire word at a time, like a foreign language,
whereas we're substituting for a single letter at a time, like the
Elvish alphabet in <EM>The Lord of the Rings.</EM>) In order to encipher
a word we must go through it letter by letter. So I'll begin by
writing a procedure to translate a single letter to its coded form.
<P>
<PRE>to codelet :letter :code
output codematch :letter "abcdefghijklmnopqrstuvwxyz :code
end
to codematch :letter :clear :code
if emptyp :clear [output :letter] ; punctuation character
if equalp :letter first :clear [output first :code]
output codematch :letter butfirst :clear butfirst :code
end
</PRE>
<P><CODE>Codelet</CODE> is an operation that takes two inputs. The first input
must be a single-letter word, and the second must be a code, that is,
a word with the 26 letters of the alphabet rearranged. The output
from <CODE>codelet</CODE> is the enciphered version of the input letter. (If
the first input is a character other than a letter, such as a punctuation
mark, then the output is the same as that input.)
<P><CODE>Codelet</CODE> itself is a very simple procedure. It simply passes its
two inputs on to a subprocedure, <CODE>codematch</CODE>, along with another
input that is the alphabet in normal order. The idea is that <CODE>
codematch</CODE> will compare the input letter to each of the letters in the
regular alphabet; when it finds a match, it will output the letter in
the corresponding position in the scrambled alphabet. Be sure you
understand the use of the <CODE>output</CODE> command in <CODE>codelet</CODE>; it
says that whatever <CODE>codematch</CODE> outputs should become the output
from <CODE>codelet</CODE> as well.
<P>The job of <CODE>codematch</CODE> is to go through the alphabet, letter by
letter, looking for the particular letter we're trying to encode. The
primary tool that Logo provides for looking at a single letter in a
word is <CODE>first</CODE>. So <CODE>codematch</CODE> uses <CODE>first</CODE> to compare
its input letter with the first letter of the input alphabet:
<P><PRE>if equalp :letter first :clear ...
</PRE>
<P>If the first input to <CODE>codematch</CODE> is the letter A, then <CODE>
equalp</CODE> will output <CODE>true</CODE> and <CODE>codematch</CODE> will output the
first letter of <CODE>:code</CODE> (Q in the example I gave earlier). But
suppose the first input isn't an A. Then <CODE>codematch</CODE> has to solve
a smaller subproblem: Find the input letter in the remaining 25
letters of the alphabet. Finding a smaller, similar subproblem means
that we can use a recursive solution. <CODE>Codematch</CODE> invokes itself,
but for its second and third inputs it uses the <CODE>butfirst</CODE>s of the
original inputs because the first letter of the alphabet (A) and its
corresponding coded letter (Q) have already been rejected.
<P>
Here is a trace of an example of <CODE>codematch</CODE> at work, to help you
understand what's going on.
<P><PRE>codematch "e "abcdefghijklmnopqrstuvwxyz "qwertyuiopasdfghjklzxcvbnm
codematch "e "bcdefghijklmnopqrstuvwxyz "wertyuiopasdfghjklzxcvbnm
codematch "e "cdefghijklmnopqrstuvwxyz "ertyuiopasdfghjklzxcvbnm
codematch "e "defghijklmnopqrstuvwxyz "rtyuiopasdfghjklzxcvbnm
codematch "e "efghijklmnopqrstuvwxyz "tyuiopasdfghjklzxcvbnm
codematch outputs "t
codematch outputs "t
codematch outputs "t
codematch outputs "t
codematch outputs "t
</PRE>
<P>The fifth, innermost invocation of <CODE>codematch</CODE> succeeds
at matching its first input (the letter E) with the first letter of
its second input. That invocation therefore outputs the first letter
of its third input, the letter T. Each of the higher-level
invocations outputs the same thing in turn.
<P>The pattern of doing something to the <CODE>first</CODE> of an input, then
invoking the same procedure recursively with the <CODE>butfirst</CODE> as the
new input, is a familiar one from recursive commands. If we only
wanted to translate single letters, we could have written <CODE>
codelet</CODE> and <CODE>codematch</CODE> as commands, like this:
<P><PRE>to codelet :letter :code ;; command version
codematch :letter "abcdefghijklmnopqrstuvwxyz :code
end
to codematch :letter :clear :code ;; command version
if emptyp :clear [print :letter stop]
if equalp :letter first :clear [print first :code stop]
codematch :letter butfirst :clear butfirst :code
end
</PRE>
<P>You may find this version a little easier to understand,
because it's more like the recursive commands we've examined in the
past. But making <CODE>codelet</CODE> an operation is a much stronger
technique. Instead of being required to print the computed code
letter, we can make that letter part of a larger computation. In
fact, we have to do that in order to encipher a complete word. Each
word is made up of letters, and the task of <CODE>codeword</CODE> will be to go
through a word, letter by letter, using each letter as input to <CODE>
codelet</CODE>. The letters output by <CODE>codelet</CODE> must be combined into a
new word, which will be the output from <CODE>codeword</CODE>.
<P>We could write <CODE>codeword</CODE> using the higher order function <CODE>map</CODE>:
<P><PRE>to codeword :word :code ;; using higher order function
output map [codelet ? :code] :word
end
</PRE>
<P>But to help you learn how to write recursive operations, in this
chapter we'll avoid higher order functions. (As it turns out, <CODE>map</CODE>
itself is a recursive operation, written using the techniques of this
chapter.)
<P>Recall the structure of a previous procedure that went through a word
letter by letter:
<P><PRE>to one.per.line :word
if emptyp :word [stop]
print first :word
one.per.line butfirst :word
end
</PRE>
<P>Compare this to the structure of the recursive <CODE>codeword</CODE>:
<P><PRE>to codeword :word :code
if emptyp :word [output "]
output word (codelet first :word :code) (codeword butfirst :word :code)
end
</PRE>
<P>There are many similarities. Both procedures have a
stop rule that tests for an empty input. Both do something to the <CODE>
first</CODE> of the input (either <CODE>print</CODE> or <CODE>codelet</CODE>), and each
invokes itself recursively on the <CODE>butfirst</CODE> of the input.
(<CODE>Codeword</CODE> has an extra input for the code letters, but that
doesn't really change the structure of the procedure. If that's
confusing to you, you could temporarily pretend that <CODE>code</CODE> is a
global variable and eliminate it as an input.)
<P>The differences have to do with the fact that <CODE>codeword</CODE> is an
operation instead of a command. The stop rule invokes <CODE>output</CODE>
rather than <CODE>stop</CODE> and must therefore specify what is to be
output when the stop condition is met. (In this case, when the input
word is empty, the output is also the empty word.) But the main thing
is that the action step (the <CODE>print</CODE> in <CODE>one.per.line</CODE>) and
the recursive call (the <CODE>one.per.line</CODE> instruction) are not two
separate instructions in <CODE>codeword</CODE>. Instead they are
expressions (the two in parentheses) that are combined by <CODE>word</CODE>
to form the complete output. Here's a picture:
<P><CENTER><IMG SRC="codelet.gif" ALT="figure: codelet"></CENTER>
<P>Remember what you learned in Chapter 2 about the way in
which Logo instructions are evaluated. Consider the <CODE>output</CODE>
instruction in <CODE>codeword</CODE>. Before <CODE>output</CODE> can be invoked,
Logo must evaluate its input. That input comes from the output from
<CODE>word</CODE>. Before <CODE>word</CODE> can be invoked, Logo must evaluate <EM>
its</EM> inputs. There are two of them. The first input to <CODE>word</CODE>
is the expression
<P><PRE>codelet first :word :code
</PRE>
<P>This expression computes the coded version of the first
letter of the word we want to translate. The second input to <CODE>
word</CODE> is the expression
<P><PRE>codeword butfirst :word :code
</PRE>
<P>This expression invokes <CODE>codeword</CODE> recursively, solving
the smaller subproblem of translating a smaller word, one with the
first letter removed. When both of these computations are complete,
<CODE>word</CODE> can combine the results to form the translation of the
complete input word. Then <CODE>output</CODE> can output that result.
<P>Here's an example of how <CODE>codeword</CODE> is used.
<P><PRE>? <U>print codeword "hello "qwertyuiopasdfghjklzxcvbnm</U>
itssg
</PRE>
<P>Notice that we have to say <CODE>print</CODE>, not just start the
instruction line with <CODE>codeword</CODE>; a complete instruction
must have a command. Suppose you had the idea of saving all that
typing by changing the <CODE>output</CODE> instruction in <CODE>codeword</CODE> to a
<CODE>print</CODE>. What would happen? The answer is that <CODE>codeword</CODE>
wouldn't be able to invoke itself recursively as an operation. (If
you don't understand that, try it!) Also, it's generally a better idea
to write an operation when you have to compute some result.
That way, you aren't committed to printing the result; you can use it
as part of a larger computation.
<P>For example, right now I'd like to write a procedure <CODE>code</CODE> that
translates an entire sentence into code. Like <CODE>codeword</CODE>, it will
be an operation with two inputs, the second of which is a code (a word
of 26 scrambled letters). The difference is that the first input will
be a sentence instead of a word and the output will also be a sentence.
<P>»Write <CODE>code</CODE> using a higher order function. Then see if you can
write an equivalent recursive version.
<P>Just as <CODE>codeword</CODE> works by splitting up the word into letters,
<CODE>code</CODE> will work by splitting up a sentence into words. The
structure will be very similar. Here it is:
<P><PRE>to code :sent :code
if emptyp :sent [output []]
output sentence (codeword first :sent :code) (code butfirst :sent :code)
end
</PRE>
<P>The main differences are that <CODE>code</CODE> outputs the empty
list, instead of the empty word, for an empty input and that <CODE>sentence</CODE>
is used as the combining operation instead of <CODE>word</CODE>.
Here's an example of <CODE>code</CODE> at work.
<P><PRE>? <U>print code [meet at midnight, under the dock.] ~</U>
<U>"qwertyuiopasdfghjklzxcvbnm</U>
dttz qz dorfouiz, xfrtk zit rgea.
</PRE>
<P><H2>More Procedure Patterns</H2>
<P><CODE>Code</CODE> and <CODE>codeword</CODE> are examples of a very common pattern in
recursive operations: They are like using <CODE>map</CODE> with a particular
function. Here is the pattern that they fit.
<P><PRE><A NAME="map">to <U>procedure</U> :input</A>
if emptyp :input [output :input]
output <U>combiner</U> (<U>something</U> first :input) (<U>procedure</U> butfirst :input)
end
</PRE>
<P>The <EM>combiner</EM> is often <CODE>word</CODE> or <CODE>sentence</CODE>,
although others are possible. In fact, when working with lists, the
most common combiner is not <CODE>sentence</CODE> but another operation that
we haven't used before, <CODE>fput</CODE> (First PUT). <CODE>Fput</CODE> takes two
inputs. The first can be any datum, but the second must be a list.
The output from <CODE>fput</CODE> is a list that is equal to the second
input, except that the first input is inserted as a new first member.
In other words the output from <CODE>fput</CODE> is a list whose <CODE>first</CODE>
is the first input and whose <CODE>butfirst</CODE> is the second input.
<P><PRE>? <U>show sentence [hee hee hee] [ho ho ho]</U>
[hee hee hee ho ho ho]
? <U>show fput [hee hee hee] [ho ho ho]</U>
[[hee hee hee] ho ho ho]
</PRE>
<P><CODE>Fput</CODE> is a good combiner because the two things we want to
combine are the <CODE>first</CODE> and the <CODE>butfirst</CODE> of a list, except that
each has been modified in some way. But the <EM>shape</EM> of the final
result (a list of so many members) should be the same as the shape of the
input, and that's what <CODE>fput</CODE> ensures.
<P>When you're working with sentences--lists of words rather than lists
of lists--<CODE>sentence</CODE> and <CODE>fput</CODE> will work equally well
as the combiner. For example, <CODE>code</CODE> could have been written
using <CODE>fput</CODE> instead of <CODE>sentence</CODE>. Not until some of
the later examples, when we use tree-structured lists,
will the choice really be important.
<P>»<CODE>Fput</CODE> is actually a "more primitive" operation than <CODE>sentence</CODE>,
in the sense that the Logo interpreter actually constructs lists by
doing the internal equivalent of <CODE>fput</CODE>. As an exercise, you
might like to try writing your own versions of list combiners like
<CODE>sentence</CODE> and <CODE>list</CODE> out of <CODE>fput</CODE>, <CODE>first</CODE>, and
<CODE>butfirst</CODE>. You should also be able to write <CODE>last</CODE> and <CODE>
butlast</CODE> using only those three building blocks. (Actually you'll
also need <CODE>if</CODE>, <CODE>emptyp</CODE>, <CODE>wordp</CODE>, and <CODE>output</CODE>,
but you won't need any other primitive combiners.) Give your versions
distinct names, such as <CODE>my.sentence</CODE>, since Logo won't let you
redefine primitives.
<P>»Another "less primitive" primitive is <CODE>lput</CODE>, an operation that
takes two inputs. As for <CODE>fput</CODE>, the first can be any datum but
the second must be a list. The output from <CODE>lput</CODE> is a list whose
<CODE>last</CODE> is the first input and whose <CODE>butlast</CODE> is the second.
Write <CODE>my-lput</CODE> using <CODE>fput</CODE> and the selectors <CODE>first</CODE> and
<CODE>butfirst</CODE>.
<P>It may seem silly to learn a recursive pattern for problems that can be
solved using <CODE>map</CODE>. But sometimes we run into a problem that's <EM>
almost</EM> like a <CODE>map</CODE>, but not exactly. For example, how would you
write the following operation:
<P><PRE>? <U>show pairup [now here after glow worm hole]</U>
[nowhere hereafter afterglow glowworm wormhole]
</PRE>
<P>Instead of the usual <CODE>map</CODE>-like situation in which each word
in the result is a function of one word of the input, this time each word of
the result is a function of <EM>two</EM> neighboring input words. <CODE>Map</CODE>
won't solve this problem, but the <CODE>map</CODE>-like recursion pattern will.
<P><PRE>to pairup :words
if emptyp butfirst :words [output []]
output (sentence (word first :words first butfirst :words)
(pairup butfirst :words))
end
</PRE>
<P>Compare this procedure with the <A HREF="recops.html#map">general pattern</A>;
look for similarities and differences.
<P>»One difference is in the test for the base case. Why is the version
in <CODE>pairup</CODE> different from the one in the pattern?
<P>»Write an operation that interchanges pairs of words in a sentence,
like this:
<P><PRE>? <U>show swap [the rain in spain stays mainly on the plain]</U>
[rain the spain in mainly stays the on plain]
</PRE>
<P>Don't forget to think about that leftover word in an odd-length
sentence!
<P><H2>The <CODE>Filter</CODE> Pattern</H2>
<P>In Chapter 5 we saw this example:
<P><PRE>? <U>show filter "numberp [76 trombones, 4 calling birds, and 8 days]</U>
[76 4 8]
</PRE>
<P>To write a recursive operation <CODE>numbers</CODE> with the same result,
we must handle three cases: the base case, in which the input is empty; the
case in which the first word of the input is a number; and the case in which
the first word isn't a number.
<P><PRE>to numbers :sent
if emptyp :sent [output []]
if numberp first :sent ~
[output sentence first :sent numbers butfirst :sent]
output numbers butfirst :sent
end
? <U>show numbers [76 trombones, 4 calling birds, and 8 days]</U>
[76 4 8]
</PRE>
<P>Here's the general <CODE>filter</CODE> pattern:
<P><PRE>to <U>procedure</U> :input
if emptyp :input [output :input]
if <U>predicate</U> first :input ~
[output <U>combiner</U> first :input <U>procedure</U> butfirst :input]
output <U>procedure</U> butfirst :input
end
</PRE>
<P>As in the case of the <CODE>map</CODE> pattern, this one is most useful
in situations for which the higher order function won't quite do.
<P>»Write an operation that looks for two equal words next to each other
in a sentence, and outputs a sentence with one of them removed:
<P><PRE>? <U>show unique [Paris in the the spring is a joy joy to behold.]</U>
Paris in the spring is a joy to behold.
</PRE>
<P>What does your procedure do with <EM>three</EM> consecutive
equal words? What should it do?
<P><H2>The <CODE>Reduce</CODE> Pattern</H2>
<P>Other examples from Chapter 5 introduced the <CODE>reduce</CODE> higher
order function.
<P><PRE>? <U>show reduce "word [C S L S]</U>
CSLS
? <U>show reduce "sum [3 4 5 6]</U>
18
</PRE>
<P>Recursive operations equivalent to these examples are very much
like the <CODE>map</CODE> pattern except that the combiner function is applied to
the members of the input directly, rather than to some function of the
members of the input:
<P><PRE>to wordify :sentence
if emptyp :sentence [output "]
output word (first :sentence) (wordify butfirst :sentence)
end
to addup :numbers
if emptyp :numbers [output 0]
output sum (first :numbers) (addup butfirst :numbers)
end
</PRE>
<P>What are the differences between these two examples? There are two: the
combiner used and the value output in the base case. Here is the pattern:
<P><PRE>to <U>procedure</U> :input
if emptyp :input [output <U>identity</U>]
output <U>combiner</U> (first :input) (<U>procedure</U> butfirst :input)
end
</PRE>
<P>The identity in this pattern depends on the combiner; it's
the value that, when combined with something else, gives that something else
unchanged as the result. Thus, zero is the identity for <CODE>sum</CODE>, but the
identity for <CODE>product</CODE> would be one.
<P>»Write a <CODE>multiply</CODE> operation that takes a list of numbers as its
input and returns the product of all the numbers.
<P>»You can make your <CODE>multiply</CODE> procedure more efficient, in some
situations, by having it notice when one of the numbers in the input list is
zero. In that case, you can output zero as the overall result without
looking at any more numbers. The resulting procedure will, in a sense,
combine aspects of the <CODE>filter</CODE> and <CODE>reduce</CODE> patterns.
<P><CODE>Addup</CODE> is one example of an important sub-category of <CODE>reduce</CODE>-like
procedures in which the "combining"
operation is arithmetic, usually <CODE>sum</CODE>. The simplest example is a
procedure equivalent to the primitive <CODE>count</CODE>, which counts the
members of a list or the letters of a word:
<P><PRE>to length :thing
if emptyp :thing [output 0]
output 1+length butfirst :thing
end
</PRE>
<P>In this procedure, as usual, we can see the reduction of a
problem to a smaller subproblem. The length of any word or list is
one more than the length of its <CODE>butfirst</CODE>. Eventually this
process of shortening the input will reduce it to emptiness; the
length of an empty word or list is zero.
<P>Although <CODE>count</CODE> is a primitive, there are more complicated
counting situations in which not every member should be counted. For
example, here is a procedure to count the number of vowels in a word:
<P><PRE>to vowelcount :word
if emptyp :word [output 0]
if vowelp first :word [output 1+vowelcount butfirst :word]
output vowelcount butfirst :word
end
to vowelp :letter
output memberp :letter [a e i o u]
end
</PRE>
<P>(Actually, my predicate <CODE>vowelp</CODE> is somewhat
oversimplified. The letter Y is a vowel in certain positions in the
word, and even some other letters can sometimes play the role of a
vowel. But this isn't a book on linguistics!)
<P>You can see the similarities between <CODE>vowelcount</CODE> and <CODE>
length</CODE>. The difference is that, in effect, <CODE>length</CODE> uses a
predicate that is always <CODE>true</CODE>, so it always carries out the
instruction inside the <CODE>if</CODE>. Here's the pattern:
<P><PRE>to <U>procedure</U> :input
if emptyp :input [output 0]
if <U>predicate</U> first :input [output 1+<U>procedure</U> butfirst :input]
output <U>procedure</U> butfirst :input
end
</PRE>
<P>»Try writing a procedure that will accept as
input a word like <CODE>21,997.00</CODE> and output the number of digits
before the decimal point. (In this case the correct output is 5.)
Don't assume that there <EM>is</EM> a decimal point; your program
shouldn't blow up no matter what word it gets as input.
<P>»Another counting problem is to output the position of a member in a
list. This operation is the inverse to <CODE>item</CODE>, a Logo primitive,
which outputs the member at a given position
number. What I'm asking you to write is <CODE>index</CODE>, which works like
this:
<P><PRE>? <U>print index "seven [four score and seven years ago]</U>
4
? <U>print index "v "aardvark</U>
5
</PRE>
<P><H2>The <CODE>Find</CODE> Pattern</H2>
<P>
A variation of the <CODE>filter</CODE> pattern
is for <EM>selection</EM> operations: ones that pick a single element out of a
list. The general idea looks like this:
<P><PRE>to <U>procedure</U> :input
if emptyp :input [output :input]
if <U>predicate</U> first :input [output <U>something</U> first :input]
output <U>procedure</U> butfirst :input
end
</PRE>
<P>There will generally be extra inputs to these procedures, to
indicate the basis for selection. For example, here is a program
that translates English words into French.
<P><PRE>to french :word
output lookup :word [[book livre] [computer ordinateur] [window fenetre]]
end
to lookup :word :dictionary
if emptyp :dictionary [output "]
if equalp :word first first :dictionary [output last first :dictionary]
output lookup :word butfirst :dictionary
end
? <U>print french "computer</U>
ordinateur
</PRE>
<P>The expression
<P><PRE>first first :dictionary
</PRE>
<P>selects the English word from the first word-pair in the list.
Similarly,
<P><PRE>last first :dictionary
</PRE>
<P>selects the French version of the same word. (Of course, in
reality, the word list in <CODE>french</CODE> would be much longer than the
three word-pairs I've shown.)
<P><CODE>Codematch</CODE>, in the example that started this chapter, follows
the same pattern of selection. The only difference is that there are
two inputs that are <CODE>butfirst</CODE>ed in parallel.
<P>
Somewhat similar to the selection pattern is one for a recursive
<EM>predicate;</EM> the difference is that instead of
<P><PRE>output <U>something</U> first :input
</PRE>
<P>for a successful match, the procedure simply says
<P><PRE>output "true
</PRE>
<P>in that case. This pattern is followed by predicates that
ask a question like "Does any member of the input do X?" For
example, suppose that instead of counting the vowels in a word, we
just want to know whether or not there is a vowel. Then we're asking
the question "Is any letter in this word a vowel?" Here's how to
find out.
<P><PRE>to hasvowelp :word
if emptyp :word [output "false]
if vowelp first :word [output "true]
output hasvowelp butfirst :word
end
</PRE>
<P>A more realistic example is also somewhat more cluttered with extra
inputs and sometimes extra end tests. Here's a procedure that takes
two words as input. It outputs <CODE>true</CODE> if the first word comes
before the second in the dictionary.
<P><PRE>to sort.beforep :word1 :word2
if emptyp :word1 [output "true]
if emptyp :word2 [output "false]
if (ascii first :word1) < (ascii first :word2) [output "true]
if (ascii first :word1) > (ascii first :word2) [output "false]
output sort.beforep butfirst :word1 butfirst :word2
end
</PRE>
<P>The procedure will end on one of the <CODE>emptyp</CODE> tests if
one of the input words is the beginning of the other, like <CODE>now</CODE>
and <CODE>nowhere</CODE>. Otherwise, the procedure ends when two letters are
unequal. The recursion step is followed when the beginning letters
are equal. (The operation <CODE>ascii</CODE> takes a one-character word as
input, and outputs the numeric value for that character in the computer's
coding system, which is called the American Standard Code for Information
Interchange.)
<P>A combination of the translation kind of operation and the selection
kind is an operation that selects not one but several members of the
input. For example, you sometimes want to examine the words in a
sentence in various ways but have trouble because the sentence
includes punctuation as part of some words. But the punctuation isn't
<EM>really</EM> part of the word. In Chapter 4, for instance, I
defined a predicate <CODE>about.computersp</CODE> and gave this example of
its use:
<P><PRE>? <U>print about.computersp [this book is about programming]</U>
true
</PRE>
<P>But if the example were part of a larger program, carrying
on a conversation with a person, the person would probably have ended
the sentence with a period. The last word would then have been
<CODE>programming.</CODE> (including the period). That word, which is
different from <CODE>programming</CODE> without the period, isn't in the
procedure's list of relevant words, so it would have output <CODE>false</CODE>.
The solution is to write a procedure that strips the punctuation from
each word of a sentence. Of course that's a straightforward case of
the translation pattern, applying a subprocedure to each word of the
sentence:
<P><PRE>to strip :sent
if emptyp :sent [output []]
output sentence (strip.word first :sent) (strip butfirst :sent)
end
</PRE>
<P><CODE>Strip.word</CODE>, though, is more interesting. It must
select only the letters from a word.
<P><PRE>to strip.word :word
if emptyp :word [output "]
if letterp first :word ~
[output word (first :word) (strip.word butfirst :word)]
output strip.word butfirst :word
end
to letterp :char
output or (inrangep (ascii :char) (ascii "A) (ascii "Z)) ~
(inrangep (ascii :char) (ascii "a) (ascii "z))
end
to inrangep :this :low :high
output and (:this > (:low-1)) (:this < (:high+1))
end
</PRE>
<P><CODE>Strip.word</CODE> is like the translation pattern in the use
of the combining operation <CODE>word</CODE> in the middle instruction line.
But it's also like the selection pattern in that there are two
different choices of output, depending on the result of the predicate
<CODE>letterp</CODE>.
<P>»You might want to rewrite <CODE>about.computersp</CODE> so that it uses <CODE>
strip</CODE>. Consider an initialization procedure.
<P>
<H2>Numerical Operations: The <CODE>Cascade</CODE> Pattern</H2>
<P>Certain mathematical functions are defined in terms of recursive
calculations. It used to be that computers were used <EM>only</EM> for
numerical computation. They're now much more versatile, as you've
already seen, but sometimes the old numerical work is still important.
<P>The classic example in this category is the <EM>factorial</EM>
function. The factorial of a positive integer is the product of all
the integers from 1 up to that number. The factorial of 5 is
represented as
5!
so
<P><CENTER>5! = 1 × 2 × 3 × 4 × 5</CENTER>
<P>We can use <CODE>cascade</CODE> to carry out this computation:
<P><PRE>to fact :n ;; cascade version
output cascade :n [? * #] 1
end
? <U>print fact 5</U>
120
</PRE>
<P>In this example I'm using a feature of <CODE>cascade</CODE> that we
haven't seen before. The template (the second input to <CODE>cascade</CODE>) may
include a number sign (<CODE>#</CODE>) character, which represents the number of
times the template has been repeated. That is, it represents 1 the first
time, 2 the second time, and so on.
<P>Here is a recursive version of <CODE>fact</CODE> that takes one input, a
positive integer, and outputs the factorial function of that number.
The input can also be zero; the rule is that 0!=1.
<P><PRE>to fact :n
if :n=0 [output 1]
output :n * fact :n-1
end
</PRE>
<P>This procedure works because
<P><CENTER>5! = 5 × 4!</CENTER>
<P>That's
another version of reducing a problem to a smaller subproblem.
<P>»Chapter 5 gives the following example:
<P><PRE>to power :base :exponent
output cascade :exponent [? * :base] 1
end
</PRE>
<P>Write a version of <CODE>power</CODE> using recursion instead of using
<CODE>cascade</CODE>.
<P>Another classic example, slightly more complicated, is the
Fibonacci sequence. Each number
in the sequence is the sum of the two previous
numbers; the first two numbers are 1. So the sequence starts
<P><CENTER>1, 1, 2, 3, 4, 5, 13, …</CENTER>
<P>A formal definition of the sequence looks
like this:
<P><CENTER><IMG SRC="fibonacci.gif" ALT="math display"></CENTER>
<P>Here's an operation <CODE>fib</CODE>
that takes a number <EM>n</EM> as input and outputs <EM>F<SUB>n</SUB></EM>.
<P><PRE>to fib :n
if :n<2 [output 1]
output (fib :n-1)+(fib :n-2)
end
</PRE>
<P>That procedure will work, but it's quite seriously
inefficient. The problem is that it ends up computing the same
numbers over and over again. To see why, here's a trace of what
happens when you ask for <CODE>fib 4</CODE>:
<P><PRE>fib 4
fib 3
fib 2
fib 1
fib 0
fib 1
fib 2
fib 1
fib 0
</PRE>
<P>Do you see the problem? <CODE>fib 2</CODE> is computed twice, once
because <CODE>fib 4</CODE> needs it directly and once because <CODE>fib 4</CODE>
needs <CODE>fib 3</CODE> and <CODE>fib 3</CODE> needs <CODE>fib 2</CODE>. Similarly, <CODE>
fib 1</CODE> is computed three times. As the input to <CODE>fib</CODE> gets
bigger, this problem gets worse and worse.
<P>It turns out that a much faster way to handle this problem is to
compute a <EM>list</EM> of all the Fibonacci numbers up to the one we
want. Then each computation can take advantage of the work already
done. Here's what I mean:
<P><PRE>to fiblist :n
if :n<2 [output [1 1]]
output newfib fiblist :n-1
end
to newfib :list
output fput (sum first :list first butfirst :list) :list
end
? <U>print fiblist 5</U>
8 5 3 2 1 1
</PRE>
<P>We can then define a faster <CODE>fib</CODE> in terms of <CODE>
fiblist</CODE>:
<P><PRE>to fib :n
output first fiblist :n
end
</PRE>
<P>Convince yourself that the two versions of <CODE>fib</CODE> give
the same outputs but that the second version is much faster. I'm
purposely not going through a detailed explanation of this example;
you should use the analytical techniques you learned in Chapter 8.
What problem is <CODE>fiblist</CODE> trying to solve? What is the smaller
subproblem?
<P>The hallmark of numerical recursion is something like <CODE>:n-1</CODE> in the
recursion step. Sometimes this kind of recursion is combined with the <CODE>
butfirst</CODE> style we've seen in most of the earlier examples. Logo has a
primitive operation called <CODE>item</CODE>, which takes two inputs. The first is
a positive integer, and the second is a list. The output from <CODE>item</CODE> is
the <EM>n</EM>th member of the list if the first input is <EM>n.</EM> (Earlier
I suggested that you write <CODE>index</CODE>, the opposite of <CODE>item</CODE>.) If Logo
didn't include <CODE>item</CODE>, here's how you could write it:
<P><PRE>to item :n :list
if equalp :n 1 [output first :list]
output item :n-1 butfirst :list
end
</PRE>
<P><H2>Pig Latin</H2>
<P>When I was growing up, every kid learned a not-very-secret "secret"
language called Pig Latin. When I became a teacher, I was surprised
to find out that kids apparently didn't learn it any more. But
more recently it seems to have come back into vogue. Translating
a sentence into Pig Latin is an interesting programming problem, so
I'm going to teach it to you.
<P>Here's how it works. For each word take any consonants that are at
the beginning (up to the first vowel) and move them to the
end. Then add "ay" at the end. So "hello" becomes "ellohay";
"through" becomes "oughthray"; "aardvark" just becomes
"aardvarkay." (Pig Latin is meant to be spoken, not written. You're
supposed to practice so that you can do it and understand it really
fast.)
<P>By now you can write in your sleep the operation <CODE>
piglatin</CODE>, which
takes a sentence and outputs its translation into Pig Latin by going
through the sentence applying a subprocedure <CODE>plword</CODE> to each
word. (It's just like <CODE>code</CODE>, only different.)
It's <CODE>plword</CODE> that is the tricky part. The stop rule is
pretty straightforward:
<P><PRE>if vowelp first :word [output word :word "ay]
</PRE>
<P>If the first letter <EM>isn't</EM> a vowel, what we want to
do is move that letter to the end and try again. Here's the complete
procedure.
<P><PRE>to plword :word
if vowelp first :word [output word :word "ay]
output plword word butfirst :word first :word
end
</PRE>
<P>What makes this tricky is that the recursion step doesn't
seem to make the problem smaller. The word is still the same length
after we move the first letter to the end. This would look more like
all the other examples if the recursion step were
<P><PRE>output plword butfirst :word
</PRE>
<P>That would make the procedure easier to understand.
Unfortunately it would also give the wrong answer. What you have to
see is that there <EM>is</EM> something that is getting smaller about
the word, namely the "distance" from the beginning of the word to
the first vowel. Trace through a couple of examples to clarify this
for yourself.
<P>By the way, this will work better if you modify <CODE>vowelp</CODE> (which we
defined earlier) so that <CODE>y</CODE> is considered a vowel. You'll then
get the wrong answer for a few strange words like <CODE>yarn</CODE>, but on
the other hand, if you consider <CODE>y</CODE> a consonant, you'll get no
answer at all for words like <CODE>try</CODE> in which <CODE>y</CODE> is the only
vowel! (Try it. Do you understand what goes wrong?)
<P>»Some people learned a different dialect of Pig Latin. According to
them, if the word starts with a vowel in the first place, you should
add "way" at the end instead of just "ay." Modify <CODE>plword</CODE> so
that it speaks that dialect. (I think the idea is that some words
simply sound better with that rule.) Hint: You'll want an
initialization procedure.
<P>»The top-level procedure <CODE>piglatin</CODE>, which you wrote yourself, is a
good candidate for careful thought about punctuation. You don't want
to see
<P><PRE>? <U>print piglatin [what is he doing?]</U>
atwhay isway ehay oing?day
</PRE>
<P>A good first attempt would be to modify <CODE>piglatin</CODE> to
use <CODE>strip</CODE>, to get rid of the punctuation altogether. But even
better would be to remove the punctuation from each word, translate it
to Pig Latin, then put the punctuation back! Then we could get
<P><PRE>atwhay isway ehay oingday?
</PRE>
<P>That's the right thing to do for punctuation at the end of a
word, like a period or a comma. On the other hand, the apostrophe
inside a word like <CODE>isn't</CODE> should be treated just like a letter.
<P>The project I'm proposing to you is a pretty tricky one. Here's a
hint. Write an operation <CODE>endpunct</CODE> that takes a word as input
and outputs a <EM>list</EM> of two words, first the "real" word
full of letters, then any punctuation that might be at the end. (The
second word will be empty if there is no such punctuation.) Then your
new <CODE>plword</CODE> can be an initialization procedure that invokes a
subprocedure with <CODE>endpunct</CODE>'s output as its input.
<P><H2>A Mini-project: Spelling Numbers</H2>
<P>Write a procedure <CODE>number.name</CODE> that takes a positive integer
input, and outputs a sentence containing that number spelled out in words:
<P><PRE>? <U>print number.name 5513345</U>
five million five hundred thirteen thousand three hundred forty five
? <U>print number.name (fact 20)</U>
two quintillion four hundred thirty two quadrillion nine hundred two
trillion eight billion one hundred seventy six million six hundred
forty thousand
</PRE>
<P>There are some special cases you will need to consider:
<UL>
<LI>Numbers in which some particular digit is zero.
<LI>Numbers like 1,000,529 in which an entire group of three digits is zero.
<LI>Numbers in the teens.
</UL>
<P>Here are two hints. First, split the number into groups of three digits,
going from right to left. Also, use the sentence
<P><PRE>[thousand million billion trillion quadrillion quintillion
sextillion septillion octillion nonillion decillion]
</PRE>
<P>You can write this bottom-up or top-down. To work bottom-up, pick a subtask
and get that working before you tackle the overall structure of the
problem. For example, write a procedure that returns the word <CODE>fifteen</CODE>
given the argument <CODE>15</CODE>.
<P>To work top-down, start by writing <CODE>number.name</CODE>, freely assuming the
existence of whatever helper procedures you like. You can begin debugging
by writing <EM>stub</EM> procedures that fit into the overall program but
don't really do their job correctly. For example, as an intermediate stage
you might end up with a program that works like this:
<P><PRE>? <U>print number.name 1428425</U> ;; intermediate version
1 million 428 thousand 425
</PRE>
<P><H2>Advanced Recursion: <CODE>Subsets</CODE></H2>
<P>We've seen that recursive operations can do the same jobs as higher order
functions, and we've seen that recursive operations can do jobs that are
similar to the higher order function patterns but not quite the same. Now
we'll see that recursive operations can do jobs that are quite outside the
bounds of any of the higher order functions in Chapter 5.
<P>I'd like to write an operation <CODE>subsets</CODE> that takes a word as input.
Its output will be a sentence containing all the words that can be made
using letters from the input word, in the same order, but not necessarily
using all of them. For example, the word <CODE>lit</CODE> counts as a subset of
the word <CODE>lights</CODE>, but <CODE>hit</CODE> doesn't count because the letters are
in the wrong order. (Of course the procedure won't know which words are
real English words, so <CODE>iht</CODE>, which has the same letters as <CODE>hit</CODE>
in the right order, <EM>does</EM> count.)
<P>»How many subsets does <CODE>lights</CODE> have? Write them all down if
you're not sure. (Or perhaps you'd prefer to count the subsets of a shorter
word, such as <CODE>word</CODE>, instead.)
<P>A problem that follows the <CODE>map</CODE> pattern is one in which the size of the
output is the same as the size of the input, because each member of the
input gives rise to one member of the output. A problem that follows the
<CODE>filter</CODE> pattern is one in which the output is smaller than the input,
because only some of the members are selected. And the <CODE>reduce</CODE> pattern
collapses all of the members of the input into one single result. The
<CODE>subsets</CODE> problem is quite different from any of these; its output will
be much <EM>larger</EM> than its input.
<P>If we can't rely on known patterns, we'll have to go back to first
principles. In Chapter 8 you learned to write recursive procedures by
looking for a smaller, similar subproblem within the problem we're trying to
solve. What is a smaller subproblem that's similar to finding the subsets
of <CODE>lights</CODE>? How about finding the subsets of its butfirst? This idea
is the same one that's often worked for us before. So imagine that we've
already found all the subsets of <CODE>ights</CODE>.
<P>Some of the subsets of <CODE>lights</CODE> <EM>are</EM> subsets of <CODE>ights</CODE>.
Which ones aren't? The missing subsets are the ones that start with the
letter L. What's more, the other letters in such a subset form a subset of
<CODE>ights</CODE>. For example, the word <CODE>lits</CODE> consists of the letter L
followed by <CODE>its</CODE>, which is a subset of <CODE>ights</CODE>.
<P><PRE>to subsets :word ;; incomplete
local "smaller
make "smaller subsets butfirst :word
output sentence :smaller (map [word (first :word) ?] :smaller)
end
</PRE>
<P>This procedure reflects the idea I've just tried to explain. The
subsets of a given word can be divided into two groups: the subsets of its
butfirst, and those same subsets with the first letter of the word stuck on
the front.
<P>The procedure lacks a base case. It's tempting to say that if the input is
an empty word, then the output should be an empty sentence. But that isn't
quite right, because every word is a subset of itself, so in particular the
empty word is a subset (the only subset) of itself. We must output a
sentence containing an empty word. That's a little tricky to type, but we
can represent a quoted empty word as <CODE>"</CODE> and so a sentence containing an
empty word is <CODE>(sentence ")</CODE>.
<P><PRE>to subsets :word
if emptyp :word [output (sentence ")]
local "smaller
make "smaller subsets butfirst :word
output sentence :smaller (map [word (first :word) ?] :smaller)
end
</PRE>
<P>Why did I use the local variable <CODE>smaller</CODE> and a <CODE>make</CODE>
instruction? It wasn't strictly necessary; I could have said
<P><PRE>output sentence (subsets butfirst :word) ~
(map [word (first :word) ?] (subsets butfirst :word))
</PRE>
<P>The trouble is that this would have told Logo to compute the
smaller similar subproblem twice instead of just once. It may seem that
that would make the program take twice as long, but in fact the problem is
worse than that, because each smaller subproblem has a smaller subproblem of
its own, and those would be computed four times--twice for each of the two
computations of the first smaller subproblem! As in the case of the
Fibonacci sequence we studied earlier, avoiding the duplicated computation
makes an enormous difference.
<P>Problems like this one, in which the size of the output grows extremely
quickly for small changes in the size of the input, tend to be harder to
program than most. Here are a couple of examples. Like <CODE>subsets</CODE>, each
of these has a fairly short procedure definition that hides a very complex
computation.
<P>»On telephone dials, most digits have letters associated
with them. In the United States, for example, the digit 5 is associated with
the letters J, K, and L. (The correspondence is somewhat different in
other countries.) You can use these letters to spell out words to make your
phone number easier to remember. For example, many years ago I had the
phone number 492-6824, which spells I-WANT-BH. Write a procedure that takes
a number as its input, and outputs a sentence of all the words that that
number can represent. You may want to test the program using numbers of
fewer than seven digits!
<P>»In the game of Boggle<SUP><SMALL>TM</SMALL></SUP>, the object is to find words by connecting
neighboring letters in a four by four array of letters. For example, the
array
<P><PRE>BEZO
URND
AKAJ
WEOE
</PRE>
<P>contains the words ZEBRA, DONE, and DARK, but not RADAR, because
each letter can be used only once. Write a predicate procedure that takes a
word and an array of letters (in the form of a sentence with one word for
each row) as inputs, and outputs <CODE>true</CODE> if and only if the given word
can be found in the given array.
<P><PRE>? <U>print findword "zebra [bezo urnd akaj weoe]</U>
true
? <U>print findword "radar [bezo urnd akaj weoe]</U>
false
</PRE>
<P>
<H2>A Word about Tail Recursion</H2>
<P>What I want to talk about in the rest of this chapter isn't really
very important, so you can skip it if you want. But <EM>some</EM>
people think it's important, so this is for those people.
<P>Every procedure invocation takes up a certain amount of computer
memory, while the procedure remains active, to hold things like local
variables. Since a recursive procedure can invoke itself many times,
recursion is a fairly "expensive" technique to allow in a
programming language. It turns out that if the only
recursion step in a procedure is the very last thing the procedure
does, the interpreter can handle that procedure in a special way that
uses memory more efficiently. You can then use as many levels of
recursive invocation as you want without running out of space.
Such a procedure is called <EM>tail recursive.</EM>
It doesn't make any difference to you as a programmer; it's just a
matter of what's happening inside the Logo interpreter.
<P>A tail recursive command is very easy to recognize; the last
instruction is an invocation of the same procedure. Tail recursive
commands are quite common; here are a couple of examples we've seen
before.
<P><PRE>to one.per.line :thing
if emptyp :thing [stop]
print first :thing
one.per.line butfirst :thing
end
to poly :size :angle
forward :size
right :angle
poly :size :angle
end
</PRE>
<P>The thing is, many people are confused about what constitutes a tail
recursive operation. It <EM>isn't</EM> one that is invoked
recursively on the last instruction line! Instead, the rule is that
the recursive invocation must be used <EM>directly</EM> as the input
to <CODE>output</CODE>, not as part of a larger computation. For example,
this is a tail recursive operation:
<P><PRE>to lookup :word :dictionary
if emptyp :dictionary [output "]
if equalp :word first first :dictionary [output last first :dictionary]
output lookup :word butfirst :dictionary
end
</PRE>
<P>But this is <EM>not</EM> tail recursive:
<P><PRE>to length :thing
if emptyp :thing [output 0]
output 1+length butfirst :thing
end
</PRE>
<P>It's that <CODE>1+</CODE> that makes the difference.
<P>It's sometimes possible to change a nontail recursive operation into
a tail recursive one by tricky programming. For example, look again
at <CODE>fact</CODE>:
<P><PRE>to fact :n
if :n=0 [output 1]
output :n * fact :n-1
end
</PRE>
<P>This is not tail recursive because the input to the final
<CODE>output</CODE> comes from the multiplication, not directly from <CODE>
fact</CODE>. But here is a tail recursive version:
<P><PRE>to fact :n
output fact1 :n 1
end
to fact1 :n :product
if :n=0 [output :product]
output fact1 (:n-1) (:n*:product)
end
</PRE>
<P>Indeed, this version can, in principle, compute the
factorial of larger numbers than the simpler version without running
out of memory. In practice, though, the largest number that most
computers can understand is less than the factorial of 70, and any
computer will allow 70 levels of recursion without difficulty. In
fact, not every Logo interpreter bothers to recognize
tail recursive operations. It's a small point; I only mention it
because some people <EM>both</EM> make a big fuss about tail recursion
<EM>and</EM> misunderstand what it means!
<P><A HREF="../v1-toc2.html">(back to Table of Contents)</A>
<P><A HREF="../v1ch10/v1ch10.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../v1ch12/v1ch12.html"><STRONG>NEXT</STRONG></A>
<P>
<ADDRESS>
<A HREF="../index.html">Brian Harvey</A>,
<CODE>bh@cs.berkeley.edu</CODE>
</ADDRESS>
</BODY>
</HTML>