Simply Scheme: Introducing Computer Science ch 14: Common Patterns in Recursive Procedures

There are two ideas about how to solve programming problems.[1] One idea is that programmers work mostly by recognizing categories of problems that come up repeatedly and remembering the solution that worked last time; therefore, programming students should learn a lot of program patterns, or templates, and fill in the blanks for each specific problem. Another idea is that there are a few powerful principles in programming, and that if a learner understands the principles, they can be applied to any problem, even one that doesn't fit a familiar pattern.

Research suggests that an expert programmer, like an expert at any skill, does work mainly by recognizing patterns. Nevertheless, we lean toward the powerful-principle idea. The expert's memory is not full of arbitrary patterns; it's full of meaningful patterns, because the expert has gone through the process of struggling to reason out how each procedure works and how to write new procedures.

Still, we think it's worth pointing out a few patterns that are so common that you'll have seen several examples of each before you finish this book. Once you learn these patterns, you can write similar procedures almost automatically. But there's an irony in learning patterns: In Scheme, once you've identified a pattern, you can write a general-purpose procedure that handles all such cases without writing individual procedures for each situation. Then you don't have to use the pattern any more! Chapter 8 presents several general pattern-handling procedures, called higher-order procedures. In this chapter we'll consider the patterns corresponding to those higher-order procedures, and we'll use the names of those procedures to name the patterns.

What's the point of learning patterns if you can use higher-order procedures instead? There are at least two points. The first, as you'll see very soon, is that some problems almost follow one of the patterns; in that case, you can't use the corresponding higher-order procedure, which works only for problems that exactly follow the pattern. But you can use your understanding of the pattern to help with these related problems. The second p

import algorithm

doAssert product[int](newSeq[seq[int]]()) == newSeq[seq[int]](), "empty input"
doAssert product[int](@[newSeq[int](), @[], @[]]) == newSeq[seq[int]](), "bit more empty input"
doAssert product(@[@[1,2]]) == @[@[1,2]], "a simple case of one element"
doAssert product(@[@[1,2], @[3,4]]) == @[@[2,4],@[1,4],@[2,3],@[1,3]], "two elements"
doAssert product(@[@[1,2], @[3,4], @[5,6]]) == @[@[2,4,6],@[1,4,6],@[2,3,6],@[1,3,6], @[2,4,5],@[1,4,5],@[2,3,5],@[1,3,5]], "three elements"
doAssert product(@[@[1,2], @[]]) == newSeq[seq[int]](), "two elements, but one empty"
doAssert lowerBound([1,2,4], 3, system.cmp[int]) == 2
doAssert lowerBound([1,2,2,3], 4, system.cmp[int]) == 4
doAssert lowerBound([1,2,3,10], 11) == 4

What should the procedure return if sent is empty? In that case, there is no first number in the sentence, so it should return no-number:

What if the first word of the sentence is a number? The program should return just that number, ignoring the rest of the sentence:

What if the first word of the sentence isn't a number? The procedure must make a recursive call for the butfirst, and whatever that recursive call returns is the answer. So the else clause does not have to be changed.

After filling in the blank in the keep pattern, we solved this problem by focusing on the details of the procedure definition. We examined each piece of the definition to decide what changes were necessary. Instead, we could have focused on the behavior of the procedure. We would have found two ways in which the program didn't do what it was supposed to do: For an argument sentence containing numbers, it would return all of the numbers instead of just one of them. For a sentence without numbers, it would return the empty sentence instead of no-number. We would then have finished the job by debugging the procedure to fix each of these problems. The final result would have been the same.

Problems That Don't Follow Patterns

We want to write the procedure sent-before?, which takes two sentences as arguments and returns #t if the first comes alphabetically before the second. The general idea is to compare the sentences word by word. If the first words are different, then whichever is alphabetically earlier determines which sentence comes before the other. If the first words are equal, we go on to compare the second words.[5]

Does this problem follow any of the patterns we've seen? It's not an every, because the result isn't a sentence in which each word is a transformed version of a word in the arguments. It's not a keep, because the result isn't a subset of the words in the arguments. And it's not exactly an accumulate. We do end up with a single true or false result, rather than a sentence full of results. But in a typical accumulate problem, every word of the argument contributes to the solution. In this case only one word from each sentence determines the overall result.

On the other hand, this problem does have something in common with the keep pattern: We know that on each invocation there will be three possibilities. We might reach a base case (an empty sentence); if not, the first words of the argument sentences might or might not be relevant to the solution.

We'll have a structure similar to the usual keep pattern, except that there's no se involved; if we find unequal words, the problem is solved without further recursion. Also, we have two arguments, and either of them might be empty.

Although thinking about the keep pattern helped us to work out this solution, the result really doesn't look much like a keep. We had to invent most of the details by thinking about this particular problem, not by thinking about the pattern.

In the next chapter we'll look at examples of recursive procedures that are quite different from any of these patterns. Remember, the patterns are a shortcut for many common problems, but don't learn the shortcut at the expense of the general technique.

Pitfalls

How do you test for the base case? Most of the examples in this chapter have used empty?, and it's easy to fall into the habit of using that test without thinking. But, for example, if the argument is a number, that's probably the wrong test. Even when the argument is a sentence or a non-numeric word, it may not be empty in the base case, as in the Pig Latin example.

A serious pitfall is failing to recognize a situation in which you need an extra variable and therefore need a helper procedure. If at each step you need the entire original argument as well as the argument that's getting closer to the base case, you probably need a helper procedure. For example, write a procedure pairs that takes a word as argument and returns a sentence of all possible two-letter words made of letters from the argument word, allowing duplicates, like this:

A simple pitfall, when using a helper procedure, is to write a recursive call in the helper that calls the main procedure instead of calling the helper. (For example, what would have happened if we'd had every-nth-helper invoke every-nth instead of invoking itself?)

Some recursive procedures with more than one argument require more than one base case. But some don't. One pitfall is to leave out a necessary base case; another is to include something that looks like a base case but doesn't fit the structure of the program.

For example, the reason sent-before? needs two base cases is that on each recursive call, both sent1 and sent2 get smaller. Either sentence might run out first, and the procedure should return different values in those two cases.

On the other hand, Exercise 11.7 asked you to write a procedure that has two arguments but needs only one base case:

In this example, the wd argument doesn't get smaller from one invocation to the next. It would be silly to test for (empty? wd).

A noteworthy intermediate case is every-nth-helper. It does have two cond clauses that check for two different arguments reaching their smallest allowable values, but the remaining clause isn't a base case. If remaining has the value 1, the procedure still invokes itself recursively.

The only general principle we can offer is that you have to think about what base cases are appropriate, not just routinely copy whatever worked last time.

Exercises

Classify each of these problems as a pattern (every, keep, or accumulate), if possible, and then write the procedure recursively. In some cases we've given an example of invoking the procedure we want you to write, instead of describing it.

(It's okay if your solution removes the other MORNING instead, as long as it removes only one of them.)

(It's okay if your procedure returns (DI OB LA DA) instead, as long as it removes all but one instance of each duplicated word.)

14.5 [8.7] Write a procedure letter-count that takes a sentence as its argument and returns the total number of letters in the sentence:

14.7 Write differences, which takes a sentence of numbers as its argument and returns a sentence containing the differences between adjacent elements. (The length of the returned sentence is one less than that of the argument.)

14.8 Write expand, which takes a sentence as its argument. It returns a sentence similar to the argument, except that if a number appears in the argument, then the return value contains that many copies of the following word:

14.9 Write a procedure called location that takes two arguments, a word and a sentence. It should return a number indicating where in the sentence that word can be found. If the word isn't in the sentence, return #f. If the word appears more than once, return the location of the first appearance.

14.10 Write the procedure count-adjacent-duplicates that takes a sentence as an argument and returns the number of words in the sentence that are immediately followed by the same word:

14.11 Write the procedure remove-adjacent-duplicates that takes a sentence as argument and returns the same sentence but with any word that's immediately followed by the same word removed:

14.12 Write a procedure progressive-squares? that takes a sentence of numbers as its argument. It should return #t if each number (other than the first) is the square of the number before it:

14.13 What does the pigl procedure from Chapter 11 do if you invoke it with a word like "frzzmlpt" that has no vowels? Fix it so that it returns "frzzmlptay."

14.14 Write a predicate same-shape? that takes two sentences as arguments. It should return #t if two conditions are met: The two sentences must have the same number of words, and each word of the first sentence must have the same number of letters as the word in the corresponding position in the second sentence.

14.15 Write merge, a procedure that takes two sentences of numbers as arguments. Each sentence must consist of numbers in increasing order. Merge should return a single sentence containing all of the numbers, in order. (We'll use this in the next chapter as part of a sorting algorithm.)

14.16 Write a procedure syllables that takes a word as its argument and returns the number of syllables in the word, counted according to the following rule: the number of syllables is the number of vowels, except that a group of consecutive vowels counts as one. For example, in the word "soaring," the group "oa" represents one syllable and the vowel "i" represents a second one.

Be sure to choose test cases that expose likely failures of your procedure. For example, what if the word ends with a vowel? What if it ends with two vowels in a row? What if it has more than two consecutive vowels?

(Of course this rule isn't good enough. It doesn't deal with things like silent "e"s that don't create a syllable ("like"), consecutive vowels that don't form a diphthong ("cooperate"), letters like "y" that are vowels only sometimes, etc. If you get bored, see whether you can teach the program to recognize some of these special cases.)

[2] If you've read Chapter 8, you know that you could implement square-sent and pigl-sent without recursion, using the every higher order function. But try using every to implement letter-pairs; you'll find that you can't quite make it work.

[3] Of course, if your version of Scheme has −∞, you can use it as the return value for an empty sentence, instead of changing the pattern.

The higher-order function version is more self-documenting and easier to write. The recursive version, however, is slightly more efficient, because it avoids building up a sentence as an intermediate value only to discard it in the final result. If we were writing this program for our own use, we'd probably choose the higher-order function version; but if we were dealing with sentences of length 10,000 instead of length 10, we'd pay more attention to efficiency.

[5] Dictionaries use a different ordering rule, in which the sentences are treated as if they were single words, with the spaces removed. By the dictionary rule, "a c" is treated as if it were "ac" and comes after "ab"; by our rule, "a c" comes before "ab" because we compare the first words ("a" and "ab").

Chapter 14

Common Patterns in Recursive Procedures

Problems That Don't Follow Patterns

Pitfalls

Exercises