<HTML>
<HEAD>
<TITLE>Computer Science Logo Style vol 3 ch 2: Discrete Mathematics</TITLE>
</HEAD>
<BODY>
<CITE>Computer Science Logo Style</CITE> volume 3:
<CITE>Beyond Programming</CITE> 2/e Copyright (C) 1997 MIT
<H1>Discrete Mathematics</H1>
<TABLE width="100%"><TR><TD>
<IMG SRC="../csls3.jpg" ALT="cover photo">
<TD><TABLE>
<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian
Harvey</A><BR>University of California, Berkeley</CITE>
<TR><TD align="right"><BR>
<TR><TD align="right"><A HREF="../pdf/v3ch02.pdf">Download PDF version</A>
<TR><TD align="right"><A HREF="../v3-toc2.html">Back to Table of Contents</A>
<TR><TD align="right"><A HREF="../v3ch1/v3ch1.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../v3ch3/v3ch3.html"><STRONG>NEXT</STRONG></A>
<TR><TD align="right"><A HREF="https://mitpress.mit.edu/books/computer-science-logo-style-second-edition-volume-3">MIT
Press web page for <CITE>Computer Science Logo Style</CITE></A>
</TABLE></TABLE>
<HR><P>Program file for this chapter: <A HREF="math.lg"><CODE>math</CODE></A>
<P>Computer scientists often use mathematics as a tool in their work, but the
mathematical problems that arise in computer science are of a special kind.
Consider these examples:
<P>Suppose you have a nondeterministic FSM with five states and you want to
convert it to a deterministic one. What is the largest number of states
that might be required for the new machine? Well, each state of the new
machine corresponds to some <EM>combination</EM> of states of the old one,
because the conversion works by finding multiple transitions (from some state
via the same input character to multiple states) and creating a new state
that combines all those resulting states. How many such combinations are
possible? In other words, how many <EM>subsets</EM> does a five-element set
have? The answer is 2<SUP><SMALL>5</SMALL></SUP> or 32 states. (31, really, because one of those
is the empty subset and that will never be used.)
<P>Suppose you want to write a program to translate telephone numbers into
letters. A telephone number has seven digits, each of which corresponds to
three letters. How many different strings of letters are possible? Well,
there are three choices for the first digit, times three for the second, and
so on...
<P>These are typical of the kinds of mathematics problems that a computer
scientist confronts in that they are <EM>counting</EM> problems--ones that
involve integers. Another relevant kind of problem is the <EM>logic</EM>
problem, in which the values under consideration are just <CODE>true</CODE> and
<CODE>false</CODE>. These areas of mathematics are quite different from what is
studied in the usual high school and college math courses: algebra,
geometry, trig, calculus, differential equations. Those courses deal with
<EM>measurement</EM> problems, in which the answer can be any number,
including a fraction or an irrational number. This conventional mathematics
curriculum, studying <EM>continuous</EM> functions, is dictated by the needs
of physics and the physics-based engineering subjects. Computer scientists
need a different kind of math, called <EM>discrete</EM> mathematics.
("Discrete" is the opposite of "continuous" and is not the same word as
its homonym "discreet" meaning tactful.)
<P><H2>Propositional Logic</H2>
<P>You already know that what in Logo is called an <EM>operation</EM> is the
computer programming version of a mathematical <EM>function.</EM> The inputs
and outputs of Logo operations may be numbers, or they may be other kinds of
words or lists. In ordinary algebra the functions we use have numeric
values. Certain Logo operations are identical to the ones used in algebra:
<CODE>sin :x</CODE> is sin(<EM>x</EM>) and <CODE>sqrt :x</CODE> is √<EM>x</EM>. On the other
hand, there is nothing in ordinary school mathematics quite like <CODE>first</CODE>
or <CODE>sentence</CODE>. You may have been taught to use the word "function"
only when you see a notation like <EM>f</EM>(<EM>x</EM>), but in fact the ordinary
arithmetic operations are functions, too. The addition in <EM>a</EM>+<EM>b</EM> is a
function with two arguments, just like <CODE>sum :a :b</CODE> in Logo.
<P>
<P>In Logo there are also operations whose inputs and outputs are the words
<CODE>true</CODE> and <CODE>false</CODE>. The primitive operations in this category are
<CODE>and</CODE>, <CODE>or</CODE>, and <CODE>not</CODE>. Just as algebra deals with numeric
functions, <EM>logic</EM> is the branch of mathematics that deals with these
<EM>truth-valued</EM> functions. Instead of numbers, these functions combine
<EM>propositions</EM>: statements that may be true or false. A Logo
expression like <CODE>:x=0</CODE> represents a proposition. "Abraham Lincoln was
the King of England" is a proposition; it happens to be false, but it's a
perfectly valid one because it asserts something that's either true or
false. "It will rain in Boston tomorrow" is a proposition whose truth
value we don't know yet. "Chinese food is better than French food" is an
example of a sentence whose validity as a proposition is open to question.
If I say that, I'm expressing my personal taste, not an objective statement
that could be proven true or false.<SUP>*</SUP>
<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>On the other hand, "The
Beatles are better than Led Zeppelin" is a perfectly valid, obviously true
proposition.</SMALL></BLOCKQUOTE></SMALL><P>Logical functions combine <EM>simple</EM> propositions into <EM>compound</EM>
propositions. For example, "Either Abraham Lincoln was the King of England
or he was the President of the United States" is a
compound proposition.
It's true even though one of the simple propositions within it is false.
Just as in algebra we use letters like <EM>x</EM> to represent numbers and
expressions like <EM>x</EM>+<EM>y</EM> to indicate the use of functions to combine numbers,
in logic we use letters to represent propositions and there are function
symbols for the logical functions. If <EM>p</EM> is the proposition "Abraham
Lincoln was the King of England," and <EM>q</EM> is the proposition "Abraham
Lincoln was the President of the United States," then the expression <EM>p</EM>∨<EM>q</EM> represents the compound proposition above. The symbol ∨ represents
the <EM>or</EM> function; ∧
represents <EM>and</EM>; ¬ and ∼
are alternative representations for <EM>not.</EM> The symbol → represents
"implies"; it turns out that <EM>p</EM>→<EM>q</EM> is equivalent to ¬<EM>p</EM>∨<EM>q</EM>;
in other words, the value of the function is true if either the "if" part
is <EM>false</EM> or the "then" part is true. An example of the former is
the classic "If wishes were horses then beggars would ride."
<P>(Don't confuse the → function with the Logo <CODE>if</CODE> command. The
latter isn't a function (an operation), but a command. It tells Logo to
take some <EM>action</EM> if a given condition is met. The operation
<P><PRE>to implies :p :q
output or (not :p) :q
end
</PRE>
<P>is the Logo equivalent of the → function in logic.)
<P>The most important use of logic in mathematics is in understanding the idea
of <EM>proof.</EM> What is a valid reason for claiming that some proposition
has been proven true? Many people come across the idea of proof for the
first and last time in high school geometry. We are asked to prove some
proposition like "the sum of the interior angles of a triangle is
180°." For each step in the proof we must give a <EM>reason</EM>
such as "things equal to the same thing are equal to each other."
<P>In logic there are certain rules that allow us to infer one proposition from
one or more previously known propositions. These rules correspond
roughly to the "reasons" in a geometry proof. They are called
<EM>rules of inference.</EM> You use rules of
inference informally all the time, whenever
you try to convince someone of something by reasoning. "Is Jay here?"
"Yes." "How do you know?" "I saw his car in the driveway, and if his car
is here, he must be here too."
<P>Suppose we use the letter <EM>p</EM> to represent the proposition "Jay's car is
here" and the letter <EM>q</EM> to represent "Jay is here." Then the reasoning
quoted in the last paragraph says "<EM>p</EM> is true and <EM>p</EM>→<EM>q</EM> is true, so
<EM>q</EM> must be true." ("<EM>p</EM>→<EM>q</EM>" is the proposition "If Jay's car is
here, he must be here too.") The fact that <EM>p</EM> and <EM>p</EM>→<EM>q</EM> allow us to
infer <EM>q</EM> is a rule of inference. (Of course the rule doesn't
tell us about the truth of its component propositions. We have to determine
that by some means outside of logic, such as observation of the world.
I had to <EM>see</EM> Jay's car in the driveway to know that <EM>p</EM> is true.)
<P><H2>An Inference System</H2>
<P>What does all this have to do with computer science? One application of
logic is in <EM>inference systems</EM>: programs that deduce propositions
from other ones. Such systems are important both in business applications
where large data bases are used and in artificial intelligence programs that
try to answer questions based on information implied by some text but not
explicit in the text.
<P>In this section I'll show you a special-purpose inference system that solves
logic problems. Logic problems are the ones in which you're given
certain propositions and asked to deduce others. Mr. Smith lives next to
the carpenter; John likes classical music; who lives in the yellow house?
Here is a typical problem taken from <EM>Mind Benders,</EM> Book B-2,
by Anita Harnadek.
<P><BLOCKQUOTE>
<P><A NAME="harnadek">A cub reporter
interviewed four people. He was very careless, however.
Each statement he wrote was half right and half wrong. He went back and
interviewed the people again. And again, each statement he wrote was half
right and half wrong. From the information below, can you straighten out
the mess?
<P>The first names were Jane, Larry, Opal, and Perry. The last names were
Irving, King, Mendle, and Nathan. The ages were 32, 38, 45, and 55. The
occupations were drafter, pilot, police sergeant, and test car driver.
<P>On the first interview, he wrote these statements, one from each person:
<P><P>
<P><TABLE><TR><TH align="right" valign="top">1.<TD> <TD valign="top"> Jane: "My name is Irving, and I'm 45."
<TR><TH align="right" valign="top">2.<TD> <TD valign="top"> King: "I'm Perry and I drive test cars."
<TR><TH align="right" valign="top">3.<TD> <TD valign="top"> Larry: "I'm a police sergeant and I'm 45."
<TR><TH align="right" valign="top">4.<TD> <TD valign="top"> Nathan: "I'm a drafter, and I'm 38."
<P></TABLE>
<P>On the second interview, he wrote these statements, one from each person:
<P><P>
<P><TABLE><TR><TH align="right" valign="top">5.<TD> <TD valign="top"> Mendle: "I'm a pilot, and my name is Larry."
<TR><TH align="right" valign="top">6.<TD> <TD valign="top"> Jane: "I'm a pilot, and I'm 45."
<TR><TH align="right" valign="top">7.<TD> <TD valign="top"> Opal: "I'm 55 and I drive test cars."
<TR><TH align="right" valign="top">8.<TD> <TD valign="top"> Nathan: "I'm 38 and I drive test cars."
<P></TABLE></BLOCKQUOTE>
<P>
<CENTER><IMG SRC="grid.gif" ALT="figure: grid"></CENTER>
<P>The chart provided with the problem is a guide to its solution. Each square
in the chart represents a proposition. For example, the box where the
"Larry" row meets the "pilot" column represents the proposition "Larry
is the pilot." In solving the problem, you can put marks in the boxes to
indicate what you know about the propositions. The status of a proposition
need not be only true or false. Initially the status of each proposition is
<EM>unknown</EM>; we have no idea whether it's true or false. The structure
of this particular problem also allows the status of a proposition to be that
it is linked with another proposition in an <EM>exclusive-or</EM>
relationship; that is, if one of the linked propositions turns out to be
true, then the other must be false, and vice versa. You can use whatever
notation you find convenient to express these possibilities. After
experimenting with T and F and with check marks and crosses, I found that
circles for true propositions and crosses for false ones made it easiest for
me to see quickly the pattern of known truths. For the linked propositions,
I used the statement numbers (1 to 8) in the boxes; two boxes with the same
number represent linked statements.
<P>You should probably solve this problem by hand before we go on to discuss
the computer solution. Stop reading now and work on the problem if you want
to do it without any hints from me.
<P>Let me introduce a little new terminology to help in the following
discussion. I'll call something like "last name" or "occupation" a
<EM>category</EM>; something like "Mendle" or "pilot" I'll call an
<EM>individual.</EM> As I'm using this terminology, "Mendle" and "pilot"
are two different individuals even if they turn out, when we solve the
problem, to be the same person.
<P>It's important that each group of four statements contains one from each
person, because the names of the speakers include first and last names.
That is, from the first group of statements we know that Jane, King, Larry,
and Nathan are four distinct people. This falsifies such propositions as
"Jane is King." After noting the first group of statements my chart looks
like this:
<P><CENTER><IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch2/grid2.gif" ALT="figure: grid2"></CENTER>
<P>From the second group of statements we learn that Mendle, Jane, Opal, and
Nathan are all distinct people. When I marked an X in the "Jane is
Mendle" box, I noticed that all but one last name for Jane had been
eliminated. I therefore put a circle in the "Jane is Irving" box. This
illustrates a special rule of inference for problems of this kind: If, for
a given individual <EM>x</EM>, all but one proposition "<EM>x</EM> is <EM>y</EM>" have been
falsified for a certain <EM>category</EM> of <EM>y</EM> individuals, then the
remaining proposition in that category must be true. (The reason I'm going
through this analysis is that the rules of inference I discover while
working the problem by hand are going to end up in the design of the
computer program.) I'm going to call this the <EM>elimination</EM> rule.
<P>Since Jane is Irving, nobody else can be Irving. This falsifies three
propositions whose status was formerly unknown: "Larry is Irving,"
"Opal is Irving," and "Perry is Irving." (The truth of "Jane is Irving"
would also falsify "Jane is King" and so on, but we already knew those to
be false.) The general rule is that if "<EM>x</EM> is <EM>y</EM>" is true then "<EM>x</EM> is
<EM>z</EM>" must be false for all other <EM>z</EM> in the same category as <EM>y</EM>, and
likewise "<EM>w</EM> is <EM>y</EM>" is false for all other <EM>w</EM> in the same category as
<EM>x</EM>. I'll call this the <EM>uniqueness</EM> rule.
<P>The proposition "Jane is Irving" was linked with the proposition
"Jane is 45" earlier. That latter proposition must therefore be false.
This is another rule of inference, the <EM>link falsification</EM> rule.
There is also a <EM>link verification</EM> rule that comes into play when a
linked proposition is found to be false; the other linked proposition must
then be true.
<P>Since we know that "Jane is 45" is false, and that "Jane is Irving" is
true, it follows that "Irving is 45" must be false. The general rule is
that if "<EM>x</EM> is <EM>y</EM>" is true and "<EM>x</EM> is <EM>z</EM>" is false, then "<EM>y</EM> is
<EM>z</EM>" must also be false. Similarly, if "<EM>x</EM> is <EM>y</EM>" is true and "<EM>x</EM> is
<EM>z</EM>" is true, then "<EM>y</EM> is <EM>z</EM>" must be true. I'll call these the
<EM>transitive</EM> rules.
<P>You should be following along with your own copy of the chart. By the
elimination rule, "Opal is King" must be true. Then, by the uniqueness
rule, "Perry is King" is false. Then, by the link verification rule,
"King is test driver" must be true. By the transitive rule, "Opal is
test driver" is true also. Here is my chart after making all possible
deductions from the fact that Mendle, Jane, Opal, and Nathan are distinct
people:
<P><CENTER><IMG SRC="grid3.gif" ALT="figure: grid3"></CENTER>
<P>Looking at statement number 5, we see that "Mendle is pilot" is linked to
"Mendle is Larry." But we already know that the latter is true, so the
former must be false. We can just put an X in that box and not bother
marking the number 5 anywhere. The same is true for statements 6 and 7;
here's my chart after statement 7:
<P>
<CENTER><IMG SRC="grid4.gif" ALT="figure: grid4"></CENTER>
<P>I haven't marked anything in the
age vs. occupation section of the chart, even though I can deduce some
propositions for that section. For example, I know that "Jane is pilot"
is true and that "Jane is 45" is false. By the transitive rule, "pilot
is 45" must be false. I just haven't bothered making <EM>all</EM> possible
deductions; I can always fill in that section of the chart if I get to the
end of the problem and still don't know who's who.
<P>Before dealing with the final statement we know all the pairings of first
and last names, but we don't know any ages and we only know half the jobs.
The last statement lets loose a flurry of deductions. The critical one is
that if Nathan is 38, he or she (Perry is an ambiguous first name) can't be
the drafter because of the link from statement 4. Here is my final chart:
<P><CENTER><IMG SRC="grid5.gif" ALT="figure: grid5"></CENTER>
<P>Once it was clear that I was in the final steps of the solution, I
didn't bother maintaining the age vs. last name section. The results can
be found by reading just the three sections at the top; for example,
Jane is Irving, the pilot, and 55.
<P><H2>Problems with Ordering</H2>
<P>The elimination, uniqueness, and transitive rules are useful in just about
all logic problems, but the link falsification and link verification rules
depend on this specific problem's gimmick that we are given pairs of
propositions and told that exactly one of them is true. To solve other
problems, a more flexible kind of linkage is needed; we must be able to say
"if <EM>p</EM> is true, then <EM>q</EM> is true also" or "if <EM>p</EM> is false, then <EM>q</EM>
must be true," and so on.
<P>A very common kind of linkage is an <EM>ordering,</EM> in which we are
told a sequence of events, or a row of houses on a street, for example.
In the cub reporter problem, the ages form an ordering, but aren't used
as such--that is, the problem doesn't include clues such as "the test
driver is younger than Jane." Suppose we did see that clue; what could
we conclude from it? Certain propositions would be settled for sure:
<P><P>
<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">The test driver must not be Jane.
<TR><TH align="right" valign="top">•<TD> <TD valign="top">The test driver isn't the oldest age (55).
<TR><TH align="right" valign="top">•<TD> <TD valign="top">Jane isn't the youngest age (32).
</TABLE><P>
<P>Other propositions would be linked by <EM>implications:</EM>
<P><P>
<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> Jane is 38, <STRONG>then</STRONG> the driver is 32.
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> Jane is 45, <STRONG>then</STRONG> the driver is 32 or 38.
</TABLE><P>
<P>and so on. (Actually, the second of these isn't directly
representable on the chart, because there's no notation for a single
proposition with two alternatives, like "32 or 38." So instead
we'd represent this using propositions about what <EM>can't</EM> be
true:
<P><P>
<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> Jane is 45, <STRONG>then</STRONG> the driver is not 55.
</TABLE><P>
<P>We already know that the driver isn't Jane, so there's no need
to record the implication that if Jane is 45 the driver isn't 45.)
<P>Here is another problem, called "Forgetful Footes,"
by Diane C. Baldwin.
It is taken from <EM>The Dell Book of Logic Problems #4.</EM>
<P><BLOCKQUOTE>
<P><A NAME="baldwin">The forgetful Foote</A> family
fairly flew from their flat up Fleet Street
to the freeway for a fling in Florida. Before they passed five
intersections, Felix and the other four Footes found they each had forgotten
something (including the food) and were forced back to their flat to fetch
it. Each turnaround was made at a different street intersection, one of
them at Field Street. From the following clues, can you figure out who
forgot what and in what order, and find the order of the intersections on
Fleet Street from the Foote's flat to the freeway?
<P><P>
<P><TABLE><TR><TH align="right" valign="top">1.<TD> <TD valign="top"> The second turnaround began at the street following Flag Street and
the street before Fred had to return to the flat.
<TR><TH align="right" valign="top">2.<TD> <TD valign="top"> The men were responsible for the return for the film, the
turnaround at Fig Street, and the fifth trip back.
<TR><TH align="right" valign="top">3.<TD> <TD valign="top"> Fork Street followed the one where they returned to fetch the
flashlight and preceded the one where a woman had them make their first
turnaround.
<TR><TH align="right" valign="top">4.<TD> <TD valign="top"> The final trip back didn't begin at Frond Street, nor was it the
one to fetch the fan.
<TR><TH align="right" valign="top">5.<TD> <TD valign="top"> Frank's forgetfulness turned them back one block and one return
trip following Francine's.
<TR><TH align="right" valign="top">6.<TD> <TD valign="top"> One woman was the third to forget, while the other woman turned
them back at Flag Street.
<TR><TH align="right" valign="top">7.<TD> <TD valign="top"> They returned for the fiddle just before the trip back that began
at Frond Street and just following the one requested by Flo.
<P></TABLE></BLOCKQUOTE><P>
<P>This problem includes <EM>two</EM> orderings. The order of the intersections
with cross streets is not necessarily the same as the time order of the
return trips. That is, the Footes might have gone four blocks before making
their first turnaround, then gone only one block before the second return,
and so on. In making a chart for this problem, I used the numbers 1-5 to
represent the order of streets, and the ordinals 1st-5th to represent the
time order of return trips. Clue 1, then, gives us this series of
implications:
<P><P>
<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> Flag Street is 1, <STRONG>then</STRONG> 2nd is 2.
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> Flag Street is 2, <STRONG>then</STRONG> 2nd is 3.
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> Flag Street is 3, <STRONG>then</STRONG> 2nd is 4.
<P>
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> 2nd is 2, <STRONG>then</STRONG> Flag Street is 1.
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> 2nd is 3, <STRONG>then</STRONG> Flag Street is 2.
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> 2nd is 4, <STRONG>then</STRONG> Flag Street is 3.
<P>
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> 2nd is 2, <STRONG>then</STRONG> Fred is 3.
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> 2nd is 3, <STRONG>then</STRONG> Fred is 4.
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> 2nd is 4, <STRONG>then</STRONG> Fred is 5.
<P>
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> Fred is 3, <STRONG>then</STRONG> 2nd is 2.
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> Fred is 3, <STRONG>then</STRONG> 2nd is 3.
<TR><TH align="right" valign="top">•<TD> <TD valign="top"><STRONG>If</STRONG> Fred is 5, <STRONG>then</STRONG> 2nd is 4.
<P></TABLE><P>
<P>It may seem that there are some missing from this list. What if
Flag Street is 4? But that can't happen (and I so marked the chart) because
then 2nd would have to be 5, and that would make Fred 6, which is too far.
But notice that we do have to include both directions of implication between
any two propositions. The clue tells us that if Flag Street is 1, then 2nd
is 2, and also that if 2nd is 2, then Flag Street is 1.
<P>See if you can solve this problem, and then we'll talk about the
Logo-based inference system.
<P><H2>Data Structure</H2>
<P>In addition to the status of the "<EM>x</EM> is <EM>y</EM>" propositions, the program
needs to keep track of the categories, like "first name," and the
individuals within each category. This information is needed for the sake
of the elimination and uniqueness rules. I chose to have a global variable
named <CODE>categories</CODE> containing (for the cub reporter problem) the list
<P><PRE>[first last age job]
</PRE>
<P>Each of the elements of that list is the name of a variable
containing the individuals in that category; for example, <CODE>:age</CODE> is
the list
<P><PRE>[32 38 45 55]
</PRE>
<P>The status of the "is" propositions is kept in property lists associated
with the names of individuals. For example, to indicate that the proposition
"Jane is King" is false, the program carries out these two instructions:
<P><PRE>pprop "Jane "King "false
pprop "King "Jane "false
</PRE>
<P>Both are necessary because we can't predict whether we're going
to be figuring out something about Jane or something about King when we
next need this information. Actually, the fact that the proposition is
stored twice reflects another inference rule, one that's so obvious we
don't think about it at all when solving these problems without using a
computer: the <EM>symmetric</EM> rule says that if <EM>x</EM> is <EM>y</EM>, then
<EM>y</EM> is <EM>x</EM>.
<P>If a given property's value is the empty list, that means that the truth of
the proposition is unknown. (Remember that Logo property lists give the
empty list as the value of a property that you haven't defined explicitly,
so I don't have to predefine all possible properties.) The word <CODE>true</CODE>
means that the proposition is known to be true; the word <CODE>false</CODE> means
that it's known to be false. If the proposition is linked to other
propositions by implication, then the value of the property is a list
containing four-element implication lists. The first member of each
implication is <CODE>true</CODE> or <CODE>false</CODE> to indicate which value of this
proposition implies something about the other proposition. The next two
members are names of individuals, indicating which other proposition is
determined by this one. The fourth member is again <CODE>true</CODE> or <CODE>
false</CODE>, indicating whether that other proposition is implied to be true
or implied to be false.
<P>For example, the
statement "My name is Irving, and I'm 45" attributed to Jane is stored as
the following two implications:
<P><P>
<TABLE><TR><TH align="right" valign="top">•<TD> <TD valign="top">Under <CODE>Jane</CODE> and <CODE>Irving</CODE>, the list <CODE>[true Jane 45 false]</CODE>.
<TR><TH align="right" valign="top">•<TD> <TD valign="top">Under <CODE>Jane</CODE> and <CODE>45</CODE>, the list <CODE>[true Jane Irving false]</CODE>.
</TABLE><P>
<P>Although the data structure as described so far
contains all the necessary information, it
turned out to be convenient to include some redundant forms of the same
information. For example, the elimination rule and the uniqueness rule
require the program to find all the <EM>peers</EM> of an individual--the
other individuals in the same category. It would be possible to start with
<CODE>:categories</CODE> and search for the known individual in a category list,
but it was easier to give each individual a <CODE>category</CODE> property whose
value is the name of the category to which it belongs.
<P>Similarly, the transitive rule needs to know all of the propositions
involving a particular individual that are known to be true, and all
those that are known to be false. This
information is implicit in the properties already described, but to simplify
the program each individual has a <CODE>true</CODE> property whose value is a list
of the other individuals known to be equal to this one, and a <CODE>false</CODE>
property whose value is a list of the others known to be different from
this one. For example, after
processing statement 7 we know that Jane is Irving and that Jane is the
pilot, but we don't know Jane's age. Therefore the property list for <CODE>
Jane</CODE> has a property whose name is <CODE>true</CODE> and whose value is the list
<P><PRE>[Irving pilot]
</PRE>
<P>and one whose name is <CODE>false</CODE> and whose value is
<P><PRE>[King Mendle Nathan drafter sergeant driver]
</PRE>
<P>Finally, it turned out that at the end of the program, in order to print out
the solutions, it was convenient to have, for each individual, properties
with a category as the name and an individual in that category as the value.
For example, since Jane's last name is Irving, <CODE>Jane</CODE> has a property
whose name is <CODE>last</CODE> and whose value is <CODE>Irving</CODE>.
<P><H2>Program Structure: Recording Simple Propositions</H2>
<P>I wanted to enter the problem as a series of assertions, each represented by
a Logo instruction. That is, I wrote this procedure:
<P><PRE>to cub.reporter
cleanup
category "first [Jane Larry Opal Perry]
category "last [Irving King Mendle Nathan]
category "age [32 38 45 55]
category "job [drafter pilot sergeant driver]
differ [Jane King Larry Nathan]
says "Jane "Irving 45
says "King "Perry "driver
says "Larry "sergeant 45
says "Nathan "drafter 38
differ [Mendle Jane Opal Nathan]
says "Mendle "pilot "Larry
says "Jane "pilot 45
says "Opal 55 "driver
says "Nathan 38 "driver
print []
solution
end
</PRE>
<P>(The first instruction erases any old property lists left over
from a previous logic problem; the last prints out the results.) I made <CODE>
category</CODE>, <CODE>differ</CODE>, and <CODE>says</CODE> print out their inputs when invoked,
so that you can watch the progress of the solution while it's being computed.
<P>The procedures <CODE>differ</CODE> and <CODE>says</CODE> were designed to reflect the
terms in which this problem is presented, rather than the internal workings
of the inference system. That is, if you compare the problem statement with
this procedure, it's easy to see which instructions represent which clues,
even if you have no idea how the program will actually solve the problem!
Our task is to implement <CODE>differ</CODE> and <CODE>says</CODE> using propositional
logic.
<P>There's an important difference between <CODE>differ</CODE> and <CODE>says</CODE>. <CODE>
Differ</CODE> gives us direct information about the truth or falsehood of some
simple propositions; specifically, we learn that several "<EM>x</EM> is <EM>y</EM>"
propositions are false. <CODE>Says</CODE>, on the other hand, gives us no
direct information about simple propositions; it tells us about
implications linking two such propositions.
<P>First let's consider how <CODE>differ</CODE> works. The instruction
<P>
<P><PRE>differ [Jane King Larry Nathan]
</PRE>
<P>tells us that Jane is not King, Jane is not Larry, Jane is
not Nathan, King is not Larry, King is not Nathan, and Larry is not Nathan.
In effect, <CODE>differ</CODE> will carry out the instructions
<P><PRE>falsify "Jane "King
falsify "Jane "Larry
falsify "Jane "Nathan
falsify "King "Larry
falsify "King "Nathan
falsify "Larry "Nathan
</PRE>
<P>There's no need to invoke <CODE>falsify</CODE> for the six cases with
inputs in the opposite order (such as the proposition that King is not Jane)
because, as we'll see, <CODE>falsify</CODE> knows about the symmetric rule and
will record those automatically. By the way, some of these six are
unnecessary because the two individuals have the same category and therefore
couldn't possibly be the same, such as Jane and Larry, both first names.
But <CODE>falsify</CODE> will catch these cases and return without doing anything.
<P>Here's how <CODE>differ</CODE> actually carries out all those <CODE>falsify</CODE>
instructions:
<P><PRE>to differ :list
print (list "differ :list)
foreach :list [differ1 ? ?rest]
end
to differ1 :a :them
foreach :them [falsify :a ?]
end
</PRE>
<P><CODE>Falsify</CODE>, and a similar procedure <CODE>verify</CODE> to assert the truth
of a proposition, are implemented in terms of the central procedure
about propositions, <CODE>settruth</CODE>:
<P><PRE>to verify :a :b
settruth :a :b "true
end
to falsify :a :b
settruth :a :b "false
end
</PRE>
<P><CODE>Settruth</CODE> takes three inputs, two individuals and a truth
value. It records the truth or falsehood of the proposition that <CODE>:a</CODE>
is <CODE>:b</CODE>, and uses the rules of inference I've described to derive
other propositions when possible.
<P><PRE>to settruth :a :b :truth.value
if equalp (gprop :a "category) (gprop :b "category) [stop]
localmake "oldvalue get :a :b
if equalp :oldvalue :truth.value [stop]
if equalp :oldvalue (not :truth.value) ~
[(throw "error (sentence [inconsistency in settruth]
:a :b :truth.value))]
print (list :a :b "-> :truth.value)
store :a :b :truth.value
settruth1 :a :b :truth.value
settruth1 :b :a :truth.value
if not emptyp :oldvalue ~
[foreach (filter [equalp first ? :truth.value] :oldvalue)
[apply "settruth butfirst ?]]
end
to settruth1 :a :b :truth.value
apply (word "find not :truth.value) (list :a :b)
foreach (gprop :a "true) [settruth ? :b :truth.value]
if :truth.value [foreach (gprop :a "false) [falsify ? :b]
pprop :a (gprop :b "category) :b]
pprop :a :truth.value (fput :b gprop :a :truth.value)
end
</PRE>
<P>If the two individuals are in the same category, <CODE>settruth</CODE>
does nothing. This is to allow for cases such as the example
<P><PRE>falsify "Jane "Larry
</PRE>
<P>as explained earlier. If the individuals are in different
categories, the next step is to find what information was previously
available about this proposition. If it was already known to be true
or false, and the truth value input in this call agrees with what was
known, then the program has merely generated some redundant information
and <CODE>settruth</CODE> stops. If the known value disagrees with the input
value, then something is wrong; the program has proven two contradictory
propositions. Generally this means that some clue has been represented
incorrectly in the program.
<P>If the proposition was not already known to be true or false, then we
have some new information. After notifying the user by printing a
message, <CODE>settruth</CODE>
must store that fact. (Procedures <CODE>get</CODE> and <CODE>store</CODE> are an
interface to the property list primitives that implement the symmetric
rule.) The next step is to make any possible inferences using the
elimination, uniqueness, and transitive rules; this is done separately
for "<CODE>:a</CODE> is <CODE>:b</CODE>" and for "<CODE>:b</CODE> is <CODE>:a</CODE>" by two
calls to the subprocedure <CODE>settruth1</CODE>.
<P><CODE>Settruth1</CODE> invokes either <CODE>findtrue</CODE> or <CODE>findfalse</CODE>
depending on the truth value. Because of the <CODE>not</CODE> in the
first instruction of <CODE>settruth1</CODE>, <CODE>findtrue</CODE> is called when we are
falsifying a proposition, and vice versa. This makes sense because of
the tasks of these procedures. If we are falsifying a proposition,
then the elimination rule comes into play, and we try to find a true
proposition. If we are verifying a proposition, then the uniqueness
rule lets us find several false propositions on the same row or
column of the chart. The next instructions in <CODE>settruth1</CODE> implement
the transitive rule, looking at other individuals known to be the same
as, or different from, <CODE>:a</CODE>. The last two instructions maintain
the redundant information used for printing the results and for carrying
out the transitive rule.
<P>The final instruction in <CODE>settruth</CODE> deals with implications. If we
already knew that <EM>p</EM>→<EM>q</EM> and now we're learning that <EM>p</EM> is true,
we can infer that <EM>q</EM> is true. (This is another rule of inference,
which I'll call the <EM>implication</EM> rule.)
<P><H2>Program Structure: Recording Implications</H2>
<P>Speaking of implications, we can now explore how the <CODE>says</CODE> procedure
produces and records implications. The instruction
<P><PRE>says "Jane "Irving 45
</PRE>
<P>establishes a relationship between the two propositions
"Jane is Irving" and "Jane is 45." The relationship is that
exactly one of them must be true; the name for this is <EM>exclusive or.</EM>
<P><PRE>to says :who :what1 :what2
print (list "says :who :what1 :what2)
xor :who :what1 :who :what2
end
to xor :who1 :what1 :who2 :what2
implies :who1 :what1 "true :who2 :what2 "false
implies :who1 :what1 "false :who2 :what2 "true
end
</PRE>
<P><CODE>Says</CODE> is a special case of <CODE>xor</CODE> (a common abbreviation
for exclusive or) in which the two propositions are about a common
individual, Jane in the example. The <CODE>xor</CODE> procedure establishes
two implications, <EM>p</EM> → ¬  <EM>q</EM> and
¬  <EM>p</EM> → <EM>q</EM>. In other words,
if we learn that Jane is Irving, then we can infer that Jane is not 45;
if we learn that Jane is not Irving, we can infer that Jane is 45.
(These two implications are <EM>not</EM> equivalent to each other; in
general, one
might be true and the other false. But the exclusive or relationship
means that both are true.)
<P>The <CODE>implies</CODE> procedure takes six inputs. The first three are for
one proposition and the others for the second proposition. For each
proposition, two inputs are individuals (call them <EM>x</EM> and <EM>y</EM>) and
the third is <CODE>true</CODE> or <CODE>false</CODE>. If <CODE>true</CODE>, then the
proposition is "<EM>x</EM> is <EM>y</EM>"; if <CODE>false</CODE>, the proposition is
"<EM>x</EM> is not <EM>y</EM>."
<P>Yet another rule of inference comes into play here, the <EM>
contrapositive</EM> rule. It says
that if <EM>p</EM>→<EM>q</EM> is true, then so is ¬<EM>q</EM> → ¬<EM>p</EM>.
Although these two implications are mathematically equivalent,
we must enter both of them into our data structure because we might
happen to discover that ¬<EM>q</EM> is true, and we'll look for relevant
implications filed under <EM>q</EM> rather than under <EM>p</EM>.
<P><PRE>to implies :who1 :what1 :truth1 :who2 :what2 :truth2
implies1 :who1 :what1 :truth1 :who2 :what2 :truth2
implies1 :who2 :what2 (not :truth2) :who1 :what1 (not :truth1)
end
</PRE>
<P>The first call to <CODE>implies1</CODE> stores the implication
in the form given to us; the second call stores its contrapositive.
<P><PRE>to implies1 :who1 :what1 :truth1 :who2 :what2 :truth2
localmake "old1 get :who1 :what1
if equalp :old1 :truth1 [settruth :who2 :what2 :truth2 stop]
if equalp :old1 (not :truth1) [stop]
if memberp (list :truth1 :who2 :what2 (not :truth2)) :old1 ~
[settruth :who1 :what1 (not :truth1) stop]
if memberp (list :truth1 :what2 :who2 (not :truth2)) :old1 ~
[settruth :who1 :what1 (not :truth1) stop]
store :who1 :what1 ~
fput (list :truth1 :who2 :what2 :truth2) :old1
end
</PRE>
<P>The first three instructions in <CODE>implies1</CODE> check for the
case in which we are told that <EM>p</EM>→<EM>q</EM> but we already know
either <EM>p</EM> or ¬<EM>p</EM>. If we already know <EM>p</EM>, then we can
derive that <EM>q</EM> is true. There's no need to store the implication,
because it is already serving its purpose, which is to allow us to
infer <EM>q</EM>. If we already know ¬<EM>p</EM>, then we can forget about
this implication, because it isn't going to do us any good. We can't
infer anything about <EM>q</EM> from this situation.
<P>The next two instructions check for the situation in which we
are told <EM>p</EM>→<EM>q</EM> but we already know that <EM>p</EM>→¬<EM>q</EM>.
In that case, <EM>p</EM> must not be true, because if it were, we could
infer two contradictory conclusions. Therefore, we can falsify
the proposition <EM>p</EM>.
<P>Finally, if none of these conditions is met, we add this implication
to the (possibly empty) list of already known implications about <EM>p</EM>.
<P><H2>Using Implications to Represent Orderings</H2>
<P>For another example of implications at work, here's the Logo program
for the Foote problem:
<P><PRE>to foote.family
cleanup
category "when [1st 2nd 3rd 4th 5th]
category "name [Felix Fred Frank Francine Flo]
category "street [Field Flag Fig Fork Frond]
category "item [food film flashlight fan fiddle]
category "position [1 2 3 4 5]
print [Clue 1]
justbefore "Flag "2nd :position
justbefore "2nd "Fred :position
print [Clue 2]
male [film Fig 5th]
print [Clue 3]
justbefore "flashlight "Fork :position
justbefore "Fork "1st :position
female [1st]
print [Clue 4]
falsify "5th "Frond
falsify "5th "fan
print [Clue 5]
justbefore "Francine "Frank :position
justbefore "Francine "Frank :when
print [Clue 6]
female [3rd Flag]
print [Clue 7]
justbefore "fiddle "Frond :when
justbefore "Flo "fiddle :when
print []
solution
end
</PRE>
<P>
I've included <CODE>print</CODE> instructions to let the user know when the
program reaches each of the seven numbered clues. Clue 4 tells us
directly that two possible pairings of individuals are false, and
the program reflects this with two invocations of <CODE>falsify</CODE>. But
the other clues either tell us about the sex of the family members
or tell us that certain individuals are next to each other in an
ordering. The procedures <CODE>male</CODE>, <CODE>female</CODE>, and <CODE>justbefore</CODE>
reflect the language of the problem in the same way that <CODE>says</CODE>
reflected the language of the other problem.
<P><PRE>to male :stuff
differ sentence :stuff [Francine Flo]
end
to female :stuff
differ sentence :stuff [Felix Fred Frank]
end
</PRE>
<P>Needless to say, this implementation of <CODE>male</CODE> and <CODE>female</CODE>
works only for this specific logic problem! It's tempting to try to
use the more general category mechanism this way:
<P><PRE>category "sex [male female]
verify "Francine "female
verify "Flo "female
verify "Felix "male
verify "Fred "male
verify "Frank "male
</PRE>
<P>but unfortunately the structure of this inference system
requires that each individual can only match one individual in
another category; once we've verified that Francine is female,
the uniqueness rule will deduce that Flo isn't female!
<P>On the other hand, I've tried to write <CODE>justbefore</CODE> in a way
that will work for future problems.
<P><PRE>to justbefore :this :that :lineup
falsify :this :that
falsify :this last :lineup
falsify :that first :lineup
justbefore1 :this :that :lineup
end
to justbefore1 :this :that :slotlist
if emptyp butfirst :slotlist [stop]
equiv :this (first :slotlist) :that (first butfirst :slotlist)
justbefore1 :this :that (butfirst :slotlist)
end
to equiv :who1 :what1 :who2 :what2
implies :who1 :what1 "true :who2 :what2 "true
implies :who2 :what2 "true :who1 :what1 "true
end
</PRE>
<P>The input named <CODE>lineup</CODE> is a list of individuals from
an ordering, in the correct order. In this problem, it will be the list
<P><PRE>[1 2 3 4 5]
</PRE>
<P>if the clue is about street position, or
<P><PRE>[1st 2nd 3rd 4th 5th]
</PRE>
<P>if the clue is about the order of events in time. As I explained
when talking about orderings earlier, the instruction
<P><PRE>justbefore "Flag "2nd :position
</PRE>
<P>results in falsifying some propositions:
<P><PRE>falsify "Flag "2nd
falsify "Flag 5
falsify "2nd 1
</PRE>
<P>and also asserts some implications:
<P><PRE>implies "Flag 1 "true "2nd 2 "true
implies "2nd 2 "true "Flag 1 "true
implies "Flag 2 "true "2nd 3 "true
implies "2nd 3 "true "Flag 2 "true
implies "Flag 3 "true "2nd 4 "true
implies "2nd 4 "true "Flag 3 "true
implies "Flag 4 "true "2nd 5 "true
implies "2nd 5 "true "Flag 4 "true
</PRE>
<P><H2>Backtracking</H2>
<P>This inference system can solve many logic problems, but sometimes it
runs into trouble. I've already mentioned, while discussing the <CODE>male</CODE>
and <CODE>female</CODE> procedures, that the program insists that each individual
in a category must appear exactly once in the solution. Suppose you have
a problem about five people living in five houses; they have five distinct
first names, five last names, five occupations--but two of the houses
are yellow. It's sometimes possible to get around this, but the technique
is a little awkward. You can set up a category like this:
<P><PRE>category "color [yellow1 yellow2 blue brown green]
</PRE>
<P>then you find <EM>one</EM> clue that mentions a yellow house
and use <CODE>yellow1</CODE> for that one:
<P><PRE>verify "Fred "yellow1
</PRE>
<P>but for any other mention of something being yellow in the
problem, you represent that by saying that the individual is <EM>not</EM> one
of the other colors:
<P><PRE>differ "architect [blue brown green]
</PRE>
<P>You never explicitly mention <CODE>yellow2</CODE> except in setting
up the category.
<P>That technique works if you know that yellow is the color that appears
twice. What if the problem tells you only that there are five houses
and four colors? Or what if some individuals are <EM>not</EM> in the
solution? For example, a problem might discuss the activities of five
people, each doing something on a different day of the week, but you
don't know until you solve the problem which of the seven weekdays
are involved.
<P>An entirely different approach to solving logic problems by computer
is <EM>backtracking.</EM> Instead of starting from the clues
and making deductions, a program can start by making arbitrary guesses
about who goes with what, and then use the clues to look for contradictions.
If a contradiction is found, the program systematically makes a different
guess until every possible arrangement has been tried. (Presumably before
every possibility has been tried, one of them will <EM>not</EM> lead to a
contradiction, and that's the solution.)
<P>In practice, backtracking is a more flexible technique than the inference
system I wrote, because it's easy to let a backtracking program try
multiple appearances of an individual if the problem allows that. But I
thought the inference system would be more interesting to study, for two
reasons. First, the inference system more closely models the way people
solve logic problems. In the cub reporter problem, with four people
to keep straight, there are (4!)<SUP><SMALL>3</SMALL></SUP> or 13,824 possible solutions. In
the Foote problem, with five people, there are (5!)<SUP><SMALL>4</SMALL></SUP> or just over
200 million possibilities. A computer can try them all, but a person
couldn't.<SUP>*</SUP>
(If multiple appearances were allowed, the numbers would be
even higher.) Second, backtracking doesn't work at all unless the problem
deals with a finite set of individuals, as in a logic problem. Inference
systems can be generalized to deal with potentially unbounded problems.
<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>People do sometimes use a combination of inference and
backtracking. If, by inference, you've established that Jane must be
either the pilot or the drafter, but you can't settle which, you might
decide to <EM>assume</EM> that Jane is the pilot and hope that you can then
infer either a complete solution to the problem or a contradiction. In the
latter case, you'd know that Jane must be the drafter.</SMALL></BLOCKQUOTE></SMALL><P><H2>Generalized Inference Systems and Predicate Logic</H2>
<P>The rules of inference in this program are specially designed for problems
like the ones we've just solved. The implication
rule is applicable to any propositional logic situation, but the ones
based on categories, such as the uniqueness and elimination rules, are not.
The idea of a "category" as we've used it
isn't a general principle of logic; instead, that idea should really be
expressed as a series of propositions. For example, to say "there is a
category called `last name' whose members are Irving, King, Mendle, and
Nathan" is really to make several statements of the form "Jane has exactly
one last name," or, in terms of the basic "<EM>x</EM> is <EM>y</EM>" propositions,
<P><CENTER><TABLE><TR><TD>(Jane is Irving) ∨ (Jane is King)
<TR><TD> ∨ (Jane is Mendle) ∨ (Jane is Nathan)
</TABLE></CENTER><P>
and so on. If we wanted to solve this problem in a general
inference system we'd assert the truth of all those propositions at the
beginning. Then if the program discovers that Jane is Irving, it would
have the two propositions
<P><CENTER>¬<STRONG>(</STRONG>(Jane is Irving)
∧ (Jane is King)<STRONG>)</STRONG></CENTER><P>
and
<P><CENTER>(Jane is Irving)</CENTER><P>
and from these it could infer
<P><CENTER>¬(Jane is King)</CENTER><P>
using the standard inference rules of propositional logic.
<P>The "and so on" just above includes quite a large number of propositions.
And yet this small problem concerns a mere 16 individual names divided into
four categories. For a larger problem it would be nearly impossible to list
all the relevant propositions, and for a problem involving an infinite set
of individuals, such as the integers, it would be literally impossible. What
would make the representation of a problem easier is if we could use, in a
formal system, the same kind of language I used in describing (in English)
the inference rules earlier: "for all other <EM>z</EM> in the same category as
<EM>y</EM>..." There are two parts to such a formal notation. First, in addition
to the <EM>propositional</EM> variables like <EM>p</EM> used in propositional logic,
we need variables like <EM>x</EM> and <EM>y</EM> that can represent objects in the system
we want to describe. Second, we need a notation for "for all." The formal
system including these additions to propositional logic is called <EM>
predicate</EM> logic. The name is like that used in Logo to refer to
operations with <CODE>true</CODE> or <CODE>false</CODE> outputs because
the statements in predicate logic
involve truth-valued functions of objects analogous to the
Logo predicates. For example, the formula we've been representing
informally as "<EM>x</EM> is <EM>y</EM>" is represented formally using the
predicate function is(<EM>x</EM>,<EM>y</EM>). This
is much like the Logo expression
<P><PRE>equalp :x :y
</PRE>
<P>A predicate function of two arguments ("inputs" in Logo) is also
called a <EM>relation</EM> in mathematics. Algebraic relations include ones
like = (equal) and < (less than).
<P>We are almost ready to show how the uniqueness rule can be expressed as a
formula in predicate logic. If you're not accustomed to mathematical
formalism, this formula is a little scary--perhaps the
scariest thing in this book. But I want you to see it so that
you'll appreciate the fact that just <EM>one</EM> formula of predicate logic
can sum up a rule that would require <EM>many</EM> formulas in propositional
logic. I'm going to introduce a new relation called
"isa" that's true if its first argument is a member of its second
argument. The first must be an individual and the second a category. For
example, isa(pilot,job) is true. And the symbol
∀ means "for all." The uniqueness rule says that if <EM>x</EM> is <EM>y</EM>
then <EM>x</EM> can't also be <EM>z</EM> for any <EM>z</EM> in the same category as <EM>y</EM>. Here's
how to say that formally:
<P><CENTER>is(<EM>x</EM>,<EM>y</EM>) → ∀<EM>z</EM>
<STRONG>(</STRONG>(isa(<EM>y</EM>,<EM>a</EM>) ∧ isa(<EM>z</EM>,<EM>a</EM>)
∧ ¬(<EM>y</EM>=<EM>z</EM>)<STRONG>)</STRONG></CENTER><P>
<P>To indicate the linking of two propositions in the general inference system,
no special rules of inference are required. To say "the reporter wrote
down these two statements; one is true and the other false" is just to say
<EM>p</EM>⊗<EM>q</EM>; I'm using the symbol ⊗ to
represent the <EM>exclusive or</EM> function. This formula is equivalent to
<P><CENTER>(<EM>p</EM>∨<EM>q</EM>) ∧ ¬(<EM>p</EM>∧<EM>q</EM>)</CENTER><P>
A general inference system will know that if it's been told <EM>p</EM>⊗<EM>q</EM>
and then later it learns, say, <EM>p</EM> then it can infer ¬<EM>q</EM>.
<P>The cub reporter problem is simpler than some of its type in that
the only relevant relation among individuals is "is," which is an
<EM>equivalence</EM> relation. This means that if Jane is Irving then also
Irving is Jane (the technical name for a relation with that property is
<EM>symmetric</EM>), and also that if Jane is Irving, and Irving is the pilot,
then Jane is the pilot (so the relation is <EM>transitive</EM>). It's
relatively easy to work out all the implications of a proposition about
equivalence relations.
<P>
By contrast, the Foote problem contains <EM>ordering</EM> relations like "is
later than" that are transitive but not symmetric. To handle such problems
the inference system must have a way to represent "<EM>x</EM> is the last."
Other problems contain relations like "lives in the house next to" that
are symmetric but not transitive. A statement like "Mr. Smith lives in the
house at the end" has to be represented formally as something like "there
is only one person <EM>x</EM> such that <EM>x</EM> lives in the house next to Mr. Smith."
<P>One reason a general inference system is harder to program than the
special-purpose one I've written in this chapter is that my system makes all
possible inferences from any newly verified or falsified proposition. This
is possible only because there is a finite, fairly small number of such
inferences. Once you introduce variables and predicates, the number of
possible inferences is potentially infinite. A general inference system must
take care not to infer infinitely many useless results. One solution is to
defer the making of inferences until the user of the system asks a question,
and then infer only what's needed to answer that particular question. But
it isn't always easy to know exactly what's needed.
<P>An inference system like the one I'm vaguely describing is a central part of
the programming language Prolog. In that language, you program not by
issuing instructions that tell the computer what to do, but rather by making
<EM>assertions</EM> that some proposition is true. You can then ask questions
like "for what values of <EM>x</EM> is this formula true?"
<P>
<H2>Logic and Computer Hardware</H2>
<P>Besides inference systems, another area in which logic is important in
computer science is in the design of the computers themselves. In a
computer, information is represented as electrical signals flowing through
wires. (These days, a "wire" is likely to be a microscopic conducting
region on a silicon chip rather than a visible strand of metal, but the
principle is the same.) In almost all computers, each wire may be carrying
one of two voltage levels at any moment. (It is the restriction to two
possible voltages that makes them <EM>binary</EM> computers. It would be
possible to build <EM>ternary</EM> computer circuits using three
voltages, but I know of no practical application of that idea.) A computer
is built out of small circuit elements called <EM>gates</EM> that combine or
rearrange binary signals in various ways. Perhaps the simplest example of a
gate is an <EM>inverter</EM>; it has one input signal and provides one output
signal that is the opposite of the input. That is, if you have a high
voltage coming into the inverter you get a low voltage out, and vice versa.
<P>The voltages inside the computer can be thought of as representing numbers
(zeros and ones) or truth values (false and true). From now on I'll use the
symbols 0 and 1 to represent the voltages, but you can mentally replace 0
with <CODE>false</CODE> and 1 with <CODE>true</CODE> to see how what I'm saying here ties
in with what's gone before.
<P>Suppose you have a gate with two input wires and one output. What are the
possibilities for how that output is determined by the inputs? Each of the
two inputs can have two possible values; that means that a gate has four
different possible input configurations. For each of those four, the gate
can output 0 or 1. As you can see from the chart below, this means that
there are 16 possible kinds of two-input gates:
<P><CENTER><TABLE>
<TR><TH align="left">input <EM>p</EM>:<TH> 0
<TH> 0<TH> 1<TH> 1
<TR><TH align="left">input <EM>q</EM>:<TH> 0
<TH> 1<TH> 0<TH> 1
<TR><TD>
<TR><TH align="left">outputs:<TD> 0
<TD> 0<TD> 0<TD> 0
<TR><TH><TD> 0<TD> 0<TD> 0<TD> 1<TD> and (<EM>p</EM>∧<EM>q</EM>)
<TR><TH><TD> 0<TD> 0<TD> 1<TD> 0
<TR><TH><TD> 0<TD> 0<TD> 1<TD> 1
<TR><TH><TD> 0<TD> 1<TD> 0<TD> 0
<TR><TH><TD> 0<TD> 1<TD> 0<TD> 1
<TR><TH><TD> 0<TD> 1<TD> 1<TD> 0<TD> exclusive or (<EM>p</EM>⊗<EM>q</EM>)
<TR><TH><TD> 0<TD> 1<TD> 1<TD> 1<TD> or (<EM>p</EM>∨<EM>q</EM>)
<TR><TH><TD> 1<TD> 0<TD> 0<TD> 0<TD> nor (not-or, ¬(<EM>p</EM>∨<EM>q</EM>))
<TR><TH><TD> 1<TD> 0<TD> 0<TD> 1<TD> equivalence (<EM>p</EM>↔<EM>q</EM>)
<TR><TH><TD> 1<TD> 0<TD> 1<TD> 0
<TR><TH><TD> 1<TD> 0<TD> 1<TD> 1
<TR><TH><TD> 1<TD> 1<TD> 0<TD> 0
<TR><TH><TD> 1<TD> 1<TD> 0<TD> 1<TD> implies (<EM>p</EM>→<EM>q</EM>)
<TR><TH><TD> 1<TD> 1<TD> 1<TD> 0<TD> nand (not-and, ¬(<EM>p</EM>∧<EM>q</EM>))
<TR><TH><TD> 1<TD> 1<TD> 1<TD> 1
</TABLE></CENTER>
<P>The table indicates, for example, that a gate whose output is 1
only when both inputs are 1 is an <EM>and</EM> gate, implementing the usual
logical and operation. The 16 possibilities include all the standard logic
functions as well as several less obviously useful ones. Two gate types
called <EM>nand</EM> and <EM>nor</EM> represent functions rarely used in
mathematical logic but common in computer design because it is sometimes
helpful to have the opposite of the signal you're logically interested in.
<P>Numbers other than 0 and 1 can't be represented as a single signal in a
single wire. That's why there isn't a "plus gate" along with the and
gates, or gates, and so on; if both inputs to a plus gate were 1, the output
would have to be 2. To add two zero-or-one numbers we need a more
complicated device with <EM>two</EM> output wires, one of which is the
"carry" to the next binary digit:
<P><CENTER><TABLE>
<TR><TH align="left"><EM>A</EM> input:<TH> 0
<TH> 0<TH> 1<TH> 1
<TR><TH align="left"><EM>B</EM> input:<TH> 0
<TH> 1<TH> 0<TH> 1
<TR><TD>
<TR><TH align="left">sum out:<TD> 0
<TD> 1<TD> 1<TD> 0
<TR><TH align="left">carry out:<TD> 0
<TD> 0<TD> 0<TD> 1
</TABLE></CENTER>
<P>These sum and carry outputs can be defined in terms of logical
operations:
<P>
<P><CENTER><TABLE><TR><TD align="right">Sum<TD> =
<TD> <EM>A</EM>⊗<EM>B</EM>
<TR><TD align="right">Carry<TD> =
<TD> <EM>A</EM>∧<EM>B</EM>
</TABLE></CENTER><P>
<P>Exclusive or gates are, in fact, not generally used as basic
hardware components, so this device is traditionally represented in terms of
and gates, or gates, and inverters:
<P><CENTER><IMG SRC="half-adder.gif" ALT="figure: half-adder"></CENTER>
<P>The device we've just built is called a <EM>half-adder</EM> for
reasons that should become clear in a moment.
<P>To represent numbers larger than 1 we have to use more than one signal wire.
Each signal represents a binary digit, or <EM>bit,</EM> that is implicitly
multiplied by a power of 2 just as in the ordinary decimal representation of
numbers each digit is implicitly multiplied by a power of 10. For example,
if we have three signal wires for a number, they have multipliers of 1, 2,
and 4; with these signals we can represent the eight numbers from 0 (all
signals 0) to 7 (all signals 1). When we want to add two such three-bit
numbers, the sum for all but the rightmost bit can involve a carry <EM>
in</EM> as well as a carry out. The circuit for each bit position must have
<EM>three</EM> inputs, including one for the carry from the next bit over as
well as the two external inputs. The outputs are found using these formulas:
<P>
<P><CENTER><TABLE><TR><TD align="right">Sum<TD> =
<TD> (<EM>A</EM>⊗<EM>B</EM>)⊗CarryIn
<TR><TD align="right">CarryOut<TD> =
<TD> (<EM>A</EM>∧<EM>B</EM>)∨
<STRONG>(</STRONG>(<EM>A</EM>⊗<EM>B</EM>)∧CarryIn<STRONG>)</STRONG>
</TABLE></CENTER><P>
<P>This circuit can be built using two half-adders:
<P><CENTER><IMG SRC="full-adder.gif" ALT="figure: full-adder"></CENTER>
<P>To add two three-bit numbers we connect three adders together this
way:
<P><CENTER><IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch2/three-adders.gif" ALT="figure: three-adders"></CENTER>
<P>The carry out signal from the leftmost adder (the one representing
the largest power of 2) is the <EM>overflow</EM> signal; if it's true, the
sum didn't fit in the number of bits provided. Many computers use this
signal to <EM>interrupt</EM> the execution of their programs so that people
don't end up seeing incorrect results.
<P>The phrase "computer logic" is widely used, even by non-experts, to refer
to the inner workings of a computer. Many people, though, think that the
phrase describes the <EM>personality</EM> of the computer, which they imagine
to be like that of Mr. Spock. "Computers may be able to play chess, but
they can't write poetry, because that isn't logical." Here you've seen the
real meaning of the phrase: Just as a Logo program has procedures defined in
terms of subprocedures and ultimately in terms of primitive procedures, the
capabilities of a computer itself are built out of smaller pieces, and the
primitive hardware components compute logical, rather than arithmetic,
functions. (For a computer to exhibit Mr. Spock's sense of purpose,
understanding of cause and effect, drive for self-preservation, loyalty to
his species and his government, and so on, would be no less miraculous than
for it to write a love poem or throw a temper tantrum. Later we'll discuss
the efforts of artificial intelligence researchers to produce such miracles.)
<P><H2>Combinatorics</H2>
<P>Earlier I listed the 16 possible logical functions of two logical arguments.
I could have figured out that there are 16 without actually listing them
this way: If a function has two arguments, and each argument has two
possible values, that makes 2<SUP><SMALL>2</SMALL></SUP> possible combinations of argument values.
A logical function is determined by its result for each of those four
possible argument combinations. (The four are <EM>f</EM>(0,0), <EM>f</EM>(0,1), <EM>f</EM>(1,0),
and <EM>f</EM>(1,1).) There are two possible results for <EM>f</EM>(0,0), two for
<EM>f</EM>(0,1), and so on; the number of possible functions is the product of all
these twos, 2<SUP><SMALL>2<SUP><SMALL>2</SMALL></SUP></SMALL></SUP> or 16.
<P>The mathematics of counting how things can be combined is called
<EM>combinatorics.</EM> The problem we've just done illustrates a fundamental
rule of combinatorics, namely that the number of possibilities for a choice
with several components is the product of the number of possibilities for
each component choice. This may be easier to understand with an example in
which not all the relevant numbers are 2. Here's a classic: Suppose you
have a group of four men and three women, and you want to form a committee
of one man and one woman chosen from this group. How many such committees
are possible? There are 4 choices for the male member of the committee and
3 choices for the female member; that means 4×3 or 12 committees.
<P>(The multiplication rule only works if the component choices are <EM>
independent</EM>; that is, the possible outcomes of one choice can't be
affected by the outcome of any of the others. For example, if our
committees are to have a chairperson who can be of either sex and then two
other members, one man and one woman, you can't say "there are 7 choices
for the chairperson, times 4 for the man, times 3 for the woman" because
after choosing the chairperson the number of choices for the other members
is changed depending on the sex of the chair. There are two correct ways
to solve this problem. One is to say "The chairperson is either male or
female. If male, there are 4 possible chairs, times 3 possible other male
members, times 3 possible female members, for 36 possible committees. If
female, there are 3 possible chairs, times 4 possible male members, times 2
possible other female members, for 24 committees. The total is 36+24 or
60 committees." The second solution is to pick the chairperson <EM>last</EM>;
then you say "There are 4 possible male members, times 3 possible females,
times 5 possible chairs (7 people in the original group minus 2 already
chosen)." Apart from arithmetic errors, almost all the mistakes people make
in solving combinatorics problems come from forgetting that events have to
be independent to allow you to multiply the choices.)
<P>Let's return to logical functions for a moment. Suppose we experiment with
a ternary logic in which the possible values are yes, no, and maybe. (You
can represent these using the numbers 0, 1, and 2.) How many three-argument
ternary logic functions are there? Is it 3<SUP><SMALL>(3<SUP><SMALL>3</SMALL></SUP>)</SMALL></SUP> or is it (3<SUP><SMALL>3</SMALL></SUP>)<SUP><SMALL>3</SMALL></SUP>?
(It doesn't matter which way you group the twos for the binary logic version
because the two groupings have the same value, 16, but this isn't true with
threes.) How many two-argument ternary logic functions are there?
2<SUP><SMALL>(3<SUP><SMALL>2</SMALL></SUP>)</SMALL></SUP>? 3<SUP><SMALL>(2<SUP><SMALL>3</SMALL></SUP>)</SMALL></SUP>? (3<SUP><SMALL>3</SMALL></SUP>)<SUP><SMALL>2</SMALL></SUP>? How many three-argument binary logic
functions? The virtue of these problems is that you can check your own
answers by listing all the possible functions as I did for the two-argument
binary logic functions.
<P>Most people are introduced to combinatorics
by way of <EM>probability,</EM>
the mathematics of gambling. Given a number of equally likely possible
situations, some of which win a bet and the rest of which lose it, the
probability of winning is a ratio:
<P><CENTER><IMG SRC="probability.gif"></CENTER><P>
The role of combinatorics is to help in computing the numerator and
denominator of that fraction.
<P>
For example, suppose you have six brown socks and four blue socks in a
drawer, and you pull out two socks without looking. What is the probability
of getting a matching pair? I'm going to give each sock a name like
<EM>brown</EM><SUB><SMALL>3</SMALL></SUB> for the third brown sock so that we can talk about individual socks
even if they're the same color. How many possible pairs of socks are there?
That is, given the set
<P><CENTER>{<EM>brown</EM><SUB><SMALL>1</SMALL></SUB>,
<EM>brown</EM><SUB><SMALL>2</SMALL></SUB>, ...,
<EM>brown</EM><SUB><SMALL>6</SMALL></SUB>,
<EM>blue</EM><SUB><SMALL>1</SMALL></SUB>, ...
<EM>blue</EM><SUB><SMALL>4</SMALL></SUB>}</CENTER><P>
containing ten socks, how many two-sock subsets are there?
<P>The first step in answering this question is to notice that we have 10
choices for the first sock and then 9 choices for the second. So there are
10×9 ways to make the two choices. (There is a subtle point here
that some textbooks don't bother explaining. The choice of the second sock
is <EM>not</EM> independent of the first choice, since we can't choose the
same sock twice. However, the <EM>number</EM> of second choices is always 9,
even though the particular nine available socks depend on the first choice.
So we can get away with multiplying in this example.)
<P>This isn't quite the answer we want, though. We've shown that there are 90
<EM>ordered pairs</EM> of socks. But once we get the socks out of the
drawer, it doesn't matter which we picked first. In other words, if we say
there are 90 possible choices, we are counting
{<EM>brown</EM><SUB><SMALL>2</SMALL></SUB>, <EM>brown</EM><SUB><SMALL>5</SMALL></SUB>} and {<EM>brown</EM><SUB><SMALL>5</SMALL></SUB>, <EM>brown</EM><SUB><SMALL>2</SMALL></SUB>}
as two different choices. Since we don't care which sock came out of the
drawer first, we are really counting every pair twice, so there are only
90/2 or 45 possible pairs.
<P>An ordered subset of a set is called a <EM>permutation;</EM>
a subset without
a particular order is a <EM>combination.</EM> We say that there are 90
permutations of ten things taken two at a time; there are 45 combinations of
ten things taken two at a time.
<P>It turns out that combinations are what matters most of the time; it's
relatively rare for permutations to turn up in a math problem. An exception
is the device wrongly called a "combination lock"; to open one, you must
know a particular permutation of the possible numbers. My high school
locker "combination" was 18-24-14. If I tried the same numbers in a
different order, like 24-18-14, the lock wouldn't open. (Actually it's
misleading for me to use this example because the same number can appear
twice in most locks of this type, so the "combination" is not a subset of
the available numbers. If there are 50 numbers available, the total number
of possibilities is not 50×49×48 but rather 50×50×50. These lock-opening patterns are therefore neither permutations nor
combinations but something else that we might call "permutations with
repetition allowed.")
<P>We haven't finished solving the sock problem. How many of the possible
pairs of socks are matching pairs? One way to find out would be to list all
the possible pairs and actually count how many of them match. This is the
sort of thing computers do well. First we have to write a procedure that
takes as input a list and a number, and outputs a list of lists, each of
which is a subset of the input list whose length is the input number. That
is, we want to take
<P><PRE>combs [brown1 brown2 ... brown6 blue1 ... blue4] 2
</PRE>
<P>and this should output a list of pairs:
<P><PRE>[[brown1 brown2] [brown1 brown3] ... [brown4 blue3] ... [blue3 blue4]]
</PRE>
<P>This is a fairly tricky program to write. Try it before you read
further. Can you reduce the problem to a smaller, similar problem? Don't
forget that we want combinations, not permutations, so the output can't have
two sublists containing the same elements.
<P>To make sure that each combination appears in the result in only one order,
we can decide explicitly what that order will be. The most convenient thing
is to say that the elements will appear in each sublist in the same order in
which they appear in the original list. That is, since the input list has
<CODE>brown2</CODE> before <CODE>blue3</CODE>, the output will contain the list
<P><PRE>[brown2 blue3]
</PRE>
<P>but not the list
<P><PRE>[blue3 brown2]
</PRE>
<P>It follows that the very first element of the input list, <CODE>
brown1</CODE>, can only appear as the first element of any output sublist. In
other words, there are two kinds of sublists: ones with <CODE>brown1</CODE> as
their first element and ones that don't include <CODE>brown1</CODE> at all. This
is a way to divide the problem into smaller pieces.
<P>If we are looking for <EM>n</EM>-element subsets, the first kind consists of <CODE>
brown1</CODE> stuck in front of a smaller subset of <EM>n</EM>−1 elements chosen from the
remainder (the <CODE>butfirst</CODE>) of the input list. The second kind of subset
is an <EM>n</EM>-element subset of the <CODE>butfirst</CODE>. We can collect all of each
kind by a recursive invocation of the procedure we're going to write, then
append the two collections and output the result. So the procedure will
look like this:
<P><PRE>to combs :list :howmany
... <stop rule> ...
output sentence (map [fput first :list ?]
combs (butfirst :list) (:howmany-1)) ~
(combs (butfirst :list) :howmany)
end
</PRE>
<P>By now you've had a lot of experience writing
recursive procedures,
but I'm going over this one in detail for two reasons: It's tricky and it's a
model for solving many other combinatorial problems. What makes it tricky
is a combination of two things. One is that it's recursive twice; that is,
there are two recursive invocations of <CODE>combs</CODE> within <CODE>combs</CODE>. This
makes the control structure very different from the
<P><PRE>to blah :list
output fput (<CODE><EM>something</EM></CODE> first :list) (blah butfirst :list)
end
</PRE>
<P>sort of recursion that may be more familiar. The second tricky
part is that there are two input variables, each of which may be made smaller
(by <CODE>butfirst</CODE>ing or by subtracting 1) but need not be in a particular
recursive call.
<P>One implication of these complicating factors is that we need <EM>two</EM>
stop rules. It may be obvious that we need one for the situation of
counting <CODE>:howmany</CODE> down to zero, but we also need one for <CODE>:list</CODE>
getting too small. Ordinarily this latter would be an <CODE>emptyp</CODE> test, but
in fact any list whose length is less than <CODE>:howmany</CODE> is too small, not
just the empty list. Here is the finished procedure:
<P><PRE>to combs :list :howmany
if equalp :howmany 0 [output [[]]]
if equalp :howmany count :list [output (list :list)]
output sentence (map [fput first :list ?]
combs (butfirst :list) (:howmany-1)) ~
(combs (butfirst :list) :howmany)
end
? <U>show combs [a b c d e] 3</U>
[[a b c] [a b d] [a b e] [a c d] [a c e] [a d e]
[b c d] [b c e] [b d e] [c d e]]
</PRE>
<P>Now we can use <CODE>combs</CODE> on the sock problem. (Note: The procedure <CODE>
socks</CODE> shown here is not the one in the program file; there will be a
modified version a few paragraphs down the road.)
<P><PRE>to socks :list
localmake "total combs :list 2
localmake "matching filter [equalp butlast first ? butlast last ?] ~
:total
print (sentence [There are] count :total [possible pairs of socks.])
print (sentence [Of these,] count :matching [are matching pairs.])
print sentence [Probability of match =] ~
word (100*(count :matching)/(count :total)) "%
end
? <U>socks [brown1 brown2 brown3 brown4 brown5 brown6</U>
<U>blue1 blue2 blue3 blue4]</U>
There are 45 possible pairs of socks.
Of these, 21 are matching pairs.
Probability of match = 46.6666667%
</PRE>
<P>The answer is that the probability of a matching pair is just
under half. The template used in the invocation of <CODE>filter</CODE> in <CODE>
socks</CODE> depends on the fact that two socks match if their names are equal
except for the last character, such as <CODE>brown3</CODE> and <CODE>brown4</CODE>.
<P>I've numbered the socks because it's easier for us to talk about how the
program works (and about how the underlying mathematics works, too) if we
can identify an individual sock. But it's worth noting that the program
doesn't really need individual sock names. We could instead use the list
<P><PRE>[brown brown brown brown brown brown blue blue blue blue]
</PRE>
<P>and change the <CODE>filter</CODE> template to
<P><PRE>[equalp first ? last ?]
</PRE>
<P>The program will generate some <CODE>[brown brown]</CODE> pairs, some
<CODE>[brown blue]</CODE> pairs, and some <CODE>[blue blue]</CODE> pairs. The number of
pairs will still be 45 and the number of matching pairs will still be 21.
<P>Having come to that realization, we can make the "user interface" a little
smoother by having <CODE>socks</CODE> accept an input list like
<P><PRE>[6 brown 4 blue]
</PRE>
<P>and expand that into the desired list of ten socks itself. Here is
the final program:
<P><PRE>to socks :list
localmake "total combs (expand :list) 2
localmake "matching filter [equalp first ? last ?] :total
print (sentence [There are] count :total [possible pairs of socks.])
print (sentence [Of these,] count :matching [are matching pairs.])
print sentence [Probability of match =] ~
word (100*(count :matching)/(count :total)) "%
end
to expand :list
if emptyp :list [output []]
if numberp first :list ~
[output cascade (first :list)
[fput first butfirst :list ?]
(expand butfirst butfirst :list)]
output fput first :list expand butfirst :list
end
? <U>socks [6 brown 4 blue]</U>
There are 45 possible pairs of socks.
Of these, 21 are matching pairs.
Probability of match = 46.6666667%
</PRE>
<P>My reason for presenting this refinement of the program is that it offers a
concrete opportunity for reflection on how you can tell which differences are
important in a combinatorics problem. In discussing the first version of
the program, I said that the two lists
<P><PRE>[brown2 brown5] and [brown5 brown2]
</PRE>
<P>represent the same pair of socks, so both shouldn't be included in
the list of lists output by <CODE>combs</CODE>. Now I'm saying that several lists
that look identical like
<P><PRE>[brown brown]
</PRE>
<P>represent <EM>different</EM> pairs of socks and must all be counted.
It would be a mistake to say, "There are three possibilities: brown-brown,
brown-blue, and blue-blue. So the probability of a match is 2/3." It's
true that there are three <EM>kinds</EM> of pairs of socks, but the three
kinds are not equally represented in the list of 45 possible pairs.
<P><H2>Inductive and Closed-Form Definition</H2>
<P>The usual approach to problems like the one about the socks is not to
enumerate the actual combinations, but rather to compute the <EM>number</EM>
of combinations directly. There are formulas for both number of
combinations and number of permutations. Usually the latter is derived
first because it's easier to understand.
<P>With 10 socks in the drawer, the number of two-sock permutations is
10×9. If we'd wanted three socks for a visiting extraterrestrial
friend, the number of permutations would be 10×9×8. In
general, if we have <EM>n</EM> things and we want to select <EM>r</EM> of them, the number
of permutations is
<P><CENTER><IMG SRC="math11.gif" ALT="math display"></CENTER><P>
Mathematicians don't like messy formulas full of dots, so this is usually
abbreviated using the factorial function. The notation "<EM>n</EM>!" is
pronounced "<EM>n</EM> factorial" and represents the product of all the integers
from 1 to <EM>n</EM>. Using this notation we can write
<P><CENTER><IMG SRC="math12.gif" ALT="math display"></CENTER><P>
This is an elegant formula, but you should resist the temptation to use it
as the basis for a computer program. If you write
<P><PRE>to perms :n :r
output (fact :n)/(fact (:n-:r))
end
to fact :n
output cascade :n [# * ?] 1
end
</PRE>
<P>then you're doing more multiplications than necessary, plus an
unnecessary division. Instead, go back to the earlier version in which <EM>r</EM>
terms are multiplied:
<P><PRE>to perms :n :r
if :r=0 [output 1]
output :n * perms :n-1 :r-1
end
</PRE>
<P>The set of all permutations of <EM>n</EM> things taken <EM>r</EM> at a time includes
several rearrangements of each <EM>combination</EM> of <EM>n</EM> things <EM>r</EM> at a
time. How many rearrangements of each? Each combination is a set of <EM>r</EM>
things, so the number of possible orderings of those <EM>r</EM> things is the
number of permutations of <EM>r</EM> things <EM>r</EM> at a time, <EM><SUB><SMALL>r</SMALL></SUB>P<SUB><SMALL>r</SMALL></SUB></EM> or <EM>r</EM>!.
<EM><SUB><SMALL>n</SMALL></SUB>P<SUB><SMALL>r</SMALL></SUB></EM> is greater than <EM><SUB><SMALL>n</SMALL></SUB>C<SUB><SMALL>r</SMALL></SUB></EM>, the number of combinations of <EM>n</EM>
things taken <EM>r</EM> at a time, by this factor. In other words, if each
combination corresponds to <EM>r</EM>! permutations, then the number of
permutations is <EM>r</EM>! times the number of combinations. So we have
<P><CENTER><IMG SRC="math13.gif" ALT="math display"></CENTER><P>
The notation <IMG SRC="nchooser.gif"> is much more commonly used in mathematics and
computer science texts than <EM><SUB><SMALL>n</SMALL></SUB>C<SUB><SMALL>r</SMALL></SUB></EM>. It's pronounced "<EM>n</EM> choose <EM>r</EM>."
<P>The traditional way to do the sock problem is this: The total number of
possible pairs of socks is <IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch2/10choose2.gif">. The number of matching pairs is
equal to the number of brown pairs plus the number of blue pairs. The
number of brown pairs is the number of combinations of 6 brown socks chosen
2 at a time, or <IMG SRC="6choose2.gif">. Similarly, the number of pairs of blue socks
is <IMG SRC="4choose2.gif">. So
<P><CENTER><IMG SRC="math14.gif" ALT="math display"></CENTER><P>
which is the same answer we got by enumerating and testing all the possible
pairs.
<P>A formula like
<P><CENTER><IMG SRC="math15.gif" ALT="math display"></CENTER><P>
defines a mathematical function in terms of other, more elementary functions.
It is comparable to a Logo procedure defined in terms of primitives, like
<P><PRE>to second :thing
output first butfirst :thing
end
</PRE>
<P>The "primitives" of mathematics are addition, subtraction, and
so on, along with a few more advanced ones like the trigonometric and
exponential functions. These are called "elementary functions" and a
formula that defines some new function in terms of those is called a
<EM>closed form definition.</EM>
<P>
<P>The function <IMG SRC="nchooser.gif"> could also be defined in a different way based on
the ideas in the <CODE>combs</CODE> program we used to enumerate combinations. The
combinations fall into two categories, those that include the first element
and those that don't. So the number of combinations is the sum of the
numbers in each category:
<P><CENTER><IMG SRC="math16.gif" ALT="math display"></CENTER><P>
This is called an <EM>inductive definition.</EM> It is analogous to a
recursive procedure in Logo.
<P>
<P>These two formulas provide alternative definitions for the <EM>same</EM>
function, just as two Logo procedures can employ different algorithms but
have the same input-output behavior. How do we know that the two
definitions of <IMG SRC="nchooser.gif"> really do define the same function? Each
definition was derived from the fundamental definition of "the number of
combinations of <EM>n</EM> things taken <EM>r</EM> at a time" by different arguments.
If those arguments are correct, the two versions must define the same
function because there is just one correct number of combinations. It is
also possible to prove the two definitions equivalent by algebraic
manipulation; start with the closed form definition and see if it does, in
fact, obey the requirements of the inductive definition. For example, if
<EM>r</EM>=0 we have
<P><CENTER><IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch2/math17.gif" ALT="math display"></CENTER><P>
(It may not be obvious that 0! should be equal to 1, but mathematicians
define the factorial function that way so that the formula <EM>n</EM>! = <EM>n</EM>·(<EM>n</EM>−1)!
remains true when <EM>n</EM>=1.) See if you can verify the other two parts of the
inductive definition. Here's a hint:
<P><CENTER><IMG SRC="math18.gif" ALT="math display"></CENTER><P>
<P>Why would anyone be interested in an inductive definition when the closed
form definition is mathematically simpler and also generally faster to
compute? There are two reasons. First, some functions don't have closed
form definitions in terms of elementary functions. For those functions,
there is no choice but to use an inductive definition. Second, sometimes
when you start with a non-formal definition of a function in terms of its
purpose, like "the number of combinations..." for <IMG SRC="nchooser.gif">, it may be
easier to see how to translate that into an inductive definition as a first
step, even if it later turns out that there is also a less obvious closed
form. In fact, that's what I did in presenting the idea of combinations. I
found it more straightforward to understand the inductive definition because
it made sense to think about the actual combinations and not merely how many
of them there are. (In fact there is a mathematical technique called <EM>
generating functions</EM> that can sometimes be used to transform an
inductive definition into a closed form definition, but that technique
requires calculus and is beyond the scope of this book.)
<P><H2>Pascal's Triangle</H2>
<P>In addition to closed form and inductive definitions, it's often helpful to
present a sort of partial definition of a function in the form of a
table of values. (For
logic functions, with only a finite number of possible values
for the arguments of the function, such a table is actually a complete
definition.) Partial definitions in a table of values can be particularly
useful when the function displays some regularity that allows values outside
the table to be computed easily based on the values in the table. For
example, the sine function is <EM>periodic</EM>; its values repeat in cycles
of 360 degrees. If you need to know the sine of 380 degrees, you can look
up the sine of 20 degrees and that's the answer. For functions of two
variables, like addition and multiplication, these function tables are often
presented as square arrays of numbers. In elementary school you learned the
addition and multiplication tables for numbers up to 10, along with
algorithms for reducing the addition and multiplication of larger numbers to
a sequence of operations on single digits.
<P>The function <IMG SRC="nchooser.gif"> is a function of two variables, so it would
ordinarily be presented as a square table like the multiplication table,
except for the fact that this particular function is meaningfully defined
only when <EM>r</EM>≤<EM>n</EM>. (There are no combinations of three things taken five
at a time, for example, so <IMG SRC="3choose5.gif"> is 0.) So instead of a square
format like this:
<P><CENTER><TABLE>
<TR><TH><SUB><SMALL><EM>r</EM></SMALL></SUB>\<SUP><SMALL><EM>n</EM></SMALL></SUP>
<TH> 0<TH> 1<TH> 2<TH> 3<TH> 4<TH> 5
<TR><TH>0<TD align="center"> 1<TD align="center"> 1<TD align="center"> 1
<TD align="center"> 1<TD align="center"> 1<TD align="center"> 1
<TR><TH>1<TD align="center"> 0<TD align="center"> 1<TD align="center"> 2
<TD align="center"> 3<TD align="center"> 4<TD align="center"> 5
<TR><TH>2<TD align="center"> 0<TD align="center"> 0<TD align="center"> 1
<TD align="center"> 3<TD align="center"> 6<TD align="center"> 10
<TR><TH>3<TD align="center"> 0<TD align="center"> 0<TD align="center"> 0
<TD align="center"> 1<TD align="center"> 4<TD align="center"> 10
<TR><TH>4<TD align="center"> 0<TD align="center"> 0<TD align="center"> 0
<TD align="center"> 0<TD align="center"> 1<TD align="center"> 5
<TR><TH>5<TD align="center"> 0<TD align="center"> 0<TD align="center"> 0
<TD align="center"> 0<TD align="center"> 0<TD align="center"> 1
</TABLE></CENTER><P>
this function is traditionally presented in a triangular form called
"Pascal's Triangle" after Blaise Pascal (1623-1662), who invented the
mathematical theory of probability along with Pierre de Fermat (1601-1665).
Pascal didn't invent the triangle, but he did pioneer its use in
combinatorics. Each row of the triangle contains the nonzero values of
<IMG SRC="nchooser.gif"> for a particular <EM>n</EM>:
<P><CENTER><TABLE>
<TR><TD align="center">1
<TR><TD align="center">1 1
<TR><TD align="center">1 2 1
<TR><TD align="center">1 3 3 1
<TR><TD align="center">1 4 6 4 1
<TR><TD align="center">1 5 10 10 5 1
</TABLE></CENTER><P>
Pascal's Triangle is often introduced in algebra because the numbers in row
<EM>n</EM> (counting from zero) are the <EM>binomial coefficients,</EM> the constant
factors in the terms in the expansion of (<EM>a</EM>+<EM>b</EM>)<SUP><SMALL><EM>n</EM></SMALL></SUP>. For example,
<P><CENTER><IMG SRC="math21.gif" ALT="math display"></CENTER><P>
<P>Do you see why the binomial coefficients are related to combinations? An
expression like (<EM>a</EM>+<EM>b</EM>)<SUP><SMALL>4</SMALL></SUP> is a sum of products of four <EM>A</EM>s and <EM>B</EM>s. (How
many such products? Each term involves four choices between <EM>A</EM> and <EM>B</EM>;
there are 2 ways to make each choice, and the choices are independent, so
there are 2<SUP><SMALL>4</SMALL></SUP> possible products.) These products are combined into terms
based on the fact that some are equal to each other, such as <EM>aaab</EM> and
<EM>abaa</EM>, both of which contribute to the <EM>a</EM><SUP><SMALL>3</SMALL></SUP><EM>b</EM> term. How many arrangements
of three <EM>A</EM>s and one <EM>B</EM> are there? That's like asking how many ways there
are to choose one slot for a <EM>B</EM> out of four possible slots, which is
<IMG SRC="4choose1.gif">.
<P>Can you predict what the coefficients will be in the expansion of
(<EM>a</EM>+<EM>b</EM>+<EM>c</EM>)<SUP><SMALL>4</SMALL></SUP>? For example, what is the coefficient of <EM>a</EM><EM>b</EM><SUP><SMALL>2</SMALL></SUP><EM>c</EM>? Try
to multiply it out and see if your formula is right.
<P>Everyone is taught in school that each number in Pascal's Triangle, except
for the 1s at the ends, is the sum of the two numbers above it. But this is
usually presented as a piece of magic with no explanation. It's not obvious
how that fact is connected to the formula expressing <IMG SRC="nchooser.gif"> in terms
of factorials. But the technique I used in writing the <CODE>combs</CODE> procedure
to enumerate the actual combinations explains how Pascal's Triangle works.
The set of all combinations of <EM>n</EM> things taken <EM>r</EM> at a time can be divided
into those combinations that include the first of the <EM>n</EM> things and those
that don't. How many of the former are there? Each such combination must
be completed by adjoining to that first thing <EM>r</EM>−1 out of the remaining
<EM>n</EM>−1 available things, so there are <IMG SRC="n-1chooser-1.gif"> such combinations.
The second category, those not containing the first thing in the list,
requires us to choose <EM>r</EM> things out of the remaining <EM>n</EM>−1, so there are
<IMG SRC="n-1chooser.gif"> of them. So <IMG SRC="nchooser.gif"> must be the sum of those two
numbers, which are indeed the ones above it in the triangle.
<P>Thinking about the triangle may also help you to understand why <CODE>combs</CODE>
needs two stop rules; each row contains <EM>two</EM> numbers, the ones (pun)
at each end, that can't be computed as the sum of two other numbers.
<P><H2>Simulation</H2>
<P>Yet another approach to solving the sock problem would be the
experimental method: Load a drawer with six brown and four blue socks, pull out pairs
of socks a few thousand times, and see how many of the pairs match. The
actual experiment would be time-consuming and rather boring, but we can
<EM>simulate</EM> the experiment with a computer program. The idea is to use
random numbers to represent the random choice of a sock.
<P><PRE>to socktest
localmake "first ~
pick [brown brown brown brown brown brown blue blue blue blue]
localmake "second ~
pick (if equalp :first "brown
[[brown brown brown brown brown blue blue blue blue]]
[[brown brown brown brown brown brown blue blue blue]] )
output equalp :first :second
end
</PRE>
<P><CODE>Socktest</CODE> is a predicate that simulates one trial of picking
a pair of socks and outputs <CODE>true</CODE> if the socks match. Notice how the
available choices for the second sock depend on which color sock was chosen
first. (It's a little unaesthetic that this particular selection of six
brown and four blue socks is built into the program, with three slightly
different lists explicitly present inside <CODE>socktest</CODE>. It would be both
more elegant and more flexible if <CODE>socktest</CODE> could take a list like
<CODE>[6 brown 4 blue]</CODE> as input, like <CODE>socks</CODE>, and compute the list of
possibilities for the second sock itself. But right now I'm more interested
in showing how a simulation works than in programming style; you can make
that change yourself if you like.)
<P>What we want to do is invoke <CODE>socktest</CODE> repeatedly and keep track of how
many times the output is <CODE>true</CODE>. That can be done with an instruction
like
<P><PRE>print (cascade 1000 [? + if socktest [1] [0]] 0) / 10
</PRE>
<P>I divide by 10 so that the result will be expressed as a percent
probability. (If I made 100 trials instead of 1000 the output from <CODE>
cascade</CODE> would already be a percentage.) Your results will depend on the
random number generator of your computer. I tried it three times and got
results of 50.1%, 50.8%, and 45.5%. I then did 10,000 trials at once
with a result of 48.78%. The result expected on theoretical grounds was
46 <SUP><SMALL>2</SMALL></SUP>⁄<SUB><SMALL>3</SMALL></SUB>%.
<P>Simulation is generally much slower than either of the techniques we used
earlier (enumeration of possibilities and direct computation of the number
of possibilities), and it gives results that are only approximately correct.
So why would anyone want to use this method? For a simple problem like
this, you probably wouldn't. But some combinatorics problems are too
complicated to be captured by a simple formula. For example, what is the
probability of winning a game of solitaire? (To make this a sensible
question, you'd have to decide on a particular set of strategy rules to
determine which card to play next when there are several possibilities. The
rule could be "play the higher ranking card" or "choose a card at
random," for example.) In principle this question could be answered
exactly, since there are only a finite number of ways a deck of cards can be
arranged and we could analyze each of them. But in practice the most
reasonable approach is probably to write a solitaire simulator and have it
play out a few thousand randomly ordered hands.
<P>Solitaire is a rather complicated game; even a simulator for it would be
quite a large project. A more manageable one, if you'd like something to
program, would be a craps simulator. Remember that the 11 possible results
of rolling two dice (2 to 12) are not equally likely! You have to simulate
each die separately.
<P><H2>The Simplex Lock Problem</H2>
<P>This is a picture of a Simplex lock, so called because it's
manufactured by Simplex Security Systems, Inc. It is a five-button mechanical
(i.e., no electricity) combination lock with an unusual set of possible
combinations. As an example of a challenging problem in combinatorics, I'd
like you to figure out how many possible combinations there are.
<P><CENTER><IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch2/lock.gif" ALT="figure: lock"></CENTER>
<P>What makes this lock unusual is that a combination can include more than one
button pushed at the same time. For example, one possible combination is
"2, then 1 and 4 at the same time, then 3." Here are the precise rules:
<P><P>
<TABLE><TR><TH align="right" valign="top">1.<TD> <TD valign="top"> Each button may be used at most once. For example, "2, then 2 and 3 at
the same time" is not allowed.
<TR><TH align="right" valign="top">2.<TD> <TD valign="top"> Each push may include any number of buttons, from one to five. For
example, one legal combination is "hit all five buttons at once with your
fist." (But hitting all five buttons can't be part of a larger combination
because of rule 1.)
<P></TABLE><P>
<P>It follows from these rules that there can be at most five
distinct pushes. (Do you see why?) The rules also allow for the null
combination, in which you don't have to push any buttons at all.
<P>When working on this problem, don't forget that when two or more buttons are
pushed at the same time, their order doesn't matter. That is, you shouldn't
count "2 and 3 together, then 5" and "3 and 2 together, then 5" as two
distinct combinations. (For this reason, the Simplex lock <EM>is</EM>
entitled, at least in part, to the name "combination lock"!)
<P>Try to figure out how many combinations there are before reading further.
You can enumerate all the possibilities or you can derive a formula for the
number of possibilities. You might want to start with a smaller number of
buttons. (As a slight hint, when you buy one of these locks, the box it
comes in says "thousands of combinations.")
<P>I first attacked this problem by trying to enumerate all the possible
combinations, but that turns out to be quite messy. The trouble is that it
isn't obvious how to <EM>order</EM> the combinations, so it's hard to be sure
you haven't missed any. Here is how I finally decided to do it. First of
all, divide the possible combinations into six categories depending on how
many buttons (zero to five) they use. There is exactly one combination
using zero buttons, and there are five using one button each. After that it
gets tricky because there are different <EM>patterns</EM> of simultaneous
pushes within each category. For example, for combinations using two
buttons there are two patterns: the one in which they're pressed together
(<CODE>x-x</CODE>) and the one in which they're pressed separately (<CODE>x x</CODE>).
(I'm introducing a notation for patterns in which hyphens connect buttons
that are pressed together and spaces connect the separate pushes.) How many
distinct combinations are there in each of those patterns? Figure it out
before reading on.
<P>In the <CODE>x x</CODE> pattern there are <SUB><SMALL>5</SMALL></SUB><EM>P</EM><SUB><SMALL>2</SMALL></SUB> "combinations" because the
order in which you push the buttons matters. In the <CODE>x-x</CODE> pattern there
are only <IMG SRC="5choose2.gif"> combinations because the two buttons are pushed
together; <CODE>1-4</CODE> and <CODE>4-1</CODE> are the same combination. Altogether
there are 20+10 or 30 combinations usng two of the five buttons.
<P>Beyond this point it gets harder to keep track of the different patterns.
Among the three-button patterns are <CODE>x-x x</CODE>, <CODE>x x x</CODE>, and <CODE>
x-x-x</CODE>. How many more are there? How many four-button patterns? You might,
at this point, like to see if you can finish enumerating all the
possibilities for the five-button lock.
<P>My solution is to notice that in a three-button pattern, for example, there
are two slots between the <CODE>x</CODE>s, and each slot has a space or a hyphen.
If I think of those slots as binary digits, with 0 for space and 1 for
hyphen, then each pattern corresponds to a 2-bit number. There are four
such numbers, 00 to 11 (or 0 to 3 in ordinary decimal notation).
<P><TABLE><TR><TH>number<TH> <TH>pattern
<TR><TD align="center">
<PRE>
00
01
10
11
</PRE><TH> <TD align="center">
<PRE>
x x x
x x-x
x-x x
x-x-x
</PRE></TABLE>
<P>Similarly, there are eight four-button patterns:
<P><TABLE><TR><TH>number<TH> <TH>pattern
<TR><TD align="center">
<PRE>
000
001
010
011
100
101
110
111
</PRE><TH> <TD align="center">
<PRE>
x x x x
x x x-x
x x-x x
x x-x-x
x-x x x
x-x x-x
x-x-x x
x-x-x-x
</PRE></TABLE>
<P>And there are 16 five-button patterns, from 0000 to 1111.
<P>How many combinations are there within each pattern? There are two
different ways to go about calculating that number. To be specific, let's
consider four-button patterns. The way I chose to do the calculation was to
start with the idea that there are <SUB><SMALL>5</SMALL></SUB><EM>P</EM><SUB><SMALL>4</SMALL></SUB> ways to choose four buttons in
order. For the <CODE>x x x x</CODE> pattern, this is the answer. For the other
patterns, this number (120) has to be divided by various factors to account
for the fact that the order is <EM>partially</EM> immaterial, just as in
deriving the formula for combinations from the formula for permutations we
divided by <EM>r</EM>! because the order is completely immaterial. Consider the
pattern <CODE>x-x x x</CODE>. In this pattern the order of the first two numbers
is immaterial, but the choice of the first two numbers as a pair matters,
and so does the order of the last two numbers. So <CODE>1-2 3 4</CODE> is the same
combination as <CODE>2-1 3 4</CODE> but different from <CODE>1-3 2 4</CODE> or <CODE>
1-2 4 3</CODE>. That means the number 120 is too big by a factor of 2, because
every significant choice of combination is represented twice. For this
pattern the number of different combinations is 60. Of course the same
argument applies to the patterns <CODE>x x-x x</CODE> and <CODE>x x x-x</CODE>.
<P>What about <CODE>x-x x-x</CODE>? In this pattern there are two pairs of positions
within which order doesn't matter. Each combination appears <EM>four</EM>
times in the list of 120; <CODE>1-2 3-4</CODE> is the same as <CODE>1-2 4-3</CODE>,
<CODE>2-1 3-4</CODE>, and <CODE>2-1 4-3</CODE>. So there are 30 significantly different
combinations in this pattern. What about <CODE>x-x-x x</CODE>? In this pattern,
the order of the first three numbers is irrelevant; this means that there
are 3! or 6 appearances of each combination in the 120, so there are 20
significantly different combinations in this pattern. The general rule is
that for each group of <EM>m</EM> consecutive hyphens in the pattern you must
divide by (<EM>m</EM>+1)! to eliminate duplicates.
<P>(My approach was to start with permutations and then divide out redundant
ones. Another approach would be to build up the pattern using combinations.
The pattern <CODE>x-x x x</CODE> contains three groups of numbers representing
three "pushes": a group of two and two groups of one. Since this is a
five-button lock, for the first group of two there are <IMG SRC="5choose2.gif"> choices.
(Order doesn't matter within a group.) For the second group there are only
three buttons remaining from which we can choose, so there are <IMG SRC="3choose1.gif">
choices for that button. Finally, there are <IMG SRC="2choose1.gif"> choices for the
fourth button (the third group). This makes 10×3×2 possible
combinations for this pattern, the same as the 60 we computed the other way.
For the <CODE>x-x x-x</CODE> pattern this method gives <IMG SRC="5c2x3c2.gif">
or 30 combinations.)
<P>Having worked all this out, I was ready to write a computer program to count
the total number of combinations. The trickiest part was deciding how to
deal with the binary numbers that represent the patterns. In the end I used
plain old numbers. The expression
<P><PRE>remainder :number 2
</PRE>
<P>yields the rightmost bit of a number, and then <CODE>:number/2</CODE>
gives all but the rightmost bit (with a little extra effort for odd numbers).
To help you read the program, here is a description of the most important
procedures:
<P>
<TABLE><TR><TH align="right" valign="top"><CODE>lock 5</CODE><TD> <TD valign="top">outputs the total number of combinations
for the 5-button lock.
<TR><TH align="right" valign="top"><CODE>lock1 5 4</CODE><TD> <TD valign="top">outputs the number of combinations that use
4 out of the 5 buttons.
<TR><TH align="right" valign="top"><CODE>lock2 120 5 1</CODE><TD> <TD valign="top">outputs the number of combinations for the
4-button pattern corresponding to the binary
form of the number 5 (101 or x-x x-x). The
120 is <SUB><SMALL>5</SMALL></SUB><EM>P</EM><SUB><SMALL>4</SMALL></SUB> and the 1 is always used as
the third input except in recursive calls.
<P></TABLE>
<P>Here is the program:
<P><PRE>to lock :buttons
output cascade :buttons [? + lock1 :buttons #] 1
end
to lock1 :total :buttons
localmake "perms perms :total :buttons
output cascade (twoto (:buttons-1)) ~
[? + lock2 :perms #-1 1] ~
0
end
to twoto :power
output cascade :power [2 * ?] 1
end
to lock2 :perms :links :factor
if equalp :links 0 [output :perms/(fact :factor)]
if equalp (remainder :links 2) 0 ~
[output lock2 :perms/(fact :factor) :links/2 1]
output lock2 :perms (:links-1)/2 :factor+1
end
</PRE>
<P>One slight subtlety is that in <CODE>lock</CODE> the third input to
<CODE>cascade</CODE> is 1 rather than 0 to include the one 0-button combination
that would not otherwise be added in.
<P><H2>An Inductive Solution</H2>
<P>When I wrote that program, I was pleased with myself for managing to turn
such a messy solution into executable form, but I wasn't satisfied with
the underlying approach. I wanted something mathematically more elegant.
<P>What made it possible for me to find the approach I wanted was the chance
discovery that the number of combinations that use all five buttons (541)
is half of the total number of combinations (1082). Could this possibly be
a coincidence, or would that have to be true for any number of buttons?
To see that it has to be true, I used an idea from another branch of
mathematics, <EM>set theory.</EM> A <EM>set</EM> is any collection of things,
in no particular order. One can speak of the set of all the fingers on my
left hand, or the set of all the integers, or the set of all the universities
in cities named Cambridge. Much of the interesting part of set theory has
to do with the properties of infinite sets; for example, it turns out that
the set of all the integers is the same size as the set of all the rational
numbers, but both of these are smaller than the set of irrational numbers.
What does it mean for one infinite set to be the same size as, or to be
larger than, another? The same definition works equally well for finite
sets: Two sets are the same size if they can be placed in
<EM>one-to-one correspondence.</EM> This means that you must exhibit a way to
pair the elements of one set with the elements of the other so that each
element of one has exactly one partner in the other. (A set is larger than
another if they aren't the same size, but a subset of the first is the same
size as the second.)
<P>To prove that my observation about the lock combinations has to be true
regardless of the number of buttons, I have to exhibit a one-to-one
correspondence between two sets: the set of all combinations using all
the buttons of an <EM>n</EM>-button lock and the set of all combinations using
fewer than <EM>n</EM> of the buttons. But that's easy. Starting with a
combination that uses all the buttons, just eliminate the last push (one or
more buttons pushed at the same time) to get a combination using fewer than
all the buttons. For example, for a five-button lock, the five-button
combination <CODE>2 3-4 1-5</CODE> is paired with the three-button combination
<CODE>2 3-4</CODE>. (We have to eliminate the last <EM>push</EM> and not merely
the last <EM>button</EM> for two reasons. First, if we always eliminated
exactly one button, we'd always get a four-button combination, and we want
to pair five-button combinations with all the fewer-than-five ones.
Second, which is the "last" button if the last push involves more than
one? Remember, <CODE>2 3-4 1-5</CODE> is the <EM>same</EM> combination as
<CODE>2 3-4 5-1</CODE>. But writing this combination in two different forms
seems to pair it with two different smaller ones. The rules of one-to-one
correspondence say that each element of a set must have exactly one partner
in the other set.)
<P>To show that the correspondence works in both directions, start with a
combination that doesn't use all the buttons; its partner is formed by
adding one push at the end that contains all the missing buttons.
For example, if we start with <CODE>1 2-5</CODE> then its partner is
<CODE>1 2-5 3-4</CODE>.
<P>I've just proved that the number of all-<EM>n</EM> combinations must be equal to
the number of fewer-than-<EM>n</EM> combinations. So it's not a coincidence that
541 is half of 1082. In order to be able to talk about these numbers more
succinctly, I want to define
<P><CENTER><EM>f</EM>(<EM>n</EM>) = number of <EM>n</EM>-button
combinations of an <EM>n</EM>-button lock</CENTER><P>
We've just proved that it's also true that
<P><CENTER><EM>f</EM>(<EM>n</EM>) = number of fewer-than-<EM>n</EM>-button
combinations</CENTER><P>
Now, what does "fewer than <EM>n</EM> buttons" mean? Well, there are
combinations using no buttons, one button, two buttons, and so on up to <EM>n</EM>−1
buttons. Let's define
<P><CENTER><EM>g</EM>(<EM>n</EM>,<EM>i</EM>) = number of <EM>i</EM>-button
combinations in an <EM>n</EM>-button lock</CENTER><P>
So we can formalize the phrase "fewer than <EM>n</EM>" by saying
<P><CENTER><EM>f</EM>(<EM>n</EM>) =
<EM>g</EM>(<EM>n</EM>,0)+<EM>g</EM>(<EM>n</EM>,1)+<EM>g</EM>(<EM>n</EM>,2)+ ··· +<EM>g</EM>(<EM>n</EM>,<EM>n</EM>−1)</CENTER><P>
Instead of using those dots in the middle, mathematicians have another
notation for a sum of several terms like this.
<P><CENTER><IMG SRC="math26.gif" ALT="math display"></CENTER><P>
If you haven't seen this notation before, the Σ (<EM>sigma</EM>)
symbol is the Greek letter <EM>S,</EM> and is used to represent a <EM>S</EM>um.
It works a little like the <CODE>for</CODE> iteration tool; the variable below the
Σ (in this case, <EM>i</EM>) takes on values from the lower limit (0) to the
upper limit (<EM>n</EM>−1), and for each of those values the expression following the
Σ is added into the sum. The Logo equivalent would be
<P><PRE>for [i 0 [:n-1]] [make "sum :sum + (g :n :i)]
</PRE>
<P>The Σ-expression is pronounced "the sum from <EM>i</EM> equals
zero to <EM>n</EM> minus one of <EM>g</EM> of <EM>n</EM> comma <EM>i</EM>."
<P>So far, what I've done is like what I did before: dividing the set of all
possible combinations into subsets based on the number of buttons used in
each combination. This is like the definition of <CODE>lock</CODE> in terms of
<CODE>lock1</CODE>. The next step is to see if we can find a formula for <EM>g</EM>(<EM>n</EM>,<EM>i</EM>).
How many 3-button combinations, for example, can we make using a 5-button
lock? (That's <EM>g</EM>(5,3).) There are many different ways in which I might
try to derive a formula, but I think it will be helpful at this point to
step back and consider my overall goal. I started this line of reasoning
because I'm trying to express the solution for the five-button lock in terms
of easier solutions for smaller numbers of buttons. That is, I'm looking
for an inductive definition of <EM>f</EM>(<EM>n</EM>) in terms of values of <EM>f</EM> for smaller
arguments. I'd like to end up with a formula like
<P><CENTER><EM>f</EM>(<EM>n</EM>) = ... <EM>f</EM>(0) ... <EM>f</EM>(1)
... <EM>f</EM>(<EM>n</EM>−1) ...</CENTER><P>
but I don't yet know exactly what form it will take. So far I've written
a formula for <EM>f</EM>(<EM>n</EM>) in terms of <EM>g</EM>(<EM>n</EM>,<EM>i</EM>) for values of <EM>i</EM> less than <EM>n</EM>.
It would be great, therefore, if I could express <EM>g</EM>(<EM>n</EM>,<EM>i</EM>) in terms of
<EM>f</EM>(<EM>i</EM>); that would give me exactly what I want.
<P>To put that last sentence into words, it would be great if I could express
the number of <EM>i</EM>-button combinations of an <EM>n</EM>-button lock in terms of the
number of <EM>i</EM>-button combinations of an <EM>i</EM>-button lock. For example, can I
express the number of combinations using 3 out of 5 buttons in terms of the
number of combinations of 3 out of 3 buttons? Yes, I can. The latter is
the number of rearrangements of three buttons once we've selected the three
buttons. If we start with five buttons, there are <IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch2/5choose3.gif"> possible
sets of three buttons to choose. For each of those <IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch2/5choose3.gif"> sets of
three buttons, there are <EM>f</EM>(3) ways to arrange those three buttons in a
combination.
<P>
<P><P><CENTER><IMG SRC="math28.gif" ALT="math display"></CENTER><P>
It may not be obvious why this is so. Suppose you list all the 3-button
combinations of a 3-button lock. There are 13 of them, consisting of the
numbers from 1 to 3 in various orders and with various groups connected
by hyphens. Those 13 combinations are also some of the 3-button combinations
of a 5-button lock, namely, the ones in which the particular three buttons
we chose are 1, 2, and 3. If instead we choose a different set of three
(out of five) buttons, that gives rise to a different set of 13 combinations.
For example, if we choose the buttons 2, 3, and 4, we can take the original
13 combinations and change all the 1s to 2s, all the 2s to 3s, and all the
3s to 4s:
<P><PRE>buttons: 1,2,3 2,3,4 1,4,5
1 2 3 2 3 4 1 4 5
1 3 2 2 4 3 1 5 4
2 1 3 3 2 4 4 1 5
2 3 1 3 4 2 4 5 1
3 1 2 4 2 3 5 1 4
3 2 1 4 3 2 5 4 1
1 2-3 2 3-4 1 4-5
2 1-3 3 2-4 4 1-5
3 1-2 4 2-3 5 1-4
1-2 3 2-3 4 1-4 5
1-3 2 2-4 3 1-5 4
2-3 1 3-4 2 4-5 1
1-2-3 2-3-4 1-4-5
</PRE>
<P>This table has a column for each of three possible combinations of
five numbers three at a time. The table could be extended to have a column
for <EM>every</EM> such combination of numbers, and then it would contain all
the lock combinations using three out of five buttons. The total number of
entries in the extended table is therefore <EM>g</EM>(5,3); the table has <EM>f</EM>(3)
rows and <IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch2/5choose3.gif"> columns. So
<P><CENTER><IMG SRC="math29.gif" ALT="math display"></CENTER><P>
which is a particular case of the general formula above.
<P>We now have a formula for <EM>f</EM>(<EM>n</EM>) in terms of all the <EM>g</EM>(<EM>n</EM>,<EM>i</EM>) and a formula
for <EM>g</EM>(<EM>n</EM>,<EM>i</EM>) in terms of <EM>f</EM>(<EM>i</EM>). Combining these we have
<P><CENTER><IMG SRC="math30.gif" ALT="math display"></CENTER><P>
or
<P><CENTER><IMG SRC="math31.gif" ALT="math display"></CENTER><P>
Like any inductive definition, this one needs a special rule for the
smallest case, from which all the others are computed:
<P><CENTER><EM>f</EM>(0) = 1</CENTER><P>
<P>The total number of combinations for an <EM>n</EM>-button lock is 2×<EM>f</EM>(<EM>n</EM>).
I find this much more elegant than my original solution. (So why didn't I
just show you this one to begin with? Because I never would have figured
this one out had I not first done the enumeration of cases. I want you to
see how a combinatorics problem is solved, not just what the beautiful
solution looks like.) This formula can also be turned into a computer
program:
<P><PRE>to simplex :buttons
output 2 * f :buttons
end
to f :n
if equalp :n 0 [output 1]
output cascade :n [? + ((choose :n (#-1)) * f (#-1))] 0
end
to choose :n :r
output (perms :n :r)/(fact :r)
end
</PRE>
<P>This program is faster as well as simpler than the other; on my
home computer, <CODE>lock 5</CODE> takes about 4 seconds, <CODE>simplex 5</CODE> about 2 seconds.
<P>The <CODE>simplex</CODE> function has no exact closed form equivalent, but it turns
out that there is (amazingly!) a closed form definition that, when rounded
to the nearest integer, gives the desired value:
<P><PRE>to simp :n
output round (fact :n)/(power (ln 2) (:n+1))
end
</PRE>
<P>The <CODE>ln</CODE> function, a Logo primitive, computes the "natural
logarithm" of its input; <CODE>ln 2</CODE> has the approximate value 0.69314718056.
The <CODE>power</CODE> function of two inputs takes
the first input to the power of the second input. <CODE>Fact</CODE> is the
factorial function as defined earlier in this chapter.
<P>Another related programming problem is to list the actual combinations,
rather than merely count them. Probably the simplest way to do that is
to use an approach similar to the one I used in the <CODE>combs</CODE> procedure
that lists combinations of members of a list: First use recursion to
find the lock combinations using only the <CODE>butfirst</CODE> of the available
buttons, then find the ways in which the <CODE>first</CODE> button can be added
to each of them.
<P><H2>Multinomial Coefficients</H2>
<P>Earlier, in talking about Pascal's Triangle, I showed how binomial
coefficients are related to combinations and asked you to think about
<EM>trinomial</EM> coefficients. What, for example, is the coefficient of
<EM>a</EM><EM>b</EM><SUP><SMALL>2</SMALL></SUP><EM>c</EM> in the expansion of (<EM>a</EM>+<EM>b</EM>+<EM>c</EM>)<SUP><SMALL>4</SMALL></SUP>?
<P>The expansion is a sum of products; each of those products contains four
variables (<EM>aaaa</EM>, <EM>aaab</EM>, etc.). The ones that contribute to the <EM>a</EM><EM>b</EM><SUP><SMALL>2</SMALL></SUP><EM>c</EM>
term are the ones with one <EM>A</EM>, two <EM>B</EM>s, and one <EM>c</EM>; these include
<EM>abbc</EM>, <EM>bcab</EM>, <EM>cabb</EM>, and so on. Out of the four slots for variables in
one of those products, how many ways can we choose a slot for one <EM>A</EM>? The
answer is <IMG SRC="4choose1.gif">. Having chosen one, we are left with three slots
and we want to choose two of them for <EM>B</EM>s. There are <IMG SRC="3choose2.gif"> ways to
do that. Then we have one slot left, just enough for the one <EM>c</EM>, which
makes a trivial contribution of <IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch2/1choose1.gif"> to the overall number of
possibilities. The total is <IMG SRC="4c1x3c2x1c1.gif"> or 4×3×1 or 12, and that is the coefficient of
<EM>a</EM><EM>b</EM><SUP><SMALL>2</SMALL></SUP><EM>c</EM>. Similarly, the coefficient of <EM>ac</EM><SUP><SMALL>3</SMALL></SUP> is <IMG SRC="4c1x3c3.gif"> or 4.
<P>The same sort of argument can be used for even more complicated cases. In
the expansion of (<EM>a</EM>+<EM>b</EM>+<EM>c</EM>+<EM>d</EM>+<EM>e</EM>)<SUP><SMALL>14</SMALL></SUP> what is the coefficient of <EM>a</EM><SUP><SMALL>2</SMALL></SUP><EM>b</EM><SUP><SMALL>3</SMALL></SUP><EM>c</EM><EM>d</EM><SUP><SMALL>5</SMALL></SUP><EM>e</EM><SUP><SMALL>3</SMALL></SUP>?
It's <P><CENTER><IMG SRC="math33.gif" ALT="math display"></CENTER><P>
<P>Here is a harder question: How many terms are there in, say, (<EM>a</EM>+<EM>b</EM>+<EM>c</EM>+<EM>d</EM>)<SUP><SMALL>7</SMALL></SUP>?
It's easy to see that there are 4<SUP><SMALL>7</SMALL></SUP> products of four variables, but after
the ones that are equal to each other have been combined into terms, how
many distinct terms are there?
<P>Like the Simplex lock problem, this one can probably be solved most easily
by reducing the problem to a smaller subproblem--in other words, by an
inductive definition. This problem also has something in common with the
earlier problem of listing all the combinations of a given size from a
given list, as we did in the <CODE>combs</CODE> procedure. Try to solve the
problem before reading further. (It's hard to say how another person will
find a problem, but I think this one is easier than the Simplex one.)
<P>In order to be able to express the original problem in terms of a smaller
version, we have to generalize it. I posed a specific problem, about the
seventh power of the sum of four variables. I'd like to be able to give
the answer to that problem the name <EM>t</EM>(4,7) and try to find a way to
express that in terms of, let's say, <EM>t</EM>(4,6). So I'm going to define
the function <EM>t</EM> as
<P><CENTER><EM>t</EM>(<EM>n</EM>,<EM>k</EM>) = number of terms
in (<EM>a</EM><SUB><SMALL>1</SMALL></SUB>+<EM>a</EM><SUB><SMALL>2</SMALL></SUB>+
···
+<EM>a</EM><SUB><SMALL><EM>n</EM></SMALL></SUB>)<SUP><SMALL><EM>k</EM></SMALL></SUP></CENTER><P>
(If this were a "straight" math book I'd cheerfully recycle the name <EM>f</EM>
for the function, even though we had a different <EM>f</EM> in the last section,
but I'm anticipating wanting to write a Logo program for this problem and I
can't have two procedures named <CODE>f</CODE> in the same workspace.)
<P>In writing <CODE>combs</CODE> I used the trick of dividing all possible
combinations into two groups: those including the first member of the list
and those not including that member. A similar trick will be useful here;
we can divide all the terms in an expansion into two groups. One group will
contain those terms that include the first variable (<EM>a</EM><SUB><SMALL>1</SMALL></SUB>) and the other
will contain the rest. For example, in the original problem, <EM>a</EM><SUP><SMALL>2</SMALL></SUP><EM>b</EM><SUP><SMALL>3</SMALL></SUP><EM>d</EM><SUP><SMALL>2</SMALL></SUP> is
a term in the first group, while <EM>c</EM><SUP><SMALL>7</SMALL></SUP> is a term in the second group.
<P>A term in the first group can be divided by <EM>a</EM><SUB><SMALL>1</SMALL></SUB>; the result must be a term
in the expansion of (<EM>a</EM><SUB><SMALL>1</SMALL></SUB>+<EM>a</EM><SUB><SMALL>2</SMALL></SUB>+ ··· +<EM>a</EM><SUB><SMALL><EM>n</EM></SMALL></SUB>)<SUP><SMALL><EM>k</EM>−1</SMALL></SUP>. How many such terms are
there? There are <EM>t</EM>(<EM>n</EM>,<EM>k</EM>−1) of them. So that's how many terms there are in
the first group.
<P>A term in the second group is a product of <EM>k</EM> variables <EM>not</EM>
including <EM>a</EM><SUB><SMALL>1</SMALL></SUB>. That means that such a term is also part of the expansion
of (<EM>a</EM><SUB><SMALL>2</SMALL></SUB>+ ··· +<EM>a</EM><SUB><SMALL><EM>n</EM></SMALL></SUB>)<SUP><SMALL><EM>k</EM></SMALL></SUP>. How many such terms are there? There are
<EM>t</EM>(<EM>n</EM>−1,<EM>k</EM>) of them. Notice the difference between the two groups. In the
first case, we associate a term with a similar term in the expansion of an
expression involving a <EM>smaller power</EM> of the <EM>same number</EM> of
variables. In the second case, we associate a term with an equal term in
the expansion of an expression involving <EM>fewer variables</EM> taken to
the <EM>same power.</EM>
<P>Combining these two results, we see that
<P><CENTER><EM>t</EM>(<EM>n</EM>,<EM>k</EM>) =
<EM>t</EM>(<EM>n</EM>,<EM>k</EM>−1)+<EM>t</EM>(<EM>n</EM>−1,<EM>k</EM>)</CENTER><P>
Since this is a function of two variables, it needs two "stop rules," just
like the function <IMG SRC="nchooser.gif">. Picking these limiting cases seems much
simpler than inventing the induction rule, but even so, it may repay some
attention. For the rule
<P><CENTER><IMG SRC="math36.gif" ALT="math display"></CENTER><P>
we ended up considering the limiting cases <EM>r</EM>=0 and <EM>r</EM>=<EM>n</EM>. I didn't say
anything about it at the time because I didn't want to get distracted, but
it's not obvious why there is the asymmetry between the two variables in
those limiting cases. That is, why didn't I pick <EM>r</EM>=0 and <EM>n</EM>=0 as the
limiting cases? That would be more like what you're accustomed to in
recursive Logo procedures, where stop rules almost always involve a
comparison of something with zero or with the empty list.
<P>The funny limiting cases for <IMG SRC="nchooser.gif"> (and the corresponding funny stop
rules in <CODE>combs</CODE>) are related to the fact that this function is
meaningful only when <EM>n</EM>≥<EM>r</EM>. The two arguments can't be chosen
independently. If we didn't have the <EM>r</EM>=<EM>n</EM> limiting case, the inductive
formula would have us compute
<P><CENTER><IMG SRC="math37.gif" ALT="math display"></CENTER><P>
If we define <IMG SRC="https://people.eecs.berkeley.edu/~bh/v3ch2/4choose5.gif"> as zero, this equation does turn out to be true,
but it isn't a very sensible way to compute <IMG SRC="5choose5.gif">.
<P>In the case of the function <EM>t</EM>, the two arguments <EM>are</EM> independent.
Both <EM>t</EM>(4,7) and <EM>t</EM>(7,4) are sensible things to ask for. Therefore, we
should use the more obvious limiting cases <EM>n</EM>=0 and <EM>k</EM>=0. The trouble is
that it's not obvious what the value of <EM>t</EM>(0,<EM>k</EM>) or <EM>t</EM>(<EM>n</EM>,0) should be. The
first of these, <EM>t</EM>(0,<EM>k</EM>), represents the number of terms in the expansion of
()<SUP><SMALL><EM>k</EM></SMALL></SUP>--nothing to the <EM>k</EM>th power! That seems meaningless. On the other
hand, <EM>t</EM>(<EM>n</EM>,0) represents the number of terms in (∑ <EM>a</EM><SUB><SMALL><EM>i</EM></SMALL></SUB>)<SUP><SMALL>0</SMALL></SUP>, which is 1.
Anything to the zeroth power is 1. Does "1" count as a term? It doesn't
have any variables in it.
<P>
<P>One solution would be to take as limiting cases <EM>n</EM>=1 and <EM>k</EM>=1. It's much
easier to see what those values should be. (<EM>a</EM><SUB><SMALL>1</SMALL></SUB>)<SUP><SMALL><EM>k</EM></SMALL></SUP> has one term, so
<EM>t</EM>(1,<EM>k</EM>)=1. And (<EM>a</EM><SUB><SMALL>1</SMALL></SUB>+ ··· +<EM>a</EM><SUB><SMALL><EM>n</EM></SMALL></SUB>)<SUP><SMALL>1</SMALL></SUP> has <EM>n</EM> terms, so <EM>t</EM>(<EM>n</EM>,1)=<EM>n</EM>. We
could, then, define the function <EM>t</EM> as
<P><CENTER><IMG SRC="math38.gif" ALT="math display"></CENTER><P>
But it is possible to figure out appropriate values for zero arguments by
working backwards from the cases we already understand. For example, we
know that <EM>t</EM>(2,1) must equal 2. But
<P><CENTER><EM>t</EM>(2,1) = <EM>t</EM>(2,0)+<EM>t</EM>(1,1) = <EM>t</EM>(2,0)+1</CENTER><P>
It follows that <EM>t</EM>(2,0) must be 1. So it's reasonable to guess that
<EM>t</EM>(<EM>n</EM>,0)=1 will work in general. Similarly, we know that <EM>t</EM>(1,2)=1, but
<P><CENTER><EM>t</EM>(1,2) = <EM>t</EM>(1,1)+<EM>t</EM>(0,2) = 1+<EM>t</EM>(2,0)</CENTER><P>
Therefore <EM>t</EM>(0,2) must be 0. We can define <EM>t</EM> as
<P><CENTER><IMG SRC="math41.gif" ALT="math display"></CENTER><P>
<P>What is <EM>t</EM>(0,0)? The definition above contradicts itself about this. The
answer should be 0 because <EM>n</EM>=0 but also 1 because <EM>k</EM>=0. This reminds me
of the similar problem about powers of integers. What is 0<SUP><SMALL>0</SMALL></SUP>? In general
0<SUP><SMALL><EM>x</EM></SMALL></SUP>=0 but <EM>x</EM><SUP><SMALL>0</SMALL></SUP>=1, for nonzero <EM>x</EM>. There is really no "right" answer,
but mathematicians have adopted the convention that 0<SUP><SMALL>0</SMALL></SUP>=1. To make our
definition of <EM>t</EM> truly correct I have to choose a convention for <EM>t</EM>(0,0)
and modify the definition to reflect it:
<P><CENTER><IMG SRC="math42.gif" ALT="math display"></CENTER><P>
<P>
<P>It's straightforward to translate this mathematical function definition into
a Logo procedure:
<P><PRE>to t :n :k
if equalp :k 0 [output 1]
if equalp :n 0 [output 0]
output (t :n :k-1)+(t :n-1 :k)
end
</PRE>
<P>Using this function we can compute <CODE>t 4 7</CODE> and find that the
answer to the original problem is 120.
<P>(Can you write a program to display the actual expansion? That is, it
should print something like
<P><PRE>(A+B+C+D)^7 =
1 A^7 +
7 A^6 B +
21 A^5 B^2 +
...
</PRE>
<P>There are two parts to this problem. One is to figure out the
combinations of variables in the 120 terms, which can be done with a
procedure like <CODE>combs</CODE>, and the other is to figure out the coefficients,
which I discussed at the beginning of this section.)
<P>I was introduced to this problem as a student teacher in a high school
probability class. The teacher gave "how many terms are there in the
expansion of (<EM>a</EM>+<EM>b</EM>+<EM>c</EM>+<EM>d</EM>)<SUP><SMALL>7</SMALL></SUP>" as a quiz problem, and nobody answered it.
In the ensuing class discussion, it turned out that she meant the students
to answer the much easier question of how many products of seven variables
there are. As I noted earlier, the answer to <EM>that</EM> question is just
4<SUP><SMALL>7</SMALL></SUP>. But all the students interpreted the question as meaning the harder
one we've been exploring here. I took the problem home that evening and
reached the point we've reached in this chapter. I didn't think I could get
a better answer than that until my housemate taught me about
generating functions. It turns out that there <EM>is</EM> a
closed form definition for
this function:
<P><CENTER><IMG SRC="math43.gif" ALT="math display"></CENTER><P>
This definition is faster to compute as well as more beautiful.
<P>Once armed with the formula, it wasn't hard to invent a way to demonstrate
that it must be correct without going through the inductive definition and
the use of calculus. The trick is that we must be choosing <EM>n</EM>−1 somethings
out of a possible <EM>k</EM>+<EM>n</EM>−1 for each term. What does a term look like?
Ignoring the constant coefficient, it is the product of <EM>k</EM> (seven, in the
specific problem given) variables, some of which may be equal. Furthermore,
when the terms are written in the usual way, the variables come in
alphabetical order. A term like <EM>a</EM><SUP><SMALL>2</SMALL></SUP><EM>b</EM><SUP><SMALL>4</SMALL></SUP><EM>d</EM> represents <EM>aabbbbd</EM>; there won't
be a different term with the same letters in another order. In general, the
<EM>k</EM> variables will be some number (zero or more) of <EM>a</EM><SUB><SMALL>1</SMALL></SUB>s, then some number
of <EM>a</EM><SUB><SMALL>2</SMALL></SUB>s, and so on.
<P>Now comes the trick. Suppose we write the string of variables with a
"wall" for each change to the next letter. So instead of <EM>aabbbbd</EM> I want
to write <EM>aa</EM>|<EM>bbbb</EM>||<EM>d</EM>. (There are two walls before the
final <EM>d</EM> to reflect the fact that we skipped over <EM>c</EM>.) In this notation
there are always exactly <EM>n</EM>−1 walls. (That's why I chose to put the walls
in; remember, we're looking for <EM>n</EM>−1 of something.) The term includes <EM>k</EM>
variables and <EM>n</EM>−1 walls, for a total of <EM>k</EM>+<EM>n</EM>−1 symbols.
<P>Once the walls are there, it really is no longer necessary to preserve the
individual variable letters. The sample term we've been using could just as
well be written <EM>xx</EM>|<EM>xxxx</EM>||<EM>x</EM>. What comes before the first
wall is the first variable letter, and so on. So ||<EM>xxx</EM>|<EM>xxxx</EM> represents <EM>c</EM><SUP><SMALL>3</SMALL></SUP><EM>d</EM><SUP><SMALL>4</SMALL></SUP>. But now we're finished. We have found a way to
represent each possible term as a string of <EM>k</EM> copies of the letter <EM>x</EM>
interspersed with <EM>n</EM>−1 walls. How many such arrangements are there? How
many ways are there to choose <EM>n</EM>−1 positions for walls in a string of
<EM>k</EM>+<EM>n</EM>−1 symbols?
<P>Earlier, in talking about the difference between closed form and inductive
definitions, I suggested that the an inductive definition might be much
easier to discover even if a closed form definition also exists. This is a
clear example. If I'd given the demonstration just above, with <EM>x</EM>s and
walls, without first showing you the more roundabout way I really discovered
the definition, you'd rightly complain about rabbits out of hats.
<P>
<TABLE width="100%"><TR><TD><A HREF="../v3-toc2.html">(back to Table of Contents)</A>
<TD align="right"><A HREF="../v3ch1/v3ch1.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../v3ch3/v3ch3.html"><STRONG>NEXT</STRONG></A>
</TABLE>
<P><H2>Program Listing</H2>
<P><P>
<P><PRE>
;;; Logic problem inference system
;; Establish categories
to category :category.name :members
print (list "category :category.name :members)
if not namep "categories [make "categories []]
make "categories lput :category.name :categories
make :category.name :members
foreach :members [pprop ? "category :category.name]
end
;; Verify and falsify matches
to verify :a :b
settruth :a :b "true
end
to falsify :a :b
settruth :a :b "false
end
to settruth :a :b :truth.value
if equalp (gprop :a "category) (gprop :b "category) [stop]
localmake "oldvalue get :a :b
if equalp :oldvalue :truth.value [stop]
if equalp :oldvalue (not :truth.value) ~
[(throw "error (sentence [inconsistency in settruth]
:a :b :truth.value))]
print (list :a :b "-> :truth.value)
store :a :b :truth.value
settruth1 :a :b :truth.value
settruth1 :b :a :truth.value
if not emptyp :oldvalue ~
[foreach (filter [equalp first ? :truth.value] :oldvalue)
[apply "settruth butfirst ?]]
end
to settruth1 :a :b :truth.value
apply (word "find not :truth.value) (list :a :b)
foreach (gprop :a "true) [settruth ? :b :truth.value]
if :truth.value [foreach (gprop :a "false) [falsify ? :b]
pprop :a (gprop :b "category) :b]
pprop :a :truth.value (fput :b gprop :a :truth.value)
end
to findfalse :a :b
foreach (filter [not equalp get ? :b "true] peers :a) ~
[falsify ? :b]
end
to findtrue :a :b
if equalp (count peers :a) (1+falses :a :b) ~
[verify (find [not equalp get ? :b "false] peers :a)
:b]
end
to falses :a :b
output count filter [equalp "false get ? :b] peers :a
end
to peers :a
output thing gprop :a "category
end
;; Common types of clues
to differ :list
print (list "differ :list)
foreach :list [differ1 ? ?rest]
end
to differ1 :a :them
foreach :them [falsify :a ?]
end
to justbefore :this :that :lineup
falsify :this :that
falsify :this last :lineup
falsify :that first :lineup
justbefore1 :this :that :lineup
end
to justbefore1 :this :that :slotlist
if emptyp butfirst :slotlist [stop]
equiv :this (first :slotlist) :that (first butfirst :slotlist)
justbefore1 :this :that (butfirst :slotlist)
end
;; Remember conditional linkages
to implies :who1 :what1 :truth1 :who2 :what2 :truth2
implies1 :who1 :what1 :truth1 :who2 :what2 :truth2
implies1 :who2 :what2 (not :truth2) :who1 :what1 (not :truth1)
end
to implies1 :who1 :what1 :truth1 :who2 :what2 :truth2
localmake "old1 get :who1 :what1
if equalp :old1 :truth1 [settruth :who2 :what2 :truth2 stop]
if equalp :old1 (not :truth1) [stop]
if memberp (list :truth1 :who2 :what2 (not :truth2)) :old1 ~
[settruth :who1 :what1 (not :truth1) stop]
if memberp (list :truth1 :what2 :who2 (not :truth2)) :old1 ~
[settruth :who1 :what1 (not :truth1) stop]
store :who1 :what1 ~
fput (list :truth1 :who2 :what2 :truth2) :old1
end
to equiv :who1 :what1 :who2 :what2
implies :who1 :what1 "true :who2 :what2 "true
implies :who2 :what2 "true :who1 :what1 "true
end
to xor :who1 :what1 :who2 :what2
implies :who1 :what1 "true :who2 :what2 "false
implies :who1 :what1 "false :who2 :what2 "true
end
;; Interface to property list mechanism
to get :a :b
output gprop :a :b
end
to store :a :b :val
pprop :a :b :val
pprop :b :a :val
end
;; Print the solution
to solution
foreach thing first :categories [solve1 ? butfirst :categories]
end
to solve1 :who :order
type :who
foreach :order [type "| | type gprop :who ?]
print []
end
;; Get rid of old problem data
to cleanup
if not namep "categories [stop]
ern :categories
ern "categories
erpls
end
;; Anita Harnadek's problem
to cub.reporter
cleanup
category "first [Jane Larry Opal Perry]
category "last [Irving King Mendle Nathan]
category "age [32 38 45 55]
category "job [drafter pilot sergeant driver]
differ [Jane King Larry Nathan]
says "Jane "Irving 45
says "King "Perry "driver
says "Larry "sergeant 45
says "Nathan "drafter 38
differ [Mendle Jane Opal Nathan]
says "Mendle "pilot "Larry
says "Jane "pilot 45
says "Opal 55 "driver
says "Nathan 38 "driver
print []
solution
end
to says :who :what1 :what2
print (list "says :who :what1 :what2)
xor :who :what1 :who :what2
end
;; Diane Baldwin's problem
to foote.family
cleanup
category "when [1st 2nd 3rd 4th 5th]
category "name [Felix Fred Frank Francine Flo]
category "street [Field Flag Fig Fork Frond]
category "item [food film flashlight fan fiddle]
category "position [1 2 3 4 5]
print [Clue 1]
justbefore "Flag "2nd :position
justbefore "2nd "Fred :position
print [Clue 2]
male [film Fig 5th]
print [Clue 3]
justbefore "flashlight "Fork :position
justbefore "Fork "1st :position
female [1st]
print [Clue 4]
falsify "5th "Frond
falsify "5th "fan
print [Clue 5]
justbefore "Francine "Frank :position
justbefore "Francine "Frank :when
print [Clue 6]
female [3rd Flag]
print [Clue 7]
justbefore "fiddle "Frond :when
justbefore "Flo "fiddle :when
print []
solution
end
to male :stuff
differ sentence :stuff [Francine Flo]
end
to female :stuff
differ sentence :stuff [Felix Fred Frank]
end
;;; Combinatorics toolkit
to combs :list :howmany
if equalp :howmany 0 [output [[]]]
if equalp :howmany count :list [output (list :list)]
output sentence (map [fput first :list ?]
combs (butfirst :list) (:howmany-1)) ~
(combs (butfirst :list) :howmany)
end
to fact :n
output cascade :n [# * ?] 1
end
to perms :n :r
if equalp :r 0 [output 1]
output :n * perms :n-1 :r-1
end
to choose :n :r
output (perms :n :r)/(fact :r)
end
;; The socks problem
to socks :list
localmake "total combs (expand :list) 2
localmake "matching filter [equalp first ? last ?] :total
print (sentence [there are] count :total [possible pairs of socks.])
print (sentence [of these,] count :matching [are matching pairs.])
print sentence [probability of match =] ~
word (100 * (count :matching)/(count :total)) "%
end
to expand :list
if emptyp :list [output []]
if numberp first :list ~
[output cascade (first :list)
[fput first butfirst :list ?]
(expand butfirst butfirst :list)]
output fput first :list expand butfirst :list
end
to socktest
localmake "first pick [brown brown brown brown brown brown
blue blue blue blue]
localmake "second ~
pick (ifelse equalp :first "brown ~
[[brown brown brown brown brown
blue blue blue blue]] ~
[[brown brown brown brown brown brown
blue blue blue]])
output equalp :first :second
end
;; The Simplex lock problem
to lock :buttons
output cascade :buttons [? + lock1 :buttons #] 1
end
to lock1 :total :buttons
local "perms
make "perms perms :total :buttons
output cascade (twoto (:buttons-1)) [? + lock2 :perms #-1 1] 0
end
to lock2 :perms :links :factor
if equalp :links 0 [output :perms/(fact :factor)]
if equalp (remainder :links 2) 0 ~
[output lock2 :perms/(fact :factor) :links/2 1]
output lock2 :perms (:links-1)/2 :factor+1
end
to twoto :power
output cascade :power [2 * ?] 1
end
to simplex :buttons
output 2 * f :buttons
end
to f :n
if equalp :n 0 [output 1]
output cascade :n [? + ((choose :n (#-1)) * f (#-1))] 0
end
to simp :n
output round (fact :n)/(power (ln 2) (:n+1))
end
;; The multinomial expansion problem
to t :n :k
if equalp :k 0 [output 1]
if equalp :n 0 [output 0]
output (t :n :k-1)+(t :n-1 :k)
end
</PRE><P>
<P><A HREF="../v3-toc2.html">(back to Table of Contents)</A>
<P><A HREF="../v3ch1/v3ch1.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../v3ch3/v3ch3.html"><STRONG>NEXT</STRONG></A>
<P>
<ADDRESS>
<A HREF="../index.html">Brian Harvey</A>,
<CODE>bh@cs.berkeley.edu</CODE>
</ADDRESS>
</BODY>
</HTML>