diff options
author | elioat <elioat@tilde.institute> | 2023-08-23 07:52:19 -0400 |
---|---|---|
committer | elioat <elioat@tilde.institute> | 2023-08-23 07:52:19 -0400 |
commit | 562a9a52d599d9a05f871404050968a5fd282640 (patch) | |
tree | 7d3305c1252c043bfe246ccc7deff0056aa6b5ab /js/games/nluqo.github.io/~bh/v1ch12 | |
parent | 5d012c6c011a9dedf7d0a098e456206244eb5a0f (diff) | |
download | tour-562a9a52d599d9a05f871404050968a5fd282640.tar.gz |
*
Diffstat (limited to 'js/games/nluqo.github.io/~bh/v1ch12')
-rw-r--r-- | js/games/nluqo.github.io/~bh/v1ch12/compose.gif | bin | 0 -> 2084 bytes | |||
-rw-r--r-- | js/games/nluqo.github.io/~bh/v1ch12/playfair.html | 715 | ||||
-rw-r--r-- | js/games/nluqo.github.io/~bh/v1ch12/v1ch12.html | 715 |
3 files changed, 1430 insertions, 0 deletions
diff --git a/js/games/nluqo.github.io/~bh/v1ch12/compose.gif b/js/games/nluqo.github.io/~bh/v1ch12/compose.gif new file mode 100644 index 0000000..29c7a7d --- /dev/null +++ b/js/games/nluqo.github.io/~bh/v1ch12/compose.gif Binary files differdiff --git a/js/games/nluqo.github.io/~bh/v1ch12/playfair.html b/js/games/nluqo.github.io/~bh/v1ch12/playfair.html new file mode 100644 index 0000000..827b018 --- /dev/null +++ b/js/games/nluqo.github.io/~bh/v1ch12/playfair.html @@ -0,0 +1,715 @@ +<HTML> +<HEAD> +<TITLE>Computer Science Logo Style vol 1 ch 12: Example: Playfair Cipher</TITLE> +</HEAD> +<BODY> +<CITE>Computer Science Logo Style</CITE> volume 1: +<CITE>Symbolic Computing</CITE> 2/e Copyright (C) 1997 MIT +<H1>Example: Playfair Cipher</H1> + +<TABLE width="100%"><TR><TD> +<IMG SRC="../csls1.jpg" ALT="cover photo"> +<TD><TABLE> +<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian +Harvey</A><BR>University of California, Berkeley</CITE> +<TR><TD align="right"><BR> +<TR><TD align="right"><A HREF="../pdf/v1ch12.pdf">Download PDF version</A> +<TR><TD align="right"><A HREF="../v1-toc2.html">Back to Table of Contents</A> +<TR><TD align="right"><A HREF="../v1ch11/v1ch11.html"><STRONG>BACK</STRONG></A> +chapter thread <A HREF="../v1ch13/v1ch13.html"><STRONG>NEXT</STRONG></A> +<TR><TD align="right"><A HREF="https://mitpress.mit.edu/books/computer-science-logo-style-second-edition-volume-1">MIT +Press web page for Computer Science Logo Style</A> +</TABLE></TABLE> + +<HR><P>Program file for this chapter: <A HREF="https://people.eecs.berkeley.edu/~bh/v1ch12/playfair.lg"><CODE>playfair</CODE></A> + +<P>This project investigates a cipher that is somewhat more complicated than +the simple substitution cipher of Chapter 11. In the +Playfair cipher, there is not a single translation of each letter +of the alphabet; that is, you don't just decide that every B will be turned +into an F. Instead, <EM>pairs</EM> of letters are translated into other +pairs of letters. + +<P>Here is how it works. To start, pick a <EM>keyword</EM> that does not +contain any letter more than once. For example, I'll pick the word +<CODE>keyword</CODE>. Now write the letters of that word in the first squares of a +five by five matrix: + +<P> +<CENTER><TABLE rules="all" frame="box" border="3" noshade> +<TR><TD align="center" height=30 width=30>K +<TD align="center" height=30 width=30>E +<TD align="center" height=30 width=30>Y +<TD align="center" height=30 width=30>W +<TD align="center" height=30 width=30>O +<TR><TD align="center" height=30 width=30>R +<TD align="center" height=30 width=30>D +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TR><TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TR><TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TR><TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +</TABLE></CENTER> + +<P>Then finish filling up the remaining squares of the matrix with the +remaining letters of the alphabet, in alphabetical order. Since there are +26 letters and only 25 squares, we assign I and J to the same square. + +<P> +<CENTER><TABLE rules="all" frame="box" border="3" noshade> +<TR><TD align="center" height=30 width=30>K +<TD align="center" height=30 width=30>E +<TD align="center" height=30 width=30>Y +<TD align="center" height=30 width=30>W +<TD align="center" height=30 width=30>O +<TR><TD align="center" height=30 width=30>R +<TD align="center" height=30 width=30>D +<TD align="center" height=30 width=30>A +<TD align="center" height=30 width=30>B +<TD align="center" height=30 width=30>C +<TR><TD align="center" height=30 width=30>F +<TD align="center" height=30 width=30>G +<TD align="center" height=30 width=30>H +<TD align="center" height=30 width=30>I J +<TD align="center" height=30 width=30>L +<TR><TD align="center" height=30 width=30>M +<TD align="center" height=30 width=30>N +<TD align="center" height=30 width=30>P +<TD align="center" height=30 width=30>Q +<TD align="center" height=30 width=30>S +<TR><TD align="center" height=30 width=30>T +<TD align="center" height=30 width=30>U +<TD align="center" height=30 width=30>V +<TD align="center" height=30 width=30>X +<TD align="center" height=30 width=30>Z +</TABLE></CENTER> + +<P>(Actually, when choosing the keyword, besides making sure that no +letter appears twice you must make sure that I and J do not both appear. +For example, <CODE>juice</CODE> wouldn't do as a keyword.) + +<P>To encipher a message, divide it into pairs of letters. Pay no attention to +punctuation or to spaces between words. For example, the sentence "Why, +don't you?" becomes + +<P><PRE>WH YD ON TY OU +</PRE> + +<P>Now, find each pair of letters in the matrix you made earlier. +Most pairs of letters will form two corners of a smaller square or rectangle +within the matrix. For example, in my matrix, the first pair of letters +(<CODE>WH</CODE>) are at two corners of a two-by-three rectangle also containing +<CODE>Y</CODE>, <CODE>A</CODE>, <CODE>B</CODE>, and <CODE>IJ</CODE>. The enciphering of the pair +<CODE>WH</CODE> is the pair at the two other corners of this rectangle, namely <CODE> +YI</CODE>. (I could also have chosen <CODE>YJ</CODE>, in this case.) It's important to +be consistent about the order of the new pair: the one that comes first is +the one on the same <EM>row</EM> as the first of the original pair. In this +case, <CODE>Y</CODE> is on the same row as <CODE>W</CODE>. We can continue to translate +the remaining pairs of letters in the same way, ending up with + +<P><PRE>YI EA ES VK EZ +</PRE> + +<P>Notice that the letter <CODE>Y</CODE> turned into <CODE>E</CODE> in the second +pair of letters, but it turned into <CODE>K</CODE> in the fourth pair. + +<P>Part of the strategy for keeping a code secret is to hide even the <EM> +kind</EM> of code being used. Pairs of letters, to a cryptographer, are a +dead giveaway that a Playfair cipher was used, so it's traditional to insert +irrelevant spacing and punctuation in the actual written version of the +message, like this: + +<P><PRE>Yie ae, svkez. +</PRE> + +<P>Of course the recipient of the message, knowing how the message +was encoded, ignores this spacing and punctuation. + +<P>As an illustration of some of the special cases that complicate this scheme, +consider the message, "Come to the window." First we divide it up into +pairs: + +<P><PRE>CO ME TO TH EW IN DO W +</PRE> + +<P>The first problem is that the message has an odd number of +letters. To solve this problem we simply add an extra letter at the end, +generally <CODE>Q</CODE>. In this example, the final <CODE>W</CODE> becomes a pair <CODE> +WQ</CODE>. + +<P>If you look up the first pair of letters, <CODE>CO</CODE>, in my matrix, you'll +find that they do not determine a rectangle, because they are in the same +column. (Strictly speaking, they <EM>do</EM> determine a one-by-two +rectangle, but the two diagonals are the same, so that <CODE>CO</CODE> would be +encoded as <CODE>CO</CODE> if we followed the usual rule.) For two letters in the +same column, the rule is to replace each letter by the one below it, so <CODE> +CO</CODE> becomes <CODE>LC</CODE>. (If one of the letters is at the end of the column, +it is replaced by the top letter. So, for example, <CODE>OZ</CODE> would become +<CODE>CO</CODE>.) Similarly, for two letters in the same row, each is replaced by +the letter to its right. We can now translate the entire message: + +<P><PRE>LC NK ZK VF YO GQ CE BX +</PRE> + +<P>The pair <CODE>EW</CODE>, on a single row, has become <CODE>YO</CODE>; the final +pair <CODE>WQ</CODE>, on a single column, has become <CODE>BX</CODE>. + +<P>The final exceptional case is the one in which the same letter appears twice +in a pair. For example, the phrase "the big wheel" divides into + +<P><PRE>TH EB IG WH EE LQ +</PRE> + +<P>The pair <CODE>EE</CODE> is treated specially. It could be translated +into <CODE>YY</CODE> (treating it as two letters in the same row) or into <CODE>DD</CODE> +(if you think of it as two letters in the same column). Instead, though, +the rule is to break up the pair by inserting a <CODE>Q</CODE> between the two +letters. This changes all the pairings after that one in the message. The +new version is + +<P><PRE>TH EB IG WH EQ EL +</PRE> + +<P>This version can now be translated into + +<P><PRE>VF WD LH YJ WN OG +</PRE> + +<P>(Notice that I chose to translate <CODE>WH</CODE> into <CODE>YJ</CODE> instead of into +<CODE>YI</CODE>. You should use some of each when coding a message. A cipher with +no <CODE>J</CODE>s at all, or one with a simple pattern of <CODE>I</CODE> and <CODE>J</CODE> +alternating, is another giveaway that the Playfair cipher was used.) + +<P>What about the frequencies of letters in a Playfair-encoded message? You +can't simply say that the most common letters are likely to represent <CODE> +E</CODE> or <CODE>T</CODE> or <CODE>A</CODE>, because a letter doesn't represent a single letter +that way. But it is still possible to say that a common letter in the coded +version is likely to <EM>be on the same row</EM> as one of the frequent +letters in English. For example, here is a well-known text +in Playfair-coded form: + +<P><PRE>ZK DW KC SE XM ZK DW VF RV LQ VF WN ED MZ LW QE GY VF KD XF MP WC GO +BF MU GY QF UG ZK NZ IM GK FK GY ZS GQ LN DP AB BM CK OQ KL EZ KF DH +YK ZN LK FK EU YK FK KZ RY YD FT PC HD GQ MZ CP YD KL KF EZ CI ON DP +AC WK QS SY QL UN DU RU GY NS +</PRE> + +<P>The most commonly occurring letters in this coded text are <CODE>K</CODE> +(19 times), <CODE>F</CODE> (12 times), <CODE>D</CODE> and <CODE>Z</CODE> (tied at 11), and <CODE> +Y</CODE> (10 times). <CODE>K</CODE> is on the same row as both <CODE>E</CODE> and <CODE>O</CODE>, and +can also represent <CODE>T</CODE> in the same-column case. <CODE>Y</CODE> is also on the +same row. <CODE>F</CODE> can represent <CODE>I</CODE> (especially in the common pair <CODE> +IT</CODE>); <CODE>D</CODE> can represent <CODE>A</CODE>; <CODE>Z</CODE> can represent <CODE>T</CODE>. Of all +the letters that might represent <CODE>E</CODE>, why should <CODE>K</CODE> and <CODE>Y</CODE> be +the popular ones? The answer is that they have common letters in their +columns as well. In order for <CODE>W</CODE> to represent <CODE>E</CODE>, for example, +the other letter of the (cleartext) pair must be <CODE>B</CODE>, <CODE>I</CODE>, <CODE>J</CODE>, +<CODE>Q</CODE>, or <CODE>X</CODE>. Of these, only <CODE>I</CODE> is particularly common, and +<CODE>Q</CODE> and <CODE>X</CODE> are downright rare. + +<P>If you were trying to break a Playfair cipher, one approach you might take +would be to count the frequencies of <EM>pairs</EM> of letters. For example, +in the message above, the only pairs that occur more than twice are <CODE> +GY</CODE>, four times, and <CODE>FK</CODE>, <CODE>VF</CODE>, and <CODE>ZK</CODE>, three times each. +It's a good guess that each of these corresponds to a commonly occurring +pair of letters in English text. In fact, as it turns out, <CODE>GY</CODE> +corresponds to <CODE>HE</CODE>, which is not only a word by itself but also part of +<CODE>the</CODE>, <CODE>them</CODE>, <CODE>then</CODE>, and so on. <CODE>VF</CODE> corresponds to <CODE> +TH</CODE>, an extremely common pair; <CODE>ZK</CODE> corresponds to <CODE>TO</CODE>, which is +again a word in itself as well as a constituent of many other words. The +other pair that occurs three times in the text, <CODE>FK</CODE>, corresponds to +<CODE>RT</CODE>. This is not such a common English pair, although it does come up +in words like <CODE>worth</CODE>. But it turns out that in the particular sample +text I'm using, this pair of letters comes up mostly as parts of two words, +as in the combination <CODE>or to</CODE>. + +<P> + +<P>If you want to know more about how to break a Playfair cipher, you can see +an example in <EM>Have His Carcase</EM><EM>,</EM> a mystery novel by +Dorothy L. Sayers. In this project, I'm less ambitious: the +program merely enciphers a message, given the keyword and the cleartext as +inputs. The first input to <CODE>playfair</CODE> must be a word, the keyword. The +second input must be a list of words, the text. The keyword must meet the +criterion of no duplicated letters, and the cleartext input must contain +only words of letters, without punctuation. Here is an example: + +<P><PRE>? <U>print playfair "keyword [come to the window]</U> +lcnkzkvfyogqcebx +</PRE> + +<P><CODE>Playfair</CODE> is an operation whose output is a single word +containing the enciphered letters of the original text. + +<P><H2>Data Redundancy</H2> + + + +<P>In writing this program, the first question I thought about was how to +represent in a Logo program the matrix of letters used in the coding +process. The most natural structure is a two-dimensional array--that is, +an array with five members, each of which is an array of five +letters.<SUP>*</SUP> So if +the keyword is <CODE>keyword</CODE> then the program will, in effect, do this: + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>In the tic-tac-toe program, I used a one-dimensional array +to represent the board, even though a tic-tac-toe board is drawn in two +dimensions. I could have used an array of three arrays of three numbers +each, but that wouldn't really have fit with the way that program labels the +board. In tic-tac-toe, the nine squares are named 1 to 9. You ask to move +in square 8, for example, not in row 3, column 2. But in the Playfair +program, the row and column numbers are going to be very important.</SMALL></BLOCKQUOTE></SMALL><P><PRE>make "matrix {{k e y w o} {r d a b c} {f g h i l} + {m n p q s} {t u v x z}} +</PRE> + +<P>The position of a letter in the matrix is represented as a list of +two numbers, the row and the column. The Berkeley Logo procedure library +includes an operation <CODE>mditem</CODE> that takes such a list as an input, along +with a multi-dimensional array, and outputs the desired member: + +<P><PRE>to letter :rowcol +output mditem :rowcol :matrix +end +</PRE> + +<P>(The actual procedure listed at the end of this section includes a +slight complication to deal with the case of <CODE>I</CODE> and <CODE>J</CODE>, but that's +not important right now.) + +<P>The Playfair process goes like this: The program is given two letters. It +finds each letter in the matrix, determines each letter's row and column +numbers, then rearranges those numbers to make new row and column numbers, +then looks in the matrix again to find the corresponding letters. For +example, suppose we are given the keyword <CODE>keyword</CODE> and the letters <CODE> +T</CODE> and <CODE>A</CODE>. The first step is to translate <CODE>T</CODE> into the row and +column list <CODE>[5 1]</CODE>, and to translate <CODE>A</CODE> into <CODE>[2 3]</CODE>. Then +the program must combine the row of one letter with the column of the other, +giving the new lists <CODE>[5 3]</CODE> and <CODE>[2 1]</CODE>. Finally, the <CODE>letter</CODE> +procedure shown above will find the letters <CODE>V</CODE> and <CODE>R</CODE> in the +matrix. + +<P><CODE>Letter</CODE> handles the last step of the translation process, but what about +the first step? We need the inverse operation of <CODE>letter</CODE>, one that +takes a letter as input and provides its row and column. + +<P>It would be possible to write a <CODE>row.and.column</CODE> procedure that would +examine each letter in the matrix until it located the desired letter. But +that procedure would be both slow and complicated. Instead, I decided +to keep <EM>redundant</EM> information about the matrix in the form of 26 +variables, one for each letter, each of which contains the coordinates of +that letter. That is, the variables take the form + +<P><PRE>make "a [2 3] +make "w [1 4] +make "z [5 5] +</PRE> + +<P>and so on. (As in the case of the variable named <CODE>matrix</CODE> +above, these <CODE>make</CODE> instructions are just illustrative. The actual +program does not contain explicit data for this particular matrix, using +this particular keyword!) + +<P>The letter variables contain the same information as the variable <CODE> +matrix</CODE>. Strictly speaking, they are not needed. By creating the redundant +variables for the letters, I've made a <EM>space/time tradeoff;</EM> +the extra variables take up room in the computer's memory, but the program +runs faster. One of the recurring concerns of a professional programmer is +deciding which way to make such tradeoffs. It depends on the amounts of +space and time required and the amounts available. In this case, the extra +space required is really quite small, compared to the memory of a modern +computer, so the decision is clear-cut. For larger programming problems it +is sometimes harder to decide. + +<P> +<H2>Composition of Functions</H2> + + + +<P>Earlier I showed a <CODE>make</CODE> instruction to put a particular coding matrix +into the variable <CODE>matrix</CODE>. How does the program create a matrix for +any keyword given as input? Here are two of the relevant procedures: + +<P><PRE>to playfair :keyword :message +local [matrix a b c d e f g h i j k l m n o p q r s t u v w x y z] +<U>setkeyword jtoi lowercase :keyword</U> +output encode (reduce "word :message) +end + +to setkeyword :word +<U>make "matrix reorder word :word (remove :word "abcdefghiklmnopqrstuvwxyz)</U> +make "j :i +end +</PRE> + +<P> +<P>The keyword that is provided by the user as one of the inputs to the +toplevel procedure <CODE>playfair</CODE> goes through several stages as it is +transformed into a matrix. + +<P> +<CENTER><IMG SRC="compose.gif" ALT="figure: compose"></CENTER> + + +<P>This <EM>dataflow</EM> diagram is very similar to a plumbing +diagram from Chapter 2 turned on its side. The format is a little +different to put somewhat more emphasis on the inputs and outputs, so you +can follow the "flow" of information through the arrows. + +<P>In English, here's what the diagram tells us. The keyword given by the user +must be converted to lower case letters. (I could have chosen to use capital +letters instead; the goal is to have some uniform convention.) If the +keyword happens to contain a <CODE>J</CODE>, it will be represented within the +program as an <CODE>I</CODE> instead. Then, to make the matrix, we combine (with +<CODE>word</CODE>) two words: the keyword and the result of removing the keyword's +letters from the alphabet (leaving out <CODE>J</CODE>). Finally, that combined +word must be rearranged into a five-by-five square. + +<P> +The advantage of a view such as this one is that each of the small boxes +in the diagram has a relatively simple task. Indeed, <CODE>lowercase</CODE> and +<CODE>word</CODE> are primitive operations in Berkeley Logo. <CODE>Jtoi</CODE> is trivial: + +<P><PRE>to jtoi :word +output map [ifelse equalp ? "j ["i] [?]] :word +end +</PRE> + +<P><CODE>Remove</CODE> is a straightforward recursive operation that +outputs the result of removing one group of letters from another group +of letters. + +<P><PRE>to remove :letters :string +if emptyp :string [output "] +if memberp first :string :letters [output remove :letters bf :string] +output word first :string remove :letters bf :string +end +</PRE> + +<P>The job of <CODE>reorder</CODE> is somewhat messier. It must keep track +of what row and column it's up to, so <CODE>reorder</CODE> is just an +initialization procedure for the recursive helper <CODE>reorder1</CODE> that does +the real work. <CODE>Reorder</CODE> also creates the two-dimensional Logo array +to provide another input to its helper procedure. + +<P><PRE>to reorder :string +output reorder1 :string (mdarray [5 5]) 1 1 +end + +to reorder1 :string :array :row :column +if :row=6 [output :array] +if :column=6 [output reorder1 :string :array :row+1 1] +mdsetitem (list :row :column) :array first :string +make first :string (list :row :column) +output reorder1 (butfirst :string) :array :row :column+1 +end +</PRE> + +<P>If I were filling in a matrix by hand, instead of writing a computer +program, I'd use a very different approach. I'd handle one letter at a +time. First I'd go through the keyword a letter at a time, stuffing each +letter into the next available slot in the matrix. (If necessary, I'd +convert upper to lower case and <CODE>J</CODE> to <CODE>I</CODE> in the process.) +Then I'd go through the alphabet a letter at a time, saying "If this letter +isn't in the keyword, then stuff it into the matrix." + +<P> +Many people would find it natural to use that same technique in writing +a computer program, also: + +<P><PRE>to playfair :keyword :message ;; sequential version +local [matrix a b c d e f g h i j k l m n o p q r s t u v w x y z] +<U>make "matrix mdarray [5 5]</U> +<U>local [row column]</U> +<U>make "row 1</U> +<U>make "column 1</U> +<U>foreach :keyword [stuff jtoi lowercase ?]</U> +<U>foreach "abcdefghiklmnopqrstuvwxyz ~</U> + <U>[if not memberp ? jtoi :keyword [stuff ?]]</U> +make "j :i +output encode (reduce "word :message) +end + +to stuff :letter +mdsetitem (list :row :column) :matrix :letter +make :letter (list :row :column) +make "column :column+1 +if :column=6 [make "row :row+1 make "column 1] +end +</PRE> + +<P>In this version, the first <CODE>foreach</CODE> instruction handles the letters of +the keyword. The second <CODE>foreach</CODE> instruction handles the rest of the +alphabet. The <CODE>not memberp</CODE> test handles the removal of the keyword +letters from the alphabet. + +<P>My intent in writing this alternate version was to model my idea of how the +problem would be solved without a computer, processing one letter at a time. +So, for example, in the template + +<P><PRE>[stuff jtoi lowercase ?] +</PRE> + +<P>it's worth noting that the operations <CODE>jtoi</CODE> and <CODE> +lowercase</CODE> are being applied to single-letter inputs, even though those +operations were designed to accept words of any length as a unit. I +cheated, though, by applying <CODE>jtoi</CODE> to the entire keyword in the +second <CODE>foreach</CODE> instruction. I was trying to make the program more +readable; the honest version would be + +<P><PRE>foreach "abcdefghiklmnopqrstuvwxyz ~ + [if (ifelse equalp ? "i + [not (or (memberp "i :keyword) + (memberp "j :keyword))] + [not memberp ? :keyword]) + [stuff ?]] +</PRE> + +<P>Why am I subjecting you to this? My point is that what may seem to be the +most natural way to think about a problem--in this case, handling one +letter at a time--may not be the easiest, most elegant, or most efficient +programming solution. + +<P>What makes the dataflow-structured version of <CODE>playfair</CODE> possible is the +use of <EM>operations</EM> in Logo, and the <EM>composition</EM> of these +operations by using the output from one as the input to another. This is an +important technique, but one that doesn't seem to come naturally to everyone. +If you're not accustomed to writing operations, I think it really pays to +train yourself into that habit. + +<P><H2>Conversational Front End</H2> + + + +<P>It's inconvenient to type a long message into the computer in the form of an +input to a procedure. Another approach would be a <EM>conversational front +end.</EM> This is a procedure that reads the cleartext message using <CODE> +readlist</CODE>, perhaps accepting the message over several lines. It's not hard +to write: + +<P><PRE>to encode.big.message +local [keyword cleartext] +print [Welcome to the Playfair enciphering program.] +print [What keyword would you like to use?] +make "keyword first readlist +print [Now please enter your message, using as many lines as needed.] +print [When you're done, enter a line containing only a period (.).] +make "cleartext [] +read.big.message +print [Here is the enciphered version:] +print [] +print playfair :keyword :cleartext +end + +to read.big.message +local "line +make "line readlist +if equalp :line [.] [stop] +make "cleartext sentence :cleartext :line +read.big.message +end +</PRE> + +<P>Such a top-level procedure may be justified in a project like this, in which +a very large block of text may be used as a datum. But don't get carried +away. Programming languages that don't emphasize composition of functions +encourage this sort of programming style, to the point where the part of the +program that prompts the user and reads the data gets to be longer than the +part that does the actual computation. This preoccupation with verbose +conversation between the program and the user is sometimes justified by the +idea of "good human engineering," but I don't think that's necessarily +true. To take an extreme case, consider the standard elementary school Logo +procedure to draw a square: + +<P><PRE>to square :size +repeat 4 [forward :size right 90] +end +</PRE> + +<P>Compare that to this "human engineered" version: + +<P><PRE>to square +local "size +print [Brian's square program copyright 1985] +print [What size square would you like me to draw?] +make "size first readlist +repeat 4 [forward :size right 90] +print [Thank you, please come again.] +end +</PRE> + +<P>Not only is the first version (in my opinion) much more pleasant +to use, but it is also more powerful and flexible. The second version can +be used <EM>only</EM> as a top-level program, carrying on a conversation with +a human user. The first version can be run at top level, but it can also be +used as a subprocedure of a more complicated drawing program. If it's used +at top level, some person types in a number, the size, as the input to <CODE> +square</CODE> on the instruction line. If it's used inside another procedure, +that procedure can <EM>compute</EM> the input. + +<P><H2>Further Explorations</H2> + +<P>I haven't described the part of the program that actually transforms +the message: the procedure <CODE>encode</CODE> and its subprocedures. Read +the listing at the end of the chapter, then answer these questions: + +<P>»Why does <CODE>encode</CODE> need two base cases? + +<P>»What purpose is served by the four invocations of <CODE>thing</CODE> at the +beginning of procedure <CODE>paircode</CODE>? + +<P>Of course this program can be improved in many ways. + +<P>»One straightforward improvement to this program would be to +"bulletproof" it so that it doesn't die with a Logo error message if, for +example, the user provides a bad keyword. (Instead, the program should give +its own message, making it clear what the problem is. It's better for the +user to see + +<P><PRE>Keywords may not have any letter repeated. +</PRE> + +<P>than + +<P><PRE>t has no value in paircode +</PRE> + +<P>after making that mistake.) Also, what if the cleartext input +contains words with characters other than letters? The program should just +ignore those characters and process the letters in the words correctly. + +<P>»Another fairly straightforward improvement would be to take the one +long word output by <CODE>playfair</CODE> and turn it into a list of words with +spacing and punctuation thrown in at random. The goal is to have the result +look more or less like an actual paragraph of English text, except for the +scrambled letters. + +<P>Another direction would be to work on deciphering a Playfair-coded +message. There are two problems here: the easy one, in which you know what +the keyword is, and the hard one, in which you know only that a Playfair +cipher was used. + +<P>»The procedure <CODE>playfair</CODE> itself will almost work in +the first case. It would work perfectly were it not for the special cases +of letters in the same row and column. It's a simple modification to handle +those cases correctly. An interesting extension would be to try to restore +the original spacing by using a dictionary to guess where words end. + +<P>»The much harder problem is to try to guess the keyword. I mentioned +earlier some ideas about the approaches you'd have to take, such as +exploring the frequencies of use of pairs of letters. If you want more +advice, you'll have to study books on cryptography. + +<P> +<TABLE width="100%"><TR><TD><A HREF="../v1-toc2.html">(back to Table of Contents)</A> +<TD align="right"><A HREF="../v1ch11/v1ch11.html"><STRONG>BACK</STRONG></A> +chapter thread <A HREF="../v1ch13/v1ch13.html"><STRONG>NEXT</STRONG></A> +</TABLE> + +<P><H2>Program Listing</H2> + +<PRE> +to playfair :keyword :message +local [matrix a b c d e f g h i j k l m n o p q r s t u v w x y z] +setkeyword jtoi lowercase :keyword +output encode (reduce "word :message) +end + +;; Prepare the code array + +to setkeyword :word +make "matrix reorder word :word (remove :word "abcdefghiklmnopqrstuvwxyz) +make "j :i +end + +to remove :letters :string +if emptyp :string [output "] +if memberp first :string :letters [output remove :letters bf :string] +output word first :string remove :letters bf :string +end + +to reorder :string +output reorder1 :string (mdarray [5 5]) 1 1 +end + +to reorder1 :string :array :row :column +if :row=6 [output :array] +if :column=6 [output reorder1 :string :array :row+1 1] +mdsetitem (list :row :column) :array first :string +make first :string (list :row :column) +output reorder1 (butfirst :string) :array :row :column+1 +end + +;; Encode the message + +to encode :message +if emptyp :message [output "] +if emptyp butfirst :message [output paircode first :message "q] +if equalp (jtoi first :message) (jtoi first butfirst :message) ~ + [output word (paircode first :message "q) (encode butfirst :message)] +output word (paircode first :message first butfirst :message) ~ + (encode butfirst butfirst :message) +end + +to paircode :one :two +local [row1 column1 row2 column2] +make "row1 first thing :one +make "column1 last thing :one +make "row2 first thing :two +make "column2 last thing :two +if :row1 = :row2 ~ + [output letters (list :row1 rotate (:column1+1)) ~ + (list :row1 rotate (:column2+1))] +if :column1 = :column2 ~ + [output letters (list rotate (:row1+1) :column1) ~ + (list rotate (:row2+1) :column1)] +output letters (list :row1 :column2) (list :row2 :column1) +end + +to rotate :index +output ifelse :index = 6 [1] [:index] +end + +to letters :one :two +output word letter :one letter :two +end + +to letter :rowcol +output itoj mditem :rowcol :matrix +end + +;; I and J conversion + +to jtoi :word +output map [ifelse equalp ? "j ["i] [?]] :word +end + +to itoj :letter +if :letter = "i [if (random 3) = 0 [output "j]] +output :letter +end +</PRE> + +<P><A HREF="../v1-toc2.html">(back to Table of Contents)</A> +<P><A HREF="../v1ch11/v1ch11.html"><STRONG>BACK</STRONG></A> +chapter thread <A HREF="../v1ch13/v1ch13.html"><STRONG>NEXT</STRONG></A> + +<P> +<ADDRESS> +<A HREF="../index.html">Brian Harvey</A>, +<CODE>bh@cs.berkeley.edu</CODE> +</ADDRESS> +</BODY> +</HTML> diff --git a/js/games/nluqo.github.io/~bh/v1ch12/v1ch12.html b/js/games/nluqo.github.io/~bh/v1ch12/v1ch12.html new file mode 100644 index 0000000..827b018 --- /dev/null +++ b/js/games/nluqo.github.io/~bh/v1ch12/v1ch12.html @@ -0,0 +1,715 @@ +<HTML> +<HEAD> +<TITLE>Computer Science Logo Style vol 1 ch 12: Example: Playfair Cipher</TITLE> +</HEAD> +<BODY> +<CITE>Computer Science Logo Style</CITE> volume 1: +<CITE>Symbolic Computing</CITE> 2/e Copyright (C) 1997 MIT +<H1>Example: Playfair Cipher</H1> + +<TABLE width="100%"><TR><TD> +<IMG SRC="../csls1.jpg" ALT="cover photo"> +<TD><TABLE> +<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian +Harvey</A><BR>University of California, Berkeley</CITE> +<TR><TD align="right"><BR> +<TR><TD align="right"><A HREF="../pdf/v1ch12.pdf">Download PDF version</A> +<TR><TD align="right"><A HREF="../v1-toc2.html">Back to Table of Contents</A> +<TR><TD align="right"><A HREF="../v1ch11/v1ch11.html"><STRONG>BACK</STRONG></A> +chapter thread <A HREF="../v1ch13/v1ch13.html"><STRONG>NEXT</STRONG></A> +<TR><TD align="right"><A HREF="https://mitpress.mit.edu/books/computer-science-logo-style-second-edition-volume-1">MIT +Press web page for Computer Science Logo Style</A> +</TABLE></TABLE> + +<HR><P>Program file for this chapter: <A HREF="https://people.eecs.berkeley.edu/~bh/v1ch12/playfair.lg"><CODE>playfair</CODE></A> + +<P>This project investigates a cipher that is somewhat more complicated than +the simple substitution cipher of Chapter 11. In the +Playfair cipher, there is not a single translation of each letter +of the alphabet; that is, you don't just decide that every B will be turned +into an F. Instead, <EM>pairs</EM> of letters are translated into other +pairs of letters. + +<P>Here is how it works. To start, pick a <EM>keyword</EM> that does not +contain any letter more than once. For example, I'll pick the word +<CODE>keyword</CODE>. Now write the letters of that word in the first squares of a +five by five matrix: + +<P> +<CENTER><TABLE rules="all" frame="box" border="3" noshade> +<TR><TD align="center" height=30 width=30>K +<TD align="center" height=30 width=30>E +<TD align="center" height=30 width=30>Y +<TD align="center" height=30 width=30>W +<TD align="center" height=30 width=30>O +<TR><TD align="center" height=30 width=30>R +<TD align="center" height=30 width=30>D +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TR><TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TR><TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TR><TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +<TD align="center" height=30 width=30> +</TABLE></CENTER> + +<P>Then finish filling up the remaining squares of the matrix with the +remaining letters of the alphabet, in alphabetical order. Since there are +26 letters and only 25 squares, we assign I and J to the same square. + +<P> +<CENTER><TABLE rules="all" frame="box" border="3" noshade> +<TR><TD align="center" height=30 width=30>K +<TD align="center" height=30 width=30>E +<TD align="center" height=30 width=30>Y +<TD align="center" height=30 width=30>W +<TD align="center" height=30 width=30>O +<TR><TD align="center" height=30 width=30>R +<TD align="center" height=30 width=30>D +<TD align="center" height=30 width=30>A +<TD align="center" height=30 width=30>B +<TD align="center" height=30 width=30>C +<TR><TD align="center" height=30 width=30>F +<TD align="center" height=30 width=30>G +<TD align="center" height=30 width=30>H +<TD align="center" height=30 width=30>I J +<TD align="center" height=30 width=30>L +<TR><TD align="center" height=30 width=30>M +<TD align="center" height=30 width=30>N +<TD align="center" height=30 width=30>P +<TD align="center" height=30 width=30>Q +<TD align="center" height=30 width=30>S +<TR><TD align="center" height=30 width=30>T +<TD align="center" height=30 width=30>U +<TD align="center" height=30 width=30>V +<TD align="center" height=30 width=30>X +<TD align="center" height=30 width=30>Z +</TABLE></CENTER> + +<P>(Actually, when choosing the keyword, besides making sure that no +letter appears twice you must make sure that I and J do not both appear. +For example, <CODE>juice</CODE> wouldn't do as a keyword.) + +<P>To encipher a message, divide it into pairs of letters. Pay no attention to +punctuation or to spaces between words. For example, the sentence "Why, +don't you?" becomes + +<P><PRE>WH YD ON TY OU +</PRE> + +<P>Now, find each pair of letters in the matrix you made earlier. +Most pairs of letters will form two corners of a smaller square or rectangle +within the matrix. For example, in my matrix, the first pair of letters +(<CODE>WH</CODE>) are at two corners of a two-by-three rectangle also containing +<CODE>Y</CODE>, <CODE>A</CODE>, <CODE>B</CODE>, and <CODE>IJ</CODE>. The enciphering of the pair +<CODE>WH</CODE> is the pair at the two other corners of this rectangle, namely <CODE> +YI</CODE>. (I could also have chosen <CODE>YJ</CODE>, in this case.) It's important to +be consistent about the order of the new pair: the one that comes first is +the one on the same <EM>row</EM> as the first of the original pair. In this +case, <CODE>Y</CODE> is on the same row as <CODE>W</CODE>. We can continue to translate +the remaining pairs of letters in the same way, ending up with + +<P><PRE>YI EA ES VK EZ +</PRE> + +<P>Notice that the letter <CODE>Y</CODE> turned into <CODE>E</CODE> in the second +pair of letters, but it turned into <CODE>K</CODE> in the fourth pair. + +<P>Part of the strategy for keeping a code secret is to hide even the <EM> +kind</EM> of code being used. Pairs of letters, to a cryptographer, are a +dead giveaway that a Playfair cipher was used, so it's traditional to insert +irrelevant spacing and punctuation in the actual written version of the +message, like this: + +<P><PRE>Yie ae, svkez. +</PRE> + +<P>Of course the recipient of the message, knowing how the message +was encoded, ignores this spacing and punctuation. + +<P>As an illustration of some of the special cases that complicate this scheme, +consider the message, "Come to the window." First we divide it up into +pairs: + +<P><PRE>CO ME TO TH EW IN DO W +</PRE> + +<P>The first problem is that the message has an odd number of +letters. To solve this problem we simply add an extra letter at the end, +generally <CODE>Q</CODE>. In this example, the final <CODE>W</CODE> becomes a pair <CODE> +WQ</CODE>. + +<P>If you look up the first pair of letters, <CODE>CO</CODE>, in my matrix, you'll +find that they do not determine a rectangle, because they are in the same +column. (Strictly speaking, they <EM>do</EM> determine a one-by-two +rectangle, but the two diagonals are the same, so that <CODE>CO</CODE> would be +encoded as <CODE>CO</CODE> if we followed the usual rule.) For two letters in the +same column, the rule is to replace each letter by the one below it, so <CODE> +CO</CODE> becomes <CODE>LC</CODE>. (If one of the letters is at the end of the column, +it is replaced by the top letter. So, for example, <CODE>OZ</CODE> would become +<CODE>CO</CODE>.) Similarly, for two letters in the same row, each is replaced by +the letter to its right. We can now translate the entire message: + +<P><PRE>LC NK ZK VF YO GQ CE BX +</PRE> + +<P>The pair <CODE>EW</CODE>, on a single row, has become <CODE>YO</CODE>; the final +pair <CODE>WQ</CODE>, on a single column, has become <CODE>BX</CODE>. + +<P>The final exceptional case is the one in which the same letter appears twice +in a pair. For example, the phrase "the big wheel" divides into + +<P><PRE>TH EB IG WH EE LQ +</PRE> + +<P>The pair <CODE>EE</CODE> is treated specially. It could be translated +into <CODE>YY</CODE> (treating it as two letters in the same row) or into <CODE>DD</CODE> +(if you think of it as two letters in the same column). Instead, though, +the rule is to break up the pair by inserting a <CODE>Q</CODE> between the two +letters. This changes all the pairings after that one in the message. The +new version is + +<P><PRE>TH EB IG WH EQ EL +</PRE> + +<P>This version can now be translated into + +<P><PRE>VF WD LH YJ WN OG +</PRE> + +<P>(Notice that I chose to translate <CODE>WH</CODE> into <CODE>YJ</CODE> instead of into +<CODE>YI</CODE>. You should use some of each when coding a message. A cipher with +no <CODE>J</CODE>s at all, or one with a simple pattern of <CODE>I</CODE> and <CODE>J</CODE> +alternating, is another giveaway that the Playfair cipher was used.) + +<P>What about the frequencies of letters in a Playfair-encoded message? You +can't simply say that the most common letters are likely to represent <CODE> +E</CODE> or <CODE>T</CODE> or <CODE>A</CODE>, because a letter doesn't represent a single letter +that way. But it is still possible to say that a common letter in the coded +version is likely to <EM>be on the same row</EM> as one of the frequent +letters in English. For example, here is a well-known text +in Playfair-coded form: + +<P><PRE>ZK DW KC SE XM ZK DW VF RV LQ VF WN ED MZ LW QE GY VF KD XF MP WC GO +BF MU GY QF UG ZK NZ IM GK FK GY ZS GQ LN DP AB BM CK OQ KL EZ KF DH +YK ZN LK FK EU YK FK KZ RY YD FT PC HD GQ MZ CP YD KL KF EZ CI ON DP +AC WK QS SY QL UN DU RU GY NS +</PRE> + +<P>The most commonly occurring letters in this coded text are <CODE>K</CODE> +(19 times), <CODE>F</CODE> (12 times), <CODE>D</CODE> and <CODE>Z</CODE> (tied at 11), and <CODE> +Y</CODE> (10 times). <CODE>K</CODE> is on the same row as both <CODE>E</CODE> and <CODE>O</CODE>, and +can also represent <CODE>T</CODE> in the same-column case. <CODE>Y</CODE> is also on the +same row. <CODE>F</CODE> can represent <CODE>I</CODE> (especially in the common pair <CODE> +IT</CODE>); <CODE>D</CODE> can represent <CODE>A</CODE>; <CODE>Z</CODE> can represent <CODE>T</CODE>. Of all +the letters that might represent <CODE>E</CODE>, why should <CODE>K</CODE> and <CODE>Y</CODE> be +the popular ones? The answer is that they have common letters in their +columns as well. In order for <CODE>W</CODE> to represent <CODE>E</CODE>, for example, +the other letter of the (cleartext) pair must be <CODE>B</CODE>, <CODE>I</CODE>, <CODE>J</CODE>, +<CODE>Q</CODE>, or <CODE>X</CODE>. Of these, only <CODE>I</CODE> is particularly common, and +<CODE>Q</CODE> and <CODE>X</CODE> are downright rare. + +<P>If you were trying to break a Playfair cipher, one approach you might take +would be to count the frequencies of <EM>pairs</EM> of letters. For example, +in the message above, the only pairs that occur more than twice are <CODE> +GY</CODE>, four times, and <CODE>FK</CODE>, <CODE>VF</CODE>, and <CODE>ZK</CODE>, three times each. +It's a good guess that each of these corresponds to a commonly occurring +pair of letters in English text. In fact, as it turns out, <CODE>GY</CODE> +corresponds to <CODE>HE</CODE>, which is not only a word by itself but also part of +<CODE>the</CODE>, <CODE>them</CODE>, <CODE>then</CODE>, and so on. <CODE>VF</CODE> corresponds to <CODE> +TH</CODE>, an extremely common pair; <CODE>ZK</CODE> corresponds to <CODE>TO</CODE>, which is +again a word in itself as well as a constituent of many other words. The +other pair that occurs three times in the text, <CODE>FK</CODE>, corresponds to +<CODE>RT</CODE>. This is not such a common English pair, although it does come up +in words like <CODE>worth</CODE>. But it turns out that in the particular sample +text I'm using, this pair of letters comes up mostly as parts of two words, +as in the combination <CODE>or to</CODE>. + +<P> + +<P>If you want to know more about how to break a Playfair cipher, you can see +an example in <EM>Have His Carcase</EM><EM>,</EM> a mystery novel by +Dorothy L. Sayers. In this project, I'm less ambitious: the +program merely enciphers a message, given the keyword and the cleartext as +inputs. The first input to <CODE>playfair</CODE> must be a word, the keyword. The +second input must be a list of words, the text. The keyword must meet the +criterion of no duplicated letters, and the cleartext input must contain +only words of letters, without punctuation. Here is an example: + +<P><PRE>? <U>print playfair "keyword [come to the window]</U> +lcnkzkvfyogqcebx +</PRE> + +<P><CODE>Playfair</CODE> is an operation whose output is a single word +containing the enciphered letters of the original text. + +<P><H2>Data Redundancy</H2> + + + +<P>In writing this program, the first question I thought about was how to +represent in a Logo program the matrix of letters used in the coding +process. The most natural structure is a two-dimensional array--that is, +an array with five members, each of which is an array of five +letters.<SUP>*</SUP> So if +the keyword is <CODE>keyword</CODE> then the program will, in effect, do this: + +<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>In the tic-tac-toe program, I used a one-dimensional array +to represent the board, even though a tic-tac-toe board is drawn in two +dimensions. I could have used an array of three arrays of three numbers +each, but that wouldn't really have fit with the way that program labels the +board. In tic-tac-toe, the nine squares are named 1 to 9. You ask to move +in square 8, for example, not in row 3, column 2. But in the Playfair +program, the row and column numbers are going to be very important.</SMALL></BLOCKQUOTE></SMALL><P><PRE>make "matrix {{k e y w o} {r d a b c} {f g h i l} + {m n p q s} {t u v x z}} +</PRE> + +<P>The position of a letter in the matrix is represented as a list of +two numbers, the row and the column. The Berkeley Logo procedure library +includes an operation <CODE>mditem</CODE> that takes such a list as an input, along +with a multi-dimensional array, and outputs the desired member: + +<P><PRE>to letter :rowcol +output mditem :rowcol :matrix +end +</PRE> + +<P>(The actual procedure listed at the end of this section includes a +slight complication to deal with the case of <CODE>I</CODE> and <CODE>J</CODE>, but that's +not important right now.) + +<P>The Playfair process goes like this: The program is given two letters. It +finds each letter in the matrix, determines each letter's row and column +numbers, then rearranges those numbers to make new row and column numbers, +then looks in the matrix again to find the corresponding letters. For +example, suppose we are given the keyword <CODE>keyword</CODE> and the letters <CODE> +T</CODE> and <CODE>A</CODE>. The first step is to translate <CODE>T</CODE> into the row and +column list <CODE>[5 1]</CODE>, and to translate <CODE>A</CODE> into <CODE>[2 3]</CODE>. Then +the program must combine the row of one letter with the column of the other, +giving the new lists <CODE>[5 3]</CODE> and <CODE>[2 1]</CODE>. Finally, the <CODE>letter</CODE> +procedure shown above will find the letters <CODE>V</CODE> and <CODE>R</CODE> in the +matrix. + +<P><CODE>Letter</CODE> handles the last step of the translation process, but what about +the first step? We need the inverse operation of <CODE>letter</CODE>, one that +takes a letter as input and provides its row and column. + +<P>It would be possible to write a <CODE>row.and.column</CODE> procedure that would +examine each letter in the matrix until it located the desired letter. But +that procedure would be both slow and complicated. Instead, I decided +to keep <EM>redundant</EM> information about the matrix in the form of 26 +variables, one for each letter, each of which contains the coordinates of +that letter. That is, the variables take the form + +<P><PRE>make "a [2 3] +make "w [1 4] +make "z [5 5] +</PRE> + +<P>and so on. (As in the case of the variable named <CODE>matrix</CODE> +above, these <CODE>make</CODE> instructions are just illustrative. The actual +program does not contain explicit data for this particular matrix, using +this particular keyword!) + +<P>The letter variables contain the same information as the variable <CODE> +matrix</CODE>. Strictly speaking, they are not needed. By creating the redundant +variables for the letters, I've made a <EM>space/time tradeoff;</EM> +the extra variables take up room in the computer's memory, but the program +runs faster. One of the recurring concerns of a professional programmer is +deciding which way to make such tradeoffs. It depends on the amounts of +space and time required and the amounts available. In this case, the extra +space required is really quite small, compared to the memory of a modern +computer, so the decision is clear-cut. For larger programming problems it +is sometimes harder to decide. + +<P> +<H2>Composition of Functions</H2> + + + +<P>Earlier I showed a <CODE>make</CODE> instruction to put a particular coding matrix +into the variable <CODE>matrix</CODE>. How does the program create a matrix for +any keyword given as input? Here are two of the relevant procedures: + +<P><PRE>to playfair :keyword :message +local [matrix a b c d e f g h i j k l m n o p q r s t u v w x y z] +<U>setkeyword jtoi lowercase :keyword</U> +output encode (reduce "word :message) +end + +to setkeyword :word +<U>make "matrix reorder word :word (remove :word "abcdefghiklmnopqrstuvwxyz)</U> +make "j :i +end +</PRE> + +<P> +<P>The keyword that is provided by the user as one of the inputs to the +toplevel procedure <CODE>playfair</CODE> goes through several stages as it is +transformed into a matrix. + +<P> +<CENTER><IMG SRC="compose.gif" ALT="figure: compose"></CENTER> + + +<P>This <EM>dataflow</EM> diagram is very similar to a plumbing +diagram from Chapter 2 turned on its side. The format is a little +different to put somewhat more emphasis on the inputs and outputs, so you +can follow the "flow" of information through the arrows. + +<P>In English, here's what the diagram tells us. The keyword given by the user +must be converted to lower case letters. (I could have chosen to use capital +letters instead; the goal is to have some uniform convention.) If the +keyword happens to contain a <CODE>J</CODE>, it will be represented within the +program as an <CODE>I</CODE> instead. Then, to make the matrix, we combine (with +<CODE>word</CODE>) two words: the keyword and the result of removing the keyword's +letters from the alphabet (leaving out <CODE>J</CODE>). Finally, that combined +word must be rearranged into a five-by-five square. + +<P> +The advantage of a view such as this one is that each of the small boxes +in the diagram has a relatively simple task. Indeed, <CODE>lowercase</CODE> and +<CODE>word</CODE> are primitive operations in Berkeley Logo. <CODE>Jtoi</CODE> is trivial: + +<P><PRE>to jtoi :word +output map [ifelse equalp ? "j ["i] [?]] :word +end +</PRE> + +<P><CODE>Remove</CODE> is a straightforward recursive operation that +outputs the result of removing one group of letters from another group +of letters. + +<P><PRE>to remove :letters :string +if emptyp :string [output "] +if memberp first :string :letters [output remove :letters bf :string] +output word first :string remove :letters bf :string +end +</PRE> + +<P>The job of <CODE>reorder</CODE> is somewhat messier. It must keep track +of what row and column it's up to, so <CODE>reorder</CODE> is just an +initialization procedure for the recursive helper <CODE>reorder1</CODE> that does +the real work. <CODE>Reorder</CODE> also creates the two-dimensional Logo array +to provide another input to its helper procedure. + +<P><PRE>to reorder :string +output reorder1 :string (mdarray [5 5]) 1 1 +end + +to reorder1 :string :array :row :column +if :row=6 [output :array] +if :column=6 [output reorder1 :string :array :row+1 1] +mdsetitem (list :row :column) :array first :string +make first :string (list :row :column) +output reorder1 (butfirst :string) :array :row :column+1 +end +</PRE> + +<P>If I were filling in a matrix by hand, instead of writing a computer +program, I'd use a very different approach. I'd handle one letter at a +time. First I'd go through the keyword a letter at a time, stuffing each +letter into the next available slot in the matrix. (If necessary, I'd +convert upper to lower case and <CODE>J</CODE> to <CODE>I</CODE> in the process.) +Then I'd go through the alphabet a letter at a time, saying "If this letter +isn't in the keyword, then stuff it into the matrix." + +<P> +Many people would find it natural to use that same technique in writing +a computer program, also: + +<P><PRE>to playfair :keyword :message ;; sequential version +local [matrix a b c d e f g h i j k l m n o p q r s t u v w x y z] +<U>make "matrix mdarray [5 5]</U> +<U>local [row column]</U> +<U>make "row 1</U> +<U>make "column 1</U> +<U>foreach :keyword [stuff jtoi lowercase ?]</U> +<U>foreach "abcdefghiklmnopqrstuvwxyz ~</U> + <U>[if not memberp ? jtoi :keyword [stuff ?]]</U> +make "j :i +output encode (reduce "word :message) +end + +to stuff :letter +mdsetitem (list :row :column) :matrix :letter +make :letter (list :row :column) +make "column :column+1 +if :column=6 [make "row :row+1 make "column 1] +end +</PRE> + +<P>In this version, the first <CODE>foreach</CODE> instruction handles the letters of +the keyword. The second <CODE>foreach</CODE> instruction handles the rest of the +alphabet. The <CODE>not memberp</CODE> test handles the removal of the keyword +letters from the alphabet. + +<P>My intent in writing this alternate version was to model my idea of how the +problem would be solved without a computer, processing one letter at a time. +So, for example, in the template + +<P><PRE>[stuff jtoi lowercase ?] +</PRE> + +<P>it's worth noting that the operations <CODE>jtoi</CODE> and <CODE> +lowercase</CODE> are being applied to single-letter inputs, even though those +operations were designed to accept words of any length as a unit. I +cheated, though, by applying <CODE>jtoi</CODE> to the entire keyword in the +second <CODE>foreach</CODE> instruction. I was trying to make the program more +readable; the honest version would be + +<P><PRE>foreach "abcdefghiklmnopqrstuvwxyz ~ + [if (ifelse equalp ? "i + [not (or (memberp "i :keyword) + (memberp "j :keyword))] + [not memberp ? :keyword]) + [stuff ?]] +</PRE> + +<P>Why am I subjecting you to this? My point is that what may seem to be the +most natural way to think about a problem--in this case, handling one +letter at a time--may not be the easiest, most elegant, or most efficient +programming solution. + +<P>What makes the dataflow-structured version of <CODE>playfair</CODE> possible is the +use of <EM>operations</EM> in Logo, and the <EM>composition</EM> of these +operations by using the output from one as the input to another. This is an +important technique, but one that doesn't seem to come naturally to everyone. +If you're not accustomed to writing operations, I think it really pays to +train yourself into that habit. + +<P><H2>Conversational Front End</H2> + + + +<P>It's inconvenient to type a long message into the computer in the form of an +input to a procedure. Another approach would be a <EM>conversational front +end.</EM> This is a procedure that reads the cleartext message using <CODE> +readlist</CODE>, perhaps accepting the message over several lines. It's not hard +to write: + +<P><PRE>to encode.big.message +local [keyword cleartext] +print [Welcome to the Playfair enciphering program.] +print [What keyword would you like to use?] +make "keyword first readlist +print [Now please enter your message, using as many lines as needed.] +print [When you're done, enter a line containing only a period (.).] +make "cleartext [] +read.big.message +print [Here is the enciphered version:] +print [] +print playfair :keyword :cleartext +end + +to read.big.message +local "line +make "line readlist +if equalp :line [.] [stop] +make "cleartext sentence :cleartext :line +read.big.message +end +</PRE> + +<P>Such a top-level procedure may be justified in a project like this, in which +a very large block of text may be used as a datum. But don't get carried +away. Programming languages that don't emphasize composition of functions +encourage this sort of programming style, to the point where the part of the +program that prompts the user and reads the data gets to be longer than the +part that does the actual computation. This preoccupation with verbose +conversation between the program and the user is sometimes justified by the +idea of "good human engineering," but I don't think that's necessarily +true. To take an extreme case, consider the standard elementary school Logo +procedure to draw a square: + +<P><PRE>to square :size +repeat 4 [forward :size right 90] +end +</PRE> + +<P>Compare that to this "human engineered" version: + +<P><PRE>to square +local "size +print [Brian's square program copyright 1985] +print [What size square would you like me to draw?] +make "size first readlist +repeat 4 [forward :size right 90] +print [Thank you, please come again.] +end +</PRE> + +<P>Not only is the first version (in my opinion) much more pleasant +to use, but it is also more powerful and flexible. The second version can +be used <EM>only</EM> as a top-level program, carrying on a conversation with +a human user. The first version can be run at top level, but it can also be +used as a subprocedure of a more complicated drawing program. If it's used +at top level, some person types in a number, the size, as the input to <CODE> +square</CODE> on the instruction line. If it's used inside another procedure, +that procedure can <EM>compute</EM> the input. + +<P><H2>Further Explorations</H2> + +<P>I haven't described the part of the program that actually transforms +the message: the procedure <CODE>encode</CODE> and its subprocedures. Read +the listing at the end of the chapter, then answer these questions: + +<P>»Why does <CODE>encode</CODE> need two base cases? + +<P>»What purpose is served by the four invocations of <CODE>thing</CODE> at the +beginning of procedure <CODE>paircode</CODE>? + +<P>Of course this program can be improved in many ways. + +<P>»One straightforward improvement to this program would be to +"bulletproof" it so that it doesn't die with a Logo error message if, for +example, the user provides a bad keyword. (Instead, the program should give +its own message, making it clear what the problem is. It's better for the +user to see + +<P><PRE>Keywords may not have any letter repeated. +</PRE> + +<P>than + +<P><PRE>t has no value in paircode +</PRE> + +<P>after making that mistake.) Also, what if the cleartext input +contains words with characters other than letters? The program should just +ignore those characters and process the letters in the words correctly. + +<P>»Another fairly straightforward improvement would be to take the one +long word output by <CODE>playfair</CODE> and turn it into a list of words with +spacing and punctuation thrown in at random. The goal is to have the result +look more or less like an actual paragraph of English text, except for the +scrambled letters. + +<P>Another direction would be to work on deciphering a Playfair-coded +message. There are two problems here: the easy one, in which you know what +the keyword is, and the hard one, in which you know only that a Playfair +cipher was used. + +<P>»The procedure <CODE>playfair</CODE> itself will almost work in +the first case. It would work perfectly were it not for the special cases +of letters in the same row and column. It's a simple modification to handle +those cases correctly. An interesting extension would be to try to restore +the original spacing by using a dictionary to guess where words end. + +<P>»The much harder problem is to try to guess the keyword. I mentioned +earlier some ideas about the approaches you'd have to take, such as +exploring the frequencies of use of pairs of letters. If you want more +advice, you'll have to study books on cryptography. + +<P> +<TABLE width="100%"><TR><TD><A HREF="../v1-toc2.html">(back to Table of Contents)</A> +<TD align="right"><A HREF="../v1ch11/v1ch11.html"><STRONG>BACK</STRONG></A> +chapter thread <A HREF="../v1ch13/v1ch13.html"><STRONG>NEXT</STRONG></A> +</TABLE> + +<P><H2>Program Listing</H2> + +<PRE> +to playfair :keyword :message +local [matrix a b c d e f g h i j k l m n o p q r s t u v w x y z] +setkeyword jtoi lowercase :keyword +output encode (reduce "word :message) +end + +;; Prepare the code array + +to setkeyword :word +make "matrix reorder word :word (remove :word "abcdefghiklmnopqrstuvwxyz) +make "j :i +end + +to remove :letters :string +if emptyp :string [output "] +if memberp first :string :letters [output remove :letters bf :string] +output word first :string remove :letters bf :string +end + +to reorder :string +output reorder1 :string (mdarray [5 5]) 1 1 +end + +to reorder1 :string :array :row :column +if :row=6 [output :array] +if :column=6 [output reorder1 :string :array :row+1 1] +mdsetitem (list :row :column) :array first :string +make first :string (list :row :column) +output reorder1 (butfirst :string) :array :row :column+1 +end + +;; Encode the message + +to encode :message +if emptyp :message [output "] +if emptyp butfirst :message [output paircode first :message "q] +if equalp (jtoi first :message) (jtoi first butfirst :message) ~ + [output word (paircode first :message "q) (encode butfirst :message)] +output word (paircode first :message first butfirst :message) ~ + (encode butfirst butfirst :message) +end + +to paircode :one :two +local [row1 column1 row2 column2] +make "row1 first thing :one +make "column1 last thing :one +make "row2 first thing :two +make "column2 last thing :two +if :row1 = :row2 ~ + [output letters (list :row1 rotate (:column1+1)) ~ + (list :row1 rotate (:column2+1))] +if :column1 = :column2 ~ + [output letters (list rotate (:row1+1) :column1) ~ + (list rotate (:row2+1) :column1)] +output letters (list :row1 :column2) (list :row2 :column1) +end + +to rotate :index +output ifelse :index = 6 [1] [:index] +end + +to letters :one :two +output word letter :one letter :two +end + +to letter :rowcol +output itoj mditem :rowcol :matrix +end + +;; I and J conversion + +to jtoi :word +output map [ifelse equalp ? "j ["i] [?]] :word +end + +to itoj :letter +if :letter = "i [if (random 3) = 0 [output "j]] +output :letter +end +</PRE> + +<P><A HREF="../v1-toc2.html">(back to Table of Contents)</A> +<P><A HREF="../v1ch11/v1ch11.html"><STRONG>BACK</STRONG></A> +chapter thread <A HREF="../v1ch13/v1ch13.html"><STRONG>NEXT</STRONG></A> + +<P> +<ADDRESS> +<A HREF="../index.html">Brian Harvey</A>, +<CODE>bh@cs.berkeley.edu</CODE> +</ADDRESS> +</BODY> +</HTML> |