1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
|
<HTML>
<HEAD>
<TITLE>Computer Science Logo Style vol 1 ch 12: Example: Playfair Cipher</TITLE>
</HEAD>
<BODY>
<CITE>Computer Science Logo Style</CITE> volume 1:
<CITE>Symbolic Computing</CITE> 2/e Copyright (C) 1997 MIT
<H1>Example: Playfair Cipher</H1>
<TABLE width="100%"><TR><TD>
<IMG SRC="../csls1.jpg" ALT="cover photo">
<TD><TABLE>
<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian
Harvey</A><BR>University of California, Berkeley</CITE>
<TR><TD align="right"><BR>
<TR><TD align="right"><A HREF="../pdf/v1ch12.pdf">Download PDF version</A>
<TR><TD align="right"><A HREF="../v1-toc2.html">Back to Table of Contents</A>
<TR><TD align="right"><A HREF="../v1ch11/v1ch11.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../v1ch13/v1ch13.html"><STRONG>NEXT</STRONG></A>
<TR><TD align="right"><A HREF="https://mitpress.mit.edu/books/computer-science-logo-style-second-edition-volume-1">MIT
Press web page for Computer Science Logo Style</A>
</TABLE></TABLE>
<HR><P>Program file for this chapter: <A HREF="https://people.eecs.berkeley.edu/~bh/v1ch12/playfair.lg"><CODE>playfair</CODE></A>
<P>This project investigates a cipher that is somewhat more complicated than
the simple substitution cipher of Chapter 11. In the
Playfair cipher, there is not a single translation of each letter
of the alphabet; that is, you don't just decide that every B will be turned
into an F. Instead, <EM>pairs</EM> of letters are translated into other
pairs of letters.
<P>Here is how it works. To start, pick a <EM>keyword</EM> that does not
contain any letter more than once. For example, I'll pick the word
<CODE>keyword</CODE>. Now write the letters of that word in the first squares of a
five by five matrix:
<P>
<CENTER><TABLE rules="all" frame="box" border="3" noshade>
<TR><TD align="center" height=30 width=30>K
<TD align="center" height=30 width=30>E
<TD align="center" height=30 width=30>Y
<TD align="center" height=30 width=30>W
<TD align="center" height=30 width=30>O
<TR><TD align="center" height=30 width=30>R
<TD align="center" height=30 width=30>D
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TR><TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TR><TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TR><TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
<TD align="center" height=30 width=30>
</TABLE></CENTER>
<P>Then finish filling up the remaining squares of the matrix with the
remaining letters of the alphabet, in alphabetical order. Since there are
26 letters and only 25 squares, we assign I and J to the same square.
<P>
<CENTER><TABLE rules="all" frame="box" border="3" noshade>
<TR><TD align="center" height=30 width=30>K
<TD align="center" height=30 width=30>E
<TD align="center" height=30 width=30>Y
<TD align="center" height=30 width=30>W
<TD align="center" height=30 width=30>O
<TR><TD align="center" height=30 width=30>R
<TD align="center" height=30 width=30>D
<TD align="center" height=30 width=30>A
<TD align="center" height=30 width=30>B
<TD align="center" height=30 width=30>C
<TR><TD align="center" height=30 width=30>F
<TD align="center" height=30 width=30>G
<TD align="center" height=30 width=30>H
<TD align="center" height=30 width=30>I J
<TD align="center" height=30 width=30>L
<TR><TD align="center" height=30 width=30>M
<TD align="center" height=30 width=30>N
<TD align="center" height=30 width=30>P
<TD align="center" height=30 width=30>Q
<TD align="center" height=30 width=30>S
<TR><TD align="center" height=30 width=30>T
<TD align="center" height=30 width=30>U
<TD align="center" height=30 width=30>V
<TD align="center" height=30 width=30>X
<TD align="center" height=30 width=30>Z
</TABLE></CENTER>
<P>(Actually, when choosing the keyword, besides making sure that no
letter appears twice you must make sure that I and J do not both appear.
For example, <CODE>juice</CODE> wouldn't do as a keyword.)
<P>To encipher a message, divide it into pairs of letters. Pay no attention to
punctuation or to spaces between words. For example, the sentence "Why,
don't you?" becomes
<P><PRE>WH YD ON TY OU
</PRE>
<P>Now, find each pair of letters in the matrix you made earlier.
Most pairs of letters will form two corners of a smaller square or rectangle
within the matrix. For example, in my matrix, the first pair of letters
(<CODE>WH</CODE>) are at two corners of a two-by-three rectangle also containing
<CODE>Y</CODE>, <CODE>A</CODE>, <CODE>B</CODE>, and <CODE>IJ</CODE>. The enciphering of the pair
<CODE>WH</CODE> is the pair at the two other corners of this rectangle, namely <CODE>
YI</CODE>. (I could also have chosen <CODE>YJ</CODE>, in this case.) It's important to
be consistent about the order of the new pair: the one that comes first is
the one on the same <EM>row</EM> as the first of the original pair. In this
case, <CODE>Y</CODE> is on the same row as <CODE>W</CODE>. We can continue to translate
the remaining pairs of letters in the same way, ending up with
<P><PRE>YI EA ES VK EZ
</PRE>
<P>Notice that the letter <CODE>Y</CODE> turned into <CODE>E</CODE> in the second
pair of letters, but it turned into <CODE>K</CODE> in the fourth pair.
<P>Part of the strategy for keeping a code secret is to hide even the <EM>
kind</EM> of code being used. Pairs of letters, to a cryptographer, are a
dead giveaway that a Playfair cipher was used, so it's traditional to insert
irrelevant spacing and punctuation in the actual written version of the
message, like this:
<P><PRE>Yie ae, svkez.
</PRE>
<P>Of course the recipient of the message, knowing how the message
was encoded, ignores this spacing and punctuation.
<P>As an illustration of some of the special cases that complicate this scheme,
consider the message, "Come to the window." First we divide it up into
pairs:
<P><PRE>CO ME TO TH EW IN DO W
</PRE>
<P>The first problem is that the message has an odd number of
letters. To solve this problem we simply add an extra letter at the end,
generally <CODE>Q</CODE>. In this example, the final <CODE>W</CODE> becomes a pair <CODE>
WQ</CODE>.
<P>If you look up the first pair of letters, <CODE>CO</CODE>, in my matrix, you'll
find that they do not determine a rectangle, because they are in the same
column. (Strictly speaking, they <EM>do</EM> determine a one-by-two
rectangle, but the two diagonals are the same, so that <CODE>CO</CODE> would be
encoded as <CODE>CO</CODE> if we followed the usual rule.) For two letters in the
same column, the rule is to replace each letter by the one below it, so <CODE>
CO</CODE> becomes <CODE>LC</CODE>. (If one of the letters is at the end of the column,
it is replaced by the top letter. So, for example, <CODE>OZ</CODE> would become
<CODE>CO</CODE>.) Similarly, for two letters in the same row, each is replaced by
the letter to its right. We can now translate the entire message:
<P><PRE>LC NK ZK VF YO GQ CE BX
</PRE>
<P>The pair <CODE>EW</CODE>, on a single row, has become <CODE>YO</CODE>; the final
pair <CODE>WQ</CODE>, on a single column, has become <CODE>BX</CODE>.
<P>The final exceptional case is the one in which the same letter appears twice
in a pair. For example, the phrase "the big wheel" divides into
<P><PRE>TH EB IG WH EE LQ
</PRE>
<P>The pair <CODE>EE</CODE> is treated specially. It could be translated
into <CODE>YY</CODE> (treating it as two letters in the same row) or into <CODE>DD</CODE>
(if you think of it as two letters in the same column). Instead, though,
the rule is to break up the pair by inserting a <CODE>Q</CODE> between the two
letters. This changes all the pairings after that one in the message. The
new version is
<P><PRE>TH EB IG WH EQ EL
</PRE>
<P>This version can now be translated into
<P><PRE>VF WD LH YJ WN OG
</PRE>
<P>(Notice that I chose to translate <CODE>WH</CODE> into <CODE>YJ</CODE> instead of into
<CODE>YI</CODE>. You should use some of each when coding a message. A cipher with
no <CODE>J</CODE>s at all, or one with a simple pattern of <CODE>I</CODE> and <CODE>J</CODE>
alternating, is another giveaway that the Playfair cipher was used.)
<P>What about the frequencies of letters in a Playfair-encoded message? You
can't simply say that the most common letters are likely to represent <CODE>
E</CODE> or <CODE>T</CODE> or <CODE>A</CODE>, because a letter doesn't represent a single letter
that way. But it is still possible to say that a common letter in the coded
version is likely to <EM>be on the same row</EM> as one of the frequent
letters in English. For example, here is a well-known text
in Playfair-coded form:
<P><PRE>ZK DW KC SE XM ZK DW VF RV LQ VF WN ED MZ LW QE GY VF KD XF MP WC GO
BF MU GY QF UG ZK NZ IM GK FK GY ZS GQ LN DP AB BM CK OQ KL EZ KF DH
YK ZN LK FK EU YK FK KZ RY YD FT PC HD GQ MZ CP YD KL KF EZ CI ON DP
AC WK QS SY QL UN DU RU GY NS
</PRE>
<P>The most commonly occurring letters in this coded text are <CODE>K</CODE>
(19 times), <CODE>F</CODE> (12 times), <CODE>D</CODE> and <CODE>Z</CODE> (tied at 11), and <CODE>
Y</CODE> (10 times). <CODE>K</CODE> is on the same row as both <CODE>E</CODE> and <CODE>O</CODE>, and
can also represent <CODE>T</CODE> in the same-column case. <CODE>Y</CODE> is also on the
same row. <CODE>F</CODE> can represent <CODE>I</CODE> (especially in the common pair <CODE>
IT</CODE>); <CODE>D</CODE> can represent <CODE>A</CODE>; <CODE>Z</CODE> can represent <CODE>T</CODE>. Of all
the letters that might represent <CODE>E</CODE>, why should <CODE>K</CODE> and <CODE>Y</CODE> be
the popular ones? The answer is that they have common letters in their
columns as well. In order for <CODE>W</CODE> to represent <CODE>E</CODE>, for example,
the other letter of the (cleartext) pair must be <CODE>B</CODE>, <CODE>I</CODE>, <CODE>J</CODE>,
<CODE>Q</CODE>, or <CODE>X</CODE>. Of these, only <CODE>I</CODE> is particularly common, and
<CODE>Q</CODE> and <CODE>X</CODE> are downright rare.
<P>If you were trying to break a Playfair cipher, one approach you might take
would be to count the frequencies of <EM>pairs</EM> of letters. For example,
in the message above, the only pairs that occur more than twice are <CODE>
GY</CODE>, four times, and <CODE>FK</CODE>, <CODE>VF</CODE>, and <CODE>ZK</CODE>, three times each.
It's a good guess that each of these corresponds to a commonly occurring
pair of letters in English text. In fact, as it turns out, <CODE>GY</CODE>
corresponds to <CODE>HE</CODE>, which is not only a word by itself but also part of
<CODE>the</CODE>, <CODE>them</CODE>, <CODE>then</CODE>, and so on. <CODE>VF</CODE> corresponds to <CODE>
TH</CODE>, an extremely common pair; <CODE>ZK</CODE> corresponds to <CODE>TO</CODE>, which is
again a word in itself as well as a constituent of many other words. The
other pair that occurs three times in the text, <CODE>FK</CODE>, corresponds to
<CODE>RT</CODE>. This is not such a common English pair, although it does come up
in words like <CODE>worth</CODE>. But it turns out that in the particular sample
text I'm using, this pair of letters comes up mostly as parts of two words,
as in the combination <CODE>or to</CODE>.
<P>
<P>If you want to know more about how to break a Playfair cipher, you can see
an example in <EM>Have His Carcase</EM><EM>,</EM> a mystery novel by
Dorothy L. Sayers. In this project, I'm less ambitious: the
program merely enciphers a message, given the keyword and the cleartext as
inputs. The first input to <CODE>playfair</CODE> must be a word, the keyword. The
second input must be a list of words, the text. The keyword must meet the
criterion of no duplicated letters, and the cleartext input must contain
only words of letters, without punctuation. Here is an example:
<P><PRE>? <U>print playfair "keyword [come to the window]</U>
lcnkzkvfyogqcebx
</PRE>
<P><CODE>Playfair</CODE> is an operation whose output is a single word
containing the enciphered letters of the original text.
<P><H2>Data Redundancy</H2>
<P>In writing this program, the first question I thought about was how to
represent in a Logo program the matrix of letters used in the coding
process. The most natural structure is a two-dimensional array--that is,
an array with five members, each of which is an array of five
letters.<SUP>*</SUP> So if
the keyword is <CODE>keyword</CODE> then the program will, in effect, do this:
<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>In the tic-tac-toe program, I used a one-dimensional array
to represent the board, even though a tic-tac-toe board is drawn in two
dimensions. I could have used an array of three arrays of three numbers
each, but that wouldn't really have fit with the way that program labels the
board. In tic-tac-toe, the nine squares are named 1 to 9. You ask to move
in square 8, for example, not in row 3, column 2. But in the Playfair
program, the row and column numbers are going to be very important.</SMALL></BLOCKQUOTE></SMALL><P><PRE>make "matrix {{k e y w o} {r d a b c} {f g h i l}
{m n p q s} {t u v x z}}
</PRE>
<P>The position of a letter in the matrix is represented as a list of
two numbers, the row and the column. The Berkeley Logo procedure library
includes an operation <CODE>mditem</CODE> that takes such a list as an input, along
with a multi-dimensional array, and outputs the desired member:
<P><PRE>to letter :rowcol
output mditem :rowcol :matrix
end
</PRE>
<P>(The actual procedure listed at the end of this section includes a
slight complication to deal with the case of <CODE>I</CODE> and <CODE>J</CODE>, but that's
not important right now.)
<P>The Playfair process goes like this: The program is given two letters. It
finds each letter in the matrix, determines each letter's row and column
numbers, then rearranges those numbers to make new row and column numbers,
then looks in the matrix again to find the corresponding letters. For
example, suppose we are given the keyword <CODE>keyword</CODE> and the letters <CODE>
T</CODE> and <CODE>A</CODE>. The first step is to translate <CODE>T</CODE> into the row and
column list <CODE>[5 1]</CODE>, and to translate <CODE>A</CODE> into <CODE>[2 3]</CODE>. Then
the program must combine the row of one letter with the column of the other,
giving the new lists <CODE>[5 3]</CODE> and <CODE>[2 1]</CODE>. Finally, the <CODE>letter</CODE>
procedure shown above will find the letters <CODE>V</CODE> and <CODE>R</CODE> in the
matrix.
<P><CODE>Letter</CODE> handles the last step of the translation process, but what about
the first step? We need the inverse operation of <CODE>letter</CODE>, one that
takes a letter as input and provides its row and column.
<P>It would be possible to write a <CODE>row.and.column</CODE> procedure that would
examine each letter in the matrix until it located the desired letter. But
that procedure would be both slow and complicated. Instead, I decided
to keep <EM>redundant</EM> information about the matrix in the form of 26
variables, one for each letter, each of which contains the coordinates of
that letter. That is, the variables take the form
<P><PRE>make "a [2 3]
make "w [1 4]
make "z [5 5]
</PRE>
<P>and so on. (As in the case of the variable named <CODE>matrix</CODE>
above, these <CODE>make</CODE> instructions are just illustrative. The actual
program does not contain explicit data for this particular matrix, using
this particular keyword!)
<P>The letter variables contain the same information as the variable <CODE>
matrix</CODE>. Strictly speaking, they are not needed. By creating the redundant
variables for the letters, I've made a <EM>space/time tradeoff;</EM>
the extra variables take up room in the computer's memory, but the program
runs faster. One of the recurring concerns of a professional programmer is
deciding which way to make such tradeoffs. It depends on the amounts of
space and time required and the amounts available. In this case, the extra
space required is really quite small, compared to the memory of a modern
computer, so the decision is clear-cut. For larger programming problems it
is sometimes harder to decide.
<P>
<H2>Composition of Functions</H2>
<P>Earlier I showed a <CODE>make</CODE> instruction to put a particular coding matrix
into the variable <CODE>matrix</CODE>. How does the program create a matrix for
any keyword given as input? Here are two of the relevant procedures:
<P><PRE>to playfair :keyword :message
local [matrix a b c d e f g h i j k l m n o p q r s t u v w x y z]
<U>setkeyword jtoi lowercase :keyword</U>
output encode (reduce "word :message)
end
to setkeyword :word
<U>make "matrix reorder word :word (remove :word "abcdefghiklmnopqrstuvwxyz)</U>
make "j :i
end
</PRE>
<P>
<P>The keyword that is provided by the user as one of the inputs to the
toplevel procedure <CODE>playfair</CODE> goes through several stages as it is
transformed into a matrix.
<P>
<CENTER><IMG SRC="compose.gif" ALT="figure: compose"></CENTER>
<P>This <EM>dataflow</EM> diagram is very similar to a plumbing
diagram from Chapter 2 turned on its side. The format is a little
different to put somewhat more emphasis on the inputs and outputs, so you
can follow the "flow" of information through the arrows.
<P>In English, here's what the diagram tells us. The keyword given by the user
must be converted to lower case letters. (I could have chosen to use capital
letters instead; the goal is to have some uniform convention.) If the
keyword happens to contain a <CODE>J</CODE>, it will be represented within the
program as an <CODE>I</CODE> instead. Then, to make the matrix, we combine (with
<CODE>word</CODE>) two words: the keyword and the result of removing the keyword's
letters from the alphabet (leaving out <CODE>J</CODE>). Finally, that combined
word must be rearranged into a five-by-five square.
<P>
The advantage of a view such as this one is that each of the small boxes
in the diagram has a relatively simple task. Indeed, <CODE>lowercase</CODE> and
<CODE>word</CODE> are primitive operations in Berkeley Logo. <CODE>Jtoi</CODE> is trivial:
<P><PRE>to jtoi :word
output map [ifelse equalp ? "j ["i] [?]] :word
end
</PRE>
<P><CODE>Remove</CODE> is a straightforward recursive operation that
outputs the result of removing one group of letters from another group
of letters.
<P><PRE>to remove :letters :string
if emptyp :string [output "]
if memberp first :string :letters [output remove :letters bf :string]
output word first :string remove :letters bf :string
end
</PRE>
<P>The job of <CODE>reorder</CODE> is somewhat messier. It must keep track
of what row and column it's up to, so <CODE>reorder</CODE> is just an
initialization procedure for the recursive helper <CODE>reorder1</CODE> that does
the real work. <CODE>Reorder</CODE> also creates the two-dimensional Logo array
to provide another input to its helper procedure.
<P><PRE>to reorder :string
output reorder1 :string (mdarray [5 5]) 1 1
end
to reorder1 :string :array :row :column
if :row=6 [output :array]
if :column=6 [output reorder1 :string :array :row+1 1]
mdsetitem (list :row :column) :array first :string
make first :string (list :row :column)
output reorder1 (butfirst :string) :array :row :column+1
end
</PRE>
<P>If I were filling in a matrix by hand, instead of writing a computer
program, I'd use a very different approach. I'd handle one letter at a
time. First I'd go through the keyword a letter at a time, stuffing each
letter into the next available slot in the matrix. (If necessary, I'd
convert upper to lower case and <CODE>J</CODE> to <CODE>I</CODE> in the process.)
Then I'd go through the alphabet a letter at a time, saying "If this letter
isn't in the keyword, then stuff it into the matrix."
<P>
Many people would find it natural to use that same technique in writing
a computer program, also:
<P><PRE>to playfair :keyword :message ;; sequential version
local [matrix a b c d e f g h i j k l m n o p q r s t u v w x y z]
<U>make "matrix mdarray [5 5]</U>
<U>local [row column]</U>
<U>make "row 1</U>
<U>make "column 1</U>
<U>foreach :keyword [stuff jtoi lowercase ?]</U>
<U>foreach "abcdefghiklmnopqrstuvwxyz ~</U>
<U>[if not memberp ? jtoi :keyword [stuff ?]]</U>
make "j :i
output encode (reduce "word :message)
end
to stuff :letter
mdsetitem (list :row :column) :matrix :letter
make :letter (list :row :column)
make "column :column+1
if :column=6 [make "row :row+1 make "column 1]
end
</PRE>
<P>In this version, the first <CODE>foreach</CODE> instruction handles the letters of
the keyword. The second <CODE>foreach</CODE> instruction handles the rest of the
alphabet. The <CODE>not memberp</CODE> test handles the removal of the keyword
letters from the alphabet.
<P>My intent in writing this alternate version was to model my idea of how the
problem would be solved without a computer, processing one letter at a time.
So, for example, in the template
<P><PRE>[stuff jtoi lowercase ?]
</PRE>
<P>it's worth noting that the operations <CODE>jtoi</CODE> and <CODE>
lowercase</CODE> are being applied to single-letter inputs, even though those
operations were designed to accept words of any length as a unit. I
cheated, though, by applying <CODE>jtoi</CODE> to the entire keyword in the
second <CODE>foreach</CODE> instruction. I was trying to make the program more
readable; the honest version would be
<P><PRE>foreach "abcdefghiklmnopqrstuvwxyz ~
[if (ifelse equalp ? "i
[not (or (memberp "i :keyword)
(memberp "j :keyword))]
[not memberp ? :keyword])
[stuff ?]]
</PRE>
<P>Why am I subjecting you to this? My point is that what may seem to be the
most natural way to think about a problem--in this case, handling one
letter at a time--may not be the easiest, most elegant, or most efficient
programming solution.
<P>What makes the dataflow-structured version of <CODE>playfair</CODE> possible is the
use of <EM>operations</EM> in Logo, and the <EM>composition</EM> of these
operations by using the output from one as the input to another. This is an
important technique, but one that doesn't seem to come naturally to everyone.
If you're not accustomed to writing operations, I think it really pays to
train yourself into that habit.
<P><H2>Conversational Front End</H2>
<P>It's inconvenient to type a long message into the computer in the form of an
input to a procedure. Another approach would be a <EM>conversational front
end.</EM> This is a procedure that reads the cleartext message using <CODE>
readlist</CODE>, perhaps accepting the message over several lines. It's not hard
to write:
<P><PRE>to encode.big.message
local [keyword cleartext]
print [Welcome to the Playfair enciphering program.]
print [What keyword would you like to use?]
make "keyword first readlist
print [Now please enter your message, using as many lines as needed.]
print [When you're done, enter a line containing only a period (.).]
make "cleartext []
read.big.message
print [Here is the enciphered version:]
print []
print playfair :keyword :cleartext
end
to read.big.message
local "line
make "line readlist
if equalp :line [.] [stop]
make "cleartext sentence :cleartext :line
read.big.message
end
</PRE>
<P>Such a top-level procedure may be justified in a project like this, in which
a very large block of text may be used as a datum. But don't get carried
away. Programming languages that don't emphasize composition of functions
encourage this sort of programming style, to the point where the part of the
program that prompts the user and reads the data gets to be longer than the
part that does the actual computation. This preoccupation with verbose
conversation between the program and the user is sometimes justified by the
idea of "good human engineering," but I don't think that's necessarily
true. To take an extreme case, consider the standard elementary school Logo
procedure to draw a square:
<P><PRE>to square :size
repeat 4 [forward :size right 90]
end
</PRE>
<P>Compare that to this "human engineered" version:
<P><PRE>to square
local "size
print [Brian's square program copyright 1985]
print [What size square would you like me to draw?]
make "size first readlist
repeat 4 [forward :size right 90]
print [Thank you, please come again.]
end
</PRE>
<P>Not only is the first version (in my opinion) much more pleasant
to use, but it is also more powerful and flexible. The second version can
be used <EM>only</EM> as a top-level program, carrying on a conversation with
a human user. The first version can be run at top level, but it can also be
used as a subprocedure of a more complicated drawing program. If it's used
at top level, some person types in a number, the size, as the input to <CODE>
square</CODE> on the instruction line. If it's used inside another procedure,
that procedure can <EM>compute</EM> the input.
<P><H2>Further Explorations</H2>
<P>I haven't described the part of the program that actually transforms
the message: the procedure <CODE>encode</CODE> and its subprocedures. Read
the listing at the end of the chapter, then answer these questions:
<P>»Why does <CODE>encode</CODE> need two base cases?
<P>»What purpose is served by the four invocations of <CODE>thing</CODE> at the
beginning of procedure <CODE>paircode</CODE>?
<P>Of course this program can be improved in many ways.
<P>»One straightforward improvement to this program would be to
"bulletproof" it so that it doesn't die with a Logo error message if, for
example, the user provides a bad keyword. (Instead, the program should give
its own message, making it clear what the problem is. It's better for the
user to see
<P><PRE>Keywords may not have any letter repeated.
</PRE>
<P>than
<P><PRE>t has no value in paircode
</PRE>
<P>after making that mistake.) Also, what if the cleartext input
contains words with characters other than letters? The program should just
ignore those characters and process the letters in the words correctly.
<P>»Another fairly straightforward improvement would be to take the one
long word output by <CODE>playfair</CODE> and turn it into a list of words with
spacing and punctuation thrown in at random. The goal is to have the result
look more or less like an actual paragraph of English text, except for the
scrambled letters.
<P>Another direction would be to work on deciphering a Playfair-coded
message. There are two problems here: the easy one, in which you know what
the keyword is, and the hard one, in which you know only that a Playfair
cipher was used.
<P>»The procedure <CODE>playfair</CODE> itself will almost work in
the first case. It would work perfectly were it not for the special cases
of letters in the same row and column. It's a simple modification to handle
those cases correctly. An interesting extension would be to try to restore
the original spacing by using a dictionary to guess where words end.
<P>»The much harder problem is to try to guess the keyword. I mentioned
earlier some ideas about the approaches you'd have to take, such as
exploring the frequencies of use of pairs of letters. If you want more
advice, you'll have to study books on cryptography.
<P>
<TABLE width="100%"><TR><TD><A HREF="../v1-toc2.html">(back to Table of Contents)</A>
<TD align="right"><A HREF="../v1ch11/v1ch11.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../v1ch13/v1ch13.html"><STRONG>NEXT</STRONG></A>
</TABLE>
<P><H2>Program Listing</H2>
<PRE>
to playfair :keyword :message
local [matrix a b c d e f g h i j k l m n o p q r s t u v w x y z]
setkeyword jtoi lowercase :keyword
output encode (reduce "word :message)
end
;; Prepare the code array
to setkeyword :word
make "matrix reorder word :word (remove :word "abcdefghiklmnopqrstuvwxyz)
make "j :i
end
to remove :letters :string
if emptyp :string [output "]
if memberp first :string :letters [output remove :letters bf :string]
output word first :string remove :letters bf :string
end
to reorder :string
output reorder1 :string (mdarray [5 5]) 1 1
end
to reorder1 :string :array :row :column
if :row=6 [output :array]
if :column=6 [output reorder1 :string :array :row+1 1]
mdsetitem (list :row :column) :array first :string
make first :string (list :row :column)
output reorder1 (butfirst :string) :array :row :column+1
end
;; Encode the message
to encode :message
if emptyp :message [output "]
if emptyp butfirst :message [output paircode first :message "q]
if equalp (jtoi first :message) (jtoi first butfirst :message) ~
[output word (paircode first :message "q) (encode butfirst :message)]
output word (paircode first :message first butfirst :message) ~
(encode butfirst butfirst :message)
end
to paircode :one :two
local [row1 column1 row2 column2]
make "row1 first thing :one
make "column1 last thing :one
make "row2 first thing :two
make "column2 last thing :two
if :row1 = :row2 ~
[output letters (list :row1 rotate (:column1+1)) ~
(list :row1 rotate (:column2+1))]
if :column1 = :column2 ~
[output letters (list rotate (:row1+1) :column1) ~
(list rotate (:row2+1) :column1)]
output letters (list :row1 :column2) (list :row2 :column1)
end
to rotate :index
output ifelse :index = 6 [1] [:index]
end
to letters :one :two
output word letter :one letter :two
end
to letter :rowcol
output itoj mditem :rowcol :matrix
end
;; I and J conversion
to jtoi :word
output map [ifelse equalp ? "j ["i] [?]] :word
end
to itoj :letter
if :letter = "i [if (random 3) = 0 [output "j]]
output :letter
end
</PRE>
<P><A HREF="../v1-toc2.html">(back to Table of Contents)</A>
<P><A HREF="../v1ch11/v1ch11.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../v1ch13/v1ch13.html"><STRONG>NEXT</STRONG></A>
<P>
<ADDRESS>
<A HREF="../index.html">Brian Harvey</A>,
<CODE>bh@cs.berkeley.edu</CODE>
</ADDRESS>
</BODY>
</HTML>
|