about summary refs log tree commit diff stats
path: root/509bezier.mu
diff options
context:
space:
mode:
authorKartik K. Agaram <vc@akkartik.com>2021-07-11 09:00:23 -0700
committerKartik K. Agaram <vc@akkartik.com>2021-07-11 09:16:46 -0700
commit5234ba60f5e0ddd8387b3a3748395dab74f9c888 (patch)
tree3770a39fd388eca52442ece30882fe4dbfa11bf6 /509bezier.mu
parent598fe88ebb6e231ead40039b2db4f388fb789f91 (diff)
downloadmu-5234ba60f5e0ddd8387b3a3748395dab74f9c888.tar.gz
.
Diffstat (limited to '509bezier.mu')
0 files changed, 0 insertions, 0 deletions
'n87' href='#n87'>87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415
<P><HTML>
<HEAD>
<TITLE>Computer Science Logo Style vol 2 ch 11: Example: Cryptographer's Helper</TITLE>
</HEAD>
<BODY>
<CITE>Computer Science Logo Style</CITE> volume 2:
<CITE>Advanced Techniques</CITE> 2/e Copyright (C) 1997 MIT
<H1>Example: Cryptographer's Helper</H1>

<TABLE width="100%"><TR><TD>
<IMG SRC="../csls2.jpg" ALT="cover photo">
<TD><TABLE>
<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian
Harvey</A><BR>University of California, Berkeley</CITE>
<TR><TD align="right"><BR>
<TR><TD align="right"><A HREF="../pdf/v2ch11.pdf">Download PDF version</A>
<TR><TD align="right"><A HREF="../v2-toc2.html">Back to Table of Contents</A>
<TR><TD align="right"><A HREF="../v2ch10/v2ch10.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="https://people.eecs.berkeley.edu/~bh/v2ch12/v2ch12.html"><STRONG>NEXT</STRONG></A>
<TR><TD align="right"><A HREF="https://mitpress.mit.edu/books/computer-science-logo-style-second-edition-volume-2">MIT
Press web page for <CITE>Computer Science Logo Style</CITE></A>
</TABLE></TABLE>

<HR><P>Program file for this chapter: <A HREF="crypto.lg"><CODE>crypto</CODE></A>


<P>

<P>A <EM>cryptogram</EM> is a kind of word puzzle, like a crossword puzzle.
Instead of definitions, though, a cryptogram gives you the actual
words of a quotation, but with each letter replaced with a different
letter.  For example, each letter A in the original text might be
replaced with an F.  Here is a sample cryptogram:

<P>
LB RA, BT YBL LB RA: LJGL CQ LJA FUAQLCBY: KJALJAT 'LCQ YBRXAT CY
LJA DCYP LB QUSSAT LJA QXCYWQ GYP GTTBKQ BS BULTGWABUQ SBTLUYA, BT
LB LGHA GTDQ GWGCYQL G QAG BS LTBURXAQ, GYP RM BIIBQCYW AYP LJAD?

<P>

<P>The punctuation marks and the spaces between words are the
same in this cryptogram as they are in the original (&quot;clear&quot;) text.


<P>


A cryptogram is a kind of secret code.  The formal name for this particular
kind of code is a <EM>simple substitution cipher.</EM>  Strictly speaking,
a <EM>code</EM> is a method of disguising a message that uses a dictionary
of arbitrarily chosen replacements for each possible word.  A foreign
language is like a code.  A <EM>cipher</EM> is a method in which a uniform
algorithm or formula is used to translate each word.  A <EM>substitution</EM>
cipher is one in which every letter (or sometimes every pair of letters,
or some such grouping) is replaced by a disguised equivalent.  A <EM>
simple</EM> substitution cipher is one in which each letter has a single
equivalent replacement, which is used throughout the message.  (A
more complicated substitution cipher might be something like this:
the first letter A in the message is replaced with F, the second A
is replaced with G, the third with H, and so on.)

<P>Years ago, Arthur Conan Doyle and Edgar Allen Poe were able to write
mystery stories in which simple substitution ciphers were used by
characters who really wanted to keep a message secret.  Today, partly
because of those stories, too many people know how to &quot;break&quot; such
ciphers for them to be of practical use.  Instead, these ciphers are
used as word puzzles.

<P>The technique used for decoding a cryptogram depends on the fact that
some letters are more common than others.  The letter A is much more
common in English words than the letter Z.  If, in a cryptogram, the
letter F occurs many times, it's more likely to represent a letter
like A in the original text than a letter like Z.

<P>The most commonly used letter in English is E, by a wide margin. 
T is in second place, with A and O nearly tied for third.  I, N, and
R are also very commonly used.  These rankings apply to <EM>large</EM>
texts.  In the usual short cryptogram, the most frequent letter doesn't
necessarily represent E.  But the letter that represents E will probably
be among the two or three most frequent.

<P>Before reading further, you might want to try to solve the cryptogram
shown above.  Make a chart of the number of times each letter appears,
then use that information to make guesses about which letter is which.
As you're working on it, make a note of what other kinds of information
are helpful to you.

<P>This project is a program to help you solve cryptograms.  The program
doesn't solve the puzzle all by itself; it doesn't know enough about
English vocabulary.  But it does some of the more boring parts of
the job automatically, and can make good guesses about some of the
letters.

<P>The top-level procedure is <CODE>crypto</CODE> .  It takes one input, a list
whose members are the words of the cryptogram.  Since these lists are long
and easy to make mistakes in, you'll probably find it easier to type the
cryptogram into the Logo editor rather than directly at a question mark
prompt.  You might make the list be the value of a variable, then use that
variable as the input to <CODE>crypto</CODE>.  (The program file for this
project includes four such variables, named <CODE>cgram1</CODE> through <CODE>
cgram4</CODE>, with sample cryptograms.)

<P><CODE>Crypto</CODE> begins by going through the coded text, letter by letter. 
It keeps count of how often each letter is used.  You can keep track
of this counting process because the program draws a <EM>histogram</EM>
on the screen as it goes.  A histogram is a chart like the one
at the top of the next page.

<HR>

<P>
<PRE>           L
 B         L
AB         L
AB         L
AB         L
AB         L
AB         L    Q
AB         L    Q       Y
AB    G    L    Q  T    Y
AB    G    L    Q  T    Y
AB    G    L    Q  T    Y
ABC   G    L    Q  T    Y
ABC   G  J L    Q  T    Y
ABC   G  J L    Q  TU   Y
ABC   G  J L    QRSTU   Y
ABC   G  J L   PQRSTU W Y
ABCD  G  J L   PQRSTU WXY
ABCD  G IJKL   PQRSTU WXY
ABCD FGHIJKLM  PQRSTU WXY
</PRE>

<P>  Histogram

<HR>

<P><PRE><U>A-17-E</U> <U>B-18- </U> C-08-  D-03-  E
F-01-  G-11-A H-01-  I-02-  J-07-H
K-02-  <U>L-19-T</U> M-01-  N      O
P-04-  Q-13-  R-05-  S-05-  T-11-
U-06-  V      W-04-  X-03-  Y-12-
Z
      <U>A</U>BCD<U>E</U>FG<U>H</U>IJKLMNOPQRS<U>T</U>UVWXYZ

<U>LB RA, BT YBL LB RA: LJGT CQ LJA</U>
T   E,      T T   E: THAT    THE
<U>FUAQLCBY: KJALJAT 'LCQ YBRXAT CY LJA</U>
  E T   :  HETHE  'T       E     THE
<U>DCYP LB QUSSAT LJA QXCYWQ GYP GTTBKQ</U>
     T      E  THE        A   A
<U>BS BULTGWABUQ SBTLUYA, BT LB LGHA</U>
     T A E       T  E,    T  TA E
<U>GTDQ GWGCYQL G QAG BS LTBURXAQ, GYP</U>
A    A A   T A  EA    T     E , A
<U>RM BIIBQCYW AYP LJAD?</U>
            E   THE ?
</PRE>

<P>  Screen display

<HR>

<P>A histogram is a kind of graph, but it's different from
the <EM>continuous</EM> graphs you use in algebra.  Histograms are used
to show quantities of <EM>discrete</EM> things, like letters of the alphabet.

<P>The main reason the program draws the histogram is that it needs to
know the frequencies of occurrence of the letters for later use. 
When I first wrote the program, it counted the letters without printing
anything on the screen.  Since this counting is a fairly slow process,
it got boring waiting for the program to finish.  The histogram display
is a sort of video thumb-twiddling to keep you occupied while the
program is creating an invisible histogram inside itself.

<P>By the way, since there are only 24 lines on the screen, the top part
of the histogram may be invisible if the cryptogram is long enough
to use some letters more than 24 times.

<P>The shape of this histogram is pretty typical.  A few letters are
used many times, while most letters are clumped down near the bottom.
In this case, A, B, and L stand out.  You might guess that they represent
the most commonly used letters: E, T, and either A or O.  But you
need more information to be able to guess which is which.

<P>


After it finishes counting letters, the program presents a screen
display like the one shown above.
The information provided in this display comes in three
parts.  At the top is an alphabetical list of the letters in the cryptogram.
For each letter, the program displays the number of times that letter
occurs in the enciphered text.  For example, the letter P occurs four
times.  The letter that occurs most frequently is highlighted by
showing it in reverse video characters, represented here with
underlined characters.  In
this example, the most frequently used letter is L, with 19 occurrences.
Letters with occurrence counts within two of the maximum are also
highlighted.  In the example, A with 17 and B with 18 are highlighted.
If a letter does not occur in the cryptogram at all, no count is given.
In the example, there is no E in the enciphered text.

<P>The top part of the display shows one more piece of information: if
either the program or the person using it has made a guess as to the
letter that a letter represents, that guess is shown after the frequency
count.  For example, here the program has guessed that the letter
L in the cryptogram represents the letter T in the clear text.
(You can't tell from the display that this guess was made by the program
rather than by the person using it.  I just happen to know that that's
what happened in this example!)

<P>The next section of the display is a single line showing all the letters
of the alphabet.  In this line, a letter is highlighted if a guess
has been made linking some letter in the cryptogram with that letter
in the clear text.  In other words, this line shows the linkages in
the reverse direction from what is shown in the top section of the
display.  For example, I just mentioned that L in the cryptogram corresponds
to T in the clear text.  In the top part of the display, we can find
L in alphabetical order, and see that it has a T linked to it.  But
in the middle part of the display, we find <EM>T</EM>, not L, in alphabetical
order, and discover that <EM>something</EM> is linked to it.  (It turns
out that we don't usually have to know which letter corresponds to
T.)

<P>Here is the purpose of that middle section of the display:  Suppose
I am looking at the second word of the cryptogram, RA.  We've already
guessed that A represents E, so this word represents something-E.
Suppose I guess that this word is actually HE.  This just happens
to be the first two-letter word I think of that ends in E.  So I'd
like to try letting R represent H.  Now I look in the middle section
of the display, and I see that H is already highlighted.  So some
other letter, not R, already represents H.  I have to try a different
guess.

<P>The most important part of the display is the bottom section.  Here,
lines of cryptogram alternate with their translation into clear text,
based on the guesses we've made so far.  The cryptogram lines are
highlighted, just to make it easy to tell which lines are which. 
The program ensures that each word entirely fits on a single line;
there is no wrapping to a new line within a single word.

<P>There is room on the screen for eight pairs of lines.  If the cryptogram
is too big to fit in this space, only a portion of it will be visible
at any time.  In a few paragraphs I'll talk about moving to another
section of the text.

<P>The program itself is very limited in its ability to guess letters.
For the most part, you have to do the guessing yourself when you use
it.  There are three guessing rules in the program:

<P>
<P><P>

<OL><LI> The most frequently occurring single-letter word is taken to represent
A.
<LI> Another single-letter word, if there is one, is taken to represent
I.
<LI> The most frequently occurring three-letter word is taken to represent
THE, but only if its last letter is one of the ones highlighted in
the top part of the display.

<P></OL>
<P>

<P>In the example, the only single-letter word in the cryptogram
is G, in the next-to-last line.  The program, following rule 1, has
guessed that G represents A.  Rule 2 did not apply, because there
is no second single-letter word.  The most frequently used three-letter
word is LJA, which occurs three times.  The last letter of that word,
A, is highlighted in the top section because it occurs 17 times. 
Therefore, the program guesses that L represents T, J represents H,
and A represents E.

<P>Of course you understand that these rules are not infallible; they're
just guesses.  (A fancy name for a rule that works most of the time
is a <EM>heuristic.</EM>  A rule that works all the time is called an
<EM>algorithm.</EM>)  For example, the three-letter word GYP appears
twice in the cryptogram, only once less often than LJA.  Maybe GYP
is really THE.  However, the appearance of the word THAT in the translation
of the first line is a pretty persuasive confirmation that the program's
rules have worked out correctly in this case.

<P>If you didn't solve the cryptogram on your own, at my first invitation,
you might want to take another look at it, based on the partial solution
you now have available.  Are these four letters (A, E, I, and T) enough
to let you guess the rest?  It's a quotation you'll probably recognize.

<P>Once this display is on the screen, you can make further guesses by
typing to the program.  For example, suppose you decide that the last
word of the cryptogram, LJAD, represents THEM.  Then you want to guess
that D represents M.  To do that, type the letters D and M in that
order.  Don't use the RETURN key.  Your typing will not be echoed
on the screen.  Instead, three things will happen.  First, the entry
in the top section of the display that originally said

<P><PRE>D-03-
</PRE>

<P>will be changed to say

<P><PRE>D-03-M
</PRE>

<P>Second, the letter M will be highlighted in the alphabet
in the second section of the display.  Finally, the program will type
an M underneath every D in the cryptogram text.

<P>If you change your mind about a guess, you can just enter a new guess
about the same cryptogram letter.  For example, if you decide that
LJAD is really THEY instead of THEM, you could just type D and Y.
Alternatively, if you decide a guess was wrong but you don't have
a new guess, type the cryptogram letter (D in this example) and then
the space bar.

<P>If you guess that D represents M, and then later you guess that R also
represents M, the program will complain at you by beeping or by flashing
the screen, depending on what your computer can do.  If you meant
that R should represent M <EM>instead of</EM> D representing M, you must
first undo the latter guess by typing D, space bar, R, and M.

<P>The process of redisplaying the clear text translation of the cryptogram
after each guess takes a fairly long time, since the program has to
look up each letter individually.  Therefore, the program is written
so that you don't have to wait for this redisplay to finish before
guessing another letter representation.  As soon as you type any key
on the keyboard, the program stops retyping the clear text.  Whatever
key you typed is taken as the first letter of a two-letter guess command.

<P>If the cryptogram is too long to fit on the screen, there are three
other things you can type to change which part of the text is
visible.  Typing a plus sign (+) eliminates the first four lines of
the displayed text (that is, four lines of cryptogram and four corresponding
lines of cleartext) and brings in four new lines at the end.  Typing
a minus sign (-) moves backwards, eliminating the four lines nearest
the bottom of the screen and bringing back four earlier lines at the
top.  These <EM>windowing</EM> commands have no effect if you are already
seeing the end of the text (for +) or the beginning of the text (for
-).

<P>The third command provided for long cryptograms is the atsign
(@) character.  This is most useful after you've figured out all of
the letter correspondences.  It clears the screen and displays only
the clear text, without the letter frequencies, table of correspondences,
or the enciphered text.  This display allows 23 lines of clear text
to fit on the screen instead of only eight.  If you don't have the
solution exactly right, you can type any character to return to the
three-part display and continue guessing.

<P>The program never stops; even after you have made guesses
for all the letters, you might find an error and change your mind
about a guess.  When you're done, you stop the program with control-C
or command-period or whatever your computer requires.

<P>In the complete listing at the end of this chapter, there are a few
cryptograms for you to practice with.  They are excerpted from one
of my favorite books, <EM>Compulsory Miseducation</EM> by
Paul Goodman.

<P><H2>Program Structure</H2>

<P>There are about 50 procedures in this program.  These procedures can be
roughly divided into several purposes:

<P>
<UL>
<LI>initialization
<LI>frequency counting and displaying the histogram
<LI>guessing letters automatically
<LI>reading user commands
<LI>keeping track of guesses
<LI>top section of display (frequencies)
<LI>middle section of display (alphabet)
<LI>bottom section of display (cryptogram text and cleartext)
<LI>windowing and full-text displays
<LI>data abstraction and other helper procedures
</UL>

<P>The diagram on the next page shows superprocedure/subprocedure relationships within
the main categories.  (Helper procedures aren't shown, to make the
diagram more readable.)  The bottom half of the diagram has the
procedures that are concerned primarily with presenting information
on the screen.  <CODE>Redisplay</CODE>, near the center of the diagram, is
called whenever the entire screen display must be redrawn: when the
initialization part of the program is finished, and whenever the user
chooses a new portion of the text to display.  When the display
changes slightly, because a new guess is made, procedures such as
<CODE>fixtop</CODE>, <CODE>light</CODE>, and <CODE>dark</CODE> are used instead of redrawing
everything.

<P>

<CENTER><IMG SRC="cryptoflow.gif" ALT="figure: cryptoflow"></CENTER>


<P><CODE>Bind</CODE> is the most important procedure, because it records and displays
each new guess.  As the diagram shows, it invokes several subprocedures to
update the display; more importantly, it changes the values of several
variables to keep track of the new guess.  There is also a similar procedure
<CODE>qbind</CODE> that's used when a guess is made by the program rather than by
the user.  (The &quot;Q&quot; stands for either &quot;quick&quot; or &quot;quiet,&quot; since this
version never has to undo an old guess, omits some error checking, and can't
beep because there are no errors in automatic guesses.)  If you ignore
initialization and displaying information, the entire structure of the
program is that <CODE>crypto</CODE> calls <CODE>parseloop</CODE>, which repeatedly calls
<CODE>parsekey</CODE>, which calls <CODE>bind</CODE> to record a guess.

<P>Unfortunately, it's not so easy in practice to divide up the
procedures into groups, with a single purpose for each group.  Several
procedures carry out two tasks at once.  For example, <CODE>light</CODE> and
<CODE>dark</CODE> have those names because they switch individual letters
between normal and inverse video in the alphabet display in the middle
part of the screen.  But those procedures also set variables to remember
that a particular cleartext letter has or hasn't been guessed, so they
are also carrying out part of <CODE>bind</CODE>'s job, keeping track of guesses.

<P><H2>Guided Tour of Global Variables</H2>

<P><CODE>Crypto</CODE> uses many global variables to hold the information it needs.
This includes information about individual letters, about
words, and about the text as a whole.

<P>There are several sets of 26 variables, one for each letter of the
alphabet.  For these variables, the last letter of the variable name
is the letter about which the variable holds information.  In the
table that follows, the italic <EM>x</EM> in each name represents
any letter.

<P><TABLE><TR><TH align="right" valign="top"><EM>x</EM><TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">Cleartext letter that is guessed to match <EM>x</EM>
in the cryptogram.
<TR><TH align="right" valign="top"><CODE>bound</CODE><EM>x</EM><TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top"><CODE>True</CODE> if <EM>x</EM> appears in the
<EM>cleartext</EM> as guessed so far; <CODE>false</CODE> otherwise.
<TR><TH align="right" valign="top"><CODE>cnt</CODE><EM>x</EM><TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Count of how many times <EM>x</EM> appears in the cryptogram.
<TR><TH align="right" valign="top"><CODE>posn</CODE><EM>x</EM><TD>&nbsp;&nbsp;&nbsp;&nbsp;<TD valign="top">Screen cursor position where the frequency count
and guess for <EM>x</EM> is shown in the top part of the display.
</TABLE>

<P>These variables are set up initially by <CODE>initvars</CODE>, except
for the <CODE>posn</CODE> variables, which are set by <CODE>showrow</CODE>.  The
variables with single-letter names start out with a space character as their
value.  This choice allows <CODE>showclear</CODE> to use <CODE>thing :letter</CODE>
as the thing to type for every letter in the cryptogram.
If no guess has been made for a letter, it will be displayed as a
blank space in the partially-decoded version of the text.

<P>Here are the variables that have to do with <EM>words</EM> in the cryptogram
text.  These variables are needed for the part of the program that
automatically makes guesses, by looking for words that might represent
A, I, and THE in the cleartext.  In the following variable names,
<EM>y</EM> represents either a one-letter word or a three-letter word
in the cryptogram text.

<P><TABLE><TR><TH align="right" valign="top"><CODE>count.single</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">The number of occurrences of the most frequent
one-letter word.
<TR><TH align="right" valign="top"><CODE>count.triple</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">The number of occurrences of the most frequent
three-letter word.
<TR><TH align="right" valign="top"><CODE>list.single</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">List of one-letter words in the cryptogram text.
<TR><TH align="right" valign="top"><CODE>list.triple</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">List of three-letter words in the cryptogram text.
<TR><TH align="right" valign="top"><CODE>max.single</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">The most frequent one-letter word in the cryptogram text.
<TR><TH align="right" valign="top"><CODE>max.triple</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">The most frequent three-letter word in the cryptogram text.
<TR><TH align="right" valign="top"><CODE>single</CODE><EM>y</EM>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">The number of occurrences of the one-letter word <EM>y.</EM>
<TR><TH align="right" valign="top"><CODE>triple</CODE><EM>y</EM>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">The number of occurrences of the three-letter word <EM>y.</EM>
</TABLE>

<P>These variables are used only during
the initial histogram counting, to keep track of which one-letter
word and which three-letter word are the most frequent in each category.
Once the most frequently occurring words have been determined, the
actual count is no longer important.

<P>Finally, there are some variables that contain information about
the text as a whole:

<P><TABLE><TR><TH align="right" valign="top"><CODE>fulltext</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">The complete cryptogram text.
<TR><TH align="right" valign="top"><CODE>text</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">The part of the cryptogram that is displayed
on the screen right now.
<TR><TH align="right" valign="top"><CODE>moretext</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">The part of the text that should be displayed
after a <CODE>+</CODE> command.
<TR><TH align="right" valign="top"><CODE>textstack</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">A list of old values of <CODE>text</CODE>, to be restored
if the <CODE>-</CODE> command is used.
<TR><TH align="right" valign="top"><CODE>maxcount</CODE>
<TD>&nbsp;&nbsp;&nbsp;&nbsp;
<TD valign="top">The number of occurrences of the most frequently used
letter.
</TABLE>

<P><CODE>:Maxcount</CODE> is used to know which letters should be highlighted
in the top section of the display.  <CODE>:Text</CODE> is used by <CODE>showcode</CODE> and
<CODE>showclear</CODE> to maintain the bottom section of the display.  <CODE>
Fulltext</CODE>, <CODE>moretext</CODE>, and <CODE>textstack</CODE> are part of the windowing
feature.  At first, <CODE>text</CODE> is equal to <CODE>fulltext</CODE>, and <CODE>
textstack</CODE> is empty.  <CODE>Moretext</CODE> contains the portion of the text
starting on the fifth line that is displayed, providing there is some text
at the end of the cryptogram that didn't fit on the screen.  If the end of
the text is visible, then <CODE>moretext</CODE> is empty.  Here is what happens if
you type the plus sign:

<P><PRE>to moretext
if emptyp :moretext [beep stop]
push &quot;textstack :text
make &quot;text :moretext
redisplay &quot;true
end
</PRE>

<P>If <CODE>:moretext</CODE> is empty, there is no more text to display,
and the procedure stops with a complaint.  Otherwise, we want
to remember what is now in <CODE>:text</CODE> in case of a later <CODE>-</CODE> command, and
we want to change the value of <CODE>text</CODE> to the version starting four lines
later that is already in <CODE>:moretext</CODE>.

<P>In the solitaire project, I used a lot of <CODE>local</CODE> instructions in the
top-level procedures to avoid creating global variables.  In this project,
I didn't bother.  There's no good reason why I was lazier here than there;
you can decide for yourself whether you think it's worth the effort.

<P><H2>What's In a Name?</H2>

<P>In revising this program for the second edition, I was struck by the ways
in which bad choices of procedure or variable names had made it needlessly
hard to read.  Changing names was one of the three main ways in which I
changed the program.  (The other two were an increased use of data
abstraction and the introduction of iteration tools to eliminate some
helper procedures.)

<P>I'll start with a simple example.  As I've mentioned, when I first wrote
the program it didn't draw the histogram on the screen during the initial
counting of letter frequencies.  Since the top part of the screen display
is primarily a presentation of those frequencies, I thought of that top
part as the program's &quot;histogram&quot; even though it doesn't have the form
of a real histogram.  That's why, in the first edition, the procedures
that maintain the top part of the display were called <CODE>showhist</CODE>,
<CODE>fixhist</CODE>, and so on; when I added the <CODE>histogram</CODE> and <CODE>histlet</CODE>
procedures that draw the real histogram, it was hard to keep track of
which &quot;<CODE>hist</CODE>&quot; names were part of the initial histogram and which
were part of the letter frequency chart at the top of the program's normal
screen display.  I've now changed <CODE>showhist</CODE> to <CODE>showtop</CODE>,
<CODE>fixhist</CODE> to <CODE>fixtop</CODE>, and so on.  The procedures with <CODE>hist</CODE>
in their names are about the real histogram, and the ones with <CODE>top</CODE>
in their names are about the frequency chart.

<P>Here's another example.  In several parts of the program, I had to
determine whether a character found in the cryptogram text is a letter
or a punctuation mark.  The most straightforward way to do this would
be an explicit check against all the letters in the alphabet:

<P><PRE>to letterp :char
output memberp :char &quot;ABCDEFGHIJKLMNOPQRSTUVWXYZ
end
</PRE>

<P>But comparing the character against each of the 26 letters
would be quite slow.  Instead, I took advantage of the fact that there
happen to be variables in the program named after each letter.  That is,
there's a variable <CODE>A</CODE>, a variable <CODE>B</CODE>, and so on, but there
aren't variables named after punctuation characters.  Therefore, I could use
the Logo primitive <CODE>namep</CODE> to see whether or not the character I'm
considering is a variable name, and if so, it must be a letter.  The
first edition version of <CODE>crypto</CODE> is full of instructions
of the form

<P><PRE>if namep :char ...
</PRE>

<P>This is clever and efficient, but not at all self-documenting.
Someone reading the program would have no way to tell that I'm using
<CODE>namep</CODE> to find out whether a character is a letter.  The solution
was to add an instruction to the initialization in <CODE>crypto</CODE>:

<P><PRE>copydef &quot;letterp &quot;namep
</PRE>

<P>The <CODE>copydef</CODE> primitive is used to give a new name to
an existing procedure.  (The old name continues to work.)  The existing
procedure can be either primitive or user-defined.  The new name is not
saved by the <CODE>save</CODE> command; that's why <CODE>crypto</CODE> performs the
<CODE>copydef</CODE> instruction each time.

<P>Probably the worst example of bad naming was in the <CODE>tally</CODE> procedure.
This procedure has a complicated job; it must keep track of the most
common one-letter and three-letter words, in preparation for the program's
attempts to make automatic guesses for A, I, and THE.  Here is the version
in the first edition:

<P><PRE>to tally :type :word
local &quot;this
make &quot;this word :type :word
if not memberp :word list. :type ~
   [setlist. :type fput :word list. :type   make :this 0]
make :this sum 1 thing :this
make &quot;this thing :this
if :this &gt; (count. :type) ~
   [setcount. :type :this  make (word &quot;max. :type) :word]
end
</PRE>

<P>The input named <CODE>type</CODE> is either the word <CODE>single</CODE> or
the word <CODE>triple</CODE>.  One thing that makes this procedure hard to read
is the local variable named <CODE>this</CODE>.  What a vague name!  This what?
Is it this word, or this letter, or this word length, or this guess?  To
make things worse, partway through the procedure I recycled the same
name to hold a different value.  At first, <CODE>:this</CODE> is a word that
will be used as the name of a variable, counting the number of times
a given word appears.  For example, if the word YBL appears in the
cryptogram, then <CODE>tally</CODE> will create a variable named <CODE>tripleybl</CODE>
whose value will be the number of times that YBL occurs in the text.
The value of <CODE>this</CODE> will be the word <CODE>tripleybl</CODE>, so the
expression <CODE>thing :this</CODE> represents the actual number.  Then,
near the end of the procedure, I used the instruction

<P><PRE>make &quot;this thing :this
</PRE>

<P>From then on, <CODE>:this</CODE> is the number itself, not the
variable name!  It's really hard to read a procedure in which the
same name is used to mean different things in different instructions.

<P>Here's the new version:

<P><PRE>to tally :type :word
localmake &quot;countvar word :type :word
if not memberp :word list. :type ~
   [setlist. :type fput :word list. :type   make :countvar 0]
localmake &quot;count (thing :countvar)+1
make :countvar :count
if :count &gt; (count. :type) ~
   [setcount. :type :count   setmax. :type :word]
end
</PRE>

<P>The name <CODE>this</CODE> is gone.  Instead, I've first created
a local variable named <CODE>countvar</CODE> whose value is the name of the
count variable.  Then I create another local variable named <CODE>count</CODE>
that contains the actual count.  These names are much more descriptive
of the purposes of the two variables.

<P>Another change in the new version is a more consistent use of
data abstraction.  The original version used the constructor
<CODE>setlist.</CODE> and the selector <CODE>list.</CODE> to refer to the
list of all known cryptogram words of the appropriate length (the
variable <CODE>list.single</CODE> or <CODE>list.triple</CODE>), but
used the instruction

<P><PRE>make (word &quot;max. :type) :word
</PRE>

<P>to construct the variable containing the most frequently
appearing word of that length.  The new version uses a constructor
named <CODE>setmax.</CODE> that's analogous to the <CODE>setlist.</CODE> constructor.

<P>Rethinking the names of procedures can reorganize your ideas about how
to group the procedures into categories.  For example, in the first
edition I was upset about the fact that <CODE>historgram</CODE>, whose job
is to count letter frequencies and draw the histogram of those counts,
also invokes prepare.guess, whose job is to count <EM>word</EM> frequencies
in preparation for automatic guessing.

<P><CENTER><IMG SRC="histflow.gif" ALT="figure: histflow"></CENTER>

<P>The reason for this mixture of tasks is efficiency.  To prepare
the histogram, the program must extract the letters (omitting punctuation)
from each word of the text, and count them.  To prepare for guessing
words, the program must extract the letters from each word, and count
the occurrences of the letters-only words.  I could have done these
things separately:

<P><PRE>to histogram :text
foreach :text [foreach (filter &quot;letterp ?) &quot;histlet]
end

to count.words :text
foreach :text [prepare.guess (filter &quot;letterp ?)]
end
</PRE>

<P>But it seemed better to scan the words of the text just
once, and extract the letters from each word just once:

<P><PRE>to histogram :text
foreach :text [localmake &quot;word filter &quot;letterp ?
               foreach :word &quot;histlet
               prepare.guess :word]
end
</PRE>

<P>But the punch line of this story is that I could avoid the
confusing jump between boxes--the feeling of mixing two tasks--merely
by changing the name of the <CODE>histogram</CODE> procedure to something
neutral like <CODE>preprocess</CODE>.  Then the structure would be

<P><CENTER><IMG SRC="preprocess.gif" ALT="figure: preprocess"></CENTER>

<P>Now we have one initialization procedure that includes invocations
for two separate kinds of preprocessing.  It's not
really the program structure that is inappropriate, but only using the
name <CODE>histogram</CODE> for a procedure whose job includes more than the
creation of the histogram.

<P>

<P><H2>Flag Variables</H2>


<P>Procedure <CODE>redisplay</CODE> has the job of redrawing the entire screen when
there is a major change to what should be shown, like moving to a different
window in the cryptogram text.

<P><PRE>to redisplay :flag
cleartext
showtop
alphabet
showcode :text
if :flag [showclear :text]
end
</PRE>

<P>The input to <CODE>redisplay</CODE> is a <EM>flag variable.</EM>  It must
have the value <CODE>true</CODE> or <CODE>false</CODE>.  (The name comes from the flags on
mailboxes, which are either up or down to indicate whether or not there is
mail in the box.)  It's there because <CODE>redisplay</CODE>
has two slightly different
jobs to do at two different points in the program.  First, <CODE>redisplay</CODE>
is invoked by <CODE>crypto</CODE>, the top-level procedure, to draw the screen
initially.  At this time, no letters have been guessed yet.  Therefore, it
is not necessary to invoke <CODE>showclear</CODE> (which indicates
the guessed letters in the bottom part of the display).
<CODE>Crypto</CODE> executes the instruction

<P><PRE>redisplay &quot;false
</PRE>

<P>to avoid that unnecessary work.  <CODE>Redisplay</CODE> is also invoked
by <CODE>moretext</CODE>, <CODE>lesstext</CODE>, and <CODE>showclear</CODE>.  Each of these
procedures uses the instruction

<P><PRE>redisplay &quot;true
</PRE>

<P>to include <CODE>showcode</CODE>.  If the flag variable
weren't used, there would have to be two different versions
of <CODE>redisplay</CODE>.

<P>I used the latter technique in the procedures <CODE>bind</CODE> and <CODE>
qbind</CODE>.  These could also have been one procedure with a flag variable
input.  The advantage of the technique used in <CODE>redisplay</CODE> is that it
makes the program easier to read by reducing the number of procedures, and
keeping similar purposes together.  The advantage of using two procedures is
that it's a little faster, because you don't have to test the flag variable
with <CODE>if</CODE>.

<P>A flag variable is somewhat analogous to a <EM>predicate,</EM> a
procedure that always outputs <CODE>true</CODE> or <CODE>false</CODE>.  The
advantage of using these particular values for flag variables is that
they're easy to test; you can say

<P><PRE>if :flag [do.something]
</PRE>

<P>whereas, if you used some other pair of values like <CODE>yes</CODE> and
<CODE>no</CODE>, you'd have to say

<P><PRE>if equalp :flag &quot;yes [do.something]
</PRE>

<P>Some people like to give flag variables names ending with <CODE>p</CODE>,
as in the convention for predicates.  (The special variable <CODE>redefp</CODE>
that controls redefinition of primitives in some versions of Logo,
including Berkeley Logo, is an
example.)  I'm somewhat uncomfortable with that practice because to me it
raises a confusion about whether a particular word is the name of a variable
or the name of a procedure.  I'd rather put <CODE>flag</CODE> in the names of flag
variables.

<P>The 26 <CODE>bound</CODE><EM>x</EM> variables in this program are also flag
variables; each is <CODE>true</CODE> if the corresponding letter has been
guessed as the cleartext half of a binding.  They don't have &quot;flag&quot;
in their names, but their names aren't used directly in most of the
program anyway.  Instead they are hidden behind data abstraction procedures.
<CODE>Setbound</CODE> and <CODE>setunbound</CODE> are used to set any such variable
<CODE>true</CODE> or <CODE>false</CODE>, respectively; the selector <CODE>boundp</CODE> alerts
you by the P in its name that it's a predicate.

<P><H2>Iteration Over Letters</H2>

<P>One of the ways in which I simplified the program for this edition was
to replace some recursive helper procedures with invocations of <CODE>
foreach</CODE>.  At several points in the program, some action must be taken
for each letter in a word, or for each word in the text.

<P>Another kind of iteration problem that was not so easily solved by
the standard higher order procedures in Berkeley Logo was one in which
some action must be taken, not for each letter in a word, but for
each letter in the alphabet, or for some subset of the alphabet, as
in the case of <CODE>showrow</CODE>, which displays one row of the top part
of the screen, with information about five consecutive letters.
Of course these problems could be solved with instructions like

<P><PRE>foreach &quot;ABCDEFGHIJKLMNOPQRSTUVWXYZ [...]
</PRE>

<P>but that seemed unaesthetic to me.  I wanted to be able to
specify the starting and ending letters, as in this example:

<P><PRE>to alphabet
setcursor [6 6]
forletters &quot;A &quot;Z [ifelse boundp ? [invtype ?] [type ?]]
end
</PRE>

<P>(The job of <CODE>alphabet</CODE> is to generate the middle
part of the screen display, which is all of the letters of the
alphabet, in order, with each letter in inverse video if that
letter has been guessed as part of the cleartext.)

<P>The difficulty in implementing <CODE>forletters</CODE> is to get from one
letter to the next.  How does a program know that the letter after
A is B?  Here is my solution:

<P><PRE>to forletters :from :to :action
for [lettercode [ascii :from] [ascii :to]] ~
    [apply :action (list char :lettercode)]
end
</PRE>

<P>The operation <CODE>ascii</CODE>
takes a letter (or other character)
as input.  Its output is the number that represents that letter in
the computer's memory.  Most computers use the same numbers to represent
characters; this standard representation is called ASCII, for
American Standard Code for Information Interchange.  (It's
pronounced &quot;ask E.&quot;)  By using <CODE>ascii</CODE> to translate the starting
and ending letters into numeric codes, I've transformed the problem
into one that can be solved using the standard <CODE>for</CODE> tool that
allows an action to be carried out for each number in a given range.

<P>But in the template input to <CODE>forletters</CODE>, I want the question mark
to represent a letter, not its numeric code.
<CODE>Char</CODE> is the inverse operation to <CODE>ascii</CODE>.  Given
a number that is part
of the ASCII sequence, <CODE>char</CODE> outputs the character that that number
represents.  For example:

<P><PRE>?<U>print ascii &quot;A</U>
65
?<U>print char 65</U>
A
</PRE>

<P><CODE>Forletters</CODE> applies the template input to the character
corresponding to the number in the <CODE>lettercode</CODE> variable controlled by
the <CODE>for</CODE>.

<P>Adding 1 to an ASCII code to get the code for the next letter depends
on the fact that the numbers representing the letters are in sequence.
Fortunately, this is true of ASCII.  A is 65, B is 66, C is 67, and
so on.  Not all computer representations for characters have this
property.  The code that was used in the days of punched cards had
the slash (/) character in between R and S!

<P>By the way, the lower case letters have different ASCII codes from the
capitals.  In this program I've used the primitive operation
<CODE>uppercase</CODE> to translate every character that the program reads
into upper case, just to be sure that each letter has only one
representation.

<P><H2>Computed Variable Names</H2>





<P>

<P>Another programming technique that is heavily used in this project
is the use of <CODE>word</CODE> to compute variable names dynamically.  Ordinarily,
you assign a value to a variable named <CODE>var</CODE> with an instruction like

<P><PRE>make &quot;var 87
</PRE>

<P>and you look at the value of the variable with the expression

<P><PRE>:var
</PRE>

<P>But in this project, there are variables for each letter,
with names like <CODE>posna</CODE>, <CODE>posnb</CODE>, <CODE>posnc</CODE>,
and so on.  To assign a value to
these variables, the program doesn't use 26 separate instructions
like

<P><PRE>make &quot;posna [0 0]
</PRE>

<P>(Each of these variables contains a list of screen coordinates for
use with <CODE>setcursor</CODE> to find the corresponding letter in the top part of
the display.)  Instead, the procedure <CODE>showrow</CODE>, which draws that
section of the display, contains the instruction

<P><PRE>forletters :from :to [setposn ? cursor   onetop ?]
</PRE>

<P><CODE>Setposn</CODE> is a data abstraction procedure:

<P><PRE>to setposn :letter :thing
make (word &quot;posn :letter) :thing
end
</PRE>

<P>When the variable <CODE>letter</CODE> contains the letter <CODE>a</CODE>,
the <CODE>make</CODE> instruction
has the same effect as if it were

<P><PRE>make &quot;posna :thing
</PRE>

<P>Similarly, the dots notation (<CODE>:posna</CODE>) isn't used to examine the values
of these variables.  Instead, <CODE>thing</CODE> is invoked explicitly:

<P><PRE>to posn :letter
output thing (word &quot;posn :letter)
end
</PRE>

<P>Another point to consider is that I could have used a different approach
altogether, instead of using <CODE>word</CODE> to piece together a variable name.
For instance, I could have used property lists:

<P><PRE>to setposn :letter :thing
pprop &quot;posn :letter :thing
end

to posn :letter
output gprop &quot;posn :letter
end
</PRE>

<P>As it happens, I first wrote this project in Atari 800 Logo, which
didn't have property list primitives.  So the question didn't arise for
me.  In a version of Logo that does support property lists, I see no
<EM>stylistic</EM> reason to prefer one approach over the other.  It's
entirely a question of which is more efficient.  Which is faster, searching
through a list of 26 times 2 members (times 2 because each property has a
name and a value) or concatenating strings with <CODE>word</CODE> to generate the
name of a variable that can then be examined quickly?  I'd have to
experiment to find out.  Alternatively, instead of using <CODE>posn</CODE> as the
name of a property list and the letters as names of properties, I could
reverse the two roles.  That would give me more lists, but shorter lists.

<P>What <EM>is</EM> a stylistic issue is that using procedures like <CODE>posn</CODE>
and <CODE>setposn</CODE> to isolate the storage mechanism from the rest of the
program makes the latter easier to read.

<P>

<P><H2>Further Explorations</H2>

<P>I have three suggestions about how to extend this project.  The first
is to put in more rules by which the program can make guesses automatically.
For example, a three-letter word that isn't THE might be AND.  Sequences
of letters within a word can also be tallied; TH is a common two-letter
sequence, for example.  A double letter in the cryptogram is more
likely to represent OO than HH.

<P>If you have many rules in the program, there will be situations in
which two rules lead to contradictory guesses.  One solution is just
to try the most reliable rule first, and ignore a new guess if it
conflicts with an old one.  (<CODE>Qbind</CODE> applies this strategy by means
of the instruction

<P><PRE>if letterp thing :from [stop]
</PRE>

<P>which avoids adding a guess to the data base if the cryptogram
letter is already bound to a cleartext letter.)

<P>Another solution would be to let the rules &quot;vote&quot; about guesses.
If the program had many rules, it might happen that three rules suggest
that F represents E, while two rules suggest that W represents E.
In this case, three rules outvote two rules, and the program would
guess that F represents E.

<P>The second direction for exploration in this program is to try to
make it more efficient.  For example, every time you make a guess,
<CODE>showclear</CODE> is invoked to redisplay the partially decoded text.  Much
of this redisplay is unnecessary, since most of the guesses haven't
changed.  How can you avoid the necessity to examine every letter
of the cryptogram text?  One possibility would be to keep a list,
for every letter in the text, of the screen positions in which that
letter appears.  Then when a new guess is made, the program could
just type the corresponding cleartext letter at exactly those positions.
The cost of this technique would be a lot of storage space for the
lists of positions, plus a slower version of <CODE>showcode</CODE>, which would
have to create these position lists.

<P>The third direction for further exploration is to find out about more
complicated ciphers.  For example, suppose you started with a simple
substitution cipher, but every time the letter A appeared in the cleartext
you shifted the corresponding cryptogram letters by one.  That is,
if E is initially represented by R, the first time an A appears you'd
start using S to represent E.  The second time A appears you'd switch
to T representing E.  And so on.  The effect of this technique would
be that a particular cleartext letter is no longer represented by
a single cryptogram letter all the way through.  Therefore, you can't
just count the frequencies of the cryptogram letters and assume that
frequently-used letters represent E and T.  How could you possibly
decipher such a message?

<P>
<TABLE width="100%"><TR><TD><A HREF="../v2-toc2.html">(back to Table of Contents)</A>
<TD align="right"><A HREF="../v2ch10/v2ch10.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="https://people.eecs.berkeley.edu/~bh/v2ch12/v2ch12.html"><STRONG>NEXT</STRONG></A>
</TABLE>

<P><H2>Program Listing</H2>

<P><P>
<P><PRE>
to crypto :text
make "text map "uppercase :text
make "fulltext :text
make "moretext []
make "textstack []
if not procedurep "letterp [copydef "letterp "namep]
forletters "A "Z "initvars
make "maxcount 0
initcount "single
initcount "triple
cleartext
histogram :text
redisplay "false
if or guess.single guess.triple [showclear :text]
parseloop
end

;; Initialization

to initcount :type
setlist. :type []
setcount. :type 0
end

to initvars :letter
setcnt :letter 0
make :letter "| |
setunbound :letter
end

;; Histogram

to histogram :text
foreach :text [localmake "word filter "letterp ?
               foreach :word "histlet
               prepare.guess :word]
end

to histlet :letter
localmake "cnt 1+cnt :letter
setcursor list (index :letter) (nonneg 24-:cnt)
type :letter
setcnt :letter :cnt
if :maxcount < :cnt [make "maxcount :cnt]
end

;; Guessing letters

to prepare.guess :word
if equalp count :word 1 [tally "single :word]
if equalp count :word 3 [tally "triple :word]
end

to tally :type :word
localmake "countvar word :type :word
if not memberp :word list. :type ~
   [setlist. :type fput :word list. :type   make :countvar 0]
localmake "count (thing :countvar)+1
make :countvar :count
if :count > (count. :type) ~
   [setcount. :type :count   setmax. :type :word]
end

to guess.single
if emptyp (list. "single) [output "false]
if emptyp butfirst (list. "single) ~
   [qbind first (list. "single) "A  output "true]
qbind (max. "single) "A
qbind (ifelse equalp first (list. "single) (max. "single)
              [last (list. "single)]
              [first (list. "single)]) ~
      "I
output "true
end

to guess.triple
if emptyp (list. "triple) [output "false]
if :maxcount < (3+cnt last (max. "triple))	 ~
   [qbind first (max. "triple) "T
    qbind first butfirst (max. "triple) "H
    qbind last (max. "triple) "E
    output "true]
output "false
end

;; Keyboard commands

to parseloop
forever [parsekey uppercase readchar]
end

to parsekey :char
if :char = "@ [fullclear stop]
if :char = "+ [moretext stop]
if :char = "- [lesstext stop]
if not letterp :char [beep stop]
bind :char uppercase readchar
end

;; Keeping track of guesses

to bind :from :to
if not equalp :to "| | [if not letterp :to [beep stop]
                        if boundp :to [beep stop]]
if letterp thing :from [dark thing :from]
make :from :to
fixtop :from
if letterp :to [light :to]
showclear :text
end

to qbind :from :to
if letterp thing :from [stop]
make :from :to
fixtop :from
light :to
end

;; Maintaining the display

to redisplay :flag
cleartext
showtop
alphabet
showcode :text
if :flag [showclear :text]
end

;; Top section of display (letter counts and guesses)

to showtop
setcursor [0 0]
showrow "A "E
showrow "F "J
showrow "K "O
showrow "P "T
showrow "U "Y
showrow "Z "Z
end

to showrow :from :to
forletters :from :to [setposn ? cursor   onetop ?]
print []
end

to onetop :letter
localmake "count cnt :letter
if :count = 0 [type word :letter "|      | stop]
localmake "text (word :letter "- twocol :count "- thing :letter)
ifelse :maxcount < :count+3 [invtype :text] [type :text]
type "| |
end

to twocol :number
if :number > 9 [output :number]
output word 0 :number
end

to fixtop :letter
setcursor posn :letter
onetop :letter
end

;; Middle section of display (guessed cleartext letters)

to alphabet
setcursor [6 6]
forletters "A "Z [ifelse boundp ? [invtype ?] [type ?]]
end

to light :letter
setcursor list 6+(index :letter) 6
invtype :letter
setbound :letter
end

to dark :letter
setcursor list 6+(index :letter) 6
type :letter
setunbound :letter
end

;; Bottom section of display (coded text)

to showcode :text
make "moretext []
showcode1 8 0 :text
end

to showcode1 :row :col :text
if emptyp :text [make "moretext [] stop]
if :row > 22 [stop]
if and equalp :row 16 equalp :col 0 [make "moretext :text]
if (:col+count first :text) > 37 [showcode1 :row+2 0 :text stop]
codeword :row :col first :text
showcode1 :row (:col+1+count first :text) butfirst :text
end

to codeword :row :col :word
setcursor list :col :row
invtype :word
end

;; Bottom section of display (cleartext)

to showclear :text
showclear1 8 0 :text 2
end

to showclear1 :row :col :text :delta
if emptyp :text [stop]
if :row > 23 [stop]
if keyp [stop]
if (:col+count first :text) > 37 ~
   [showclear1 :row+:delta 0 :text :delta stop]
clearword :row :col first :text
showclear1 :row (:col+1+count first :text) butfirst :text :delta
end

to clearword :row :col :word
setcursor list :col :row+1
foreach :word [ifelse letterp ? [type thing ?] [type ?]]
end

;; Windowing commands

to fullclear
cleartext
showclear1 0 0 :fulltext 1
print []
invtype [type any char to redisplay]
ignore readchar
redisplay "true
end

to moretext
if emptyp :moretext [beep stop]
push "textstack :text
make "text :moretext
redisplay "true
end

to lesstext
if emptyp :textstack [beep stop]
make "text pop "textstack
redisplay "true
end

;; Iteration tool for letters

to forletters :from :to :action
for [lettercode [ascii :from] [ascii :to]] ~
    [apply :action (list char :lettercode)]
end

;; Data abstraction (constructors and selectors)

to setbound :letter
make word "bound :letter "true
end

to setunbound :letter
make word "bound :letter "false
end

to boundp :letter
output thing word "bound :letter
end

to setcnt :letter :thing
make (word "cnt :letter) :thing
end

to cnt :letter
output thing (word "cnt :letter)
end

to setposn :letter :thing
make (word "posn :letter) :thing
end

to posn :letter
output thing (word "posn :letter)
end

to setcount. :word :thing
make (word "count. :word) :thing
end

to count. :word
output thing (word "count. :word)
end

to setlist. :word :thing
make (word "list. :word) :thing
end

to list. :word
output thing (word "list. :word)
end

to setmax. :word :thing
make (word "max. :word) :thing
end

to max. :word
output thing (word "max. :word)
end

;; Miscellaneous helpers

to index :letter
output (ascii :letter)-(ascii "A)
end

to beep
tone 440 15
end

to invtype :text
type standout :text
end

to nonneg :number
output ifelse :number < 0 [0] [:number]
end

;; Sample cryptograms

make "cgram1 [Dzynufqyjulli, jpqhq ok yr hoxpj qnzeujory qceqwj xhrtoyx
   zw oyjr u trhjptpolq trhln. oynqqn, rzh qceqkkogq eryeqhy tojp
   whrvlqfk rd qnzeujory uj whqkqyj kofwli fquyk jpuj jpq |xhrty-zwk| nr
   yrj pugq kzep u trhln. u nqeqyj qnzeujory uofk uj, whqwuhqk drh, u
   frhq trhjptpolq dzjzhq, tojp u noddqhqyj erffzyoji kwohoj, noddqhqyj
   reezwujoryk, uyn frhq hqul zjoloji jpuy ujjuoyoyx kjujzk uyn kuluhi.]

make "cgram2 [Lvo vfkp lfzj md opaxflimn iz lm gitokflo fnp zlkonblvon f
   hmalv'z inilifliuo, fnp fl lvo zfyo liyo lm zoo lm il lvfl vo jnmwz
   wvfl iz noxozzfkh lm xmco wilv lvo mnbminb fxliuilioz fnp xaglako md
   zmxiolh, zm lvfl viz inilifliuo xfn to kogoufnl. il iz ftzakp lm
   lvinj lvfl lviz lfzj xfn to fxxmycgizvop th zm yaxv zillinb in f tms
   dfxinb dkmnl, yfnicagflinb zhytmgz fl lvo pikoxlimn md pizlfnl
   fpyinizlkflmkz. lviz iz kflvok f wfh lm kobiyonl fnp tkfinwfzv.]

make "cgram3 [Pcodl hbdcx qxdrdlh yihcodr, hbd rzbiier gxd lih ziyqdhdlh
   hi hdgzb gwhbdlhcz echdxgzf, xdgnclp gr g ydglr ia ecudxghcil gln
   zwehcoghcil. gln c niwuh hbgh yirh ia wr jbi rdxciwref xdgn gln jxchd
   hbd dlpecrb eglpwgpd dodx edgxldn ch uf hbd xiwhd ia "xwl, rqih, xwl"
   hi rcegr ygxldx.]

make "cgram4 [Jw btn xnsgsyp ejke gfebbcg, dtyjbn fbccsksg, ryu fbccsksg
   nswcsfpsu pes usgjns, wnssuba, ryu wtptns bw pes qbtyk, pesns zbtcu
   ls yb knrujyk, yb psgpjyk svfsxp rg r psrfejyk aspebu, ryu yb
   lcrfilbrnu dtykcsg. jy wrfp, zs rns ksppjyk cbfigpsx gfesutcjyk ryu
   knrujyk pb pes xbjyp bw pbnptns.]
</PRE><P>


<P>

<P><A HREF="../v2-toc2.html">(back to Table of Contents)</A>
<P><A HREF="../v2ch10/v2ch10.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="https://people.eecs.berkeley.edu/~bh/v2ch12/v2ch12.html"><STRONG>NEXT</STRONG></A>

<P>
<ADDRESS>
<A HREF="../index.html">Brian Harvey</A>, 
<CODE>bh@cs.berkeley.edu</CODE>
</ADDRESS>
</BODY>
</HTML>