*

author: elioat <elioat@tilde.institute> 2023-08-23 07:52:19 -0400
committer: elioat <elioat@tilde.institute> 2023-08-23 07:52:19 -0400
commit: 562a9a52d599d9a05f871404050968a5fd282640 (patch)
tree: 7d3305c1252c043bfe246ccc7deff0056aa6b5ab /js/games/nluqo.github.io/~bh/v2ch1/files.html
parent: 5d012c6c011a9dedf7d0a098e456206244eb5a0f (diff)
download: tour-562a9a52d599d9a05f871404050968a5fd282640.tar.gz
1 files changed, 777 insertions, 0 deletions
diff --git a/js/games/nluqo.github.io/~bh/v2ch1/files.html b/js/games/nluqo.github.io/~bh/v2ch1/files.html
new file mode 100644
index 0000000..2cfa1ec
--- /dev/null
+++ b/js/games/nluqo.github.io/~bh/v2ch1/files.html
@@ -0,0 +1,777 @@
+
+<P><HTML>
+<HEAD>
+<TITLE>Computer Science Logo Style vol 2 ch 1: Data Files</TITLE>
+</HEAD>
+<BODY>
+<CITE>Computer Science Logo Style</CITE> volume 2:
+<CITE>Advanced Techniques</CITE> 2/e Copyright (C) 1997 MIT
+<H1>Data Files</H1>
+
+<TABLE width="100%"><TR><TD>
+<IMG SRC="../csls2.jpg" ALT="cover photo">
+<TD><TABLE>
+<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian
+Harvey</A><BR>University of California, Berkeley</CITE>
+<TR><TD align="right"><BR>
+<TR><TD align="right"><A HREF="../pdf/v2ch01.pdf">Download PDF version</A>
+<TR><TD align="right"><A HREF="../v2-toc2.html">Back to Table of Contents</A>
+<TR><TD align="right"><A HREF="../v2ch0/ack.html"><STRONG>BACK</STRONG></A>
+chapter thread <A HREF="../v2ch2/v2ch2.html"><STRONG>NEXT</STRONG></A>
+<TR><TD align="right"><A HREF="https://mitpress.mit.edu/books/computer-science-logo-style-second-edition-volume-2">MIT
+Press web page for <CITE>Computer Science Logo Style</CITE></A>
+</TABLE></TABLE>
+
+<HR><P>Program file for this chapter: <A HREF="format.lg"><CODE>format</CODE></A>
+
+
+<P>The programming techniques that you learned in the first volume of this
+series are all you need to express any computation.  That is, given any
+question that a computer program can answer, you can write the program in
+Logo using those techniques.  Also, those techniques can be used, with few
+if any changes in notation, in any implementation of Logo.  However, saying
+that a problem can be solved using certain tools doesn't mean that it can be
+solved in the most convenient way.  In this volume the overall goal is to
+expand your repertoire of Logo techniques, so that you'll find it easier to
+deal with more difficult problems.  Some of the techniques here are unique
+to Berkeley Logo; others exist in other dialects, but in significantly
+different forms.
+
+<P>Probably the most glaring omission in the first volume is that we made no
+provision for saving information from one session to the next.  (You do know
+how to save a Logo workspace, but that's too all-or-nothing to be very
+useful.  You'd like to be able to save specific kinds of information, and
+perhaps use that information in some program outside of Logo.)  In this
+chapter we'll explore the use of <EM>data files</EM> in Logo
+programs.
+
+<P>There isn't much in the way of truly new ideas here.  There are a few new
+primitives and a few grubby details about how files are named in your
+particular computer, but for the most part you won't have to change the way
+you think about the programming process.  My plan for this chapter is to
+give a quick summary of the vocabulary you'll need, and spend most of the
+chapter on a practical programming project that will show you the sort of
+thing you can accomplish easily in Logo.
+
+<P><H2>Reader and Writer</H2>
+
+<P>We've been reading and writing data all along.  We've been reading
+from the keyboard, with operations like <CODE>readlist</CODE> and <CODE>
+readchar</CODE>, and we've been writing to your screen, with commands
+like <CODE>print</CODE> and <CODE>type</CODE>.
+
+<P>The goal now is to read and write the same data, but from and to other
+devices.  This includes files on a hard disk or a diskette,
+but also things like printers or TV cameras if you have them.  The
+same procedures that read the keyboard and write the screen can be used for
+these other devices as well.  The trick is to divert the attention of those
+procedures to someplace else.
+
+<P>The part of the Logo interpreter that reads characters for <CODE>readlist</CODE>
+and <CODE>readchar</CODE> is called the <EM>reader;</EM> the part that handles
+<CODE>print</CODE> and its friends is the <EM>writer.</EM> The commands
+<CODE>setread</CODE> and <CODE>setwrite</CODE> tell the reader and the writer,
+respectively, what file or device to use.  The input to either command is
+the name of a file or device.  The format of that name will vary from one
+operating system to another, so you should look it up in your computer's
+reference manual.  Generally it will be the same format that you (I assume)
+have already been using as input to the <CODE>save</CODE> and <CODE>load</CODE> commands.
+
+<P>If you invoke <CODE>setread</CODE> with the empty list as input, it tells the
+reader to read from the keyboard.  If you give <CODE>setwrite</CODE> the
+empty list as input, it tells the writer to write to the screen.  In
+other words the empty list &quot;turns off&quot; whatever file or device you
+may have been using and returns to Logo's usual style of
+interaction.
+
+<P>You can switch the attention of the reader or the writer among several files
+in rotation without &quot;losing your place&quot; in each one.  You must <EM>open</EM> a
+file when you want to begin reading or writing it before you can use it as
+input to <CODE>setread</CODE> or <CODE>setwrite</CODE>.  You do this with the
+<CODE>openread</CODE> or <CODE>openwrite</CODE> command.<SUP>*</SUP>  Once a file is opened, you can <CODE>setread</CODE> or <CODE>
+setwrite</CODE> to it, read or write some data, then switch to a different file
+for a while, and then continue where you left off.  When you're finished
+using the file, you must <CODE>close</CODE> it.
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP><CODE>Openwrite</CODE> creates
+a new, empty file, replacing any file that might previously have existed
+with the same name.  Berkeley Logo also provides <CODE>openupdate</CODE>, which
+opens an existing file for both reading and writing simultaneously, and <CODE>
+openappend</CODE>, which opens an existing file for writing, putting the newly
+written data after the old contents of the file.  I won't use those in this
+book, though.</SMALL></BLOCKQUOTE></SMALL><P>Some operating systems allow access to devices like printers using the same
+programming interface that works for files.  In those systems,
+you can <CODE>setwrite</CODE> to a printer just as you can to a disk file.  The
+format of the input to <CODE>setwrite</CODE> may be different (a device name
+instead of a file name), but there is no conceptual difference.
+
+<P>
+
+<H2>End of File</H2>
+
+<P>When reading information from a file, the problem arises of what
+happens when there is no more left to read.  How does a program
+know it's reached the end of the file?
+
+<P>Berkeley Logo provides two ways to answer this question.  If the structure
+of your program makes it convenient to test for the end of the file <EM>
+before</EM> attempting to read more information from the file, you can use the
+predicate <CODE>eofp</CODE>, which takes no inputs, and returns <CODE>true</CODE> if the
+file currently being read is at its end.  (If Logo is reading from the
+keyboard, then <CODE>eofp</CODE> always returns <CODE>false</CODE>.)
+
+<P>In some cases it may be more convenient to try to read from the file, and
+then later test whether there was really any information available to read.
+To make this possible, the reading operations output an empty datum
+when there is nothing left to read, but of the opposite type from
+their usual output.  In other words <CODE>readlist</CODE>, which usually
+outputs a list, outputs an empty <EM>word</EM> to indicate the end of a
+file.  <CODE>Readchar</CODE>, which normally outputs a word, outputs an empty
+<EM>list</EM> when there are no more characters to be read.  You can
+use <CODE>wordp</CODE> or <CODE>listp</CODE>, therefore, to check for the end of the
+file.
+
+<P>Here's an example.  <CODE>Extract</CODE> is a command that takes two inputs, a
+word and a filename.  Its effect is to print every line in that file
+that contains the chosen word.  For example, you might have a file in
+which each line contains someone's name and telephone number; you
+could use this procedure to find a particular person in the file.
+
+<P><PRE>to extract :word :file
+openread :file
+setread :file
+extract1 :word
+setread []
+close :file
+end
+
+to extract1 :word
+local &quot;line
+if eofp [stop]
+make &quot;line readlist
+if memberp :word :line [print :line]
+extract1 :word
+end
+
+? <U>extract &quot;brian &quot;phonelist</U>
+Brian Harvey 555-2368
+Brian Silverman 555-5274
+</PRE>
+
+<P>Notice that the program restores reading from the keyboard
+when it's done reading the file.  In the example I'm assuming that
+<CODE>phonelist</CODE> is the name of a file you've created earlier,
+with a Logo program or with your favorite text editor outside
+of Logo.
+
+<P><H2>Case Sensitivity</H2>
+
+<P>In this example, I used the word <CODE>brian</CODE>, in all lower case
+letters, as the input to <CODE>extract</CODE>, whereas the data file contained
+the word <CODE>Brian</CODE> with an initial upper case or
+capital letter.  You can control whether or not Logo considers
+those two words equal by changing the value of the variable
+<CODE>caseignoredp</CODE>.  If this variable has the value <CODE>true</CODE>, as it does
+by default, then <CODE>equalp</CODE> and <CODE>memberp</CODE> consider upper and lower
+case letters the same.  But if you say
+
+<P><PRE>make &quot;caseignoredp &quot;false
+</PRE>
+
+<P>then upper and lower case letters will not be equal.  (This
+variable does <EM>not</EM> affect Logo's understanding of the names of
+procedures and variables, in which case is always ignored.  The words
+<CODE>print</CODE> and <CODE>PRINT</CODE> always name the same procedure, for example.)
+
+<P><H2>Dribble Files</H2>
+
+<P>Not everything Logo prints goes through the writer.  Error messages and
+trace output always go to the screen, not into a file.  The idea is that
+even when you're using files, you're still programming interactively, and
+those messages are part of the programming process rather than
+part of the result of your program.
+
+<P>Sometimes, though, you want to capture in a file <EM>everything</EM>
+that happens while you're using Logo.  Some programming teachers, for
+instance, like to look over their students' shoulders but can't look
+at everyone at once.  If you record everything you do, your teacher
+can print out the record, take it home, and study it overnight.  The
+formal name for this kind of record is a <EM>transcript file,</EM> but
+it's more popularly known as a <EM>dribble file.</EM>  (The metaphor is
+that there's a leak in the pipe between the computer and the screen
+and some of the data dribbles out into the file.)
+
+<P>The <CODE>dribble</CODE> command takes a file name as input and starts
+dribbling into that file.  The <CODE>nodribble</CODE> command, with no input,
+turns off dribbling.  Information is sent to the dribble file <EM>
+in addition to</EM> being printed on your screen, or written in a file by
+the writer.  Compare this with the effect of <CODE>setwrite</CODE>, which
+tells Logo to print into a file <EM>instead of</EM> onto the screen.
+
+<P>If you want to keep a transcript of a programming session, remember
+that much of your interaction with Logo happens in the Logo editor
+and that that kind of interaction can't be recorded in a dribble
+file.  So you might want to make it a habit to <CODE>po</CODE> the procedures
+you've edited, each time you leave the editor.
+
+<P><H2>A Text Formatter</H2>
+
+<P>Okay, it's time for the practical project I promised you.  Probably
+the most useful &quot;real&quot; program you can find for a home computer is a
+word processor.  There are
+two parts to a word processing package: a text editor and a
+formatter.  The editor is the part of the system that lets you type in
+your document, correct errors, and make additions and deletions
+later.  The formatter is the part that takes what you type and turns
+it into beautiful printed pages with even margins and so on.
+(In most word processors, these two parts are integrated, so that every
+character you type makes an immediate change in the beautifully formatted
+document.  But in principle the two tasks are separable.)
+
+<P>I'm going to write a text formatter.  I assume that you have some way
+to put text into a file.  (In some versions of Logo the same
+editor that you use for procedures can also edit text files.
+Otherwise you probably have a separate program that edits files, or
+else you can write one in Logo!)  The formatter will read lines from a
+file, fill and justify paragraphs, and print the result.  (To <EM>
+fill</EM> text means to fit as many words as possible into each printed
+line.  To <EM>justify</EM> the text is to insert extra spaces between
+words so that both margins line up.) You can see how the
+formatter will work by examining the example on the following pages.
+I've shown both what's in the file and what my program prints.
+
+<HR>
+
+<P>Formatter input file:
+
+<P><PRE><SMALL>
+When I wrote the first edition of this book in 1984, I said that the study of
+computer programming was intellectually rewarding for young children in
+elementary school, and for computer science majors in college, but that high
+school students and adults studying on their own generally had an
+intellectually barren diet, full of technical details of some particular
+computer brand.
+
+At about the same time I wrote those words, the College Board was introducing
+an Advanced Placement exam in computer science.  Since then, the AP course has
+become popular, and similar official or semi-official computer science
+curricula have been adopted in other countries as well.  Meanwhile, the
+computers available to ordinary people have become large enough and powerful
+enough to run serious programming languages, breaking the monopoly of BASIC.
+* nofill
+I think that there shall never exist
+a poem as lovely as a tree-structured list.
+* yesfill
+So, the good news is that intellectually serious computer science is within
+the reach of just about everyone.  The bad news is that the curricula tend to
+be imitations of what is taught to beginning undergraduate computer science
+majors, and I think that's too rigid a starting point for independent
+learners, and especially for teenagers.
+
+See, the wonderful thing about computer programming is that it's fun, perhaps
+not for everyone, but for very many people.  There aren't many mathematical
+activities that appeal so spontaneously.  Kids get caught up in the
+excitement of programming, in the same way that other kids (or maybe the
+same ones) get caught up in acting, in sports, in journalism (provided the
+paper isn't run by teachers), or in ham radio.  If schools get too serious
+about computer science, that spontaneous excitement can be lost.  I once
+heard a high school teacher say proudly that kids used to hang out in his
+computer lab at all hours, but since they introduced the computer science
+curriculum, the kids don't want to program so much because they've learned
+that programming is just a means to the end of understanding the
+curriculum.  No!  The ideas of computer science are a means to the end of
+getting computers to do what you want.
+*skip 4
+*make "nofilltab 15
+*nofill
+Computer
+Science
+Apprenticeship
+*yesfill
+*make "spacing 2
+My goal in this series of books is to make the goals and methods of a serious
+computer scientist accessible, at an introductory level, to people who are
+interested in computer programming but are not computer science majors.  If
+you're an adult or teenaged hobbyist, or a teacher who wants to use the
+computer as an educational tool, you're definitely part of this audience.
+I've taught these ideas to teachers and to high school students.  What I enjoy
+most is teaching high school freshmen who bring a love of programming into the
+class with them--the ones who are always tugging at my arm to tell me what they
+found in the latest Byte.
+</SMALL></PRE><P>
+
+<HR>
+
+<P>Formatted output:
+
+<PRE><SMALL><SMALL><SMALL><SMALL>
+
+
+
+
+
+
+            When  I wrote the first edition of this book in 1984, I said
+       that   the  study  of  computer  programming  was  intellectually
+       rewarding  for  young  children  in  elementary  school,  and for
+       computer science majors in college, but that high school students
+       and  adults studying on their own generally had an intellectually
+       barren  diet,  full  of  technical  details  of  some  particular
+       computer brand.
+
+            At  about  the  same  time  I wrote those words, the College
+       Board  was  introducing  an  Advanced  Placement exam in computer
+       science.  Since  then,  the  AP  course  has  become popular, and
+       similar official or semi-official computer science curricula have
+       been adopted in other countries as well. Meanwhile, the computers
+       available  to  ordinary  people  have  become  large  enough  and
+       powerful  enough  to  run serious programming languages, breaking
+       the monopoly of BASIC.
+
+       I think that there shall never exist
+       a poem as lovely as a tree-structured list.
+
+            So,  the  good  news is that intellectually serious computer
+       science  is within the reach of just about everyone. The bad news
+       is  that the curricula tend to be imitations of what is taught to
+       beginning  undergraduate  computer  science  majors,  and I think
+       that's  too  rigid a starting point for independent learners, and
+       especially for teenagers.
+
+            See,  the wonderful thing about computer programming is that
+       it's  fun,  perhaps  not  for everyone, but for very many people.
+       There   aren't   many  mathematical  activities  that  appeal  so
+       spontaneously.   Kids   get   caught  up  in  the  excitement  of
+       programming,  in  the same way that other kids (or maybe the same
+       ones) get caught up in acting, in sports, in journalism (provided
+       the paper isn't run by teachers), or in ham radio. If schools get
+       too  serious  about computer science, that spontaneous excitement
+       can  be lost. I once heard a high school teacher say proudly that
+       kids used to hang out in his computer lab at all hours, but since
+       they  introduced  the computer science curriculum, the kids don't
+       want  to program so much because they've learned that programming
+       is  just  a means to the end of understanding the curriculum. No!
+       The  ideas  of computer science are a means to the end of getting
+       computers to do what you want.
+
+
+
+
+
+                      Computer
+                      Science
+                      Apprenticeship
+
+            My  goal  in  this  series of books is to make the goals and
+
+       methods  of  a  serious  computer  scientist  accessible,  at  an
+
+
+
+
+
+
+<CENTER><HR width="50%"></CENTER>
+
+
+
+
+
+       introductory  level,  to  people  who  are interested in computer
+
+       programming  but  are  not  computer science majors. If you're an
+
+       adult  or  teenaged  hobbyist,  or a teacher who wants to use the
+
+       computer  as  an educational tool, you're definitely part of this
+
+       audience.  I've taught these ideas to teachers and to high school
+
+       students.  What I enjoy most is teaching high school freshmen who
+
+       bring  a  love  of programming into the class with them--the ones
+
+       who  are  always  tugging at my arm to tell me what they found in
+
+       the latest Byte.
+
+</SMALL></SMALL></SMALL></SMALL></PRE>
+<HR>
+
+<P>For the most part the formatter just copies words from one file to
+another, filling and justifying as it goes.  A blank line in the file
+indicates a break between paragraphs.  The program skips a line
+between paragraphs and indents the first line of the new paragraph.
+It's possible to control the formatter's work by including <EM>
+formatting commands</EM> in the file.  These are the lines that start
+with asterisks in the example.  For example, the line that says
+
+<P><PRE>* nofill
+</PRE>
+
+<P>means, &quot;From now on, stop filling paragraphs.  Instead,
+each line in the input file should be one line in the printed result.&quot; The
+<CODE>yesfill</CODE> command returns to normal paragraph style.<SUP>*</SUP>
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>I'd
+have liked to call the command <CODE>fill</CODE>, as it would be in a
+commercial word processing program, but unfortunately that's the name
+of a primitive procedure in Logo.</SMALL></BLOCKQUOTE></SMALL><P>To run the program, invoke the <CODE>format</CODE> command.  This command
+takes two inputs: the name of a file to read and the name of a file
+to write.  The latter might be the name of the printer if your operating
+system allows it.
+
+<P><H2>Page Geometry</H2>
+
+<P>The program uses several global variables to determine the layout of a
+printed page.  Vertical measurements are in vertical lines (6 per inch for
+most computer printers); horizontal measurements are in characters (10 per
+inch is common, although there is more variation in this unit).  The program
+assumes fixed-width characters; a more professional program would handle
+variable-width character fonts, but the added complexity wouldn't help you
+learn the things I'm most interested in now.
+
+<P><CENTER><IMG SRC="formatter.jpg" ALT="figure: formatter"></CENTER>
+
+<P><P>
+
+<TABLE>
+<TR><TH align="left">pageheight<TD>&nbsp;&nbsp;&nbsp;Height of the entire sheet of paper, including margins.
+<TR><TH align="left">topmar<TD>&nbsp;&nbsp;&nbsp;Number of lines of margin at the top of each page.
+<TR><TH align="left">lines<TD>&nbsp;&nbsp;&nbsp;Number of lines to be printed on each page.
+<TR><TH align="left">parskip<TD>&nbsp;&nbsp;&nbsp;Number of blank lines between paragraphs.
+<TR><TH align="left">spacing<TD>&nbsp;&nbsp;&nbsp;1 for single spaced printing, 2 for double spaced, etc.
+<TR><TH align="left">leftmar<TD>&nbsp;&nbsp;&nbsp;Number of characters of margin at the left of the page.
+<TR><TH align="left">width<TD>&nbsp;&nbsp;&nbsp;Number of characters to print on each line.
+<TR><TH align="left">filltab<TD>&nbsp;&nbsp;&nbsp;Number of characters to indent the first line of a paragraph.
+<TR><TH align="left">nofilltab<TD>&nbsp;&nbsp;&nbsp;Number of characters to indent each nofill line.
+</TABLE>
+
+<P>The formatter recognizes formatting commands, in the file it's
+reading, to change the values of these variables.  By a strange
+coincidence these formatting commands look similar to the Logo
+commands to set a variable.  In the sample file, for instance, the
+formatting command
+
+<P><PRE>*make &quot;spacing 2
+</PRE>
+
+<P>is used to start double spacing.
+
+<P><H2>The Program</H2>
+
+<P>Here are the procedures that make up the formatter.
+
+<P><PRE>
+to format :from :to
+openread :from
+openwrite :to
+setread :from
+setwrite :to
+init.vars
+loop
+setread []
+setwrite []
+close :from
+close :to
+end
+
+to init.vars
+make "pageheight 66
+make "topmar 6
+make "lines 54
+make "leftmar 7
+make "width 65
+make "filltab 5
+make "nofilltab 0
+make "parskip 1
+make "spacing 1
+make "started "false
+make "filling "true
+make "printed 0
+make "inline []
+end
+
+to <A NAME="loop">loop
+forever [if process nextword [stop]]
+end
+
+;; Add a word to the output file, starting a new line if necessary
+
+to <A NAME="process">process</A> :word
+if listp :word [output "true]
+if not :started [start]
+if (:linecount+1+count :word) > :width [putline]
+addword :word
+output "false
+end
+
+to <A NAME="addword">addword :word</A>
+if not emptyp :line [make "linecount :linecount+1]
+make "line lput :word :line
+make "linecount :linecount+count :word
+end
+
+to <A NAME="putline">putline</A>
+repeat :leftmar+:indent [type "| |]
+putwords :line ((count :line)-1) (:width-:linecount)
+newline
+skip :spacing
+end
+
+to <A  NAME="putwords">putwords</A> :line :spaces :filler
+local "perword
+if emptyp :line [stop]
+type first :line
+make "perword ifelse :spaces > 0 [int ((:filler+:spaces-1)/:spaces)] [0]
+if :filler > 0 [repeat :perword [type "| |]]
+type "| |
+putwords (butfirst :line) (:spaces-1) (:filler-:perword)
+end
+
+;; Get the next input word, reading a new line if necessary
+
+to <A NAME="nextword">nextword</A>
+if not emptyp :inline [output extract.word]
+if not :filling [break]
+make "inline readword
+if listp :inline [break output []]
+if emptyp :inline [break output nextword]
+if equalp first :inline "|*| ~
+   [run butfirst :inline
+    make "inline "]
+make "inline skipspaces :inline
+output nextword
+end
+
+to <A NAME="extract.word">extract.word</A>
+local "result
+make "result firstword :inline
+make "inline skipfirst :inline
+output :result
+end
+
+to firstword :word
+if emptyp :word [output "]
+if equalp first :word "| | [output "]
+output word (first :word) (firstword butfirst :word)
+end
+
+to skipfirst :word
+if emptyp :word [output "]
+if equalp first :word "| | [output skipspaces :word]
+output skipfirst butfirst :word
+end
+
+to skipspaces :word
+if emptyp :word [output "]
+if equalp first :word "| | [output skipspaces butfirst :word]
+output :word
+end
+
+;; Formatting helpers
+
+to start
+make "started "true
+repeat :topmar [print []]
+newindent
+end
+
+to newindent
+newline
+make "indent ifelse :filling [:filltab] [:nofilltab]
+make "linecount :indent
+end
+
+to newline
+make "line []
+make "indent 0
+make "linecount 0
+end
+
+to <A NAME="break">break</A>
+if emptyp :line [stop]
+make "linecount :width
+putline
+newindent
+if :filling [skip :parskip]
+end
+
+;; Formatting commands to be invoked by the user
+
+to <A NAME="skip">skip</A> :howmany
+break
+repeat :howmany [print []]
+make "printed :printed+:howmany
+if :printed < :lines [stop]
+repeat :pageheight-:printed [print []]
+make "printed 0
+end
+
+to nofill
+break
+make "filling "false
+newindent
+end
+
+to yesfill
+break
+if not :filling [skip :parskip]
+make "filling "true
+newindent
+end
+</PRE>
+
+<P>To help you understand this program, you should start by imagining that the
+text file contains one big paragraph with no formatting commands.  For each
+word in the file, <A HREF="files.html#loop"><CODE>loop</CODE></A>
+invokes <A HREF="files.html#nextword"><CODE>nextword</CODE></A> to read the word and
+process to process it.  Just take <CODE>nextword</CODE> on faith for now
+and look at <A HREF="files.html#process"><CODE>process</CODE></A>.
+The third and fourth instruction lines are the
+interesting ones.  The third line asks whether adding this word to the
+partially filled print line will overflow its width.  If so, <CODE>process</CODE>
+invokes <A HREF="files.html#putline"><CODE>putline</CODE></A> to print
+that line and start a new one.  Then, in
+either case, <CODE>process</CODE> invokes <A HREF="files.html#addword"><CODE>addword</CODE></A>
+to add the word to the print
+line it's accumulating.  <CODE>Addword</CODE> puts the word at the end of the line
+and also adds its length to <CODE>:linecount</CODE>, the number of characters in
+the line.  If this isn't the first word of a new line, then it must also add
+another character to <CODE>:linecount</CODE> to take account of the space between
+words.
+
+<P><A HREF="files.html#putline"><CODE>Putline</CODE></A> is
+essentially just a fancy <CODE>print</CODE> command.  The
+complication comes in because the program is trying to justify the line by
+adding spaces where needed between words.  To do this, it has to <CODE>type</CODE>
+the line a word at a time; that's the task of
+<A HREF="files.html#putwords"><CODE>putwords</CODE></A>.  In that
+procedure, <CODE>:spaces</CODE> is the number of spaces between words not yet
+printed; in other words it's the number of positions into which extra spaces
+can be shoved.  (The idea is to spread out the necessary spaces as evenly as
+possible.) <CODE>:Filler</CODE> is the total number of extra spaces we need to
+insert; <CODE>:perword</CODE> is the number that should be inserted after the
+particular word we're typing right now.  (When I started writing <CODE>
+putline</CODE> and <CODE>putwords</CODE>, I thought that I could just calculate
+<CODE>:perword</CODE> once for each line.  But if the number of extra spaces we want
+to insert is not a multiple of the number of positions available, then the
+number of extra spaces may not be equal for every word in the line.)
+
+<P>That's pretty much the whole story about the printing part of the
+program.  The reading part is handled by
+<A HREF="files.html#nextword"><CODE>nextword</CODE></A>.  It reads a
+line at a time into the variable
+<CODE>inline</CODE>.  <CODE>Nextword</CODE> uses the Logo
+primitive <CODE>readword</CODE> to read a line, rather than the usual
+<CODE>readlist</CODE>, to avoid Logo's usual special handling of parentheses and
+brackets.  <CODE>Readword</CODE> outputs a word containing all of the characters on
+the line that it reads, even if the line includes spaces, which would
+ordinarily separate words.  Therefore, the program must divide the
+long word output by <CODE>readword</CODE> into ordinary words; that's the job of
+<A HREF="files.html#extract.word"><CODE>extract.word</CODE></A>
+and its subprocedures <CODE>firstword</CODE>,
+<CODE>skipword</CODE>, and <CODE>skipspaces</CODE>.
+
+<P>Each time <A HREF="files.html#nextword"><CODE>nextword</CODE></A>
+is invoked, it removes one word from the line and
+outputs that word.  When <CODE>:inline</CODE> is empty, <CODE>nextword</CODE> reads a new
+line from the file.  There are four possibilities: First, the end of the
+file may be reached.  <CODE>Listp</CODE> tests for this; if so, <CODE>nextword</CODE>
+outputs an empty list.  Second, the new line can be empty, indicating a
+paragraph break.  In this case <CODE>nextword</CODE> invokes
+<A HREF="files.html#break"><CODE>break</CODE></A> and reads
+another line.  Third, the new line can be a formatting command, starting
+with an asterisk.  In this case <CODE>nextword</CODE> just <CODE>run</CODE>s the line,
+minus the asterisk, and reads another line.  Fourth, the line can be an
+ordinary text line, in which case <CODE>nextword</CODE> goes back to extracting
+words from the line.
+
+<P>In most programming languages, most of the effort in writing a
+formatter like this would be in recognizing and evaluating the
+formatting commands.  I hope you appreciate how much Logo's ability to
+<CODE>run</CODE> instructions found in a file simplifies this task!  The danger
+in this technique is that an invalid instruction in the input file will
+crash the formatting program, giving a Logo error message.  (This is
+especially bad because after the error message we are left with a
+half-written output file still open.)  I'd like to &quot;catch&quot; errors
+while running the user's instructions; you'll see how to do that in
+Chapter 3.
+
+<P>The rest of the program is just a bunch of detail.  The
+<A HREF="files.html#skip"><CODE>skip</CODE></A>
+command is written to be used both by the formatting program itself
+and as a formatting command, as in the example I showed earlier.  As
+an exercise in understanding program structure, notice that <CODE>skip</CODE>
+invokes <A HREF="files.html#break"><CODE>break</CODE></A>
+and <CODE>break</CODE> invokes <CODE>skip</CODE>; then explain
+why they don't just keep invoking each other forever, like a recursive
+procedure without a stop rule.
+
+<P>Another slightly tricky part to understand is the variable
+<CODE>started</CODE> and the procedure <CODE>start</CODE>.
+<CODE>Start</CODE> is invoked by
+<CODE>process</CODE>, but only once, before processing the very first word of
+the text.  Ensuring the &quot;only once&quot; is the sole purpose of
+<CODE>started</CODE>, a variable that initially contains <CODE>false</CODE> and is
+changed to <CODE>true</CODE> by <CODE>start</CODE>.  Instead, why don't I just
+invoke <CODE>start</CODE> from <CODE>format</CODE> before calling <CODE>loop</CODE>?  The
+answer is that this technique allows the file to start with an
+instruction like
+
+<P><PRE>*make &quot;topmar 10
+</PRE>
+
+<P>Any such instructions will be evaluated <EM>before</EM>
+processing the first text word.  If <CODE>start</CODE> were invoked by
+<CODE>format</CODE>, the top margin would be skipped before this instruction had a
+chance to set <CODE>:topmar</CODE>.
+
+<P><H2>Improving the Formatter</H2>
+
+<P>Actually, using <CODE>make</CODE> as a formatting command is a little
+schlock--not what I'd call good &quot;human engineering.&quot; If you wanted
+to make a million dollars selling this program, you'd add several
+little procedures like this:
+
+<P><PRE>to topmar :lines
+make &quot;topmar :lines
+end
+</PRE>
+
+<P>Like <CODE>nofill</CODE> and <CODE>yesfill</CODE>, these procedures would be
+used only as formatting commands, not as part of the formatter itself.
+
+<P>The program leaves out a lot of things you'd like to be able to do.
+You should be able to number pages automatically in the top or bottom
+margins.  (That's a pretty easy modification; most of the work would
+be in <CODE>skip</CODE>.) You'd like to be able to center lines on the page
+for chapter headings.  If your printer can underline or use different
+type faces, you'll want a way to control those things with formatting
+commands.<SUP>*</SUP>
+
+<P><SMALL><BLOCKQUOTE><SMALL><SUP>*</SUP>If you're <EM>really</EM> ambitious, you could try
+teaching the program about footnotes!</SMALL></BLOCKQUOTE></SMALL>
+
+<P>Still, this is a usable program carrying out a real task.  It takes
+19 Logo procedures averaging 7 lines each.  This would be a
+much harder project in most languages.  What makes it so manageable in
+Logo?  First, <EM>modularity.</EM>  A small procedure for each task makes
+the overall program easier to understand than it would be if it were
+all in one piece.  Second, Logo's data types, words and lists, are
+well suited to this problem.  Third, Logo's control mechanisms, especially
+recursive operations and <CODE>run</CODE>, have the needed flexibility.
+
+<P>
+
+<P><A HREF="../v2-toc2.html">(back to Table of Contents)</A>
+<P><A HREF="../v2ch0/ack.html"><STRONG>BACK</STRONG></A>
+chapter thread <A HREF="../v2ch2/v2ch2.html"><STRONG>NEXT</STRONG></A>
+
+<P>
+<ADDRESS>
+<A HREF="../index.html">Brian Harvey</A>, 
+<CODE>bh@cs.berkeley.edu</CODE>
+</ADDRESS>
+</BODY>
+</HTML>
author	elioat <elioat@tilde.institute>	2023-08-23 07:52:19 -0400
committer	elioat <elioat@tilde.institute>	2023-08-23 07:52:19 -0400
commit	562a9a52d599d9a05f871404050968a5fd282640 (patch)
tree	7d3305c1252c043bfe246ccc7deff0056aa6b5ab /js/games/nluqo.github.io/~bh/v2ch1/files.html
parent	5d012c6c011a9dedf7d0a098e456206244eb5a0f (diff)
download	tour-562a9a52d599d9a05f871404050968a5fd282640.tar.gz