about summary refs log blame commit diff stats
path: root/js/games/nluqo.github.io/~bh/ssch25/database
blob: fb6e0fadd8a64743e711eb490e96a3ce1482cc59 (plain) (tree)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880















































































































































































































































































































































































































































































































































































































































































































































































































































































































                                                                                                                                                                                                                                            
<P>

<P><HTML>
<HEAD>
<TITLE>Simply Scheme:Project: A Database Program</TITLE>
</HEAD>
<BODY>
<CITE>Simply Scheme</CITE>:
<CITE>Introducing Computer Science</CITE> 2/e Copyright (C) 1999 MIT
<H1>Project: A Database Program</H1>

<TABLE width="100%"><TR><TD>
<IMG SRC="../simply.jpg" ALT="cover photo">
<TD><TABLE>
<TR><TD align="right"><CITE><A HREF="http://www.cs.berkeley.edu/~bh/">Brian
Harvey</A><BR>University of California, Berkeley</CITE>
<TR><TD align="right"><CITE><A HREF="http://ccrma.stanford.edu/~matt">Matthew
Wright</A><BR>University of California, Santa Barbara</CITE>
<TR><TD align="right"><BR>
<TR><TD align="right"><A HREF="../pdf/ssch25.pdf">Download PDF version</A>
<TR><TD align="right"><A HREF="../ss-toc2.html">Back to Table of Contents</A>
<TR><TD align="right"><A HREF="spread-implement.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../ssch26/part7.html"><STRONG>NEXT</STRONG></A>
<TR><TD align="right"><A HREF="http://mitpress.mit.edu/0262082810">MIT
Press web page for <CITE>Simply Scheme</CITE></A>
</TABLE></TABLE>

<HR>

<P>A <EM>database</EM> is a large file with lots of related data in it.
For example, you might have a database of your local Chinese restaurants,
listing their names, their addresses, and how good their potstickers are,
like this:<A NAME="text0" HREF="database#ft0">[0]</A>

<P><PRE>Name:           Address:                  City:           Potstickers:

Cal's           1866 Euclid               Berkeley        nondescript
Hunan           924 Sansome               San Francisco   none
Mary Chung's    464 Massachusetts Avenue  Cambridge       great
Shin Shin       1715 Solano Avenue        Berkeley        awesome
TC Garden       2507 Hearst Avenue        Berkeley        doughy
Yet Wah         2140 Clement              San Francisco   fantastic
</PRE>

<P>There are six <EM>records</EM> in this database, one for each restaurant.
Each record contains four pieces of information; we say that the
database has four <EM>fields.</EM>

<P>A <EM>database program</EM> is a program that can create, store, modify, and
examine databases.  At the very least, a database program must let you
create new databases, enter records, and save the database to a file.  More
sophisticated operations involve sorting the records in a database by a
particular field, printing out the contents of the database, counting the
number of records that satisfy a certain condition, taking statistics such as
averages, and so on.

<P>There are many commercial database programs available; our version will have
some of the flavor of more sophisticated programs while leaving out a lot of
the details.

<P><H2>A Sample Session with Our Database</H2>

<P>Most database programs come with their own programming language built in.
Our database program will use Scheme itself as the language; you will be able
to perform database commands by invoking Scheme procedures at the Scheme
prompt.  Here is a sample session with our program:

<P><PRE>&gt; (load &quot;database.scm&quot;)
#F
&gt; (new-db &quot;albums&quot; '(artist title year brian-likes?))
CREATED
</PRE>

<P>First we loaded the database program, then we created a new
database called <CODE>&quot;albums&quot;</CODE><A NAME="text1" HREF="database#ft1">[1]</A> with four fields.<A NAME="text2" HREF="database#ft2">[2]</A>
Let's enter some data:

<P><PRE>&gt; (insert)
Value for ARTIST--&gt; (the beatles)
Value for TITLE--&gt; &quot;A Hard Day's Night"
Value for YEAR--&gt; 1964
Value for BRIAN-LIKES?--&gt; #t
Insert another? yes
Value for ARTIST--&gt; (the zombies)
Value for TITLE--&gt; &quot;Odessey and Oracle"
Value for YEAR--&gt; 1967
Value for BRIAN-LIKES?--&gt; #t
Insert another? y
Value for ARTIST--&gt; (frank zappa)
Value for TITLE--&gt; &quot;Hot Rats"
Value for YEAR--&gt; 1970
Value for BRIAN-LIKES?--&gt; #f
Insert another? y
Value for ARTIST--&gt; (the beatles)
Value for TITLE--&gt; &quot;Rubber Soul"
Value for YEAR--&gt; 1965
Value for BRIAN-LIKES?--&gt; #t
Insert another? no
INSERTED
</PRE>

<P>(We used strings for the album titles but sentences for the
artists, partly because one of the titles has an apostrophe in it, but
mainly just to demonstrate that fields can contain any data type.)

<P>At this point we start demonstrating features that aren't actually in the
version of the program that we've provided.  You will implement these
features in this project.  We're showing them now as if the project were
finished to convey the overall flavor of how the program should work.

<P>We can print out the information in a database, and count the number of
records:<A NAME="text3" HREF="database#ft3">[3]</A>

<P><PRE>&gt; (list-db)
RECORD 1
ARTIST: (THE BEATLES)
TITLE: Rubber Soul
YEAR: 1965
BRIAN-LIKES?: #T

RECORD 2
ARTIST: (FRANK ZAPPA)
TITLE: Hot Rats
YEAR: 1970
BRIAN-LIKES?: #F

RECORD 3
ARTIST: (THE ZOMBIES)
TITLE: Odessey and Oracle
YEAR: 1967
BRIAN-LIKES?: #T

RECORD 4
ARTIST: (THE BEATLES)
TITLE: A Hard Day's Night
YEAR: 1964
BRIAN-LIKES?: #T

LISTED
&gt; (count-db)
4
</PRE>

<P>We can insert new records into the database later on:

<P><PRE>&gt; (insert)
Value for ARTIST--&gt; (the bill frisell band)
Value for TITLE--&gt; &quot;Where in the World?"
Value for YEAR--&gt; 1991
Value for BRIAN-LIKES?--&gt; #f
Insert another? no
INSERTED
</PRE>

<P>We can sort the records of the database, basing the sorting order on a
particular field:

<P><PRE>&gt; (sort-on 'year)
YEAR

&gt; (list-db)
RECORD 1
ARTIST: (THE BEATLES)
TITLE: A Hard Day's Night
YEAR: 1964
BRIAN-LIKES?: #T

RECORD 2
ARTIST: (THE BEATLES)
TITLE: Rubber Soul
YEAR: 1965
BRIAN-LIKES?: #T

RECORD 3
ARTIST: (THE ZOMBIES)
TITLE: Odessey and Oracle
YEAR: 1967
BRIAN-LIKES?: #T

RECORD 4
ARTIST: (FRANK ZAPPA)
TITLE: Hot Rats
YEAR: 1970
BRIAN-LIKES?: #F

RECORD 5
ARTIST: (THE BILL FRISELL BAND)
TITLE: Where in the World?
YEAR: 1991
BRIAN-LIKES?: #F

LISTED
</PRE>

<P>We can change the information in a record:

<P><PRE>&gt; (edit-record 1)
ARTIST: (THE BEATLES)
TITLE: A Hard Day's Night
YEAR: 1964
BRIAN-LIKES?: #T

Edit which field?  title
New value for TITLE--&gt; &quot;A Hard Day's Night (original soundtrack)"
ARTIST: (THE BEATLES)
TITLE: A Hard Day's Night (original soundtrack)
YEAR: 1964
BRIAN-LIKES?: #T

Edit which field?  #f
EDITED
</PRE>

<P>(The <CODE>edit-record</CODE> procedure takes a record number as its
argument.  In this case, we wanted the first record.  Also, the way you stop
editing a record is by entering <CODE>#f</CODE> as the field name.)

<P>Finally, we can save a database to a file and retrieve it later:

<P><PRE>&gt; (save-db)
SAVED

&gt; (load-db &quot;albums&quot;)
LOADED
</PRE>

<P><H2>How Databases Are Stored Internally</H2>

<P>Our program will store a database as a vector of three elements: the file
name associated with the database, a list of the names of the fields of the
database, and a list of records in the database.

<P>Each record of the database is itself a vector, containing values for
the various fields.  (So the length of a record vector depends on the number
of fields in the database.)

<P>Why is each record a vector, but the collection of records a
list?  Records have to be vectors because they are mutable; the <CODE>edit</CODE>
command lets you change the value of a field for a record.  But there is no
command to replace an entire record with a new one, so the list of records
doesn't have to be mutable.

<P>An advantage of storing the records in a list instead of a vector is that
it's easy to insert new records.  If you've got a new record and a list of
the old records, you simply <CODE>cons</CODE> the new record onto the old ones, and
that's the new list.  You need to mutate the vector that represents the
entire database to contain this new list instead of the old one, but you
don't need to mutate the list itself.

<P>Here's the <CODE>albums</CODE> database we created, as it looks to Scheme:

<P><PRE>#(&quot;albums"
  (ARTIST TITLE YEAR BRIAN-LIKES?)
  (#((THE BEATLES) &quot;A Hard Day's Night (original soundtrack)&quot; 1964 #T)
   #((THE BEATLES) &quot;Rubber Soul&quot; 1965 #T)
   #((THE ZOMBIES) &quot;Odessey and Oracle&quot; 1967 #T)
   #((FRANK ZAPPA) &quot;Hot Rats&quot; 1970 #F)
   #((THE BILL FRISELL BAND) &quot;Where in the World?&quot; 1991 #F)))
</PRE>

<P>We'll treat databases as an abstract data type; here is how we implement it:

<P><PRE>;;; The database ADT: a filename, list of fields and list of records

(define (<A NAME="g1"></A>make-db filename fields records)
  (vector filename fields records))

(define (<A NAME="g2"></A>db-filename db)
  (vector-ref db 0))

(define (<A NAME="g3"></A>db-set-filename! db filename)
  (vector-set! db 0 filename))

(define (<A NAME="g4"></A>db-fields db)
  (vector-ref db 1))

(define (<A NAME="g5"></A>db-set-fields! db fields)
  (vector-set! db 1 fields))

(define (<A NAME="g6"></A>db-records db)
  (vector-ref db 2))

(define (<A NAME="g7"></A>db-set-records! db records)
  (vector-set! db 2 records))
</PRE>

<P><H2>The Current Database</H2>

<P>The database program works on one database at a time.  Every command
implicitly refers to the current database.  Since the program might
switch to a new database, it has to keep the current database in a
vector that it can mutate if necessary.  For now, the current
database is the only state information that the program keeps, so
it's stored in a vector of length one.  If there is no current
database (for example, when you start the database program), the
value <CODE>#f</CODE> is stored in this vector:

<P><PRE>(define current-state (vector #f))

(define (<A NAME="g8"></A>no-db?)
  (not (vector-ref current-state 0)))

(define (<A NAME="g9"></A>current-db)
  (if (no-db?)
      (error &quot;No current database!&quot;)
      (vector-ref current-state 0)))

(define (<A NAME="g10"></A>set-current-db! db)
  (vector-set! current-state 0 db))

(define (<A NAME="g11"></A>current-fields)
  (db-fields (current-db)))
</PRE>

<P><H2>Implementing the Database Program Commands</H2>

<P>Once we have the basic structure of the database program, the work consists
of inventing the various database operations.  Here is the <CODE>new-db</CODE>
procedure:

<P><PRE>(define (<A NAME="g12"></A>new-db filename fields)
  (set-current-db! (make-db filename fields '()))
  'created)
</PRE>

<P>(Remember that when you first create a database there are no
records in it.)

<P>Here's the <CODE>insert</CODE> procedure:

<P><PRE>(define (<A NAME="g13"></A>insert)
  (let ((new-record (get-record)))
    (db-insert new-record (current-db)))
  (if (ask &quot;Insert another? &quot;)
      (insert)
      'inserted))

(define (<A NAME="g14"></A>db-insert record db)
  (db-set-records! db (cons record (db-records db))))

(define (<A NAME="g15"></A>get-record)
  (get-record-loop 0
		   (make-vector (length (current-fields)))
		   (current-fields)))

(define (<A NAME="g16"></A>get-record-loop which-field record fields)
  (if (null? fields)
      record
      (begin (display &quot;Value for &quot;)
	     (display (car fields))
	     (display &quot;--&gt; &quot;)
	     (vector-set! record which-field (read))
	     (get-record-loop (+ which-field 1) record (cdr fields)))))

(define (<A NAME="g17"></A>ask question)
  (display question)
  (let ((answer (read)))
    (cond ((equal? (first answer) 'y) #t)
	  ((equal? (first answer) 'n) #f)
	  (else (show &quot;Please type Y or N.&quot;)
		(ask question)))))
</PRE>

<P><H2>Additions to the Program</H2>

<P>The database program we've shown so far has the structure of a more
sophisticated program, but it's missing almost every feature you'd want it
to have.  Some of the following additions are ones that we've demonstrated,
but for which we haven't provided an implementation; others are introduced
here for the first time.

<P>In all of these additions, think about possible error conditions and how to
handle them.  Try to find a balance between failing even on errors that are
very likely to occur and having an entirely safe program that has more error
checking than actual content.

<H3><CODE>Count-db</CODE></H3>

<P>Implement the <CODE><A NAME="g18"></A>count-db</CODE> procedure.  It should take no arguments,
and it should return the number of records in the current database.

<H3><CODE>List-db</CODE></H3>

<P>Implement the <CODE><A NAME="g19"></A>list-db</CODE> procedure.  It should take no arguments,
and it should print the current database in the format shown earlier.

<H3><CODE>Edit-record</CODE></H3>

<P>Implement <CODE><A NAME="g20"></A>edit-record,</CODE> which takes a number between one and the
number of records in the current database as its argument.  It should allow
the user to interactively edit the given record of the current database, as
shown earlier.

<H3><CODE>Save-db</CODE> and </CODE>Load-db</CODE></H3>

<P>Write <CODE><A NAME="g21"></A>save-db</CODE> and <CODE><A NAME="g22"></A>load-db.</CODE> <CODE>Save-db</CODE> should
take no arguments and should save the current database into a file with the
name that was given when the database was created.  Make sure to save the
field names as well as the information in the records.

<P><CODE>Load-db</CODE> should take one argument, the filename of the database you
want to load.  It should replace the current database with the one in the
specified file.  (Needless to say, it should expect files to be in the
format that <CODE>save-db</CODE> creates.)

<P>In order to save information to a file in a form that Scheme will be able to
read back later, you will need to use the <CODE>write</CODE> procedure instead of
<CODE>display</CODE> or <CODE>show</CODE>, as discussed in Chapter 22.

<H3><CODE>Clear-current-db!</CODE></H3>

<P>The <CODE>new-db</CODE> and <CODE>load-db</CODE> procedures change the current database.
<CODE>New-db</CODE> creates a new, blank database, while <CODE>load-db</CODE> reads in an
old database from a file.  In both cases, the program just throws out
the current database.  If you forgot to save it, you could lose a lot of
work.

<P>Write a procedure <CODE><A NAME="g23"></A>clear-current-db!</CODE> that clears the current
database.  If there is no current database, <CODE>clear-current-db!</CODE> should
do nothing.  Otherwise, it should ask the user whether to save the
database, and if so it should call <CODE>save-db</CODE>.

<P>Modify <CODE>new-db</CODE> and <CODE>load-db</CODE> to invoke <CODE>clear-current-db!</CODE>.

<H3><CODE>Get</CODE></H3>

<P>Many of the kinds of things that you would want to do to a database involve
looking up the information in a record by the field name.  For example, the
user might want to list only the artists and titles of the album
database, or sort it by year, or list only the albums that Brian likes.

<P>But this isn't totally straightforward, since a record doesn't contain any
information about names of fields.  It doesn't make sense to ask what value
the <CODE>price</CODE> field has in the record

<P><PRE>#(SPROCKET 15 23 17 2)
</PRE>

<P>without knowing the names of the fields of the current database
and their order.

<P>Write a procedure <CODE><A NAME="g24"></A>get</CODE> that takes two arguments, a field name
and a record, and returns the given field of the given record.  It should
work by looking up the field name in the list of field names of the current
database.

<P><PRE>&gt; (get 'title '#((the zombies) &quot;Odessey and Oracle&quot; 1967 #t))
&quot;Odessey and Oracle"
</PRE>

<P><CODE>Get</CODE> can be thought of as a selector for the record data type.  To
continue the implementation of a record ADT, write a constructor <CODE>blank-record</CODE> that takes no arguments and returns a record with no values in
its fields.  (Why doesn't <CODE>blank-record</CODE> need any arguments?)  Finally,
write the mutator <CODE>record-set!</CODE> that takes three arguments: a field
name, a record, and a new value for the corresponding field.

<P>Modify the rest of the database program to use this ADT instead of directly
manipulating the records as vectors.

<H3><CODE>Sort</CODE></H3>

<P>Write a <CODE><A NAME="g25"></A>sort</CODE> command that takes a predicate as its argument and
sorts the database according to that predicate.  The predicate should take
two records as arguments and return <CODE>#t</CODE> if the first record belongs
before the second one, or <CODE>#f</CODE> otherwise.  Here's an example:

<P><PRE>&gt; (sort (lambda (r1 r2) (before? (get 'title r1) (get 'title r2))))
SORTED

&gt; (list-db)
RECORD 1
ARTIST: (THE BEATLES)
TITLE: A Hard Day's Night (original soundtrack)
YEAR: 1964
BRIAN-LIKES?: #T

RECORD 2
ARTIST: (FRANK ZAPPA)
TITLE: Hot Rats
YEAR: 1970
BRIAN-LIKES?: #F

RECORD 3
ARTIST: (THE ZOMBIES)
TITLE: Odessey and Oracle
YEAR: 1967
BRIAN-LIKES?: #T

RECORD 4
ARTIST: (THE BEATLES)
TITLE: Rubber Soul
YEAR: 1965
BRIAN-LIKES?: #T

RECORD 5
ARTIST: (THE BILL FRISELL BAND)
TITLE: Where in the World?
YEAR: 1991
BRIAN-LIKES?: #F

LISTED

&gt; (sort (lambda (r1 r2) (&lt; (get 'year r1) (get 'year r2))))
SORTED

&gt; (list-db)
RECORD 1
ARTIST: (THE BEATLES)
TITLE: A Hard Day's Night (original soundtrack)
YEAR: 1964
BRIAN-LIKES?: #T

RECORD 2
ARTIST: (THE BEATLES)
TITLE: Rubber Soul
YEAR: 1965
BRIAN-LIKES?: #T

RECORD 3
ARTIST: (THE ZOMBIES)
TITLE: Odessey and Oracle
YEAR: 1967
BRIAN-LIKES?: #T

RECORD 4
ARTIST: (FRANK ZAPPA)
TITLE: Hot Rats
YEAR: 1970
BRIAN-LIKES?: #F

RECORD 5
ARTIST: (THE BILL FRISELL BAND)
TITLE: Where in the World?
YEAR: 1991
BRIAN-LIKES?: #F

LISTED
</PRE>

<P>Note: Don't invent a sorting algorithm for this problem.  You can
just use one of the sorting procedures from Chapter 15 and modify it
slightly to sort a list of records instead of a sentence of words.

<H3><CODE>Sort-on-by</CODE></H3>

<P>Although <CODE>sort</CODE> is a very general-purpose tool, the way that you have
to specify how to sort the database is cumbersome.  Write a procedure
<CODE>sort-on-by</CODE> that takes two arguments, the name of a field and a
predicate.  It should invoke <CODE>sort</CODE> with an appropriate predicate to
achieve the desired sort.  For example, you could say

<P><PRE>(sort-on-by 'title before?)
</PRE>

<P>and

<P><PRE>(sort-on-by 'year &lt;)
</PRE>

<P>instead of the two <CODE>sort</CODE> examples we showed earlier.

<H3><CODE>Generic-before?</CODE></H3>

<P>The next improvement is to eliminate the need to specify a predicate
explicitly.  Write a procedure <CODE><A NAME="g26"></A>generic-before?</CODE> that takes two
arguments of any types and returns <CODE>#t</CODE> if the first comes before the
second.  The meaning of &quot;before&quot; depends on the types of the arguments:

<P>If the arguments are numbers, <CODE>generic-before?</CODE> should use <CODE>&lt;</CODE>.
If the arguments are words that aren't numbers, then <CODE>generic-before?</CODE> 
should use <CODE>before?</CODE> to make the comparison.

<P>What if the arguments are lists?  For example, suppose you want to sort on
the <CODE>artist</CODE> field in the albums example.  The way to compare two lists
is element by element, just as in the <CODE>sent-before?</CODE> procedure in
Chapter 14.

<P><PRE>&gt; (generic-before? '(magical mystery tour) '(yellow submarine))
#T
&gt; (generic-before? '(is that you?) '(before we were born))
#F
&gt; (generic-before? '(bass desires) '(bass desires second sight))
#T
</PRE>

<P>But <CODE>generic-before?</CODE> should also work for structured lists:

<P><PRE>&gt; (generic-before? '(norwegian wood (this bird has flown))
		   '(norwegian wood (tastes so good)))
#F
</PRE>

<P>What if the two arguments are of different types?  If you're comparing a
number and a non-numeric word, compare them alphabetically.  If you're
comparing a word to a list, treat the word as a one-word list, like
this:

<P><PRE>&gt; (generic-before? '(in line) 'rambler)
#T

&gt; (generic-before? '(news for lulu) 'cobra)
#F
</PRE>

<H3><CODE>Sort-on</CODE></H3>

<P>Now write <CODE><A NAME="g27"></A>sort-on</CODE>, which takes the name of a field as its
argument and sorts the current database on that field, using <CODE>generic-before?</CODE> as the comparison predicate.

<H3><CODE>Add-field</CODE></H3>

<P>Sometimes you discover that you don't have enough fields in your database.
Write a procedure <CODE><A NAME="g28"></A>add-field</CODE> that takes two arguments: the
name of a new field and an initial value for that field.  <CODE>Add-field</CODE> 
should modify the current database to include the new field.  Any existing
records in the database should be given the indicated initial value for the
field.  Here's an example:

<P><PRE>&gt; (add-field 'category 'rock)
ADDED

&gt; (edit-record 5)
CATEGORY: ROCK
ARTIST: (THE BILL FRISELL BAND)
TITLE: Where in the World?
YEAR: 1991
BRIAN-LIKES?: #F

Edit which field?  category
New value for CATEGORY--&gt; jazz
CATEGORY: JAZZ
ARTIST: (THE BILL FRISELL BAND)
TITLE: Where in the World?
YEAR: 1991
BRIAN-LIKES?: #F

Edit which field?  #f
EDITED

&gt; (list-db)
RECORD 1
CATEGORY: ROCK
ARTIST: (THE BEATLES)
TITLE: A Hard Day's Night (original soundtrack)
YEAR: 1964
BRIAN-LIKES?: #T

RECORD 2
CATEGORY: ROCK
ARTIST: (THE BEATLES)
TITLE: Rubber Soul
YEAR: 1965
BRIAN-LIKES?: #T

RECORD 3
CATEGORY: ROCK
ARTIST: (THE ZOMBIES)
TITLE: Odessey and Oracle
YEAR: 1967
BRIAN-LIKES?: #T

RECORD 4
CATEGORY: ROCK
ARTIST: (FRANK ZAPPA)
TITLE: Hot Rats
YEAR: 1970
BRIAN-LIKES?: #F

RECORD 5
CATEGORY: JAZZ
ARTIST: (THE BILL FRISELL BAND)
TITLE: Where in the World?
YEAR: 1991
BRIAN-LIKES?: #F

LISTED
</PRE>

<P>If you like, you can write <CODE>add-field</CODE> so that it will accept either
one or two arguments.  If given only one argument, it should use <CODE>#f</CODE>
as the default field value.

<P>Note: We said earlier that each record is a vector but the
collection of records is a list because we generally want to mutate fields
in a record, but not add new ones, whereas we generally want to add new
records, but not replace existing ones completely.  This problem is an
exception; we're asking you to add an element to a vector.  To do this,
you'll have to create a new, longer vector for each record.  Your program
will probably run slowly as a result.  This is okay because adding fields to
an existing database is very unusual.  Commercial database programs are also
slow at this; now you know why.

<P>You can't solve this problem in a way that respects the current version of
the record ADT.  Think about trying to turn the record

<P><PRE>#((THE BEATLES) &quot;Rubber Soul&quot; 1965 #T)
</PRE>

<P>into the record

<P><PRE>#(ROCK (THE BEATLES) &quot;Rubber Soul&quot; 1965 #T)
</PRE>

<P>It seems simple enough: Make a new record of the correct size, and fill in
all the values of the old fields from the old record.  But does this happen
before or after you change the list of current fields in the database?  If
before, you can't call <CODE>blank-record</CODE> to create a new record of the
correct size.  If after, you can't call <CODE>get</CODE> to extract the field values
of the old record, because the field names have changed.

<P>There are (at least) three solutions to this dilemma.  One is to abandon the
record ADT, since it's turning out to be more trouble than it's worth.
Using the underlying vector tools, it would be easy to transform old-field
records into new-field records.

<P>The second solution is to create another constructor for records, <CODE>adjoin-field</CODE>.  <CODE>Adjoin-field</CODE> would take a record and a new
field value, and would be analogous to <CODE>cons</CODE>.

<P>The last solution is the most complicated, but perhaps the most elegant.
The reason our ADT doesn't work is that <CODE>get</CODE>, <CODE>record-set!</CODE>, and
<CODE>blank-record</CODE> don't just get information from their arguments; they
also examine the current fields of the database.  You could write a new ADT
implementation in which each procedure took a list of fields as an extra
argument. Then <CODE>get</CODE>, <CODE>record-set!</CODE>, and <CODE>blank-record</CODE> could be
implemented in this style:

<P><PRE>(define (get fieldname record)
  (get-with-these-fields fieldname record (current-fields)))
</PRE>

<P><CODE>Add-field</CODE> could use the underlying ADT, and the rest of the
program could continue to use the existing version.

<P>We've put a lot of effort into figuring out how to design this small part of
the overall project.  Earlier we showed you examples in which using an ADT
made it <EM>easy</EM> to modify a program; those were realistic, but it's
also realistic that sometimes making an ADT work can add to the effort.

<H3><CODE>Save-selection</CODE></H3>

<P>Write a <CODE>save-selection</CODE> procedure that's similar to <CODE>save-db</CODE> but
saves only the currently selected records.  It should take a file name as
its argument.

<H3><CODE>Merge-db</CODE></H3>

<P>One of the most powerful operations on a database is to merge it with
another database.  Write a procedure <CODE><A NAME="g29"></A>merge-db</CODE> that takes two
arguments: the file name of another database and the name of a field that
the given database has in common with the current database.  Both databases
(the current one and the one specified) must already be sorted by the given
field.

<P>The effect of the <CODE>merge-db</CODE> command is to add fields from the specified 
database to the records of the current database.  For example, suppose you
had a database called <CODE>&quot;bands&quot;</CODE> with the following information:

<P><PRE>Artist:                   Members:
(rush)                    (geddy alex neil)
(the beatles)             (john paul george ringo)
(the bill frisell band)   (bill hank kermit joey)     
(the zombies)             (rod chris colin hugh paul)
</PRE>

<P>You should be able to do the following (assuming <CODE>&quot;albums&quot;</CODE> is
the current database):

<P><PRE>&gt; (sort-on 'artist)
ARTIST

&gt; (merge-db &quot;bands&quot; 'artist)
MERGED

&gt; (list-db)
RECORD 1
CATEGORY: ROCK
ARTIST: (FRANK ZAPPA)
TITLE: Hot Rats
YEAR: 1970
BRIAN-LIKES?: #F
MEMBERS: #F

RECORD 2
CATEGORY: ROCK
ARTIST: (THE BEATLES)
TITLE: A Hard Day's Night (original soundtrack)
YEAR: 1964
BRIAN-LIKES?: #T
MEMBERS: (JOHN PAUL GEORGE RINGO)

RECORD 3
CATEGORY: ROCK
ARTIST: (THE BEATLES)
TITLE: Rubber Soul
YEAR: 1965
BRIAN-LIKES?: #T
MEMBERS: (JOHN PAUL GEORGE RINGO)

RECORD 4
CATEGORY: JAZZ
ARTIST: (THE BILL FRISELL BAND)
TITLE: Where in the World?
YEAR: 1991
BRIAN-LIKES?: #F
MEMBERS: (BILL HANK KERMIT JOEY)

RECORD 5
CATEGORY: ROCK
ARTIST: (THE ZOMBIES)
TITLE: Odessey and Oracle
YEAR: 1967
BRIAN-LIKES?: #T
MEMBERS: (ROD CHRIS COLIN HUGH PAUL)

LISTED
</PRE>

<P>Since there was no entry for Frank Zappa in the <CODE>&quot;bands&quot;</CODE> database, the <CODE>members</CODE> field was given the default value <CODE>#f</CODE>.  If there are two or more records in the specified database with the
same value for the given field, just take the information from the first one.

<P>(By the way, this problem has a lot in common with the <CODE>join</CODE> procedure
from Exercise <A HREF="../ssch22/files.html#join">22.8</A>.)

<P>This is a complicated problem, so we're giving a hint:

<P><PRE>(define (merge-db other-name common-field)
  (let ((other-db (read-db-from-disk other-name))
	(original-fields (current-fields)))
    (set-current-fields! (merge-fields original-fields
 				       (db-fields other-db)))
    (set-current-records! (merge-db-helper original-fields
					   (current-records)
					   (db-fields other-db)
					   (db-records other-db)
					   common-field))))
</PRE>

<P>This procedure shows one possible overall structure
for the program.  You can fill in the structure by writing the
necessary subprocedures.  (If you prefer to start with a different
overall design, that's fine too.)  Our earlier suggestion about
writing a <CODE>get-with-these-fields</CODE> procedure is relevant here also.

<P>
<H2>Extra Work for Hotshots</H2>

<P>Compare your program to a real database program and see if you can add some
of its features.  For example, many database programs have primitive
facilities for averaging the value of a field over certain records.  Another
feature you might want to try to implement is two-dimensional printing,
in which each column is a field and each row is a record.

<P>
<HR>
<A NAME="ft0" HREF="database#text0">[0]</A> [online-only footnote] Alas, two of these
restaurants are no more.  If you're in the Cal's vicinity, you'll have to
settle for TC Garden; if you're in the Shin Shin vicinity, try Kirin,
1767 Solano Ave.<P>
<A NAME="ft1" HREF="database#text1">[1]</A> The double-quote marks are
necessary because <CODE>albums</CODE> will be used as a filename when we save the
database to a file.<P>
<A NAME="ft2" HREF="database#text2">[2]</A> We don't need a
<CODE>matt-likes?</CODE> field because Matt likes all the albums in this database.<P>
<A NAME="ft3" HREF="database#text3">[3]</A> Readers who are old enough to remember the days before
compact discs may be disturbed by the ambiguity of the word &quot;record,&quot;
which could mean either a database record or a phonograph record.  Luckily,
in our example it doesn't matter, because each database record represents
a phonograph record.  But we intend the word &quot;record&quot; to mean a database
record; we'll say &quot;album&quot; if we mean the musical kind.<P>
<P><A HREF="../ss-toc2.html">(back to Table of Contents)</A><P>
<A HREF="spread-implement.html"><STRONG>BACK</STRONG></A>
chapter thread <A HREF="../ssch26/part7.html"><STRONG>NEXT</STRONG></A>

<P>
<ADDRESS>
<A HREF="../index.html">Brian Harvey</A>, 
<CODE>bh@cs.berkeley.edu</CODE>
</ADDRESS>
</BODY>
</HTML>