about summary refs log tree commit diff stats
path: root/docs
diff options
context:
space:
mode:
authorThomas E. Dickey <dickey@invisible-island.net>1998-02-19 10:57:28 -0500
committerThomas E. Dickey <dickey@invisible-island.net>1998-02-19 10:57:28 -0500
commit899516a7c8880df05e30bbbed72ca1d3cb7a4f00 (patch)
tree14b895432dc4e84686c36bdeee4c689706af5361 /docs
parentc82d2a4041724afe1dce249c78c4f034ca6a8d69 (diff)
downloadlynx-snapshots-899516a7c8880df05e30bbbed72ca1d3cb7a4f00.tar.gz
snapshot of project "lynx", label v2-7-1ac-0_115
Diffstat (limited to 'docs')
-rw-r--r--docs/IBMPC-charsets.announce171
-rw-r--r--docs/README.chartrans15
2 files changed, 52 insertions, 134 deletions
diff --git a/docs/IBMPC-charsets.announce b/docs/IBMPC-charsets.announce
index 40d2854c..870abe5b 100644
--- a/docs/IBMPC-charsets.announce
+++ b/docs/IBMPC-charsets.announce
@@ -1,92 +1,69 @@
-Mike Brown (mike@hyperreal.com)
--------------------------------
 
 Summary
 =======
-This document describes peculiarities in the way MS-DOS handles character
-sets and provides instructions on how to activate different character sets
-that are ISO-8859 compliant.  This is primarily of utility to people who
-will be using Lynx on a remote UNIX or VMS system via an MS-DOS based
-terminal program.
+This document is primarily for people who will be using Lynx
+on a remote UNIX or VMS system via an MS-DOS based terminal program.
 
 
 General Information
 ===================
-Lynx comes with built-in translation tables to map the 8-bit character codes
-or ISO-8859-x character entities coming in from an HTML document to their
-equivalent codes, where possible, for various character sets, including some
-that are not quite the same as ISO-8859-x.  The translations supported as of
-the 09-02-96 Lynx2-6 code include:
-        "ISO Latin 1         " (ISO-8859-1)
-        "ISO Latin 2         " (ISO-8859-2)
-        "Other ISO Latin     "
-        "DEC Multinational   "
-        "IBM PC character set" (CP 437, standard for US)
-        "IBM PC codepage 850 " (ISO-8859-1, but see below!)
-        "Macintosh (8 bit)   "
-        "NeXT character set  "
-        "KOI8-R character set"
-        "Chinese             "
-        "Japanese (EUC)      "
-        "Japanese (SJIS)     "
-        "Korean              "
-        "Taipei (Big5)       "
-        "7 bit approximations"
+Lynx comes with built-in translation tables to map the 8-bit character codes or
+character entities coming in from an HTML document to their equivalent codes,
+where possible, for various character sets.  You should choose display
+character set in Lynx Options Menu according to your font installed locally. 
+Please contact lynx-dev mailing list if you want any new codepage not listed
+there.
 
-Under ideal conditions, when using Lynx through a system that displays one 
-of these character sets, selecting the appropriate character set in the Lynx 
-options will ensure proper display of all characters one might encounter in 
-HTML documents.
-
-Note that all points of the connection between the display at your end and 
-Lynx at the remote end must be 8-bit clean.  If the high bit is being 
-stripped at any point in between, the only character set you can use 
-(effectively) in Lynx will be "7 bit approximations".  More on that later.
+Note that all points of the connection between the display at your end and Lynx
+at the remote end must be 8-bit clean.  If the high bit is being stripped at
+any point in between, the only character set you can use (effectively) in Lynx
+will be "7 bit approximations".  More on that later.
 
 
 MS-DOS character set weirdness
 ==============================
-MS-DOS uses a bass-ackwards character set in which half the normal 
-characters have been replaced by pseudo-graphic line and box-drawing 
-characters, and in which almost all of the international characters are 
-mapped to nonstandard numbers.  It also contains Greek letters.
-
-Further confusing matters, there is more than one MS-DOS character set. 
-The character sets are referred to as "codepages," each of which has a
-unique number.  IBM PCs and compatibles come with one hardware-based
-default codepage and a keyboard to match.  In the US market the hardware
-codepage is 437.  PCs destined for other regions of the world often have a
-different default codepage which contains characters for other languages
-and keyboards.  Under MS-DOS, one can load different codepages into memory
-and use one of them instead of the hardware default.
-
-If you are using Lynx through an MS-DOS based terminal program or telnet 
-client, you should use the "IBM PC character set" in Lynx.  I believe this 
-was written with codepage 437 in mind.  [ what about console displays for a 
-PC-based UNIX?  what about DOSLynx?  I don't know! ]  Also, the Windows
-font "Terminal" is nearly the same as codepage 437.
+MS-DOS uses a bass-ackwards character set in which half the normal characters
+have been replaced by pseudo-graphic line and box-drawing characters, and in
+which almost all of the international characters are mapped to nonstandard
+numbers.  It also contains Greek letters.
+
+Further confusing matters, there is more than one MS-DOS character set.  The
+character sets are referred to as "codepages," each of which has a unique
+number.  IBM PCs and compatibles come with one hardware-based default codepage
+and a keyboard to match.  In the US market the hardware codepage is 437.  PCs
+destined for other regions of the world often have a different default codepage
+which contains characters for other languages and keyboards.  Under MS-DOS, one
+can load different codepages into memory and use one of them instead of the
+hardware default.
+
+If you are using Lynx through an MS-DOS based terminal program or telnet
+client, you should use an appropriate DOS codepage in Lynx and you need not any
+translation within terminal program (this is different from old-style behavior
+and works better because of superior Lynx translation support).
 
 Check your display by accessing Martin Ramsch's ISO-8859-1 table
 (iso8859-1.html in the Lynx distribution's test subdirectory).
 
-Ramsch's table describes each entity and shows examples of each.  It should 
-be immediately obvious that you are either seeing what you are supposed to, 
-or you're not.  If you see box and line-drawing characters and mismatched 
-letters and so on, you are likely displaying 7 bit data, not 8.  Ensure that 
-all points of your connection are 8-bit clean:
+Ramsch's table describes each entity and shows examples of each.  It should be
+immediately obvious that you are either seeing what you are supposed to, or
+you're not.  If you see box and line-drawing characters and mismatched letters
+and so on, you are likely displaying 7 bit data, not 8.  Ensure that all points
+of your connection are 8-bit clean:
 
 	On any remote UNIX systems you must pass through, do 
 		'stty cs8 -istrip' or 'stty pass8'.  'stty -a' should list
 		your settings.
 	On any remote VMS systems, do 'set terminal /eightbit'.
 	Make sure your terminal program or telnet client is not filtering
-		8-bit data.  Note: Procomm for DOS has a confusing "Use 7 bit
-		or 8 bit ANSI" setting -- this has to do with ANSI sequences.
-		If set to 8 bit, some 8-bit character sequences, including
-		those passed by Lynx as well as those which are for your
-		terminal type (vt100, etc.) will be processed by Procomm as
-		ANSI screen control codes and will most likely result in a
-		garbled display.  Set it to 7 bit.
+		8-bit data.  You may found the choice between "VT-100 strict"
+		and "VT-100 relaxed" emulation mode - use relaxed.
+		Note:  Procomm for DOS has a confusing "Use 7 bit or 8 bit
+		ANSI" setting -- this has to do with ANSI sequences.  If set to
+		8 bit, some 8-bit character sequences, including those passed
+		by Lynx as well as those which are for your terminal type
+		(vt100, etc.) will be processed by Procomm as ANSI screen
+		control codes and will most likely result in a garbled display. 
+		Set it to 7 bit.
 	If going through a dialup terminal server, you may have to set the 
 		terminal server itself to pass 8 bit data.  How to do this
 		varies with the make of the server, and in some cases only a
@@ -94,63 +71,3 @@ all points of your connection are 8-bit clean:
 		to do that.
 	SLIP or PPP connections should already be 8-bit clean.
 
-
-Displaying true ISO-8859-1 under MS-DOS
-=======================================
-
-Since there are apparently no ISO-8859-1 EGA/VGA soft fonts (I looked) and
-since such fonts tend to cause problems when switching video modes, the
-next-best alternative is to use MS-DOS 5/6's international codepage
-feature.  I'm fuzzy on the why-how-wherefores, but it works great if you
-do it like this:
-
-        In your config.sys, add a line to make codepage switching possible:
-                devicehigh=c:\dos\display.sys con=(ega,437,1)
-
-	This loads the display driver.  437 is the codepage supported by my 
-	hardware.  Check your MS-DOS documentation and help screens for 
-	more info on what these things do.
-
-        In your autoexec.bat, add lines to load the IBM OEM ISO-Latin1 
-	character set from the ega.cpi collection and switch over to it:
-                mode con cp prep=((850) c:\dos\ega.cpi)
-                mode con cp sel=850
-
-Note that the codepage 850 in ega.cpi is IBM/Microsoft's ISO-Latin1,
-which, although it contains all the right characters, does *not* map them
-to the standard numbers as per ISO-8859-1, and it still preserves some of
-the pseudo-graphic characters.  If you run Procomm for DOS (or just about
-any other application), you'll see that some of the line-drawing
-characters in the title screen and on the dialing/help menus appear as
-international letters.  There's no way around this. 
-
-Once you are using codepage 850, you've still got the problem of the 
-characters being mapped to the wrong numbers.  For example, if Lynx sends 
-your terminal a code for a middle dot, you'll see something other than a 
-middle dot -- maybe an upper-left box-corner (regular codepage) or an A with 
-an accent mark (codepage 850).  There are two possible remedies:
-
-	1. If using a terminal program like Procomm, use its Translation Table
-	to process incoming characters.  On my slow 286, even with a speedy
-	screen driver (nansi or nnansi.sys) installed, this results in a
-	slight (20%) slowdown in the screen write time.  If you still want to
-	give it a try, I found a set of translation tables for ISO-8859-1 ->
-	IBM CP 850 for Procomm and Qmodem in the SimTel archives at:
-		http://oak.oakland.edu:8080/SimTel/msdos/modem/xlate.zip
-
-	2. Have Lynx do the work for you.  I used the information in xlate.zip
-	to create a Lynx character set for codepage 850.  Select it via the
-	'o'ptions menu when running Lynx, and save the choice in your .lynxrc
-	file.
-
-There is another option.  There are actually ISO-8859 compliant codepages
-available at:
-		ftp://ftp.informatik.uni-erlangen.de/pub/doc/ISO/charsets/
-		ftp://nic.funet.fi/pub/doc/charsets/
-
-as part of Kosta Kosis' free ISOCP collection.  You have to use a custom
-keyboard driver (supplied) and you may find that sacrificing all of the
-pseudo-graphic characters may make your terminal program (and many other
-DOS applications) look rather ugly, but at least no translations will be
-necessary -- ISO-8859-[1,2] data received will appear on screen exactly as
-it should with the Lynx "ISO Latin" character sets selected. 
diff --git a/docs/README.chartrans b/docs/README.chartrans
index f77a1115..77ee6c61 100644
--- a/docs/README.chartrans
+++ b/docs/README.chartrans
@@ -1,8 +1,8 @@
 Lynx CHARTRANS
 
- Features (in addition to those which Lynx already has):
+ Features (in addition to those which Lynx 2.7.1 already has):
  
-- Can (attempt to) translate from any document charset to any display
+ - Can (attempt to) translate from any document charset to any display
    character set, *IF* the document charset is known by a translation 
    table (compiled in at installation).
 
@@ -23,7 +23,7 @@ Lynx CHARTRANS
    i18n RFC 2070 and W3C HTML 4.0 drafts.  A link can suggest the target's
    charset in this way.
 
--  Support for ACCEPT-CHARSET attribute of FORM tags.
+ - Support for ACCEPT-CHARSET attribute of FORM tags.
 
  - EXPERIMENTAL, currently enabled only for Linux console: 
    can (attempt to) automatically switch terminal mode and load new
@@ -42,9 +42,9 @@ Additions/changes to user interface:
  - new command line flags:
    -assume_charset=...  assume this as charset for documents that don't
                         specify a charset parameter in HTTP headers
-   -assume_unknown_charset=...  in case a charset parameter is not recognized 
-   -assume_local_charset=... assume this as charset of local file: docs 
-   also available as ASSUME_CHARSET etc. in lynx.cfg
+   -assume_local_charset=...  assume this as charset of local file
+   -assume_unrec_charset=...  in case a charset parameter is not recognized;
+   docs also available as ASSUME_CHARSET etc. in lynx.cfg
    In "Advanced User" mode, ASSUME_CHARSET can be changed during a session
    from the Options Screen.
 
@@ -62,6 +62,7 @@ Additions/changes to user interface:
       additional effect for characters that can't be translated.
    (Try the "Transparent" Display Character Set for more "rawness".)
 
+
 Requirements:  same as for Lynx in general :)
 
 The chartrans code is now merged with Wayne Buttle's changes for
@@ -101,7 +102,7 @@ HOW TO INSTALL:
     What's supposed to happen (in addition to the usual things when
     building Lynx): in the new subdirectory src/chrtrans, make should
     first compile the auxiliary program `makeuctb', then invoke that
-    program to create xxxxx_yyy.h files from the provided xxxxx.yyy
+    program to create xxxxx_yyy.h files from the provided xxxxx_yyy.tab
     translation table files.  (See README.* files in src/chrtrans for
     more info.)