about summary refs log tree commit diff stats
path: root/docs/IBMPC-charsets.announce
diff options
context:
space:
mode:
authorThomas E. Dickey <dickey@invisible-island.net>1998-02-19 10:57:28 -0500
committerThomas E. Dickey <dickey@invisible-island.net>1998-02-19 10:57:28 -0500
commit899516a7c8880df05e30bbbed72ca1d3cb7a4f00 (patch)
tree14b895432dc4e84686c36bdeee4c689706af5361 /docs/IBMPC-charsets.announce
parentc82d2a4041724afe1dce249c78c4f034ca6a8d69 (diff)
downloadlynx-snapshots-899516a7c8880df05e30bbbed72ca1d3cb7a4f00.tar.gz
snapshot of project "lynx", label v2-7-1ac-0_115
Diffstat (limited to 'docs/IBMPC-charsets.announce')
-rw-r--r--docs/IBMPC-charsets.announce171
1 files changed, 44 insertions, 127 deletions
diff --git a/docs/IBMPC-charsets.announce b/docs/IBMPC-charsets.announce
index 40d2854c..870abe5b 100644
--- a/docs/IBMPC-charsets.announce
+++ b/docs/IBMPC-charsets.announce
@@ -1,92 +1,69 @@
-Mike Brown (mike@hyperreal.com)
--------------------------------
 
 Summary
 =======
-This document describes peculiarities in the way MS-DOS handles character
-sets and provides instructions on how to activate different character sets
-that are ISO-8859 compliant.  This is primarily of utility to people who
-will be using Lynx on a remote UNIX or VMS system via an MS-DOS based
-terminal program.
+This document is primarily for people who will be using Lynx
+on a remote UNIX or VMS system via an MS-DOS based terminal program.
 
 
 General Information
 ===================
-Lynx comes with built-in translation tables to map the 8-bit character codes
-or ISO-8859-x character entities coming in from an HTML document to their
-equivalent codes, where possible, for various character sets, including some
-that are not quite the same as ISO-8859-x.  The translations supported as of
-the 09-02-96 Lynx2-6 code include:
-        "ISO Latin 1         " (ISO-8859-1)
-        "ISO Latin 2         " (ISO-8859-2)
-        "Other ISO Latin     "
-        "DEC Multinational   "
-        "IBM PC character set" (CP 437, standard for US)
-        "IBM PC codepage 850 " (ISO-8859-1, but see below!)
-        "Macintosh (8 bit)   "
-        "NeXT character set  "
-        "KOI8-R character set"
-        "Chinese             "
-        "Japanese (EUC)      "
-        "Japanese (SJIS)     "
-        "Korean              "
-        "Taipei (Big5)       "
-        "7 bit approximations"
+Lynx comes with built-in translation tables to map the 8-bit character codes or
+character entities coming in from an HTML document to their equivalent codes,
+where possible, for various character sets.  You should choose display
+character set in Lynx Options Menu according to your font installed locally. 
+Please contact lynx-dev mailing list if you want any new codepage not listed
+there.
 
-Under ideal conditions, when using Lynx through a system that displays one 
-of these character sets, selecting the appropriate character set in the Lynx 
-options will ensure proper display of all characters one might encounter in 
-HTML documents.
-
-Note that all points of the connection between the display at your end and 
-Lynx at the remote end must be 8-bit clean.  If the high bit is being 
-stripped at any point in between, the only character set you can use 
-(effectively) in Lynx will be "7 bit approximations".  More on that later.
+Note that all points of the connection between the display at your end and Lynx
+at the remote end must be 8-bit clean.  If the high bit is being stripped at
+any point in between, the only character set you can use (effectively) in Lynx
+will be "7 bit approximations".  More on that later.
 
 
 MS-DOS character set weirdness
 ==============================
-MS-DOS uses a bass-ackwards character set in which half the normal 
-characters have been replaced by pseudo-graphic line and box-drawing 
-characters, and in which almost all of the international characters are 
-mapped to nonstandard numbers.  It also contains Greek letters.
-
-Further confusing matters, there is more than one MS-DOS character set. 
-The character sets are referred to as "codepages," each of which has a
-unique number.  IBM PCs and compatibles come with one hardware-based
-default codepage and a keyboard to match.  In the US market the hardware
-codepage is 437.  PCs destined for other regions of the world often have a
-different default codepage which contains characters for other languages
-and keyboards.  Under MS-DOS, one can load different codepages into memory
-and use one of them instead of the hardware default.
-
-If you are using Lynx through an MS-DOS based terminal program or telnet 
-client, you should use the "IBM PC character set" in Lynx.  I believe this 
-was written with codepage 437 in mind.  [ what about console displays for a 
-PC-based UNIX?  what about DOSLynx?  I don't know! ]  Also, the Windows
-font "Terminal" is nearly the same as codepage 437.
+MS-DOS uses a bass-ackwards character set in which half the normal characters
+have been replaced by pseudo-graphic line and box-drawing characters, and in
+which almost all of the international characters are mapped to nonstandard
+numbers.  It also contains Greek letters.
+
+Further confusing matters, there is more than one MS-DOS character set.  The
+character sets are referred to as "codepages," each of which has a unique
+number.  IBM PCs and compatibles come with one hardware-based default codepage
+and a keyboard to match.  In the US market the hardware codepage is 437.  PCs
+destined for other regions of the world often have a different default codepage
+which contains characters for other languages and keyboards.  Under MS-DOS, one
+can load different codepages into memory and use one of them instead of the
+hardware default.
+
+If you are using Lynx through an MS-DOS based terminal program or telnet
+client, you should use an appropriate DOS codepage in Lynx and you need not any
+translation within terminal program (this is different from old-style behavior
+and works better because of superior Lynx translation support).
 
 Check your display by accessing Martin Ramsch's ISO-8859-1 table
 (iso8859-1.html in the Lynx distribution's test subdirectory).
 
-Ramsch's table describes each entity and shows examples of each.  It should 
-be immediately obvious that you are either seeing what you are supposed to, 
-or you're not.  If you see box and line-drawing characters and mismatched 
-letters and so on, you are likely displaying 7 bit data, not 8.  Ensure that 
-all points of your connection are 8-bit clean:
+Ramsch's table describes each entity and shows examples of each.  It should be
+immediately obvious that you are either seeing what you are supposed to, or
+you're not.  If you see box and line-drawing characters and mismatched letters
+and so on, you are likely displaying 7 bit data, not 8.  Ensure that all points
+of your connection are 8-bit clean:
 
 	On any remote UNIX systems you must pass through, do 
 		'stty cs8 -istrip' or 'stty pass8'.  'stty -a' should list
 		your settings.
 	On any remote VMS systems, do 'set terminal /eightbit'.
 	Make sure your terminal program or telnet client is not filtering
-		8-bit data.  Note: Procomm for DOS has a confusing "Use 7 bit
-		or 8 bit ANSI" setting -- this has to do with ANSI sequences.
-		If set to 8 bit, some 8-bit character sequences, including
-		those passed by Lynx as well as those which are for your
-		terminal type (vt100, etc.) will be processed by Procomm as
-		ANSI screen control codes and will most likely result in a
-		garbled display.  Set it to 7 bit.
+		8-bit data.  You may found the choice between "VT-100 strict"
+		and "VT-100 relaxed" emulation mode - use relaxed.
+		Note:  Procomm for DOS has a confusing "Use 7 bit or 8 bit
+		ANSI" setting -- this has to do with ANSI sequences.  If set to
+		8 bit, some 8-bit character sequences, including those passed
+		by Lynx as well as those which are for your terminal type
+		(vt100, etc.) will be processed by Procomm as ANSI screen
+		control codes and will most likely result in a garbled display. 
+		Set it to 7 bit.
 	If going through a dialup terminal server, you may have to set the 
 		terminal server itself to pass 8 bit data.  How to do this
 		varies with the make of the server, and in some cases only a
@@ -94,63 +71,3 @@ all points of your connection are 8-bit clean:
 		to do that.
 	SLIP or PPP connections should already be 8-bit clean.
 
-
-Displaying true ISO-8859-1 under MS-DOS
-=======================================
-
-Since there are apparently no ISO-8859-1 EGA/VGA soft fonts (I looked) and
-since such fonts tend to cause problems when switching video modes, the
-next-best alternative is to use MS-DOS 5/6's international codepage
-feature.  I'm fuzzy on the why-how-wherefores, but it works great if you
-do it like this:
-
-        In your config.sys, add a line to make codepage switching possible:
-                devicehigh=c:\dos\display.sys con=(ega,437,1)
-
-	This loads the display driver.  437 is the codepage supported by my 
-	hardware.  Check your MS-DOS documentation and help screens for 
-	more info on what these things do.
-
-        In your autoexec.bat, add lines to load the IBM OEM ISO-Latin1 
-	character set from the ega.cpi collection and switch over to it:
-                mode con cp prep=((850) c:\dos\ega.cpi)
-                mode con cp sel=850
-
-Note that the codepage 850 in ega.cpi is IBM/Microsoft's ISO-Latin1,
-which, although it contains all the right characters, does *not* map them
-to the standard numbers as per ISO-8859-1, and it still preserves some of
-the pseudo-graphic characters.  If you run Procomm for DOS (or just about
-any other application), you'll see that some of the line-drawing
-characters in the title screen and on the dialing/help menus appear as
-international letters.  There's no way around this. 
-
-Once you are using codepage 850, you've still got the problem of the 
-characters being mapped to the wrong numbers.  For example, if Lynx sends 
-your terminal a code for a middle dot, you'll see something other than a 
-middle dot -- maybe an upper-left box-corner (regular codepage) or an A with 
-an accent mark (codepage 850).  There are two possible remedies:
-
-	1. If using a terminal program like Procomm, use its Translation Table
-	to process incoming characters.  On my slow 286, even with a speedy
-	screen driver (nansi or nnansi.sys) installed, this results in a
-	slight (20%) slowdown in the screen write time.  If you still want to
-	give it a try, I found a set of translation tables for ISO-8859-1 ->
-	IBM CP 850 for Procomm and Qmodem in the SimTel archives at:
-		http://oak.oakland.edu:8080/SimTel/msdos/modem/xlate.zip
-
-	2. Have Lynx do the work for you.  I used the information in xlate.zip
-	to create a Lynx character set for codepage 850.  Select it via the
-	'o'ptions menu when running Lynx, and save the choice in your .lynxrc
-	file.
-
-There is another option.  There are actually ISO-8859 compliant codepages
-available at:
-		ftp://ftp.informatik.uni-erlangen.de/pub/doc/ISO/charsets/
-		ftp://nic.funet.fi/pub/doc/charsets/
-
-as part of Kosta Kosis' free ISOCP collection.  You have to use a custom
-keyboard driver (supplied) and you may find that sacrificing all of the
-pseudo-graphic characters may make your terminal program (and many other
-DOS applications) look rather ugly, but at least no translations will be
-necessary -- ISO-8859-[1,2] data received will appear on screen exactly as
-it should with the Lynx "ISO Latin" character sets selected.