about summary refs log tree commit diff stats
path: root/docs/README.chartrans
diff options
context:
space:
mode:
Diffstat (limited to 'docs/README.chartrans')
-rw-r--r--docs/README.chartrans164
1 files changed, 164 insertions, 0 deletions
diff --git a/docs/README.chartrans b/docs/README.chartrans
new file mode 100644
index 00000000..a13ef361
--- /dev/null
+++ b/docs/README.chartrans
@@ -0,0 +1,164 @@
+Lynx CHARTRANS
+
+ Features (in addition to those which Lynx 2.7.1 already has):
+
+ - Can (attempt to) translate from any document charset to any display
+   character set, *IF* the document charset is known by a translation
+   table (compiled in at installation).
+
+ - New method to define character sets: used for input charset as well
+   as display character set, translation tables compiled in from
+   separate files (one per charset).  One table is designated as default
+   and can be used for fallback translation to 7-bit replacements for
+   display.
+
+ - New method for specifying translations of SGML entities.
+
+ - Unicode (UTF-8) support: can (attempt to) decode and translate UTF-8 to
+   display character set, or pass through UTF to display (if terminal
+   or console understands UTF-8).  [raw display of UTF only tested with Slang
+   so far, does not always position everything correctly on screen]
+
+ - Support for CHARSET attribute on A tag (and sometimes LINK), as in HTML
+   i18n RFC 2070 and W3C HTML 4.0 drafts.  A link can suggest the target's
+   charset in this way.
+
+ - Support for ACCEPT-CHARSET attribute of FORM tags.
+
+ - EXPERIMENTAL, currently enabled only for Linux console:
+   can (attempt to) automatically switch terminal mode and load new
+   code pages on change of display character set.
+
+ - some minor changes: sometimes invalid characters were displayed in a hex
+   notation Uxxxx (helps debugging, but I also regard it as at least not
+   worse than showing the wrong char without warning), now they are not
+   displayed to reduce garbage.
+
+Additions/changes to user interface:
+
+ - many new Display Character Sets are available on O)ptions screen.
+   (One can use arrow keys, HOME, END etc. for cycling through the list
+   or use selection from popup box, as for other options.)
+
+ - new command line flags:
+   -assume_charset=...  assume this as charset for documents that don't
+                        specify a charset parameter in HTTP headers
+   -assume_local_charset=...  assume this as charset of local file
+   -assume_unrec_charset=...  in case a charset parameter is not recognized;
+   docs also available as ASSUME_CHARSET etc. in lynx.cfg
+   In "Advanced User" mode, ASSUME_CHARSET can be changed during a session
+   from the Options Screen.
+
+ - The "Raw" toggle (from -raw flag, '@' key, or Options screen)
+   o  toggles the assumption "Default remote charset is same as Display
+      Character Set" on or off.
+      Toggling of the assumed charset is between Display Character Set and
+      the specified ASSUME_CHARSET or, if they are the same, between the
+      specified ASSUME_CHARSET and ISO-8859-1.
+   o  The default for raw mode now depends on the Display Character Set as
+      well as on the specified ASSUME_CHARSET value.
+   o  should work as before for CJK charsets (turning CJK-mode on or off).
+   o  If the effective ASSUME_CHARSET and the Display Character Set are
+      unchanged from the ISO-8859-1 default, toggling "Raw" may have some
+      additional effect for characters that can't be translated.
+   (Try the "Transparent" Display Character Set for more "rawness".)
+
+
+Requirements:  same as for Lynx in general :)
+
+The chartrans code is now merged with Wayne Buttle's changes for
+32-bit MS Windows and DOS/DJGPP, with Thomas Dickey's and Jim Spath's
+emerging auto-configure mechanism, and with BUGFIXES from Foteos
+Macrides.  See the accompanying file CHANGES for the current
+status.
+
+
+A warning:
+In some cases undisplayable bytes may still get sent to the terminal
+which are then interpreted as control chars, there is no protection
+against if strange things are defined in the table files.
+
+
+HOW TO INSTALL:
+
+(4) before compiling:
+
+    Check top level makefile or Makefile and userdefs.h as usual.
+
+    NOTE that there is a new "#define" in userdefs.h for MAX_CHARSETS
+    near the end (in "Section 3.").
+
+(5) Building Lynx:
+
+    Compiling the chartrans code is now integrated into the normal
+    installation procedures for UNIX (configure script) and other
+    platforms.
+
+    What's supposed to happen (in addition to the usual things when
+    building Lynx): in the new subdirectory src/chrtrans, make should
+    first compile the auxiliary program `makeuctb', then invoke that
+    program to create xxxxx_yyy.h files from the provided xxxxx_yyy.tab
+    translation table files.  (See README.* files in src/chrtrans for
+    more info.)
+
+    If all goes well, just invoking make from the top-level Lynx dir
+    as usual should do everything automatically.  If not, the makefiles
+    may need some tweaking... or:
+
+(6) Some things to look at if compilation fails:
+
+    In src/chrtrans/UCkd.h there is a typedef for an unsigned 16bit
+    numeric type which may need to be changed for your system.
+    See comment near top there.
+
+    For recompiling Lynx, `make clean' should not be necessary if only
+    files in src/chrtrans have been changed.  On the other hand
+    may not propagate to the src/chrtrans directory (depending how things
+    are going with auto-config), you may have to cd to that directory
+    and `make clean' there to really clean up there.
+
+(7) To customize (add/change translation tables etc.):
+
+     See README.* files in src/chrtrans.
+     Make the necessary changes there, then recompile.
+     (A general `make clean' should not be necessary, but make sure
+     the ...uni.h file in src/chrtrans gets regenerated.)
+
+     Note that definition of new character entities (if e.g., you want
+     Lynx to recognize Ž) are not covered by these table files,
+     they have to be listed in entities.h.
+
+     _If you are on a Linux system_ and using Lynx on the console (i.e.
+     not xterm, not a dialup *into* the Linux box), you can compile
+     with -DEXP_CHARTRANS_AUTOSWITCH.  This is very useful for testing
+     the various Display Character Sets, Lynx will try to automatically
+     change the console state.  You need to have the Linux kbd package
+     installed, with a working `setfont' command executable by the user,
+     and the right font files - check the source in src/UCAuto.c for
+     the files used and/or to change them!
+     NOTE that with this enabled,
+     - Lynx currently will not clean up the console state at exit,
+       it will probably left like the last Display Character Set you used.
+     - Loading a font is global across _all_ virtual text consoles, so
+       using Lynx (compiled with this flag) may change the appearance of
+       text on other consoles (if that text contains characters
+       beyond US-ASCII).
+
+(8) Some suggested Web pages for testing:
+
+    <URL:  http://www.tezcat.com/~kweide/lynx-chartrans/test/>
+
+    <URL:  http://www.isoc.org:8080/>,
+      especially
+    <URL:  http://www.isoc.org:8080/liste_ml.htm>.
+
+    <URL:  http://www.accentsoft.com/un/un-all.htm>
+
+(9) Please report bugs, unexpected behavior, etc.
+    to <lynx-dev@nongnu.org>.
+
+    Suggestions for improvement would be welcome, as well as
+    contributed translation tables (for stuff that is not available
+    at ftp://dkuug.dk or ftp://ftp.unicode.org).
+
+KW  1997-11-06