diff options
Diffstat (limited to 'src/chrtrans')
-rw-r--r-- | src/chrtrans/README.format | 12 | ||||
-rw-r--r-- | src/chrtrans/README.tables | 18 | ||||
-rw-r--r-- | src/chrtrans/def7_uni.tbl | 7 |
3 files changed, 20 insertions, 17 deletions
diff --git a/src/chrtrans/README.format b/src/chrtrans/README.format index 4ced0a14..5e8d029c 100644 --- a/src/chrtrans/README.format +++ b/src/chrtrans/README.format @@ -26,7 +26,7 @@ b) directives: start with a keyword which may be abbreviated to one letter (first letter must be capitalized), followed by space and a value. Currently recognized: - + OptionName The name under which this should appear on the O)ptions screen in the list for Display Character Set @@ -53,7 +53,7 @@ c) character translation definitions: 0x41 U+0041 U+0391 ... and are used for "forward" translation (mapping this charset to Unicode) - AS WELL AS "back" translation (mapping Unicodes to an 8-bit + AS WELL AS "back" translation (mapping Unicodes to an 8-bit [incl. 7-bit ASCII] code). For the "forward" direction, only the first Unicode is used; for @@ -63,7 +63,7 @@ c) character translation definitions: The above example line would tell the chartrans mechanism: "For this charset, code position 65 [hex 0x41] contains Unicode U+0041 (LATIN CAPITAL LETTER A). For translation of Unicodes to - this charset, use byte value 65 [hex 0x41] for U+0041 (LATIN CAPITAL + this charset, use byte value 65 [hex 0x41] for U+0041 (LATIN CAPITAL LETTER A) as well as for U+0391 (GREEK CAPITAL LETTER ALPHA)." [Note that for bytes in the ASCII range 0x00-0x7F, the forward translations @@ -89,10 +89,10 @@ d) string replacement definitions: U+00cd:I' - which would mean "Replace Unicode U+00cd (LATIN CAPITAL LETTER I WITH + which would mean "Replace Unicode U+00cd (LATIN CAPITAL LETTER I WITH ACUTE" with the string (consisting of two character) I' (if no other translation is available)." Please note that replacement definitions - in certnain charset table will override ones from Default table. + in certain charset table will override ones from the Default table. Note that everything after the ':' is currently taken VERBATIM, so careful with trailing blanks etc. @@ -111,7 +111,7 @@ d) string replacement definitions: Motivation: -- It is an extention of the format already in use for Linux (kernel, +- It is an extension of the format already in use for Linux (kernel, kbd package), those files can be used with some minimal editing. - It is easy to convert Unicode tables for other charsets, as they diff --git a/src/chrtrans/README.tables b/src/chrtrans/README.tables index be6dac6a..b9311729 100644 --- a/src/chrtrans/README.tables +++ b/src/chrtrans/README.tables @@ -1,12 +1,12 @@ -The translation table files in this directory are _examples only_. -They were collected from several sources (among them ftp://ftp.unicode.org, -Linux kbd package, ftp://dkuug.dk/) and are believed to be correct -in their mappings, but not checked in detail. The Unicode/UCS2 values -for some of the RFC 1345 Mnemonic codes are out of date, a cleanup and -update would be needed for serious use. +The translation table files in this directory were collected from +several sources (among them ftp://ftp.unicode.org, Linux kbd package, +ftp://dkuug.dk/) and are believed to be correct in their mappings, +but not checked in detail. The Unicode/UCS2 values +for some of the RFC 1345 Mnemonic codes are out of date, +a cleanup and update would be needed for serious use. More translation files can be easily provided (and new character entities -added to entities.h), this set is just to test whether the system works +added to entities.h), this set is just to test whether the system works in principle (and also how it behaves with incomplete data...) See the file README.format for a brief explanation of what's in the @@ -27,7 +27,7 @@ charset known to Lynx) you currently have to manually edit UCdomap.c, in two places: a) Near the top, you will find a bunch of lines (some may be commented out) - + #include "<fn>.h" Add or comment out as you wish. But it is probably safest to leave the @@ -44,7 +44,7 @@ did under a)...) [The <something> is derived from the charset's MIME name. if in doubt, check the last lines of the corresponding ...uni.h file.] c) To let make automatically notice when you have changed one of the - table files, and automatically regenerate the *uni.h file(s), + table files, and automatically regenerate the *uni.h file(s), you also have to add any new tables to both src/Makefile *and* src/chrtrans/Makefile. Or, for auto-config, the equivalent files named makefile.in before running ./configure, or makefile after running diff --git a/src/chrtrans/def7_uni.tbl b/src/chrtrans/def7_uni.tbl index 66a63f76..01b86d7a 100644 --- a/src/chrtrans/def7_uni.tbl +++ b/src/chrtrans/def7_uni.tbl @@ -1754,6 +1754,7 @@ U+266e:Mx U+266f:# 0x58 U+2713 U+2717 # check marks -> x U+2720:-X +# CJK area: 0x20 U+3000 # ideographic space U+3001:,_ U+3002:._ @@ -2014,6 +2015,10 @@ U+3229:10c U+327f:KSC U+33c2:am U+33d8:pm +# end of CJK area (up to U+e000). + +# Characters in Private Use Area (e000-f8ff) do not have unassigned numbers. + U+fb00:ff U+fb01:fi U+fb02:fl @@ -2209,8 +2214,6 @@ U+009a:SC #U+009e:PM #U+009f:AC -# Characters in Private Use Area (e000-f8ff) do not have ussigned numbers. - # Let's try to show a question mark for character that cannot # be shown. U+fffd is used for invalid characters. # It works, but let's stick with UHHH representatiion. - FM |