about summary refs log tree commit diff stats
path: root/test/iso-8859-1.html
blob: b9349fa227302c9129a554df7e856aa5a39d4825 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- X-URL: http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html -->
<!-- Date: Tue, 28 Dec 2004 20:24:09 GMT -->
<!-- Last-Modified: Mon, 15 May 2000 09:37:37 GMT -->
<HTML>
<HEAD>
<TITLE>Martin Ramsch - iso8859-1 table</TITLE>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<BASE HREF="http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html">
</HEAD>

<BODY> 

<H1 ALIGN=center>iso8859-1 table</H1> 

<PRE>
Description                               Code            Entity name   
===================================       ============    ==============
quotation mark                            &amp;#34;  --> &#34;    &amp;quot;   --> &quot;
ampersand                                 &amp;#38;  --> &#38;    &amp;amp;    --> &amp;
less-than sign                            &amp;#60;  --> &#60;    &amp;lt;     --> &lt;
greater-than sign                         &amp;#62;  --> &#62;    &amp;gt;     --> &gt;

Description                          Char Code            Entity name   
===================================  ==== ============    ==============
non-breaking space                   �    &amp;#160; --> &#160;    &amp;nbsp;   --> &nbsp;
inverted exclamation                 �    &amp;#161; --> &#161;    &amp;iexcl;  --> &iexcl;
cent sign                            �    &amp;#162; --> &#162;    &amp;cent;   --> &cent;
pound sterling                       �    &amp;#163; --> &#163;    &amp;pound;  --> &pound;
general currency sign                �    &amp;#164; --> &#164;    &amp;curren; --> &curren;
yen sign                             �    &amp;#165; --> &#165;    &amp;yen;    --> &yen;
broken vertical bar                  �    &amp;#166; --> &#166;    &amp;brvbar; --> &brvbar;
                                             Non-standard &amp;brkbar; --> &brkbar;
section sign                         �    &amp;#167; --> &#167;    &amp;sect;   --> &sect;
umlaut (dieresis)                    �    &amp;#168; --> &#168;    &amp;uml;    --> &uml;
                                             Non-standard &amp;die;    --> &die;
copyright                            �    &amp;#169; --> &#169;    &amp;copy;   --> &copy;
feminine ordinal                     �    &amp;#170; --> &#170;    &amp;ordf;   --> &ordf;
left angle quote, guillemotleft      �    &amp;#171; --> &#171;    &amp;laquo;  --> &laquo;
not sign                             �    &amp;#172; --> &#172;    &amp;not;    --> &not;
soft hyphen                          �    &amp;#173; --> &#173;    &amp;shy;    --> &shy;
registered trademark                 �    &amp;#174; --> &#174;    &amp;reg;    --> &reg;
macron accent                        �    &amp;#175; --> &#175;    &amp;macr;   --> &macr;
                                             Non-standard &amp;hibar;  --> &hibar;
degree sign                          �    &amp;#176; --> &#176;    &amp;deg;    --> &deg;
plus or minus                        �    &amp;#177; --> &#177;    &amp;plusmn; --> &plusmn;
superscript two                      �    &amp;#178; --> &#178;    &amp;sup2;   --> &sup2;
superscript three                    �    &amp;#179; --> &#179;    &amp;sup3;   --> &sup3;
acute accent                         �    &amp;#180; --> &#180;    &amp;acute;  --> &acute;
micro sign                           �    &amp;#181; --> &#181;    &amp;micro;  --> &micro;
paragraph sign                       �    &amp;#182; --> &#182;    &amp;para;   --> &para;
middle dot                           �    &amp;#183; --> &#183;    &amp;middot; --> &middot;
cedilla                              �    &amp;#184; --> &#184;    &amp;cedil;  --> &cedil;
superscript one                      �    &amp;#185; --> &#185;    &amp;sup1;   --> &sup1;
masculine ordinal                    �    &amp;#186; --> &#186;    &amp;ordm;   --> &ordm;
right angle quote, guillemotright    �    &amp;#187; --> &#187;    &amp;raquo;  --> &raquo;
fraction one-fourth                  �    &amp;#188; --> &#188;    &amp;frac14; --> &frac14;
fraction one-half                    �    &amp;#189; --> &#189;    &amp;frac12; --> &frac12;
fraction three-fourths               �    &amp;#190; --> &#190;    &amp;frac34; --> &frac34;
inverted question mark               �    &amp;#191; --> &#191;    &amp;iquest; --> &iquest;
capital A, grave accent              �    &amp;#192; --> &#192;    &amp;Agrave; --> &Agrave;
capital A, acute accent              �    &amp;#193; --> &#193;    &amp;Aacute; --> &Aacute;
capital A, circumflex accent         �    &amp;#194; --> &#194;    &amp;Acirc;  --> &Acirc;
capital A, tilde                     �    &amp;#195; --> &#195;    &amp;Atilde; --> &Atilde;
capital A, dieresis or umlaut mark   �    &amp;#196; --> &#196;    &amp;Auml;   --> &Auml;
capital A, ring                      �    &amp;#197; --> &#197;    &amp;Aring;  --> &Aring;
capital AE diphthong (ligature)      �    &amp;#198; --> &#198;    &amp;AElig;  --> &AElig;
capital C, cedilla                   �    &amp;#199; --> &#199;    &amp;Ccedil; --> &Ccedil;
capital E, grave accent              �    &amp;#200; --> &#200;    &amp;Egrave; --> &Egrave;
capital E, acute accent              �    &amp;#201; --> &#201;    &amp;Eacute; --> &Eacute;
capital E, circumflex accent         �    &amp;#202; --> &#202;    &amp;Ecirc;  --> &Ecirc;
capital E, dieresis or umlaut mark   �    &amp;#203; --> &#203;    &amp;Euml;   --> &Euml;
capital I, grave accent              �    &amp;#204; --> &#204;    &amp;Igrave; --> &Igrave;
capital I, acute accent              �    &amp;#205; --> &#205;    &amp;Iacute; --> &Iacute;
capital I, circumflex accent         �    &amp;#206; --> &#206;    &amp;Icirc;  --> &Icirc;
capital I, dieresis or umlaut mark   �    &amp;#207; --> &#207;    &amp;Iuml;   --> &Iuml;
capital Eth, Icelandic               �    &amp;#208; --> &#208;    &amp;ETH;    --> &ETH;
                                             Non-standard &amp;Dstrok; --> &Dstrok;
capital N, tilde                     �    &amp;#209; --> &#209;    &amp;Ntilde; --> &Ntilde;
capital O, grave accent              �    &amp;#210; --> &#210;    &amp;Ograve; --> &Ograve;
capital O, acute accent              �    &amp;#211; --> &#211;    &amp;Oacute; --> &Oacute;
capital O, circumflex accent         �    &amp;#212; --> &#212;    &amp;Ocirc;  --> &Ocirc;
capital O, tilde                     �    &amp;#213; --> &#213;    &amp;Otilde; --> &Otilde;
capital O, dieresis or umlaut mark   �    &amp;#214; --> &#214;    &amp;Ouml;   --> &Ouml;
multiply sign                        �    &amp;#215; --> &#215;    &amp;times;  --> &times;
capital O, slash                     �    &amp;#216; --> &#216;    &amp;Oslash; --> &Oslash;
capital U, grave accent              �    &amp;#217; --> &#217;    &amp;Ugrave; --> &Ugrave;
capital U, acute accent              �    &amp;#218; --> &#218;    &amp;Uacute; --> &Uacute;
capital U, circumflex accent         �    &amp;#219; --> &#219;    &amp;Ucirc;  --> &Ucirc;
capital U, dieresis or umlaut mark   �    &amp;#220; --> &#220;    &amp;Uuml;   --> &Uuml;
capital Y, acute accent              �    &amp;#221; --> &#221;    &amp;Yacute; --> &Yacute;
capital THORN, Icelandic             �    &amp;#222; --> &#222;    &amp;THORN;  --> &THORN;
small sharp s, German (sz ligature)  �    &amp;#223; --> &#223;    &amp;szlig;  --> &szlig;
small a, grave accent                �    &amp;#224; --> &#224;    &amp;agrave; --> &agrave;
small a, acute accent                �    &amp;#225; --> &#225;    &amp;aacute; --> &aacute;
small a, circumflex accent           �    &amp;#226; --> &#226;    &amp;acirc;  --> &acirc;
small a, tilde                       �    &amp;#227; --> &#227;    &amp;atilde; --> &atilde;
small a, dieresis or umlaut mark     �    &amp;#228; --> &#228;    &amp;auml;   --> &auml;
small a, ring                        �    &amp;#229; --> &#229;    &amp;aring;  --> &aring;
small ae diphthong (ligature)        �    &amp;#230; --> &#230;    &amp;aelig;  --> &aelig;
small c, cedilla                     �    &amp;#231; --> &#231;    &amp;ccedil; --> &ccedil;
small e, grave accent                �    &amp;#232; --> &#232;    &amp;egrave; --> &egrave;
small e, acute accent                �    &amp;#233; --> &#233;    &amp;eacute; --> &eacute;
small e, circumflex accent           �    &amp;#234; --> &#234;    &amp;ecirc;  --> &ecirc;
small e, dieresis or umlaut mark     �    &amp;#235; --> &#235;    &amp;euml;   --> &euml;
small i, grave accent                �    &amp;#236; --> &#236;    &amp;igrave; --> &igrave;
small i, acute accent                �    &amp;#237; --> &#237;    &amp;iacute; --> &iacute;
small i, circumflex accent           �    &amp;#238; --> &#238;    &amp;icirc;  --> &icirc;
small i, dieresis or umlaut mark     �    &amp;#239; --> &#239;    &amp;iuml;   --> &iuml;
small eth, Icelandic                 �    &amp;#240; --> &#240;    &amp;eth;    --> &eth;
small n, tilde                       �    &amp;#241; --> &#241;    &amp;ntilde; --> &ntilde;
small o, grave accent                �    &amp;#242; --> &#242;    &amp;ograve; --> &ograve;
small o, acute accent                �    &amp;#243; --> &#243;    &amp;oacute; --> &oacute;
small o, circumflex accent           �    &amp;#244; --> &#244;    &amp;ocirc;  --> &ocirc;
small o, tilde                       �    &amp;#245; --> &#245;    &amp;otilde; --> &otilde;
small o, dieresis or umlaut mark     �    &amp;#246; --> &#246;    &amp;ouml;   --> &ouml;
division sign                        �    &amp;#247; --> &#247;    &amp;divide; --> &divide;
small o, slash                       �    &amp;#248; --> &#248;    &amp;oslash; --> &oslash;
small u, grave accent                �    &amp;#249; --> &#249;    &amp;ugrave; --> &ugrave;
small u, acute accent                �    &amp;#250; --> &#250;    &amp;uacute; --> &uacute;
small u, circumflex accent           �    &amp;#251; --> &#251;    &amp;ucirc;  --> &ucirc;
small u, dieresis or umlaut mark     �    &amp;#252; --> &#252;    &amp;uuml;   --> &uuml;
small y, acute accent                �    &amp;#253; --> &#253;    &amp;yacute; --> &yacute;
small thorn, Icelandic               �    &amp;#254; --> &#254;    &amp;thorn;  --> &thorn;
small y, dieresis or umlaut mark     �    &amp;#255; --> &#255;    &amp;yuml;   --> &yuml;
</PRE>
<!-- removed: second /PRE, a hack for HotJava 1.0 preBeta 1 -->
<HR>

<STRONG>How to read</STRONG> this table.  The columns are
<DL COMPACT>
<DT>1st:<DD>textual <EM>description</EM> of the character
<DT>2nd:<DD>character inserted directly into the HTML page as <EM>one
            byte</EM>
<DT>3rd:<DD>character written as <EM>numeric HTML entity</EM>, in the
            format:<BR>"how it looks literally" <CODE>--&gt;</CODE>
            "what your browser does with it"
<DT>4th:<DD>character written as <EM>symbolic HTML entity</EM>, in the
            format:<BR>"how it looks literally" <CODE>--&gt;</CODE>
            "what your browser does with it"
</DL>

So for example, if you see something like "<CODE>&amp;divide; -->
&amp;divide;</CODE>" in the 4th column, this means your browser
doesn't know about the entity name "divide" and just puts it
literally.

<P>
<STRONG>This table</STRONG> grew out of an overview of the "ISO
Latin-1 Character Set" overview related to the Hyper-G Text Format
(<A HREF="http://www.hyperwave.de/HTFdoc">HTF</A>).

The entity names <CODE>&amp;brkbar;</CODE> and <CODE>&amp;Dstrok;</CODE>
seem to be unique to HTF.

The entity name <CODE>&amp;hibar;</CODE> has been supported by X Mosaic
but seems to be replaced with <CODE>&amp;macr;</CODE>.

The entity names <CODE>&amp;uml;</CODE> and <CODE>&amp;die;</CODE> should
be equivalent.

<P><STRONG>The standards stuff:</STRONG>
The 
<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/html-spec/">HTML 2.0 Standard</A>
includes a section on
<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/html-spec/html-spec_9.html#SEC99">Character Entity Sets</A>
and an overview on the
<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/html-spec/html-spec_13.html#SEC106">HTML Coded Character Set</A>
(The entity names are derived from <A HREF="http://www.ucc.ie/info/net/isolat1.html">ISO 8879</A>).
<BR>

Or have a look at the
<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/html3/latin1.html">Latin-1 Character Entities</A>
as listed in an draft for the
<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/html3/CoverPage.html">HTML 3.0 specification</A>.
<BR>

The
<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/HTMLPlus/htmlplus_59.html">Appendix II</A>
of CERN's
<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/HTMLPlus/htmlplus_1.html">HTML+ Discussion Document</A>
contains a
<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/HTMLPlus/htmlplus_table.ps">table</A>
(in PostScript format) of the proposed character entities for HTML+ and their
corresponding character codes for Unicode and the Adobe Latin-1 &amp; Symbol
character sets.
<P>

<STRONG>Please note</STRONG> that there is nothing wrong with using
characters of ISO Latin-1 above 127: the normal transmission protocol
for the WWW,
<A HREF="http://www.w3.org/pub/WWW/Protocols/rfc1945/rfc1945">HTTP/1.0</A>,
uses the 8bit ISO latin-1 as default encoding.
(Thanks to Roman 
Czyborra for pointing this out!)
<P>

<STRONG>Other information:</STRONG>
<UL>

<LI><STRONG>Kevin J. Brewer</STRONG> has done two very good pages on the subject:
  <UL>
   <LI><A HREF="http://www.bbsinc.com/iso8859.html">ASCII - ISO 8859-1 (Latin-1) with HTML 3.0 Entities Table</A> and
   <LI><A HREF="http://www.bbsinc.com/iso8879.html">ISO 8879 Entities Gopher Menu</A>
  </UL>

<LI>The excellent overview on the series of
    <A HREF="http://czyborra.com/charsets/iso8859.html">ISO 8859
    character sets</A> compiled by Roman Czyborra.

<LI>Also have a look on Alan Flavell's page of
    <A HREF="http://ppewww.ph.gla.ac.uk/%7Eflavell/iso8859/iso8859-pointers.html">pointers
    to information about ISO8859</A>. It's written very well!

<LI>Maybe also of interest to you is the
    <A HREF="ftp://ftp.vlsivie.tuwien.ac.at/pub/8bit/FAQ-ISO-8859-1">ISO 
     8859-1 FAQ</A> by Michael Gschwind
    (<A HREF="mailto:mike@vlsivie.tuwien.ac.at">mike@vlsivie.tuwien.ac.at</A>),
    part of his page on
    <A HREF="http://www.vlsivie.tuwien.ac.at/mike/i18n.html">Internationalization</A>.

<LI>For users of X11R5 on SunOS systems: the
    <A HREF="Compose.txt">table over the compose combinations</A>
    (also coded <A HREF="Compose.html">with entities</A> where possible).
     It's taken from the MIT X sources in
     <CODE>server/ddx/sun/Compose.list</CODE>.

<LI>Finally you could have a look at
    <A HREF="ftp://ds.internic.net/rfc/rfc1345.txt">RFC 1345: 
     Character Mnemonics &amp; Character Sets</A>
     by K. Simonsen (06/11/92, 103 pages, approx. 240 kbyte).

</UL>


<HR>

<ADDRESS><A HREF="http://ramsch.home.pages.de/">Martin Ramsch</A>, 16.02.1994, 07.01.1996, 01.07.1996, 1998-10-09, 2000-05-15</ADDRESS>

</BODY>
</HTML>