![]() |
Home | Photos | Resources | About | Sitemap |
| CHARACTER CODES : | Character Sets | ISO Latin-1 | Maths/Greek | Markup/Internat | Symbol Fonts | Links |
Character SetsA character set is a list of characters that may appear in a document, and a character encoding is a way of storing these characters on a computer as bits.
Whenever you are developing HTML documents you must specify the encoding you wish to use, e.g.
<META HTTP-EQUIV="Content-type" CONTENT="text/html; charset=iso-8859-1">
Common encodings include the ISO-8859-x series, SHIFT-JIS and EUC-JP for Japanese, and the various Unicode encodings (UTF-4, UTF-8, UTF-16 etc.). But the most common encoding used in HTML documents is ISO Latin-1 (otherwise known as ISO-8859-1) as this is the encoding used in HTML 4.0 and XHTML 1.0 specifications developed by the World Wide Web Consortium (W3C). Windows also uses a range of proprietary encodings (e.g. Windows-1252 Western European) which are similar to some of the popular encodings (most notably ISO Latin-1). However, there are a few significant incompatibilities. For example, although Windows-1252 displays characters in code positions 128-159, in ISO Latin-1 these code positions are not used.
Character referencesCharacter references allow web authors to refer to characters using either:
Character entity referencesCharacter entity references allow you to use a simple, memorable name instead of a number to refer to a character. The benefits are:
The disadvantages are:
The first character entity references were introduced with HTML 3.2 for ISO Latin-1 characters. In HTML 4, the list was extended to include symbols, mathematical symbols and Greek letters plus markup-significant and internationalisation characters.
The syntax for a character entity reference is an ampersand (&) followed by the name of the entity, followed by a semi-colon (;) :
<P>Special offer on «War & Peace». Price only £0.99.</P>
Special offer on «War & Peace». Price only £0.99.
Numeric character referencesNumeric character references use a number to refer to a character in the document character set. The number can be either a decimal or hexadecimal number. The benefits are:
The disadvantages are:
The syntax for numeric character references is an ampersand and a hash mark (&#), followed by a number in decimal, or the letter "x" and a number in hexadecimal, followed by a semi-colon (;) :
<P>‰ or ‰ displays the "per mille sign".</P>
‰ or ‰ displays the "per mille sign".
|
| © 2000 cloford.com. All rights reserved. |