SIST-TP CEN/TR 14381:2003
(Main)Information technology - Character repertoire and coding transformations - European fallback rules
Information technology - Character repertoire and coding transformations - European fallback rules
Multilingual fallbacks of European characters, applicable in multilingual pan-European environment. Harmonising work of all bodies dealing with standardised fallbacks.
Informacijska tehnologija – Nabor znakov in kodne pretvorbe – Evropska pravila za njihov nadomestni prikaz
General Information
Relations
Standards Content (Sample)
SLOVENSKI STANDARD
01-oktober-2003
Informacijska tehnologija – Nabor znakov in kodne pretvorbe – Evropska pravila
za njihov nadomestni prikaz
Information technology - Character repertoire and coding transformations - European
fallback rules
Ta slovenski standard je istoveten z: CEN/TR 14381:2003
ICS:
35.040 Nabori znakov in kodiranje Character sets and
informacij information coding
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
TECHNICAL REPORT
CEN/TR 14381
RAPPORT TECHNIQUE
TECHNISCHER BERICHT
April 2003
ICS 35.040
English version
Information technology – Character repertoire and coding
transformations – European fallback rules
This Technical Report was approved by CEN on 2 June 2002. It has been drawn up by the Technical Committee CEN/TC 304.
CEN members are the national standards bodies of Austria, Belgium, Czech Republic, Denmark, Finland, France, Germany, Greece,
Hungary, Iceland, Ireland, Italy, Luxembourg, Malta, Netherlands, Norway, Portugal, Slovakia, Spain, Sweden, Switzerland and United
Kingdom.
EUROPEAN COMMITTEE FOR STANDARDIZATION
COMITÉ EUROPÉEN DE NORMALISATION
EUROPÄISCHES KOMITEE FÜR NORMUNG
Management Centre: rue de Stassart, 36 B-1050 Brussels
© 2003 CEN All rights of exploitation in any form and by any means reserved Ref. No. CEN/TR 14381:2003 E
worldwide for CEN national Members.
Contents
Page
Foreword .3
0 Introduction .4
0.1 Rationale for the provision of fallback rules .4
0.2 Basic concepts .5
0.3 Requirements.6
0.4 Satisfying the requirements .6
1. Scope and field of application .6
1.1 Scope .6
1.2 Field of application.6
2. Normative references.7
3. Definitions and abbreviations.7
3.1 Basic definitions.7
3.2 Other definitions.7
3.3 Abbreviations .8
4 Specification of the general fallback rules .8
Annex 1 The list of fallback specification per character.9
Annex II Examples of fallback representation of text in different languages and scripts .61
II.1 Multilingual original text.61
II.2 Multilingual text with fallbacks .62
Annex III Notes on fallback for Latin, Greek and Cyrillic characters .63
III.1 Fallback from extended Latin characters.63
III.2 Fallback from Greek characters to Latin characters.63
III.3. Fallback with a One-to-many transliteration.64
III.4 Fallback with a one-to-many transcription.64
III.5. Restoring Greek text from Latin script fallback text.64
III.6. Fallback with a one-to-one transliteration.65
III.7 Fallback from Cyrillic characters to Latin characters.65
III.8. Fallback with a one-to-many transliteration.65
III.9 Restoring Cyrillic text from Latin script fallback text .66
III.10 Fallback with a one-to-one transliteration.66
Bibliography.68
Foreword
This document (CEN/TR 14381:2003) has been prepared by Technical Committee CEN/TC 304
"Information and communications technologies - European localization requirements", the
secretariat of which is held by IST.
The text of this technical report was written with the intent of it being published as a European pre-
Norm (ENV). In light of the various formal and informal comments received on the document
(some of which were only received after the closing of the ballot) the TC has resolved to turn this
document into a CEN report as a recorded example of an attempt to formulate European wide
fallback rules. It is evident that any fallback scheme in order for it to become acceptable by the
users and the industry will need to be very carefully laid out and explained.
th th
Resolutions no 7 of the 16 Meeting and no 4 of the 18 Meeting of CEN/TC 304 refer to this
technical report:
th
Res. 7/16 . TC304 acknowledges that the Fallback project team has completed its
contracted work. Although the proposed draft has received the sufficient support
to be forwarded to CEN/BT for final adoption as an ENV, the nature of the
comments received is such that it is decided to publish it as a CEN Report with
editorial comments added by the secretary and reviewed by TC members and
observers before final publication. Unanimous.
th
Res. 4/18 . TC304 accepts the Fallback document in N978 to be presented to
CEN BT for adoption as a CR with the following text on Greek letters added in
the foreword: “The method of performing fallback from Greek letters into Latin
letters is especially seen as posing problems to Greek users and its use is not
advised". Unanimous.
This technical report is intended to facilitate cross border communications and data exchange and to
ensure that European cultural requirements are safeguarded in the increasingly interconnected world
of today. It provides rules for fallback for multilingual European texts into the invariant set of
ISO/IEC 646. These rules come into effect if data from different languages must be represented by
equipment and systems that do not support the presentation of all the characters in the different
language repertoires.
This technical report does not intend to influence, let alone substitute itself for, national standards
or customs in this field. Nevertheless, national standards have the opportunity to adapt this
Technical Report by declaring a formalized set of deviation rules (»delta«) if they so wish.
This document does not cancel or replace any other technical report or standard.
There is no known identical national technical report or standard in Europe.
According to the CEN/CENELEC Common Rules the following countries are bound to announce
the existence of this Technical Report: Austria, Belgium, Czech Republic, Denmark, Finland,
France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Luxembourg, Malta, Netherlands,
Norway, Portugal, Slovakia, Spain, Sweden, Switzerland and United Kingdom.
0 Introduction
0.1 Rationale for the provision of fallback rules
Users who are trying to write text in a language which is not their mother tongue (native language)
often wish to write that text using a character repertoire which does not contain all the letters
needed for that language, especially those with diacritic marks. A method of character substitution
would be useful for such users.
In spite of the computers being able to process larger repertoires of graphic characters than ever
before, there are cases where it is not possible to render all the characters of a processing repertoire
on an output device. In these cases, not all the characters in a processing repertoire are available in
an output repertoire. In order to cater for these situations, a widely applicable standard method of
character substitution (fallback) is required which will allow an approximate rendition to be made
of the unsupported characters of the processing repertoire for output and rendition.
Examples of key applications are:
a) a multilingual information service offered across Europe where personal or business documents
come from different countries and are presented in a standardised rendition using MES 2
characters and which cannot be properly represented by the information service; and
b) search engines in the World Wide Web which make use of "fuzzy" search techniques based on
the use of search terms which have diacritical marks removed and make use of common
substitutions for less frequently used letters of the Latin alphabet. Examples of the latter are -
eth (ð), thorn (Þ), æ, œ and the German sharp s (ß)
The provision of single fallback rules with a collection of fallback representations for MES 2 will
enable the services to be improved and the applications easier for use by the human end user. A
standard set of substitutions would be useful for such applications in order to avoid confusion. The
same applies for the other two scripts represented by MES-2, Cyrillic and Greek.
The justification for preparing a technical report for these purposes is that the concept of
representing the characters without diacritical marks is not useful for scripts originating outside
Europe. Furthermore, Standardisation bodies of Europe that may wish to specify national schemes
for fallback may modify the scheme given in this technical report for a limited set of characters and
promulgate national standards for fallbacks. Greek and Cyrillic fallback representation specified in
this technical report should be used with caution since transliteration into various Latin script
languages depends on the target language. Local standards or local best practice should be
referenced where they exist.
This European fallback specification can be used as a default in all relevant situations. It can be
used as the basis for national standards with local preferences being used for specific substitutions
defined by particular nation. It is expected that national standards for fallback will be registered in
the international cultural registry as part of national locales. Well known local solutions will also be
documented in addition to the default values.
0.2 Basic concepts
This standard specifies how a source stream of coded characters from a processing repertoire is
represented in a target stream of an output repertoire. The worst case that is covered by the
substitutions defined in this technical report is where the processing repertoire is MES-2 and the
output repertoire is the invariant repertoire of ISO/IEC 646. The coding of the processing repertoire
and the coding of the output repertoire are outside the scope of this TR.
Characters in the source stream that occur in the output repertoire are transferred directly to the
target stream without substitution. Characters in the source stream that do not occur in the output
repertoire are subject to substitution.
There are two types of substitution. In the first type, the target characters are represented in a way
that disables the reverse transformation of the target stream to the source stream because of loss of
information. A very common example of this type of presentation is when the Latin small letter e
acute (é) is presented by Latin small letter e (e). This type of substitution when a letter with
diacritical mark is represented with the same character but without diacritical mark is known as
accent dropping. The second type of presentation introduces special symbols that preserve the
information about the original graphical symbol enabling transformation of the character stream to
the original encoding. An example of this is the use of the SGML symbols (e.g. é in the
case above). This type of substitution is outside the scope of this TR.
The substitution with loss of information can have more forms, but two main classes are always
recognised as basic:
-one-to-many when one graphical character of the source stream is substituted with more than one
graphical character from the output repertoire in the target stream. An example of this type of
presentation is Latin capital letter Æ presented as AE. This class is recommended for general use.
-one-to-one when one graphical character of the source stream is substituted with one graphical
character of the output repertoire in the target stream. This type of presentation is required in
applications were the number of characters in the data entries or fields (e.g. in data bases or
application forms) is fixed. The accent dropping is a type of one to one substitution. It is
anticipated that this class will have minority application and should be discouraged, only to be used
when there are strong technical reasons for doing so.
0.3 Requirements
It is desirable that a standard fallback specification has a very large field of application so that it
may be used across a wide range of platforms. To achieve this, a fallback specification is needed
for a processing repertoire that is a superset of a large proportion of existing processing repertoires.
Also, a fallback repertoire is needed which is a subset of a large proportion of existing output
repertoires.
The characters contained in the collection Multilingual European Subset No.2 (MES-2) specified in
CEN CWA 13873-2000, have a wide usage across Europe. In many cases, MES-2 will become the
processing repertoire of choice. MES-2 is a large repertoire for which there will be a need for a
fallback specification.
0.4 Satisfying the requirements
MES-2 has been designed to be a superset of a wide range of processing repertoires for commercial
and administrative applications in office environments across the EEA.
It is a valid assumption that the minimum output repertoire that is implemented in computer
systems is the invariant repertoire of ISO/IEC 646. Therefore a substitution with one or more
characters from the invariant repertoire of ISO/IEC 646 will always be able to be rendered on an
output device. The invariant repertoire of ISO/IEC 646 does not have letters with diacritic marks
nor letters from national alphabets used in Europe.
This standard satisfies the requirements by providing a fallback specification that can represent
each character of MES-2 (and six additional characters). The additional six characters are 048C,
048D, 048E, 048F, 04EC and 04ED.
1. Scope and field of application
1.1 Scope
This technical report specifies the representation of the characters of the collection Multilingual
European Subset No.2, MES-2, and six additional characters with one or more characters of the
invariant repertoire of ISO/IEC 646 (83 graphic characters). Where a character is not available in
the invariant set of ISO/IEC 646, a fallback representation for rendering is specified.
1.2 Field of application
The fallback rules given here are only intended for data in more than one European language, i.e.
for use in pan-European applications. They are not meant to influence, let alone replace existing
national standards or practices.
2. Normative references
2.1 ISO/IEC 10646-1:2000, Information technology-Universal Multiple-Octet Coded Character
Set (UCS)-Part 1: Architecture and Basic Multilingual Plane.
2.2 ISO/IEC 646:1991, Information technology - ISO 7-bit coded character set for information
interchange.
3. Definitions and abbreviations
3.1 Basic definitions
For the purposes of this technical report the basic definitions of ISO/IEC 10646-1 section 4 apply.
The following are reproduced here for ease of reference:
: A member of a set of elements used for organisation, control, or representation of
3.1.1 character
data.
3.1.2 coded character: A character together with its coded representation.
3.1.3 coded character set: A set of unambiguous rules that establishes a character set and the
relationship between the characters of the set and their coded representation.
: A specified set of characters that are represented in a coded character set
3.1.4 repertoire
3.1.5 presentation: To present: the process of writing, printing, or displaying of a graphic symbol.
3.1.6 graphic symbol: The visual representation of a graphic character or of a composite sequence.
3.1.7 graphic character: A character, other than a control function, that has a visual representation,
normally hand-written, printed, or displayed.
3.2 Other definitions
Also, the following definitions apply:
: Any of a number of recurring graphical structures placed over, under, next
3.2.1 diacritical mark
to, or through a basic letter which does not significantly modify the shape of the basic letter itself
and which in combination with that basic letter is a graphic character of the identified repertoire of
MES-2.
: A letter which is constructed as the combination of a basic letter
3.2.2 letter with diacritical mark
and a diacritical mark.
3.2.3 basic letter: A letter that is one of the letters of the repertoire of the IRV of ISO 646.
: A series of coded characters in sequence, sometimes referenced as a stream.
3.2.4 character stream
3.2.5 character classification: The characters of the repertoire characters are classified into letters,
digits and special characters.
3.2.6 processing repertoire: This is the repertoire used by an application for processing character
based information.
: This is the repertoire used by a computer system for external representation
3.2.7 output repertoire
of character based information.
3.3 Abbreviations
The following abbreviations apply:
3.3.1 MES-2: Multilingual European Subset No. 2
4 Specification of the general fallback rules
This technical report provides two mappings for substitution. The first mapping is one to many “(1:
many)” and the second is one to one “(1:1)” fallback for characters outside of the ISO/IEC 646
Invariant Subset. The one-to-one mapping should be avoided, and used only when technical
difficulties prevent the use of the one-to-many fallback.
Common practice is used as the basis for determining fallbacks, where possible. Both fallback
methods drop accents (diacritical marks). Common practice in several European countries leads to
fallbacks to digraphs for several accented and modified vowels used in Nordic countries and in
German-speaking countries (aa, ae, oe, ue, etc). One character may default to several characters,
especially in the case of symbols, where fallbacks are sometimes based on the name of the
character.
Fallbacks for the letters of the Greek and Cyrillic alphabets are based on the international
romanisation standards ISO 843 and ISO 9 where possible, amended where necessary to allow
fallbacks to be limited to the Invariant Subset of ISO/IEC 646.
Annex 1 The list of fallback specification per character
Table 1 GENERAL FALLBACK TABLE
Column 1 contains the ISO/IEC 10646-1 code for the source character
Column 2 contains a glyph
Column 3 contains the ISO/IEC 10646-1 character name
Column 4 contains a glyph or glyphs for the one-to-many fallback
Column 5 contains a glyph for the one-to-one fallback
CHARACTERS AND FALLBACKS MES-2
FALLBACKS FALLBACK
SOURCE CHARACTER
CHARACTERS CHARACTER
One-to-many One-to-one
Code Glyph Character name
glyph(s) glyph
0020 SPACE
0021 EXCLAMATION MARK
0022 QUOTATION MARK
0023 NUMBER SIGN
0024 DOLLAR SIGN
0025 PERCENT SIGN
0026 AMPERSAND
0027 APOSTROPHE
0028 LEFT PARENTHESIS
0029 RIGHT PARENTHESIS
002A ASTERISK
002B PLUS SIGN
002C COMMA
002D HYPHEN-MINUS
002E FULL STOP
002F SOLIDUS
0030 DIGIT ZERO
0031 DIGIT ONE
0032 DIGIT TWO
0033 DIGIT THREE
0034 DIGIT FOUR
0035 DIGIT FIVE
0036 DIGIT SIX
0037 DIGIT SEVEN
0038 DIGIT EIGHT
0039 DIGIT NINE
003A COLON
003B SEMICOLON
003C LESS-THAN SIGN
003D EQUALS SIGN
003E GREATER-THAN SIGN
003F QUESTION MARK
0040 COMMERCIAL AT
0041 LATIN CAPITAL LETTER A
0042 LATIN CAPITAL LETTER B
0043 LATIN CAPITAL LETTER C
0044 LATIN CAPITAL LETTER D
0045 LATIN CAPITAL LETTER E
0046 LATIN CAPITAL LETTER F
0047 LATIN CAPITAL LETTER G
0048 LATIN CAPITAL LETTER H
0049 LATIN CAPITAL LETTER I
004A LATIN CAPITAL LETTER J
004B LATIN CAPITAL LETTER K
004C LATIN CAPITAL LETTER L
004D LATIN CAPITAL LETTER M
004E LATIN CAPITAL LETTER N
004F LATIN CAPITAL LETTER O
0050 LATIN CAPITAL LETTER P
0051 LATIN CAPITAL LETTER Q
0052 LATIN CAPITAL LETTER R
0053 LATIN CAPITAL LETTER S
0054 LATIN CAPITAL LETTER T
0055 LATIN CAPITAL LETTER U
0056 LATIN CAPITAL LETTER V
0057 LATIN CAPITAL LETTER W
0058 LATIN CAPITAL LETTER X
0059 LATIN CAPITAL LETTER Y
005A LATIN CAPITAL LETTER Z
005B LEFT SQUARE BRACKET
005C REVERSE SOLIDUS
005D RIGHT SQUARE BRACKET
005E CIRCUMFLEX ACCENT
005F LOW LINE
0060 GRAVE ACCENT
0061 LATIN SMALL LETTER A
0062 LATIN SMALL LETTER B
0063 LATIN SMALL LETTER C
0064 LATIN SMALL LETTER D
0065 LATIN SMALL LETTER E
0066 LATIN SMALL LETTER F
0067 LATIN SMALL LETTER G
0068 LATIN SMALL LETTER H
0069 LATIN SMALL LETTER I
006A LATIN SMALL LETTER J
006B LATIN SMALL LETTER K
006C LATIN SMALL LETTER L
006D LATIN SMALL LETTER M
006E LATIN SMALL LETTER N
006F LATIN SMALL LETTER O
0070 LATIN SMALL LETTER P
0071 LATIN SMALL LETTER Q
0072 LATIN SMALL LETTER R
0073 LATIN SMALL LETTER S
0074 LATIN SMALL LETTER T
0075 LATIN SMALL LETTER U
0076 LATIN SMALL LETTER V
0077 LATIN SMALL LETTER W
0078 LATIN SMALL LETTER X
0079 LATIN SMALL LETTER Y
007A LATIN SMALL LETTER Z
007B LEFT CURLY BRACKET
007C VERTICAL LINE
007D RIGHT CURLY BRACKET
007E TILDE
00A0 NO-BREAK SPACE
00A1 INVERTED EXCLAMATION MARK
00A2 CENT SIGN
00A3 POUND SIGN
00A4 CURRENCY SIGN
00A5 YEN SIGN
00A6 BROKEN BAR
00A7 SECTION SIGN
00A8 DIAERESIS
00A9 COPYRIGHT SIGN
00AA FEMININE ORDINAL INDICATOR
LEFT-POINTING DOUBLE ANGLE
00AB
QUOTATION MARK
00AC NOT SIGN
00AD SOFT HYPHEN
00AE REGISTERED SIGN
00AF MACRON
00B0 DEGREE SIGN
00B1 PLUS-MINUS SIGN
00B2 SUPERSCRIPT TWO
00B3 SUPERSCRIPT THREE
00B4 ACUTE ACCENT
00B5 MICRO SIGN
00B6 PILCROW SIGN
00B7 MIDDLE DOT
00B8 CEDILLA
00B9 SUPERSCRIPT ONE
00BA MASCULINE ORDINAL INDICATOR
RIGHT-POINTING DOUBLE ANGLE
00BB
QUOTATION MARK
00BC VULGAR FRACTION ONE QUARTER
00BD VULGAR FRACTION ONE HALF
00BE VULGAR FRACTION THREE QUARTERS
00BF INVERTED QUESTION MARK
00C0 LATIN CAPITAL LETTER A WITH GRAVE
00C1 LATIN CAPITAL LETTER A WITH ACUTE
LATIN CAPITAL LETTER A WITH
00C2
CIRCUMFLEX
00C3 LATIN CAPITAL LETTER A WITH TILDE
LATIN CAPITAL LETTER A WITH
00C4
DIAERESIS
LATIN CAPITAL LETTER A WITH RING
00C5
ABOVE
00C6 LATIN CAPITAL LETTER AE
LATIN CAPITAL LETTER C WITH
00C7
CEDILLA
00C8 LATIN CAPITAL LETTER E WITH GRAVE
00C9 LATIN CAPITAL LETTER E WITH ACUTE
LATIN CAPITAL LETTER E WITH
00CA
CIRCUMFLEX
LATIN CAPITAL LETTER E WITH
00CB
DIAERESIS
00CC LATIN CAPITAL LETTER I WITH GRAVE
00CD LATIN CAPITAL LETTER I WITH ACUTE
LATIN CAPITAL LETTER I WITH
00CE
CIRCUMFLEX
LATIN CAPITAL LETTER I WITH
00CF
DIAERESIS
00D0 LATIN CAPITAL LETTER ETH
00D1 LATIN CAPITAL LETTER N WITH TILDE
00D2 LATIN CAPITAL LETTER O WITH GRAVE
00D3 LATIN CAPITAL LETTER O WITH ACUTE
LATIN CAPITAL LETTER O WITH
00D4
CIRCUMFLEX
00D5 LATIN CAPITAL LETTER O WITH TILDE
LATIN CAPITAL LETTER O WITH
00D6
DIAERESIS
00D7 MULTIPLICATION SIGN
LATIN CAPITAL LETTER O WITH
00D8
STROKE
00D9 LATIN CAPITAL LETTER U WITH GRAVE
00DA LATIN CAPITAL LETTER U WITH ACUTE
LATIN CAPITAL LETTER U WITH
00DB
CIRCUMFLEX
LATIN CAPITAL LETTER U WITH
00DC
DIAERESIS
00DD LATIN CAPITAL LETTER Y WITH ACUTE
00DE LATIN CAPITAL LETTER THORN
00DF LATIN SMALL LETTER SHARP S
00E0 LATIN SMALL LETTER A WITH GRAVE
00E1 LATIN SMALL LETTER A WITH ACUTE
LATIN SMALL LETTER A WITH
00E2
CIRCUMFLEX
00E3 LATIN SMALL LETTER A WITH TILDE
LATIN SMALL LETTER A WITH
00E4
DIAERESIS
LATIN SMALL LETTER A WITH RING
00E5
ABOVE
00E6 LATIN SMALL LETTER AE
00E7 LATIN SMALL LETTER C WITH CEDILLA
00E8 LATIN SMALL LETTER E WITH GRAVE
00E9 LATIN SMALL LETTER E WITH ACUTE
LATIN SMALL LETTER E WITH
00EA
CIRCUMFLEX
LATIN SMALL LETTER E WITH
00EB
DIAERESIS
00EC LATIN SMALL LETTER I WITH GRAVE
00ED LATIN SMALL LETTER I WITH ACUTE
LATIN SMALL LETTER I WITH
00EE
CIRCUMFLEX
LATIN SMALL LETTER I WITH
00EF
DIAERESIS
00F0 LATIN SMALL LETTER ETH
00F1 LATIN SMALL LETTER N WITH TILDE
00F2 LATIN SMALL LETTER O WITH GRAVE
00F3 LATIN SMALL LETTER O WITH ACUTE
LATIN SMALL LETTER O WITH
00F4
CIRCUMFLEX
00F5 LATIN SMALL LETTER O WITH TILDE
LATIN SMALL LETTER O WITH
00F6
DIAERESIS
00F7 DIVISION SIGN
00F8 LATIN SMALL LETTER O WITH STROKE
00F9 LATIN SMALL LETTER U WITH GRAVE
00FA LATIN SMALL LETTER U WITH ACUTE
LATIN SMALL LETTER U WITH
00FB
CIRCUMFLEX
LATIN SMALL LETTER U WITH
00FC
DIAERESIS
00FD LATIN SMALL LETTER Y WITH ACUTE
00FE LATIN SMALL LETTER THORN
LATIN SMALL LETTER Y WITH
00FF
DIAERESIS
LATIN CAPITAL LETTER A WITH
MACRON
0101 LATIN SMALL LETTER A WITH MACRON
0102 LATIN CAPITAL LETTER A WITH BREVE
0103 LATIN SMALL LETTER A WITH BREVE
LATIN CAPITAL LETTER A WITH
OGONEK
0105 LATIN SMALL LETTER A WITH OGONEK
0106 LATIN CAPITAL LETTER C WITH ACUTE
0107 LATIN SMALL LETTER C WITH ACUTE
LATIN CAPITAL LETTER C WITH
CIRCUMFLEX
LATIN SMALL LETTER C WITH
CIRCUMFLEX
LATIN CAPITAL LETTER C WITH DOT
010A
ABOVE
LATIN SMALL LETTER C WITH DOT
010B
ABOVE
010C LATIN CAPITAL LETTER C WITH CARON
010D LATIN SMALL LETTER C WITH CARON
010E LATIN CAPITAL LETTER D WITH CARON
010F LATIN SMALL LETTER D WITH CARON
LATIN CAPITAL LETTER D WITH
STROKE
0111 LATIN SMALL LETTER D WITH STROKE
LATIN CAPITAL LETTER E WITH
MACRON
0113 LATIN SMALL LETTER E WITH MACRON
0114 LATIN CAPITAL LETTER E WITH BREVE
0115 LATIN SMALL LETTER E WITH BREVE
LATIN CAPITAL LETTER E WITH DOT
ABOVE
LATIN SMALL LETTER E WITH DOT
ABOVE
LATIN CAPITAL LETTER E WITH
OGONEK
0119 LATIN SMALL LETTER E WITH OGONEK
011A LATIN CAPITAL LETTER E WITH CARON
011B LATIN SMALL LETTER E WITH CARON
LATIN CAPITAL LETTER G WITH
011C
CIRCUMFLEX
LATIN SMALL LETTER G WITH
011D
CIRCUMFLEX
011E LATIN CAPITAL LETTER G WITH BREVE
011F LATIN SMALL LETTER G WITH BREVE
LATIN CAPITAL LETTER G WITH DOT
ABOVE
LATIN SMALL LETTER G WITH DOT
ABOVE
LATIN CAPITAL LETTER G WITH
CEDILLA
0123 LATIN SMALL LETTER G WITH CEDILLA
LATIN CAPITAL LETTER H WITH
CIRCUMFLEX
LATIN SMALL LETTER H WITH
CIRCUMFLEX
LATIN CAPITAL LETTER H WITH
STROKE
0127 LATIN SMALL LETTER H WITH STROKE
0128 LATIN CAPITAL LETTER I WITH TILDE
0129 LATIN SMALL LETTER I WITH TILDE
LATIN CAPITAL LETTER I WITH
012A
MACRON
012B LATIN SMALL LETTER I WITH MACRON
012C LATIN CAPITAL LETTER I WITH BREVE
012D LATIN SMALL LETTER I WITH BREVE
LATIN CAPITAL LETTER I WITH
012E
OGONEK
012F LATIN SMALL LETTER I WITH OGONEK
LATIN CAPITAL LETTER I WITH DOT
ABOVE
0131 LATIN SMALL LETTER DOTLESS I
0132 LATIN CAPITAL LIGATURE IJ
0133 LATIN SMALL LIGATURE IJ
LATIN CAPITAL LETTER J WITH
CIRCUMFLEX
LATIN SMALL LETTER J WITH
CIRCUMFLEX
LATIN CAPITAL LETTER K WITH
CEDILLA
0137 LATIN SMALL LETTER K WITH CEDILLA
0138 LATIN SMALL LETTER KRA
0139 LATIN CAPITAL LETTER L WITH ACUTE
013A LATIN SMALL LETTER L WITH ACUTE
LATIN CAPITAL LETTER L WITH
013B
CEDILLA
013C LATIN SMALL LETTER L WITH CEDILLA
013D LATIN CAPITAL LETTER L WITH CARON
013E LATIN SMALL LETTER L WITH CARON
LATIN CAPITAL LETTER L WITH
013F
MIDDLE DOT
LATIN SMALL LETTER L WITH MIDDLE
DOT
LATIN CAPITAL LETTER L WITH
STROKE
0142 LATIN SMALL LETTER L WITH STROKE
0143 LATIN CAPITAL LETTER N WITH ACUTE
0144 LATIN SMALL LETTER N WITH ACUTE
LATIN CAPITAL LETTER N WITH
CEDILLA
0146 LATIN SMALL LETTER N WITH CEDILLA
0147 LATIN CAPITAL LETTER N WITH CARON
0148 LATIN SMALL LETTER N WITH CARON
LATIN SMALL LETTER N PRECEDED BY
APOSTROPHE
014A LATIN CAPITAL LETTER ENG
014B LATIN SMALL LETTER ENG
LATIN CAPITAL LETTER O WITH
014C
MACRON
014D LATIN SMALL LETTER O WITH MACRON
014E LATIN CAPITAL LETTER O WITH BREVE
014F LATIN SMALL LETTER O WITH BREVE
LATIN CAPITAL LETTER O WITH
DOUBLE ACUTE
LATIN SMALL LETTER O WITH DOUBLE
ACUTE
0152 LATIN CAPITAL LIGATURE OE
0153 LATIN SMALL LIGATURE OE
0154 LATIN CAPITAL LETTER R WITH ACUTE
0155 LATIN SMALL LETTER R WITH ACUTE
LATIN CAPITAL LETTER R WITH
CEDILLA
0157 LATIN SMALL LETTER R WITH CEDILLA
0158 LATIN CAPITAL LETTER R WITH CARON
0159 LATIN SMALL LETTER R WITH CARON
015A LATIN CAPITAL LETTER S WITH ACUTE
015B LATIN SMALL LETTER S WITH ACUTE
LATIN CAPITAL LETTER S WITH
015C
CIRCUMFLEX
LATIN SMALL LETTER S WITH
015D
CIRCUMFLEX
LATIN CAPITAL LETTER S WITH
015E
CEDILLA
015F LATIN SMALL LETTER S WITH CEDILLA
0160 LATIN CAPITAL LETTER S WITH CARON
0161 LATIN SMALL LETTER S WITH CARON
LATIN CAPITAL LETTER T WITH
CEDILLA
0163 LATIN SMALL LETTER T WITH CEDILLA
0164 LATIN CAPITAL LETTER T WITH CARON
0165 LATIN SMALL LETTER T WITH CARON
LATIN CAPITAL LETTER T WITH
STROKE
0167 LATIN SMALL LETTER T WITH STROKE
0168 LATIN CAPITAL LETTER U WITH TILDE
0169 LATIN SMALL LETTER U WITH TILDE
LATIN CAPITAL LETTER U WITH
016A
MACRON
016B LATIN SMALL LETTER U WITH MACRON
016C LATIN CAPITAL LETTER U WITH BREVE
016D LATIN SMALL LETTER U WITH BREVE
LATIN CAPITAL LETTER U WITH RING
016E
ABOVE
LATIN SMALL LETTER U WITH RING
016F
ABOVE
LATIN CAPITAL LETTER U WITH
DOUBLE ACUTE
LATIN SMALL LETTER U WITH DOUBLE
ACUTE
LATIN CAPITAL LETTER U WITH
OGONEK
0173 LATIN SMALL LETTER U WITH OGONEK
LATIN CAPITAL LETTER W WITH
CIRCUMFLEX
LATIN SMALL LETTER W WITH
CIRCUMFLEX
LATIN CAPITAL LETTER Y WITH
CIRCUMFLEX
LATIN SMALL LETTER Y WITH
CIRCUMFLEX
LATIN CAPITAL LETTER Y WITH
DIAERESIS
0179 LATIN CAPITAL LETTER Z WITH ACUTE
017A LATIN SMALL LETTER Z WITH ACUTE
LATIN CAPITAL LETTER Z WITH DOT
017B
ABOVE
LATIN SMALL LETTER Z WITH DOT
017C
ABOVE
017D LATIN CAPITAL LETTER Z WITH CARON
017E LATIN SMALL LETTER Z WITH CARON
017F LATIN SMALL LETTER LONG S
018F LATIN CAPITAL LETTER SCHWA
0192 LATIN SMALL LETTER F WITH HOOK
01B7 LATIN CAPITAL LETTER EZH
LATIN CAPITAL LETTER A WITH
01DE
DIAERESIS AND MACRON
LATIN SMALL LETTER A WITH
01DF
DIAERESIS AND MACRON
LATIN CAPITAL LETTER A WITH DOT
01E0
ABOVE AND MACRON
LATIN SMALL LETTER A WITH DOT
01E1
ABOVE AND MACRON
LATIN CAPITAL LETTER AE WITH
01E2
MACRON
01E3 LATIN SMALL LETTER AE WITH MACRON
LATIN CAPITAL LETTER G WITH
01E4
STROKE
01E5 LATIN SMALL LETTER G WITH STROKE
01E6 LATIN CAPITAL LETTER G WITH CARON
01E7 LATIN SMALL LETTER G WITH CARON
01E8 LATIN CAPITAL LETTER K WITH CARON
01E9 LATIN SMALL LETTER K WITH CARON
LATIN CAPITAL LETTER O WITH
01EA
OGONEK
01EB LATIN SMALL LETTER O WITH OGONEK
LATIN CAPITAL LETTER O WITH
01EC
OGONEK AND MACRON
LATIN SMALL LETTER O WITH OGONEK
01ED
AND MACRON
LATIN CAPITAL LETTER EZH WITH
01EE
CARON
01EF LATIN SMALL LETTER EZH WITH CARON
LATIN CAPITAL LETTER A WITH RING
01FA
ABOVE AND ACUTE
LATIN SMALL LETTER A WITH RING
01FB
ABOVE AND ACUTE
LATIN CAPITAL LETTER AE WITH
01FC
ACUTE
01FD LATIN SMALL LETTER AE WITH ACUTE
LATIN CAPITAL LETTER O WITH
01FE
STROKE AND ACUTE
LATIN SMALL LETTER O WITH STROKE
01FF
AND ACUTE
LATIN CAPITAL LETTER S WITH COMMA
BELOW
LATIN SMALL LETTER S WITH COMMA
BELOW
LATIN CAPITAL LETTER T WITH COMMA
021A
BELOW
LATIN SMALL LETTER T WITH COMMA
021B
BELOW
021E LATIN CAPITAL LETTER H WITH CARON
021F LATIN SMALL LETTER H WITH CARON
0259 LATIN SMALL LETTER SCHWA
027C
LATIN SMALL LETTER R WITH LONG
LEG
0292 LATIN SMALL LETTER EZH
02BB MODIFIER LETTER TURNED COMMA
02BC MODIFIER LETTER APOSTROPHE
02BD MODIFIER LETTER REVERSED COMMA
02C6 MODIFIER LETTER CIRCUMFLEX ACCENT
02C7 CARON
02C9 MODIFIER LETTER MACRON
02D8 BREVE
02D9 DOT ABOVE
02DA RING ABOVE
02DB OGONEK
02DC SMALL TILDE
02DD DOUBLE ACUTE ACCENT
02EE MODIFIER LETTER DOUBLE APOSTROPHE
0374 GREEK NUMERAL SIGN
0375 GREEK LOWER NUMERAL SIGN
037A GREEK YPOGEGRAMMENI
037E GREEK QUESTION MARK
0384 GREEK TONOS
0385 GREEK DIALYTIKA TONOS
GREEK CAPITAL LETTER ALPHA WITH
TONOS
0387 GREEK ANO TELEIA
GREEK CAPITAL LETTER EPSILON WITH
TONOS
GREEK CAPITAL LETTER ETA WITH
TONOS
GREEK CAPITAL LETTER IOTA WITH
038A
TONOS
GREEK CAPITAL LETTER OMICRON WITH
038C
TONOS
GREEK CAPITAL LETTER UPSILON WITH
038E
TONOS
GREEK CAPITAL LETTER OMEGA WITH
038F
TONOS
GREEK SMALL LETTER IOTA WITH
DIALYTIKA AND TONOS
0391 GREEK CAPITAL LETTER ALPHA
0392 GREEK CAPITAL LETTER BETA
0393 GREEK CAPITAL LETTER GAMMA
0394 GREEK CAPITAL LETTER DELTA
0395 GREEK CAPITAL LETTER EPSILON
0396 GREEK CAPITAL LETTER ZETA
0397 GREEK CAPITAL LETTER ETA
0398 GREEK CAPITAL LETTER THETA
0399 GREEK CAPITAL LETTER IOTA
039A GREEK CAPITAL LETTER KAPPA
039B GREEK CAPITAL LETTER LAMDA
039C GREEK CAPITAL LETTER MU
039D GREEK CAPITAL LETTER NU
039E GREEK CAPITAL LETTER XI
039F GREEK CAPITAL LETTER OMICRON
03A0 GREEK CAPITAL LETTER PI
03A1 GREEK CAPITAL LETTER RHO
03A3 GREEK CAPITAL LETTER SIGMA
03A4 GREEK CAPITAL LETTER TAU
03A5 GREEK CAPITAL LETTER UPSILON
03A6 GREEK CAPITAL LETTER PHI
03A7 GREEK CAPITAL LETTER CHI
03A8 GREEK CAPITAL LETTER PSI
03A9 GREEK CAPITAL LETTER OMEGA
GREEK CAPITAL LETTER IOTA WITH
03AA
DIALYTIKA
GREEK CAPITAL LETTER UPSILON WITH
03AB
DIALYTIKA
GREEK SMALL LETTER ALPHA WITH
03AC
TONOS
GREEK SMALL LETTER EPSILON WITH
03AD
TONOS
03AE GREEK SMALL LETTER ETA WITH TONOS
GREEK SMALL LETTER IOTA WITH
03AF
TONOS
GREEK SMALL LETTER UPSILON WITH
03B0
DIALYTIKA AND TONOS
03B1 GREEK SMALL LETTER ALPHA
03B2 GREEK SMALL LETTER BETA
03B3 GREEK SMALL LETTER GAMMA
03B4 GREEK SMALL LETTER DELTA
03B5 GREEK SMALL LETTER EPSILON
03B6 GREEK SMALL LETTER ZETA
03B7 GREEK SMALL LETTER ETA
03B8 GREEK SMALL LETTER THETA
03B9 GREEK SMALL LETTER IOTA
03BA GREEK SMALL LETTER KAPPA
03BB GREEK SMALL LETTER LAMDA
03BC GREEK SMALL LETTER MU
03BD GREEK SMALL LETTER NU
03BE GREEK SMALL LETTER XI
03BF GREEK SMALL LETTER OMICRON
03C0 GREEK SMALL LETTER PI
03C1 GREEK SMALL LETTER RHO
03C2 GREEK SMALL LETTER FINAL SIGMA
03C3 GREEK SMALL LETTER SIGMA
03C4 GREEK SMALL LETTER TAU
03C5 GREEK SMALL LETTER UPSILON
03C6 GREEK SMALL LETTER PHI
03C7 GREEK SMALL LETTER CHI
03C8 GREEK SMALL LETTER PSI
03C9 GREEK SMALL LETTER OMEGA
GREEK SMALL LETTER IOTA WITH
03CA
DIALYTIKA
GREEK SMALL LETTER UPSILON WITH
03CB
DIALYTIKA
GREEK SMALL LETTER OMICRON WITH
03CC
TONOS
GREEK SMALL LETTER UPSILON WITH
03CD
TONOS
GREEK SMALL LETTER OMEGA WITH
03CE
TONOS
03D7 GREEK KAI SYMBOL
03DA GREEK LETTER STIGMA
03DB GREEK SMALL LETTER STIGMA
03DC GREEK LETTER DIGAMMA
03DD GREEK SMALL LETTER DIGAMMA
03DE GREEK LETTER KOPPA
03DF GREEK SMALL LETTER KOPPA
03E0 GREEK LETTER SAMPI
03E1 GREEK SMALL LETTER SAMPI
CYRILLIC CAPITAL LETTER IE WITH
GRAVE
0401 CYRILLIC CAPITAL LETTER IO
0402 CYRILLIC CAPITAL LETTER DJE
0403 CYRILLIC CAPITAL LETTER GJE
CYRILLIC CAPITAL LETTER UKRAINIAN
IE
0405 CYRILLIC CAPITAL LETTER DZE
CYRILLIC CAPITAL LETTER
BYELORUSSIAN-UKRAINIAN I
0407 CYRILLIC CAPITAL LETTER YI
0408 CYRILLIC CAPITAL LETTER JE
0409 CYRILLIC CAPITAL LETTER LJE
040A CYRILLIC CAPITAL LETTER NJE
040B CYRILLIC CAPITAL LETTER TSHE
040C CYRILLIC CAPITAL LETTER KJE
CYRILLIC CAPITAL LETTER I WITH
040D
GRAVE
040E CYRILLIC CAPITAL LETTER SHORT U
040F CYRILLIC CAPITAL LETTER DZHE
0410 CYRILLIC CAPITAL LETTER A
0411 CYRILLIC CAPITAL LETTER BE
0412 CYRILLIC CAPITAL LETTER VE
0413 CYRILLIC CAPITAL LETTER GHE
0414 CYRILLIC CAPITAL LETTER DE
0415 CYRILLIC CAPITAL LETTER E
0416 CYRILLIC CAPITAL LETTER ZHE
0417 CYRILLIC CAPITAL LETTER ZE
0418 CYRILLIC CAPITAL LETTER I
0419 CYRILLIC CAPITAL LETTER SHORT I
041A CYRILLIC CAPITAL LETTER KA
041B CYRILLIC CAPITAL LETTER EL
041C CYRILLIC CAPITAL LETTER EM
041D CYRILLIC CAPITAL LETTER EN
041E CYRILLIC CAPITAL LETTER O
041F CYRILLIC CAPITAL LETTER PE
0420 CYRILLIC CAPITAL LETTER ER
0421 CYRILLIC CAPITAL LETTER ES
0422 CYRILLIC CAPITAL LETTER TE
0423 CYRILLIC CAPITAL LETTER U
0424 CYRILLIC CAPITAL LETTER EF
0425 CYRILLIC CAPITAL LETTER HA
0426 CYRILLIC CAPITAL LETTER TSE
0427 CYRILLIC CAPITAL LETTER CHE
0428 CYRILLIC CAPITAL LETTER SHA
0429 CYRILLIC CAPITAL LETTER SHCHA
042A CYRILLIC CAPITAL LETTER HARD SIGN
042B CYRILLIC CAPITAL LETTER YERU
042C CYRILLIC CAPITAL LETTER SOFT SIGN
042D CYRILLIC CAPITAL LETTER E
042E CYRILLIC CAPITAL LETTER YU
042F CYRILLIC CAPITAL LETTER YA
0430 CYRILLIC SMALL LETTER A
0431 CYRILLIC SMALL LETTER BE
0432 CYRILLIC SMALL LETTER VE
0433 CYRILLIC SMALL LETTER GHE
0434 CYRILLIC SMALL LETTER DE
0435 CYRILLIC SMALL LETTER IE
0436 CYRILLIC SMALL LETTER ZHE
0437 CYRILLIC SMALL LETTER ZE
0438 CYRILLIC SMALL LETTER I
0439 CYRILLIC SMALL LETTER SHORT I
043A CYRILLIC SMALL LETTER KA
043B CYRILLIC SMALL LETTER EL
043C CYRILLIC SMALL LETTER EM
043D CYRILLIC SMALL LETTER EN
043E CYRILLIC SMALL LETTER O
043F CYRILLIC SMALL LETTER PE
0440 CYRILLIC SMALL LETTER ER
0441 CYRILLIC SMALL LETTER ES
0442 CYRILLIC SMALL LETTER TE
0443 CYRILLIC SMALL LETTER U
0444 CYRILLIC SMALL LETTER EF
0445 CYRILLIC SMALL LETTER HA
0446 CYRILLIC SMALL LETTER TSE
0447 CYRILLIC SMALL LETTER CHE
0448 CYRILLIC SMALL LETTER SHA
0449 CYRILLIC SMALL LETTER SHCHA
044A CYRILLIC SMALL LETTER HARD SIGN
044B CYRILLIC SMALL LETTER YERU
044C CYRILLIC SMALL LETTER SOFT SIGN
044D CYRILLIC SMALL LETTER E
044E CYRILLIC SMALL LETTER YU
044F CYRILLIC SMALL LETTER YA
CYRILLIC SMALL LETTER IE WITH
GRAVE
0451 CYRILLIC SMALL LETTER IO
0452 CYRILLIC SMALL LETTER DJE
0453 CYRILLIC SMALL LETTER GJE
CYRILLIC SMALL LETTER UKRAINIAN
IE
0455 CYRILLIC SMALL LETTER DZE
CYRILLIC SMALL LETTER
BYELORUSSIAN-UKRAINIAN I
0457 CYRILLIC SMALL LETTER YI
0458 CYRILLIC SMALL LETTER JE
0459 CYRILLIC SMALL LETTER LJE
045A CYRILLIC SMALL LETTER NJE
045B CYRILLIC SMALL LETTER TSHE
045C CYRILLIC SMALL LETTER KJE
CYRILLIC SMALL LETTER I WITH
045D
GRAVE
045E CYRILLIC SMALL LETTER SHORT U
045F CYRILLIC SMALL LETTER DZHE
CYRILLIC CAPITAL LETTER SEMISOFT
048C
SIGN
CYRILLIC SMALL LETTER SEMISOFT
048D
SIGN
CYRILLIC CAPITAL LETTER ER WITH
048E
TICK
CYRILLIC SMALL LETTER ER WITH
048F
TICK
CYRILLIC CAPITAL LETTER GHE WITH
UPTURN
CYRILLIC SMALL LETTER GHE WITH
UPTURN
CYRILLIC CAPITAL LETTER GHE WITH
STROKE
CYRILLIC SMALL LETTER GHE WITH
STROKE
CYRILLIC CAPITAL LETTER GHE WITH
MIDDLE HOOK
CYRILLIC SMALL LETTER GHE WITH
MIDDLE HOOK
CYRILLIC CAPITAL LETTER ZHE WITH
DESCENDER
CYRILLIC SMALL LETTER ZHE WITH
DESCENDER
CYRILLIC CAPITAL LETTER ZE WITH
DESCENDER
CYRILLIC SMALL LETTER ZE WITH
DESCENDER
CYRILLIC CAPITAL LETTER KA WITH
049A
DESCENDER
CYRILLIC SMALL LETTER KA WITH
049B
DESCENDER
CYRILLIC CAPITAL LETTER KA WITH
049C
VERTICAL STROKE
CYRILLIC SMALL LETTER KA WITH
049D
VERTICAL STROKE
CYRILLIC CAPITAL LETTER KA WITH
049E
STROKE
CYRILLIC SMALL LETTER KA WITH
049F
STROKE
CYRILLIC CAPITAL LETTER BASHKIR
04A0
KA
04A1 CYRILLIC SMALL LETTER BASHKIR KA
CYRILLIC CAPITAL LETTER EN WITH
04A2
DESCENDER
CYRILLIC SMALL LETTER EN WITH
04A3
DESCENDER
04A4 CYRILLIC CAPITAL LIGATURE EN GHE
04A5 CYRILLIC SMALL LIGATURE EN GHE
CYRILLIC CAPITAL LETTER PE WITH
04A6
MIDDLE HOOK
CYRILLIC SMALL LETTER PE WITH
04A7
MIDDLE HOOK
CYRILLIC CAPITAL LETTER ABKHASIAN
04A8
HA
CYRILLIC SMALL LETTER ABKHASIAN
04A9
HA
CYRILLIC CAPITAL LETTER ES WITH
04AA
DESCENDER
CYRILLIC SMALL LETTER ES WITH
04AB
DESCENDER
CYRILLIC CAPITAL LETTER TE WITH
04AC
DESCENDER
CYRILLIC SMALL LETTER TE WITH
04AD
DESCENDER
CYRILLIC CAPITAL LETTER STRAIGHT
04AE
U
04AF CYRILLIC SMALL LETTER STRAIGHT U
CYRILLIC CAPITAL LETTER STRAIGHT
04B0
U WITH STROKE
CYRILLIC SMALL LETTER STRAIGHT U
04B1
WITH STROKE
CYRILLIC CAPITAL LETTER HA WITH
04B2
DESCENDER
CYRILLIC SMALL LETTER HA WITH
04B3
DESCENDER
04B4 CYRILLIC CAPITAL LIGATURE TE TSE
04B5 CYRILLIC SMALL LIGATURE TE TSE
CYRILLIC CAPITAL LETTER CHE WITH
04B6
DESCENDER
CYRILLIC SMALL LETTER CHE WITH
04B7
DESCENDER
CYRILLIC CAPITAL LETTER CHE WITH
04B8
VERTICAL STROKE
CYRILLIC SMALL LETTER CHE WITH
04B9
VERTICAL STR
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...