ISO/IEC 2022:1994
(Main)Information technology — Character code structure and extension techniques
Information technology — Character code structure and extension techniques
Cancels and replaces the third edition (1986). Specifies the structure of 8-bit codes and 7-bit codes which provide for the coding of character sets. The codes specified here are designed to be used for data that is processed sequentially in a forward direction. Use of these codes in strings of data which are processed in some other way, or which are included in data formatted for fixed-length record processing, may have undesirable results or may require additional special treatment to ensure correct interpretation.
Technologies de l'information — Structure de code de caractères et techniques d'extension
Information technology - Character code structure and extension techniques
General Information
Relations
Standards Content (Sample)
SLOVENSKI STANDARD
01-junij-1995
Information technology - Character code structure and extension techniques
Information technology -- Character code structure and extension techniques
Technologies de l'information -- Structure de code de caractères et techniques
d'extension
Ta slovenski standard je istoveten z: ISO/IEC 2022:1994
ICS:
35.040 Nabori znakov in kodiranje Character sets and
informacij information coding
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
lSO/IEC
INTERNATIONAL
STANDARD
Fourth edition
1994-12-01
Information technology - Character code
structure and extension techniques
- Structure de code de caract&es et
Technologies de I’informa tion
techniques d’extension
Reference number
ISO/l EC 2022: 1994(E)
ISO/IEC 2022: 1994 (E)
Contents
Page
Section 1 - General
1 Scope
2 Conformance
2.1 Types of conformance
2.2 Conformance of information interchange
2.3 Conformance of devices
2.3.1 Device description
2.3.2 Originating devices
2.3.3 Receiving devices
3 Normative references
4 Definitions
4.1 bit combination
4.2 byte
4.3 character
4.4 coded-character-data-element (CC-data-element)
4.5 coded character set; code
4.6 code extension
4.7 code table
4.8 combining character
4.9 control character
4.10 control function
4.11 to designate
4.12 device
4.13 escape sequence
4.14 Final Byte
4.15 graphic character
4.16 graphic symbol
4.17 Intermediate Byte
4.18 to invoke
4.19 repertoire
4.20 to represent
4.21 user
5 Notation, code tables and names
5.1 Notation
5.2 Code tables
5.3 Names of characters
Section 2 - Character sets and codes
6 Characters and character sets
@ ISO/IEC 1994
All rights resewed. No part of this publication may be reproduced or utilized in any form or
by any means, electronic or mechanical, including photocopying and microfilm, without per-
mission in writing from the publisher
ISO/IEC Copyright Office * Case Postale 56 * CH-1211 Gen&ve 20 * Switzerland
Printed in Switzerland
ii
ISO/IEC 2022: 1994 (E)
0 ISO/IEC
character sets
6.1 of characters and
Types
6.2 Fixed coded characters
6.2.1 Character DELETE
6.2.2 Character ESCAPE
6.2.3 Character SPACE
6.3 Sets of coded graphic characters
6.3.1 Types of coded graphic character set
6.3.2 Contents of a coded graphic character set
6.3.3 Combination of graphic characters
6.3.4 Sources of coded graphic character sets
6.4 Sets of coded control functions
6.4.1 Types of coded control function set
6.4.2 Primary sets of coded control functions
6.4.3 Supplementary sets of coded control functions
6.4.4 Sources of coded control function sets
6.5 Coded single additional control functions
6.5.1 Standardized single control functions
6.5.2 Registered single control functions
6.5.3 Private control functions
6.5.4 Sources of coded single control functions
7 The elements of S-bit and 7-bit codes
7.1 Summary of the elements
7.2 Character-set code elements
7.3 Invocation of character-set code elements
7.4 Coded code-identification functions
7.5 Unique coding of graphic characters
8 Structure of S-bit codes
8.1 Code table layout for &bit codes
8.2 Elements and structure of the code
8.3 Invocation of graphic character sets by means of shift functions
8.3.1 LOCKING-SHIFT ZERO, . . ONE, . . TWO, and . . THREE
8.3.2 LOCKING SHIFT ONE RIGHT, . . TWO RIGHT , and . . THREE RIGHT
8.3.3 Shift status
8.3.4 Interactions of locking-shift functions
8.4 Invocation of single graphic characters means of shift functions
by
Invocation sets of control functions
8.5 of
8.5.1 Invocation of the CO code element
8.5.2 Invocation of the Cl code element
9 Structure of 7-bit codes
9.1 Code table layout for 7-bit codes
9.2 Elements and structure of the code
9.3 Invocation of graphic character sets by means of shift functions
9.3.1 SHIFT-IN, SHIFT-OUT, LOCKING-SHIFT TWO, and LOCKING-SHIFT THREE
9.3.2 LOCKING SHIFT ONE RIGHT, TWO RIGHT, and THREE RIGHT
9.3.3 Shift status
9.3.4 Interactions of locking-shift functions
. . .
0 1s0/IEc
ISO/IEC 2022: 1994 (E)
functions
single graphic characters means of shift
9.4 Invocation of
bY
functions
9.5 Invocation of sets of control
9.5.1 Invocation of the CO code element
9.5.2 Invocation of the Cl code element
10 Versions and levels of implementation
10.1 Versions
10.2 Identification of code structure facilities and character sets
10.3 Levels of implementation
10.3.1 &bit codes
10.3.2 Qualification of levels for &bit codes
10.3.3 7-bit codes
11 Transformation between 8-bit and 7-bit codes
11.1 Transformation from &bit to 7-bit codes
11.2 Transformation from 7-bit to 8-bit codes
Section 3 - Code identification and escape sequences
12 Code-identification functions
12.1 Purposes of code-identification functions
12.2 Relationship to escape sequences
13 Structure and use of escape sequences
13.1 Structure of escape sequences
13.2 Types of escape sequences
13.2.1 Indication of type
13.2.2 Escape Sequences of types nF
13.2.3 Escape Sequences of type 4F
13.2.4 Summary
13.2.5 Notation of escape sequences
13.3 Specific meanings of escape sequences
13.3.1 Registration of Final Bytes
13.3.2 Final Bytes specified in this International Standard
13.3.3 Private use
14 Designation of sets of graphic characters and control functions
14.1 Designation functions
14.2 Designation of sets of control functions (CZD, ClD)
14.2.1 Purpose
14.2.2 Designation of CO
14.2.3 Designation of Cl
14.3 Designation of sets of graphic characters (GnDm and GnDMm)
14.3.1 Purpose
14.3.2 Specifications
14.3.3 Size indication for multiple-byte sets
14.4 Dynamically redefinable character sets (DRCS)
14.4.1 Purpose
14.4.2 Specification
iv
ISO/IEC 2022: 1994 (E)
0 ISO/IEC
14.5 Identification of revisions of registered character sets (IRR)
14.5.1 Purpose
14.5.2 Specification
15 Code announcement and switching
15.1 Summary of functions provided
15.2 Announcement of code structure facilities (ACS)
15.2.1 Purpose
15.2.2 Specification
15.3 Data Delimiter for this Coding Method (CMD)
15.3.1 Purpose
15.3.2 Specification
15.4 Designation of Other Coding Systems (DOCS)
15.4.1 Purpose
15.4.2 Specification
ANNEXES
A - External references to character repertoires and their coding
B - The IS0 International register of coded character sets to be used with escape sequences
C - Main differences between the 3rd edition (1986) and the present edition of this International Standard
D - Bibliography
ISOLIEC 2022: 1994 (E) 0 ISOnEC
Foreword
IS0 (the International Organisation for Standardisation) and IEC (the International Electrical Commission) form the
specialised system for world-wide standardisation. National Bodies that are members of IS0 or IEC participate in the
development of International Standards through technical committees established by the respective organisation to deal with
particular fields of mutual interest. Other international organisations, governmental and non-governmental, in liaison with IS0
and IEC, also take part in the work.
In the field of information technology, IS0 and IEC have established a joint technical committee ISO/IEC JTC 1. Draft
International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an
International Standard requires approval by at least 75% of the national bodies casting a vote.
International Standard ISO/IEC 2022 was prepared by the European Association for the Standardization of Information and
Communication Systems, ECMA, (as ECMA-35) and was adopted, under a special “fast-track procedure”, by Joint Technical
Committee ISO/IEC JTC 1, Information technology, in parallel with its approval by national bodies of IS0 and IEC.
This fourth edition cancels and replaces the third edition (IS0 2022: 1986), of which it constitutes a technical revision (see also
the introduction).
Annex A forms an integral part of this International Standard. Annexes B, C and D are for information only.
0 ISO/IEC
Introduction
ECMA/TCl participates very actively in the work of JTCl/SC2 (previously ISO/TC97/SC2) on code structure and code
extension, and contributed numerous technical papers to SC2/WGl, the group entrusted with the preparation of IS0 2022, the
International Standard for code extension techniques. ECMA published its first Standard ECMA-35 on the same subject in
1971. Three further editions in 1980, 1982 and 1985 reflected the progress achieved internationally, and the text of the 1985
edition was identical with that of the 1986 edition of IS0 2022.
The present edition of ISO/IEC 2022 is technically almost identical with the 1986 edition but is completely rearranged and
rewritten to make it more convenient to use as a reference document.
vii
This page intentionally left blank
INTERNATIONAL STANDARD 0 *SonEC ISO/IEC 2022:1994 (E)
Information technology - Character code structure and extension techniques
Section 1 - General
1 Scope
This International Standard specifies the structure of 8-bit codes and 7-bit codes which provide for the coding of character sets.
The code elements used in the structure are common to both the 8-bit and 7-bit codes. The codes use a variety of techniques
for extending the capabilities of elementary 8-bit and 7-bit codes. Greater emphasis is given to 8-bit codes in this edition of the
Standard than in previous editions because they are now more widely used.
The use of common elements in the 8-bit and 7-bit code structure enables any specific conforming 8-bit code to be
transformed into an equivalent 7-bit code, and vice versa, in a simple and direct fashion.
ISO/IEC 4873 conforms to the 8-bit code structure specified here, and ISO/IEC 646 conforms to the 7-bit code structure
specified here.
Note - The coded character set specified in ISO/IEC 10646-l has a different structure not in accordance with this International Standard.
The code structure facilities specified here include various means of extending the number of control functions and graphic
characters available in a code. They also include techniques to construct and formalize the definition of specific codes, and to
provide a coded identification of the structure and of the constituent elements of such specific codes.
Specific codes may also be identified by means of object identifiers in accordance with IS0 8824, Abstract Syntax Notation
One (ASN.l). The form of such object identifiers is specified in annex A.
Individual character sets and control functions intended for use with these 8-bit and 7-bit codes are assumed to be registered in
in accordance with IS0 2375 (see
the IS0 International Register of Coded Character Sets to be Used with Escape Sequences,
annex B). The register includes details to relate individual character sets and control functions with their coded representations,
and also with the associated coded identifications of such character sets.
The principles established in this International Standard may be utilized to form supplementary code structure facilities. For
example ISO/IEC 6429 has followed such a procedure to formulate some parameterized control functions.
The use of uniform code structure techniques for the 8-bit and 7-bit codes specified here has the advantage of:
-
permitting uniform provision for code structure in the design of information processing systems,
-
providing standardized methods of calling into use agreed sets of characters,
-
allowing the interchange of data between environments that utilise 8-bit and 7-bit codes respectively,
-
reducing the risk of conflict between systems required to inter-operate.
When two systems with different levels of implementation of code structure facilities are required to communicate with one
another, they may do so using the code structure facilities that they have in common.
The codes specified here are designed to be used for data that is processed sequentially in a forward direction. Use of these
codes in strings of data which are processed in some other way, or which are included in data formatted for fixed-length record
processing, may have undesirable results or may require additional special treatment to ensure correct interpretation.
Note - Since the previous edition (1986) of this International Standard the text has been completely rearranged and rewritten to make the Standard more
convenient to use as a reference document. It is now arranged in three main sections as follows:
1 General
2 Character Sets and Codes
3 Code Identification and Escape Sequences
ISO/IEX 2022: 1994 (E) 0 ISO/IEC
2 Conformance
2.1 Types of conformance
Full conformance to a standard means that all of its requirements are met. Conformance will only have a unique meaning if the
standard contains no options. If there are options within the standard they must be clearly identified, and any claim of
conformance must include a statement that identifies those options that have been adopted.
This International Standard is of a different nature since it specifies a large number of facilities from which different selections
may be made to suit individual applications. These selections are not identified in this International Standard, but must be
identified at the time that a claim of conformance is made. Conformance to such an identified selection is known as limited
conformance.
The selection of facilities from this International Standard that are to be used in a particular application will generally be
included in a specification document, which states the adopted facilities and gives other details necessary to define fully one or
more specific codes. Such a specification is said to be in accordance with this International Standard (see 10.1).
2.2 Conformance of information interchange
A CC-data-element within coded information for interchange is in conformance
with this International Standard if the coded
representations within that CC-data-element satisfy the following conditions:
they shall represent graphic characters, control functions, and code-identification functions in accordance with an identified
a)
selection of the facilities specified in this International Standard (i.e. a version of this Standard, see 10.1);
b) when the code extension techniques specified in this International Standard are used, they shall be implemented by the
control functions and code-identification functions defined in this Standard with the meaning and coded representation
specified in this Standard;
c) no coded representation that is either reserved for registration and not assigned, or reserved for future use, shall be used;
d) no registered escape sequence shall be used with a meaning different from that defined by the registration.
2.3 Conformance of devices
A device is in conformance with this International Standard if it conforms to the requirements of 2.3.1, and either or both of
2.3.2 and 2.3.3 below. Any claim of conformance shall identify the document which contains the description specified in 2.3.1.
2.3.1 Device description
A device that conforms to this International Standard shall be the subject of a description that
identifies either directly, or by reference to a specification that is in accordance with this International Standard, the
a)
selection of facilities from this Standard that it can utilize when originating or when receiving CC-data-elements;
b) identifies the means by which the user may supply the corresponding characters and functions, or may recognize them
when they are made available to the user, as specified in 2.3.2 and 2.3.3 respectively.
2.3.2 Originating devices
An originating device shall be capable of transmitting within a CC-data-element the coded representations of graphic
characters from one or more graphic character sets, and of an identified selection of control functions and code-identification
functions conforming to this International Standard.
Such a device shall allow the user to supply, from an appropriate set, characters or other indications which will implicitly or
explicitly determine the graphic characters, control functions, and code-identification functions whose coded representations
are to be transmitted.
2.3.3 Receiving devices
A receiving device shall be capable of receiving within a CC-data-element and interpreting the coded representations of
graphic characters from one or more graphic character sets, and an identified selection of control functions and code-
identification functions conforming to this International Standard.
Such a device shall make available to the user, from an appropriate set, characters or other indications which are implicitly or
explicitly determined by the graphic characters, control functions, and code-identification functions whose coded
representations are received.
ISOBEC 2022: 1994 (E)
0 ISO/lEC
3 Normative references
The following standards contain provisions which, through reference in this text, constitute provisions of this International
Standard. At the time of publication, the editions indicated were valid. All standards are subject to revision, and parties to
agreements based on this International Standard are encouraged to investigate the possibility of applying the most recent
editions of the standards listed below. Members of IEC and IS0 maintain registers of currently valid standards.
IS0 2375: 1985, Data processing - Procedure for registration of escape sequences.
ISOIIEC 6429: 1992, Information technology - Controlfunctions for coded character sets.
Open Systems Interconnection - Specification of Abstract Syntax Notation One
IS0 8824: 1990, Information technology -
(ASN. I).
Specification of Basic Encoding Rules for Abstract
IS0 8825: 1990, Information technology - Open Systems Interconnection -
Syntax Notation One (ASN.1).
IS0 International Register of Coded Character Sets to be Used with Escape Sequences.
4 Definitions
For the purposes of this International Standard, the following definitions apply.
41 . bit combination: An ordered set of bits used for the representation of characters.
4.2 byte: A bit string that is operated upon as a unit.
Note - Each bit has the value either ZERO or ONE.
4.3 character: A member of a set of elements used for the organization, control or representation of data.
4.4 coded-character-data-element (CC-data-element): An element of interchanged information that is specified to
consist of a sequence of coded representations of characters, in accordance with one or more identified standards for coded
character sets.
Notes
1 - In a communication environment in accordance with the Reference Model for Open Systems Interconnection of IS0 7498, a CC-data-element will form
all or part of the information that corresponds to the Presentation-Protocol-Data-Unit (PPDU) defined in that International Standard.
a CC -data-element will form all or part of the information
2 - When information interchange is accomplished by means of interchangeable media,
corresponds to the user data, and not that recorded during formatting and initialization.
one-to-one relationship
4.5 coded character set; code: A set of unambiguous rules that establishes a character set and the
between the characters of the set and their bit combinations.
4.6 code extension: The techniques for the encoding of characters that are not included in the character set of a given code.
combination in a code.
47 0 code table: A table showing the character allocated to each bit
4.8 combining character: A member of an identified subset of a coded character set, intended for combination with the
preceding or following graphic character, or with a sequence of combining characters preceded or followed by a non-
combining character.
4.9 control character: A control function the coded representation of which consists of a single bit combination.
transmission or interpretation of data, and that has
4.10 control function: An action that affects the recording, processing,
a coded representation consisting of one or more bit combinations.
represented, in some cases immediately and in others on the
4.11 to designate: To identify a set of characters that are to be
occurrence of a further control function, in a prescribed manner.
0 ISO/IEC
ISO/IEC 2022: 1994 (E)
4.12 device: A component of information processing equipment which can transmit, and/or can receive, coded information
within CC-data-elements.
Note - It may be an input/output device in the conventional sense, or a process such as an application program or a gateway function.
4.13 escape sequence: A string of bit combinations that is used for control purposes in code extension procedures. The
first of these bit combinations represents the control function ESCAPE.
Note -In this International Standard ESCAPE is always referred to as a control character.
4.14 Final Byte: The bit combination that terminates an escape sequence or a control sequence.
4.15 graphic character: A character, other than a control function, that has a visual representation normally handwritten,
printed or displayed, and that has a coded representation consisting of one or more bit combinations.
4.16 graphic symbol: A visual representation of a graphic character or of a control function.
4.17 Intermediate Byte: A bit combination which may occur between that of the control character ESCAPE and the Final
Byte in an escape sequence.
4.18 to invoke: To cause a designated set of characters to be represented by the prescribed bit combinations whenever those
bit combinations occur.
4.19 repertoire: A specified set of characters that are each represented by one or more bit combinations of a coded
character set.
4.20 to represent:
a) To use a prescribed bit combination with the meaning of a character in a set of characters that has been designated and
invoked; or
b) To use an escape sequence with the meaning of an additional control function.
4.21 user: A person or other entity that invokes the services provided by a device.
1 - This entity may be a process such as an application program if the “device” is a code convertor or a gateway function, for example.
2 - The characters, as supplied by the user or made available to the user, may be in the form of codes local to the device, or of non-conventional visible
representations, provided that 2.3 above is satisfied.
5 Notation, code tables and names
5.1 Notation
The bits of the bit combinations of the 8-bit code are identified by b b b b b b b and b,, where b, is the highest
8’ 79 69 5’ 43 39 2
order, or most-significant, bit and b, is the lowest-order, or least-significant, bit.
The bits of the bit combinations of the 7-bit code are identified by b,, b,, b,, b,, b,, b, and b,, where b, is the highest order, or
most-significant, bit and b, is the lowest-order, or least-significant, bit.
The bit combinations may be interpreted to represent integers in binary notation, in the range 0 to 255 for the &bit code, and
in the range 0 to 127 for the 7-bit code, by attributing the following weights to the individual bits:
Bit:
bl
bl3 b7 b6 b5 b4 b3 b2
Weight: 128 64 32 16 8 4 2 1
In this International Standard, the bit combinations are identified by notations of the form x/y, where x and y are numbers in
therangeOOt0 15.
The correspondence between the notations of the form x/y and the bit combinations consisting of the bits b8 or b, to b, is as
follows:
- x for the 8-bit code is the number represented by b,, b,, b,, and b, where these bits are given the weights 8, 4, 2 and 1
respectively;
0 ISO/IEC ISO/IEC 2022:1994 (E)
-
x for the 7-bit code is the number represented by b,, b,, and b, where these bits are given the weights 4, 2 and 1
respectively;
-
y is the number represented by b,, b,, b, and b, where these bits are given the weights 8,4,2 and 1 respectively.
The notations of the form x/y are the same as those used to identify code table positions, where x is the column number and y
the row number (see 5.2).
5.2 Code tables
An 8-bit code table consists of 256 positions arranged in 16 columns and 16 rows. The columns and rows are numbered 00 to
15 (see figure 1).
A 7-bit code table consists of 128 positions arranged in 8 columns and 16 rows. The columns are numbered 00 to 07 and the
rows 00 to 15 (see figure 1).
The code table positions are identified by notations of the form x/y, where x is the column number and y is the row number.
By convention, leading zeroes are included in the column and row numbers (e.g. 02/01).
The positions of the code table are in one-to-one correspondence with the bit combinations of the code. The notation of a code
table position, of the form x/y, is the same as that of the corresponding bit combination.
001 OllO2103104 05 06 07 00 01 02 03 04 05 06 07
08 09 10 11 12 13 14 15
I
I 00
‘““‘+ttFFttl r-l
n8
"V
IA
I 1
7-bit
8-bit
Figure 1 - Code tables
5.3 Names of characters
This International Standard assigns one name to each character. In addition, it specifies an acronym for each control character
and for the characters SPACE and DELETE. By convention, only capital letters, space and hyphen are used for writing the
names of the characters. For acronyms only capital letters and digits are used. It is intended that the acronyms and this
convention be retained in all translations of the text.
Section 2 - Character sets and codes
6 Characters and character sets
6.1 Types of characters and character sets
The structure of &bit and 7-bit codes specified by this International Standard makes use of the following types of characters,
character sets, and functions:
- fixed coded characters,
-
sets of coded graphic characters,
-
sets of coded control functions (or control characters),
-
coded single additional control functions.
These components are specified respectively in 6.2 to 6.5 below.
The coded representations of the graphic characters and control functions are specified in relation to the &bit and 7-bit code
tables defined in 5.2 above. A coded representation for each type of component is specified within columns 00 to 07 of the
&bit and 7-bit code tables. For some components an alternative coded representation is specified in columns 08 to 15 of the
8-bit code table, and is not applicable to any 7-bit code.
6.2 Fixed coded characters
6.2.1 Character DELETE
Name: DELETE Acronym: DEL Coded representation: 07/ 15
DEL was originally used to erase or obliterate an erroneous or unwanted character in punched tape. DEL may be used for
media-fill or time-fill. DEL characters may be inserted into, or removed from, a CC-data-element without affecting its
information content, but such action may affect the information layout and/or the control of equipment.
6.2.2 Character ESCAPE
Name: ESCAPE Acronym: ESC Coded representation: 01/l 1
ESCAPE is a control character used for code extension purposes. It causes the meaning of a limited number of the bit
combinations following it in a CC-data-element to be changed. These bit combinations, together with the preceding bit
combination that represents the ESC character, constitute an escape sequence.
Escape sequences provide the coded representations of code-identification functions and of some types of control functions.
The various uses of escape sequences are specified in clause 13. Code identification functions are specified in clauses 14 and
15.
6.2.3 Character SPACE
Name: SPACE Acronym: SP Coded representation: 02/00
SPACE is a graphic character. It has a visual representation consisting of the absence of a graphic symbol. It causes the active
position to be advanced by one character position.
6.3 Sets of coded graphic characters
6.3.1 Types of coded graphic character set
A graphic character shall have a coded representation comprising one or more 8-bit combinations (bytes) in an 8-bit code, and
one or more 7-bit combinations (bytes) in a 7-bit code. Within a coded graphic character set each character shall be
represented by the same number of such bit combinations.
The bit combinations used to represent the graphic characters in a set shall be either from the six adjacent columns numbered
02 to 07 of the code tables or from the six adjacent columns numbered 10 to 15 of the 8-bit code table.
The type of a coded graphic character set is defined by the maximum number of graphic characters that the set can contain.
The types of set specified here are illustrated in figure 3.
0 ISO/IEC ISO/IEC 2022: 1994 (E)
A coded graphic character set in which each character is represented by a single bit combination shall be one of the following:
-
94-character set, in positions 02/01 to 07/14, or lO/Ol to 15/14;
(i.e. all positions in columns 02 to 07 except 02/00 and 07/15, or
all positions in columns 10 to 15 except lO/OO and 15/15)
-
96-character set, in positions 02/00 to 07/15, or lO/OO to 15115.
(i.e. all positions in columns 02 to 07, or in columns 10 to 15)
In a 94-character set no character shall be allocated to positions 02/00 and 07/15.
A coded graphic character set in which each character is represented by a sequence of n bit combinations, where n>l , shall be
one of the following:
- 94”-character set,
- 96”-character set.
These sets are here referred to as multiple-byte sets.
A 94”-character set shall consist of up to 94n graphic characters each of which is represented by a sequence of n 8-bit or 7-bit
combinations, either all in the range 02/01 to 07/14 or all in the range 1 O/O1 to 15/l 4. In a 94”-character set no character shall
have a coded representation that includes the bit combination 02/00 or 07/l 5.
A 96”-character set shall consist of up to 96” graphic characters each of which is represented by a sequence of n 8-bit or 7-bit
combinations, either all in the range 02/00 to 07/l 5 or all in the range 1 O/O0 to 15/l 5.
Note - The 8th bit (bg) of each byte in such an &bit multiple-byte representation is uniformly either ZERO or ONE.
ISO/IEC 2022: 1994 (E)
0 1s0AEc
02 03 04 05 06 07
10 11
12 13 14 15
-
00 01
0'
n7
0:
0:
O!
I
OE
I
Oi
0e
OS
90character set
96character set
/
/
02/00
to
1 o%o
to
15/15
if-H+++1
1 I , ,
121 I I I I I
////
02 03 04 05 06 07
10 11 12 13 14 15
94 x 94-character set
96 x 96-character set
Figure 2 - Structure of sets of coded graphic characters
ISO/IEC 2022: 1994 (E)
0 ISOAEC
6.3.2 Contents of a coded graphic character set
(sequences of) bit
Within a coded graphic character set either a unique graphic character shall be allocated to each of the
combinations that are specified for that set, or that bit combination (or sequence) shall be declared unused.
Any coded graphic character set shall not contain the characters SPACE or DELETE, or any control character (see 6.4).
However, characters other than SPACE and representing spaces of different sizes or usage may be assigned to any (sequences
of) bit combinations in any set of graphic characters.
6.3.3 Combination of graphic characters
shall not be combining characters, i.e. they shall not be intended for
Unless specifically defined otherwise, graphic characters
combination with an adjacent graphic character.
Some graphic character sets may allow for the graphical representation of additional graphic symbols, such as accented letters,
by the imaging of two or more graphic characters as a single graphic symbol. Two combination methods are recognised in this
International Standard:
a) graphic characters that are non-combining characters may be combined by the use of the control character BACKSPACE
or CARRIAGE RETURN;
b) graphic characters that are specified to be combining characters may be used in conjunction with a non-combining graphic
character.
Sponsors of graphic character sets who apply for registration according to IS0 2375 are expected to identify any combining
characters that are in the set.
1 - A standard that defines a character set specify which characters, if any, are combining characters, and how they may be used, since a registration
does not require such details to be stated.
2 - The graphic character set of ISO/IEC 646 allows for the first of the above methods for the imaging of accented characters.
3 - ISO/IEC 6429 specifies a third method for combining graphic characters, independent of the specification characters themselves, by the use of the
control function GRAPHIC CHARACTER COMBINATION (GCC).
6.3.4 Sources of coded graphic character sets
Sets of graphic characters and their coded representations are specified in other standards such as ISOAEC 646 or
ISO/IEC 10367, and in national standards. Some of these sets, and some additional sets, are specified in the IS0 International
Register of Coded Character Sets (see annex B).
Note - New and revised character sets may be added to the register when required.
Sets of graphic characters for private use may be defined by agreement between the interchange parties.
6.4
Sets of coded control functions
6.4.1 Types of coded control function set
A set of coded control functions shall contain up to 32 control functions control characters) allocated to two adjacent
(or
columns of a code table.
Two types of coded control function set are defined as follows:
- primary set, in positions OO/OO to 01/15,
-
supplementary set, in positions 08/00 to 09/15, or represented by escape sequences.
character. These sets are
A primary set shall include the ESCAPE character. A supplementary set shall not include that
illustrated in figure 3.
Either a unique control function shall be allocated to each position or the position shall be declared unused.
ISO/IEC 2022: 1994 (E)
0 ISO/IEC
Fe =
Primary set Supplementary set
Figure 3 - Structure of sets of coded control functions (or characters)
6.4.2 Primary sets of coded control functions
A control function in a primary set shall have a coded representation consisting of one 8-bit or 7-bit combination, i.e. it is a
control character.
A primary set of coded control functions shall include the control character ESCAPE in position 01/l 1.
If any control function from the primary set specified in ISO/IEC 6429 is included, it shall have the definition and the coded
representation specified therein. No transmission control characters, other than the ten specified in ISO/IEC 6429, shall be
included in a primary set of coded control functions.
6.4.3
Supplementary sets of coded control functions
A control function in a supplementary set shall have a coded representation consisting of one 8-bit or 7-bit combination when
the set is invoked in positions OS/O0 to 09/15. It shall be represented by an escape sequence of type Fe (see 13.2) otherwise.
Note - The notation Fe indicates a bit combination in the range 04/00 to 05/U. The escape sequence consists of the two bit combinations ESC Fe (13.2.5).
A supplementary set of coded control functions shall not include the control character ESCAPE or any of the transmission
control functions of the primary set of ISO/IEC 6429.
6.4.4 Sources of coded control function sets
Control functions for a wide variety of applications are specified in ISOAEC 6429. A standardized primary set and
supplementary set are included (identified there as CO and Cl sets). Sets of control functions are also registered in the IS0
International Register of Coded Character Sets (see annex B). Each set is registered either as a primary (CO) set only, or as a
supplementary (Cl) set only.
Note - New and revised sets of coded control functions may be added to the register when required.
Sets of coded control functions for private use may be defined by agreement between the interchange parties.
ISOAEC 2022:1994 (E)
0 ISO/IEC
6.5 Coded single additional control functions
A coded single additional control function shall be either:
-
a standardized single control function, or
-
a registered single control function, or
-
a private control function.
Each such function shall be represented by an escape sequence (see clause 13).
6.5.1 Standardized single control functions
A standardized single control function shall have a permanently assigned meaning. Such a function shall be represented by an
escape sequence of type Fs (13.2.1). Each such function shall be registered, together with its coded representation, in the IS0
International Register of Coded Character Sets (see annex B).
functions must first be approved by ISO/IEC JTCUSC2. If approval is granted the control
1 - Any candidates for registration as standardized control
specified in a standard published by IS0 or other recognised body.
function is registered according to the procedure of IS0 2375 . It will normally then be
2 - The notation Fs indicates a bit combination in the range 06/00 to 07/14. The escape sequence consists of the bit combinations ESC Fs (13.2.5).
6.5.2 Registered single control functions
A registered single control function shall have a permanently assigned meaning. Such a function shall be represented by an
escape sequence of type 3Ft (13.2.2). Each such function shall be registered, together with its coded representation, in the IS0
International Register of Coded Character Sets (see annex B).
in the range 04/00 to 07/14. The escape sequence consists of the bit combinations ESC 02/03 . . Ft
Note - The notation Ft indicates a bit combination
(13.2.5).
6.5.3 Private control functions
Private control functions have no standardized meaning. They are for private use and may be defined by agreement between
the interchange parties. A private control function shall be represented by an escape sequence of type Fp or of type 3Fp
(13.2.2).
Note - The notation Fp indicates a bit combination in the range 03/00 to 03/15. The escape sequences consist respectively of the bit combinations ESC Fp
and ESC 02/03 . . Fp (13.2.5).
6.5.4 Sources of coded single control functions
Some standardised single control functions are specified elsewhere in this International Standard, see 7.3 and 15.3, and some
are specified in ISOIIEC 6429.
Registered control functions are found in the IS0 International Register of Coded Character Sets (see annex B).
Private control functions are defined by agreement between the interchange parties.
7 The elements of S-bit and 7-bit codes
7.1 Summary of the elements
An element of an 8-bit or a 7-bit code shall be either:
- a coded character-set (7.2),
-
a coded single additional control function (6.5),
-
a coded code-identification function (7.4).
These code elements are illustrated in figure 4.
0 ISO/IEC
Cl
co
code
code
/
element
element
Code
identification
functions
Single additional
control functions
GO Gl G2 G3
code code
code code
element element element element
r
Figure 4 - Elements of a code
7.2 Character-set code elements
A character-set code element shall be an identified set of coded graphic characters, or of coded control functions (or
characters), together with an element name to indicate the relationship of the set to the structure of the code. When the element
is invoked, the corresponding set shall be represented in those columns of an &bit or 7-bit code table that are specified in
6.3.1, 6.4.2, or 6.4.3 for that type of set.
A character-set code element shall be one of those shown in table 1 below. The table shows the name of the element, the type
of coded character set that it comprises, and the column numbers of the &bit or 7-bit code tables into which it may be invoked.
Table 1 - Character-set code elements
Column numbers Type of coded character set
Control functions (characters), primary set
OOandOl
Fe Control functions, supplementary set
Cl 08 and 09 or ESC
94-character or 94”-character set
GO 02 to 07 Graphic characters -
Gl 02 to 07 or Graphic characters - 94-character or 94”-character or
to 15 96-character or 96”-character set
G2 (as for Gl) (as for Gl)
(as for Gl) (as for Gl)
Note - The identification of specific graphic character sets as the elements GO, Gl, G2, and G3, and the identification of specific control function sets as the
Designation of sets may be achieved by the use of designation
elements CO and Cl, is referred to in this International Standard by the term “designation”.
functions (7.4) or by other methods (see 10.2).
ISO/IEC 2022: 1994 (E)
0 ISO/IEC
7.3 Invocation of character-set code elements
The designation of a control character set as a CO or Cl code element shall invoke that set.
The designation of a graphic character set as a GO, Gl, G2, or G3 code element shall invoke that set if the code element
already has a shift status (8.3.3 and 9.3.3); otherwise the use of a corresponding shift function shall invoke that set. Shift
functions are control functions, and are specified in 8.3, 8.4,9.3, and 9.4. They are listed in table 2 below.
Table 2 shows the name, acronym, and coded representation of each shift function. The entry in the “usage code” column
signifies whether the function is available for use in an 8-bit code or a 7-bit code as follows:
- 7 7-bit code only,
- 8 8-bit code only,
- 718 7-bit and 8-bit codes.
The entry in the “type” column signifies the allocation of the function to a particular code element as follows:
- co a member of the primary set of control functions,
- Cl a member of the supplementary set of control functions,
- Fs a standardised single control function.
Table 2 - Shift functions
Coded Representation
Usage
Name Acronym Code
Bit Combination
Type
SHIFT-IN
SI 7 co 00/l 5
SHIFT-OUT
so 7 co 00114
LOCKING-SHIFT ZERO
LSO 8 co 00115
LOCKING-SHIFT ONE
LSl 8
co 00/14
LOCKING-SHIFT TWO
LS2 718 Fs ESC 06/14
LOCKING-SHIFT THREE LS3 7/8 Fs ESC 06115
SINGLE-SHIFT TWO ss2 718 Cl ESC 04114 or 08114
SINGLE-SHIFT THREE ss3 718 Cl ESC 04/15 or 08/15
LOCKING-SHIFT ONE RIGHT LSlR 8 Fs ESC 07114
LOCKING-SHIFT TWO RIGHT LS2R 8 Fs ESC 07113
LOCKING-SHIFT THREE RIGHT LS3R 8 Fs ESC 07112
Notes
1 - The coded representations of LS2, LS3, SS2, SS3, LSlR, LS2R, and LS3R, are allocated in the IS0 International Register of Coded Character Sets (see
annex B), and are repeated here for convenience.
2 - If a 7-bit single-byte representation of SS2 and SS3 is requ
...
lSO/IEC
INTERNATIONAL
STANDARD
Fourth edition
1994-12-01
Information technology - Character code
structure and extension techniques
- Structure de code de caract&es et
Technologies de I’informa tion
techniques d’extension
Reference number
ISO/l EC 2022: 1994(E)
ISO/IEC 2022: 1994 (E)
Contents
Page
Section 1 - General
1 Scope
2 Conformance
2.1 Types of conformance
2.2 Conformance of information interchange
2.3 Conformance of devices
2.3.1 Device description
2.3.2 Originating devices
2.3.3 Receiving devices
3 Normative references
4 Definitions
4.1 bit combination
4.2 byte
4.3 character
4.4 coded-character-data-element (CC-data-element)
4.5 coded character set; code
4.6 code extension
4.7 code table
4.8 combining character
4.9 control character
4.10 control function
4.11 to designate
4.12 device
4.13 escape sequence
4.14 Final Byte
4.15 graphic character
4.16 graphic symbol
4.17 Intermediate Byte
4.18 to invoke
4.19 repertoire
4.20 to represent
4.21 user
5 Notation, code tables and names
5.1 Notation
5.2 Code tables
5.3 Names of characters
Section 2 - Character sets and codes
6 Characters and character sets
@ ISO/IEC 1994
All rights resewed. No part of this publication may be reproduced or utilized in any form or
by any means, electronic or mechanical, including photocopying and microfilm, without per-
mission in writing from the publisher
ISO/IEC Copyright Office * Case Postale 56 * CH-1211 Gen&ve 20 * Switzerland
Printed in Switzerland
ii
ISO/IEC 2022: 1994 (E)
0 ISO/IEC
character sets
6.1 of characters and
Types
6.2 Fixed coded characters
6.2.1 Character DELETE
6.2.2 Character ESCAPE
6.2.3 Character SPACE
6.3 Sets of coded graphic characters
6.3.1 Types of coded graphic character set
6.3.2 Contents of a coded graphic character set
6.3.3 Combination of graphic characters
6.3.4 Sources of coded graphic character sets
6.4 Sets of coded control functions
6.4.1 Types of coded control function set
6.4.2 Primary sets of coded control functions
6.4.3 Supplementary sets of coded control functions
6.4.4 Sources of coded control function sets
6.5 Coded single additional control functions
6.5.1 Standardized single control functions
6.5.2 Registered single control functions
6.5.3 Private control functions
6.5.4 Sources of coded single control functions
7 The elements of S-bit and 7-bit codes
7.1 Summary of the elements
7.2 Character-set code elements
7.3 Invocation of character-set code elements
7.4 Coded code-identification functions
7.5 Unique coding of graphic characters
8 Structure of S-bit codes
8.1 Code table layout for &bit codes
8.2 Elements and structure of the code
8.3 Invocation of graphic character sets by means of shift functions
8.3.1 LOCKING-SHIFT ZERO, . . ONE, . . TWO, and . . THREE
8.3.2 LOCKING SHIFT ONE RIGHT, . . TWO RIGHT , and . . THREE RIGHT
8.3.3 Shift status
8.3.4 Interactions of locking-shift functions
8.4 Invocation of single graphic characters means of shift functions
by
Invocation sets of control functions
8.5 of
8.5.1 Invocation of the CO code element
8.5.2 Invocation of the Cl code element
9 Structure of 7-bit codes
9.1 Code table layout for 7-bit codes
9.2 Elements and structure of the code
9.3 Invocation of graphic character sets by means of shift functions
9.3.1 SHIFT-IN, SHIFT-OUT, LOCKING-SHIFT TWO, and LOCKING-SHIFT THREE
9.3.2 LOCKING SHIFT ONE RIGHT, TWO RIGHT, and THREE RIGHT
9.3.3 Shift status
9.3.4 Interactions of locking-shift functions
. . .
0 1s0/IEc
ISO/IEC 2022: 1994 (E)
functions
single graphic characters means of shift
9.4 Invocation of
bY
functions
9.5 Invocation of sets of control
9.5.1 Invocation of the CO code element
9.5.2 Invocation of the Cl code element
10 Versions and levels of implementation
10.1 Versions
10.2 Identification of code structure facilities and character sets
10.3 Levels of implementation
10.3.1 &bit codes
10.3.2 Qualification of levels for &bit codes
10.3.3 7-bit codes
11 Transformation between 8-bit and 7-bit codes
11.1 Transformation from &bit to 7-bit codes
11.2 Transformation from 7-bit to 8-bit codes
Section 3 - Code identification and escape sequences
12 Code-identification functions
12.1 Purposes of code-identification functions
12.2 Relationship to escape sequences
13 Structure and use of escape sequences
13.1 Structure of escape sequences
13.2 Types of escape sequences
13.2.1 Indication of type
13.2.2 Escape Sequences of types nF
13.2.3 Escape Sequences of type 4F
13.2.4 Summary
13.2.5 Notation of escape sequences
13.3 Specific meanings of escape sequences
13.3.1 Registration of Final Bytes
13.3.2 Final Bytes specified in this International Standard
13.3.3 Private use
14 Designation of sets of graphic characters and control functions
14.1 Designation functions
14.2 Designation of sets of control functions (CZD, ClD)
14.2.1 Purpose
14.2.2 Designation of CO
14.2.3 Designation of Cl
14.3 Designation of sets of graphic characters (GnDm and GnDMm)
14.3.1 Purpose
14.3.2 Specifications
14.3.3 Size indication for multiple-byte sets
14.4 Dynamically redefinable character sets (DRCS)
14.4.1 Purpose
14.4.2 Specification
iv
ISO/IEC 2022: 1994 (E)
0 ISO/IEC
14.5 Identification of revisions of registered character sets (IRR)
14.5.1 Purpose
14.5.2 Specification
15 Code announcement and switching
15.1 Summary of functions provided
15.2 Announcement of code structure facilities (ACS)
15.2.1 Purpose
15.2.2 Specification
15.3 Data Delimiter for this Coding Method (CMD)
15.3.1 Purpose
15.3.2 Specification
15.4 Designation of Other Coding Systems (DOCS)
15.4.1 Purpose
15.4.2 Specification
ANNEXES
A - External references to character repertoires and their coding
B - The IS0 International register of coded character sets to be used with escape sequences
C - Main differences between the 3rd edition (1986) and the present edition of this International Standard
D - Bibliography
ISOLIEC 2022: 1994 (E) 0 ISOnEC
Foreword
IS0 (the International Organisation for Standardisation) and IEC (the International Electrical Commission) form the
specialised system for world-wide standardisation. National Bodies that are members of IS0 or IEC participate in the
development of International Standards through technical committees established by the respective organisation to deal with
particular fields of mutual interest. Other international organisations, governmental and non-governmental, in liaison with IS0
and IEC, also take part in the work.
In the field of information technology, IS0 and IEC have established a joint technical committee ISO/IEC JTC 1. Draft
International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an
International Standard requires approval by at least 75% of the national bodies casting a vote.
International Standard ISO/IEC 2022 was prepared by the European Association for the Standardization of Information and
Communication Systems, ECMA, (as ECMA-35) and was adopted, under a special “fast-track procedure”, by Joint Technical
Committee ISO/IEC JTC 1, Information technology, in parallel with its approval by national bodies of IS0 and IEC.
This fourth edition cancels and replaces the third edition (IS0 2022: 1986), of which it constitutes a technical revision (see also
the introduction).
Annex A forms an integral part of this International Standard. Annexes B, C and D are for information only.
0 ISO/IEC
Introduction
ECMA/TCl participates very actively in the work of JTCl/SC2 (previously ISO/TC97/SC2) on code structure and code
extension, and contributed numerous technical papers to SC2/WGl, the group entrusted with the preparation of IS0 2022, the
International Standard for code extension techniques. ECMA published its first Standard ECMA-35 on the same subject in
1971. Three further editions in 1980, 1982 and 1985 reflected the progress achieved internationally, and the text of the 1985
edition was identical with that of the 1986 edition of IS0 2022.
The present edition of ISO/IEC 2022 is technically almost identical with the 1986 edition but is completely rearranged and
rewritten to make it more convenient to use as a reference document.
vii
This page intentionally left blank
INTERNATIONAL STANDARD 0 *SonEC ISO/IEC 2022:1994 (E)
Information technology - Character code structure and extension techniques
Section 1 - General
1 Scope
This International Standard specifies the structure of 8-bit codes and 7-bit codes which provide for the coding of character sets.
The code elements used in the structure are common to both the 8-bit and 7-bit codes. The codes use a variety of techniques
for extending the capabilities of elementary 8-bit and 7-bit codes. Greater emphasis is given to 8-bit codes in this edition of the
Standard than in previous editions because they are now more widely used.
The use of common elements in the 8-bit and 7-bit code structure enables any specific conforming 8-bit code to be
transformed into an equivalent 7-bit code, and vice versa, in a simple and direct fashion.
ISO/IEC 4873 conforms to the 8-bit code structure specified here, and ISO/IEC 646 conforms to the 7-bit code structure
specified here.
Note - The coded character set specified in ISO/IEC 10646-l has a different structure not in accordance with this International Standard.
The code structure facilities specified here include various means of extending the number of control functions and graphic
characters available in a code. They also include techniques to construct and formalize the definition of specific codes, and to
provide a coded identification of the structure and of the constituent elements of such specific codes.
Specific codes may also be identified by means of object identifiers in accordance with IS0 8824, Abstract Syntax Notation
One (ASN.l). The form of such object identifiers is specified in annex A.
Individual character sets and control functions intended for use with these 8-bit and 7-bit codes are assumed to be registered in
in accordance with IS0 2375 (see
the IS0 International Register of Coded Character Sets to be Used with Escape Sequences,
annex B). The register includes details to relate individual character sets and control functions with their coded representations,
and also with the associated coded identifications of such character sets.
The principles established in this International Standard may be utilized to form supplementary code structure facilities. For
example ISO/IEC 6429 has followed such a procedure to formulate some parameterized control functions.
The use of uniform code structure techniques for the 8-bit and 7-bit codes specified here has the advantage of:
-
permitting uniform provision for code structure in the design of information processing systems,
-
providing standardized methods of calling into use agreed sets of characters,
-
allowing the interchange of data between environments that utilise 8-bit and 7-bit codes respectively,
-
reducing the risk of conflict between systems required to inter-operate.
When two systems with different levels of implementation of code structure facilities are required to communicate with one
another, they may do so using the code structure facilities that they have in common.
The codes specified here are designed to be used for data that is processed sequentially in a forward direction. Use of these
codes in strings of data which are processed in some other way, or which are included in data formatted for fixed-length record
processing, may have undesirable results or may require additional special treatment to ensure correct interpretation.
Note - Since the previous edition (1986) of this International Standard the text has been completely rearranged and rewritten to make the Standard more
convenient to use as a reference document. It is now arranged in three main sections as follows:
1 General
2 Character Sets and Codes
3 Code Identification and Escape Sequences
ISO/IEX 2022: 1994 (E) 0 ISO/IEC
2 Conformance
2.1 Types of conformance
Full conformance to a standard means that all of its requirements are met. Conformance will only have a unique meaning if the
standard contains no options. If there are options within the standard they must be clearly identified, and any claim of
conformance must include a statement that identifies those options that have been adopted.
This International Standard is of a different nature since it specifies a large number of facilities from which different selections
may be made to suit individual applications. These selections are not identified in this International Standard, but must be
identified at the time that a claim of conformance is made. Conformance to such an identified selection is known as limited
conformance.
The selection of facilities from this International Standard that are to be used in a particular application will generally be
included in a specification document, which states the adopted facilities and gives other details necessary to define fully one or
more specific codes. Such a specification is said to be in accordance with this International Standard (see 10.1).
2.2 Conformance of information interchange
A CC-data-element within coded information for interchange is in conformance
with this International Standard if the coded
representations within that CC-data-element satisfy the following conditions:
they shall represent graphic characters, control functions, and code-identification functions in accordance with an identified
a)
selection of the facilities specified in this International Standard (i.e. a version of this Standard, see 10.1);
b) when the code extension techniques specified in this International Standard are used, they shall be implemented by the
control functions and code-identification functions defined in this Standard with the meaning and coded representation
specified in this Standard;
c) no coded representation that is either reserved for registration and not assigned, or reserved for future use, shall be used;
d) no registered escape sequence shall be used with a meaning different from that defined by the registration.
2.3 Conformance of devices
A device is in conformance with this International Standard if it conforms to the requirements of 2.3.1, and either or both of
2.3.2 and 2.3.3 below. Any claim of conformance shall identify the document which contains the description specified in 2.3.1.
2.3.1 Device description
A device that conforms to this International Standard shall be the subject of a description that
identifies either directly, or by reference to a specification that is in accordance with this International Standard, the
a)
selection of facilities from this Standard that it can utilize when originating or when receiving CC-data-elements;
b) identifies the means by which the user may supply the corresponding characters and functions, or may recognize them
when they are made available to the user, as specified in 2.3.2 and 2.3.3 respectively.
2.3.2 Originating devices
An originating device shall be capable of transmitting within a CC-data-element the coded representations of graphic
characters from one or more graphic character sets, and of an identified selection of control functions and code-identification
functions conforming to this International Standard.
Such a device shall allow the user to supply, from an appropriate set, characters or other indications which will implicitly or
explicitly determine the graphic characters, control functions, and code-identification functions whose coded representations
are to be transmitted.
2.3.3 Receiving devices
A receiving device shall be capable of receiving within a CC-data-element and interpreting the coded representations of
graphic characters from one or more graphic character sets, and an identified selection of control functions and code-
identification functions conforming to this International Standard.
Such a device shall make available to the user, from an appropriate set, characters or other indications which are implicitly or
explicitly determined by the graphic characters, control functions, and code-identification functions whose coded
representations are received.
ISOBEC 2022: 1994 (E)
0 ISO/lEC
3 Normative references
The following standards contain provisions which, through reference in this text, constitute provisions of this International
Standard. At the time of publication, the editions indicated were valid. All standards are subject to revision, and parties to
agreements based on this International Standard are encouraged to investigate the possibility of applying the most recent
editions of the standards listed below. Members of IEC and IS0 maintain registers of currently valid standards.
IS0 2375: 1985, Data processing - Procedure for registration of escape sequences.
ISOIIEC 6429: 1992, Information technology - Controlfunctions for coded character sets.
Open Systems Interconnection - Specification of Abstract Syntax Notation One
IS0 8824: 1990, Information technology -
(ASN. I).
Specification of Basic Encoding Rules for Abstract
IS0 8825: 1990, Information technology - Open Systems Interconnection -
Syntax Notation One (ASN.1).
IS0 International Register of Coded Character Sets to be Used with Escape Sequences.
4 Definitions
For the purposes of this International Standard, the following definitions apply.
41 . bit combination: An ordered set of bits used for the representation of characters.
4.2 byte: A bit string that is operated upon as a unit.
Note - Each bit has the value either ZERO or ONE.
4.3 character: A member of a set of elements used for the organization, control or representation of data.
4.4 coded-character-data-element (CC-data-element): An element of interchanged information that is specified to
consist of a sequence of coded representations of characters, in accordance with one or more identified standards for coded
character sets.
Notes
1 - In a communication environment in accordance with the Reference Model for Open Systems Interconnection of IS0 7498, a CC-data-element will form
all or part of the information that corresponds to the Presentation-Protocol-Data-Unit (PPDU) defined in that International Standard.
a CC -data-element will form all or part of the information
2 - When information interchange is accomplished by means of interchangeable media,
corresponds to the user data, and not that recorded during formatting and initialization.
one-to-one relationship
4.5 coded character set; code: A set of unambiguous rules that establishes a character set and the
between the characters of the set and their bit combinations.
4.6 code extension: The techniques for the encoding of characters that are not included in the character set of a given code.
combination in a code.
47 0 code table: A table showing the character allocated to each bit
4.8 combining character: A member of an identified subset of a coded character set, intended for combination with the
preceding or following graphic character, or with a sequence of combining characters preceded or followed by a non-
combining character.
4.9 control character: A control function the coded representation of which consists of a single bit combination.
transmission or interpretation of data, and that has
4.10 control function: An action that affects the recording, processing,
a coded representation consisting of one or more bit combinations.
represented, in some cases immediately and in others on the
4.11 to designate: To identify a set of characters that are to be
occurrence of a further control function, in a prescribed manner.
0 ISO/IEC
ISO/IEC 2022: 1994 (E)
4.12 device: A component of information processing equipment which can transmit, and/or can receive, coded information
within CC-data-elements.
Note - It may be an input/output device in the conventional sense, or a process such as an application program or a gateway function.
4.13 escape sequence: A string of bit combinations that is used for control purposes in code extension procedures. The
first of these bit combinations represents the control function ESCAPE.
Note -In this International Standard ESCAPE is always referred to as a control character.
4.14 Final Byte: The bit combination that terminates an escape sequence or a control sequence.
4.15 graphic character: A character, other than a control function, that has a visual representation normally handwritten,
printed or displayed, and that has a coded representation consisting of one or more bit combinations.
4.16 graphic symbol: A visual representation of a graphic character or of a control function.
4.17 Intermediate Byte: A bit combination which may occur between that of the control character ESCAPE and the Final
Byte in an escape sequence.
4.18 to invoke: To cause a designated set of characters to be represented by the prescribed bit combinations whenever those
bit combinations occur.
4.19 repertoire: A specified set of characters that are each represented by one or more bit combinations of a coded
character set.
4.20 to represent:
a) To use a prescribed bit combination with the meaning of a character in a set of characters that has been designated and
invoked; or
b) To use an escape sequence with the meaning of an additional control function.
4.21 user: A person or other entity that invokes the services provided by a device.
1 - This entity may be a process such as an application program if the “device” is a code convertor or a gateway function, for example.
2 - The characters, as supplied by the user or made available to the user, may be in the form of codes local to the device, or of non-conventional visible
representations, provided that 2.3 above is satisfied.
5 Notation, code tables and names
5.1 Notation
The bits of the bit combinations of the 8-bit code are identified by b b b b b b b and b,, where b, is the highest
8’ 79 69 5’ 43 39 2
order, or most-significant, bit and b, is the lowest-order, or least-significant, bit.
The bits of the bit combinations of the 7-bit code are identified by b,, b,, b,, b,, b,, b, and b,, where b, is the highest order, or
most-significant, bit and b, is the lowest-order, or least-significant, bit.
The bit combinations may be interpreted to represent integers in binary notation, in the range 0 to 255 for the &bit code, and
in the range 0 to 127 for the 7-bit code, by attributing the following weights to the individual bits:
Bit:
bl
bl3 b7 b6 b5 b4 b3 b2
Weight: 128 64 32 16 8 4 2 1
In this International Standard, the bit combinations are identified by notations of the form x/y, where x and y are numbers in
therangeOOt0 15.
The correspondence between the notations of the form x/y and the bit combinations consisting of the bits b8 or b, to b, is as
follows:
- x for the 8-bit code is the number represented by b,, b,, b,, and b, where these bits are given the weights 8, 4, 2 and 1
respectively;
0 ISO/IEC ISO/IEC 2022:1994 (E)
-
x for the 7-bit code is the number represented by b,, b,, and b, where these bits are given the weights 4, 2 and 1
respectively;
-
y is the number represented by b,, b,, b, and b, where these bits are given the weights 8,4,2 and 1 respectively.
The notations of the form x/y are the same as those used to identify code table positions, where x is the column number and y
the row number (see 5.2).
5.2 Code tables
An 8-bit code table consists of 256 positions arranged in 16 columns and 16 rows. The columns and rows are numbered 00 to
15 (see figure 1).
A 7-bit code table consists of 128 positions arranged in 8 columns and 16 rows. The columns are numbered 00 to 07 and the
rows 00 to 15 (see figure 1).
The code table positions are identified by notations of the form x/y, where x is the column number and y is the row number.
By convention, leading zeroes are included in the column and row numbers (e.g. 02/01).
The positions of the code table are in one-to-one correspondence with the bit combinations of the code. The notation of a code
table position, of the form x/y, is the same as that of the corresponding bit combination.
001 OllO2103104 05 06 07 00 01 02 03 04 05 06 07
08 09 10 11 12 13 14 15
I
I 00
‘““‘+ttFFttl r-l
n8
"V
IA
I 1
7-bit
8-bit
Figure 1 - Code tables
5.3 Names of characters
This International Standard assigns one name to each character. In addition, it specifies an acronym for each control character
and for the characters SPACE and DELETE. By convention, only capital letters, space and hyphen are used for writing the
names of the characters. For acronyms only capital letters and digits are used. It is intended that the acronyms and this
convention be retained in all translations of the text.
Section 2 - Character sets and codes
6 Characters and character sets
6.1 Types of characters and character sets
The structure of &bit and 7-bit codes specified by this International Standard makes use of the following types of characters,
character sets, and functions:
- fixed coded characters,
-
sets of coded graphic characters,
-
sets of coded control functions (or control characters),
-
coded single additional control functions.
These components are specified respectively in 6.2 to 6.5 below.
The coded representations of the graphic characters and control functions are specified in relation to the &bit and 7-bit code
tables defined in 5.2 above. A coded representation for each type of component is specified within columns 00 to 07 of the
&bit and 7-bit code tables. For some components an alternative coded representation is specified in columns 08 to 15 of the
8-bit code table, and is not applicable to any 7-bit code.
6.2 Fixed coded characters
6.2.1 Character DELETE
Name: DELETE Acronym: DEL Coded representation: 07/ 15
DEL was originally used to erase or obliterate an erroneous or unwanted character in punched tape. DEL may be used for
media-fill or time-fill. DEL characters may be inserted into, or removed from, a CC-data-element without affecting its
information content, but such action may affect the information layout and/or the control of equipment.
6.2.2 Character ESCAPE
Name: ESCAPE Acronym: ESC Coded representation: 01/l 1
ESCAPE is a control character used for code extension purposes. It causes the meaning of a limited number of the bit
combinations following it in a CC-data-element to be changed. These bit combinations, together with the preceding bit
combination that represents the ESC character, constitute an escape sequence.
Escape sequences provide the coded representations of code-identification functions and of some types of control functions.
The various uses of escape sequences are specified in clause 13. Code identification functions are specified in clauses 14 and
15.
6.2.3 Character SPACE
Name: SPACE Acronym: SP Coded representation: 02/00
SPACE is a graphic character. It has a visual representation consisting of the absence of a graphic symbol. It causes the active
position to be advanced by one character position.
6.3 Sets of coded graphic characters
6.3.1 Types of coded graphic character set
A graphic character shall have a coded representation comprising one or more 8-bit combinations (bytes) in an 8-bit code, and
one or more 7-bit combinations (bytes) in a 7-bit code. Within a coded graphic character set each character shall be
represented by the same number of such bit combinations.
The bit combinations used to represent the graphic characters in a set shall be either from the six adjacent columns numbered
02 to 07 of the code tables or from the six adjacent columns numbered 10 to 15 of the 8-bit code table.
The type of a coded graphic character set is defined by the maximum number of graphic characters that the set can contain.
The types of set specified here are illustrated in figure 3.
0 ISO/IEC ISO/IEC 2022: 1994 (E)
A coded graphic character set in which each character is represented by a single bit combination shall be one of the following:
-
94-character set, in positions 02/01 to 07/14, or lO/Ol to 15/14;
(i.e. all positions in columns 02 to 07 except 02/00 and 07/15, or
all positions in columns 10 to 15 except lO/OO and 15/15)
-
96-character set, in positions 02/00 to 07/15, or lO/OO to 15115.
(i.e. all positions in columns 02 to 07, or in columns 10 to 15)
In a 94-character set no character shall be allocated to positions 02/00 and 07/15.
A coded graphic character set in which each character is represented by a sequence of n bit combinations, where n>l , shall be
one of the following:
- 94”-character set,
- 96”-character set.
These sets are here referred to as multiple-byte sets.
A 94”-character set shall consist of up to 94n graphic characters each of which is represented by a sequence of n 8-bit or 7-bit
combinations, either all in the range 02/01 to 07/14 or all in the range 1 O/O1 to 15/l 4. In a 94”-character set no character shall
have a coded representation that includes the bit combination 02/00 or 07/l 5.
A 96”-character set shall consist of up to 96” graphic characters each of which is represented by a sequence of n 8-bit or 7-bit
combinations, either all in the range 02/00 to 07/l 5 or all in the range 1 O/O0 to 15/l 5.
Note - The 8th bit (bg) of each byte in such an &bit multiple-byte representation is uniformly either ZERO or ONE.
ISO/IEC 2022: 1994 (E)
0 1s0AEc
02 03 04 05 06 07
10 11
12 13 14 15
-
00 01
0'
n7
0:
0:
O!
I
OE
I
Oi
0e
OS
90character set
96character set
/
/
02/00
to
1 o%o
to
15/15
if-H+++1
1 I , ,
121 I I I I I
////
02 03 04 05 06 07
10 11 12 13 14 15
94 x 94-character set
96 x 96-character set
Figure 2 - Structure of sets of coded graphic characters
ISO/IEC 2022: 1994 (E)
0 ISOAEC
6.3.2 Contents of a coded graphic character set
(sequences of) bit
Within a coded graphic character set either a unique graphic character shall be allocated to each of the
combinations that are specified for that set, or that bit combination (or sequence) shall be declared unused.
Any coded graphic character set shall not contain the characters SPACE or DELETE, or any control character (see 6.4).
However, characters other than SPACE and representing spaces of different sizes or usage may be assigned to any (sequences
of) bit combinations in any set of graphic characters.
6.3.3 Combination of graphic characters
shall not be combining characters, i.e. they shall not be intended for
Unless specifically defined otherwise, graphic characters
combination with an adjacent graphic character.
Some graphic character sets may allow for the graphical representation of additional graphic symbols, such as accented letters,
by the imaging of two or more graphic characters as a single graphic symbol. Two combination methods are recognised in this
International Standard:
a) graphic characters that are non-combining characters may be combined by the use of the control character BACKSPACE
or CARRIAGE RETURN;
b) graphic characters that are specified to be combining characters may be used in conjunction with a non-combining graphic
character.
Sponsors of graphic character sets who apply for registration according to IS0 2375 are expected to identify any combining
characters that are in the set.
1 - A standard that defines a character set specify which characters, if any, are combining characters, and how they may be used, since a registration
does not require such details to be stated.
2 - The graphic character set of ISO/IEC 646 allows for the first of the above methods for the imaging of accented characters.
3 - ISO/IEC 6429 specifies a third method for combining graphic characters, independent of the specification characters themselves, by the use of the
control function GRAPHIC CHARACTER COMBINATION (GCC).
6.3.4 Sources of coded graphic character sets
Sets of graphic characters and their coded representations are specified in other standards such as ISOAEC 646 or
ISO/IEC 10367, and in national standards. Some of these sets, and some additional sets, are specified in the IS0 International
Register of Coded Character Sets (see annex B).
Note - New and revised character sets may be added to the register when required.
Sets of graphic characters for private use may be defined by agreement between the interchange parties.
6.4
Sets of coded control functions
6.4.1 Types of coded control function set
A set of coded control functions shall contain up to 32 control functions control characters) allocated to two adjacent
(or
columns of a code table.
Two types of coded control function set are defined as follows:
- primary set, in positions OO/OO to 01/15,
-
supplementary set, in positions 08/00 to 09/15, or represented by escape sequences.
character. These sets are
A primary set shall include the ESCAPE character. A supplementary set shall not include that
illustrated in figure 3.
Either a unique control function shall be allocated to each position or the position shall be declared unused.
ISO/IEC 2022: 1994 (E)
0 ISO/IEC
Fe =
Primary set Supplementary set
Figure 3 - Structure of sets of coded control functions (or characters)
6.4.2 Primary sets of coded control functions
A control function in a primary set shall have a coded representation consisting of one 8-bit or 7-bit combination, i.e. it is a
control character.
A primary set of coded control functions shall include the control character ESCAPE in position 01/l 1.
If any control function from the primary set specified in ISO/IEC 6429 is included, it shall have the definition and the coded
representation specified therein. No transmission control characters, other than the ten specified in ISO/IEC 6429, shall be
included in a primary set of coded control functions.
6.4.3
Supplementary sets of coded control functions
A control function in a supplementary set shall have a coded representation consisting of one 8-bit or 7-bit combination when
the set is invoked in positions OS/O0 to 09/15. It shall be represented by an escape sequence of type Fe (see 13.2) otherwise.
Note - The notation Fe indicates a bit combination in the range 04/00 to 05/U. The escape sequence consists of the two bit combinations ESC Fe (13.2.5).
A supplementary set of coded control functions shall not include the control character ESCAPE or any of the transmission
control functions of the primary set of ISO/IEC 6429.
6.4.4 Sources of coded control function sets
Control functions for a wide variety of applications are specified in ISOAEC 6429. A standardized primary set and
supplementary set are included (identified there as CO and Cl sets). Sets of control functions are also registered in the IS0
International Register of Coded Character Sets (see annex B). Each set is registered either as a primary (CO) set only, or as a
supplementary (Cl) set only.
Note - New and revised sets of coded control functions may be added to the register when required.
Sets of coded control functions for private use may be defined by agreement between the interchange parties.
ISOAEC 2022:1994 (E)
0 ISO/IEC
6.5 Coded single additional control functions
A coded single additional control function shall be either:
-
a standardized single control function, or
-
a registered single control function, or
-
a private control function.
Each such function shall be represented by an escape sequence (see clause 13).
6.5.1 Standardized single control functions
A standardized single control function shall have a permanently assigned meaning. Such a function shall be represented by an
escape sequence of type Fs (13.2.1). Each such function shall be registered, together with its coded representation, in the IS0
International Register of Coded Character Sets (see annex B).
functions must first be approved by ISO/IEC JTCUSC2. If approval is granted the control
1 - Any candidates for registration as standardized control
specified in a standard published by IS0 or other recognised body.
function is registered according to the procedure of IS0 2375 . It will normally then be
2 - The notation Fs indicates a bit combination in the range 06/00 to 07/14. The escape sequence consists of the bit combinations ESC Fs (13.2.5).
6.5.2 Registered single control functions
A registered single control function shall have a permanently assigned meaning. Such a function shall be represented by an
escape sequence of type 3Ft (13.2.2). Each such function shall be registered, together with its coded representation, in the IS0
International Register of Coded Character Sets (see annex B).
in the range 04/00 to 07/14. The escape sequence consists of the bit combinations ESC 02/03 . . Ft
Note - The notation Ft indicates a bit combination
(13.2.5).
6.5.3 Private control functions
Private control functions have no standardized meaning. They are for private use and may be defined by agreement between
the interchange parties. A private control function shall be represented by an escape sequence of type Fp or of type 3Fp
(13.2.2).
Note - The notation Fp indicates a bit combination in the range 03/00 to 03/15. The escape sequences consist respectively of the bit combinations ESC Fp
and ESC 02/03 . . Fp (13.2.5).
6.5.4 Sources of coded single control functions
Some standardised single control functions are specified elsewhere in this International Standard, see 7.3 and 15.3, and some
are specified in ISOIIEC 6429.
Registered control functions are found in the IS0 International Register of Coded Character Sets (see annex B).
Private control functions are defined by agreement between the interchange parties.
7 The elements of S-bit and 7-bit codes
7.1 Summary of the elements
An element of an 8-bit or a 7-bit code shall be either:
- a coded character-set (7.2),
-
a coded single additional control function (6.5),
-
a coded code-identification function (7.4).
These code elements are illustrated in figure 4.
0 ISO/IEC
Cl
co
code
code
/
element
element
Code
identification
functions
Single additional
control functions
GO Gl G2 G3
code code
code code
element element element element
r
Figure 4 - Elements of a code
7.2 Character-set code elements
A character-set code element shall be an identified set of coded graphic characters, or of coded control functions (or
characters), together with an element name to indicate the relationship of the set to the structure of the code. When the element
is invoked, the corresponding set shall be represented in those columns of an &bit or 7-bit code table that are specified in
6.3.1, 6.4.2, or 6.4.3 for that type of set.
A character-set code element shall be one of those shown in table 1 below. The table shows the name of the element, the type
of coded character set that it comprises, and the column numbers of the &bit or 7-bit code tables into which it may be invoked.
Table 1 - Character-set code elements
Column numbers Type of coded character set
Control functions (characters), primary set
OOandOl
Fe Control functions, supplementary set
Cl 08 and 09 or ESC
94-character or 94”-character set
GO 02 to 07 Graphic characters -
Gl 02 to 07 or Graphic characters - 94-character or 94”-character or
to 15 96-character or 96”-character set
G2 (as for Gl) (as for Gl)
(as for Gl) (as for Gl)
Note - The identification of specific graphic character sets as the elements GO, Gl, G2, and G3, and the identification of specific control function sets as the
Designation of sets may be achieved by the use of designation
elements CO and Cl, is referred to in this International Standard by the term “designation”.
functions (7.4) or by other methods (see 10.2).
ISO/IEC 2022: 1994 (E)
0 ISO/IEC
7.3 Invocation of character-set code elements
The designation of a control character set as a CO or Cl code element shall invoke that set.
The designation of a graphic character set as a GO, Gl, G2, or G3 code element shall invoke that set if the code element
already has a shift status (8.3.3 and 9.3.3); otherwise the use of a corresponding shift function shall invoke that set. Shift
functions are control functions, and are specified in 8.3, 8.4,9.3, and 9.4. They are listed in table 2 below.
Table 2 shows the name, acronym, and coded representation of each shift function. The entry in the “usage code” column
signifies whether the function is available for use in an 8-bit code or a 7-bit code as follows:
- 7 7-bit code only,
- 8 8-bit code only,
- 718 7-bit and 8-bit codes.
The entry in the “type” column signifies the allocation of the function to a particular code element as follows:
- co a member of the primary set of control functions,
- Cl a member of the supplementary set of control functions,
- Fs a standardised single control function.
Table 2 - Shift functions
Coded Representation
Usage
Name Acronym Code
Bit Combination
Type
SHIFT-IN
SI 7 co 00/l 5
SHIFT-OUT
so 7 co 00114
LOCKING-SHIFT ZERO
LSO 8 co 00115
LOCKING-SHIFT ONE
LSl 8
co 00/14
LOCKING-SHIFT TWO
LS2 718 Fs ESC 06/14
LOCKING-SHIFT THREE LS3 7/8 Fs ESC 06115
SINGLE-SHIFT TWO ss2 718 Cl ESC 04114 or 08114
SINGLE-SHIFT THREE ss3 718 Cl ESC 04/15 or 08/15
LOCKING-SHIFT ONE RIGHT LSlR 8 Fs ESC 07114
LOCKING-SHIFT TWO RIGHT LS2R 8 Fs ESC 07113
LOCKING-SHIFT THREE RIGHT LS3R 8 Fs ESC 07112
Notes
1 - The coded representations of LS2, LS3, SS2, SS3, LSlR, LS2R, and LS3R, are allocated in the IS0 International Register of Coded Character Sets (see
annex B), and are repeated here for convenience.
2 - If a 7-bit single-byte representation of SS2 and SS3 is required, it should be bit combination 01/09 and 01/13, respectively in the primary set of control
functions (see annex B of ISO/IEC 10538).
When any shift function from table 2 is required for use in an 8-bit or 7-bit code it shall be included in, or as, the appropriate
element of that code, in accordance with the “type” entry above.
7.4 Coded code-identification functions
The following types of coded code-identification functions are specified in this International Standard:
-
designation of sets of control characters (14.2),
-
designation of sets of graphic characters (14.3),
-
identify revision number of character sets (14.5),
-
announcement of code structure and facilities (15.2),
- code switching (15.4).
An associated control function is also specified:
- data delimiter (15.3).
required. Alternative methods of providing
These functions may be included as code elements in a 8-bit or 7-bit code when
10.2).
equivalent facilities may be specified in standards for information interchange (see
7.5 Unique coding of graphic characters
The same character may be present in more than one of the sets of graphic characters
...










Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...