Codes for the representation of names of languages — Part 6: Alpha-4 code for comprehensive coverage of language variants

ISO 639-6:2009 specifies a method for establishing four-letter language identifiers (alpha-4) and language reference names for language variants and a hierarchical framework for relating them to languages, language families and language groups. The alpha-4 language identifiers have been developed for use in a wide range of applications, especially in computer systems, where there is a potential need to cover the entire range of languages, language families and language groups as well as language variants within each identified language. Alpha-4 language identifiers can support the quantity of known language variants and accommodate any future expansion. ISO 639-6:2009 provides a hierarchical framework, which facilitates backward compatibility with other ISO 639 codes, based on linguistic and/or geolinguistic relationships, within which a comprehensive enumeration of language variants is possible, including living, extinct, ancient and constructed languages, whether major or minor. As a result, ISO 639-6:2009 caters for a very large number of languages and their variants. ISO 639-6:2009 is not applicable to the registrations for languages designed exclusively for machine use, such as computer-programming languages and reconstructed languages.

Codes pour la représentation des noms de langues — Partie 6: Code alpha-4 pour un traitement exhaustif des variantes linguistiques

L'ISO 639-6:2009 spécifie une méthode pour établir les identifiants de langues à quatre lettres (alpha-4) et les noms de référence de langue pour les variantes linguistiques, ainsi qu'un cadre hiérarchique permettant de relier ces derniers à des langues, familles de langues et groupes de langues. Les identifiants alpha‑4 ont été mis au point pour pouvoir être utilisées dans une large gamme d'applications, et en particulier dans les systèmes informatiques où il y a un besoin potentiel de couvrir la gamme entière de langues, de familles de langues et groupes de langues ainsi que des variantes linguistiques à l'intérieur de chaque langue identifiée. Les identifiants alpha-4 prennent en charge toutes les variantes linguistiques connues et permettent les extensions futures. L'ISO 639-6:2009 fournit un cadre hiérarchique qui facilite la compatibilité rétrospective avec les autres codes de l'ISO 639. Elle est fondée sur des relations linguistiques et/ou géolinguistiques permettant un recensement exhaustif des variantes linguistiques, que les langues soient vivantes, mortes, anciennes ou construites, et qu'elles soient importantes ou mineures. Il en résulte que l'ISO 639-6:2009 couvre un très grand nombre de langues et de variantes linguistiques. L'ISO 639-6:2009 ne s'applique pas à l'enregistrement des langues à usage informatique exclusif telles que les langages de programmation ou les langues reconstruites.

Kode za predstavljanje imen jezikov - 6. del: Štiričrkovna koda za celovito predstavitev različic jezikov

Ta del ISO 639 določa metodo za vzpostavljanje štiričrkovnih jezikovnih označb (alfa-4) in jezikovnih referenčnih imen za jezikovne različice in hierarhični okvir za njihovo povezavo z jeziki, jezikovnimi družinami in jezikovnimi skupinami. Alfa-4 jezikovne označbe so bile razvite za širok razpon uporabe, predvsem v računalniških sistemih, kjer je potencialna potreba po zajetju celotnega obsega jezikov, jezikovnih družin in jezikovnih skupin kot tudi jezikovnih različic v okviru vsakega določenega jezika. Alfa-4 jezikovne označbe lahko podpirajo količino znanih jezikovnih različic in sprejmejo kakršno koli nadaljnjo razširitev. Ta del ISO 639 zagotavlja hierarhični okvir, ki omogoča združljivost za nazaj z drugimi ISO 639 kodami, osnovanimi na jezikovnih in/ali geojezikovnih razmerjih, znotraj katerih je možen izčrpen seznam jezikovnih različic, vključno z živimi, mrtvimi, izumrlimi, starodavnimi in narejenimi jeziki, večjimi ali manjšimi. Posledično ta del ISO 639 skrbi za veliko število jezikov in njihovih različic. Ta del ISO 639 ne velja za registriranje jezikov, ki so zasnovani izključno za strojno uporabo, kot na primer računalniški programski jeziki in obnovljeni jeziki.

General Information

Status
Withdrawn
Publication Date
16-Nov-2009
Withdrawal Date
16-Nov-2009
Current Stage
9599 - Withdrawal of International Standard
Completion Date
25-Nov-2014

Buy Standard

Standard
ISO 639-6:2009 - Codes for the representation of names of languages
English language
16 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
ISO 639-6:2010
English language
21 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day
Standard
ISO 639-6:2009 - Codes pour la représentation des noms de langues
French language
17 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO
STANDARD 639-6
First edition
2009-12-01

Codes for the representation of names of
languages —
Part 6:
Alpha-4 code for comprehensive
coverage of language variants
Codes pour la représentation des noms de langues —
Partie 6: Code alpha-4 pour un traitement exhaustif des variantes
linguistiques




Reference number
ISO 639-6:2009(E)
©
ISO 2009

---------------------- Page: 1 ----------------------
ISO 639-6:2009(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.


COPYRIGHT PROTECTED DOCUMENT


©  ISO 2009
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland

ii © ISO 2009 – All rights reserved

---------------------- Page: 2 ----------------------
ISO 639-6:2009(E)
Contents Page
Foreword .iv
Introduction.v
1 Scope.1
2 Normative references.1
3 Terms and definitions .2
3.1 Terms relating to alpha-4 language code .2
3.2 Terms relating to language and language variants .3
4 Alpha-4 language identifier .3
4.1 Form of alpha-4 language identifier .3
4.2 Syntax of the alpha-4 language identifier .4
4.3 Use of language codes in the ISO 639 series of standards .4
5 Language and language variants .4
5.1 Criteria for identifying language variants .4
5.2 Identifying languages.5
5.3 Identifying spoken language variants .5
5.4 Identifying written language variants.5
5.5 Identifying transcription .5
6 Structure.5
6.1 Model .5
6.2 Data categories used in this model .6
7 Extension coding and register.8
8 Administration of code assignments .9
Annex A (informative) Example data.10
Annex B (normative) Operation of the Registration Authority (ISO 639-6/RA) and the Registration
Authorities Advisory Committee for ISO 639 (ISO 639/RA-JAC).12
Bibliography.16

© ISO 2009 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO 639-6:2009(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
ISO 639-6 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 2, Terminographical and lexicographical working methods.
ISO 639 consists of the following parts, under the general title Codes for the representation of names of
languages:
⎯ Part 1: Alpha-2 code
⎯ Part 2: Alpha-3 code
⎯ Part 3: Alpha-3 code for comprehensive coverage of languages
⎯ Part 4: General principles of coding of the representation of names of languages and related entities, and
application guidelines
⎯ Part 5: Alpha-3 code for language families and groups
⎯ Part 6: Alpha-4 code for comprehensive coverage of language variants

iv © ISO 2009 – All rights reserved

---------------------- Page: 4 ----------------------
ISO 639-6:2009(E)
Introduction
Within language-dependent resources, where use, re-use and interchange are of significant importance to
industry and academia alike, it is important to be able to fully document the data being captured within a
resource. One important element of these resources is the language itself. It is important therefore to be able
to identify the language of the resource as precisely and accurately as possible for the purposes of
interoperability and quality of information content. The ISO 639 series of standards has been developed with
these important goals in mind and this part of ISO 639 is concerned with capability for increased precision and
accuracy. Specifically, this part of ISO 639 is concerned with the identification and documentation of language
variants. Other parts of ISO 639 govern the identification of languages, language families and language
groups and establish general principles for managing and developing codes.
The terms and definitions in this part of ISO 639 have, where appropriate, been harmonized with ISO 639-3.
This part of ISO 639 provides an alpha-4 language identifier and a unique language reference name forming
the language code element for language variants. It also establishes a hierarchical framework that enables the
relationship between language variants, language families, language groups and languages to be shown.
The alpha-4 language identifiers and language reference names, standardized in accordance with this part of
ISO 639, are both complementary to, and compatible with, the alpha-2 and alpha-3 codes in other parts of the
ISO 639 series of standards.
The ISO 639 series of standards provides a coherent means for the identification of languages from the highly
generic to the highly specific. This part contributes to the objective of facilitating seamless transfer between
these standards so that users are able to combine and use language identifiers for language variants,
languages, macrolanguages, language groups and language families with minimal effort, whilst allowing a
stand-alone system for future development and extensibility.
The alpha-4 language identifiers and the language reference names are provided for use in a broad range of
applications, including terminology and lexicography, information and documentation (e.g. for education,
archival and retrieval processes, information services, cultural heritage and “bridging the digital divide”),
linguistics and information technology (search engines).
All language codes are to be regarded as open lists that can be extended and refined in accordance with the
registration procedures within each standard. The registration and maintenance procedures for this part of
ISO 639 have been adopted from ISO/IEC 11179-6.

© ISO 2009 – All rights reserved v

---------------------- Page: 5 ----------------------
INTERNATIONAL STANDARD ISO 639-6:2009(E)

Codes for the representation of names of languages —
Part 6:
Alpha-4 code for comprehensive coverage of language variants
1 Scope
This part of ISO 639 specifies a method for establishing four-letter language identifiers (alpha-4) and language
reference names for language variants and a hierarchical framework for relating them to languages, language
families and language groups. The alpha-4 language identifiers have been developed for use in a wide range
of applications, especially in computer systems, where there is a potential need to cover the entire range of
languages, language families and language groups as well as language variants within each identified
language. Alpha-4 language identifiers can support the quantity of known language variants and
accommodate any future expansion.
This part of ISO 639 provides a hierarchical framework, which facilitates backward compatibility with other
ISO 639 codes, based on linguistic and/or geolinguistic relationships, within which a comprehensive
enumeration of language variants is possible, including living, extinct, ancient and constructed languages,
whether major or minor. As a result, this part of ISO 639 caters for a very large number of languages and their
variants. This part of ISO 639 is not applicable to the registrations for languages designed exclusively for
machine use, such as computer-programming languages and reconstructed languages.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO 639-1:2002, Codes for the representation of names of languages — Part 1: Alpha-2 code
ISO 639-2:1998, Codes for the representation of names of languages — Part 2: Alpha-3 code
ISO 639-4, Codes for the representation of names of languages — General principles of coding of the
representation of names of languages and related entities, and application guidelines
ISO/IEC 11179-1, Information technology — Metadata registries (MDR) — Part 1: Framework
ISO/IEC 11179-2, Information technology — Metadata registries (MDR) — Part 2: Classification
ISO/IEC 11179-3, Information technology — Metadata registries (MDR) — Part 3: Registry metamodel and
basic attributes
ISO/IEC 11179-4, Information technology — Metadata registries (MDR) — Part 4: Formulation of data
definitions
ISO/IEC 11179-5, Information technology — Metadata registries (MDR) — Part 5: Naming and identification
principles
ISO/IEC 11179-6:2005, Information technology — Metadata registries (MDR) — Part 6: Registration
© ISO 2009 – All rights reserved 1

---------------------- Page: 6 ----------------------
ISO 639-6:2009(E)
3 Terms and definitions
For the purpose of this document, the terms and definitions in ISO 639-4 and the following apply.
3.1 Terms relating to alpha-4 language code
3.1.1
code
data transformed or represented in different forms according to a pre-established set of rules
[ISO 639-3:2007, definition 3.1]
3.1.2
code element
individual entry in a code table
[ISO 639-3:2007, definition 3.2]
3.1.3
language identifier
language symbol
unique string of three or four letters that represent a language variant (3.2.2)
NOTE In this part of ISO 639, three-letter language identifiers are those drawn from other parts of ISO 639. However,
the four-letter language identifiers are of variant linguistic entities not found in any of the other parts of ISO 639.
3.1.4
reference name
name
appellation
linguistic expression used to designate an individual concept
NOTE 1 In this part of ISO 639, the reference name is used to designate a language variant (3.2.2).
NOTE 2 The reference name may be that by which the language variant (3.2.2) is known in any one of many
languages.
NOTE 3 Adapted from ISO 639-3:2007, definition 3.4.
3.1.5
language code element
language (3.2.1) or language variant (3.2.2) entry in a code or code table consisting of categories of data
that are transformed or represented in different forms according to rules
NOTE 1 The language code element for this part of ISO 639 consists of an alpha-4 language identifier (3.1.3) and a
reference name (3.1.4) for language variants (3.2.2).
NOTE 2 The database for this part of ISO 639 includes the alpha-3 language identifiers (3.1.3) and their reference
names (3.1.4); these are not part of this part of ISO 639 but provide information on the hierarchical links of language
variants (3.2.2), languages (3.2.1), language families and language groups.
2 © ISO 2009 – All rights reserved

---------------------- Page: 7 ----------------------
ISO 639-6:2009(E)
3.2 Terms relating to language and language variants
3.2.1
language
systematic use of sounds, characters, symbols or signs to express or communicate meaning or a message
3.2.2
language variant
uniquely identified use of language (3.2.1) based on language variation (3.2.3)
3.2.3
language variation
difference in the characteristics of individual languages (3.2.4)
3.2.4
individual language
language (3.2.1) that is distinguishable from other languages (3.2.1)
3.2.5
language documentation
information relating to the identification of a language (3.2.1)
4 Alpha-4 language identifier
4.1 Form of alpha-4 language identifier
A language identifier in accordance with this part of ISO 639 shall comprise a sequence of four letters from the
following set of 26 letters of the Latin alphabet in lower case: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u,
v, w, x, y, z. Use shall not be made of diacritical marks or modified characters.
Alpha-4 language identifiers serve only as a device to uniquely represent language variants.
This part of ISO 639 makes provision for a reference name which is intended for use within a metadata
registry. This reference name is selected from one of the names by which the language is or has been
identified; this shall not be interpreted to imply that a name from any particular language is considered to be
the preferred name.
As far as possible, an alpha-4 language identifier begins with the same letter as the corresponding reference
name and includes one or more of the subsequent letters of that reference name; an effort being made to
avoid pronounceable sequences. With thousands of languages and language variants and many similar
names, alpha-4 language identifiers that resemble the reference name cannot be provided in every case. No
alpha-4 language identifier shall have the same form as any other four-letter language reference name.
To maintain continuity and stability, alpha-4 language identifiers shall not change or be reused for a different
purpose. In accordance with ISO/IEC 11179-6, when a language identifier is retired or superseded, the
alpha-4 language identifier shall remain within the registry for this part of ISO 639 for backward compatibility.
When translating this International Standard to languages that are written using non-Latin scripts, alpha-4
language identifiers shall be formed using the Latin alphabet according to the principles of this part of ISO 639.
The alpha-4 language identifier, and the language reference name, shall be treated as language-independent.
Knowledge of the world's languages and language variants at any given time is never complete or perfect.
Additional language reference names and their alpha-4 language identifiers can be added when a language
variant is found which differs from the language variants previously identified. It is essential that certain criteria
be met before a new language code element can be accepted. These conditions conform to
ISO/IEC 11179 (all parts) and are documented in 6.2. The denotation of an existing language code element
can be revised. Existing reference names can also be revised. In addition, existing language code elements
can be retired or become superseded when it is apparent that they no longer reflect language variant
© ISO 2009 – All rights reserved 3

---------------------- Page: 8 ----------------------
ISO 639-6:2009(E)
distinctions in current use. When making changes, remaining language code elements shall not be adversely
affected.
4.2 Syntax of the alpha-4 language identifier
Different types of language variants may be represented by an alpha-4 language identifier (e.g. living
languages, ancient languages, artificially constructed languages) in a variety of communication modes
including written, spoken or signed modes. The list of communication modes is not exhaustive and can be
augmented over time. Table A.1 gives more detail.
Alpha-4 language code elements are linked to code elements provided by other standards in the ISO 639
series.
4.3 Use of language codes in the ISO 639 series of standards
The ISO 639 series of standards is intended to provide a compatible and interoperable set of language
identifiers for language variants, languages and language groups and families. All parts of the series shall be
considered when determining the most applicable language code, and the language identifiers may be applied,
where applicable, in the order alpha-2, alpha-3, or alpha-4 moving from more generic to more specific forms of
identification. The links provided between the alpha-4 and alpha-3 and between the alpha-3 and alpha-2
representations facilitate interoperability between systems, although each system can be used alone. No
alpha-4 language identifier shall be assigned to language code elements that already have alpha-3
representations assigned to them in other parts of ISO 639. The Registration Authority of this part of ISO 639
is responsible, through representation on the ISO 639 Joint Advisory Committee (JAC), for determining the
appropriate assignment of a language code element where there might be debate regarding the nature of the
language or language variant being registered.
5 Language and language variants
5.1 Criteria for identifying language variants
No method of identification of a language is agreed by all or is appropriate for all purposes. As a result, there
can be disagreement, even among speakers or linguistic experts, as to whether two language variants
represent dialects of a single language or two distinct languages. For this part of ISO 639, judgements
regarding whether two language variants are considered to be the same or different languages are based on a
number of factors including linguistic similarity, intelligibility, a common literature and the views of speakers
concerning the relationship between language and identity. They can also be affected by current views and
local attitudes concerning the relationship between languages and ethnic identity and/or between languages
and speakers' religious affiliations. Such judgements cannot always be absolute, given the impossibility of
precisely defining degrees of linguistic similarity or inter-intelligibility and given the pressures of changing
political situations.
The following basic criteria should be followed.
⎯ Two related language variants are normally considered language variants of the same language if
speakers of each language variant have inherent understanding of the other language variant
(i.e. understanding based on knowledge of their own language variant without needing to learn the other
language variant) at a functional level.
⎯ Where spoken intelligibility between language variants is marginal, the existence of a common literature
or of a common ethnolinguistic identity with a central language variant that both understand can be strong
indicators that they should nevertheless be considered language variants of the same language.
⎯ Where there is enough intelligibility between language variants to enable communication, the existence of
well-established distinct ethnolinguistic identities can be a strong indicator that they should nevertheless
be considered to be different languages. Some of the distinctions made on this basis might not be
considered appropriate by some users or for certain applications.
4 © ISO 2009 – All rights reserved

---------------------- Page: 9 ----------------------
ISO 639-6:2009(E)
These criteria shall be evaluated, where possible, according to open mediated discussion with identifiable
experts; the results of discussion shall be documented.
5.2 Identifying languages
For the purposes of international reference, a category of identified languages (or language units), as
recognized within the corpus of ISO 639 codes, shall be identified and represented as languages within this
part of ISO 639.
Any unresolved cases of identification or re-identification will be referred to the Joint Advisory Committee of
ISO 639.
5.3 Identifying spoken language variants
The application of alpha-4 language identifiers to spoken language variants within each language can be as
detailed as practicable, involving not only language variants (known popularly as “dialects”) but also
components (or “sub-dialects”) within each language variant.
The identification of language variants within a language is even more problematic than the identification of a
“language” itself. For practical purposes, it is assumed that boundaries exist between neighbouring language
variants or components of a spoken language, although there are most frequently gradual transitions among
them (of pronunciation and/or vocabulary and/or morphology). The most tangible boundaries among the
language variants and components of individual spoken languages are either geographic or ethnic, marked in
particular by intervening highlands or open water or other areas of low or non-existent population or by areas
where other languages predominate (although the relative dimensions of height, distance and/or population
can vary greatly).
5.4 Identifying written language variants
The alpha-4 language identifiers of this part of ISO 639, where applied to written language variants, shall also
denote the writing system, script and character set (where known) for use in identifying written and historical
language variants and orthographies.
NOTE Whilst written language variants in scripts are already included within the code specified in this part of ISO 639,
the framework also facilitates further extension to include transliteration and text to audio/audio to text representations.
5.5 Identifying transcription
The alpha-4 language identifiers of this part of ISO 639, where applied to code elements for transcription,
denote a transcription of a spoken language or language variant. System designers may find it useful to
further define these code elements by assigning an extension code from ISO 3166-1 or ISO 3166-2.
6 Structure
6.1 Model
6.1.1 General
The model for the code of this part of ISO 639 has been developed to be compatible with models being
developed in ISO/TC 37 in general. ISO/TC 37 standards for computational use of terminology, specifically
ISO 16642 and its combination with ISO 12620, emphasize the use of a metamodel in combination with
metadata identifiers, referred to as data categories. This part of ISO 639 provides a specific model for
language documentation and a list of metadata identifiers used within this model. These metadata identifiers
are described and documented within ISO 639-4. Further discussion regarding this model can be found in
References [15] and [16].
© ISO 2009 – All rights reserved 5

---------------------- Page: 10 ----------------------
ISO 639-6:2009(E)
This model has also been developed in conformity with ISO/IEC 11179 (all parts) for the provision of a
language metadata registry as follows:
⎯ specified in accordance with ISO/IEC 11179-3;
⎯ defined in accordance with ISO/IEC 11179-4;
⎯ named in accordance with ISO/IEC 11179-5;
⎯ registered in accordance with ISO/IEC 11179-6;
and is intended to be fully compatible with the metadata registry specified in accordance with ISO 12620.
The identifiers and associated data shall be managed within a metadata registry in conformity with
ISO/IEC 11179-6.
6.1.2 Identification
Each language variant shall be provided with one language identifier. This language identifier is intended for
use as the unique data identifier (DI), to be used in combination with a version identifier (VI) and registration
authority identifier (RAI) for composing the international registration data identifier (IRDI).
6.1.3 Naming
The concepts of “language” and “language variant” are used in the model in a direct relationship with the
metamodel. Each language code element shall have the language reference names for the language variant
organized according to language sections. One or more names can be present in one or more languages that
identify the language variant. Since the model is hierarchical, every language code element includes a link
between the language variant and its parent. These links also provide a bridge between this part of ISO 639
and other codes in the ISO 639 series.
6.1.4 Representation
The alpha-4 language identifiers shall be regarded as representations in conformity with ISO/IEC 11179 (all
parts).
6.1.5 Language documentation
Language documentation shall be maintained in the registry in conformity with ISO/IEC 11179 (all parts).
6.2 Data categories used in this model
Table 1 comprises descriptors used for documenting and identifying each language variant, some of which are
deemed essential for purposes of registration.
A metadata registry for this part of ISO 639 shall be developed and maintained. The purpose of this metadata
registry is to support the ability to identify whether similar suitable items of metadata already exist and, if not,
to assist in the construction of a new description from similar existing language descriptions. The metadata
registry will allow similarities between registered items to be identified as a function of the identifiers used for
that item. Where similar or identical items are determined, appropriate decisions regarding the need for new
items or the harmonization of existing items shall be undertaken, with the decisions being fully documented
within the system and the status of the alpha-4 language identifiers being recorded accordingly.
These data categories are intended for capturing information that assists in the identification, the
determination of the provenance and monitoring of the quality of the registered language variants. They shall
be used to help avoid unnecessary variations when describing highly similar objects within the registry. The
development of the metadata registry might result in the addition of further data categories, and the
registration authority for this part of ISO 639 should be prepared to document such additions where it is
6 © ISO 2009 – All rights reserved

---------------------- Page: 11 ----------------------
ISO 639-6:2009(E)
essential for interoperability within the
...

SLOVENSKI STANDARD
SIST ISO 639-6:2010
01-maj-2010
.RGH]DSUHGVWDYOMDQMHLPHQMH]LNRYGHOâWLULþUNRYQDNRGD]DFHORYLWR
SUHGVWDYLWHYUD]OLþLFMH]LNRY
Codes for the representation of names of languages - Part 6: Alpha-4 code for
comprehensive coverage of language variants
Codes pour la représentation des noms de langues - Partie 6: Code alpha-4 pour un
traitement exhaustif des variantes linguistiques
Ta slovenski standard je istoveten z: ISO 639-6:2009
ICS:
01.140.20 Informacijske vede Information sciences
SIST ISO 639-6:2010 en,fr
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

---------------------- Page: 1 ----------------------

SIST ISO 639-6:2010

---------------------- Page: 2 ----------------------

SIST ISO 639-6:2010

INTERNATIONAL ISO
STANDARD 639-6
First edition
2009-12-01

Codes for the representation of names of
languages —
Part 6:
Alpha-4 code for comprehensive
coverage of language variants
Codes pour la représentation des noms de langues —
Partie 6: Code alpha-4 pour un traitement exhaustif des variantes
linguistiques




Reference number
ISO 639-6:2009(E)
©
ISO 2009

---------------------- Page: 3 ----------------------

SIST ISO 639-6:2010
ISO 639-6:2009(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.


COPYRIGHT PROTECTED DOCUMENT


©  ISO 2009
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland

ii © ISO 2009 – All rights reserved

---------------------- Page: 4 ----------------------

SIST ISO 639-6:2010
ISO 639-6:2009(E)
Contents Page
Foreword .iv
Introduction.v
1 Scope.1
2 Normative references.1
3 Terms and definitions .2
3.1 Terms relating to alpha-4 language code .2
3.2 Terms relating to language and language variants .3
4 Alpha-4 language identifier .3
4.1 Form of alpha-4 language identifier .3
4.2 Syntax of the alpha-4 language identifier .4
4.3 Use of language codes in the ISO 639 series of standards .4
5 Language and language variants .4
5.1 Criteria for identifying language variants .4
5.2 Identifying languages.5
5.3 Identifying spoken language variants .5
5.4 Identifying written language variants.5
5.5 Identifying transcription .5
6 Structure.5
6.1 Model .5
6.2 Data categories used in this model .6
7 Extension coding and register.8
8 Administration of code assignments .9
Annex A (informative) Example data.10
Annex B (normative) Operation of the Registration Authority (ISO 639-6/RA) and the Registration
Authorities Advisory Committee for ISO 639 (ISO 639/RA-JAC).12
Bibliography.16

© ISO 2009 – All rights reserved iii

---------------------- Page: 5 ----------------------

SIST ISO 639-6:2010
ISO 639-6:2009(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
ISO 639-6 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 2, Terminographical and lexicographical working methods.
ISO 639 consists of the following parts, under the general title Codes for the representation of names of
languages:
⎯ Part 1: Alpha-2 code
⎯ Part 2: Alpha-3 code
⎯ Part 3: Alpha-3 code for comprehensive coverage of languages
⎯ Part 4: General principles of coding of the representation of names of languages and related entities, and
application guidelines
⎯ Part 5: Alpha-3 code for language families and groups
⎯ Part 6: Alpha-4 code for comprehensive coverage of language variants

iv © ISO 2009 – All rights reserved

---------------------- Page: 6 ----------------------

SIST ISO 639-6:2010
ISO 639-6:2009(E)
Introduction
Within language-dependent resources, where use, re-use and interchange are of significant importance to
industry and academia alike, it is important to be able to fully document the data being captured within a
resource. One important element of these resources is the language itself. It is important therefore to be able
to identify the language of the resource as precisely and accurately as possible for the purposes of
interoperability and quality of information content. The ISO 639 series of standards has been developed with
these important goals in mind and this part of ISO 639 is concerned with capability for increased precision and
accuracy. Specifically, this part of ISO 639 is concerned with the identification and documentation of language
variants. Other parts of ISO 639 govern the identification of languages, language families and language
groups and establish general principles for managing and developing codes.
The terms and definitions in this part of ISO 639 have, where appropriate, been harmonized with ISO 639-3.
This part of ISO 639 provides an alpha-4 language identifier and a unique language reference name forming
the language code element for language variants. It also establishes a hierarchical framework that enables the
relationship between language variants, language families, language groups and languages to be shown.
The alpha-4 language identifiers and language reference names, standardized in accordance with this part of
ISO 639, are both complementary to, and compatible with, the alpha-2 and alpha-3 codes in other parts of the
ISO 639 series of standards.
The ISO 639 series of standards provides a coherent means for the identification of languages from the highly
generic to the highly specific. This part contributes to the objective of facilitating seamless transfer between
these standards so that users are able to combine and use language identifiers for language variants,
languages, macrolanguages, language groups and language families with minimal effort, whilst allowing a
stand-alone system for future development and extensibility.
The alpha-4 language identifiers and the language reference names are provided for use in a broad range of
applications, including terminology and lexicography, information and documentation (e.g. for education,
archival and retrieval processes, information services, cultural heritage and “bridging the digital divide”),
linguistics and information technology (search engines).
All language codes are to be regarded as open lists that can be extended and refined in accordance with the
registration procedures within each standard. The registration and maintenance procedures for this part of
ISO 639 have been adopted from ISO/IEC 11179-6.

© ISO 2009 – All rights reserved v

---------------------- Page: 7 ----------------------

SIST ISO 639-6:2010

---------------------- Page: 8 ----------------------

SIST ISO 639-6:2010
INTERNATIONAL STANDARD ISO 639-6:2009(E)

Codes for the representation of names of languages —
Part 6:
Alpha-4 code for comprehensive coverage of language variants
1 Scope
This part of ISO 639 specifies a method for establishing four-letter language identifiers (alpha-4) and language
reference names for language variants and a hierarchical framework for relating them to languages, language
families and language groups. The alpha-4 language identifiers have been developed for use in a wide range
of applications, especially in computer systems, where there is a potential need to cover the entire range of
languages, language families and language groups as well as language variants within each identified
language. Alpha-4 language identifiers can support the quantity of known language variants and
accommodate any future expansion.
This part of ISO 639 provides a hierarchical framework, which facilitates backward compatibility with other
ISO 639 codes, based on linguistic and/or geolinguistic relationships, within which a comprehensive
enumeration of language variants is possible, including living, extinct, ancient and constructed languages,
whether major or minor. As a result, this part of ISO 639 caters for a very large number of languages and their
variants. This part of ISO 639 is not applicable to the registrations for languages designed exclusively for
machine use, such as computer-programming languages and reconstructed languages.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO 639-1:2002, Codes for the representation of names of languages — Part 1: Alpha-2 code
ISO 639-2:1998, Codes for the representation of names of languages — Part 2: Alpha-3 code
ISO 639-4, Codes for the representation of names of languages — General principles of coding of the
representation of names of languages and related entities, and application guidelines
ISO/IEC 11179-1, Information technology — Metadata registries (MDR) — Part 1: Framework
ISO/IEC 11179-2, Information technology — Metadata registries (MDR) — Part 2: Classification
ISO/IEC 11179-3, Information technology — Metadata registries (MDR) — Part 3: Registry metamodel and
basic attributes
ISO/IEC 11179-4, Information technology — Metadata registries (MDR) — Part 4: Formulation of data
definitions
ISO/IEC 11179-5, Information technology — Metadata registries (MDR) — Part 5: Naming and identification
principles
ISO/IEC 11179-6:2005, Information technology — Metadata registries (MDR) — Part 6: Registration
© ISO 2009 – All rights reserved 1

---------------------- Page: 9 ----------------------

SIST ISO 639-6:2010
ISO 639-6:2009(E)
3 Terms and definitions
For the purpose of this document, the terms and definitions in ISO 639-4 and the following apply.
3.1 Terms relating to alpha-4 language code
3.1.1
code
data transformed or represented in different forms according to a pre-established set of rules
[ISO 639-3:2007, definition 3.1]
3.1.2
code element
individual entry in a code table
[ISO 639-3:2007, definition 3.2]
3.1.3
language identifier
language symbol
unique string of three or four letters that represent a language variant (3.2.2)
NOTE In this part of ISO 639, three-letter language identifiers are those drawn from other parts of ISO 639. However,
the four-letter language identifiers are of variant linguistic entities not found in any of the other parts of ISO 639.
3.1.4
reference name
name
appellation
linguistic expression used to designate an individual concept
NOTE 1 In this part of ISO 639, the reference name is used to designate a language variant (3.2.2).
NOTE 2 The reference name may be that by which the language variant (3.2.2) is known in any one of many
languages.
NOTE 3 Adapted from ISO 639-3:2007, definition 3.4.
3.1.5
language code element
language (3.2.1) or language variant (3.2.2) entry in a code or code table consisting of categories of data
that are transformed or represented in different forms according to rules
NOTE 1 The language code element for this part of ISO 639 consists of an alpha-4 language identifier (3.1.3) and a
reference name (3.1.4) for language variants (3.2.2).
NOTE 2 The database for this part of ISO 639 includes the alpha-3 language identifiers (3.1.3) and their reference
names (3.1.4); these are not part of this part of ISO 639 but provide information on the hierarchical links of language
variants (3.2.2), languages (3.2.1), language families and language groups.
2 © ISO 2009 – All rights reserved

---------------------- Page: 10 ----------------------

SIST ISO 639-6:2010
ISO 639-6:2009(E)
3.2 Terms relating to language and language variants
3.2.1
language
systematic use of sounds, characters, symbols or signs to express or communicate meaning or a message
3.2.2
language variant
uniquely identified use of language (3.2.1) based on language variation (3.2.3)
3.2.3
language variation
difference in the characteristics of individual languages (3.2.4)
3.2.4
individual language
language (3.2.1) that is distinguishable from other languages (3.2.1)
3.2.5
language documentation
information relating to the identification of a language (3.2.1)
4 Alpha-4 language identifier
4.1 Form of alpha-4 language identifier
A language identifier in accordance with this part of ISO 639 shall comprise a sequence of four letters from the
following set of 26 letters of the Latin alphabet in lower case: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u,
v, w, x, y, z. Use shall not be made of diacritical marks or modified characters.
Alpha-4 language identifiers serve only as a device to uniquely represent language variants.
This part of ISO 639 makes provision for a reference name which is intended for use within a metadata
registry. This reference name is selected from one of the names by which the language is or has been
identified; this shall not be interpreted to imply that a name from any particular language is considered to be
the preferred name.
As far as possible, an alpha-4 language identifier begins with the same letter as the corresponding reference
name and includes one or more of the subsequent letters of that reference name; an effort being made to
avoid pronounceable sequences. With thousands of languages and language variants and many similar
names, alpha-4 language identifiers that resemble the reference name cannot be provided in every case. No
alpha-4 language identifier shall have the same form as any other four-letter language reference name.
To maintain continuity and stability, alpha-4 language identifiers shall not change or be reused for a different
purpose. In accordance with ISO/IEC 11179-6, when a language identifier is retired or superseded, the
alpha-4 language identifier shall remain within the registry for this part of ISO 639 for backward compatibility.
When translating this International Standard to languages that are written using non-Latin scripts, alpha-4
language identifiers shall be formed using the Latin alphabet according to the principles of this part of ISO 639.
The alpha-4 language identifier, and the language reference name, shall be treated as language-independent.
Knowledge of the world's languages and language variants at any given time is never complete or perfect.
Additional language reference names and their alpha-4 language identifiers can be added when a language
variant is found which differs from the language variants previously identified. It is essential that certain criteria
be met before a new language code element can be accepted. These conditions conform to
ISO/IEC 11179 (all parts) and are documented in 6.2. The denotation of an existing language code element
can be revised. Existing reference names can also be revised. In addition, existing language code elements
can be retired or become superseded when it is apparent that they no longer reflect language variant
© ISO 2009 – All rights reserved 3

---------------------- Page: 11 ----------------------

SIST ISO 639-6:2010
ISO 639-6:2009(E)
distinctions in current use. When making changes, remaining language code elements shall not be adversely
affected.
4.2 Syntax of the alpha-4 language identifier
Different types of language variants may be represented by an alpha-4 language identifier (e.g. living
languages, ancient languages, artificially constructed languages) in a variety of communication modes
including written, spoken or signed modes. The list of communication modes is not exhaustive and can be
augmented over time. Table A.1 gives more detail.
Alpha-4 language code elements are linked to code elements provided by other standards in the ISO 639
series.
4.3 Use of language codes in the ISO 639 series of standards
The ISO 639 series of standards is intended to provide a compatible and interoperable set of language
identifiers for language variants, languages and language groups and families. All parts of the series shall be
considered when determining the most applicable language code, and the language identifiers may be applied,
where applicable, in the order alpha-2, alpha-3, or alpha-4 moving from more generic to more specific forms of
identification. The links provided between the alpha-4 and alpha-3 and between the alpha-3 and alpha-2
representations facilitate interoperability between systems, although each system can be used alone. No
alpha-4 language identifier shall be assigned to language code elements that already have alpha-3
representations assigned to them in other parts of ISO 639. The Registration Authority of this part of ISO 639
is responsible, through representation on the ISO 639 Joint Advisory Committee (JAC), for determining the
appropriate assignment of a language code element where there might be debate regarding the nature of the
language or language variant being registered.
5 Language and language variants
5.1 Criteria for identifying language variants
No method of identification of a language is agreed by all or is appropriate for all purposes. As a result, there
can be disagreement, even among speakers or linguistic experts, as to whether two language variants
represent dialects of a single language or two distinct languages. For this part of ISO 639, judgements
regarding whether two language variants are considered to be the same or different languages are based on a
number of factors including linguistic similarity, intelligibility, a common literature and the views of speakers
concerning the relationship between language and identity. They can also be affected by current views and
local attitudes concerning the relationship between languages and ethnic identity and/or between languages
and speakers' religious affiliations. Such judgements cannot always be absolute, given the impossibility of
precisely defining degrees of linguistic similarity or inter-intelligibility and given the pressures of changing
political situations.
The following basic criteria should be followed.
⎯ Two related language variants are normally considered language variants of the same language if
speakers of each language variant have inherent understanding of the other language variant
(i.e. understanding based on knowledge of their own language variant without needing to learn the other
language variant) at a functional level.
⎯ Where spoken intelligibility between language variants is marginal, the existence of a common literature
or of a common ethnolinguistic identity with a central language variant that both understand can be strong
indicators that they should nevertheless be considered language variants of the same language.
⎯ Where there is enough intelligibility between language variants to enable communication, the existence of
well-established distinct ethnolinguistic identities can be a strong indicator that they should nevertheless
be considered to be different languages. Some of the distinctions made on this basis might not be
considered appropriate by some users or for certain applications.
4 © ISO 2009 – All rights reserved

---------------------- Page: 12 ----------------------

SIST ISO 639-6:2010
ISO 639-6:2009(E)
These criteria shall be evaluated, where possible, according to open mediated discussion with identifiable
experts; the results of discussion shall be documented.
5.2 Identifying languages
For the purposes of international reference, a category of identified languages (or language units), as
recognized within the corpus of ISO 639 codes, shall be identified and represented as languages within this
part of ISO 639.
Any unresolved cases of identification or re-identification will be referred to the Joint Advisory Committee of
ISO 639.
5.3 Identifying spoken language variants
The application of alpha-4 language identifiers to spoken language variants within each language can be as
detailed as practicable, involving not only language variants (known popularly as “dialects”) but also
components (or “sub-dialects”) within each language variant.
The identification of language variants within a language is even more problematic than the identification of a
“language” itself. For practical purposes, it is assumed that boundaries exist between neighbouring language
variants or components of a spoken language, although there are most frequently gradual transitions among
them (of pronunciation and/or vocabulary and/or morphology). The most tangible boundaries among the
language variants and components of individual spoken languages are either geographic or ethnic, marked in
particular by intervening highlands or open water or other areas of low or non-existent population or by areas
where other languages predominate (although the relative dimensions of height, distance and/or population
can vary greatly).
5.4 Identifying written language variants
The alpha-4 language identifiers of this part of ISO 639, where applied to written language variants, shall also
denote the writing system, script and character set (where known) for use in identifying written and historical
language variants and orthographies.
NOTE Whilst written language variants in scripts are already included within the code specified in this part of ISO 639,
the framework also facilitates further extension to include transliteration and text to audio/audio to text representations.
5.5 Identifying transcription
The alpha-4 language identifiers of this part of ISO 639, where applied to code elements for transcription,
denote a transcription of a spoken language or language variant. System designers may find it useful to
further define these code elements by assigning an extension code from ISO 3166-1 or ISO 3166-2.
6 Structure
6.1 Model
6.1.1 General
The model for the code of this part of ISO 639 has been developed to be compatible with models being
developed in ISO/TC 37 in general. ISO/TC 37 standards for computational use of terminology, specifically
ISO 16642 and its combination with ISO 12620, emphasize the use of a metamodel in combination with
metadata identifiers, referred to as data categories. This part of ISO 639 provides a specific model for
language documentation and a list of metadata identifiers used within this model. These metadata identifiers
are described and documented within ISO 639-4. Further discussion regarding this model can be found in
References [15] and [16].
© ISO 2009 – All rights reserved 5

---------------------- Page: 13 ----------------------

SIST ISO 639-6:2010
ISO 639-6:2009(E)
This model has also been developed in conformity with ISO/IEC 11179 (all parts) for the provision of a
language metadata registry as follows:
⎯ specified in accordance with ISO/IEC 11179-3;
⎯ defined in accordance with ISO/IEC 11179-4;
⎯ named in accordance with ISO/IEC 11179-5;
⎯ registered in accordance with ISO/IEC 11179-6;
and is intended to be fully compatible with the metadata registry specified in accordance with ISO 12620.
The identifiers and associated data shall be managed within a metadata registry in conformity with
ISO/IEC 11179-6.
6.1.2 Identification
Each language variant shall be provided with one language identifier. This language identifier is intended for
use as the unique data identifier (DI), to be used in combination with a version identifier (VI) and registration
authority identifier (RAI) for composing the international registration data identifier (IRDI).
6.1.3 Naming
The concepts of “language” and “language variant” are used in the model in a direct relationship with the
metamodel. Each language code element shall have the language reference names for the language variant
organized according to language sections. One or more names can be present in one or more languages that
identify the language variant. Since the model is hierarchical, every language code element includes a link
between the language variant and its parent. These links also provide a bridge between this part of ISO 639
and other codes in the ISO 639 series.
6.1.4 Representation
The alpha-4 language identifiers shall be regarded as representations in conformity with ISO/IEC 11179 (all
parts).
6.1.5 Language documentation
Language documentation shall be maintained in the registry in conformity with ISO/IEC 11179 (all parts).
6.2 Data categories used in this model
Table 1 comprises descriptors used for documenting and identifying each language variant, some of which are
deemed essential for purposes of registration.
A metadata registry for this part of ISO 639 shall be developed and maintained. The purpose of this metadata
registry is to support the ability to identify whether similar suitable items of metadata already exist and, if not,
to assist in the construction of a new description from similar existing language descriptions. The metadata
regi
...

NORME ISO
INTERNATIONALE 639-6
Première édition
2009-12-01

Codes pour la représentation des noms
de langues —
Partie 6:
Code alpha-4 pour un traitement
exhaustif des variantes linguistiques
Codes for the representation of names of languages —
Part 6: Alpha-4 code for comprehensive coverage of language variants




Numéro de référence
ISO 639-6:2009(F)
©
ISO 2009

---------------------- Page: 1 ----------------------
ISO 639-6:2009(F)
PDF – Exonération de responsabilité
Le présent fichier PDF peut contenir des polices de caractères intégrées. Conformément aux conditions de licence d'Adobe, ce fichier
peut être imprimé ou visualisé, mais ne doit pas être modifié à moins que l'ordinateur employé à cet effet ne bénéficie d'une licence
autorisant l'utilisation de ces polices et que celles-ci y soient installées. Lors du téléchargement de ce fichier, les parties concernées
acceptent de fait la responsabilité de ne pas enfreindre les conditions de licence d'Adobe. Le Secrétariat central de l'ISO décline toute
responsabilité en la matière.
Adobe est une marque déposée d'Adobe Systems Incorporated.
Les détails relatifs aux produits logiciels utilisés pour la création du présent fichier PDF sont disponibles dans la rubrique General Info
du fichier; les paramètres de création PDF ont été optimisés pour l'impression. Toutes les mesures ont été prises pour garantir
l'exploitation de ce fichier par les comités membres de l'ISO. Dans le cas peu probable où surviendrait un problème d'utilisation,
veuillez en informer le Secrétariat central à l'adresse donnée ci-dessous.


DOCUMENT PROTÉGÉ PAR COPYRIGHT


©  ISO 2009
Droits de reproduction réservés. Sauf prescription différente, aucune partie de cette publication ne peut être reproduite ni utilisée sous
quelque forme que ce soit et par aucun procédé, électronique ou mécanique, y compris la photocopie et les microfilms, sans l'accord écrit
de l'ISO à l'adresse ci-après ou du comité membre de l'ISO dans le pays du demandeur.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Publié en Suisse

ii © ISO 2009 – Tous droits réservés

---------------------- Page: 2 ----------------------
ISO 639-6:2009(F)
Sommaire Page
Avant-propos .iv
Introduction.v
1 Domaine d'application .1
2 Références normatives.1
3 Termes et définitions .2
3.1 Termes relatifs au codage alpha-4 des langues .2
3.2 Termes relatifs aux langues et variantes linguistiques.3
4 Identifiant de langue alpha-4.3
4.1 Forme de l'identifiant de langue alpha-4.3
4.2 Syntaxe de l'identifiant de langue alpha-4.4
4.3 Utilisation des codets de langue dans la série de normes de l'ISO 639.4
5 Langues et variantes linguistiques .4
5.1 Critères d'identification des variantes linguistiques.4
5.2 Identification des langues .5
5.3 Identification des variantes linguistiques parlées .5
5.4 Identification des variantes linguistiques écrites .5
5.5 Identification des transcriptions.6
6 Structure.6
6.1 Modèle .6
6.2 Catégories de données utilisées dans le présent modèle .7
7 Codage et registre des extensions.9
8 Allocation des codes de langue.10
Annexe A (informative) Exemples de données.11
Annexe B (normative) Fonctionnement de l'Organisme d'enregistrement (l'ISO 639-6/RA) et du
Comité consultatif mixte des Organismes d'enregistrement de l'ISO 639 (l'ISO 639/RA-
JAC) .13
Bibliographie.17

© ISO 2009 – Tous droits réservés iii

---------------------- Page: 3 ----------------------
ISO 639-6:2009(F)
Avant-propos
L'ISO (Organisation internationale de normalisation) est une fédération mondiale d'organismes nationaux de
normalisation (comités membres de l'ISO). L'élaboration des Normes internationales est en général confiée
aux comités techniques de l'ISO. Chaque comité membre intéressé par une étude a le droit de faire partie du
comité technique créé à cet effet. Les organisations internationales, gouvernementales et non
gouvernementales, en liaison avec l'ISO participent également aux travaux. L'ISO collabore étroitement avec
la Commission électrotechnique internationale (CEI) en ce qui concerne la normalisation électrotechnique.
Les Normes internationales sont rédigées conformément aux règles données dans les Directives ISO/CEI,
Partie 2.
La tâche principale des comités techniques est d'élaborer les Normes internationales. Les projets de Normes
internationales adoptés par les comités techniques sont soumis aux comités membres pour vote. Leur
publication comme Normes internationales requiert l'approbation de 75 % au moins des comités membres
votants.
L'attention est appelée sur le fait que certains des éléments du présent document peuvent faire l'objet de
droits de propriété intellectuelle ou de droits analogues. L'ISO ne saurait être tenue pour responsable de ne
pas avoir identifié de tels droits de propriété et averti de leur existence.
L'ISO 639-6 a été élaborée par le comité technique ISO/TC 37, Terminologie et autres ressources langagières
et ressources de contenu, sous-comité SC 2, Méthodes de travail terminographiques et lexicographiques.
L'ISO 639 comprend les parties suivantes, présentées sous le titre général Codes pour la représentation des
noms de langues:
⎯ Partie 1: Code alpha-2
⎯ Partie 2: Code alpha-3
⎯ Partie 3: Code alpha-3 pour un traitement exhaustif des langues
⎯ Partie 4: Principes généraux du codage de la représentation des noms de langue et d'entités connexes,
et lignes directrices pour la mise en œuvre
⎯ Partie 5: Code alpha-3 pour les familles de langues et groupes de langues
⎯ Partie 6: Code alpha-4 pour un traitement exhaustif des variantes linguistiques

iv © ISO 2009 – Tous droits réservés

---------------------- Page: 4 ----------------------
ISO 639-6:2009(F)
Introduction
Dans les ressources dépendant de la langue, où l'utilisation, la réutilisation et l'échange sont d'une importance
significative tant dans les activités industrielles qu'au sein du monde universitaire, il est important d'être en
mesure de documenter de manière exhaustive les données saisies dans une ressource donnée. Un élément
important de ces ressources est la langue elle-même. Il est donc important de pouvoir identifier la langue de
la ressource aussi exactement et aussi fidèlement que possible pour assurer l'interopérabilité et la qualité du
contenu informatif. La série de normes ISO 639 a été mise au point en ayant ces objectifs fondamentaux
présents à l'esprit et la présente partie de l'ISO 639 vise à permettre une précision et une justesse encore
accrues. Plus précisément, la présente partie de l'ISO 639 a pour objet l'identification et la documentation des
variantes linguistiques. Les autres parties de l'ISO 639 régissent l'identification des langues, des familles et
des groupes de langues, et établissent les principes généraux de la gestion et de l'élaboration des codes.
Les termes et définitions de la présente partie de l'ISO 639 ont été harmonisés avec ceux de l'ISO 639-3, là
où cela a été jugé approprié.
La présente partie de l'ISO 639 propose un identifiant de langue alpha-4 et un nom de référence de langue
unique pour former le codet de langue des variantes linguistiques. Elle définit également un cadre
hiérarchique permettant de représenter la relation entre variantes linguistiques, familles de langues, groupes
de langues et langues.
Les identifiants de langue alpha-4 et les noms de référence de langue uniques normalisés conformément à la
présente partie de l'ISO 639 sont à la fois complémentaires des codes alpha-2 et alpha-3 normalisés dans les
autres parties de la série ISO 639 et compatibles avec ces derniers.
La série de normes ISO 639 fournit un moyen cohérent d'identification des langues, depuis les plus
génériques jusqu'aux plus spécifiques. La présente partie contribue à l'objectif de faciliter le transfert sans
faille d'une norme à l'autre, permettant aux utilisateurs de combiner et d'employer avec un minimum d'effort
les identifiants de langue pour les variantes linguistiques, les langues, les macrolangues, les groupes de
langues et les familles de langues tout en disposant d'un système autonome permettant les développements
et extensions futurs.
Les identifiants de langue alpha-4 et les noms de référence de langue fournis sont utilisables dans une large
gamme d'applications, à savoir la terminologie et la lexicographie, l'information et la documentation (par
exemple dans les domaines de l'enseignement, de l'archivage et de la recherche documentaire, dans les
services d'information, pour le patrimoine culturel et pour combler la «fracture numérique»), la linguistique et
les technologies de l'information (moteurs de recherche).
Tous les codes de langue doivent être considérés comme des listes ouvertes qui peuvent être complétées et
affinées conformément aux procédures d'enregistrement figurant dans chaque norme. Les procédures
d'enregistrement et de maintenance de la présente partie de l'ISO 639 ont été reprises de l'ISO/CEI 11179-6.

© ISO 2009 – Tous droits réservés v

---------------------- Page: 5 ----------------------
NORME INTERNATIONALE ISO 639-6:2009(F)

Codes pour la représentation des noms de langues —
Partie 6:
Code alpha-4 pour un traitement exhaustif des variantes
linguistiques
1 Domaine d'application
La présente partie de l'ISO 639 spécifie une méthode pour établir les identifiants de langues à quatre lettres
(alpha-4) et les noms de référence de langue pour les variantes linguistiques, ainsi qu'un cadre hiérarchique
permettant de relier ces derniers à des langues, familles de langues et groupes de langues. Les identifiants
alpha-4 ont été mis au point pour pouvoir être utilisés dans une large gamme d'applications, et en particulier
dans les systèmes informatiques où il y a un besoin potentiel de couvrir la gamme entière de langues, de
familles de langues et groupes de langues ainsi que des variantes linguistiques à l'intérieur de chaque langue
identifiée. Les identifiants alpha-4 prennent en charge toutes les variantes linguistiques connues et permettent
les extensions futures.
La présente partie de l'ISO 639 fournit un cadre hiérarchique qui facilite la compatibilité rétrospective avec les
autres codes de l'ISO 639. Elle est fondée sur des relations linguistiques et/ou géolinguistiques permettant un
recensement exhaustif des variantes linguistiques, que les langues soient vivantes, mortes, anciennes ou
construites, et qu'elles soient importantes ou mineures. Il en résulte que la présente partie de l'ISO 639
couvre un très grand nombre de langues et de variantes linguistiques. La présente partie de l'ISO 639 ne
s'applique pas à l'enregistrement des langues à usage informatique exclusif telles que les langages de
programmation ou les langues reconstruites.
2 Références normatives
Les documents de référence suivants sont indispensables pour l'application du présent document. Pour les
références datées, seule l'édition citée s'applique. Pour les références non datées, la dernière édition du
document de référence s'applique (y compris les éventuels amendements).
ISO 639-1:2002, Codes pour la représentation des noms de langue — Partie 1: Code alpha-2
ISO 639-2:1998, Codes pour la représentation des noms de langue — Partie 2: Code alpha-3
ISO 639-4, Codes pour la représentation des noms de langue — Partie 4: Principes généraux du codage de
la représentation des noms de langue et d'entités associées, et lignes directrices pour la mise en œuvre
ISO/CEI 11179-1, Technologies de l'information — Registres de métadonnées (RM) — Partie 1: Cadre
ISO/CEI 11179-2, Technologies de l'information — Registres de métadonnées (RM) — Partie 2: Classification
ISO/CEI 11179-3, Technologies de l'information — Registres de métadonnées (RM) — Partie 3: Métamodèle
de registre et attributs de base
ISO/CEI 11179-4, Technologies de l'information — Registres de métadonnées (RM) — Partie 4: Formulation
des définitions de données
© ISO 2009 – Tous droits réservés 1

---------------------- Page: 6 ----------------------
ISO 639-6:2009(F)
ISO/CEI 11179-5, Technologies de l'information — Registres de métadonnées (RM) — Partie 5: Principes de
dénomination et d'identification
ISO/CEI 11179-6, Technologies de l'information — Registres de métadonnées (RM) — Partie 6:
Enregistrement
3 Termes et définitions
Pour les besoins du présent document, les termes et définitions donnés dans l'ISO 639-4 ainsi que les
suivants s'appliquent.
3.1 Termes relatifs au codage alpha-4 des langues
3.1.1
code
ensemble de données transformées ou représentées sous différentes formes, selon un jeu de règles préétablies
[ISO 639-3:2007, définition 3.1]
3.1.2
codet
entrée dans une table de codes
[ISO 639-3:2007, définition 3.2]
3.1.3
identifiant de langue
symbole de langue
chaîne unique de trois ou quatre lettres qui représente une variante linguistique (3.2.2)
NOTE Dans la présente partie de l'ISO 639, les identifiants de langue à trois lettres sont ceux qui sont tirés des
autres parties de l'ISO 639. En revanche, les identifiants de langue à quatre lettres caractérisent des variantes
linguistiques qu'on ne retrouve dans aucune des autres parties de l'ISO 639.
3.1.4
nom de référence
nom
appellation
expression linguistique utilisée pour désigner un concept particulier
NOTE 1 Dans la présente partie de l'ISO 639, le nom de référence sert à désigner une variante linguistique (3.2.2).
NOTE 2 Le nom de référence peut être le nom par lequel une variante linguistique (3.2.2) est connue dans de
nombreuses autres langues.
NOTE 3 Adapté de l'ISO 639-3:2007, définition 3.4.
3.1.5
codet de langue
entrée d'une langue (3.2.1) ou d'une variante linguistique (3.2.2) dans un code ou une table de codes,
constituée de catégories de données transformées ou représentées sous différentes formes en fonction de
règles
2 © ISO 2009 – Tous droits réservés

---------------------- Page: 7 ----------------------
ISO 639-6:2009(F)
NOTE 1 Dans la présente partie de l'ISO 639, le codet de langue se compose d'un identifiant de langue (3.1.3)
alpha-4 et d'un nom de référence (3.1.4) pour les variantes linguistiques (3.2.2).
NOTE 2 La base de données de la présente partie de l'ISO 639 comprend les identifiants de langue alpha-3 (3.1.3)
ainsi que leur nom de référence (3.1.4); ceux-ci ne font pas partie de la présente partie de l'ISO 639 mais donnent des
informations sur les liens hiérarchiques entre variantes linguistiques (3.2.2), langues (3.2.1), familles de langues et
groupes de langues.
3.2 Termes relatifs aux langues et variantes linguistiques
3.2.1
langue
utilisation systématique de sons, caractères, symboles ou signes pour exprimer ou communiquer un sens ou
un message
3.2.2
variante linguistique
utilisation identifiée de façon unique d'une langue (3.2.1) reposant sur une variation linguistique (3.2.3)
3.2.3
variation linguistique
différence entre les caractéristiques des langues individuelles (3.2.4)
3.2.4
langue individuelle
langue (3.2.1) distincte des autres langues (3.2.1)
3.2.5
documentation linguistique
information relative à l'identification d'une langue (3.2.1)
4 Identifiant de langue alpha-4
4.1 Forme de l'identifiant de langue alpha-4
Un identifiant de langue conforme à la présente partie de l'ISO 639 doit comprendre une suite de quatre
lettres, chacune choisie parmi les 26 lettres suivantes de l'alphabet latin en minuscules: a, b, c, d, e, f, g, h, i, j,
k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z. Aucun signe diacritique et aucun caractère modifié ne doivent être
utilisés.
Les identifiants de langue alpha-4 servent seulement de dispositif pour représenter de façon unique les
variantes linguistiques.
La présente partie de l'ISO 639 fournit un nom de référence destiné à être utilisé à l'intérieur d'un registre de
métadonnées. Ce nom de référence est choisi parmi les noms qui identifient ou ont identifié la langue; cela ne
doit pas être interprété comme impliquant qu'un nom tiré d'une langue particulière soit considéré comme étant
le nom préféré.
Dans la mesure du possible, un identifiant de langue alpha-4 commence par la même lettre que le nom de
référence correspondant de la langue et comprend une ou plusieurs des lettres suivantes de ce nom de
référence, un effort étant fait pour éviter les suites prononçables. Compte tenu des milliers de langues et
variantes linguistiques qui existent et des nombreux noms similaires, des identifiants de langue alpha-4
ressemblant au nom de référence ne peuvent être fournis dans tous les cas. Aucun identifiant de langue
alpha-4 ne doit avoir la même forme qu'un autre nom de référence de langue à quatre lettres.
© ISO 2009 – Tous droits réservés 3

---------------------- Page: 8 ----------------------
ISO 639-6:2009(F)
Pour assurer la continuité et la stabilité, les identifiants de langue alpha-4 ne doivent ni changer ni être
réutilisés à d'autres fins. Conformément à l'ISO/CEI 11179-6, lorsqu'un identifiant de langue est retiré ou
remplacé, l'identifiant alpha-4 correspondant doit demeurer dans le registre de la présente partie de l'ISO 639
pour assurer la compatibilité rétrospective.
Dans les traductions de la Norme internationale en des langues écrites en caractères non latins, les
identifiants de langue alpha-4 doivent être formés à l'aide de l'alphabet latin selon les principes de la présente
partie de l'ISO 639. L'identifiant de langue alpha-4 et le nom de référence de la langue doivent être traités
comme indépendants de la langue.
La connaissance des langues du monde et des variantes linguistiques à un moment donné n'est jamais ni
parfaite ni complète. Des noms de référence et les identifiants de langue alpha-4 correspondants peuvent être
ajoutés à chaque découverte d'une variante linguistique différant des variantes déjà identifiées. Certains
critères doivent toutefois être remplis avant qu'un nouveau codet de langue puisse être accepté. Ces
conditions sont conformes à l'ISO/CEI 11179 (toutes les parties) et sont documentées en 6.2. La dénotation
d'un codet de langue existant peut être révisée. Les noms de référence de langue existants peuvent
également être révisés. Les codets de langue existants peuvent aussi être retirés ou remplacés lorsqu'il
apparaît qu'ils ne reflètent plus les distinctions entre variantes linguistiques d'usage courant. Les
changements ne doivent pas affecter les codets de langue restants.
4.2 Syntaxe de l'identifiant de langue alpha-4
Un identifiant de langue alpha-4 peut représenter différents types de variantes linguistiques (par exemple des
langues vivantes, des langues anciennes, des langues artificielles) dans un large éventail de modes de
communication (communication écrite, orale ou langage des signes). La liste des modes de communication
n'est pas exhaustive et peut être augmentée à n'importe quel moment. Pour plus de détails, se reporter au
Tableau A.1.
Les codets de langue alpha-4 ont également été reliés aux codets des autres normes de la série ISO 639.
4.3 Utilisation des codets de langue dans la série de normes de l'ISO 639
La série des normes ISO 639 est destinée à fournir un ensemble compatible et interopérable d'identifiants de
langue pour les variantes linguistiques, les langues, les groupes de langues et les familles de langues. Toutes
les parties de la série doivent être examinées avant de déterminer le code de langue le mieux adapté, et les
identifiants de langue peuvent être appliqués, là où c'est possible, dans l'ordre, alpha-2, alpha-3 ou alpha-4 en
allant des formes d'identification les plus génériques aux formes les plus spécifiques. Les liens entre les
représentations alpha-4 et alpha-3 et entre les représentations alpha-3 et alpha-2 facilitent l'interopérabilité
entre systèmes bien que chaque système puisse être utilisé isolément. Aucun identifiant de langue alpha-4 ne
doit être attribué à des codets de langue ayant déjà une représentation alpha-3 dans d'autres parties de
l'ISO 639. L'Autorité d'enregistrement de la présente partie de l'ISO 639 est responsable, par l'intermédiaire
de leur représentant au Comité consultatif mixte de l'ISO 639, de la détermination de l'affectation appropriée
d'un codet de langue lorsqu'il y a débat sur la nature de la langue ou de la variante linguistique à enregistrer.
5 Langues et variantes linguistiques
5.1 Critères d'identification des variantes linguistiques
Aucune méthode d'identification d'une langue ne fait l'objet d'un accord général ou n'est adaptée à tous les
usages. Il peut y avoir des désaccords, même entre locuteurs ou experts linguistiques, sur la question de
savoir si deux variantes linguistiques représentent des dialectes d'une même langue ou deux langues
distinctes. Pour les besoins de la présente partie de l'ISO 639, les jugements quant au fait de savoir si deux
variantes linguistiques sont considérées comme étant la même langue ou des langues différentes reposent
sur un certain nombre de facteurs dont la similitude linguistique, l'intelligibilité, la présence d'une littérature
commune et l'avis des locuteurs concernant la relation entre langue et identité. Ils peuvent aussi être affectés
par les points de vue actuels et les attitudes locales concernant la relation entre langues et identité ethnique
et/ou entre langues et appartenance religieuse des locuteurs. Ces jugements ne peuvent pas toujours être
4 © ISO 2009 – Tous droits réservés

---------------------- Page: 9 ----------------------
ISO 639-6:2009(F)
absolus compte tenu, d'une part, de l'impossibilité de définir avec précision les degrés de similitude
linguistique ou d'intercompréhension et, d'autre part, des pressions suite aux évolutions des situations
politiques.
Il convient de suivre les critères de base suivants.
⎯ Deux variantes linguistiques liées sont normalement considérées comme des variantes linguistiques
d'une même langue si les locuteurs de chaque variante ont, au niveau fonctionnel, une compréhension
inhérente de l'autre variante (c'est à dire une compréhension reposant sur la seule connaissance de leur
propre variante linguistique sans qu'ils aient à apprendre l'autre variante).
⎯ Lorsque l'intelligibilité orale de variantes linguistiques est marginale, l'existence d'une littérature commune
ou d'une identité ethnolinguistique commune, avec une variante linguistique centrale comprise par les
deux, peuvent être un indicateur fort du fait qu'elles puissent néanmoins être considérées comme des
variantes de la même langue.
⎯ Lorsque l'intelligibilité des variantes linguistiques est suffisante pour permettre la communication,
l'existence d'identités ethnolinguistiques distinctes bien établies peut néanmoins être un indicateur fort du
fait que ces variantes soient à considérer comme des langues différentes. Certaines des distinctions
faites sur cette base peuvent ne pas être considérées comme appropriées par certains utilisateurs ou
pour certaines applications.
Ces critères doivent être évalués, si possible, lors de discussions ouvertes avec médiation par des experts
identifiables. Les résultats de la discussion doivent être documentés.
5.2 Identification des langues
Pour les besoins de référence internationale, une catégorie formée de langues identifiées (ou unités
linguistiques) reconnues comme telles à l'intérieur du corpus des codes ISO 639 doit être identifiée et
représentée en tant que langue dans le cadre de la présente partie de l'ISO 639.
Tout cas non résolu d'identification ou de ré-identification sera arbitré par le Comité consultatif mixte de
l'ISO 639.
5.3 Identification des variantes linguistiques parlées
L'application des identifiants de langue alpha-4 aux variantes linguistiques parlées à l'intérieur de chaque
langue peut être aussi détaillée que possible, s'appliquant non seulement à des variantes linguistiques
(connues de façon populaire comme «dialectes»), mais également à des composantes (ou «sous-dialectes»)
de chaque variante.
L'identification des variantes linguistiques à l'intérieur d'une même langue est encore plus problématique que
l'identification d'une «langue» elle-même. À des fins pratiques, il est admis que des frontières existent entre
variantes linguistiques ou composantes voisines d'une langue parlée alors que, la plupart du temps, les
transitions entre elles (dans la prononciation, le vocabulaire et/ou la morphologie) sont graduelles. Les
frontières les plus tangibles entre variantes linguistiques et composantes voisines de langues parlées
individuelles sont géographiques ou ethniques, et marquées en particulier par des zones montagneuses ou
des étendues d'eaux ou d'autres zones de population faible ou inexistante ou encore par des régions où
d'autres langues prédominent (bien que les dimensions relatives de hauteur, distance et/ou population
puissent varier de façon notable).
5.4 Identification des variantes linguistiques écrites
Lorsqu'ils s'appliquent aux variantes linguistiques écrites, les identifiants de langue alpha-4 de la présente
partie de l'ISO 639 doivent aussi dénoter le système d'écriture, à savoir l'écriture et le jeu de caractères
utilisables (si on les connaît), pour identifier les variantes linguistiques écrites et historiques ainsi que les
orthographes.
NOTE Tandis que les variations dans l'écriture sont déjà incluses au sein du codage de la présente partie de
l'ISO 639, le cadre général de celle-ci facilite également une extension permettant l'inclusion en son sein de la
translittération et des représentations texte/audio et audio/texte.
© ISO 2009 – Tous droits réservés 5

---------------------- Page: 10 ----------------------
ISO 639-6:2009(F)
5.5 Identification des transcriptions
Lorsqu'ils s'appliquent aux codets de transcription, les identifiants de langue alpha-4 de la présente partie de
l'ISO 639 dénotent la transcription d'une langue ou d'une variante linguistique parlée. Les concepteurs de
système pourraient juger utile de mieux définir encore ces codets en leur attribuant un code d'extension de
l'ISO 3166-1 ou de l'ISO 3166-2.
6 Structure
6.1 Modèle
6.1.1 Généralités
Le modèle de code de la présente partie de l'ISO 639 a été élaboré de manière à être compatible avec les
modèles généraux mis au point par l'ISO/TC 37. Les normes de l'ISO/TC 37 destinées au traitement
informatique de la terminologie, et en particulier l'ISO 16642 et sa combinaison avec l'ISO 12620, mettent
l'accent sur l'utilisation d'un métamodèle lié à des identifiants de métadonnées appelés catégories de
données. La présente partie de l'ISO 639 fournit un modèle spécifique pour la documentation des langues et
une liste d'identifiants de métadonnées utilisés dans ce modèle. Les identifiants de métadonnées sont décrits
et documentés dans l'ISO 639-4. Une discussion relative à ce modèle peut être trouvée dans les
Références [15] et [16].
Le modèle a également été élaboré conformément à l'ISO/CEI 11179 (toutes les parties) pour permettre
l'élaboration d'un registre de métadonnées sur les langues comme suit:
⎯ spécifié conformément à l'ISO/CEI 11179-3;
⎯ défini conformément à l'ISO/CEI
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.