ISO/IEC 14651:2025
(Main)Information technology — International string ordering and comparison — Method for comparing character strings and description of the common template tailorable ordering
Information technology — International string ordering and comparison — Method for comparing character strings and description of the common template tailorable ordering
This document defines a reference comparison method. This method is applicable to two or more character strings to determine their collating order in a sorted list. The method can be applied to strings containing characters from the full repertoire of ISO/IEC 10646. This method is also applicable to subsets of that repertoire to produce ordering results valid (after tailoring) for a given set of languages for each script. This method uses collation tables derived either from the Common Template Tables (CTT) referenced by this document or from one of their tailoring. The format of the Common Template Table is described using the Backus-Naur Form (BNF). The format is used normatively within this document. This document also defines syntax elements to tailor these Common Template Tables used by the reference comparison method. Furthermore, it defines requirements for a declaration of the differences (delta) between a collation table and a given Common Template Table including the tailoring elements. These Common Template Tables describe an order for all characters encoded in the current and past ISO/IEC 10646 editions, including amendments. They allow for a specification of a fully deterministic ordering. These tables enable the specification of a string ordering adapted to local ordering rules, without requiring an implementer to have knowledge of all the different scripts already encoded in the Universal Coded Character Set (UCS). All these Common Template Tables have reference names which are related to a particular stage of development of the ISO/IEC 10646 Universal coded character set or a particular version of the Unicode Standard. These names and their relationship with ISO/IEC 10646 or the Unicode Standard repertoire are specified by an externally referenced document: Unicode Technical Standard, UTS #10, Unicode Collation Algorithm. This document does not: — mandate a specific comparison method; any equivalent method giving the same results is acceptable; — mandate a specific format for describing or tailoring tables in a given implementation; — mandate specific symbols to be used by implementations; — mandate any specific internal format for intermediate keys used when comparing, nor for the table used. The use of numeric keys is not mandated either; — mandate a context-dependent ordering; — mandate any particular preparation of character strings prior to comparison. NOTE 1 It is typical to do preparation of character strings prior to comparison even if it is not prescribed by this document (see Annex C). NOTE 2 Annex D describes problems that gave way to this document with their anticipated solutions.
Technologies de l'information — Classement international et comparaison de chaînes de caractères — Méthode de comparaison de chaînes de caractères et description du modèle commun et adaptable d'ordre de classement
Le présent document définit une méthode de comparaison de référence. Cette méthode est applicable à deux chaînes de caractères ou plus pour déterminer leur ordre de classement dans une liste triée. La méthode peut être appliquée aux chaînes contenant des caractères du répertoire complet de l'ISO/IEC 10646. Cette méthode est également applicable aux sous-ensembles de ce répertoire pour produire des résultats de tri valides (après adaptation) pour un ensemble donné de langues pour chaque script. Cette méthode de référence utilise des tables de tri dérivées soit des tables-modèles communes de classement définies dans le présent document, soit d’une de leurs adaptations. Le format de la table-modèle commune est décrit en notation BNF (Backus-Naur Form, Forme de Backus-Naur). Son emploi est normatif dans le présent document; Le présent document définit également les éléments de syntaxe pour adapter ces tables-modèles communes utilisées par la méthode de comparaison de référence. De plus, il définit les exigences relatives à une déclaration des différences (delta) entre une table de tri et une table-modèle commune donnée, y compris les éléments d'adaptation. Ces tables-modèles communes décrivent un ordre pour tous les caractères encodés dans les éditions actuelles et passées de l'ISO/IEC 10646, y compris les amendements. Elles permettent de spécifier un ordre complètement déterministe. Ces tables constituent le point de départ permettant de préciser un ordre de classement adapté aux règles de classement locales, sans qu’il soit nécessaire de connaître tous les systèmes d’écriture repris dans le jeu universel de caractères codés (JUC). Toutes ces tables-modèles communes comportent des noms de référence qui sont liés à un stade particulier de développement de l'ISO/IEC 10646 relative au jeu universel de caractères codés ou d'une version particulière du standard Unicode. Ces noms et leur relation avec l'ISO/IEC 10646 ou le répertoire du standard Unicode sont spécifiés par un document de référencement externe: Unicode Technical Standard, UTS #10, Unicode Collation Algorithm. Le présent document n'impose pas ce qui suit: — une méthode particulière de comparaison; toute méthode équivalente conduisant aux mêmes résultats est acceptable; — un format précis pour décrire ou pour adapter les tables dans une mise en œuvre donnée; — des symboles spécifiques à utiliser par les mises en œuvre; — un format interne particulier pour les clés intermédiaires utilisées dans les comparaisons ou pour la table de tri. L’utilisation de clés numériques n’est pas spécifiée non plus; — un ordre dépendant du contexte; — un prétraitement particulier des chaînes de caractères avant comparaison. NOTE 1 Bien que ceci ne soit pas spécifié par le présent document, il s’avère courant de préparer les chaînes de caractères avant leur comparaison (voir l’Annexe C). NOTE 2 L’Annexe D décrit les problèmes qui ont donné lieu au présent document avec leurs solutions anticipées.
General Information
Relations
Buy Standard
Standards Content (Sample)
International
Standard
ISO/IEC 14651
Seventh edition
Information technology —
2025-07
International string ordering
and comparison — Method for
comparing character strings and
description of the common template
tailorable ordering
Technologies de l'information — Classement international
et comparaison de chaînes de caractères — Méthode de
comparaison de chaînes de caractères et description du modèle
commun et adaptable d'ordre de classement
Reference number
© ISO/IEC 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
© ISO/IEC 2025 – All rights reserved
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 2
3 Terms and definitions . 2
4 Symbols and conventions . 3
5 Conformance . 3
6 String comparison . 4
6.1 Preparation of character strings prior to comparison .4
6.2 Key building and comparison .5
6.2.1 Preliminary considerations .5
6.2.2 Reference ordering key formation.6
6.2.3 Reference comparison method for ordering character strings .8
6.2.4 Key ordering definition .8
6.3 Common Template Table: Formation and interpretation .9
6.3.1 General .9
6.3.2 BNF syntax rules for the Common Template Tables in Annex A .10
6.3.3 Well-formedness conditions . 12
6.3.4 Interpretation of tailored tables . 13
6.3.5 Evaluation of weight tables .14
6.3.6 Conditions for considering specific table equivalences. 15
6.3.7 Conditions for results to be considered equivalent . 15
6.4 Declaration of a delta . 15
6.5 Names of the Common Template Tables and name declaration .17
Annex A (normative) Common Template Tables .18
Annex B (informative) Example tailoring deltas.20
Annex C (informative) Preparation .29
Annex D (informative) Tutorial on solutions brought by this document to problems of lexical
ordering .45
Annex E (informative) Searching and fuzzy matches .49
Bibliography .51
© ISO/IEC 2025 – All rights reserved
iii
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations,
governmental and non-governmental, in liaison with ISO and IEC, also take part in the work.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of document should be noted. This document was drafted in accordance with the editorial rules of the ISO/
IEC Directives, Part 2 (see www.iso.org/directives or www.iec.ch/members_experts/refdocs).
ISO and IEC draw attention to the possibility that the implementation of this document may involve the
use of (a) patent(s). ISO and IEC take no position concerning the evidence, validity or applicability of
any claimed patent rights in respect thereof. As of the date of publication of this document, ISO and IEC
had notreceived notice of (a) patent(s) which may be required to implement this document. However,
implementers are cautioned that this may not represent the latest information, which may be obtained from
the patent database available at www.iso.org/patents and https://patents.iec.ch. ISO and IEC shall not be
held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www.iso.org/iso/foreword.html.
In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 2, Coded character sets.
This seventh edition cancels and replaces the sixth edition (ISO/IEC 14651:2020), which has been technically
revised.
The main changes are as follows:
— the Common Template Tables are now referenced externally, removing the need for frequent update to
this document.
Any feedback or questions on this document should be directed to the user’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html and
www.iec.ch/national-committees.
© ISO/IEC 2025 – All rights reserved
iv
Introduction
This document provides a method, applicable around the world, for ordering text data, and references
Common Template Tables which, when tailored, can meet a given language’s ordering requirements while
retaining reasonable ordering for other scripts.
Common Template Tables require some tailoring in different local environments. Conformance to this
document requires that all deviations from these templates, called “deltas”, be declared to document
resultant discrepancies.
This document describes a method to order text data independently of context.
ISO/IEC 30112 has specifications for ordering that informatively complement the specifications in this
document and indicates where additional information can be sought on ordering keywords defined in this
document.
© ISO/IEC 2025 – All rights reserved
v
International Standard ISO/IEC 14651:2025(en)
Information technology — International string ordering and
comparison — Method for comparing character strings and
description of the common template tailorable ordering
1 Scope
This document defines a reference comparison method. This method is applicable to two or more character
strings to determine their collating order in a sorted list. The method can be applied to strings containing
characters from the full repertoire of ISO/IEC 10646. This method is also applicable to subsets of that
repertoire to produce ordering results valid (after tailoring) for a given set of languages for each script.
This method uses collation tables derived either from the Common Template Tables (CTT) referenced by this
document or from one of their tailoring. The format of the Common Template Table is described using the
Backus-Naur Form (BNF). The format is used normatively within this document.
This document also defines syntax elements to tailor these Common Template Tables used by the reference
comparison method. Furthermore, it defines requirements for a declaration of the differences (delta)
between a collation table and a given Common Template Table including the tailoring elements.
These Common Template Tables describe an order for all characters encoded in the current and past
ISO/IEC 10646 editions, including amendments. They allow for a specification of a fully deterministic
ordering. These tables enable the specification of a string ordering adapted to local ordering rules, without
requiring an implementer to have knowledge of all the different scripts already encoded in the Universal
Coded Character Set (UCS).
All these Common Template Tables have reference names which are related to a particular stage of
development of the ISO/IEC 10646 Universal coded character set or a particular version of the Unicode
Standard. These names and their relationship with ISO/IEC 10646 or the Unicode Standard repertoire are
specified by an externally referenced document: Unicode Technical Standard, UTS #10, Unicode Collation
Algorithm.
This document does not:
— mandate a specific comparison method; any equivalent method giving the same results is acceptable;
— mandate a specific format for describing or tailoring tables in a given implementation;
— mandate specific symbols to be used by implementations;
— mandate any specific internal format for intermediate keys used when comparing, nor for the table used.
The use of numeric keys is not mandated either;
— mandate a context-dependent ordering;
— mandate any particular preparation of character strings prior to comparison.
NOTE 1 It is typical to do preparation of character strings prior to comparison even if it is not prescribed by
this document (see Annex C).
NOTE 2 Annex D describes problems that gave way to this document with their anticipated solutions.
© ISO/IEC 2025 – All rights reserved
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 10646, Information technology — Universal coded character set (UCS)
Unicode Technical Standard. Unicode Technical Standard #10, Unicode Collation Algorithm:https:// www
.unicode .org/ reports/ tr10/
Common Template Tables. Unicode Technical Standard UTS #10, Unicode Collation Algorithm, Appendix B:
Synchronization with ISO/IEC 14651:https:// www .unicode .org/ reports
...
Norme
internationale
ISO/IEC 14651
Septième édition
Technologies de l'information —
2025-07
Classement international et
comparaison de chaînes de
caractères — Méthode de
comparaison de chaînes de
caractères et description du modèle
commun et adaptable d'ordre de
classement
Information technology — International string ordering and
comparison — Method for comparing character strings and
description of the common template tailorable ordering
Numéro de référence
DOCUMENT PROTÉGÉ PAR COPYRIGHT
© ISO/IEC 2025
Tous droits réservés. Sauf prescription différente ou nécessité dans le contexte de sa mise en œuvre, aucune partie de cette
publication ne peut être reproduite ni utilisée sous quelque forme que ce soit et par aucun procédé, électronique ou mécanique,
y compris la photocopie, ou la diffusion sur l’internet ou sur un intranet, sans autorisation écrite préalable. Une autorisation peut
être demandée à l’ISO à l’adresse ci-après ou au comité membre de l’ISO dans le pays du demandeur.
ISO copyright office
Case postale 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Genève
Tél.: +41 22 749 01 11
E-mail: copyright@iso.org
Web: www.iso.org
Publié en Suisse
© ISO/IEC 2025 – Tous droits réservés
ii
Sommaire Page
Avant-propos .iv
Introduction .v
1 Domaine d'application . 1
2 Références normatives . 2
3 Termes et définitions . 2
4 Symboles et conventions . 3
5 Conformité . 3
6 Comparaison de chaînes . 4
6.1 Prétraitement des chaînes de caractères avant comparaison .4
6.2 Construction des clés et comparaison .5
6.2.1 Considérations préliminaires .5
6.2.2 Formation de clés de tri de référence .6
6.2.3 Méthode de comparaison de référence pour le tri des chaînes de caractères .8
6.2.4 Définition de l'ordre des clés .8
6.3 Table-modèle commune: Formation et interprétation .9
6.3.1 Généralités .9
6.3.2 Règles de syntaxe BNF pour les tables-modèles communes de l’Annexe A .10
6.3.3 Contraintes de forme . 12
6.3.4 Interprétation des tables adaptées . 13
6.3.5 Évaluation des tables de poids . 15
6.3.6 Conditions d’équivalences de tables spécifiques . 15
6.3.7 Conditions d’équivalence des résultats . 15
6.4 Déclaration d'un delta . 15
6.5 Noms des tables-modèles communes et déclaration de nom .17
Annexe A (normative) Tables-modèles communes .18
Annexe B (informative) Exemple de deltas d'adaptation .20
Annexe C (informative) Prétraitement .27
Annexe D (informative) Annexe didactique sur les solutions apportées par le présent document
aux problèmes de tri lexical.43
Annexe E (informative) Recherche et correspondances floues . 47
Bibliographie .49
© ISO/IEC 2025 – Tous droits réservés
iii
Avant-propos
L'ISO (Organisation internationale de normalisation) et l'IEC (Commission électrotechnique internationale)
forment le système spécialisé de la normalisation mondiale. Les organismes nationaux membres de l'ISO
ou de l'IEC participent au développement de Normes Internationales par l'intermédiaire des comités
techniques créés par l'organisation concernée afin de s'occuper des domaines particuliers de l'activité
technique. Les comités techniques de l'ISO et de l'IEC collaborent dans des domaines d'intérêt commun.
D'autres organisations internationales, gouvernementales et non gouvernementales, en liaison avec l'ISO et
l'IEC participent également aux travaux.
Les procédures utilisées pour élaborer le présent document et celles destinées à sa mise à jour sont
décrites dans les Directives ISO/IEC, Partie 1. Il convient, en particulier de prendre note des différents
critères d'approbation requis pour les différents types de document ISO. Le présent document a
été rédigé conformément aux règles de rédaction données dans les Directives ISO/IEC, Partie 2
(voir www.iso.org/directives ou www.iec.ch/members_experts/refdocs).
L’ISO et l’IEC attirent l’attention sur la possibilité que la mise en œuvre du présent document implique
l'utilisation d'un ou plusieurs brevets. L’ISO et l’IEC ne se prononcent pas sur la preuve, la validité ou
l’applicabilité de tout droit de brevet revendiqué à cet égard. À la date de publication du présent document,
l'ISO et l’IEC n'avaient pas reçu notification de brevets pouvant être nécessaires à la mise en œuvre du
présent document. Toutefois, les utilisateurs sont avertis que ces informations ne sont pas nécessairement
les plus récentes et qu'ils peuvent obtenir des informations plus récentes dans les bases de données de
brevets disponibles à l'adresse www.iso.org/patents et https://patents.iec.ch. L'ISO et l’IEC ne peuvent être
tenues responsables de l'identification de tout ou partie de ces droits de brevet.
Les appellations commerciales éventuellement mentionnées dans le présent document sont données pour
information, par souci de commodité, à l'intention des utilisateurs et ne sauraient constituer un engagement.
Pour une explication de la nature volontaire des normes, la signification des termes et expressions
spécifiques de l'ISO liés à l'évaluation de la conformité, ou pour toute information au sujet de l'adhésion de
l'ISO aux principes de l'Organisation mondiale du commerce (OMC) concernant les obstacles techniques au
commerce (OTC), voir www.iso.org/avant-propos. Pour l'IEC, voir www.iec.ch/understanding-standards.
Le présent document a été élaboré par le comité technique mixte ISO/IEC JTC 1, Technologies de l'information,
sous-comité SC 2, Jeux de caractères codés.
Cette septième édition annule et remplace la sixième édition (ISO/IEC 14651:2020) qui a fait l'objet d'une
révision technique.
Les principales modifications sont les suivantes:
— les tables-modèles communes sont désormais référencés de manière externe, éliminant la nécessité
d'une mise à jour fréquente du présent document.
Il convient que l'utilisateur adresse tout retour d'information ou toute question concernant le présent
document à l'organisme national de normalisation de son pays. Une liste exhaustive desdits organismes se
trouve aux adresses www.iso.org/fr/members.html et www.iec.ch/national-committees.
© ISO/IEC 2025 – Tous droits réservés
iv
Introduction
Le présent document fournit une méthode universelle de mise en ordre des données textuelles. Il référence
les tables-modèles communes qui, lorsqu'elles sont adaptées, peuvent satisfaire aux exigences de tri d'une
langue donnée, tout en triant de manière raisonnable les autres écritures.
Les tables-modèles communes sont conçues de sorte qu’une adaptation s’avère nécessaire pour chaque
environnement local. C’est pourquoi la conformité au présent document requiert que les modifications à
ces tables communes, appelées «deltas», soient déclarées de manière à documenter les différences dans les
résultats.
Le présent document décrit une méthode pour classer l’information textuelle de manière indépendante du
contexte.
L’ISO/IEC 30112 contient des dispositions pour le tri complémentaires à celles du présent document; on y
trouve aussi des renseignements complémentaires sur les mots-clés définis dans le présent document et
utilisés pour le tri.
© ISO/IEC 2025 – Tous droits réservés
v
Norme internationale ISO/IEC 14651:2025(fr)
Technologies de l'information — Classement international
et comparaison de chaînes de caractères — Méthode de
comparaison de chaînes de caractères et description du
modèle commun et adaptable d'ordre de classement
1 Domaine d'application
Le présent document définit une méthode de comparaison de référence. Cette méthode est applicable à deux
chaînes de caractères ou plus pour déterminer leur ordre de classement dans une liste triée. La méthode
peut être appliquée aux chaînes contenant des caractères du répertoire complet de l'ISO/IEC 10646. Cette
méthode est également applicable aux sous-ensembles de ce répertoire pour produire des résultats de
tri valides (après adaptation) pour un ensemble donné de langues pour chaque script. Cette méthode de
référence utilise des tables de tri dérivées soit des tables-modèles communes de classement définies dans
le présent document, soit d’une de leurs adaptations. Le format de la table-modèle commune est décrit en
notation BNF (Backus-Naur Form, Forme de Backus-Naur). Son emploi est normatif dans le présent document;
Le présent document définit également les éléments de syntaxe pour adapter ces tables-modèles communes
utilisées par la méthode de comparaison de référence.
De plus, il définit les exigences relatives à une déclaration des différences (delta) entre une table de tri et une
table-modèle commune donnée, y compris les éléments d'adaptation.
Ces tables-modèles communes décrivent un ordre pour tous les caractères encodés dans les éditions
actuelles et passées de l'ISO/IEC 10646, y compris les amendements. Elles permettent de spécifier un ordre
complètement déterministe. Ces tables constituent le point de départ permettant de préciser un ordre de
classement adapté aux règles de classement locales, sans qu’il soit nécessaire de connaître tous les systèmes
d’écriture repris dans le jeu universel de caractères codés (JUC).
Toutes ces tables-modèles communes comportent des noms de référence qui sont liés à un stade particulier de
développement de l'ISO/IEC 10646 relative au jeu universel de caractères codés ou d'une version particulière
du standard Unicode. Ces noms et leur relation avec l'ISO/IEC 10646 ou le répertoire du standard Unicode
sont spécifiés par un document de référencement externe: Unicode Technical Standard, UTS #10, Unicode
Collation Algorithm.
Le présent document n'impose pas ce qui suit:
— une méthode particulière de comparaison; toute méthode équivalente conduisant aux mêmes résultats
est acceptable;
— un format précis pour décrire ou pour adapter les tables dans une mise en œuvre donnée;
— des symboles spécifiques à utiliser par les mises en œuvre;
— un format interne particulier pour les clés intermédiaires utilisées dans les comparaisons ou pour la
table de tri. L’utilisation de clés numériques n’est pas spécifiée non plus;
— un ordre dépendant du contexte;
— un prétraitement particulier des chaînes de caractères avant comparaison.
NOTE 1 Bien que ceci ne soit pas spécifié par le présent document, il s’avère courant de prépare
...
FINAL DRAFT
International
Standard
ISO/IEC FDIS
ISO/IEC JTC 1/SC 2
Information technology —
Secretariat: JISC
International string ordering
Voting begins on:
and comparison — Method for
2025-04-11
comparing character strings and
Voting terminates on:
description of the common template
2025-06-06
tailorable ordering
Technologies de l'information — Classement international
et comparaison de chaînes de caractères — Méthode de
comparaison de chaînes de caractères et description du modèle
commun et adaptable d'ordre de classement
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
Reference number
ISO/IEC FDIS 14651:2025(en) © ISO/IEC 2025
FINAL DRAFT
ISO/IEC FDIS 14651:2025(en)
International
Standard
ISO/IEC FDIS
ISO/IEC JTC 1/SC 2
Information technology —
Secretariat: JISC
International string ordering
Voting begins on:
and comparison — Method for
comparing character strings and
Voting terminates on:
description of the common template
tailorable ordering
Technologies de l'information — Classement international
et comparaison de chaînes de caractères — Méthode de
comparaison de chaînes de caractères et description du modèle
commun et adaptable d'ordre de classement
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
© ISO/IEC 2025
IN ADDITION TO THEIR EVALUATION AS
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
or ISO’s member body in the country of the requester.
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ISO/IEC FDIS 14651:2025(en) © ISO/IEC 2025
© ISO/IEC 2025 – All rights reserved
ii
ISO/IEC FDIS 14651:2025(en)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 2
3 Terms and definitions . 2
4 Symbols and conventions . 3
5 Conformance . 3
6 String comparison . 4
6.1 Preparation of character strings prior to comparison .4
6.2 Key building and comparison .5
6.2.1 Preliminary considerations .5
6.2.2 Reference ordering key formation.6
6.2.3 Reference comparison method for ordering character strings .8
6.2.4 Key ordering definition .8
6.3 Common Template Table: Formation and interpretation .9
6.3.1 General .9
6.3.2 BNF syntax rules for the Common Template Tables in Annex A .10
6.3.3 Well-formedness conditions .11
6.3.4 Interpretation of tailored tables . 12
6.3.5 Evaluation of weight tables .14
6.3.6 Conditions for considering specific table equivalences.14
6.3.7 Conditions for results to be considered equivalent .14
6.4 Declaration of a delta .14
6.5 Names of the Common Template Tables and name declaration .16
Annex A (normative) Common Template Tables .18
Annex B (informative) Example tailoring deltas.20
Annex C (informative) Preparation .26
Annex D (informative) Tutorial on solutions brought by this document to problems of lexical
ordering .42
Annex E (informative) Searching and fuzzy matches .46
Bibliography .48
© ISO/IEC 2025 – All rights reserved
iii
ISO/IEC FDIS 14651:2025(en)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations,
governmental and non-governmental, in liaison with ISO and IEC, also take part in the work.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of document should be noted. This document was drafted in accordance with the editorial rules of the ISO/
IEC Directives, Part 2 (see www.iso.org/directives or www.iec.ch/members_experts/refdocs).
ISO and IEC draw attention to the possibility that the implementation of this document may involve the
use of (a) patent(s). ISO and IEC take no position concerning the evidence, validity or applicability of
any claimed patent rights in respect thereof. As of the date of publication of this document, ISO and IEC
had not received notice of (a) patent(s) which may be required to implement this document.
However, implementers are cautioned that this may not represent the latest information, which may be
obtained from the patent database available at www.iso.org/patents and https://patents.iec.ch. ISO and
IEC shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www.iso.org/iso/foreword.html.
In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 2, Coded character sets.
This seventh edition cancels and replaces the sixth edition (ISO/IEC 14651:2020), which has been technically
revised.
The main changes are as follows:
— the Common Template Tables are now referenced externally, removing the need for frequent update to
this document.
Any feedback or questions on this document should be directed to the user’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html and
www.iec.ch/national-committees.
© ISO/IEC 2025 – All rights reserved
iv
ISO/IEC FDIS 14651:2025(en)
Introduction
This document provides a method, applicable around the world, for ordering text data, and references
Common Template Tables which, when tailored, can meet a given language’s ordering requirements while
retaining reasonable ordering for other scripts.
Common Template Tables require some tailoring in different local environments. Conformance to this
document requires that all deviations from these templates, called “deltas”, be declared to document
resultant discrepancies.
This document describes a method to order text data independently of context.
ISO/IEC 30112 has specifications for ordering that informatively complement the specifications in this
document and indicates where additional information can be sought on ordering keywords defined in this
document.
© ISO/IEC 2025 – All rights reserved
v
FINAL DRAFT International Standard ISO/IEC FDIS 14651:2025(en)
Information technology — International string ordering and
comparison — Method for comparing character strings and
description of the common template tailorable ordering
1 Scope
This document defines a reference comparison method. This method is applicable to two or more character
strings to determine their collating order in a sorted list. The method can be applied to strings containing
characters from the full repertoire of ISO/IEC 10646. This method is also applicable to subsets of that
repertoire to produce ordering results valid (after tailoring) for a given set of languages for each script.
This method uses collation tables derived either from the Common Template Tables (CTT) referenced by this
document or from one of their tailoring. The format of the Common Template Table is described using the
Backus-Naur Form (BNF). The format is used normatively within this document.
This document also defines syntax elements to tailor these Common Template Tables used by the reference
comparison method. Furthermore, it defines requirements for a declaration of the differences (delta)
between a collation table and a given Common Template Table including the tailoring elements.
These Common Template Tables describe an order for all characters encoded in the current and past
ISO/IEC 10646 editions, including amendments. They allow for a specification of a fully deterministic
ordering. These tables enable the specification of a string ordering adapted to local ordering rules, without
requiring an implementer to have knowledge of all the different scripts already encoded in the Universal
Coded Character Set (UCS).
All these Common Template Tables have reference names which are related to a particular stage of
development of the ISO/IEC 10646 Universal coded character set or a particular version of the Unicode
Standard. These names and their relationship with ISO/IEC 10646 or the Unicode Standard repertoire are
specified by an externally referenced document:
...
ISO/IEC FDIS 14651:2025(en)
ISO/IEC JTC 1/SC 2
Secretariat: JISC
Date: 2025-02-1303-27
Information technology — International string ordering and
comparison — Method for comparing character strings and
description of the common template tailorable ordering
Technologies de l’information l'information — Classement international et comparaison de chaînes de
caractères — Méthode de comparaison de chaînes de caractères et description du modèle commun et adaptable
d'ordre de classement
Seventh edition, 2025
FDIS stage
ISO/IEC FDIS 14651:2025(en)
© ISO/IEC 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication
may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying,
or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO
at the address below or ISO'sISO’s member body in the country of the requester.
ISO Copyright Officecopyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: + 41 22 749 01 11
Email: E-mail: copyright@iso.org
Website: www.iso.org
Published in Switzerland.
© ISO/IEC 2025 – All rights reserved
ii
ISO/IEC DISFDIS 14651:2025(en)
CONTENTS
© ISO/IEC 2025 – All rights reserved
iii
ISO/IEC FDIS 14651:2025(en)
Contents
Foreword . v
Introduction . vi
1 Scope . 1
2 Normative references . 2
3 Terms and definitions . 2
4 Symbols and conventions . 3
5 Conformance . 4
6 String comparison . 4
Annex A (normative) Common Template Tables . 21
Annex B (informative) Example tailoring deltas . 23
Annex C (informative) Preparation . 30
Annex D (informative) Tutorial on solutions brought by this document to problems of lexical
ordering . 48
Annex E (informative) Searching and fuzzy matches . 52
Bibliography . 54
© ISO/IEC 2025 – All rights reserved
iv
ISO/IEC DISFDIS 14651:2025(en)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members
of ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of
document should be noted. This document was drafted in accordance with the editorial rules of the ISO/IEC
Directives, Part 2 (see www.iso.org/directives or www.iec.ch/members_experts/refdocs).
Field Code Changed
ISO and IEC draw attention to the possibility that the implementation of this document may involve the use of
(a) patent(s). ISO and IEC take no position concerning the evidence, validity or applicability of any claimed
patent rights in respect thereof. As of the date of publication of this document, ISO and IEC had not received
notice of (a) patent(s) which may be required to implement this document. However, implementers are
cautioned that this may not represent the latest information, which may be obtained from the patent database
available at www.iso.org/patents and https://patents.iec.ch. ISO and IEC shall not be held responsible for
identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www.iso.org/iso/foreword.html.
In the IEC, see www.iec.ch/understanding-standards.
Field Code Changed
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 2, Coded character sets.
This seventh edition cancels and replaces the sixth edition (ISO/IEC 14651:2020), which has been technically
revised.
The main changes are as follows:
— — the Common Template Tables are now referenced externally, removing the need for frequent update
to this document.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html and www.iec.ch/national-
committees.
© ISO/IEC 2025 – All rights reserved
v
ISO/IEC FDIS 14651:2025(en)
Introduction
This document provides a method, applicable around the world, for ordering text data, and references
Common Template Tables which, when tailored, can meet a given language’s ordering requirements while
retaining reasonable ordering for other scripts.
Common Template Tables require some tailoring in different local environments. Conformance to this
document requires that all deviations from these templates, called “deltas”, be declared to document resultant
discrepancies.
This document describes a method to order text data independently of context.
ISO/IEC 30112 has specifications for ordering that informatively complement the specifications in this
document and indicates where additional information can be sought on ordering keywords defined in this
document.
© ISO/IEC 2025 – All rights reserved
vi
DRAFT International Standard ISO/IEC DIS 14651:2025(en)
Information technology — International string ordering and
comparison — Method for comparing character strings and
description of the common template tailorable ordering
1 Scope
This document defines a reference comparison method. This method is applicable to two or more character
strings to determine their collating order in a sorted list. The method can be applied to strings containing
characters from the full repertoire of ISO/IEC 10646. This method is also applicable to subsets of that
repertoire to produce ordering results valid (after tailoring) for a given set of languages for each script. This
method uses collation tables derived either from the Common Template Tables (CTT) referenced by this
document or from one of their tailoring. The format of the Common Template Table is described using the
Backus-Naur Form (BNF). The format is used normatively within this document.
This document also defines syntax elements to tailor these Common Template Tables used by the reference
comparison method. Furthermore, it defines requirements for a declaration of the differences (delta) between
a collation table and a given Common Template Table including the tailoring elements.
These Common Template Tables describe an order for all characters encoded in the current and past ISO/IEC
10646 editions, including amendments. They allow for a specification of a fully deterministic ordering. These
tables enable the specification of a string ordering adapted to local ordering rules, without requiring an
implementer to have knowledge of all the different scripts already encoded in the Universal Coded Character
Set (UCS).
All these Common Template Tables have reference names which are related to a particular stage of
development of the ISO/IEC 10646 Universal coded character set or a particular version of the Unicode
Standard. These names and their relationship with ISO/IEC 10646 or the Unicode Standard repertoire are
specified by an externally referenced document: Unicode Technical Standard, UTS #10, Unicode Collation
Algorithm.
This document does not:
— — mandate a specific comparison method; any equivalent method giving the same results is acceptable;
— — mandate a specific format for describing or tailoring tables in a given implementation;
— — mandate specific symbols to be used by implementations;
— — mandate any specific internal format for intermediate keys used when comparing, nor for the table
used. The use of numeric keys is not mandated either;
— — mandate a context-dependent ordering;
— — mandate any particular preparation of character strings prior to comparison.
NOTE 1 It is typical to do preparation of character strings prior to comparison even if it is not prescribed by this
document (see Annex CAnnex C).).
NOTE 2 Annex D Annex D describes problems that gave way to this document with their anticipated solutions.
© ISO/IEC 2025 – All rights reserved
ISO/IEC FDIS 14651:2025(en)
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 10646, Information technology — Universal coded character set (UCS)
UNICODE TECHNICAL STANDARD. Unicode Technical Standard #10, Unicode Collation Algorithm:
:https://www.unicode.org/reports/tr10/
COMMON TEMPLATE TABLES. Unicode Technical Standard UTS #10, Unicode Collation Algorithm, Appendix B:
Synchronization with ISO/IEC 14651:
:https://www.unicode.org/reports/tr10/#Synch_ISO14651
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— — ISO Online browsing platform: available at https://www.iso.org/obp
— — IEC Electropedia: available at https://www.electropedia.org/
3.1 3.1
character string
sequence of characters considered as a single object
Note 1 to entry: A character string to be ordered does not normally include the characters that delimit it, as for example
an “end of line” control character in a text file to be sorted.
3.2 3.2
collating symbol
symbol (3.12(3.12)) used to specify weights assigned to a collating element (3.4(3.4))
3.3 3.3
collation table
weighting table
mapping from collating elements (3.4(3.4)) to weighting elements (3.14(3.14))
3.4 3.4
collating element
sequence of one or more characters that are considered a single entity for ordering (3.7(3.7))
3.5 3.5
delta
list of the differences between a given collation table (3.3(3.3)) and another one
Note 1 to entry: The given collation table, together with a given delta, forms a new collation table.
Note 2 to entry: Unless otherwise specified in this document, the term “delta” always refers to differences from a
Common Template Table as defined in this document.
© ISO/IEC 2025 – All rights reserved
ISO/IEC DISFDIS 14651:2025(en)
3.6 3.6
level
collation level
sequence number for a subkey (3.11(3.11)) in the series of subkeys forming a key
3.7 3.7
ordering
collation
process by which, given two strings, it is determined whether the first one is less than, equal to, or greater
than the second one
3.8 3.8
ordering key
sequence of subkeys (3.11(3.11)) used to determine an order
3.9 3.9
preparation
collation preparation
process in which given character strings (3.1(3.1)) are mapped to (other) character strings before the
calculation of the ordering key (3.8(3.8)) for each of the strings
3.10 3.10
reference comparison method
method for establishing an order between two ordering keys (3.8(3.8))
Note 1 to entry: See Clause 6Clause 6.
3.11 3.11
subkey
sequence of weights computed for a character string (3.1(3.1))
3.12 3.12
symbol
collating element (3.4(3.4))
3.13 3.13
weight
collation weight
positive integer value, used in subkeys (3.11(3.11),), reflecting the relative order of collating elements
(3.4(3.4))
3.14 3.14
weighting element
list of a given number of weights sequentially ordered by level
4 Symbols and conventions
Following ISO/IEC 10646, characters are referenced as UX where X stands for a series of one to eight
hexadecimal digits (where all the letters in the hexadecimal string are in upper case) and refers to the value
of that character in ISO/IEC 10646. This convention is used throughout this document.
Any use of the term "The Common Template Table" in this document is applicable to all instances of Common
Template Tables referenced in Annex AAnnex A. The term can also be abbreviated as CTT.
© ISO/IEC 2025 – All rights reserved
ISO/IEC FDIS 14651:2025(en)
In the Comm
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.