ISO 16642:2025
(Main)Management of terminology resources - Terminological markup framework
Management of terminology resources - Terminological markup framework
This document specifies a framework for representing data recorded in terminological data collections (TDCs). This framework includes a metamodel and methods for describing specific terminological markup languages (TMLs), exemplified in this document in eXtensible Markup Language (XML). The mechanisms for implementing constraints in a TML are defined, but not the specific constraints for individual TMLs. This document is designed to support the development and use of computer applications for terminological data and the exchange of such data between different applications. This document also defines the conditions that allow the data expressed in one TML to be mapped onto another TML.
Gestion des ressources terminologiques — Plate-forme pour le balisage de terminologies informatisées
Upravljanje terminoloških virov - Ogrodje za označevanje terminologije
General Information
- Status
- Published
- Publication Date
- 09-Dec-2025
- Technical Committee
- ISO/TC 37/SC 3 - Management of terminology resources
- Drafting Committee
- ISO/TC 37/SC 3 - Management of terminology resources
- Current Stage
- 6060 - International Standard published
- Start Date
- 10-Dec-2025
- Due Date
- 23-Jun-2026
- Completion Date
- 10-Dec-2025
Relations
- Effective Date
- 24-Jun-2023
Overview
ISO 16642:2025 - Management of terminology resources - Terminological markup framework (TMF) - defines a modular framework for representing and exchanging terminological data collections (TDCs). The standard specifies a metamodel, methods for defining concrete terminological markup languages (TMLs), and mechanisms for expressing constraints and mapping data between different TMLs. TMLs are exemplified in XML in the document, but TMF is not restricted to a particular serialization format. The third edition (2025) updates definitions, DatCatInfo identifiers, examples aligned with TBX, and broadens scope beyond XML.
Key technical topics and requirements
- Metamodel and modular approach: TMF defines a shared, abstract metamodel plus data model extensions. This two-level design supports consistent analysis, design and interchange of terminological data.
- Data categories and DCRs: Uses formal data categories drawn from data category repositories (DCRs) such as DatCatInfo. Implementations must reference appropriate data category specifications and styles.
- TML definition and serialization: Requirements for defining a terminological markup language (TML), including structural organization, expansion trees and serialization formats. XML serialization is shown as an example.
- Constraints and datatypes: Mechanisms to implement constraints on datatypes and to anchor data categories to specific vocabularies or persistent identifiers (PIDs).
- Interchange & interoperability: Conditions and mapping rules that allow data expressed in one TML to be translated or mapped onto another TML to enable reliable data exchange between applications.
- Language representation & annotations: Guidance on representing languages, language sections, annotations, brackets and other modeling constructs required for comprehensive terminological entries.
- Conformance: Requirements and example scenarios for verifying that terminological data conform to TMF.
Practical applications and users
ISO 16642:2025 is designed to support the development, management and exchange of terminological data in a range of contexts:
- Terminology managers and terminologists building or curating TDCs
- Software vendors and developers of terminology management systems (TMS), termbases and lexicographic tools
- Localization and translation teams needing standardized term exchange (interoperability with TBX and other formats)
- NLP engineers, knowledge managers and ontology creators who integrate multilingual terminology into applications
- Standards organizations and corporate language governance teams seeking consistent metadata and data category usage
Related standards
- ISO 30042 (TermBase eXchange - TBX) - example interchange format aligned with terminological practice
- ISO 12620 / DatCatInfo - data category specifications and repositories referenced by TMF
- ISO 704 / ISO 1087 - foundational principles and vocabulary for terminology work
Use ISO 16642:2025 (TMF) to achieve consistent, interoperable terminological datasets, improve system-to-system exchange, and align implementations with recognized data category semantics for terminological resource management.
Frequently Asked Questions
ISO 16642:2025 is a standard published by the International Organization for Standardization (ISO). Its full title is "Management of terminology resources - Terminological markup framework". This standard covers: This document specifies a framework for representing data recorded in terminological data collections (TDCs). This framework includes a metamodel and methods for describing specific terminological markup languages (TMLs), exemplified in this document in eXtensible Markup Language (XML). The mechanisms for implementing constraints in a TML are defined, but not the specific constraints for individual TMLs. This document is designed to support the development and use of computer applications for terminological data and the exchange of such data between different applications. This document also defines the conditions that allow the data expressed in one TML to be mapped onto another TML.
This document specifies a framework for representing data recorded in terminological data collections (TDCs). This framework includes a metamodel and methods for describing specific terminological markup languages (TMLs), exemplified in this document in eXtensible Markup Language (XML). The mechanisms for implementing constraints in a TML are defined, but not the specific constraints for individual TMLs. This document is designed to support the development and use of computer applications for terminological data and the exchange of such data between different applications. This document also defines the conditions that allow the data expressed in one TML to be mapped onto another TML.
ISO 16642:2025 is classified under the following ICS (International Classification for Standards) categories: 01.020 - Terminology (principles and coordination); 35.240.30 - IT applications in information, documentation and publishing. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO 16642:2025 has the following relationships with other standards: It is inter standard links to ISO 16642:2017. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO 16642:2025 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
International
Standard
ISO 16642
Third edition
Management of terminology
2025-12
resources — Terminological
markup framework
Gestion des ressources terminologiques — Plate-forme pour le
balisage de terminologies informatisées
Reference number
© ISO 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Modular approach . 4
5 Generic model for describing terminological data . 6
5.1 Principles .6
5.2 Metamodel .7
5.3 Example .9
6 Requirements for compliance to TMF . 9
7 Interchange and interoperability . 10
8 Representing languages . 10
9 Defining a TML .11
9.1 Requirements .11
9.2 Defining interoperability conditions .11
10 Implementing an XML-serialized TML .11
10.1 General .11
10.2 Implementing the metamodel .11
10.3 Anchoring data categories . 12
10.3.1 General . 12
10.3.2 Styles and vocabulary . 12
10.4 Constraints on datatypes . 13
10.5 Implementing annotations . 13
10.6 Implementing brackets . 13
Annex A (informative) Conformance of terminological data to TMF: example scenario . 14
Bibliography .18
iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO’s adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology, Subcommittee
SC 3, Management of terminology resources.
This third edition cancels and replaces the second edition (ISO 16642:2017), which has been technically
revised.
The main changes are as follows:
— The scope is no longer restricted to representing terminological data in XML format. Terminological
markup languages can be serialized in any formats, but they are exemplified in this document in XML.
— DatCatInfo PIDs have been updated.
— Terms and definitions have been updated according to the latest International Standards.
— Examples have been updated to reflect ISO 30042:2019.
— Annex A has been significantly revised in order to show two TMF-compliant XML-based serialization
examples of the same terminological data.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
Introduction
Terminological data are collected, managed and stored in a wide variety of systems, typically database
management systems, ranging from personal computer applications for individual users to large terminology
database systems operated by major companies and governmental agencies. Terminology databases
comprise various types of information, called “data categories”, and can adopt different structural models.
However, terminological data often need to be shared and reused in a number of applications, and this
sharing is facilitated when the data adhere to a common model. To facilitate co-operation and to prevent
duplicate work, it is important to develop standards and guidelines for creating and using terminological
data collections (TDCs) as well as for sharing and exchanging data.
This document presents a modular approach for analysing existing TDCs and designing new ones. It also
provides a framework for defining terminological markup languages (TMLs) that are interoperable.
This document refers to DatCatInfo, an example of an available data category repository (DCR). DatCatInfo
is an online database of information about the types of data that can be included in TDCs and other language
resources. It is available at www.datcatinfo.net.
v
International Standard ISO 16642:2025(en)
Management of terminology resources — Terminological
markup framework
1 Scope
This document specifies a framework for representing data recorded in terminological data collections
(TDCs). This framework includes a metamodel and methods for describing specific terminological markup
languages (TMLs), exemplified in this document in eXtensible Markup Language (XML). The mechanisms for
implementing constraints in a TML are defined, but not the specific constraints for individual TMLs.
This document is designed to support the development and use of computer applications for terminological
data and the exchange of such data between different applications. This document also defines the conditions
that allow the data expressed in one TML to be mapped onto another TML.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 704, Terminology work — Principles and methods
ISO 1087, Terminology work and terminology science — Vocabulary
ISO 12616-1, Terminology work in support of multilingual communication — Part 1: Fundamentals of
translation-oriented terminography
ISO 26162 (all parts), Management of terminology resources — Terminology databases
ISO 30042, Management of terminology resources — TermBase eXchange (TBX)
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 1087 and the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
basic information unit
information unit (3.13) attached to a component (3.3) of the metamodel (3.15) that can be expressed by means
of a single data category (3.7)
EXAMPLE Term, note.
3.2
complementary information
CI
information supplementary to that described in concept entries (3.5) and shared across the terminological
data collection (3.23)
EXAMPLE Domain hierarchies, bibliographic references, references to text corpora.
3.3
component
elementary description unit of a metamodel (3.15) to which data categories (3.7) can be associated to form a
data model
3.4
compound information unit
information unit (3.13) attached to a component (3.3) of the metamodel (3.15) that can be expressed by means
of several grouped data categories (3.7)
EXAMPLE IDLangTgtDtyp, transacGrp.
3.5
concept entry
CE
terminological entry
entry
part of a terminological data collection (3.23) which contains the terminological data related to one concept
3.6
conceptual domain
permissible content of a data category (3.7)
EXAMPLE In a terminology database, the data category /part of speech/ can have a conceptual domain consisting
of the values /noun/, /verb/, /adjective/, /adverb/.
Note 1 to entry: The permissible content can be closed, as in the example, or subject to formal restrictions such as
dates, or free text such as the conceptual domain of /definition/. Although the latter type is not formally restricted, it
is nevertheless subject to adherence to the requirements of its data category specification (3.10), i.e. it contains a true
definition and not a note, example or some other piece of information.
[SOURCE: ISO 12620-1:2022, 3.1]
3.7
data category
DC
class of data items that are closely related from a formal or semantic point of view
EXAMPLE /part of speech/, /subject field/, /definition/.
Note 1 to entry: A data category can be viewed as a generalization of the notion of a field in a database.
Note 2 to entry: In running text, such as in this document, data category names are enclosed in forward slashes (e.g.
/part of speech/).
3.8
data category repository
DCR
digital collection of data category specifications (3.10)
EXAMPLE DatCatInfo, a DCR for language resources (see Reference [6]).
Note 1 to entry: Data category repositories are used as references when specifying language resources.
[SOURCE: ISO 12620-1:2022, 3.6]
3.9
data category selection
DCS
DC selection
set of data category specifications (3.10) selected from a data category repository (3.8)
Note 1 to entry: A data category selection can represent the data categories (3.7) used within a research discipline or
a specific application or project.
[SOURCE: ISO 12620-1:2022, 3.7, modified — “chosen” replaced by “selected”; “DCS” made the second
preferred term.]
3.10
data category specification
DC specification
complete descriptive record of a data category (3.7)
[SOURCE: ISO 12620-1:2022, 3.5]
3.11
expansion tree
structured group of serialized elements that implement a level of the metamodel (3.15) in a given
terminological markup language (3.24)
3.12
global information
GI
technical and administrative information applying to an entire terminological data collection (3.23)
EXAMPLE Title of the terminological data collection, revision history, owner or copyright information.
3.13
information unit
IU
elementary piece of information attached to a component (3.3) of the metamodel (3.15)
3.14
language section
LS
part of a concept entry (3.5) containing information related to one language
3.15
metamodel
model that specifies one or more other models
[SOURCE: ISO/IEC TR 19583-24:2025, 3.2.19]
3.16
object language
language being described
3.17
PID
persistent identifier
unique identifier that ensures permanent access for a digital object by providing access to it independently
of its physical location or current ownership
[SOURCE: ISO 24619:2011, 3.2.4, modified — Note 1 to entry deleted.]
3.18
serialization
process of translating data structures or object states into a format that can be stored or transmitted and
reconstructed later
[SOURCE: ISO/TS 23494-1:2023, 3.16]
3.19
serialization format
data storage format for storing, transmitting, and reconstructing data structures or object states
[SOURCE: ISO/IEC TR 19583-24:2025, 3.2.41, modified — “object state” replaced by “object states”.]
3.20
style
specification for the implementation of a data category (3.7) in any serialization format (3.19)
3.21
term component section
TCS
part of a term section (3.22) containing linguistic information about the parts of a term
[SOURCE: ISO 26162-1:2019, 3.2.10, modified — “components” replaced by “parts”.]
3.22
term section
TS
part of a language section (3.14) containing information about a term
[SOURCE: ISO 26162-1:2019, 3.2.9]
3.23
terminological data collection
TDC
resource consisting of concept entries (3.5) with associated metadata and documentary information
EXAMPLE A TBX document instance, ISO 1087:2019.
3.24
terminological markup language
TML
serialization format (3.19) for representing a terminological data collection (3.23) conforming to the
constraints of the metamodel (3.15)
3.25
vocabulary
set of strings used to implement a data category (3.7) according to a style (3.20)
3.26
working language
language used to describe objects
4 Modular approach
Terminological markup framework (TMF) consists of two levels of abstraction:
— The first (and most abstract) level is the metamodel level. The metamodel level supports analysis, design
and exchange at a very general level, i.e. it is independent of any specific implementation or software. The
metamodel shall be shared by all TDCs that are compliant with TMF.
— The second level is the data model level, which adds the necessary data categories for representing
specific TDCs.
The implementation of a data model in any serialization format is called a TML. In this document, TMLs are
exemplified in XML format (see Reference [7]). TMLs can be described on the basis of a limited number of
characteristics, namely:
— the structural organization of the metamodel (i.e. the expansion trees) expressed by the TML;
— the specific data categories used by the TML and how they relate to the metamodel;
— the way in which these data categories can be serialized and anchored on the expansion trees of the TML,
e.g. the XML style of the data categories;
— the vocabularies used by the TML to express those various informational objects as, for example, XML
elements and attributes according to the corresponding XML styles.
Figure 1 represents the information required to fully specify a TML:
— the metamodel which describes the basic hierarchy of components to which any TML shall conform;
— a set of data category specifications from a data category repository (DCR), which can form the basis for
defining a data category selection (DCS) for the TML;
— the dialectal specification (dialect) which includes the various elements needed to represent a given
TML in a serialization format. These elements comprise expansion trees and data category instantiation
styles, together with their corresponding vocabularies.
Figure 1 — Various knowledge sources involved in the description of a TML
An example of a DCR providing sample data category specifications for language resources is available at
Reference [6]. Where possible, data categories documented in existing DCRs should be used for a TML. If
no suitable data category is available in existing DCRs, the implementers of the TML should propose the
creation of the required data category specification within existing or new DCRs.
5 Generic model for describing terminological data
5.1 Principles
This clause describes a class of XML document structures which can be used to represent a wide range of
terminological data formats and provides a framework for representing these document structures in XML.
Each type of document structure is described by means of a three-tiered information structure that
describes:
— a metamodel, which comprises a hierarchy of components;
— information units, which can be associated with each component of the metamodel;
— annotations, which can be used to qualify properties associated with a given information unit.
Information units can be basic or compound. A basic information unit encapsulates information that can be
expressed by means of a single data category. A compound information unit encapsulates information that is
expressed by means of several grouped data categories that, taken together, express a coherent information
unit.
EXAMPLE A compound information unit can be used to represent the fact that a transaction can be a combination
of a transaction type (such as modification), the person who performed it, and the date when it was performed.
Basic information units, whether they are directly attached to a component or placed within a compound
information unit, can take two non-exclusive types of value:
— an atomic value corresponding either to a simple type (in the sense of XML schemas) such as a number,
string, element of a picklist, or to a mixed content type in the case of annotated text;
— a reference to a component in order to express a relation between it and the current component.
Information units can be abstractly represented as feature-value structures. For instance, the following
markup sample
XYZ
can be modelled as a basic information unit in the following feature-value structure:
[owner = XYZ]
In addition, the following TermBase eXchange (TBX) markup sample (see Reference [9])
modification
ABC
2024-04-04
can be modelled in a feature-value structure as shown in Figure 2.
Figure 2 — Feature-value structure
There can also be a need to associate additional information with the content of a data category; this is
achieved through annotations. A typical example is a definition in which the genus and/or diffe
...
SLOVENSKI STANDARD
01-junij-2025
Upravljanje terminoloških virov - Ogrodje za označevanje terminologije
Management of terminology resources — Terminological markup framework
Titre manque
Ta slovenski standard je istoveten z: ISO/DIS 16642
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
DRAFT
International
Standard
ISO/DIS 16642
ISO/TC 37/SC 3
Management of terminology
Secretariat: DIN
resources — Terminological
Voting begins on:
markup framework
2025-02-03
ICS: 01.020; 35.240.30
Voting terminates on:
2025-04-28
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENTS AND APPROVAL. IT
IS THEREFORE SUBJECT TO CHANGE
AND MAY NOT BE REFERRED TO AS AN
INTERNATIONAL STANDARD UNTIL
PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
STANDARDS MAY ON OCCASION HAVE TO
This document is circulated as received from the committee secretariat.
BE CONSIDERED IN THE LIGHT OF THEIR
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS.
RECIPIENTS OF THIS DRAFT ARE INVITED
TO SUBMIT, WITH THEIR COMMENTS,
NOTIFICATION OF ANY RELEVANT PATENT
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION.
Reference number
ISO/DIS 16642:2025(en)
DRAFT
ISO/DIS 16642:2025(en)
International
Standard
ISO/DIS 16642
ISO/TC 37/SC 3
Management of terminology
Secretariat: DIN
resources — Terminological
Voting begins on:
markup framework
ICS: 01.020; 35.240.30
Voting terminates on:
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENTS AND APPROVAL. IT
IS THEREFORE SUBJECT TO CHANGE
AND MAY NOT BE REFERRED TO AS AN
INTERNATIONAL STANDARD UNTIL
PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
© ISO 2025
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
STANDARDS MAY ON OCCASION HAVE TO
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
This document is circulated as received from the committee secretariat. BE CONSIDERED IN THE LIGHT OF THEIR
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
or ISO’s member body in the country of the requester.
NATIONAL REGULATIONS.
ISO copyright office
RECIPIENTS OF THIS DRAFT ARE INVITED
CP 401 • Ch. de Blandonnet 8
TO SUBMIT, WITH THEIR COMMENTS,
CH-1214 Vernier, Geneva
NOTIFICATION OF ANY RELEVANT PATENT
Phone: +41 22 749 01 11
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION.
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ISO/DIS 16642:2025(en)
ii
ISO/DIS 16642:2025(en)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Modular approach . 4
5 Generic model for describing terminological data . 5
5.1 Principles .5
5.2 The metamodel . .6
5.3 Example .8
6 Requirements for compliance to TMF . 9
7 Interchange and interoperability . 9
8 Representing languages . 10
9 Defining a TML .10
9.1 Steps .10
9.2 Defining interoperability conditions .10
10 Implementing an XML-serialized TML .11
10.1 General .11
10.2 Implementing the metamodel .11
10.3 Anchoring data categories .11
10.3.1 General .11
10.3.2 Styles and vocabulary .11
10.4 Constraints on datatypes . 12
10.5 Implementing annotations . 12
10.6 Implementing brackets . 12
Annex A (informative) Conformance of terminological data to TMF: example scenario .13
Bibliography .16
iii
ISO/DIS 16642:2025(en)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology, Subcommittee
SC 3, Management of terminology resources.
This third edition cancels and replaces the second edition (ISO 16642:2017), which has been editorially and
technically revised.
The main changes are as follows:
— The scope is no longer restricted to representing terminological data in XML format. Terminological
markup languages can be serialized in any formats, but they are exemplified in this document in XML.
— DatCatInfo PIDs have been updated.
— Terms and definitions have been updated according to the latest ISO standards.
— Examples have been updated to reflect ISO 30042:2019.
— Appendix A has been significantly revised in order to show two TMF-compliant XML-based serialization
examples of the same terminological data.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
ISO/DIS 16642:2025(en)
Introduction
Terminological data are collected, managed and stored in a wide variety of systems, typically various
kinds of database management systems, ranging from personal computer applications for individual
users to large terminology database systems operated by major companies and governmental agencies.
Terminology databases are comprised of various types of information, called data categories, and can adopt
different structural models. However, terminological data often need to be shared and reused in a number
of applications, and this sharing is facilitated when the data adheres to a common model. To facilitate co-
operation and to prevent duplicate work, it is important to develop standards and guidelines for creating
and using terminological data collections (TDCs) as well as for sharing and exchanging data.
This document presents a modular approach for analyzing existing TDCs and designing new ones. It also
provides a framework for defining terminological markup languages (TMLs) that are interoperable.
This document makes reference to DatCatInfo, an example of an available data category repository (DCR).
DatCatInfo is an online database of information about the types of data that can be included in TDCs and
other language resources. It is available at www.datcatinfo.net.
v
DRAFT International Standard ISO/DIS 16642:2025(en)
Management of terminology resources — Terminological
markup framework
1 Scope
This document specifies a framework for representing data recorded in terminological data collections
(TDCs). This framework includes a metamodel and methods for describing specific terminological markup
languages (TMLs), exemplified in this document in eXtensible Markup Language (XML). The mechanisms for
implementing constraints in a TML are defined, but not the specific constraints for individual TMLs.
This document is designed to support the development and use of computer applications for terminological
data and the exchange of such data between different applications. This document also defines the conditions
that allow the data expressed in one TML to be mapped onto another TML.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 704, Terminology work — Principles and methods
ISO 1087, Terminology work and terminology science — Vocabulary
ISO 12616-1, Terminology work in support of multilingual communication — Part 1: Fundamentals of
translation-oriented terminography
ISO 26162 (all parts), Management of terminology resources — Terminology databases
ISO 30042, Management of terminology resources — TermBase eXchange (TBX)
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 1087 and the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
basic information unit
information unit (3.13) attached to a component (3.3) of the metamodel that can be expressed by means of a
single data category (3.7)
3.2
complementary information
CI
information supplementary to that described in concept entries (3.5) and shared across the terminological
data collection (3.21)
EXAMPLE Domain hierarchies, institution descriptions, bibliographic references and references to text corpora
are typical examples of complementary information.
ISO/DIS 16642:2025(en)
3.3
component
elementary description unit of a metamodel to which data categories (3.7) can be associated to form a data model
3.4
compound information unit
information unit (3.13) attached to a component (3.3) of the metamodel that is expressed by means of several
grouped data categories (3.7)
3.5
concept entry
CE
terminological entry
entry
part of a terminological data collection (3.21) which contains the terminological data related to one concept
[SOURCE: ISO 30042:2019, 3.5]
3.6
conceptual domain
permissible content of a data category (3.7)
EXAMPLE In a terminology database, the data category /part of speech/ can have a conceptual domain consisting
of the values /noun/, /verb/, /adjective/, /adverb/.
Note 1 to entry: The permissible content can be closed, as in the example, or subject to formal restrictions such as
dates, or free text such as the conceptual domain of /definition/. Although the latter type is not formally restricted,
it is nevertheless subject to adherence to the requirements of its data category specification, i.e., it contains a true
definition and not a note, example or some other piece of information.
[SOURCE: ISO 12620-1:2022, 3.1]
3.7
data category
DC
class of data items that are closely related from a formal or semantic point of view
EXAMPLE /part of speech/, /subject field/, /definition/.
Note 1 to entry: A data category can be viewed as a generalization of the notion of a field in a database.
Note 2 to entry: In running text, such as in this document, data category names are enclosed in forward slashes (e.g., /
part of speech/).
[SOURCE: ISO 30042:2019, 3.8, modified — The admitted term “DC” has been added.]
3.8
data category repository
DCR
digital collection of data category specifications (3.10)
EXAMPLE DatCatInfo, a DCR for language resources (see Reference [5]).
Note 1 to entry: Data category repositories are used as references when specifying language resources.
[SOURCE: ISO 12620-1:2022, 3.6]
ISO/DIS 16642:2025(en)
3.9
data category selection
DC selection
DCS
set of data category specifications (3.10) selected from a data category repository (3.8)
Note 1 to entry: A data category selection can represent the data categories (3.7) used within a research discipline or
a specific application or project.
[SOURCE: ISO 12620-1:2022, 3.7]
3.10
data category specification
DC specification
complete descriptive record of a data category (3.7)
[SOURCE: ISO 12620-1:2022, 3.5]
3.11
expansion tree
structured group of serialized elements that implement a level of the metamodel in a given terminological
markup language (3.22)
3.12
global information
GI
technical and administrative information applying to an entire terminological data collection (3.21)
EXAMPLE Title of the terminological data collection, revision history, owner or copyright information.
3.13
information unit
IU
elementary piece of information attached to a component (3.3) of the metamodel
3.14
language section
LS
part of a concept entry (3.5) containing information related to one language
3.15
object language
language being described
3.16
persistent identifier
PID
unique Uniform Resource Identifier (URI) that assures permanent access to a digital object by providing
access to it independently of its physical location or current ownership
3.18
style
specification for the implementation of a data category (3.7) in any serialization format
3.19
term component section
TCS
part of a term section (3.20) containing linguistic information about the components (3.3) of a term
[SOURCE: ISO 26162-1:2019, 3.2.10]
ISO/DIS 16642:2025(en)
3.20
term section
TS
part of a language section (3.14) containing information about a term
[SOURCE: ISO 26162-1:2019, 3.2.9]
3.21
terminological data collection
TDC
resource consisting of concept entries (3.5) with associated metadata and documentary information
3.22
terminological markup language
TML
serialization format for representing a terminological data collection (3.21) conforming to the constraints of
the metamodel
3.23
vocabulary
set of strings used to implement a data category (3.7) according to a style (3.18)
3.24
working language
language used to describe objects
4 Modular approach
Terminological markup framework (TMF) consists of two levels of abstraction. The first (and most abstract)
level is the metamodel level. The metamodel level supports analysis, design and exchange at a very general
level, i.e., it is independent of any specific implementation or software. The metamodel shall be shared by all
TDCs that are compliant with TMF. The second level is the data model level, which adds the necessary data
categories for representing specific TDCs.
The implementation of a data model in any serialization format is called a terminological markup language
(TML). In this document, TMLs are exemplified in XML format (see Reference [6]). TMLs can be described on
the basis of a limited number of characteristics, namely:
— the structural organization of the metamodel (i.e., the expansion trees) expressed by the TML;
— the specific data categories used by the TML and how they relate to the metamodel;
— the way in which these data categories can be serialized and anchored on the expansion trees of the TML,
e.g., the XML style of the data categories;
— the vocabularies used by the TML to express those various informational objects as, for example, XML
elements and attributes according to the corresponding XML styles.
Figure 1 represents the information required to fully specify a TML:
— the metamodel which describes the basic hierarchy of components to which any TML shall conform;
— a set of data category specifications from a data category repository (DCR), which can form the basis for
defining a data category selection (DCS) for the TML;
— the dialectal specification (dialect) which includes the various elements needed to represent a given
TML in a serialization format. These elements comprise expansion trees and data category instantiation
styles, together with their corresponding vocabularies.
ISO/DIS 16642:2025(en)
Figure 1 — Various knowledge sources involved in the description of a TML
An example of a DCR providing sample data category specifications for language resources is available at
Reference [5]. Where possible, data categories documented in existing DCRs should be used for a TML. If
no suitable data category is available in existing DCRs, the implementers of the TML should propose the
creation of the required data category specification within existing or new DCRs.
5 Generic model for describing terminological data
5.1 Principles
This clause describes a class of XML document structures which can be used to represent a wide range of
terminological data formats and provides a framework for representing these document structures in XML.
Each type of document structure is described by means of a three-tiered information structure that
describes:
— a metamodel, which comprises a hierarchy of components;
— information units, which can be associated with each component of the metamodel;
— annotations, which can be used to qualify properties associated with a given information unit.
Information units can be basic or compound. A basic information unit encapsulates information that can be
expressed by means of a single data category. A compound information unit encapsulates information that is
expressed by means of several grouped data categories that, taken together, express a coherent information unit.
EXAMPLE A compound information unit can be used to represent the fact that a transaction can be a combination
of a transaction type (such as modification), the person who performed it, and the date when it was performed.
Basic information units, whether they are directly attached to a component or are placed within a compound
information unit, can take two non-exclusive types of value:
— an atomic value corresponding either to a simple type (in the sense of XML schemas) such as a number,
string, element of a picklist, etc., or to a mixed content type in the case of a
...
ISO 16642:2025は、用語資源の管理に関する標準であり、特に用語データコレクション(TDC)に記録されたデータを表現するためのフレームワークを規定しています。この標準は、メタモデルと特定の用語マークアップ言語(TML)を記述するための手法を含み、具体的にはXML(拡張可能なマークアップ言語)でその例が示されています。 この標準の強みは、その柔軟性と汎用性です。異なるTML間でのデータ交換を可能にするための条件を定義しており、これにより、さまざまなコンピュータアプリケーションの開発や用語データの利用が促進されます。また、TMLにおける制約を実装するためのメカニズムが明確にされているため、開発者は特定の言語仕様にも対応しやすくなっています。 ISO 16642:2025は、用語データの記録と管理における共通フレームワークを提供するだけでなく、さまざまな業界や分野での標準化された用語管理のニーズに応えるために重要な役割を果たしています。これにより、用語資源に関する情報の一貫性が高まり、専門的なコミュニケーションの円滑化に寄与します。このように、ISO 16642:2025は、用語資源管理の分野において極めて重要な標準であると言えます。
Die ISO 16642:2025 bietet ein umfassendes Rahmenwerk für das Management von Terminologie-Ressourcen, das speziell für die repräsentative Darstellung von Daten in terminologischen Datenkollektionen (TDCs) konzipiert ist. Der Standard legt einen klaren Fokus auf die Entwicklung eines Metamodells und darauf, wie spezifische terminologische Markup-Sprachen (TMLs) beschrieben werden können. Dies ist von zentraler Bedeutung für Fachleute, die in der Terminologiearbeit tätig sind und sich täglich mit dem Austausch und der Entwicklung von terminologischen Daten befassen. Ein herausragendes Merkmal dieser Norm ist die Verwendung von eXtensible Markup Language (XML) als Beispiel für die terminologischen Markup-Sprachen. Diese Entscheidung trägt maßgeblich zur Flexibilität und Zugänglichkeit des Standards bei, da XML in vielen Anwendungen weit verbreitet ist und sich gut für die Darstellung strukturierter Daten eignet. Die Definition der Mechanismen zur Implementierung von Einschränkungen in einer TML, obwohl nicht konkretisiert für spezifische TMLs, bietet den Anwendungsentwicklern dennoch einen wertvollen Rahmen für die Integration von Terminologie in Computeranwendungen. Ein weiterer wesentlicher Aspekt der ISO 16642:2025 ist die klare Darstellung der Bedingungen, unter denen Daten, die in einer TML ausgedrückt sind, auf eine andere TML abgebildet werden können. Diese Eigenschaft fördert nicht nur die Interoperabilität zwischen verschiedenen terminologischen Anwendungen, sondern auch ein einheitliches Vorgehen in der Terminologiearbeit. Zusammenfassend lässt sich sagen, dass die ISO 16642:2025 als eine bedeutende Ressource im Bereich des Terminologie-Managements gilt, indem sie sowohl die Entwicklung als auch den Austausch von terminologischen Daten effizient unterstützt. Die Relevanz dieses Standards erstreckt sich über verschiedene Anwendungsbereiche und stellt sicher, dass Terminologiearbeit auf einer soliden, standardisierten Grundlage erfolgt.
La norme ISO 16642:2025, intitulée "Gestion des ressources terminologiques - Cadre de balisage terminologique", offre un cadre essentiel pour la représentation des données enregistrées dans les collections de données terminologiques (TDC). Son champ d'application est particulièrement pertinent pour les développeurs et les utilisateurs d'applications informatiques concernant les données terminologiques. Parmi les points forts de cette norme, on trouve la définition d'un métamodèle robuste qui facilite la description des langages de balisage terminologiques spécifiques (TML). En se basant sur le langage XML, cette norme permet une interchangeabilité et une flexibilité dans la manipulation des données terminologiques. En effet, la possibilité d'implémenter des contraintes dans un TML, bien que non définie de manière précise pour les TML individuels, initie un cadre structuré qui peut être adapté aux besoins de chaque application. De plus, la norme ISO 16642:2025 se distingue par sa capacité à faciliter l'échange de données terminologiques entre différentes applications. Ceci est particulièrement crucial dans un environnement où la collaboration et l'interopérabilité sont primordiales pour le développement des technologies de l'information. Les conditions définies pour le mappage des données exprimées dans un TML vers un autre TML augmentent considérablement la pertinence de cette norme dans le domaine de la gestion terminologique. En résumé, la norme ISO 16642:2025 représente un jalon fondamental pour la standardisation et la gestion efficace des ressources terminologiques. Sa portée, ses forces et sa pertinence en font un outil indispensable au service de l'innovation technologique et de l'amélioration continue des processus terminologiques.
ISO 16642:2025 표준은 용어 자원 관리를 위한 용어 마크업 프레임워크를 규정하고 있으며, 용어 데이터 수집(TDC)에서 기록된 데이터를 표현하기 위한 포괄적인 구조를 제공합니다. 이 표준의 주요 강점은 메타모델 및 특정 용어 마크업 언어(TML)를 설명하기 위한 방법을 포괄적으로 정의하고 있다는 점입니다. 예를 들어, XML(확장 가능 마크업 언어)을 사용한 사례가 문서 내에 잘 설명되어 있어 실질적인 적용 가능성을 높입니다. 이 문서의 범위는 TML에서 제약 조건을 구현하기 위한 메커니즘을 정의하지만, 개별 TML에 대한 구체적인 제약 조건은 포함하지 않고 있습니다. 이는 다양한 TML의 요구 사항을 충족할 수 있는 유연성을 제공하며, 다양한 컴퓨터 응용 프로그램의 개발 및 용어 데이터 간의 상호 교환을 지원하는 데 필수적인 요소입니다. 또한, ISO 16642:2025는 하나의 TML로 표현된 데이터를 다른 TML로 매핑할 수 있는 조건을 정의하여, 데이터 연계와 통합의 중요성을 부각시킵니다. 이로 인해 기업 및 연구 환경에서 용어 데이터의 일관성과 상호 운용성을 유지할 수 있습니다. 결론적으로, ISO 16642:2025 표준은 용어 자원의 관리와 마크업에 대한 체계적 접근 방식을 제공하며, 현재 디지털화된 정보 환경에서 용어 처리의 필수 도구로 자리 잡을 것입니다.
The ISO 16642:2025 standard, titled "Management of terminology resources - Terminological markup framework," presents a comprehensive framework for representing data within terminological data collections (TDCs). Its well-defined scope emphasizes the relevance of a metamodel and facilitates the description of specific terminological markup languages (TMLs), with XML as a key exemplar. One of the standard's notable strengths lies in its structured approach to implementing a terminological markup framework. By outlining metamodels and methods for different TMLs, ISO 16642:2025 provides a robust foundation for the development and utilization of computer applications that manage terminological data. This is increasingly important in an era where efficient data exchange and interoperability between applications are paramount. Furthermore, the standard adeptly addresses the mechanisms necessary for implementing constraints within a TML, although it deliberately avoids specifying constraints for individual TMLs. This flexibility allows developers to tailor their implementations while adhering to a common framework, fostering a diverse range of applications in the domain of terminological resource management. The document also defines essential conditions for mapping data expressed in one TML onto another, which enhances the interoperability of terminology data across different systems. This mapping capability is critical for organizations that manage vast terminological resources and require seamless integration between various platforms. In summary, the ISO 16642:2025 standard articulates a well-structured framework for managing terminology resources through terminological markup, facilitating the development and exchange of terminological data across different applications. Its emphasis on metamodels, methods, and data mapping further underscores its strength and relevance in the continuously evolving landscape of terminological data management.














Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...