Language resource management -- Semantic annotation framework (SemAF)

Gestion des ressources linguistiques -- Cadre d'annotation sémantique (SemAF)

Upravljanje jezikovnih virov - Ogrodje za semantično označevanje (SemAF) - 11. del: Merljive kvantitativne informacije (MQI)

General Information

Status
Published
Current Stage
5020 - FDIS ballot initiated: 2 months. Proof sent to secretariat
Start Date
10-May-2021
Completion Date
10-May-2021

Buy Standard

Draft
ISO/FDIS 24617-11 - Language resource management -- Semantic annotation framework (SemAF)
English language
21 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
ISO/DIS 24617-11:2021 - BARVE na PDF-str 15
English language
29 pages
sale 10% off
Preview
sale 10% off
Preview

e-Library read for
1 day
Draft
ISO/FDIS 24617-11 - Gestion des ressources linguistiques -- Cadre d'annotation sémantique (SemAF)
French language
22 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (sample)

FINAL
INTERNATIONAL ISO/FDIS
DRAFT
STANDARD 24617-11
ISO/TC 37/SC 4
Language resource management —
Secretariat: KATS
Semantic annotation framework
Voting begins on:
2021-05-10 (SemAF) —
Voting terminates on:
Part 11:
2021-07-05
Measurable quantitative information
(MQI)
Gestion des ressources linguistiques — Cadre d'annotation
sémantique (SemAF) —
Partie 11: Mesurer l'information quantitative (MQI)
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/FDIS 24617-11:2021(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS. ISO 2021
---------------------- Page: 1 ----------------------
ISO/FDIS 24617-11:2021(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2021

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may

be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting

on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address

below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2021 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/FDIS 24617-11:2021(E)
Contents Page

Foreword ........................................................................................................................................................................................................................................iv

Introduction ..................................................................................................................................................................................................................................v

1 Scope ................................................................................................................................................................................................................................. 1

2 Normative references ...................................................................................................................................................................................... 1

3 Terms and definitions ..................................................................................................................................................................................... 1

4 Abstract specification of QML ................................................................................................................................................................. 3

4.1 Overview ...................................................................................................................................................................................................... 3

4.2 Characteristics of QML ..................................................................................................................................................................... 4

4.3 Metamodel .................................................................................................................................................................................................. 4

4.4 Abstract syntax of QML (QML_as) .......................................................................................................................................... 5

4.5 Concrete syntaxes of QML (QML_cs) and its subsets ............................................................................................. 6

5 XML-based concrete syntax of QML (QML_csx) ..................................................................................................................... 6

5.1 General ........................................................................................................................................................................................................... 6

5.2 Tag names with ID prefixes .......................................................................................................................................................... 6

5.3 Attribute specification of the root ........................................................................................................................ 7

5.4 Attribute specification of the basic element types ................................................................................................... 7

5.5 Attribute specification of the link types ............................................................................................................................ 8

5.6 Illustrations of QML_csx .................................................................................................................................................................. 8

5.6.1 General...................................................................................................................................................................................... 8

5.6.2 Sample data .......................................................................................................................................................................... 8

5.6.3 Procedure of annotation ........................................................................................................................................... 9

6 TEI-based concrete syntax of QML (QML_cst) .....................................................................................................................11

6.1 Concrete syntaxes of QML (QML_cst) ...............................................................................................................................11

6.1.1 Overall ....................................................................................................................................................................................11

6.1.2 Tag names with ID prefixes ..................................................................................................................................11

6.1.3 Attribute specification of the basic element types ..........................................................................11

6.1.4 Attribute specification of the two link types ........................................................................................12

6.2 Illustrations of QML_cst ................................................................................................................................................................12

6.2.1 Overall ....................................................................................................................................................................................12

6.2.2 Sample data .......................................................................................................................................................................12

6.2.3 Illustrations of TEI-based Concrete Syntax............................................................................................13

Annex A (informative) Illustrations of QML_csx with more samples ...............................................................................16

Annex B (informative) Informal statements of MQI ..........................................................................................................................19

Annex C (informative) The representation of units ...........................................................................................................................20

Bibliography .............................................................................................................................................................................................................................21

© ISO 2021 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/FDIS 24617-11:2021(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards

bodies (ISO member bodies). The work of preparing International Standards is normally carried out

through ISO technical committees. Each member body interested in a subject for which a technical

committee has been established has the right to be represented on that committee. International

organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.

ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of

electrotechnical standardization.

The procedures used to develop this document and those intended for its further maintenance are

described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the

different types of ISO documents should be noted. This document was drafted in accordance with the

editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).

Attention is drawn to the possibility that some of the elements of this document may be the subject of

patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of

any patent rights identified during the development of the document will be in the Introduction and/or

on the ISO list of patent declarations received (see www .iso .org/ patents).

Any trade name used in this document is information given for the convenience of users and does not

constitute an endorsement.

For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and

expressions related to conformity assessment, as well as information about ISO's adherence to the

World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/

iso/ foreword .html.

This document was prepared by Technical Committee ISO/TC 37, Language and terminology,

Subcommittee SC 4, Language resource management.
A list of all parts in the ISO 24617 series can be found on the ISO website.

Any feedback or questions on this document should be directed to the user’s national standards body. A

complete listing of these bodies can be found at www .iso .org/ members .html.
iv © ISO 2021 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/FDIS 24617-11:2021(E)
Introduction

Measurable quantitative information (MQI) such as ‘165 cm’ or ‘60 kg’ of ‘John’ that applies to the height

or weight of the person is very common in ordinary language. MQI describes one of basic properties

that is associated with the magnitude aspect of quantity. The main characteristics of MQI is that

quantitative information is presented as measures expressed in terms of a pair , consisting of

a numerically expressed quantity n and a unit u, which is either basic or derived, or either normalized

or conventionally used. Such information is much more abundant in scientific publications or technical

reports to the extent that it constitutes an essential part of communicative segments of language in

general. The processing of such information is thus required for any successful language resource

management.

In such a big data era, demands from industry and academic communities for a precise acquisition of

measurable quantitative information have increased. For example, business investment companies

frequently need to aggregate various sorts of information covering net sales, gross profit, operating

expenses, operating profit, interest expense, net profit before taxes, net income, etc., of the target

companies from their annual reports. The fast-growing medical informatics research also needs

to process a large amount of medical texts to analyze the dose of medicine, the eligibility criteria of

[8]

clinical trial, the phenotype characters of patients, the lab tests in clinical records, etc. . All these

demands either in industry or in medical research require the accurate and consistent representation

of measurable quantitative information for automated processing, computation, and exchange.

However, in the IR and NLP areas, there is no standardized way of representing measurable quantitative

information currently available. Each application system developed in industrial sectors has hitherto

used its own format to annotate measurable quantitative information. A flexible, interoperable and

standardized measurable quantitative information representation format for IR and NLP tasks to work

with many different application systems is called for.

This document aims at formulating a general annotation scheme with following the principles of

semantic annotation laid down in ISO 24617-6 in general and the basic requirements of ISO 24611,

that facilitates the processing of MQI in scientific and technical language and to make it interoperable

with other semantic annotation schemes, such as ISO 24617. The annotation scheme is designed to be

interoperable with other parts of ISO 24617. It also utilizes various ISO standards on lexical resources

and morpho-syntactic annotation frameworks. It aims at being compatible with other existing relevant

standards.

NOTE ISO 24617-1 and ISO 24617-7, for instance, have proposed a way of annotating measures on time

(durations or time amounts) and space (distances), respectively. ISO 24612 provides a pivotal form (graphic

annotation framework) that makes all the annotation of temporal or spatial measures in these two annotation

schemes.

QML is normalized at the abstract level that allows various serialization formats representing annotated

measurable quantitative information such as an XML-based representation. The normalization of QI

(quantitative information) annotation is stated at the abstract level of annotation, and the standoff

annotation format is adopted at the concrete level of serialization.

Focusing on measurements in scientifico-technological language, this document is expected to

[9]

contribute to information extraction (IR) , question answering (QA), text summarization (TS), and

[10]
other natural language processing (NLP) applications .
© ISO 2021 – All rights reserved v
---------------------- Page: 5 ----------------------
FINAL DRAFT INTERNATIONAL STANDARD ISO/FDIS 24617-11:2021(E)
Language resource management — Semantic annotation
framework (SemAF) —
Part 11:
Measurable Quantitative information (MQI)
1 Scope

This document covers the measurable or magnitudinal aspect of quantity so that it can focus on the

technical or practical use of measurements in IR (information retrieval), QA (question answering), TS

(text summarization), and other NLP (natural language processing) applications. It is applicable to the

domains of technology that carry more applicational relevance than some theoretical issues found in

the ordinary use of language.

NOTE ISO 24617-12 deals with more general and theoretical issues of quantification and quantitative

information.

This document also treats temporal durations that are discussed in ISO 24617-1, and spatial

measures such as distances that are treated ISO 24617-7, while making them interoperable with other

measure types. It also accommodates the treatment of measures or amounts that are introduced in

ISO 24617-6:2016, 8.3.
2 Normative references

The following documents are referred to in the text in such a way that some or all of their content

constitutes requirements of this document. For dated references, only the edition cited applies. For

undated references, the latest edition of the referenced document (including any amendments) applies.

ISO 24612, Language resource management — Linguistic annotation framework (LAF)
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.

ISO and IEC maintain terminological databases for use in standardization at the following addresses:

— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
quantity
property of a measurable object referring to its magnitude or multitude

[SOURCE: ISO/IEC Guide 99:2007, 1.1, modified — Definition substantially redrafted, and Notes

removed.]
© ISO 2021 – All rights reserved 1
---------------------- Page: 6 ----------------------
ISO/FDIS 24617-11:2021(E)
3.2
base quantity

quantity (3.1) in a conventionally chosen subset of a given system of quantities, where no quantity in

the subset can be expressed in terms of the other quantities within that subset

Note 1 to entry: Kinds of quantities include seven base quantities defined by the International System of

Quantities (ISQ).

[SOURCE: ISO/IEC Guide 99:2007, 1.4, modified — "no subset quantity" replaced with "no quantity in

the subset", "the others" replaced with "the other quantities within that subset", and Notes and Example

removed.]
3.3
derived quantity

quantity (3.1), in a system of quantities, defined in terms of the base quantities (3.2) of that system

EXAMPLE Speed is a derived quantity defined by length (distance) over time (LT ), where length (L) and

time (T) are base quantities.
[SOURCE: ISO/IEC Guide 99:2009, 1.5, modified — Example replaced.]
3.4
quantitative information
measurement associated with the quantity (3.1) of a measurable object
3.5
measurable quantitative information
MQI
quantitative information (3.4) that can be expressed in unitized numeric terms
3.6
measurable quantitative information markup language
markup language of measurable quantitative information
quantitative markup language
QML

specification language for the annotation of measurable quantitative information (3.5) extractable from

text or other medium types of language
3.7
measurement unit
unit of measurement
unit

scalar basis, defined and adopted by convention, of measuring objects by multiplying their quantitative

values expressed in real numbers

Note 1 to entry: The expressions that are used in measurement such as “metre”, “litre”, and “µmol/kg” are units

by the definition given above. The multitude expressions such as “bottles”, “boxes”, or “two” as in “two bottles of

milk”, “a box of apples”, and “two coffees” sometimes fail to be regarded as units, but they can also be if they are

accepted as units by convention or agreement in some communities. ISO 24617 SemAF Part 12: Quantification

treats such multitude expressions as genuine units.

[SOURCE: ISO/IEC Guide 99:2007, 1.9, modified — Definition substantilly redrafted, original Notes

removed, new Note 1 to entry added.]
3.8
base unit
measurement unit (3.7) that is adopted by convention for a base quantity (3.2)

Note 1 to entry: There are seven base units chosen by the International System of Units (SI) associated with

seven ISQ base quantities to measure quantities, as shown in Table 1.
2 © ISO 2021 – All rights reserved
---------------------- Page: 7 ----------------------
ISO/FDIS 24617-11:2021(E)
Table 1 — Base units
SI base unit Associated ISQ base quantity
(unit symbol) (base quantity symbol)
metre (m) length (L)
kilogram (kg) mass (M)
second (s) time (T)
ampere (A) electric current (I)
kelvin (K) thermodynamic temperature (È)
mole (mol) amount of substance (N)
candela (cd) luminous intensity (J)

[SOURCE: ISO/IEC Guide 99:2007, 1.10, modified — Notes and Examples removes, new Note 1 to entry

and Table 1 added.]
3.9
derived unit
measurement unit (3.7) for a derived quantity (3.3)

EXAMPLE The unit “newton” (N) is a derived unit for a derived quantity “force” (F), which is defined to be

“mass times acceleration” (MLT ), where the quantity “acceleration” is a derived quantity defined by “velocity

-1 -1

divided by time” (VT ) and “velocity” defined by “length (distance) divided by time” (LT ).

Note 1 to entry: Table 2 illustrates some of the derived units.

[SOURCE: ISO/IEC Guide 99:2007, 1.11, modified — Examples removed, new Example and Note 1 to

entry added.]
Table 2 — derived units
Derived unit Associated derived quantity
(unit symbol)
kilometre per minute(km/min) speed = length(L)/ time(T)
3 3
gram per cubic metre (gram/m ) density = mass(M)/volume(L )
2 2

kilogram metre per square second (kg x m/s ) force = mass (M) x length(L)/time(T )

2 2
lumen per square metre (lm/m ) Illuminance = luminous intensity (J)/area(M )
4 Abstract specification of QML
4.1 Overview

The quantitative markup language (QML) (3.6) is specified at two levels, abstract and concrete. Some

characteristics of QML are listed in 4.2. The overall structure of QML is represented by a metamodel, as

introduced in 4.3. The abstract syntax of QML as QML_as shall be a set-theoretic specification of QML in

conceptual terms that are independent of ways of representing the annotation (content) of measurable

quantitative information. The concrete syntax of QML as QML_cs shall be a specification of a set of

representation formats, based on QML_as, for the annotation of measurable quantitative information

in a computationally tractable way. The QML_as is introduced in 4.4, while QML_cs is presented in

4.5. Equivalent concrete syntaxes, including an XML-based concrete syntax QML_csx and a TEI-based

concrete syntax QML_cst, are described in Clause 5 and Clause 6, respectively.

NOTE There can be many equivalent concrete syntaxes defined on a single abstract syntax.

© ISO 2021 – All rights reserved 3
---------------------- Page: 8 ----------------------
ISO/FDIS 24617-11:2021(E)
4.2 Characteristics of QML
QML shall have the following characteristics.

a) QML shall focus on the annotation of the measurable attributes of entities. For example, “BMI

between 10-20 kg/ m ”

b) QML shall provide a way to annotate the relations of measures. For example, “age 40 or older” and

“fpg>=100 mg/dl or a1c not less than 5,8 %”

c) QML shall cover the complex uses of unitized numeric quantities. For example, “14,0 × 109”,

“glycosylated haemoglobin (hba1c) <1,15 times the upper limit of normal”.

d) QML shall facilitate the identification of normalized numeric, units, as the measurable attribute of

an associated entity.

NOTE QML does not specify ways of annotating the normalization (e.g. “millimoles per litre” is normalized

to “mmol/L”) or complete specification (e.g. “kg/m” is “kg/m2” for BMI) of units, which will be dealt with in

another part of ISO 24617 addressing automated implementation of MQI.
4.3 Metamodel

The overall structure of measurable quantitative information is represented by the metamodel in

Figure 1.
Figure 1 — Metamodel of measurable quantitative information

This metamodel shall consist of seven class components, represented as square boxes in Figure 1:

a) source data as input to the annotation of MQI,
b) markables extracted from data sources,
c) three types of basic elements: entity, measure, and relator,
d) two types of links: measure link and comparison link.

The element “entity” shall be any object that has the property of a measurable quantity, represented by

“@quantity”, as one of its properties. The “entity”, as is used in this document, shall be a very general

term that refers to any object, not just to individual entities, but also to their properties, such as

4 © ISO 2021 – All rights reserved
---------------------- Page: 9 ----------------------
ISO/FDIS 24617-11:2021(E)

“height” of a building or “speed” of a car, and also to any kinds of eventualities such as states, processes

or transitions.
EXAMPLE 1 We drove at more than 200 kilometres per hour on a German autobahn.

The speed mentioned by “more than 200 kilometres per hour” applies to the quantitative property of a

motion: e.g. the measure “over 200 kilometres per hour” applies to the motion of driving mentioned in

the example.

The element “measure” represents a measurable quantity of an entity in terms of three attributes:

quantity, unit, and type.
EXAMPLE 2 The height of Mt. Hall is 1 950 metres.

The measure shall consist of a quantity referred to by a numeric expression “1 950” and a unit “metre”.

It applies to the “height” quantity of the geographical object, named “Mt. Hall”.

The element “relator” which is associated with markables such as “equal to”, “greater than”, “<=”,

“between”, or “at least” has only a functional status of relating two or more measures.

EXAMPLE 3 One pound equals 16 ounces.
It is a relator of identity between two measures, “one pound” and “16 ounces”.
EXAMPLE 4 1 foot is less than 1 metre, for it is exactly equal to 30,48 cm.

This example illustrates two types of links between measures: the relation of being “less than”, and that

of being an identity.

A link of the type “measure” shall relate a measure to the quantitative property of an entity. Such a link

is triggered by a measure element.

A link of the type “comparison” shall relate a measure to another or other more measures. Such a link is

often triggered by an element “comparison”.
4.4 Abstract syntax of QML (QML_as)

A markup language QML shall be a specification language for the annotation of MQI. The abstract syntax

of QML shall specifies an annotation scheme in set-theoretic terms based on a conceptual understanding

of MQI. The abstract syntax QML_as is understood to be structured as a triple such that

a) B is a set of three basic element types: entity, measure, and relator;
b) R is a set of two link types: measure and comparison types;

c) @ is a set of assignments that specify the list of attributes and their value types associated with

each of the basic element types in B and each of the link types in R.

Every element in B shall have at least one attribute, @type, and so does every link. The values of @

type are CDATA associated with each of the elements. For instance, the entity of “mountain” is of the

“geographical” type, and the entity named “John” is of the “person” type.

The values of @quantity for an entity are CDATA that may include values such as height, width, or

weight, and so on.

The assignment of measure shall have three attributes: @numeric, @unit, and @type. A possible value

of the attribute @numeric is a real number. A possible value of @unit is one of the units in a system

conventionally accepted such as one of the SI base units or derived units. A possible value of @type is

one of the quantities listed as ISQ base quantities or derived quantities, such as length, mass, voltage,

and so on.
© ISO 2021 – All rights reserved 5
---------------------- Page: 10 ----------------------
ISO/FDIS 24617-11:2021(E)
4.5 Concrete syntaxes of QML (QML_cs) and its subsets

An abstract syntax shall allow several semantically equivalent concrete syntaxes. QML_as likewise

allows a set of equivalent concrete syntaxes of QML(QML_cs). This document introduces two kinds of

concrete syntaxes, QML_csx and QML_csf, in Clause 5 and Clause 6, respectively.

The two concrete syntaxes, QML_csx and QML_csf, are both based on the abstract syntax QML_as, while

adopting XML as their representation language. They shall comply with the requirement of standoff

annotation in ISO 24612.

These two concrete syntaxes do, however, differ from each other in at least two aspects. Just like the

other Parts of ISO 24617 on semantic annotation, such as ISO 24617-1, ISO 24617-7, and ISO 24617-6,

QML_csx does not separate annotation content structures from their anchoring (referencing)

structures, although this separation is required by LAF for linguistic annotation.

In contrast, QML_csf is feature-structure-based. It shall follow LAF for the separation of the two

structures, anchoring and content structures in representing measurement information in feature

structures. Furthermore, QML_cst, as specified in this document, shall adopt the names of XML

elements and attributes with value type specifications from the TEI P 5 Guidelines of the Text Encoding

Initiative Consortium for the representation of MQI.
5 XML-based concrete syntax of QML (QML_csx)
5.1 General

The XML-based concrete syntax QML_csx is introduced in two steps. The first step is to list the tag

names and ID prefixes of QML_csx in 5.2. The second step is to specify the attribute assignments for the

XML root in 5.3, for each of the basic element types listed in 5.4, and for each of the link types listed in

5.5.

NOTE The root tag is introduced in XML to embed a list of XML elements into a single structure.

5.2 Tag names with ID prefixes

Corresponding to each of the basic element types and the link types for QML_csx, there is a unique tag

and a unique ID prefix, as shown in Table 3.
Table 3 — List of tags and ID prefixes of QML_csx
Tags ID prefixes Comment
Root mqi XML root tag
Basic element types
Entity x object to which a measure applies
Measure me unitized numeric quantities only
Relator c triggers a link relating measures
Link types
Measure link mL relates a measure to an entity and is triggered
by a measure
Comparison link cL relates a measure to another or other more
measures

NOTE The attribute name for each ID in XML is xml:id and each of its values is an ID prefix followed by a

positive integer, e.g. .
6 © ISO 2021 – All rights reserved
---------------------- Page: 11 ----------------------
ISO/FDIS 24617-11:2021(E)
5.3 Attribute specification of the root
List 1: A list of attributes for in extended BNF (Backus-Naur form)
attributes = identifier, target, [lang], [mediumTyp
...

SLOVENSKI STANDARD
oSIST ISO/DIS 24617-11:2021
01-marec-2021
Upravljanje jezikovnih virov - Ogrodje za semantično označevanje (SemAF) - 11.
del: Merljive kvantitativne informacije (MQI)
Language resource management -- Semantic annotation framework (SemAF) - Part 11:
Measurable Quantitative information (MQI)

Gestion des ressources linguistiques -- Cadre d'annotation sémantique - Partie 11:

Mesurer l'information quantitative (MQI)
Ta slovenski standard je istoveten z: ISO/DIS 24617-11
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
oSIST ISO/DIS 24617-11:2021 en

2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

---------------------- Page: 1 ----------------------
oSIST ISO/DIS 24617-11:2021
---------------------- Page: 2 ----------------------
oSIST ISO/DIS 24617-11:2021
DRAFT INTERNATIONAL STANDARD
ISO/DIS 24617-11
ISO/TC 37/SC 4 Secretariat: KATS
Voting begins on: Voting terminates on:
2020-03-16 2020-06-08
Language resource management — Semantic annotation
framework (SemAF) —
Part 11:
Measurable Quantitative information (MQI)
Gestion des ressources linguistiques — Cadre d'annotation sémantique —
Partie 11: Mesurer l'information quantitative (MQI)
ICS: 01.020
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENT AND APPROVAL. IT IS
THEREFORE SUBJECT TO CHANGE AND MAY
NOT BE REFERRED TO AS AN INTERNATIONAL
STANDARD UNTIL PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
This document is circulated as received from the committee secretariat.
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
STANDARDS MAY ON OCCASION HAVE TO
BE CONSIDERED IN THE LIGHT OF THEIR
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
Reference number
NATIONAL REGULATIONS.
ISO/DIS 24617-11:2020(E)
RECIPIENTS OF THIS DRAFT ARE INVITED
TO SUBMIT, WITH THEIR COMMENTS,
NOTIFICATION OF ANY RELEVANT PATENT
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION. ISO 2020
---------------------- Page: 3 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2020

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may

be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting

on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address

below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2020 – All rights reserved
---------------------- Page: 4 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
Contents Page

Foreword ........................................................................................................................................................................................................................................iv

Introduction ..................................................................................................................................................................................................................................v

1 Scope ................................................................................................................................................................................................................................. 1

2 Normative references ...................................................................................................................................................................................... 1

3 Terms and definitions ..................................................................................................................................................................................... 2

4 Background and Motivations ................................................................................................................................................................... 4

5 Purposes and Requirements .................................................................................................................................................................... 5

6 Abstract Specification of SemAF-MQI .............................................................................................................................................. 6

6.1 Overview ...................................................................................................................................................................................................... 6

6.2 Characteristics of SemAF-MQI ................................................................................................................................................... 6

6.3 Metamodel .................................................................................................................................................................................................. 6

6.4 Abstract syntax of QML (QML_as) .......................................................................................................................................... 8

6.5 Concrete Syntaxes of QML (QML_cs) .................................................................................................................................... 8

7 XML-based Concrete Syntax of QML (QML_csx) .................................................................................................................... 9

7.1 Overall ............................................................................................................................................................................................................ 9

7.2 Tag names with ID prefixes .......................................................................................................................................................... 9

7.3 Attribute specification of the root ........................................................................................................................ 9

7.4 Attribute specification of the basic element types ................................................................................................... 9

7.5 Attribute specification of the link types and ................................................................10

7.6 Illustrations of QML_csx ...............................................................................................................................................................11

7.6.1 Overall ....................................................................................................................................................................................11

7.6.2 Sample data .......................................................................................................................................................................11

7.6.3 Procedure of annotation ........................................................................................................................................11

8 TEI-based Concrete Syntax of QML (QML_cst).....................................................................................................................13

8.1 Concrete syntaxes of QML (QML_cst) ...............................................................................................................................13

8.1.1 Overall ....................................................................................................................................................................................13

8.1.2 Tag names with ID prefixes ..................................................................................................................................14

8.1.3 Attribute specification of the basic element types ..........................................................................14

8.1.4 Attribute specification of the two link types ........................................................................................15

8.2 Illustrations of QML_cst ................................................................................................................................................................15

8.2.1 Overall ....................................................................................................................................................................................15

8.2.2 Sample data .......................................................................................................................................................................15

8.2.3 Illustrations of TEI-based Concrete Syntax............................................................................................15

Annex A (informative) Illustrations of QML_csx with more samples ...............................................................................19

Annex B (informative) Informal statements of Measurable Quantitative Information ................................22

Annex C (informative) The representation of units ...........................................................................................................................23

Bibliography .............................................................................................................................................................................................................................24

© ISO 2020 – All rights reserved iii
---------------------- Page: 5 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards

bodies (ISO member bodies). The work of preparing International Standards is normally carried out

through ISO technical committees. Each member body interested in a subject for which a technical

committee has been established has the right to be represented on that committee. International

organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.

ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of

electrotechnical standardization.

The procedures used to develop this document and those intended for its further maintenance are

described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the

different types of ISO documents should be noted. This document was drafted in accordance with the

editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).

Attention is drawn to the possibility that some of the elements of this document may be the subject of

patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of

any patent rights identified during the development of the document will be in the Introduction and/or

on the ISO list of patent declarations received (see www .iso .org/ patents).

Any trade name used in this document is information given for the convenience of users and does not

constitute an endorsement.

For an explanation on the meaning of ISO specific terms and expressions related to conformity

assessment, as well as information about ISO's adherence to the World Trade Organization (WTO)

principles in the Technical Barriers to Trade (TBT) see the following URL: Error! Hyperlink reference

not valid..

The committee responsible for this document is ISO/TC 37, Language and Terminology, Subcommittee

SC 4, Language resource management

ISO 24617 consists of the following parts under the general title Language resource management —

Semantic annotation framework (SemAF):
— Part 1: Time and events (TimeML)
— Part 2: Dialogue acts (DA)
— Part 3: Named entity
— Part 4: Semantic roles (SR)
— Part 5: Discourse structures (DS)
— Part 6: Principles of semantic annotation (SemAF Principles)
— Part 7: Spatial information
— Part 8: Semantic relations in discourse, core annotation schema (DR-core)
— Part 9: Reference annotation framework (RAF)
— Part 10: Visual information (VoxML)
— Part 11: Measurable quantitative information (MQI)
— Part 12: Quantification
— Part 13: Gestures
iv © ISO 2020 – All rights reserved
---------------------- Page: 6 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
Introduction

Measurable quantitative information (MQI) such as ‘165 cm’ or ‘60 kg’ of ‘John’ that applies to the height

or weight of the person is very common in ordinary language. MQI describes one of basic properties

which is associated with the magnitude aspect of quantity. Such information is much more abundant

in scientific publications or technical reports to the extent that it constitutes an essential part of

communicative segments of language in general. The processing of such information is thus required

for any successful language resource management.

This document, named ‘SemAF-MQI’, thus aims to focus on specifying a general annotation scheme

with following the principles of semantic annotation laid down in ISO 24617-6 in general and the basic

requirements of ISO 24611 Linguistic annotation framework (LAF), that facilitates the processing of

MQI in scientific and technical language and to make it interoperable with other semantic annotation

schemes, such as ISO 24617 etc.

NOTE 1 ISO 24617-1:2012 (E) TimeML and ISO 24617-7: 2014 (E) Spatial information, for instance,

have proposed a way of annotating measures on time (durations or time amounts) and space (distances),

respectively. The serious disucssion of annotating measures as part of ISO 24617 was initiated at the 11 joint

[1]

ACL-ISO/TC 37/SC 4/WG 2 Workshop on Interoperable Semantic Annotation (ISA-11) and was continued at

[2] [3] [4]

the ISA-13 , ISA-14 , and ISA-15 workshops. ISO 24612: 2012 (E) LAF provides a pivotal form (GrAF, graphic

annotation framework) that makes all the annotation of temporal or spatial measures in these two annotation

schemes interchangeable with those measure annotations in the new document SemAF-MQI.

Focusing on measurements in scientifico-technological language, SemAF-MQI as an ISO standard is

[5]

expected to contribute to information extraction (IR) , question answering (QA), text summarization

[6]
(TS), and other natural language processing (NLP) applications .

NOTE 2 To enhance the readability of this document and to correct some obvisous editorial errors, some

editorial changes were made on the earlier version of CD 24617-11 MQI that had been submitted to the successful

CD ballot (2019-09-11 ~ 2019-11-06) with a 100% approval but with no comments.

• Each item in Bibliography as well as in Clause 2 Normative references was made to be referred to in

the main part of the current version of the docment.

• Three of the illustrative examples in clause 7.6 Illustrations of QML_csx were moved to a newly

created Annex A (informative) without any change of content change in order to lighten the burden

of reading that clause 7.6.
• Incorrect wordings or obvious typos were corrected.

• The white and black coloing of Figure 1 — Metamodel of QML was changed to the multiple coloring

to bring out each of the different components of the metamodel.
© ISO 2020 – All rights reserved v
---------------------- Page: 7 ----------------------
oSIST ISO/DIS 24617-11:2021
---------------------- Page: 8 ----------------------
oSIST ISO/DIS 24617-11:2021
DRAFT INTERNATIONAL STANDARD ISO/DIS 24617-11:2020(E)
Language resource management — Semantic annotation
framework (SemAF) —
Part 11:
Measurable Quantitative information (MQI)
1 Scope

As one of the basic physical properties, quantity is associated with multitude (how many) and magnitude

(how much). Focusing on the magnitudinal aspect of quantity, this document, which is named “SemAF-

MQI” henceforth, aims at formulating a specification language for the construction of an annotation

scheme for measurable quantitative information (MQI) in scientifico-technological language. The main

characteristics of SemAF-MQI is that quantitative information is presented as measures expressed in

terms of a pair , consisting of a numerically expressed quantity n and a unit u, which is either

basic or derived, or either normalized or conventionally used.

NOTE 1 MQI stands for “measurable quantitative information”, whereas SemAF-MQI refers to the part 11 of

ISO 24617-11. [See 3.4 for the definition of MQI.]

The scope of SemAF-MQI is restricted to the measurable or magnitudinal aspect of quantity so that it

can focus on the technical or practical use of measurements in IR (information retrieval), QA (question

answering), TS (text summarization), and other NLP (natural language processing) applications. The

scope is restricted to the domains of technology that carry more applicational relevance than some

theoretical issues found in the ordinary use of language. The subsequent part of ISO 24617 (Part 12)

deals with more general and theoretical issues of quantification and quantitative information.

NOTE 2 The scope of this document is intentionally restricted to the measurable or magnitudinal aspect of

quantity so that SemAF-MQI focuses on the technical or practical use of measurements in IR, QA, TS, and other

NLP applications. The scope is restricted to domains of technology that carry more applicational relevance than

theoretical issues found in the ordinary use of language. Fruit as well as meat is, for instance, sold at markets

in terms of weight but not of pieces. Furthermore, the subsequent part of ISO 24617 (Part 12) deals with more

general and theoretical issues of quantification and plurals (e.g., “three apples) including quantitative information

that includes multitudinal aspects.

The scope of SemAF-MQI also treats temporal durations that are discussed in Part 1 of ISO 24617

SemAF-Time (ISO-TimeML) and spatial measures such as distances that are treated in Part 7 of

ISO 24617 Spatial information (ISO-Space), while making them interoperable with other measure types.

It also accommodates the treatment of measures or amounts that are introduced in ISO 24617-6 SemAF

Principles (Clause 8.3).

NOTE 3 The scope of this document (Part 11) also treats temporal durations that are discussed in Part 1 of

ISO 24617 SemAF-Time (TimeML) and spatial measures such as distances that are treated in Part 7 of ISO 24617

Spatial information, while making them interoperable with other measure types. It also accommodates the

treatment of measures or amounts that are introduced in ISO 24617-6 SemAF Principles. Its scope thus covers

temporal durations treated in XSchema and the TEI Guidelines.
2 Normative references

The following documents, in whole or in part, are normatively referenced in this document and are

indispensable for its application. For dated references, only the edition cited applies. For undated

references, the latest edition of the referenced document (including any amendments) applies.

ISO 24612:2012, Language resource management — Linguistic annotation framework (LAF)

© ISO 2020 – All rights reserved 1
---------------------- Page: 9 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)

ISO 24617-1:2012, Language resource management — Semantic annotation framework (SemAF) — Part 1:

Time and events (SemAF-Time, ISO-TimeML)

ISO 24617-6:2016, Language resource management — Semantic annotation framework — Part 6:

Principles of semantic annotation (SemAF Principles)

ISO 24617-7:2014, Language resource management — Semantic annotation framework — Part 7: Spatial

information (ISOspace)

ISO/IEC 14977:1996, Information technology - Syntactic metalanguage - Extended BNF

ISO 80000-1:2009, Quantities and units — Part 1: General

NOTE 1 The following two documents are de-facto standards to be followed by SemAF-MQI:

[7]

TEI P5: Guidelines for Electronic Text Encoding and Interchange, The TEI Consortium, 2019 .

[8]

XML Schema, Part 2: Datatypes, 2nd edition, W3C Recommendation, 28 October 2004 .

3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.

ISO and IEC maintain terminological databases for use in standardization at the following addresses:

— IEC Electropedia: available at http:// www .electropedia .org/
— ISO Online browsing platform: available at https:// www .iso .org/ obp
3.1
quantity

property of a measureable object referring to its magnitude (how much) or multitude (how many).

Note 1 to entry: Compare with ISO 80000-1:2009, 3 Terms and Definitions, 3.1: property of a phenomenon, body,

or substance, where the property has a magnitude that can be expressed by means of a number and a reference.

3.2
base quantity

quantity in a conventionally chosen subset of a given system of quantities, where no quantity in the

subset can be expressed in terms of the other quantities within that subset

Note 1 to entry: Kinds of quantities include seven base quantities defined by the International System of

Quantities (ISQ), as listed in Table 1
Table 1 — ISQ base quantities
base quantities base quantity symbols
length L
mass M
time T
electric current I
thermodynamic temperature Θ
amount of substance N
luminous intensity J

Note 2 to entry: In ISO 80000-1:2009, 3 Terms and Definition, the symbols such as L and M, which are called base

quantity symbols in this document, are called as dimension symbols of quantity
2 © ISO 2020 – All rights reserved
---------------------- Page: 10 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
3.3
derived quantity

quantity, in a system of quantities, defined in terms of the base quantities of that system

EXAMPLE Speed is a derived quantity defined by length (distance) over time (LT ), where length (L) and

time (T) are base quantities.
[SOURCE: ISO 80000-1:2009, 3 Terms and Definition, 3.5 derived quantity]
3.4
quantitative information
measure associated with the quantity (3.1) of a measurable object
3.5
measurable quantitative information
MQI
quantitative information (3.3) that can be expressed in unitized numeric terms
3.6
measurable quantitative information markup language
markup language of measurable quantitative information
QML

specification language for the annotation of measurable quantitative information (3.5) extractable

from text or other medium types of language
3.7
unit
unit of measurement
measurement unit

scalar basis, defined and adopted by convention, of measuring objects by multiplying their quantitative

values expressed in real numbers

Note 1 to entry: The expressions that are used in measurement such as “meter”, “liter”, and “µmol/kg” are units

by the definition given above. The multitude expressions such as “bottles”, “boxes”, or “two” as in “two bottles of

milk”, “a box of apples”, and “two coffees” sometimes fail to be regarded as units, but they can also be if they are

accepted as units by convention or agreement in some communities. ISO 24617 SemAF Part 12: Quantification

treats such multitude expressions as genuine units.
Note 2 to entry: There are two major types of units, base and derived

[Refer to ISO 80000-1:2009, 3 Terms and Definitions, 3.9 Unit, 3.10 Base unit, and 3.11 Derived unit.]

[SOURCE: Refer to: ISO 80000-1:2009, 3 Terms and Definitions, 3.9, real scalar quantity, defined and

adopted by convention, with which any other quantity of the same kind can be compared to express the

ratio of the second quantity to the first one as a number.]
3.8
base unit
measurement unit that is adopted by convention for a base quantity (3.2)

Note 1 to entry: There are seven base units chosen by the International System of Units (SI) associated with

seven ISQ base quantities to measure quantities, as shown in Table 2.
Table 2 — base units
SI base unit Associated ISQ base quantity
(unit symbol) (base quantity dimension symbol)
meter (m) length (L)
kilogram (kg) mass (M)
© ISO 2020 – All rights reserved 3
---------------------- Page: 11 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
Table 2 (continued)
SI base unit Associated ISQ base quantity
(unit symbol) (base quantity dimension symbol)
second (s) time (T)
ampere (A) electric current (I)
kelvin (K) thermodynamic temperature (Θ)
mole (mol) amount of substance (N)
candela (cd) luminous intensity (J)

[SOURCE: ISO 80000-1:2009, 3 Terms and Definitions, 3.9 Unit, 3.10 Base unit, and 3.11 Derived unit.]

3.9
derived unit
measurement unit for a derived quantity

EXAMPLE The unit “newton” (N) is a derived unit for a derived quantity “force” (F), which is defined to be

“mass times acceleration” (MLT ), where the quantity “acceleration” is a derived quantity defined by “velocity

-1 -1

divided by time” (VT ) and “velocity” defined by “length (distance) divided by time” (LT ).

Note 1 to entry: Table 3 illustrates some of the derived units.

[Refer to ISO 80000-1:2009, 3 Terms and Definitions, 3.9 Unit, 3.10 Base unit, and 3.11 Derived unit.]

Table 3 — derived units
Derived unit Associated derived quantity
(unit symbol)
kilo-meter per minute(km/min) speed= length(L)/ time(T)
3 3
gram per cubic meter (gram/m ) density=mass(M)/volume(L )
kilo- gram, meter per square second force = mass (M) x length(L)/time(T )
(kg x m/s )
lumen per square meter (lm/m ) Illuminance = luminous intensity (J)/
area(M )
4 Background and Motivations

Quantity exists as a multitude (e.g., “two watermelons”) or magnitude (“one kilogram of watermelon”).

The two basic divisions of quantity imply the principal distinction between continuity (continuum)

and discontinuity, which are two ways of determining quantity. SemAF-MQI only focuses on the

measurement information in scientific and technical texts. Therefore, quantity is regarded as a

magnitude property in the document, which is consistent with ISO 80000 - 1:2009 Quantities and units.

As in ISO 80000-1:2009, the term “unit” is defined in relation to quantity and is used for real scalar

quantity, defined and adopted by convention, with which any other quantity of the same kind can be

compared to express the ratio of the second quantity to the first one as a number. There are two types

of units: base unit and derived unit.

This document treats complex derived units as unanalyzed wholes. It does not annotate their internal

structures and components, unless it is required by some special use cases. Neither does the standard

require to specify ways of converting one unit to another. Here are some reasons:

1) Complex derived units such as speed “km/h” (LT-1) or acceleration “m/s2” (LT-2) are understood as

they are in ordinary situations.

2) Certain domain specific units cannot be decomposed during their conversion to other equivalent

units. For example, Estimated Glomerular Filtration Rate (eGFR) frequently uses the unit “mL/

min/1.73m ” in a medical domain. Thus, a kidney function can be classified into various stages

4 © ISO 2020 – All rights reserved
---------------------- Page: 12 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)

depending on eGFR, where the stage 1 defines “normal eGFR greater than or equal to 90 mL/

2 2

min/1.73m ”. In some cases, the unit can be written as “mL/min/((173/100).m )”. In all these cases,

“1.43” or “173/100” in the units cannot be annotated separately for automatic conversion since they

are combined with other parts together to be a complete unit.

3) Units can be converted automatically in an effective way such as with the use of a conversion

table. For example, by using directly “1 mmol/l” that equals to “18 mg/dl”, the computer can more

effectively convert the unit into another with one single computation rather than convert each part

of unit and then compute the total value.

4) Incomplete units exist. During language processing, there are incomplete units which need to

be detected by using different methods such as by formulating some specific rules or guidelines.

Such rules could be designed to extend a unit into a more complete representation or to complete

missing parts of a derived unit according to some clues such as contextual information or variable-

specific default unit information.

With the recent advent of artificial intelligence technologies, many applications in IR and NLP have been

developed to acquire meta information from unstructured texts as a core module, such as question

answering systems, automatic speech translation systems, and intelligent assistant systems. In the

process of running such systems, texts are usually found containing a large amount of measurable

quantitative information, constituting an essential portion of meta information for information

extraction, text understanding, and data analysis.

Particularly, in such a big data era, demands from industry and academic communities for a precise

acquisition of measurable quantitative information have increased. For example, business investment

companies frequently need to aggregate various sorts of information covering net sales, gross profit,

operating expenses, operating profit, interest expense, net profit before taxes, net income, etc., of the

target companies from their annual reports. The fast-growing medical informatics research also needs

to process a large amount of medical texts
...

PROJET
NORME ISO/FDIS
FINAL
INTERNATIONALE 24617-11
ISO/TC 37/SC 4
Gestion des ressources
Secrétariat: KATS
linguistiques — Cadre d'annotation
Début de vote:
2021-05-10 sémantique (SemAF) —
Vote clos le:
Partie 11:
2021-07-05
Informations quantitatives
mesurables (MQI)
Language resource management — Semantic annotation framework
(SemAF) —
Part 11: Measurable quantitative information (MQI)
LES DESTINATAIRES DU PRÉSENT PROJET SONT
INVITÉS À PRÉSENTER, AVEC LEURS OBSER-
VATIONS, NOTIFICATION DES DROITS DE PRO-
PRIÉTÉ DONT ILS AURAIENT ÉVENTUELLEMENT
CONNAISSANCE ET À FOURNIR UNE DOCUMEN-
TATION EXPLICATIVE.
OUTRE LE FAIT D’ÊTRE EXAMINÉS POUR
ÉTABLIR S’ILS SONT ACCEPTABLES À DES FINS
INDUSTRIELLES, TECHNOLOGIQUES ET COM-
Numéro de référence
MERCIALES, AINSI QUE DU POINT DE VUE
ISO/FDIS 24617-11:2021(F)
DES UTILISATEURS, LES PROJETS DE NORMES
INTERNATIONALES DOIVENT PARFOIS ÊTRE
CONSIDÉRÉS DU POINT DE VUE DE LEUR POSSI-
BILITÉ DE DEVENIR DES NORMES POUVANT
SERVIR DE RÉFÉRENCE DANS LA RÉGLEMENTA-
TION NATIONALE. ISO 2021
---------------------- Page: 1 ----------------------
ISO/FDIS 24617-11:2021(F)
DOCUMENT PROTÉGÉ PAR COPYRIGHT
© ISO 2021

Tous droits réservés. Sauf prescription différente ou nécessité dans le contexte de sa mise en œuvre, aucune partie de cette

publication ne peut être reproduite ni utilisée sous quelque forme que ce soit et par aucun procédé, électronique ou mécanique,

y compris la photocopie, ou la diffusion sur l’internet ou sur un intranet, sans autorisation écrite préalable. Une autorisation peut

être demandée à l’ISO à l’adresse ci-après ou au comité membre de l’ISO dans le pays du demandeur.

ISO copyright office
Case postale 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Genève
Tél.: +41 22 749 01 11
E-mail: copyright@iso.org
Web: www.iso.org
Publié en Suisse
ii © ISO 2021 – Tous droits réservés
---------------------- Page: 2 ----------------------
ISO/FDIS 24617-11:2021(F)
Sommaire Page

Avant-propos ..............................................................................................................................................................................................................................iv

Introduction ..................................................................................................................................................................................................................................v

1 Domaine d’application ................................................................................................................................................................................... 1

2 Références normatives ................................................................................................................................................................................... 1

3 Termes et définitions ....................................................................................................................................................................................... 1

4 Spécification abstraite de QML .............................................................................................................................................................. 3

4.1 Vue d’ensemble ....................................................................................................................................................................................... 3

4.2 Caractéristiques de QML ................................................................................................................................................................ 4

4.3 Métamodèle ............................................................................................................................................................................................... 4

4.4 Syntaxe abstraite de QML (QML_as) ..................................................................................................................................... 5

4.5 Syntaxes concrètes de QML (QML_cs) et de ses sous-ensembles ................................................................ 6

5 Syntaxe concrète de QML basée sur XML (QML_csx) ....................................................................................................... 6

5.1 Généralités .................................................................................................................................................................................................. 6

5.2 Noms de balises avec préfixes d’ID ........................................................................................................................................ 6

5.3 Spécification des attributs de la racine ........................................................................................................... 7

5.4 Spécification des attributs des types d’éléments de base .................................................................................. 7

5.5 Spécification des attributs des types de liens .............................................................................................................. 8

5.6 Illustrations de QML_csx ................................................................................................................................................................. 9

5.6.1 Généralités ............................................................................................................................................................................ 9

5.6.2 Échantillons de données ........................................................................................................................................... 9

5.6.3 Procédure d’annotation ............................................................................................................................................. 9

6 Syntaxe concrète de QML basée sur la TEI (QML_cst) .................................................................................................11

6.1 Syntaxes concrètes de QML (QML_cst) ............................................................................................................................11

6.1.1 Généralités .........................................................................................................................................................................11

6.1.2 Noms de balises avec préfixes d’ID ...............................................................................................................11

6.1.3 Spécification des attributs des types d’éléments de base .........................................................11

6.1.4 Spécification des attributs des deux types de liens ........................................................................12

6.2 Illustrations de QML_cst ..............................................................................................................................................................13

6.2.1 Généralités .........................................................................................................................................................................13

6.2.2 Échantillons de données ........................................................................................................................................13

6.2.3 Illustrations de la syntaxe concrète basée sur la TEI ....................................................................13

Annexe A (informative) Illustrations de QML_csx avec davantage d’échantillons .............................................17

Annexe B (informative) Énoncés informels de MQI ...........................................................................................................................20

Annexe C (informative) Représentation des unités ...........................................................................................................................21

Bibliographie ...........................................................................................................................................................................................................................22

© ISO 2021 – Tous droits réservés iii
---------------------- Page: 3 ----------------------
ISO/FDIS 24617-11:2021(F)
Avant-propos

L’ISO (Organisation internationale de normalisation) est une fédération mondiale d’organismes

nationaux de normalisation (comités membres de l’ISO). L’élaboration des Normes internationales est

en général confiée aux comités techniques de l’ISO. Chaque comité membre intéressé par une étude

a le droit de faire partie du comité technique créé à cet effet. Les organisations internationales,

gouvernementales et non gouvernementales, en liaison avec l’ISO participent également aux travaux.

L’ISO collabore étroitement avec la Commission électrotechnique internationale (IEC) en ce qui

concerne la normalisation électrotechnique.

Les procédures utilisées pour élaborer le présent document et celles destinées à sa mise à jour sont

décrites dans les Directives ISO/IEC, Partie 1. Il convient, en particulier de prendre note des différents

critères d’approbation requis pour les différents types de documents ISO. Le présent document a été

rédigé conformément aux règles de rédaction données dans les Directives ISO/IEC, Partie 2 (voir www

.iso .org/ directives).

L’attention est appelée sur le fait que certains des éléments du présent document peuvent faire l’objet de

droits de propriété intellectuelle ou de droits analogues. L’ISO ne saurait être tenue pour responsable

de ne pas avoir identifié de tels droits de propriété et averti de leur existence. Les détails concernant

les références aux droits de propriété intellectuelle ou autres droits analogues identifiés lors de

l’élaboration du document sont indiqués dans l’Introduction et/ou dans la liste des déclarations de

brevets reçues par l’ISO (voir www .iso .org/ brevets).

Les appellations commerciales éventuellement mentionnées dans le présent document sont données

pour information, par souci de commodité, à l’intention des utilisateurs et ne sauraient constituer un

engagement.

Pour une explication de la nature volontaire des normes, la signification des termes et expressions

spécifiques de l’ISO liés à l’évaluation de la conformité, ou pour toute information au sujet de l’adhésion

de l’ISO aux principes de l’Organisation mondiale du commerce (OMC) concernant les obstacles

techniques au commerce (OTC), voir le lien suivant: www .iso .org/ iso/ fr/ avant -propos.

Le présent document a été élaboré par le comité ISO/TC 37, Langage et terminologie, sous-comité SC 4,

Gestion des ressources linguistiques.

Une liste de toutes les parties de la série ISO 24617 se trouve sur le site web de l’ISO.

Il convient que l’utilisateur adresse tout retour d’information ou toute question concernant le présent

document à l’organisme national de normalisation de son pays. Une liste exhaustive desdits organismes

se trouve à l’adresse www .iso .org/ fr/ members .html.
iv © ISO 2021 – Tous droits réservés
---------------------- Page: 4 ----------------------
ISO/FDIS 24617-11:2021(F)
Introduction

Les informations quantitatives mesurables (MQI, Measurable Quantitative Information) telles que

«165 cm» ou «60 kg» de «John» qui s’appliquent à la taille ou au poids de la personne sont très courantes

dans le langage ordinaire. Les MQI décrivent l’une des propriétés de base qui est associée à l’aspect

quantitatif d’une grandeur. Les principales caractéristiques de la norme MQI sont que les informations

quantitatives sont présentées sous forme de mesures exprimées en termes de paire < n, u > , consistant

en une grandeur exprimée numériquement n et une unité u, qui est une unité de base ou une unité

dérivée, ou encore une unité normalisée ou utilisée par convention. Ces informations sont beaucoup

plus abondantes dans les publications scientifiques ou les rapports techniques au point qu’elles

constituent une part essentielle des segments communicatifs du langage en général. Le traitement de

ces informations est donc nécessaire pour une gestion réussie des ressources linguistiques.

À l’époque du «big data», les demandes de l’industrie et des milieux universitaires pour une

acquisition précise des informations quantitatives mesurables ont augmenté. Par exemple, les sociétés

d’investissement dans les entreprises ont fréquemment besoin d’agréger différents types d’informations

couvrant les ventes nettes, la marge brute, les frais d’exploitation, le bénéfice d’exploitation, les frais

d’intérêt, le bénéfice net avant impôts, le revenu net, etc. des sociétés cibles à partir de leurs rapports

annuels. La recherche en informatique médicale, en plein essor, a également besoin de traiter une

grande quantité de textes médicaux pour analyser la dose de médicament, les critères d’éligibilité des

essais cliniques, les caractères phénotypiques des patients, les essais en laboratoire dans les dossiers

[8]

cliniques, etc. . Toutes ces demandes, qu’elles soient liées à l’industrie ou à la recherche médicale,

exigent la représentation précise et cohérente des informations quantitatives mesurables afin de

permettre un traitement, un calcul et un échange automatisés.

Cependant, en IR et en PNL, il n’existe actuellement aucun moyen normalisé de représenter les

informations quantitatives mesurables. Chaque système d’application développé dans les secteurs

industriels utilise jusqu’à présent son propre format pour annoter les informations quantitatives

mesurables. Un format de représentation des informations quantitatives mesurables qui soit flexible,

interopérable et normalisé est nécessaire pour permettre aux tâches d’IR et de PNL de fonctionner avec

de nombreux systèmes d’application différents.

Le présent document vise à formuler un schéma d’annotation général en suivant les principes

d’annotation sémantique définis dans l’ISO 24617-6 en général et les exigences de base de l’ISO 24611, qui

facilite le traitement des MQI dans le langage scientifique et technique et afin de le rendre interopérable

avec d’autres schémas d’annotation sémantique, tels que l’ISO 24617. Le schéma d’annotation est conçu

pour être interopérable avec les autres parties de l’ISO 24617. Il s’appuie également sur diverses normes

ISO relatives aux ressources lexicales et aux cadres d’annotation morpho-syntaxique. Il vise à être

compatible avec les autres normes pertinentes existantes.

NOTE L’ISO 24617-1 et l’ISO 24617-7, par exemple, ont proposé un moyen d’annoter les mesures de temps

(durées ou quantités de temps) et d’espace (distances), respectivement. L’ISO 24612 fournit un formulaire pivot

(cadre d’annotation graphique) qui permet de réaliser toutes les annotations de mesures de temps et d’espace

dans ces deux schémas d’annotation.

Le QML est normalisé à un niveau abstrait qui permet divers formats de sérialisation représentant

les informations quantitatives mesurables annotées, tels qu’une représentation basée sur XML.

La normalisation de l’annotation QI (information quantitative) est indiquée au niveau abstrait de

l’annotation, et le format d’annotation déportée est adopté au niveau concret de la sérialisation.

Axé sur les mesures en langage scientifico-technologique, le présent document est censé contribuer aux

[9]

applications d’extraction d’information (IR) , de réponse aux questions (QA), de résumé de texte (TS)

[10]
et autres applications de traitement du langage naturel (NLP) .
© ISO 2021 – Tous droits réservés v
---------------------- Page: 5 ----------------------
PROJET FINAL DE NORME INTERNATIONALE ISO/FDIS 24617-11:2021(F)
Gestion des ressources linguistiques — Cadre d'annotation
sémantique (SemAF) —
Partie 11:
Informations quantitatives mesurables (MQI)
1 Domaine d’application

Le présent document porte sur l’aspect mesurable ou quantitatif de la grandeur, de sorte qu’il est

possible de se concentrer sur l’utilisation technique ou pratique des mesures dans les applications IR

(recherche d’informations), QA (réponse aux questions), TS (résumé de texte) et autres applications NLP

(traitement du langage naturel). Il s’applique aux domaines technologiques qui présentent plus d’intérêt

sur le plan de l’application que certains problèmes théoriques rencontrés dans l’utilisation ordinaire du

langage.

NOTE L’ISO 24617-12 traite des questions plus générales et théoriques de la quantification et de l’information

quantitative.

Le présent document traite également des durées temporelles qui sont abordées dans l’ISO 24617-1 et

des mesures spatiales telles que les distances qui sont traitées dans l’ISO 24617-7, tout en les rendant

interopérables avec d’autres types de mesures. Il intègre également le traitement des mesures ou des

montants qui sont introduits dans l’ISO 24617-6:2016, 8.3.
2 Références normatives

Les documents suivants sont cités dans le texte de sorte qu’ils constituent, pour tout ou partie de leur

contenu, des exigences du présent document. Pour les références datées, seule l’édition citée s’applique.

Pour les références non datées, la dernière édition du document de référence s’applique (y compris les

éventuels amendements).

ISO 24612, Gestion des ressources linguistiques — Cadre d'annotation linguistique (LAF)

3 Termes et définitions

Pour les besoins du présent document, les termes et définitions suivants s’appliquent.

L’ISO et l’IEC tiennent à jour des bases de données terminologiques destinées à être utilisées en

normalisation, consultables aux adresses suivantes:

— ISO Online browsing platform: disponible à l’adresse https:// www .iso .org/ obp;

— IEC Electropedia: disponible à l’adresse https:// www .electropedia .org/ .
3.1
grandeur
propriété d’un objet mesurable se référant à son ampleur ou à sa multiplicité

[SOURCE: ISO/IEC Guide 99:2007, 1.1, modifiée — La définition a été considérablement remaniée et les

notes ont été supprimées.]
© ISO 2021 – Tous droits réservés 1
---------------------- Page: 6 ----------------------
ISO/FDIS 24617-11:2021(F)
3.2
grandeur de base

grandeur (3.1) d’un sous-ensemble choisi par convention dans un système de grandeurs donné de façon

qu’aucune grandeur du sous-ensemble ne puisse être exprimée en fonction des autres grandeurs de ce

sous-ensemble

Note 1 à l'article: La nature des grandeurs comprend sept grandeurs de base définies par le Système international

de grandeurs (ISQ).

[SOURCE: ISO/IEC Guide 99:2007, 1.4, modifiée — L’expression «des autres» a été remplacée par «des

autres grandeurs de ce sous-ensemble», et les notes ainsi que l’exemple ont été supprimés.]

3.3
grandeur dérivée

grandeur définie (3.1), dans un système de grandeurs, en fonction des grandeurs de base (3.2) de ce

système

EXEMPLE La vitesse est une grandeur dérivée définie par la longueur (distance) par rapport au temps (LT ),

où la longueur (L) et le temps (T) sont des grandeurs de base.
[SOURCE: ISO/IEC Guide 99:2009, 1.5, modifiée — L’exemple a été remplacé.]
3.4
information quantitative
mesure associée à la grandeur (3.1) d’un objet mesurable
3.5
information quantitative mesurable
MQI

information quantitative (3.4) qui peut être exprimée en termes numériques unifiés

3.6
langage de balisage des informations quantitatives mesurables
langage de balisage des informations quantitatives mesurables
langage de balisage quantitatif
QML

langage de spécification pour l’annotation des informations quantitatives mesurables (3.5) extractibles

de textes ou d’autres types de support de langage
3.7
unité de mesure
unité de mesure
unité

base scalaire, définie et adoptée par convention, de la mesure des objets par multiplication de leurs

valeurs quantitatives exprimées en nombres réels

Note 1 à l'article: Les expressions utilisées en mesurage telles que «mètre», «litre» et «µmol/kg» sont des unités

selon la définition donnée ci-dessus. Les expressions de multiplicité telles que «bouteilles», «boîtes» ou «deux»

comme dans «deux bouteilles de lait», «une boîte de pommes» et «deux cafés» ne sont parfois pas considérées

comme des unités, mais elles peuvent l’être si elles sont acceptées comme unités par convention ou accord dans

certaines communautés. L’ISO 24617 SemAF Partie 12: Quantification traite ces expressions de multiplicité

comme de véritables unités.

[SOURCE: ISO/IEC Guide 99:2007, 1.9, modifiée — La définition a été considérablement remaniée, les

notes d’origine ont été supprimées et une nouvelle Note 1 à l’article a été ajoutée.]

2 © ISO 2021 – Tous droits réservés
---------------------- Page: 7 ----------------------
ISO/FDIS 24617-11:2021(F)
3.8
unité de base
unité de mesure (3.7) adoptée par convention pour une grandeur de base (3.2)

Note 1 à l'article: Il existe sept unités de base choisies par le Système international d’unités (SI) associées à sept

grandeurs de base ISQ pour mesurer les grandeurs, comme indiqué dans le Tableau 1.

Tableau 1 — Unités de base
Unité SI de base Grandeur de base de l’ISQ associée
(symbole de l’unité) (symbole de la grandeur de base)
mètre (m) longueur (L)
kilogramme (kg) masse (M)
seconde (s) temps (T)
ampère (A) courant électrique (I)
kelvin (K) température thermodynamique (È)
mole (mol) quantité de matière (N)
candela (cd) intensité lumineuse (J)

[SOURCE: ISO/IEC Guide 99:2007, 1.10, modifiée — Les notes et les exemples ont été supprimés, et une

nouvelle Note 1 à l’article ainsi que le Tableau 1 ont été ajoutés.]
3.9
unité dérivée
unité de mesure (3.7) d’une grandeur dérivée (3.3)

EXEMPLE L’unité «newton» (N) est une unité dérivée pour une grandeur dérivée «force» (F), qui est définie

comme la «masse multipliée par l’accélération» (MLT ), où la grandeur «accélération» est une grandeur dérivée

définie par la «vitesse divisée par le temps» (VT ) et la «vitesse» définie par la «longueur (distance) divisée par

le temps» (LT ).
Note 1 à l'article: Le Tableau 2 illustre certaines des unités dérivées.

[SOURCE: ISO/IEC Guide 99:2007, 1.11, modifiée — Les exemples ont été supprimés et un nouvel

exemple ainsi que la Note 1 à l’article ont été ajoutés.]
Tableau 2 — Unités dérivées
Unité dérivée Grandeur dérivée associée
(symbole de l’unité)
kilomètre par minute (km/min) vitesse = longueur(L)/temps(T)
3 3
gramme par mètre cube (g/m ) masse volumique = masse(M)/volume(L )
2 2

kilogramme mètre par seconde carrée (kg x m/s ) force = masse (M) x longueur(L)/temps(T )

2 2

lumen par mètre carré (lm/m ) éclairement lumineux = intensité lumineuse (J)/aire(M )

4 Spécification abstraite de QML
4.1 Vue d’ensemble

Le langage de balisage quantitatif (QML) (3.6) est spécifié à deux niveaux, abstrait et concret. Certaines

caractéristiques de QML sont énumérées en 4.2. La structure globale de QML est représentée par

un métamodèle, tel que présenté en 4.3. La syntaxe abstraite de QML comme QML_as doit être une

spécification ensembliste de QML en termes conceptuels qui sont indépendants des manières de

représenter l’annotation (contenu) des informations quantitatives mesurables. La syntaxe concrète

de QML comme QML_cs doit être une spécification d’un ensemble de formats de représentation, basé

© ISO 2021 – Tous droits réservés 3
---------------------- Page: 8 ----------------------
ISO/FDIS 24617-11:2021(F)

sur QML_as, pour l’annotation des informations quantitatives mesurables d’une manière traçable

informatiquement. QML_as est présenté en 4.4, tandis que QML_cs est présenté en 4.5. Les syntaxes

concrètes équivalentes, dont une syntaxe concrète QML_csx basée sur XML et une syntaxe concrète

QML_cst basée sur la TEI, sont décrites à l’Article 5 et à l’Article 6 respectivement.

NOTE Il peut y avoir de nombreuses syntaxes concrètes équivalentes définies sur une seule syntaxe

abstraite.
4.2 Caractéristiques de QML
Le QML doit présenter les caractéristiques suivantes:

a) le QML doit être axé sur l’annotation des attributs mesurables des entités. Par exemple, «IMC entre

10-20 kg/m »;

b) le QML doit permettre d’annoter les relations des mesures. Par exemple, «âge 40 ou plus» et

«fpg >= 100 mg/dl ou a1c pas moins de 5,8 %»;

c) le QML doit couvrir les utilisations complexes de grandeurs numériques unifiées. Par exemple,

«14,0 × 109», «hémoglobine glyquée (hba1c) < 1,15 fois la limite supérieure de la normale»;

d) le QML doit faciliter l’identification d’unités numériques normalisées en tant qu’attribut mesurable

d’une entité associée.

NOTE Le QML ne spécifie pas les moyens d’annoter la normalisation (par exemple, «millimoles par litre»

est normalisé par «mmol/l ») ou la spécification complète (par exemple, «kg/m» s’écrit «kg/m » pour l’IMC) des

unités, ce qui sera abordé dans une autre partie de l’ISO 24617 traitant de la mise en œuvre automatisée des MQI.

4.3 Métamodèle

La structure globale des informations quantitatives mesurables est représentée par le métamodèle de

la Figure 1.
Figure 1 — Métamodèle des informations quantitatives mesurables

Ce métamodèle doit se composer de sept composantes de classe, représentées par des cases carrées à la

Figure 1:
a) données sources en entrée pour l’annotation des MQI;
b) marqueurs extraits des sources de données;
4 © ISO 2021 – Tous droits réservés
---------------------- Page: 9 ----------------------
ISO/FDIS 24617-11:2021(F)
c) trois types d’éléments de base: entité, mesure, et relateur;
d) deux types de liens: lien de mesure et lien de comparaison.

L’élément «entité» doit être tout objet qui a la propriété d’une grandeur mesurable, représentée par «@

grandeur», comme l’une de ses propriétés. L’« entité », telle qu’elle est utilisée dans le présent document,

doit être un terme très général qui fait référence à tout objet, non seulement à des entités individuelles,

mais aussi à leurs propriétés, telles que la «hauteur» d’un bâtiment ou la «vitesse» d’une voiture, ainsi

que toutes sortes d’éventualités telles que des états, des processus ou des transitions.

EXEMPLE 1 Nous avons roulé à plus de 200 kilomètres à l’heure sur une autoroute allemande.

La vitesse mentionnée par «plus de 200 kilomètres à l’heure» s’applique à la propriété quantitative d’un

mouvement: par exemple, la mesure «plus de 200 kilomètres à l’heure» s’applique au mouvement de

rouler mentionné dans l’exemple.

L’élément «mesure» représente une grandeur mesurable d’une entité selon trois attributs: grandeur,

unité et type.
EXEMPLE 2 La hauteur du mont Hall est de 1 950 mètres.

La mesure doit consister en une grandeur désignée par une expression numérique «1 950» et une unité

«mètre». Elle s’applique à la grandeur «hauteur» de l’objet géographique, nommé «mont Hall».

L’élément «relateur» qui est associé aux marqueurs tels que «égal à», «supérieur à», «<=», «entre» ou

«au moins» n’a que le statut fonctionnel de relier deux mesures ou plus.
EXEMPLE 3 Une livre équivaut à 16 onces.
Il s’agit un relateur d’identité entre deux mesures, «une livre» et «16 onces».
EXEMPLE 4 1 ft est inférieur à 1 mètre, car il est exactement égal à 30,48 cm.

Cet exemple illustre deux types de liens entre les mesures: la relation d’être «inférieur à» et celle d’être

une identité.

Un lien de type «mesure» doit relier une mesure à la propriété quantitative d’une entité. Un tel lien est

déclenché par un élément de mesure.

Un lien de type «comparaison» doit relier une mesure à une autre ou à plusieurs autres mesures. Un tel

lien est souvent déclenché par un élément de «comparaison».
4.4 Syntaxe abstraite de QML (QML_as)

Un langage de balisage QML doit être un langage de spécification pour l’annotation des MQI. La syntaxe

abstraite de QML doit spécifier un schéma d’annotation en termes de théorie des ensembles basé sur

une compréhension conceptuelle des MQI. La syntaxe abstraite QML_as est considérée comme ayant

une structure triple < B, R, @ > de sorte que:

a) B est un ensemble de trois types d’éléments de base: entité, mesure et relateur;

b) R est un ensemble de deux types de liens: les types mesure et comparaison;

c) @ est un ensemble d’affectations qui spécifient la liste d’attributs et leurs types de valeur associés

à chacun des types d’éléments de base dans B et à chacun des types de liens dans R.

Chaque élément de B doit posséder au moins un attribut, @type, tout comme chaque lien. Les valeurs de

@type sont des éléments CDATA associées à chacun des éléments. Par exemple, l’entité «montagne» est

de type «géographique» et l’entité nommée «John» est de type «personne».

Les valeurs de @grandeur pour une entité sont des éléments CDATA qui peuvent inclure des valeurs

telles que la hauteur, la largeur ou le poids, etc.
© ISO 2021 – Tous droits réservés 5
---------------------- Page: 10 ----------------------
ISO/FDIS 24617-11:2021(F)

L’affectation de mesure doit posséder trois attributs: @numérique, @unité et @type. Une valeur

possible de l’attribut @numérique est un nombre réel. Une valeur possible de @unité est l’une des

unités d’un système accepté par convention, comme l’une des unités SI de base ou des unités dérivées.

Une valeur possible de @type est l’une des grandeurs répertoriées en tant que grandeurs de base de

l’ISQ ou grandeurs dérivées, telles que la longueur, la masse, la tension, etc.
4.5 Syntaxes concrètes de QML (QML_cs) et de ses sous-ensembles

Une syntaxe abstraite doit permettre plusieurs syntaxes concrètes sémantiquement équivalentes.

QML_as permet ainsi un ensemble de syntaxes concrètes équivalentes de QML (QML_cs). Ce

document présente deux types de syntaxes concrètes, QML_csx et QML_csf, à l’Article 5 et l’Article 6,

respectivement.

Les deux syntaxes concrètes, QML_csx et QML_csf, sont basées sur la syntaxe abstraite QML_as, tout en

adoptant XML comme langage de représentation. Elles doivent être conformes à l’exigence d’annotation

déportée de l’ISO 24612.
Ces deux syntaxes concrètes diffèrent cependant l’une de l’autre sur
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.