ISO 24617-7:2014
(Main)Language resource management - Semantic annotation framework - Part 7: Spatial information (ISOspace)
Language resource management - Semantic annotation framework - Part 7: Spatial information (ISOspace)
ISO 24617-7:2014 provides a framework for encoding a broad range not only of spatial information, but also of spatiotemporal information relating to motion as expressed in natural language texts. It includes references to locations, general spatial entities, spatial relations (involving topological, orientational, and metric values), dimensional information, motion events, and paths.
Gestion des ressources linguistiques — Cadre d'annotation sémantique — Partie 7: Information spatiale (ISOspace)
Upravljanje z jezikovnimi viri - Ogrodje za semantično označevanje (SemAF) - 7. del: Prostorske informacije (ISOspace)
Ta del standarda ISO 24617 določa ogrodje za kodiranje širokega razpona ne samo prostorskih informacij, ampak tudi časovno vrsto, ki se nanaša na gibanje, kot je izraženo v besedilnih naravnega jezika. Ta del standarda ISO 24617 zajema sklice na lokacije, splošne prostorske enote, prostorske odnose (ki zajemajo topološke, orientacijske in metrične vrednosti), podatke o merah, dogodke premikanja in poti.
General Information
Relations
Frequently Asked Questions
ISO 24617-7:2014 is a standard published by the International Organization for Standardization (ISO). Its full title is "Language resource management - Semantic annotation framework - Part 7: Spatial information (ISOspace)". This standard covers: ISO 24617-7:2014 provides a framework for encoding a broad range not only of spatial information, but also of spatiotemporal information relating to motion as expressed in natural language texts. It includes references to locations, general spatial entities, spatial relations (involving topological, orientational, and metric values), dimensional information, motion events, and paths.
ISO 24617-7:2014 provides a framework for encoding a broad range not only of spatial information, but also of spatiotemporal information relating to motion as expressed in natural language texts. It includes references to locations, general spatial entities, spatial relations (involving topological, orientational, and metric values), dimensional information, motion events, and paths.
ISO 24617-7:2014 is classified under the following ICS (International Classification for Standards) categories: 01.020 - Terminology (principles and coordination). The ICS classification helps identify the subject area and facilitates finding related standards.
ISO 24617-7:2014 has the following relationships with other standards: It is inter standard links to ISO 24617-7:2020. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO 24617-7:2014 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
INTERNATIONAL ISO
STANDARD 24617-7
First edition
2014-12-15
Language resource management —
Semantic annotation framework —
Part 7:
Spatial information (ISOspace)
Gestion des ressources linguistiques — Cadre d’annotation
sémantique —
Partie 7: Information spatiale (ISOspace)
Reference number
©
ISO 2014
© ISO 2014
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2014 – All rights reserved
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 List of tags . 4
5 Overview . 5
6 Motivation and requirements . 6
7 Specification of ISOspace for spatial annotation . 7
7.1 Overview: annotation vs. representation. 7
7.2 Abstract syntax for the ISOspace annotation structure . 7
8 Representation of ISOspace-conformant annotations . 8
8.1 XML-based concrete syntax: outline . 8
8.1.1 Overview . 8
8.1.2 Basic entitles . 9
8.1.3 Signals . 9
8.1.4 Links . 9
8.1.5 Root element . 9
8.2 Conventions for tagging . 9
8.2.1 Naming conventions . 9
8.2.2 Convention for inline tagging extents .10
8.3 Basic entity tags .11
8.3.1 .11
8.3.2 .13
8.3.3 .13
8.3.4 .14
8.3.5 for non-motion event .15
8.3.6 .15
8.3.7 .16
8.3.8 .16
8.4 Link tags .17
8.4.1 .17
8.4.2 .17
8.4.3 .18
8.4.4 .20
8.5 Root tag: .20
8.6 Summary .21
8.6.1 Identifier .21
8.6.2 Shared attributes .22
8.6.3 IDRef as value .23
Annex A (normative) Core annotation guidelines .24
Bibliography .52
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any
patent rights identified during the development of the document will be in the Introduction and/or on
the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation on the meaning of ISO specific terms and expressions related to conformity
assessment, as well as information about ISO’s adherence to the WTO principles in the Technical Barriers
to Trade (TBT) see the following URL: Foreword - Supplementary information
The committee responsible for this document is ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 4, Language resource management.
ISO 24617 consists of the following parts, under the general title Language resource management —
Semantic annotation framework (semAF):
— Part 1: Time and events (SemAF-Time, ISO-TimeML)
— Part 2: Dialogue acts
— Part 4: Semantic roles (SemAF-SR)
— Part 5: Discourse structures (SemAF-DS)
— Part 6: Principles of semantic annotation (SemAF-Basics)
— Part 7: Spatial information (ISOspace)
— Part 8: Semantic relations in discourse (SemAF-DRel)
iv © ISO 2014 – All rights reserved
Introduction
The automatic recognition of spatial information in natural language is currently attracting considerable
attention in the fields of computational linguistics and artificial intelligence. The development of
algorithms that exhibit “spatial awareness” promises to add needed functionality to NLP systems, from
named entity recognition to question-answering and text-based inference. However, in order for such
systems to reason spatially, they require the enrichment of textual data with the annotation of spatial
information in language. This involves a large range of linguistic constructions, including spatially
anchoring events, descriptions of objects in motion, viewer-relative descriptions of scenes, absolute
spatial descriptions of locations, and many other constructions.
This part of ISO 24617 was developed in collaboration with the ISOspace working group at Brandeis
University with the aim to provide an International Standard for the representation of spatial information
relating to locations, motions and non-motion events in language.
NOTE The ISOspace Working Group is headed by James Pustejovsky, jampesp@cs.brandeis.edu, Brandeis
University, Waltham, MA, U.S.A.
This part of ISO 24617 provides normative specifications and guidelines not only for spatial information,
but also for information content in motion and various other types of event in language.
The main parts of this part of ISO 24617 consist of the following:
a) Scope;
b) Normative references;
c) Terms and definitions;
d) List of tags or names of elements;
e) Overview;
f) Motivation and requirements;
g) Specification of the ISOspace annotation structure;
h) Representation of ISOspace-conformant annotations.
Clause 8 introduces an XML-based concrete syntax for representing spatial-related or motion-related
annotations based on the annotation structure of ISOspace that is presented in Clause 7 with a UML-
based metamodel.
A formal semantics for ISOspace will be provided as part of a future new work item within the semantic
annotation framework. This will be coordinated with the temporal semantics and specification of
ISO 24617-1 (SemAF-Time, ISO-TimeML), thereby producing a rich semantics that will be directly useable
by practitioners in computational linguistics and other communities (see Clause 6). The multilingual
extension of ISOspace will also be treated in a separate part of the ISO 24617- series in the near future.
NOTE Although the schema and DTD are not part of the present document as normative annexes, they will
both be found in a webpage relating to the ISOspace specification.
Normative Annex A is an integral part of ISO 24617 and provides core annotation guidelines.
INTERNATIONAL STANDARD ISO 24617-7:2014(E)
Language resource management — Semantic annotation
framework —
Part 7:
Spatial information (ISOspace)
1 Scope
This part of ISO 24617 provides a framework for encoding a broad range not only of spatial information,
but also of spatiotemporal information relating to motion as expressed in natural language texts. This
part of ISO 24617 includes references to locations, general spatial entities, spatial relations (involving
topological, orientational, and metric values), dimensional information, motion events, and paths.
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are
indispensable for its application. For dated references, only the edition cited applies. For undated
references, the latest edition of the referenced document (including any amendments) applies.
ISO 24617-1, Language resource management — Semantic annotation framework (SemAF) — Part 1: Time
and events (SemAF-Time, ISO-TimeML)
ISO/IEC 14977, Information technology — Syntactic metalanguage — Extended BNF
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 24617-1 and the following apply.
3.1
document creation location
dcl
unique place or set of places associated with a document that represents the location (3.7) in which the
document was created
1)
Note 1 to entry: Some collaboratively written documents, such as GoogleDoc documents and chat logs, might
refer not only to a single location but also to a set of locations spread out across the world. Besides, for example,
the creation place of the Hebrew bible or the creation place of each of the books in it is uncertain. The attribute @
dcl will, therefore, have the value “false” which is to be understood to mean “unspecified”, while the value “true”
is to be understood to mean “specified”.
3.2
event
eventuality
something that can be said to obtain or hold true, to happen or to occur
Note 1 to entry: This is a very broad notion of event, also known in the literature as “eventuality” and includes all
kinds of actions, states, processes, etc. It is not to be confused with the narrower notion of event (as opposed to
the notion of “state”) as something that happens at a certain point in time (e.g. the clock striking two or waking
up) or during a short period of time (e.g. laughing). In ISO-TimeML, the term event is used in a broader sense and
is equivalent to the term eventuality.
1) GoogleDoc is an example of a suitable product available commercially. This information is given for the
convenience of users of this document and does not constitute an endorsement by ISO of these products.
[SOURCE: ISO 24617-1:2012]
3.3
event-path
path (3.13) or trajectory followed by a spatial entity (3.17) coincident with a motion-event (3.9)
3.4
extent
textual segment which is string of character segments in text to be annotated
EXAMPLE Tokens, words, and non-contiguous phrases (e.g. a complex verb like “look . up”) are extents.
3.5
figure
spatial entity (3.17) that is considered to be the focal object, which is related to some reference object
3.6
ground
spatial entity (3.17) that acts as reference for a figure (3.5)
3.7
location
point or finite area that is positioned within a space (3.16)
3.8
measure
magnitude of a spatial dimension or relation
EXAMPLE Distance is a spatial relation.
3.9
motion
motion-event
action or process involving the translocation of a spatial object, transformation of some spatial property
of an object, or change in the conformation of an object
Note 1 to entry: A motion (3.9) in ISOspace is a particular kind of event (3.2).
3.10
motion-signal
adjunct
motion-adjunct
path (3.13) of motion and/or manner of motion information contributed by a particle or by a prepositional,
adverbial phrase, in conjunction with a motion (3.9)-related text
Note 1 to entry: This terminology is specific to ISOspace and is different from the general term “adjunct” which is
used to describe optional syntactic elements.
3.11
non-cosuming tag
tag (3.19) that has no associated extent (3.4)
Note 1 to entry: The extent (3.4) of a non-consuming tag is a “null” string.
EXAMPLE In John ate an apple but Mary a pear, there are at least two ways of marking up the non-
consuming tag:
a) John ate an apple, but Mary ∅ a pear;
e1 e2
b) 1)
2) (non-consuming tag)
2 © ISO 2014 – All rights reserved
3.12
orientation
orientation(al) relation
relation between a figure (3.5) and a ground (3.6) that expresses the spatial disposition or direction of a
spatial object within a frame of reference
3.13
path
location (3.7) that consists of a series of locations (3.7)
Note 1 to entry: A spatial object path is a location where the focus is on the potential for traversal or which
functions as a boundary. This includes common nouns like road, coastline, and river and proper names like Route
66 and Kangamangus Highway. Some nouns, such as valley, can be ambiguous. It can be understood as a path in we
walked down the valley or as a place (3.14) in we live in the valley.
Note 2 to entry: A path might be represented as an undirected graph whose vertices are locations (3.7) and whose
edges signify continuity; that is to say, a path has no inherent directionality.
3.14
place
geographic or administrative entity that is situated at a location (3.7)
3.15
region
connected, non-empty point-set defined by a domain and its boundary points
Note 1 to entry: The term “region” as defined here does not refer to a political or administrative region such as
“the Canary Islands” or “Hong Kong, SAR”, where SAR is the acronym of “Special Administrative Region”.
3.16
space
dimensional extent in which objects and events (3.2) have a relative position and direction
3.17
spatial entity
object that is situated at a unique location (3.7) for some period of time, and typically has the potential
to undergo translocation
Note 1 to entry: A spatial entity can also be understood as an object that participates in a spatial relation. In John is
sitting in a car, both John and car could be understood as spatial entities or as being the figure (3.5) and the ground
(3.6), respectively, of the sitting-in situation.
3.18
spatial signal
segment or series of segments of a text that rebounds to orientational (3.12) or topological relations (3.20)
3.19
tag
element name
name associated with textual segments for annotation or for a relation between these segments
Note 1 to entry: The following are two kinds of tag for annotation:
a) extent tag, which is associated with textual segments referring to basic entities or signals;
b) link tag, for representing spatial relations.
3.20
topological relation
relation that expresses the connectedness or continuity of spaces (3.16)
4 List of tags
4.1 General
The tag in angled brackets stands for the name of an XML element. See 8.2.
4.2 Extent tags: Basic entities and signals
4.2.1
measure
extent tag representing some measure (3.8)
4.2.2
motion
extent tag representing a motion (3.9)
4.2.3
motionSignal
extent tag representing a motion-signal (3.10)
4.2.4
non-motion event
extent tag representing a non-motion event (3.9)
4.2.5
path
extent tag that represents a path (3.13)
4.2.6
place
extent tag that represents a place (3.14)
4.2.7
spatialEntity
extent tag that represents a spatial entity (3.17)
4.2.8
spatialSignal
extent tag that represents a spatial signal (3.18)
4.3 Link tags
4.3.1
mLink
linking tag that represents some measure (3.8)
4 © ISO 2014 – All rights reserved
4.3.2
moveLink
linking tag that represents a relation between a motion (3.9) and participant spatial entities (3.17)
4.3.3
oLink
linking tag that represents an orientation relation (3.12) between a figure (3.5) and a ground (3.6)
4.3.4
qsLink
linking tag that represents a topological relation (3.20)
NOTE The tag qsLink or stands for a qualitative spatial link.
4.4 Root element
4.4.1
isoSpace
root element in which all ISOspace tags are embedded
NOTE In ISOspace annotations, all of the extent and link tags listed above are embedded in the tag .
5 Overview
Human languages impose diverse linguistic constructions for expressing concepts of space, of spatially-
anchored events, and of spatial configurations that relate in complex ways to the situations in which
they are used. One area that deserves further development regarding the connection between natural
language and formal representations of space is the automatic enrichment of textual data with spatial
annotations. There is a growing demand for such annotated data, particularly in the context of the
semantic web. Moreover, textual data routinely make reference to objects moving through space over
time. Integrating such information derived from textual sources into a geosensor data system can enhance
the overall spatiotemporal representation in changing and evolving situations, such as when tracking
objects through space with limited image data. It follows that verbal subjective descriptions of spatial
relations need to be translated into metrically meaningful positional information. A central research
question currently hindering progress in interpreting textual data is the lack of a clear separation of
the information that can be derived directly from linguistic interpretation and further information that
requires contextual interpretation. In order to avoid building incorrect deductions into the annotations
themselves, mark-up schemes should avoid over-annotating the text. Solutions to the language-space
mapping problem and its grounding in geospatial data are urgently required for this purpose.
There are many applications and tasks that would benefit from a robust spatial mark-up language, such
as ISOspace. These applications and tasks include the following:
a) creating a visualization of objects from a verbal description of a scene;
b) identifying the spatial relations associated with a sequence of processes and events from a news article;
c) determining an object location or tracking a moving object from a verbal description;
d) translating viewer-centric verbal descriptions into other relative descriptions or absolute coordinate
descriptions;
e) constructing a route given a route description;
f) constructing a spatial model of an interior or exterior space given a verbal description;
g) integrating spatial descriptions with information from other media.
The goal of ISOspace is not to provide a formalism that fully represents the complexity of spatial
language but rather to capture these complex constructions in text in order to provide an inventory
of how spatial information is presented in natural language. For example, many texts have no explicit
frame of spatio-temporal reference, thus, making it impossible to annotate such an unspecified frame
of reference. The interpretation of spatial prepositions, such as on in a book on the desk vs a picture on
the wall requires a handbook of its own dealing with different senses or uses of spatial prepositions
beyond a set of annotation guidelines. Any detailed classification of motion verbs in English alone is
again beyond the scope of this International Standard.
All of the examples in the current version of part of ISO 24617 are from English datasets. The specification
language proposed in this International Standard can be seen as a version of ISOspace for English only
and its applicability to other languages is still pending. A multilingual extension of ISOspace is necessary
if the document is to be verified, but this is expected to immediately follow preliminary rigorous work
on establishing the first edition of this part of ISO 24617 as an International Standard for spatial and
motion-related annotation.
6 Motivation and requirements
This International Standard aims to formulate the requirements for spatiotemporal annotation
standards and to develop the ISOspace standard to meet these requirements. It assumes ISO 24612 and
builds on previous work, including ISO 24617-1 and other spatial representations and calculi.
Natural language abounds with descriptions of motion. Our experience of our own motion, together
with our perception of motion in the world, have given human languages substantial means to verbally
express many different aspects of movement, including its temporal circumstances, spatial trajectory and
manner. In every language on earth, verbalizations of motion can specify changes in the spatial position
of an object over time. In addition to when and where the motion takes place, languages additionally
characterize how the motion takes place (e.g., its path, its manner, and how it was caused). In particular,
the path of motion involves conceptualizations of the various spatial relationships that an object can
have to other objects in the space in which it moves. An understanding of such spatial information in
natural language is necessary for many computational linguistics and artificial intelligence applications.
Any specification language for spatial information in language will need to support the following
computational tasks:
— identification of the appropriate topological configuration between two regions or objects (e.g.
containment, identity, disjointedness, connectedness, overlap, and closure over these relations,
when possible);
— identification of directional and orientational relations between objects and regions, including the
distinction between frames of reference;
— identification of metric properties of objects and metric values between regions and objects, when
possible (e.g. distance, height and width);
— identification of the motion of objects through time and a characterization of the nature of this movement;
— the provision of clear interoperable interfaces to existing representations and geo-databases (e.g.
2)
GeoNames, ArcGIS, and Google Earth ).
NOTE 1 Texts are often completely unspecified for frames of reference (texts are, so to speak, “not situated”)
and it, therefore, appears that the annotation of a frame of reference cannot be provided for many texts.
2) GeoNames, ArcGIS, and Google Earth are examples of a suitable product available commercially. This information
is given for the convenience of users of this document and does not constitute an endorsement by ISO of these
products.
6 © ISO 2014 – All rights reserved
NOTE 2 Measure expressions, such as 20 miles, have two attributes, numeric @value “20” and @unit “miles”,
but expressions like near and far have no unit specified. The annotation scheme proposed in this International
Standard can only state that they are measure-related expressions only with its attribute @value specified, say
with “near” or “far”. As will be seen, many of the annotation cases are left underspecified.
7 Specification of ISOspace for spatial annotation
7.1 Overview: annotation vs. representation
As with other areas of work on semantic annotation carried by the ISO Working Group (IS0/TC
37/SC 4/WG 2), ISOspace draws a fundamental distinction between the concepts of annotation and
representation; ISO 24612 does likewise. The term “annotation” is used to refer to the process of
adding information to segments of language data or to refer to that information itself. This notion is
independent of the format in which this information is represented. The term “representation” is used to
refer to the format in which an annotation is rendered (for instance, in XML) independent of its content.
According to ISO 24612, annotations are the proper level of standardization, not representations. This
part of ISO 24617 therefore, defines a specification language for annotating documents with information
about spatial entities and spatial relations at the level of annotations and then for representing these
annotations in a specific way, either with XML or with a predicate-logic-like format, as used in Annex A.
This language is called “ISOspace”.
However, the current version of ISOspace does not offer a formal specification of its annotation structure
with an abstract syntax and a formal semantics. This task will be taken up in a proposed work item, ISO
PWI 24617-9, aims to achieve a full development of spatial semantics. Instead, ISOspace will simply
specify a concrete XML-based syntax in 8.2 and a set of core annotation guidelines in Annex A.
7.2 Abstract syntax for the ISOspace annotation structure
An abstract syntax provides a theoretical basis for deriving various versions of a concrete syntax. In
this part of ISO 24617, the abstract syntax of ISOspace is schematically represented by the UML-based
metamodel (Figure 1), which specifies an annotation structure for spatial information consisting of
two substructures: an entity structure and a link structure. The entity structure of ISOspace consists
of basic spatial entities that are anchored to textual fragments called “markables” or “extents”; the link
structure relates these spatial entities and assigns a specific relation-type to each relation.
triggersQSLink
spatialSignal
triggersOLink
suppliesAdditionalLinkInfo
olink
primaryData isoSpace link
#trigger : IDRef
contains mlink movelink
relates
markable
triggersMLink
measure
isLinkedTo
deinesMap
isAnchoredTo
entity signal
mapsTo
Inherited from
ISO-TimeML
spatialEntity
qslink
event nonMotionEvent
location
triggersMoveLink
place path motion motionSignal
suppliesAdditionalMotionInfo
Figure 1 — Schematic metamodel of ISOspace
As Figure 1 shows, the annotation structure of ISOspace consists of the following three classes of entities
and four types of links:
a) three major basic semantic entities: spatial entity, event, and signal with their respective subclasses:
1) spatial entity: location: place and path;
2) event: motion and non-motion event;
3) signal: spatial signal, motion-signal, and measure;
b) four types of links: qsLink, oLink, moveLink, and mLink.
NOTE In earlier versions of ISOspace, the basic entity, motion-signal, was treated as part of spatial signals.
However, in the current version, it is treated differently because of its specific function to motions that provides
either additional information on either the path or the manner of motions. It is therefore called a motion signal, a
motion adjunct, or simply an adjunct so that it is not confused with spatial signals.
8 Representation of ISOspace-conformant annotations
8.1 XML-based concrete syntax: outline
8.1.1 Overview
The version of ISOspace’s concrete syntax in Clause 8 is an XML serialization of the spatial annotation
structure or of abstract syntax informally presented in Clause 7 with a UML-based metamodel. The
8 © ISO 2014 – All rights reserved
concrete syntax of ISOspace consists of basic entities (8.1.2), signals (8.1.3), links (8.1.4), and the root
element (8.1.5).
8.1.2 Basic entitles
There are five basic entity tags: , , , , and .
NOTE The and tags are subclasses of a location, but the location itself is not introduced as forming
its own element in this XML-based ISOspace. Non-location spatial entities are tagged simply as .
There is no tag as such. The tag is inherited from ISO 24617-1:2012(E) ISO-TimeML,
but is understood in ISOspace to stand for the class of non-motion events.
8.1.3 Signals
There are three signal tags: , , and .
8.1.4 Links
There are four links: , , , and .
In 8.4, the four links specified in ISOspace’s link structure are represented by their respective XML tags.
8.1.5 Root element
Each bundle of XML elements forms a tree-like structure called an “XML document”. This part of ISO 24617
has a single element called a “root element” that encloses all the other elements in the document.
For each ISOspace document, its root element is called“”.
EXAMPLE
8.2 Conventions for tagging
8.2.1 Naming conventions
Naming conventions can be quite complex. Here are four basic guidelines:
a) This part of ISO 24617 follows medial capitalization, also called “CamelCase”, thus avoiding the use
of the hyphen “-” or the underscore “_” in concatenating more than two words.
EXAMPLE 1 , , or @relatedToEvent, instead of ,
, or @related_to_event.
b) This ISO 24617 also avoids the use of uppercase unless it is absolutely necessary (e.g. acronyms such
as “XML” and UML class names such as “Entity” as a class).
EXAMPLE 2 , , , instead of , , , or
.
NOTE 1 “ISO” in “ISOspace” is the Greek prefix iso-, meaning “equal”. “AF” is the acronym for “Annotation
Framework”.
NOTE 2 ISOspace refers to this part of ISO 24617 and to the specification language for the annotation
of motion, together with other type event-related spatial information presented in the ISO document; the
element refers to an XML tag for a concrete annotation of a textual fragment based on ISOspace.
c) This part of ISO 24617 therefore allows both lowerCamelCase and UpperCamelCase, although the XML
serialization of ISOspace adopts lowerCamelCase for the representation of element names and tags.
d) The values of the various ID attributes are specified as beginning with one or more lowercase
alphabetical characters, followed by an integer. This scheme is mandated by the syntax of XML.
EXAMPLE 3
,
NOTE 3 “pl23’’ is a valid XML ID, but “23’’ is not.
Names for elements, attributes, and their values might be mentioned or listed in the documents. Where
this occurs, the following mentioning conventions are followed:
— element names are braced with a pair of angled brackets;
EXAMPLE 4 , , and
— attribute names are prefixed with @;
EXAMPLE 5 @value, @referencePt, and @frameType.
NOTE 4 @ is not part of attribute names.
— values of attributes are in double quotes.
EXAMPLE 6 birthPlace=”Boston” and xml:id=”e1”.
NOTE 5 Some attribute values might refer to an ID value that occurs somewhere in the annotation, that is
to say, an IDRef value. In cases such as this, the “#” symbol is prefixed to it.
EXAMPLE 7
8.2.2 Convention for inline tagging extents
For illustration, extents in a sample text are often inline tagged with their identifiers or some other tag
names. Here are some conventions for such tagging:
a) Style guides generally do not recommend boldface text for providing emphasis. Hence, the use of
boldface is discouraged.
EXAMPLE 1 Tsingtao beer is produced in Qingdao .
tok7
Boldface here is not recommended in actual tagging.
b) The end of each extent is marked with a unique ID in subscript.
EXAMPLE 2 Tsingtao beer is produced in Qingdao .
tok7
c) If an extent consists of more than one token, then it is enclosed by a pair of square brackets and an
ID is placed outside of the closing bracket.
EXAMPLE 3 John hopped [out of the room] .
m p
d) If an extent is a non-contiguous sequence of more than one token, then each non-contiguous token
is bracketed and marked with an identical ID.
EXAMPLE 4 Mia [looked] me [up] .
e1 e1
10 © ISO 2014 – All rights reserved
8.3 Basic entity tags
8.3.1
The ISOspace tag is inherited from Reference[40] with some additions and modifications. This
tag is used to annotate geographic entities like lakes and mountains, as well as administrative entities
like towns and counties. With the exception of implicit, non-consuming tags, a tag in ISOspace
should be directly linked to an explicit span of text.
The syntax and definition for the tag are set out below.
List 1 — List of attributes for the tag
|postalCode|postBox|ppl|ppla|pplc|rgn|state|UTM) #IMPLIED >
The ATTLIST can specify that an attribute can be any of the following (this list is not exhaustive):
a) ID should start with an alphabetic character and may not contain spaces. An ID value should be unique
within the document. The name of the attribute @id may take a prefix “xml:” for XML documents.
b) IDRef should be a value that is used as an ID somewhere in the document or in the annotation.
c) CDATA is any parsed character data.
d) “#REQUIRED” is for required attributes, whereas “#IMPLIED” is for optional ones, allowing no value
to be specified.
NOTE 1 A value for the attribute @latLong attribute will be provided automatically and it is therefore, not
usual for it to be manually specified.
The attributes for the tag are largely inherited from Reference[40]. For example:
— the value “mtn” stands for mountain;
— the value “mts” for mountain range;
— the value “ppl” stands for populated place;
— the value “ppla” stands for a capital of a sub-country (populated area), such as a state or a province;
— the value “pplc” stands for a capital of a country (populated place);
— the value “rgn” stands for a (non-political or non-administrative) region, such as a desert.
For places that have known latitude and longitude values, the @latLong attribute can be used to allow
3)
for mapping to other resources such as Google Maps .
NOTE 2 For further details, see Reference[40] Table 1, ISO 3166-1:1999, Table 2 and Table 4, as well as other
parts of the manual as a whole.
Adopting standoff annotation, ISOspace requires an attribute @markable to refer to a markable in a
tokenized text or an extent in the given text. It also includes a Document Creation Location or @dcl
attribute, which is a special location that serves as the “narrative location”. If the document includes a
@dcl, it is generally specified at the beginning of the text, in rather the same way that a Document
Creation Time is specified in ISO-TimeML. If a place is the DCL, the special @dcl attribute is annotated
as “true” and all other location tags have the default @dcl value of “false”. The current set of
attributes is shown in List A.1 in A.3.1.2.
NOTE 3 The default value for the attribute @dcl is “false”. This means that a document creation location is
not specified.
NOTE 4 It is worth remembering that, by convention, the tag names such as and the value of each
attribute are no longer represented in uppercase, but in lowercase (unless they are acronyms), while the name of
each attributes such as latLong is followed by the prefix @, thus being represented as @latLong. This convention
is adopted throughout the whole document.
The values for the @type attribute are identical to those for the SpatialML tag, although there
are exceptions, such as “vehicle”, which is a (spatial entity) in ISOspace and “road”, which
is a in ISOspace. The tag can be in the form of proper names (New York) or nominals
(town); these are marked with the @form attribute as “nam” or “nom” respectively. For applications to
countries other than the U.S., ISOspace also adds “province” both as a value for the attribute @type and
as an attribute for the element .
EXAMPLE Text: Tsingtao beer is produced in Qingdao .
tok7
province=”Shandong” country=”CN”/>
NOTE 5 For its value, the attribute @target can refer to a token in a tokenized text. However, in ISOspace, it has
been replaced with the attribute @markable, which takes a) an extent directly out of the text as its value in order
to make examples more readable and b) a token ID. It follows that, in the above example, the value “ ” of the @
tok7
markable can be replaced with “Qingdao” or use them interchangeably.
NOTE 6 Although this part of ISO 24617 describes the full ISOspace language, many of the example annotations
provided show the result of human annotation but do not include elements (e.g. attributes and/or attribute values)
that can b
...
SLOVENSKI STANDARD
01-september-2018
Upravljanje z jezikovnimi viri - Ogrodje za semantično označevanje (SemAF) - 7.
del: Prostorske informacije (ISOspace)
Language resource management -- Semantic annotation framework -- Part 7: Spatial
information (ISOspace)
Gestion des ressources linguistiques -- Cadre d'annotation sémantique -- Partie 7:
Information spatiale (ISOspace)
Ta slovenski standard je istoveten z: ISO 24617-7:2014
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
INTERNATIONAL ISO
STANDARD 24617-7
First edition
2014-12-15
Language resource management —
Semantic annotation framework —
Part 7:
Spatial information (ISOspace)
Gestion des ressources linguistiques — Cadre d’annotation
sémantique —
Partie 7: Information spatiale (ISOspace)
Reference number
©
ISO 2014
© ISO 2014
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2014 – All rights reserved
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 List of tags . 4
5 Overview . 5
6 Motivation and requirements . 6
7 Specification of ISOspace for spatial annotation . 7
7.1 Overview: annotation vs. representation. 7
7.2 Abstract syntax for the ISOspace annotation structure . 7
8 Representation of ISOspace-conformant annotations . 8
8.1 XML-based concrete syntax: outline . 8
8.1.1 Overview . 8
8.1.2 Basic entitles . 9
8.1.3 Signals . 9
8.1.4 Links . 9
8.1.5 Root element . 9
8.2 Conventions for tagging . 9
8.2.1 Naming conventions . 9
8.2.2 Convention for inline tagging extents .10
8.3 Basic entity tags .11
8.3.1 .11
8.3.2 .13
8.3.3 .13
8.3.4 .14
8.3.5 for non-motion event .15
8.3.6 .15
8.3.7 .16
8.3.8 .16
8.4 Link tags .17
8.4.1 .17
8.4.2 .17
8.4.3 .18
8.4.4 .20
8.5 Root tag: .20
8.6 Summary .21
8.6.1 Identifier .21
8.6.2 Shared attributes .22
8.6.3 IDRef as value .23
Annex A (normative) Core annotation guidelines .24
Bibliography .52
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any
patent rights identified during the development of the document will be in the Introduction and/or on
the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation on the meaning of ISO specific terms and expressions related to conformity
assessment, as well as information about ISO’s adherence to the WTO principles in the Technical Barriers
to Trade (TBT) see the following URL: Foreword - Supplementary information
The committee responsible for this document is ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 4, Language resource management.
ISO 24617 consists of the following parts, under the general title Language resource management —
Semantic annotation framework (semAF):
— Part 1: Time and events (SemAF-Time, ISO-TimeML)
— Part 2: Dialogue acts
— Part 4: Semantic roles (SemAF-SR)
— Part 5: Discourse structures (SemAF-DS)
— Part 6: Principles of semantic annotation (SemAF-Basics)
— Part 7: Spatial information (ISOspace)
— Part 8: Semantic relations in discourse (SemAF-DRel)
iv © ISO 2014 – All rights reserved
Introduction
The automatic recognition of spatial information in natural language is currently attracting considerable
attention in the fields of computational linguistics and artificial intelligence. The development of
algorithms that exhibit “spatial awareness” promises to add needed functionality to NLP systems, from
named entity recognition to question-answering and text-based inference. However, in order for such
systems to reason spatially, they require the enrichment of textual data with the annotation of spatial
information in language. This involves a large range of linguistic constructions, including spatially
anchoring events, descriptions of objects in motion, viewer-relative descriptions of scenes, absolute
spatial descriptions of locations, and many other constructions.
This part of ISO 24617 was developed in collaboration with the ISOspace working group at Brandeis
University with the aim to provide an International Standard for the representation of spatial information
relating to locations, motions and non-motion events in language.
NOTE The ISOspace Working Group is headed by James Pustejovsky, jampesp@cs.brandeis.edu, Brandeis
University, Waltham, MA, U.S.A.
This part of ISO 24617 provides normative specifications and guidelines not only for spatial information,
but also for information content in motion and various other types of event in language.
The main parts of this part of ISO 24617 consist of the following:
a) Scope;
b) Normative references;
c) Terms and definitions;
d) List of tags or names of elements;
e) Overview;
f) Motivation and requirements;
g) Specification of the ISOspace annotation structure;
h) Representation of ISOspace-conformant annotations.
Clause 8 introduces an XML-based concrete syntax for representing spatial-related or motion-related
annotations based on the annotation structure of ISOspace that is presented in Clause 7 with a UML-
based metamodel.
A formal semantics for ISOspace will be provided as part of a future new work item within the semantic
annotation framework. This will be coordinated with the temporal semantics and specification of
ISO 24617-1 (SemAF-Time, ISO-TimeML), thereby producing a rich semantics that will be directly useable
by practitioners in computational linguistics and other communities (see Clause 6). The multilingual
extension of ISOspace will also be treated in a separate part of the ISO 24617- series in the near future.
NOTE Although the schema and DTD are not part of the present document as normative annexes, they will
both be found in a webpage relating to the ISOspace specification.
Normative Annex A is an integral part of ISO 24617 and provides core annotation guidelines.
INTERNATIONAL STANDARD ISO 24617-7:2014(E)
Language resource management — Semantic annotation
framework —
Part 7:
Spatial information (ISOspace)
1 Scope
This part of ISO 24617 provides a framework for encoding a broad range not only of spatial information,
but also of spatiotemporal information relating to motion as expressed in natural language texts. This
part of ISO 24617 includes references to locations, general spatial entities, spatial relations (involving
topological, orientational, and metric values), dimensional information, motion events, and paths.
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are
indispensable for its application. For dated references, only the edition cited applies. For undated
references, the latest edition of the referenced document (including any amendments) applies.
ISO 24617-1, Language resource management — Semantic annotation framework (SemAF) — Part 1: Time
and events (SemAF-Time, ISO-TimeML)
ISO/IEC 14977, Information technology — Syntactic metalanguage — Extended BNF
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 24617-1 and the following apply.
3.1
document creation location
dcl
unique place or set of places associated with a document that represents the location (3.7) in which the
document was created
1)
Note 1 to entry: Some collaboratively written documents, such as GoogleDoc documents and chat logs, might
refer not only to a single location but also to a set of locations spread out across the world. Besides, for example,
the creation place of the Hebrew bible or the creation place of each of the books in it is uncertain. The attribute @
dcl will, therefore, have the value “false” which is to be understood to mean “unspecified”, while the value “true”
is to be understood to mean “specified”.
3.2
event
eventuality
something that can be said to obtain or hold true, to happen or to occur
Note 1 to entry: This is a very broad notion of event, also known in the literature as “eventuality” and includes all
kinds of actions, states, processes, etc. It is not to be confused with the narrower notion of event (as opposed to
the notion of “state”) as something that happens at a certain point in time (e.g. the clock striking two or waking
up) or during a short period of time (e.g. laughing). In ISO-TimeML, the term event is used in a broader sense and
is equivalent to the term eventuality.
1) GoogleDoc is an example of a suitable product available commercially. This information is given for the
convenience of users of this document and does not constitute an endorsement by ISO of these products.
[SOURCE: ISO 24617-1:2012]
3.3
event-path
path (3.13) or trajectory followed by a spatial entity (3.17) coincident with a motion-event (3.9)
3.4
extent
textual segment which is string of character segments in text to be annotated
EXAMPLE Tokens, words, and non-contiguous phrases (e.g. a complex verb like “look . up”) are extents.
3.5
figure
spatial entity (3.17) that is considered to be the focal object, which is related to some reference object
3.6
ground
spatial entity (3.17) that acts as reference for a figure (3.5)
3.7
location
point or finite area that is positioned within a space (3.16)
3.8
measure
magnitude of a spatial dimension or relation
EXAMPLE Distance is a spatial relation.
3.9
motion
motion-event
action or process involving the translocation of a spatial object, transformation of some spatial property
of an object, or change in the conformation of an object
Note 1 to entry: A motion (3.9) in ISOspace is a particular kind of event (3.2).
3.10
motion-signal
adjunct
motion-adjunct
path (3.13) of motion and/or manner of motion information contributed by a particle or by a prepositional,
adverbial phrase, in conjunction with a motion (3.9)-related text
Note 1 to entry: This terminology is specific to ISOspace and is different from the general term “adjunct” which is
used to describe optional syntactic elements.
3.11
non-cosuming tag
tag (3.19) that has no associated extent (3.4)
Note 1 to entry: The extent (3.4) of a non-consuming tag is a “null” string.
EXAMPLE In John ate an apple but Mary a pear, there are at least two ways of marking up the non-
consuming tag:
a) John ate an apple, but Mary ∅ a pear;
e1 e2
b) 1)
2) (non-consuming tag)
2 © ISO 2014 – All rights reserved
3.12
orientation
orientation(al) relation
relation between a figure (3.5) and a ground (3.6) that expresses the spatial disposition or direction of a
spatial object within a frame of reference
3.13
path
location (3.7) that consists of a series of locations (3.7)
Note 1 to entry: A spatial object path is a location where the focus is on the potential for traversal or which
functions as a boundary. This includes common nouns like road, coastline, and river and proper names like Route
66 and Kangamangus Highway. Some nouns, such as valley, can be ambiguous. It can be understood as a path in we
walked down the valley or as a place (3.14) in we live in the valley.
Note 2 to entry: A path might be represented as an undirected graph whose vertices are locations (3.7) and whose
edges signify continuity; that is to say, a path has no inherent directionality.
3.14
place
geographic or administrative entity that is situated at a location (3.7)
3.15
region
connected, non-empty point-set defined by a domain and its boundary points
Note 1 to entry: The term “region” as defined here does not refer to a political or administrative region such as
“the Canary Islands” or “Hong Kong, SAR”, where SAR is the acronym of “Special Administrative Region”.
3.16
space
dimensional extent in which objects and events (3.2) have a relative position and direction
3.17
spatial entity
object that is situated at a unique location (3.7) for some period of time, and typically has the potential
to undergo translocation
Note 1 to entry: A spatial entity can also be understood as an object that participates in a spatial relation. In John is
sitting in a car, both John and car could be understood as spatial entities or as being the figure (3.5) and the ground
(3.6), respectively, of the sitting-in situation.
3.18
spatial signal
segment or series of segments of a text that rebounds to orientational (3.12) or topological relations (3.20)
3.19
tag
element name
name associated with textual segments for annotation or for a relation between these segments
Note 1 to entry: The following are two kinds of tag for annotation:
a) extent tag, which is associated with textual segments referring to basic entities or signals;
b) link tag, for representing spatial relations.
3.20
topological relation
relation that expresses the connectedness or continuity of spaces (3.16)
4 List of tags
4.1 General
The tag in angled brackets stands for the name of an XML element. See 8.2.
4.2 Extent tags: Basic entities and signals
4.2.1
measure
extent tag representing some measure (3.8)
4.2.2
motion
extent tag representing a motion (3.9)
4.2.3
motionSignal
extent tag representing a motion-signal (3.10)
4.2.4
non-motion event
extent tag representing a non-motion event (3.9)
4.2.5
path
extent tag that represents a path (3.13)
4.2.6
place
extent tag that represents a place (3.14)
4.2.7
spatialEntity
extent tag that represents a spatial entity (3.17)
4.2.8
spatialSignal
extent tag that represents a spatial signal (3.18)
4.3 Link tags
4.3.1
mLink
linking tag that represents some measure (3.8)
4 © ISO 2014 – All rights reserved
4.3.2
moveLink
linking tag that represents a relation between a motion (3.9) and participant spatial entities (3.17)
4.3.3
oLink
linking tag that represents an orientation relation (3.12) between a figure (3.5) and a ground (3.6)
4.3.4
qsLink
linking tag that represents a topological relation (3.20)
NOTE The tag qsLink or stands for a qualitative spatial link.
4.4 Root element
4.4.1
isoSpace
root element in which all ISOspace tags are embedded
NOTE In ISOspace annotations, all of the extent and link tags listed above are embedded in the tag .
5 Overview
Human languages impose diverse linguistic constructions for expressing concepts of space, of spatially-
anchored events, and of spatial configurations that relate in complex ways to the situations in which
they are used. One area that deserves further development regarding the connection between natural
language and formal representations of space is the automatic enrichment of textual data with spatial
annotations. There is a growing demand for such annotated data, particularly in the context of the
semantic web. Moreover, textual data routinely make reference to objects moving through space over
time. Integrating such information derived from textual sources into a geosensor data system can enhance
the overall spatiotemporal representation in changing and evolving situations, such as when tracking
objects through space with limited image data. It follows that verbal subjective descriptions of spatial
relations need to be translated into metrically meaningful positional information. A central research
question currently hindering progress in interpreting textual data is the lack of a clear separation of
the information that can be derived directly from linguistic interpretation and further information that
requires contextual interpretation. In order to avoid building incorrect deductions into the annotations
themselves, mark-up schemes should avoid over-annotating the text. Solutions to the language-space
mapping problem and its grounding in geospatial data are urgently required for this purpose.
There are many applications and tasks that would benefit from a robust spatial mark-up language, such
as ISOspace. These applications and tasks include the following:
a) creating a visualization of objects from a verbal description of a scene;
b) identifying the spatial relations associated with a sequence of processes and events from a news article;
c) determining an object location or tracking a moving object from a verbal description;
d) translating viewer-centric verbal descriptions into other relative descriptions or absolute coordinate
descriptions;
e) constructing a route given a route description;
f) constructing a spatial model of an interior or exterior space given a verbal description;
g) integrating spatial descriptions with information from other media.
The goal of ISOspace is not to provide a formalism that fully represents the complexity of spatial
language but rather to capture these complex constructions in text in order to provide an inventory
of how spatial information is presented in natural language. For example, many texts have no explicit
frame of spatio-temporal reference, thus, making it impossible to annotate such an unspecified frame
of reference. The interpretation of spatial prepositions, such as on in a book on the desk vs a picture on
the wall requires a handbook of its own dealing with different senses or uses of spatial prepositions
beyond a set of annotation guidelines. Any detailed classification of motion verbs in English alone is
again beyond the scope of this International Standard.
All of the examples in the current version of part of ISO 24617 are from English datasets. The specification
language proposed in this International Standard can be seen as a version of ISOspace for English only
and its applicability to other languages is still pending. A multilingual extension of ISOspace is necessary
if the document is to be verified, but this is expected to immediately follow preliminary rigorous work
on establishing the first edition of this part of ISO 24617 as an International Standard for spatial and
motion-related annotation.
6 Motivation and requirements
This International Standard aims to formulate the requirements for spatiotemporal annotation
standards and to develop the ISOspace standard to meet these requirements. It assumes ISO 24612 and
builds on previous work, including ISO 24617-1 and other spatial representations and calculi.
Natural language abounds with descriptions of motion. Our experience of our own motion, together
with our perception of motion in the world, have given human languages substantial means to verbally
express many different aspects of movement, including its temporal circumstances, spatial trajectory and
manner. In every language on earth, verbalizations of motion can specify changes in the spatial position
of an object over time. In addition to when and where the motion takes place, languages additionally
characterize how the motion takes place (e.g., its path, its manner, and how it was caused). In particular,
the path of motion involves conceptualizations of the various spatial relationships that an object can
have to other objects in the space in which it moves. An understanding of such spatial information in
natural language is necessary for many computational linguistics and artificial intelligence applications.
Any specification language for spatial information in language will need to support the following
computational tasks:
— identification of the appropriate topological configuration between two regions or objects (e.g.
containment, identity, disjointedness, connectedness, overlap, and closure over these relations,
when possible);
— identification of directional and orientational relations between objects and regions, including the
distinction between frames of reference;
— identification of metric properties of objects and metric values between regions and objects, when
possible (e.g. distance, height and width);
— identification of the motion of objects through time and a characterization of the nature of this movement;
— the provision of clear interoperable interfaces to existing representations and geo-databases (e.g.
2)
GeoNames, ArcGIS, and Google Earth ).
NOTE 1 Texts are often completely unspecified for frames of reference (texts are, so to speak, “not situated”)
and it, therefore, appears that the annotation of a frame of reference cannot be provided for many texts.
2) GeoNames, ArcGIS, and Google Earth are examples of a suitable product available commercially. This information
is given for the convenience of users of this document and does not constitute an endorsement by ISO of these
products.
6 © ISO 2014 – All rights reserved
NOTE 2 Measure expressions, such as 20 miles, have two attributes, numeric @value “20” and @unit “miles”,
but expressions like near and far have no unit specified. The annotation scheme proposed in this International
Standard can only state that they are measure-related expressions only with its attribute @value specified, say
with “near” or “far”. As will be seen, many of the annotation cases are left underspecified.
7 Specification of ISOspace for spatial annotation
7.1 Overview: annotation vs. representation
As with other areas of work on semantic annotation carried by the ISO Working Group (IS0/TC
37/SC 4/WG 2), ISOspace draws a fundamental distinction between the concepts of annotation and
representation; ISO 24612 does likewise. The term “annotation” is used to refer to the process of
adding information to segments of language data or to refer to that information itself. This notion is
independent of the format in which this information is represented. The term “representation” is used to
refer to the format in which an annotation is rendered (for instance, in XML) independent of its content.
According to ISO 24612, annotations are the proper level of standardization, not representations. This
part of ISO 24617 therefore, defines a specification language for annotating documents with information
about spatial entities and spatial relations at the level of annotations and then for representing these
annotations in a specific way, either with XML or with a predicate-logic-like format, as used in Annex A.
This language is called “ISOspace”.
However, the current version of ISOspace does not offer a formal specification of its annotation structure
with an abstract syntax and a formal semantics. This task will be taken up in a proposed work item, ISO
PWI 24617-9, aims to achieve a full development of spatial semantics. Instead, ISOspace will simply
specify a concrete XML-based syntax in 8.2 and a set of core annotation guidelines in Annex A.
7.2 Abstract syntax for the ISOspace annotation structure
An abstract syntax provides a theoretical basis for deriving various versions of a concrete syntax. In
this part of ISO 24617, the abstract syntax of ISOspace is schematically represented by the UML-based
metamodel (Figure 1), which specifies an annotation structure for spatial information consisting of
two substructures: an entity structure and a link structure. The entity structure of ISOspace consists
of basic spatial entities that are anchored to textual fragments called “markables” or “extents”; the link
structure relates these spatial entities and assigns a specific relation-type to each relation.
triggersQSLink
spatialSignal
triggersOLink
suppliesAdditionalLinkInfo
olink
primaryData isoSpace link
#trigger : IDRef
contains mlink movelink
relates
markable
triggersMLink
measure
isLinkedTo
deinesMap
isAnchoredTo
entity signal
mapsTo
Inherited from
ISO-TimeML
spatialEntity
qslink
event nonMotionEvent
location
triggersMoveLink
place path motion motionSignal
suppliesAdditionalMotionInfo
Figure 1 — Schematic metamodel of ISOspace
As Figure 1 shows, the annotation structure of ISOspace consists of the following three classes of entities
and four types of links:
a) three major basic semantic entities: spatial entity, event, and signal with their respective subclasses:
1) spatial entity: location: place and path;
2) event: motion and non-motion event;
3) signal: spatial signal, motion-signal, and measure;
b) four types of links: qsLink, oLink, moveLink, and mLink.
NOTE In earlier versions of ISOspace, the basic entity, motion-signal, was treated as part of spatial signals.
However, in the current version, it is treated differently because of its specific function to motions that provides
either additional information on either the path or the manner of motions. It is therefore called a motion signal, a
motion adjunct, or simply an adjunct so that it is not confused with spatial signals.
8 Representation of ISOspace-conformant annotations
8.1 XML-based concrete syntax: outline
8.1.1 Overview
The version of ISOspace’s concrete syntax in Clause 8 is an XML serialization of the spatial annotation
structure or of abstract syntax informally presented in Clause 7 with a UML-based metamodel. The
8 © ISO 2014 – All rights reserved
concrete syntax of ISOspace consists of basic entities (8.1.2), signals (8.1.3), links (8.1.4), and the root
element (8.1.5).
8.1.2 Basic entitles
There are five basic entity tags: , , , , and .
NOTE The and tags are subclasses of a location, but the location itself is not introduced as forming
its own element in this XML-based ISOspace. Non-location spatial entities are tagged simply as .
There is no tag as such. The tag is inherited from ISO 24617-1:2012(E) ISO-TimeML,
but is understood in ISOspace to stand for the class of non-motion events.
8.1.3 Signals
There are three signal tags: , , and .
8.1.4 Links
There are four links: , , , and .
In 8.4, the four links specified in ISOspace’s link structure are represented by their respective XML tags.
8.1.5 Root element
Each bundle of XML elements forms a tree-like structure called an “XML document”. This part of ISO 24617
has a single element called a “root element” that encloses all the other elements in the document.
For each ISOspace document, its root element is called“”.
EXAMPLE
8.2 Conventions for tagging
8.2.1 Naming conventions
Naming conventions can be quite complex. Here are four basic guidelines:
a) This part of ISO 24617 follows medial capitalization, also called “CamelCase”, thus avoiding the use
of the hyphen “-” or the underscore “_” in concatenating more than two words.
EXAMPLE 1 , , or @relatedToEvent, instead of ,
, or @related_to_event.
b) This ISO 24617 also avoids the use of uppercase unless it is absolutely necessary (e.g. acronyms such
as “XML” and UML class names such as “Entity” as a class).
EXAMPLE 2 , , , instead of , , , or
.
NOTE 1 “ISO” in “ISOspace” is the Greek prefix iso-, meaning “equal”. “AF” is the acronym for “Annotation
Framework”.
NOTE 2 ISOspace refers to this part of ISO 24617 and to the specification language for the annotation
of motion, together with other type event-related spatial information presented in the ISO document; the
element refers to an XML tag for a concrete annotation of a textual fragment based on ISOspace.
c) This part of ISO 24617 therefore allows both lowerCamelCase and UpperCamelCase, although the XML
serialization of ISOspace adopts lowerCamelCase for the representation of element names and tags.
d) The values of the various ID attributes are specified as beginning with one or more lowercase
alphabetical characters, followed by an integer. This scheme is mandated by the syntax of XML.
EXAMPLE 3
,
NOTE 3 “pl23’’ is a valid XML ID, but “23’’ is not.
Names for elements, attributes, and their values might be mentioned or listed in the documents. Where
this occurs, the following mentioning conventions are followed:
— element names are braced with a pair of angled brackets;
EXAMPLE 4 , , and
— attribute names are prefixed with @;
EXAMPLE 5 @value, @referencePt, and @frameType.
NOTE 4 @ is not part of attribute names.
— values of attributes are in double quotes.
EXAMPLE 6 birthPlace=”Boston” and xml:id=”e1”.
NOTE 5 Some attribute values might refer to an ID value that occurs somewhere in the annotation, that is
to say, an IDRef value. In cases such as this, the “#” symbol is prefixed to it.
EXAMPLE 7
8.2.2 Convention for inline tagging extents
For illustration, extents in a sample text are often inline tagged with their identifiers or some other tag
names. Here are some conventions for such tagging:
a) Style guides generally do not recommend boldface text for providing emphasis. Hence, the use of
boldface is discouraged.
EXAMPLE 1 Tsingtao beer is produced in Qingdao .
tok7
Boldface here is not recommended in actual tagging.
b) The end of each extent is marked with a unique ID in subscript.
EXAMPLE 2 Tsingtao beer is produced in Qingdao .
tok7
c) If an extent consists of more than one token, then it is enclosed by a pair of square brackets and an
ID is placed outside of the closing bracket.
EXAMPLE 3 John hopped [out of the room] .
m p
d) If an extent is a non-contiguous sequence of more than one token, then each non-contiguous token
is bracketed and marked with an identical ID.
EXAMPLE 4 Mia [looked] me [up] .
e1 e1
10 © ISO 2014 – All rights reserved
8.3 Basic entity tags
8.3.1
The ISOspace tag is inherited from Reference[40] with some additions and modifications. This
tag is used to annotate geographic entities like lakes and mountains, as well as administrative entities
like towns and counties. With the exception of implicit, non-consuming tags, a tag in ISOspace
should be directly linked to an explicit span of text.
The syntax and definition for the tag are set out below.
List 1 — List of attributes for the tag
|postalCode|postBox|ppl|ppla|pplc|rgn|state|UTM) #IMPLIED >
The ATTLIST can specify that an attribute can be any of the following (this list is not exhaustive):
a) ID should start with an alphabetic character and may not contain spaces. An ID value should be unique
within the document. The name of the attribute @id may take a prefix “xml:” for XML documents.
b) IDRef should be a value that is used as an ID somewhere in the document or in the annotation.
c) CDATA is any parsed character data.
d) “#REQUIRED” is for required attributes, whereas “#IMPLIED” is for optional ones, allowing no value
to be specified.
NOTE 1 A value for the attribute @latLong attribute will be provided automatically and it is therefore, not
usual for it to be manually specified.
The attributes for the tag are largely inherited from Reference[40]. For example:
— the value “mtn” stands for mountain;
— the value “mts” for mountain range;
— the value “ppl” stands for populated place;
— the value “ppla” stands for a capital of a sub-country (populated area), such as a state or a province;
— the value “pplc” stands for a capital of a country (populated place);
— the value “rgn” stands for a (non-political or non-administrative) region, such as a desert.
For places that have known latitude and longitude values, the @latLong attribute can be used to allow
3)
for mapping to other resources such as Google Maps .
NOTE 2 For further details, see Reference[40] Table 1, ISO 3166-1:1999, Table 2 and Table 4, as well as other
parts of the manual as a whole.
Adopting standoff annotation, ISOspace requires an attribute @markable to refer to a markable in a
tokenized text or an extent in the given text. It also includes a Document Creation Location or @dcl
attribute, which is a special location that serves as the “narrative location”. If the document includes a
@dcl, it is generally specified at the beginning of the text, in rather the same way that a Document
Creation Time is specified in ISO-TimeML. If a place is the DCL, the special @dcl attribute is annotated
as “true” and all other location tags have the default @dcl value of “false”. The current set of
attributes is shown in List A.1 in A.3.1.2.
NOTE 3 The default value for the attribute @dcl is “false”. This means that a document creation location is
not specified.
NOTE 4 It is worth remembering that, by convention, the tag names such as and the value of each
attribute are no longer represented in uppercase, but in lowercase (unless they are acronyms), while the name of
each attributes such as latLong is followed by the prefix @, thus b
...
SLOVENSKI STANDARD
01-september-2018
Upravljanje z jezikovnimi viri - Ogrodje za semantično označevanje (SemAF) - 7.
del: Prostorske informacije (ISOspace)
Language resource management -- Semantic annotation framework -- Part 7: Spatial
information (ISOspace)
Gestion des ressources linguistiques -- Cadre d'annotation sémantique -- Partie 7:
Information spatiale (ISOspace)
Ta slovenski standard je istoveten z: ISO 24617-7:2014
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
INTERNATIONAL ISO
STANDARD 24617-7
First edition
2014-12-15
Language resource management —
Semantic annotation framework —
Part 7:
Spatial information (ISOspace)
Gestion des ressources linguistiques — Cadre d’annotation
sémantique —
Partie 7: Information spatiale (ISOspace)
Reference number
©
ISO 2014
© ISO 2014
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2014 – All rights reserved
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 List of tags . 4
5 Overview . 5
6 Motivation and requirements . 6
7 Specification of ISOspace for spatial annotation . 7
7.1 Overview: annotation vs. representation. 7
7.2 Abstract syntax for the ISOspace annotation structure . 7
8 Representation of ISOspace-conformant annotations . 8
8.1 XML-based concrete syntax: outline . 8
8.1.1 Overview . 8
8.1.2 Basic entitles . 9
8.1.3 Signals . 9
8.1.4 Links . 9
8.1.5 Root element . 9
8.2 Conventions for tagging . 9
8.2.1 Naming conventions . 9
8.2.2 Convention for inline tagging extents .10
8.3 Basic entity tags .11
8.3.1 .11
8.3.2 .13
8.3.3 .13
8.3.4 .14
8.3.5 for non-motion event .15
8.3.6 .15
8.3.7 .16
8.3.8 .16
8.4 Link tags .17
8.4.1 .17
8.4.2 .17
8.4.3 .18
8.4.4 .20
8.5 Root tag: .20
8.6 Summary .21
8.6.1 Identifier .21
8.6.2 Shared attributes .22
8.6.3 IDRef as value .23
Annex A (normative) Core annotation guidelines .24
Bibliography .52
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any
patent rights identified during the development of the document will be in the Introduction and/or on
the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation on the meaning of ISO specific terms and expressions related to conformity
assessment, as well as information about ISO’s adherence to the WTO principles in the Technical Barriers
to Trade (TBT) see the following URL: Foreword - Supplementary information
The committee responsible for this document is ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 4, Language resource management.
ISO 24617 consists of the following parts, under the general title Language resource management —
Semantic annotation framework (semAF):
— Part 1: Time and events (SemAF-Time, ISO-TimeML)
— Part 2: Dialogue acts
— Part 4: Semantic roles (SemAF-SR)
— Part 5: Discourse structures (SemAF-DS)
— Part 6: Principles of semantic annotation (SemAF-Basics)
— Part 7: Spatial information (ISOspace)
— Part 8: Semantic relations in discourse (SemAF-DRel)
iv © ISO 2014 – All rights reserved
Introduction
The automatic recognition of spatial information in natural language is currently attracting considerable
attention in the fields of computational linguistics and artificial intelligence. The development of
algorithms that exhibit “spatial awareness” promises to add needed functionality to NLP systems, from
named entity recognition to question-answering and text-based inference. However, in order for such
systems to reason spatially, they require the enrichment of textual data with the annotation of spatial
information in language. This involves a large range of linguistic constructions, including spatially
anchoring events, descriptions of objects in motion, viewer-relative descriptions of scenes, absolute
spatial descriptions of locations, and many other constructions.
This part of ISO 24617 was developed in collaboration with the ISOspace working group at Brandeis
University with the aim to provide an International Standard for the representation of spatial information
relating to locations, motions and non-motion events in language.
NOTE The ISOspace Working Group is headed by James Pustejovsky, jampesp@cs.brandeis.edu, Brandeis
University, Waltham, MA, U.S.A.
This part of ISO 24617 provides normative specifications and guidelines not only for spatial information,
but also for information content in motion and various other types of event in language.
The main parts of this part of ISO 24617 consist of the following:
a) Scope;
b) Normative references;
c) Terms and definitions;
d) List of tags or names of elements;
e) Overview;
f) Motivation and requirements;
g) Specification of the ISOspace annotation structure;
h) Representation of ISOspace-conformant annotations.
Clause 8 introduces an XML-based concrete syntax for representing spatial-related or motion-related
annotations based on the annotation structure of ISOspace that is presented in Clause 7 with a UML-
based metamodel.
A formal semantics for ISOspace will be provided as part of a future new work item within the semantic
annotation framework. This will be coordinated with the temporal semantics and specification of
ISO 24617-1 (SemAF-Time, ISO-TimeML), thereby producing a rich semantics that will be directly useable
by practitioners in computational linguistics and other communities (see Clause 6). The multilingual
extension of ISOspace will also be treated in a separate part of the ISO 24617- series in the near future.
NOTE Although the schema and DTD are not part of the present document as normative annexes, they will
both be found in a webpage relating to the ISOspace specification.
Normative Annex A is an integral part of ISO 24617 and provides core annotation guidelines.
INTERNATIONAL STANDARD ISO 24617-7:2014(E)
Language resource management — Semantic annotation
framework —
Part 7:
Spatial information (ISOspace)
1 Scope
This part of ISO 24617 provides a framework for encoding a broad range not only of spatial information,
but also of spatiotemporal information relating to motion as expressed in natural language texts. This
part of ISO 24617 includes references to locations, general spatial entities, spatial relations (involving
topological, orientational, and metric values), dimensional information, motion events, and paths.
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are
indispensable for its application. For dated references, only the edition cited applies. For undated
references, the latest edition of the referenced document (including any amendments) applies.
ISO 24617-1, Language resource management — Semantic annotation framework (SemAF) — Part 1: Time
and events (SemAF-Time, ISO-TimeML)
ISO/IEC 14977, Information technology — Syntactic metalanguage — Extended BNF
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 24617-1 and the following apply.
3.1
document creation location
dcl
unique place or set of places associated with a document that represents the location (3.7) in which the
document was created
1)
Note 1 to entry: Some collaboratively written documents, such as GoogleDoc documents and chat logs, might
refer not only to a single location but also to a set of locations spread out across the world. Besides, for example,
the creation place of the Hebrew bible or the creation place of each of the books in it is uncertain. The attribute @
dcl will, therefore, have the value “false” which is to be understood to mean “unspecified”, while the value “true”
is to be understood to mean “specified”.
3.2
event
eventuality
something that can be said to obtain or hold true, to happen or to occur
Note 1 to entry: This is a very broad notion of event, also known in the literature as “eventuality” and includes all
kinds of actions, states, processes, etc. It is not to be confused with the narrower notion of event (as opposed to
the notion of “state”) as something that happens at a certain point in time (e.g. the clock striking two or waking
up) or during a short period of time (e.g. laughing). In ISO-TimeML, the term event is used in a broader sense and
is equivalent to the term eventuality.
1) GoogleDoc is an example of a suitable product available commercially. This information is given for the
convenience of users of this document and does not constitute an endorsement by ISO of these products.
[SOURCE: ISO 24617-1:2012]
3.3
event-path
path (3.13) or trajectory followed by a spatial entity (3.17) coincident with a motion-event (3.9)
3.4
extent
textual segment which is string of character segments in text to be annotated
EXAMPLE Tokens, words, and non-contiguous phrases (e.g. a complex verb like “look . up”) are extents.
3.5
figure
spatial entity (3.17) that is considered to be the focal object, which is related to some reference object
3.6
ground
spatial entity (3.17) that acts as reference for a figure (3.5)
3.7
location
point or finite area that is positioned within a space (3.16)
3.8
measure
magnitude of a spatial dimension or relation
EXAMPLE Distance is a spatial relation.
3.9
motion
motion-event
action or process involving the translocation of a spatial object, transformation of some spatial property
of an object, or change in the conformation of an object
Note 1 to entry: A motion (3.9) in ISOspace is a particular kind of event (3.2).
3.10
motion-signal
adjunct
motion-adjunct
path (3.13) of motion and/or manner of motion information contributed by a particle or by a prepositional,
adverbial phrase, in conjunction with a motion (3.9)-related text
Note 1 to entry: This terminology is specific to ISOspace and is different from the general term “adjunct” which is
used to describe optional syntactic elements.
3.11
non-cosuming tag
tag (3.19) that has no associated extent (3.4)
Note 1 to entry: The extent (3.4) of a non-consuming tag is a “null” string.
EXAMPLE In John ate an apple but Mary a pear, there are at least two ways of marking up the non-
consuming tag:
a) John ate an apple, but Mary ∅ a pear;
e1 e2
b) 1)
2) (non-consuming tag)
2 © ISO 2014 – All rights reserved
3.12
orientation
orientation(al) relation
relation between a figure (3.5) and a ground (3.6) that expresses the spatial disposition or direction of a
spatial object within a frame of reference
3.13
path
location (3.7) that consists of a series of locations (3.7)
Note 1 to entry: A spatial object path is a location where the focus is on the potential for traversal or which
functions as a boundary. This includes common nouns like road, coastline, and river and proper names like Route
66 and Kangamangus Highway. Some nouns, such as valley, can be ambiguous. It can be understood as a path in we
walked down the valley or as a place (3.14) in we live in the valley.
Note 2 to entry: A path might be represented as an undirected graph whose vertices are locations (3.7) and whose
edges signify continuity; that is to say, a path has no inherent directionality.
3.14
place
geographic or administrative entity that is situated at a location (3.7)
3.15
region
connected, non-empty point-set defined by a domain and its boundary points
Note 1 to entry: The term “region” as defined here does not refer to a political or administrative region such as
“the Canary Islands” or “Hong Kong, SAR”, where SAR is the acronym of “Special Administrative Region”.
3.16
space
dimensional extent in which objects and events (3.2) have a relative position and direction
3.17
spatial entity
object that is situated at a unique location (3.7) for some period of time, and typically has the potential
to undergo translocation
Note 1 to entry: A spatial entity can also be understood as an object that participates in a spatial relation. In John is
sitting in a car, both John and car could be understood as spatial entities or as being the figure (3.5) and the ground
(3.6), respectively, of the sitting-in situation.
3.18
spatial signal
segment or series of segments of a text that rebounds to orientational (3.12) or topological relations (3.20)
3.19
tag
element name
name associated with textual segments for annotation or for a relation between these segments
Note 1 to entry: The following are two kinds of tag for annotation:
a) extent tag, which is associated with textual segments referring to basic entities or signals;
b) link tag, for representing spatial relations.
3.20
topological relation
relation that expresses the connectedness or continuity of spaces (3.16)
4 List of tags
4.1 General
The tag in angled brackets stands for the name of an XML element. See 8.2.
4.2 Extent tags: Basic entities and signals
4.2.1
measure
extent tag representing some measure (3.8)
4.2.2
motion
extent tag representing a motion (3.9)
4.2.3
motionSignal
extent tag representing a motion-signal (3.10)
4.2.4
non-motion event
extent tag representing a non-motion event (3.9)
4.2.5
path
extent tag that represents a path (3.13)
4.2.6
place
extent tag that represents a place (3.14)
4.2.7
spatialEntity
extent tag that represents a spatial entity (3.17)
4.2.8
spatialSignal
extent tag that represents a spatial signal (3.18)
4.3 Link tags
4.3.1
mLink
linking tag that represents some measure (3.8)
4 © ISO 2014 – All rights reserved
4.3.2
moveLink
linking tag that represents a relation between a motion (3.9) and participant spatial entities (3.17)
4.3.3
oLink
linking tag that represents an orientation relation (3.12) between a figure (3.5) and a ground (3.6)
4.3.4
qsLink
linking tag that represents a topological relation (3.20)
NOTE The tag qsLink or stands for a qualitative spatial link.
4.4 Root element
4.4.1
isoSpace
root element in which all ISOspace tags are embedded
NOTE In ISOspace annotations, all of the extent and link tags listed above are embedded in the tag .
5 Overview
Human languages impose diverse linguistic constructions for expressing concepts of space, of spatially-
anchored events, and of spatial configurations that relate in complex ways to the situations in which
they are used. One area that deserves further development regarding the connection between natural
language and formal representations of space is the automatic enrichment of textual data with spatial
annotations. There is a growing demand for such annotated data, particularly in the context of the
semantic web. Moreover, textual data routinely make reference to objects moving through space over
time. Integrating such information derived from textual sources into a geosensor data system can enhance
the overall spatiotemporal representation in changing and evolving situations, such as when tracking
objects through space with limited image data. It follows that verbal subjective descriptions of spatial
relations need to be translated into metrically meaningful positional information. A central research
question currently hindering progress in interpreting textual data is the lack of a clear separation of
the information that can be derived directly from linguistic interpretation and further information that
requires contextual interpretation. In order to avoid building incorrect deductions into the annotations
themselves, mark-up schemes should avoid over-annotating the text. Solutions to the language-space
mapping problem and its grounding in geospatial data are urgently required for this purpose.
There are many applications and tasks that would benefit from a robust spatial mark-up language, such
as ISOspace. These applications and tasks include the following:
a) creating a visualization of objects from a verbal description of a scene;
b) identifying the spatial relations associated with a sequence of processes and events from a news article;
c) determining an object location or tracking a moving object from a verbal description;
d) translating viewer-centric verbal descriptions into other relative descriptions or absolute coordinate
descriptions;
e) constructing a route given a route description;
f) constructing a spatial model of an interior or exterior space given a verbal description;
g) integrating spatial descriptions with information from other media.
The goal of ISOspace is not to provide a formalism that fully represents the complexity of spatial
language but rather to capture these complex constructions in text in order to provide an inventory
of how spatial information is presented in natural language. For example, many texts have no explicit
frame of spatio-temporal reference, thus, making it impossible to annotate such an unspecified frame
of reference. The interpretation of spatial prepositions, such as on in a book on the desk vs a picture on
the wall requires a handbook of its own dealing with different senses or uses of spatial prepositions
beyond a set of annotation guidelines. Any detailed classification of motion verbs in English alone is
again beyond the scope of this International Standard.
All of the examples in the current version of part of ISO 24617 are from English datasets. The specification
language proposed in this International Standard can be seen as a version of ISOspace for English only
and its applicability to other languages is still pending. A multilingual extension of ISOspace is necessary
if the document is to be verified, but this is expected to immediately follow preliminary rigorous work
on establishing the first edition of this part of ISO 24617 as an International Standard for spatial and
motion-related annotation.
6 Motivation and requirements
This International Standard aims to formulate the requirements for spatiotemporal annotation
standards and to develop the ISOspace standard to meet these requirements. It assumes ISO 24612 and
builds on previous work, including ISO 24617-1 and other spatial representations and calculi.
Natural language abounds with descriptions of motion. Our experience of our own motion, together
with our perception of motion in the world, have given human languages substantial means to verbally
express many different aspects of movement, including its temporal circumstances, spatial trajectory and
manner. In every language on earth, verbalizations of motion can specify changes in the spatial position
of an object over time. In addition to when and where the motion takes place, languages additionally
characterize how the motion takes place (e.g., its path, its manner, and how it was caused). In particular,
the path of motion involves conceptualizations of the various spatial relationships that an object can
have to other objects in the space in which it moves. An understanding of such spatial information in
natural language is necessary for many computational linguistics and artificial intelligence applications.
Any specification language for spatial information in language will need to support the following
computational tasks:
— identification of the appropriate topological configuration between two regions or objects (e.g.
containment, identity, disjointedness, connectedness, overlap, and closure over these relations,
when possible);
— identification of directional and orientational relations between objects and regions, including the
distinction between frames of reference;
— identification of metric properties of objects and metric values between regions and objects, when
possible (e.g. distance, height and width);
— identification of the motion of objects through time and a characterization of the nature of this movement;
— the provision of clear interoperable interfaces to existing representations and geo-databases (e.g.
2)
GeoNames, ArcGIS, and Google Earth ).
NOTE 1 Texts are often completely unspecified for frames of reference (texts are, so to speak, “not situated”)
and it, therefore, appears that the annotation of a frame of reference cannot be provided for many texts.
2) GeoNames, ArcGIS, and Google Earth are examples of a suitable product available commercially. This information
is given for the convenience of users of this document and does not constitute an endorsement by ISO of these
products.
6 © ISO 2014 – All rights reserved
NOTE 2 Measure expressions, such as 20 miles, have two attributes, numeric @value “20” and @unit “miles”,
but expressions like near and far have no unit specified. The annotation scheme proposed in this International
Standard can only state that they are measure-related expressions only with its attribute @value specified, say
with “near” or “far”. As will be seen, many of the annotation cases are left underspecified.
7 Specification of ISOspace for spatial annotation
7.1 Overview: annotation vs. representation
As with other areas of work on semantic annotation carried by the ISO Working Group (IS0/TC
37/SC 4/WG 2), ISOspace draws a fundamental distinction between the concepts of annotation and
representation; ISO 24612 does likewise. The term “annotation” is used to refer to the process of
adding information to segments of language data or to refer to that information itself. This notion is
independent of the format in which this information is represented. The term “representation” is used to
refer to the format in which an annotation is rendered (for instance, in XML) independent of its content.
According to ISO 24612, annotations are the proper level of standardization, not representations. This
part of ISO 24617 therefore, defines a specification language for annotating documents with information
about spatial entities and spatial relations at the level of annotations and then for representing these
annotations in a specific way, either with XML or with a predicate-logic-like format, as used in Annex A.
This language is called “ISOspace”.
However, the current version of ISOspace does not offer a formal specification of its annotation structure
with an abstract syntax and a formal semantics. This task will be taken up in a proposed work item, ISO
PWI 24617-9, aims to achieve a full development of spatial semantics. Instead, ISOspace will simply
specify a concrete XML-based syntax in 8.2 and a set of core annotation guidelines in Annex A.
7.2 Abstract syntax for the ISOspace annotation structure
An abstract syntax provides a theoretical basis for deriving various versions of a concrete syntax. In
this part of ISO 24617, the abstract syntax of ISOspace is schematically represented by the UML-based
metamodel (Figure 1), which specifies an annotation structure for spatial information consisting of
two substructures: an entity structure and a link structure. The entity structure of ISOspace consists
of basic spatial entities that are anchored to textual fragments called “markables” or “extents”; the link
structure relates these spatial entities and assigns a specific relation-type to each relation.
triggersQSLink
spatialSignal
triggersOLink
suppliesAdditionalLinkInfo
olink
primaryData isoSpace link
#trigger : IDRef
contains mlink movelink
relates
markable
triggersMLink
measure
isLinkedTo
deinesMap
isAnchoredTo
entity signal
mapsTo
Inherited from
ISO-TimeML
spatialEntity
qslink
event nonMotionEvent
location
triggersMoveLink
place path motion motionSignal
suppliesAdditionalMotionInfo
Figure 1 — Schematic metamodel of ISOspace
As Figure 1 shows, the annotation structure of ISOspace consists of the following three classes of entities
and four types of links:
a) three major basic semantic entities: spatial entity, event, and signal with their respective subclasses:
1) spatial entity: location: place and path;
2) event: motion and non-motion event;
3) signal: spatial signal, motion-signal, and measure;
b) four types of links: qsLink, oLink, moveLink, and mLink.
NOTE In earlier versions of ISOspace, the basic entity, motion-signal, was treated as part of spatial signals.
However, in the current version, it is treated differently because of its specific function to motions that provides
either additional information on either the path or the manner of motions. It is therefore called a motion signal, a
motion adjunct, or simply an adjunct so that it is not confused with spatial signals.
8 Representation of ISOspace-conformant annotations
8.1 XML-based concrete syntax: outline
8.1.1 Overview
The version of ISOspace’s concrete syntax in Clause 8 is an XML serialization of the spatial annotation
structure or of abstract syntax informally presented in Clause 7 with a UML-based metamodel. The
8 © ISO 2014 – All rights reserved
concrete syntax of ISOspace consists of basic entities (8.1.2), signals (8.1.3), links (8.1.4), and the root
element (8.1.5).
8.1.2 Basic entitles
There are five basic entity tags: , , , , and .
NOTE The and tags are subclasses of a location, but the location itself is not introduced as forming
its own element in this XML-based ISOspace. Non-location spatial entities are tagged simply as .
There is no tag as such. The tag is inherited from ISO 24617-1:2012(E) ISO-TimeML,
but is understood in ISOspace to stand for the class of non-motion events.
8.1.3 Signals
There are three signal tags: , , and .
8.1.4 Links
There are four links: , , , and .
In 8.4, the four links specified in ISOspace’s link structure are represented by their respective XML tags.
8.1.5 Root element
Each bundle of XML elements forms a tree-like structure called an “XML document”. This part of ISO 24617
has a single element called a “root element” that encloses all the other elements in the document.
For each ISOspace document, its root element is called“”.
EXAMPLE
8.2 Conventions for tagging
8.2.1 Naming conventions
Naming conventions can be quite complex. Here are four basic guidelines:
a) This part of ISO 24617 follows medial capitalization, also called “CamelCase”, thus avoiding the use
of the hyphen “-” or the underscore “_” in concatenating more than two words.
EXAMPLE 1 , , or @relatedToEvent, instead of ,
, or @related_to_event.
b) This ISO 24617 also avoids the use of uppercase unless it is absolutely necessary (e.g. acronyms such
as “XML” and UML class names such as “Entity” as a class).
EXAMPLE 2 , , , instead of , , , or
.
NOTE 1 “ISO” in “ISOspace” is the Greek prefix iso-, meaning “equal”. “AF” is the acronym for “Annotation
Framework”.
NOTE 2 ISOspace refers to this part of ISO 24617 and to the specification language for the annotation
of motion, together with other type event-related spatial information presented in the ISO document; the
element refers to an XML tag for a concrete annotation of a textual fragment based on ISOspace.
c) This part of ISO 24617 therefore allows both lowerCamelCase and UpperCamelCase, although the XML
serialization of ISOspace adopts lowerCamelCase for the representation of element names and tags.
d) The values of the various ID attributes are specified as beginning with one or more lowercase
alphabetical characters, followed by an integer. This scheme is mandated by the syntax of XML.
EXAMPLE 3
,
NOTE 3 “pl23’’ is a valid XML ID, but “23’’ is not.
Names for elements, attributes, and their values might be mentioned or listed in the documents. Where
this occurs, the following mentioning conventions are followed:
— element names are braced with a pair of angled brackets;
EXAMPLE 4 , , and
— attribute names are prefixed with @;
EXAMPLE 5 @value, @referencePt, and @frameType.
NOTE 4 @ is not part of attribute names.
— values of attributes are in double quotes.
EXAMPLE 6 birthPlace=”Boston” and xml:id=”e1”.
NOTE 5 Some attribute values might refer to an ID value that occurs somewhere in the annotation, that is
to say, an IDRef value. In cases such as this, the “#” symbol is prefixed to it.
EXAMPLE 7
8.2.2 Convention for inline tagging extents
For illustration, extents in a sample text are often inline tagged with their identifiers or some other tag
names. Here are some conventions for such tagging:
a) Style guides generally do not recommend boldface text for providing emphasis. Hence, the use of
boldface is discouraged.
EXAMPLE 1 Tsingtao beer is produced in Qingdao .
tok7
Boldface here is not recommended in actual tagging.
b) The end of each extent is marked with a unique ID in subscript.
EXAMPLE 2 Tsingtao beer is produced in Qingdao .
tok7
c) If an extent consists of more than one token, then it is enclosed by a pair of square brackets and an
ID is placed outside of the closing bracket.
EXAMPLE 3 John hopped [out of the room] .
m p
d) If an extent is a non-contiguous sequence of more than one token, then each non-contiguous token
is bracketed and marked with an identical ID.
EXAMPLE 4 Mia [looked] me [up] .
e1 e1
10 © ISO 2014 – All rights reserved
8.3 Basic entity tags
8.3.1
The ISOspace tag is inherited from Reference[40] with some additions and modifications. This
tag is used to annotate geographic entities like lakes and mountains, as well as administrative entities
like towns and counties. With the exception of implicit, non-consuming tags, a tag in ISOspace
should be directly linked to an explicit span of text.
The syntax and definition for the tag are set out below.
List 1 — List of attributes for the tag
|postalCode|postBox|ppl|ppla|pplc|rgn|state|UTM) #IMPLIED >
The ATTLIST can specify that an attribute can be any of the following (this list is not exhaustive):
a) ID should start with an alphabetic character and may not contain spaces. An ID value should be unique
within the document. The name of the attribute @id may take a prefix “xml:” for XML documents.
b) IDRef should be a value that is used as an ID somewhere in the document or in the annotation.
c) CDATA is any parsed character data.
d) “#REQUIRED” is for required attributes, whereas “#IMPLIED” is for optional ones, allowing no value
to be specified.
NOTE 1 A value for the attribute @latLong attribute will be provided automatically and it is therefore, not
usual for it to be manually specified.
The attributes for the tag are largely inherited from Reference[40]. For example:
— the value “mtn” stands for mountain;
— the value “mts” for mountain range;
— the value “ppl” stands for populated place;
— the value “ppla” stands for a capital of a sub-country (populated area), such as a state or a province;
— the value “pplc” stands for a capital of a country (populated place);
— the value “rgn” stands for a (non-political or non-administrative) region, such as a desert.
For places that have known latitude and longitude values, the @latLong attribute can be used to allow
3)
for mapping to other resources such as Google Maps .
NOTE 2 For further details, see Reference[40] Table 1, ISO 3166-1:1999, Table 2 and Table 4, as well as other
parts of the manual as a whole.
Adopting standoff annotation, ISOspace requires an attribute @markable to refer to a markable in a
tokenized text or an extent in the given text. It also includes a Document Creation Location or @dcl
attribute, which is a special location that serves as the “narrative location”. If the document includes a
@dcl, it is generally specified at the beginning of the text, in rather the same way that a Document
Creation Time is specified in ISO-TimeML. If a place is the DCL, the special @dcl attribute is annotated
as “true” and all other location tags have the default @dcl value of “false”. The current set of
attributes is shown in List A.1 in A.3.1.2.
NOTE 3 The default value for the attribute @dcl is “false”. This means that a document creation location is
not specified.
NOTE 4 It is worth remembering that, by convention, the tag names such as and the value of each
attribute are no longer represented in uppercase, but in lowercase (unless they are acronyms), while the name of
each attributes such as latLong is followed by the prefix @, thus being represented as @latLong. This convention
is
...












Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...