Language resource management -- Semantic annotation framework (SemAF) - Part 2: Dialogue acts

This document provides a set of empirically and theoretically well-motivated concepts for dialogue annotation, a formal language for expressing dialogue annotations (the Dialogue Act Markup Language, DiAML), and a method for segmenting a dialogue into semantic units. This allows the manual or automatic annotation of dialogue segments with information about the communicative actions which the participants perform by their contributions to the dialogue. The annotation scheme specified in this document supports multidimensional annotation of spoken, written, and multimodal dialogues involving two or more participants. Dialogue units are viewed as having multiple communicative functions in different dimensions. The markup language DiAML has an XML-based representation format and a formal semantics which makes it possible to perform inferences with DiAML representations. This document also specifies data categories for dimensions of dialogue analysis, for communicative functions, for dialogue act qualifiers, and for relations between dialogue acts. Additionally, it provides mechanisms for customizing these sets of concepts, extending them with application-specific or domain-specific concepts and descriptions of semantic content, or selecting relevant coherent subsets of them. These mechanisms make the dialogue act concepts specified in this document useful not only for annotation but also for the recognition and generation of dialogue acts in interactive systems.

Gestion des ressources langagières -- Cadre d'annotation sémantique (SemAF) - Partie 2: Actes de dialogue

Upravljanje jezikovnih virov - Ogrodje za semantično označevanje (SemAF) - 2. del: Dialogi

General Information

Status
Published
Publication Date
28-Apr-2021
Current Stage
6060 - National Implementation/Publication (Adopted Project)
Start Date
02-Apr-2021
Due Date
07-Jun-2021
Completion Date
29-Apr-2021

Relations

Standard
SIST ISO 24617-2:2021
English language
101 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day
Standard
ISO 24617-2:2020 - Language resource management — Semantic annotation framework (SemAF) — Part 2: Dialogue acts Released:12/2/2020
English language
95 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


SLOVENSKI STANDARD
01-junij-2021
Nadomešča:
SIST ISO 24617-2:2013
Upravljanje jezikovnih virov - Ogrodje za semantično označevanje (SemAF) - 2.
del: Dialogi
Language resource management -- Semantic annotation framework (SemAF) - Part 2:
Dialogue acts
Gestion des ressources langagières -- Cadre d'annotation sémantique (SemAF) - Partie
2: Actes de dialogue
Ta slovenski standard je istoveten z: ISO 24617-2:2020
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

INTERNATIONAL ISO
STANDARD 24617-2
Second edition
2020-12
Language resource management —
Semantic annotation framework
(SemAF) —
Part 2:
Dialogue acts
Gestion des ressources langagières — Cadre d'annotation sémantique
(SemAF) —
Partie 2: Actes de dialogue
Reference number
©
ISO 2020
© ISO 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2020 – All rights reserved

Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Use cases . 5
5 Basic concepts and metamodel . 6
5.1 Dialogue acts . 6
5.2 Dependence relations . 8
5.3 Rhetorical relations. 9
5.4 Qualifiers .11
5.5 Metamodel .11
6 Multifunctionality, multidimensionality and segmentation .11
6.1 Multifunctionality .11
6.2 Multidimensionality and dimensions .13
6.3 Segmentation .14
7 Specification of the annotation scheme .15
7.1 Overview .15
7.2 Dimensions .15
7.2.1 Overview .15
7.2.2 Task and Task Management .16
7.2.3 Auto-Feedback and Allo-Feedback .16
7.2.4 Turn Management .16
7.2.5 Time Management .16
7.2.6 Discourse Structuring .17
7.2.7 Social Obligations Management .17
7.2.8 Own- and Partner Communication Management .17
7.2.9 Contact Management .17
7.3 Communicative functions .17
7.3.1 Overview .17
7.3.2 General-purpose functions .19
7.3.3 Dimension-specific functions .20
7.3.4 Responsive communicative functions .21
7.4 Functional and feedback dependences .22
7.5 Qualifiers .22
8 The Dialogue Act Markup Language (DiAML) .23
8.1 Overview .23
8.2 Abstract syntax .24
8.3 Concrete syntax .24
8.4 Semantics .26
9 Extension and customization .27
9.1 Overview .27
9.2 Simplifying the annotation scheme: options and selections .27
9.3 Extending the annotation scheme: triple-layered plug-ins and interfaces .28
Annex A (normative) Formal specification of DiAML .30
Annex B (normative) DiAML-XML technical schema.36
Annex C (normative) Data categories for DiAML concepts .40
Annex D (informative) Plug-ins for semantic content and other enrichments .62
Annex E (informative) Annotation guidelines and examples .73
Bibliography .92
iv © ISO 2020 – All rights reserved

Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 4, Language resource management.
This second edition cancels and replaces the first edition (ISO 24617-2:2012), which has been technically
revised.
The main changes compared to the previous edition are as follows:
— in 6.2, ‘reference segments’ are introduced to allow more accurate annotations of feedback
dependence relations;
— in 6.3, a more detailed way of annotating rhetorical relations between dialogue acts is made possible
by importing concepts from ISO 24617-8:2016 (DR-core);
— in 7.2, the Contact Management dimension, known from the DIT++ annotation scheme, and the Task
Management dimension, known from the DAMSL annotation scheme, have been added, along with a
few communicative functions specific for contact management;
— in 7.5 and Annex D, a possibility is introduced for importing elements from the W3C recommendation
EmotionML in order to add affective information to dialogue acts;
— in Clause 9 and Annex D, the mechanism of ‘triple-layered annotation scheme plug-in’ with ‘plug-in
interface’ is introduced; this mechanism allows the dialogue act annotation to be customized, using
application-specific concepts, and to be enriched with semantic content information.
A list of all parts in the ISO 24617 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
Introduction
Since its publication in 2012, ISO 24617-2 has been used in a number of annotation efforts as well as in
the development of language-based interactive systems. These experiences have brought to light
— that the standard allowed dialogue act annotations that are slightly inaccurate in some respects,
— that some applications would benefit from the availability of mechanisms for customizing the set of
concepts defined in the standard, and
— that certain use cases require the representation of functional dialogue act information to be
extended with semantic content information.
This second edition seeks to remedy the noted inaccuracies, and to provide mechanisms
a) for customizing the set of defined concepts, and
b) for extending the information types in dialogue act annotations.
The improved accuracy of this second edition concerns the annotation of semantic dependence relations
of dialogue acts and their scopes, and of rhetorical relations between dialogue acts. The mechanisms for
extending and customizing the standard for a specific application concern most notably the annotation
of information about the (domain-specific) semantic content of dialogue acts, the introduction of
application-specific dialogue act types, the addition of communicative functions for fine-grained
specification of feedback, and the annotation of speaker emotions.
This second edition is downward compatible with the original ISO 24617-2:2012 in the sense that
every annotation made with the original version is a valid annotation according to the second edition.
Existing annotations do not need to be revised in order to be compliant with this second edition.
vi © ISO 2020 – All rights reserved

INTERNATIONAL STANDARD ISO 24617-2:2020(E)
Language resource management — Semantic annotation
framework (SemAF) —
Part 2:
Dialogue acts
1 Scope
This document provides a set of empirically and theoretically well-motivated concepts for dialogue
annotation, a formal language for expressing dialogue annotations (the Dialogue Act Markup Language,
DiAML), and a method for segmenting a dialogue into semantic units. This allows the manual or
automatic annotation of dialogue segments with information about the communicative actions which
the participants perform by their contributions to the dialogue. The annotation scheme specified in
this document supports multidimensional annotation of spoken, written, and multimodal dialogues
involving two or more participants. Dialogue units are viewed as having multiple communicative
functions in different dimensions. The markup language DiAML has an XML-based representation format
and a formal semantics which makes it possible to perform inferences with DiAML representations.
This document also specifies data categories for dimensions of dialogue analysis, for communicative
functions, for dialogue act qualifiers, and for relations between dialogue acts. Additionally, it provides
mechanisms for customizing these sets of concepts, extending them with application-specific or
domain-specific concepts and descriptions of semantic content, or selecting relevant coherent subsets
of them. These mechanisms make the dialogue act concepts specified in this document useful not only
for annotation but also for the recognition and generation of dialogue acts in interactive systems.
2 Normative references
There are no normative references in this document.
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1
addressee
dialogue (3.5) participant (3.13) oriented to by the sender (3.20) in a manner to suggest that his/her
utterances (3.25) are particularly intended for this participant, and that some response is therefore
anticipated from this participant, more so than from the other participants
Note 1 to entry: This definition is a de facto standard in the linguistics literature.
[SOURCE: Reference [34], modified - ‘speaker' replaced by ‘sender', and use of ambiguous pronouns
avoided.]
3.2
allo-feedback act
feedback act (3.8) where the sender (3.20) elicits information about the addressee's (3.1) processing
of an utterance (3.25) that the sender contributed to the dialogue (3.5), or where the sender provides
information about his perceived processing by the addressee of an utterance that the sender contributed
to the dialogue
EXAMPLE 1. A: Now move up.
2. B: Slightly northeast you mean?
3. A: Slightly yeah
With utterance 3, A performs an allo-feedback act signalling that he/she thinks B understood utterance 1
correctly.
3.3
auto-feedback act
feedback act (3.8) where the sender (3.20) provides information about his/her own processing of an
utterance (3.25) contributed to the dialogue (3.5) by another participant (3.13)
EXAMPLE B's utterance in the example dialogue fragment in 3.2 signals that he/she is uncertain whether
he/she understood the previous utterance correctly.
3.4
communicative function
property of certain stretches of communicative behaviour, describing how the behaviour changes the
information state (3.12) of an understander of the behaviour
3.5
dialogue
exchange of utterances (3.25) between two or more persons or artificial agents
3.6
dialogue act
communicative activity of a dialogue (3.5) participant (3.13), interpreted as having a certain
communicative function (3.4) and semantic content (3.18)
Note 1 to entry: A dialogue act can additionally also have certain functional dependence relations (3.10), rhetorical
relations (3.17) and feedback dependence relations (3.9) with other units in a dialogue.
3.7
dimension
class of dialogue acts (3.6) that are concerned with a particular aspect of communication, corresponding
to a particular category of semantic content (3.18)
EXAMPLE (1) Dialogue acts advancing the task or activity that motivates the dialogue (the ‘Task' dimension);
(2) dialogue acts providing and eliciting feedback (the Auto- and Allo-Feedback dimensions); (3) dialogue acts for
allocating the speaker role (the Turn Management dimension).
3.8
feedback act
dialogue act (3.6) that provides or elicits information about the sender's (3.20) or the addressee's (3.1)
processing of something that was uttered in the dialogue (3.5)
Note 1 to entry: Two classes of feedback are distinguished: allo-feedback acts (3.2) and auto-feedback acts (3.3).
2 © ISO 2020 – All rights reserved

3.9
feedback dependence relation
relation between a feedback act (3.8) and the stretch of communicative behaviour the processing of
which the act provides or elicits information about
EXAMPLE In the example in 3.2, both the allo-feedback act expressed by utterance 3 and the auto-feedback
act expressed by utterance 2 have a feedback dependence relation to utterance 1.
Note 1 to entry: Feedback dependence relations are also used to relate self-corrections, partner corrections, and
other speech editing acts, which strictly speaking are not feedback acts, to the segments that they apply to.
3.10
functional dependence relation
relation between a dialogue act (3.6) with a responsive communicative function (3.16) and one or more
previous dialogue acts that it responds to
EXAMPLE The relation between an answer and the corresponding question, such as between utterance 3 and
utterance 2 in the example in 3.2; or the relation between the acceptance of an offer and the corresponding offer.
3.11
functional segment
minimal stretch of communicative behaviour that has one or more communicative functions (3.4)
Note 1 to entry: The condition of being ‘minimal' ensures that functional segments do not include material that
does not contribute to the expression of a communicative function that identifies the segment.
EXAMPLE The functional segment corresponding to the answer given by S in the following dialogue
fragment does not include the parts "Just a moment please" and “. let me see." but only the parts “the first train to
the airport on Sunday morning is" and “at 5:45”.
1. U: What time is the first train to the airport on Sunday morning please?
2. S: Just a moment please. the first train to the airport on Sunday morning is . let me see. at 5:45.
Note 2 to entry: A consequence of this definition is that functional segments can be discontinuous, can overlap or
be embedded, and can contain parts from more than one turn.
3.12
information state
context
totality of a dialogue (3.5) participant's (3.13) beliefs, assumptions, expectations, goals, preferences,
hopes, and other attitudes that may influence the participant's interpretation and generation of
communicative behaviour
3.13
participant
person or artificial agent involved in the exchange of utterances (3.25)
3.14
qualifier
predicate that can be associated with a communicative function (3.4)
EXAMPLE A: Would you like to have some coffee?
B: Only if you have it ready.
B's utterance accepts A's offer under a certain condition; this can be described by qualifying the communicative
function Accept Offer with the predicate ‘conditional'.
3.15
reference segment
stretch of communicative behaviour that a feedback dependence relation (3.9) refers to and that is not a
functional segment (3.11)
3.16
responsive communicative function
communicative function (3.4) of a dialogue act (3.6) that depends for its semantic content (3.18) on one or
more dialogue acts that it responds to
Note 1 to entry: See 5.2.
Note 2 to entry: In 7.3.4, the set of responsive communicative functions is listed of the annotation scheme defined
in this document.
3.17
rhetorical relation
discourse relation
semantic or pragmatic relation between two dialogue acts (3.6) or their semantic contents (3.18)
Note 1 to entry: Relations such as elaboration, explanation, justification, cause, and concession have been studied
extensively in the analysis of (monologue) text, where they are often called ‘rhetorical relations' or ‘discourse
relations', and are mostly viewed either as relations between text segments or as relations between events or
propositions, described in text segments. Many of these relations also occur in dialogue (3.5).
EXAMPLE 1 In the following example, the statement in the second utterance provides a motivation for the
question in the first utterance:
A: Can you tell me what flights there are to Sydney on Saturday? I’d like to attend my mother's 80th birthday.
EXAMPLE 2 A rhetorical relation between the semantic contents of two dialogue act occurs in the following,
where the content of B's statement mentions a cause for the content of A's statement:
A: I can never find these stupid remote controls.
B: That's because they don’t have a fixed location.
3.18
semantic content
information, situation, action, event, or objects that a stretch of communicative behaviour refers to
3.19
semantic content category
semantic content type
type of the semantic content (3.18) of a dialogue act (3.6)
EXAMPLE The various dimensions (3.7) defined in this document correspond to categories of semantic
content. In particular, the Task dimension corresponds to the category of task-specific actions and information;
the Allo- and Auto-Feedback dimensions correspond to the categories of information about the processing by
the current speaker or by the addressee, respectively, of something that was said before; the Turn Management
dimension corresponds to the category of information about the allocation of the speaker role, and so forth.
3.20
sender
dialogue (3.5) participant (3.13) who performs a dialogue act (3.6)
3.21
speaker
sender (3.20) of a dialogue act (3.6) in spoken form, possibly combining speech with nonverbal
communicative behaviour
Note 1 to entry: A dialogue (3.5) participant (3.13) can contribute to a dialogue without having the speaker role
(3.22), for example by nodding in agreement to what the other participant says. Therefore, the term ‘speaker' is
not synonymous with ‘participant who occupies speaker role'.
4 © ISO 2020 – All rights reserved

3.22
speaker role
role occupied by a participant (3.13) who has temporary control of a dialogue (3.5) and speaks for some
period of time
[SOURCE: DAMSL annotation scheme (see Reference [3]).]
3.23
speech act
act that a speaker (3.21) performs when producing an utterance (3.25)
Note 1 to entry: The notion ‘utterance’ in this definition is commonly interpreted as mentioned in Note 1 to entry
of 3.25.
[SOURCE: SIL Glossary of linguistic terms (https:// glossary .sil .org/ term/ speech -act), modified - Added
Note 1 to entry.]
3.24
turn unit
stretch of communicative activity produced by one participant (3.13) who occupies the speaker role
(3.22), bounded by periods of inactivity of that sender (3.20) or by periods where another participant
occupies the speaker role
Note 1 to entry: The term ‘turn unit’ corresponds to one of the meanings of the often used term ‘turn’, which is
ambiguous between ‘turn unit’ and ‘right to speak”, as in “to have the turn” and “turn-taking”. The term ‘turn’ is
only used in this document when the context makes it clear in what sense the term is meant.
Note 2 to entry: The term ‘turn unit’ is also closely related to the term ‘turn construction unit’ (TCU), introduced
by Reference [51]. The TCU seems a rather intuitive and holistic notion, of which the usefulness has been the
subject of debate (see e.g. Reference [52]). The term is therefore avoided in this document.
Note 3 to entry: The term ‘turn unit’ is useful in the description of dialogue (3.5) behaviour, but is not of central
importance in this document, since dialogue acts (3.6) are not assumed to correspond to turn units.
3.25
utterance
anything said, written, keyed, signed, or otherwise expressed, possibly in multimodal form
Note 1 to entry: An utterance is part of a turn unit (3.24). In the literature, the term is commonly used in the sense
of ‘everything contributed by a sender (3.20) within a turn unit’.
Note 2 to entry: The term ‘utterance’ is useful in the description of dialogue (3.5) behaviour, but is not of central
importance in this standard, since dialogue acts (3.6) are not assumed to correspond to utterances, but rather to
the communicative behaviour in functional segments (3.11).
4 Use cases
The notion of a dialogue act plays a key role in the analysis of spoken and multimodal dialogue, as well
as in the design of spoken dialogue systems and embodied conversational agents. These applications
all depend on the availability of dialogue corpora, annotated with dialogue act information. The main
purpose of this document is to define a reference set of domain-independent basic concepts for dialogue
act annotation, and their use for representing such annotations, in the Dialogue Act Markup Language
(DiAML). The set of concepts defined in ISO 24617-2:2012 was based on the DIT++ taxonomy, which was
originally developed to serve a double purpose: the articulate functional description of communicative
activity in natural human dialogue, and a basis for the design of modules in dialogue systems. As part of
the ISO 24617 series, a focus came to lie on annotation. Still, like DIT++, this document has multiple use
cases, which can be grouped into four types:
— UC1: manual annotation of spoken, written, or multimodal human-human or human-computer
dialogue;
— UC2: automatic annotation of spoken, written, or multimodal human-human or human-computer
dialogue starting from transcriptions or recordings of communicative behaviour;
— UC3: recognition of dialogue acts in (multimodal) communicative user behaviour;
— UC4: generation of dialogue acts by the dialogue manager component of a dialogue system.
The different use cases bring different requirements and desiderata, as follows.
— UC1: manual annotation is costly and only feasible for limited amounts of data, but has the advantage
of producing annotations of the highest quality since expert annotators have a wealth of context
information, general world knowledge, and common-sense reasoning abilities to infer speaker
beliefs and intentions. Expert annotators are therefore able to make fine-grained annotations. In
order to support this use case, the annotation scheme should include fine-grained concepts.
— UC2: automatic annotation systems typically cannot characterize dialogue behaviour with the same
level of detail as expert human annotators, since they lack common world knowledge, and usually have
access to context information only as far as represented in the dialogue history. Automatic annotation
is therefore in general less fine-grained. To effectively support this use case, the annotation scheme
should contain more coarse-grained concepts than those needed for use case UC1.
— UC3: recognition of dialogue acts by an interactive system is almost the same as automatic dialogue
act annotation, except that in an interactive system the semantic contents of dialogue acts play a
prominent role. For a given application, it may be beneficial to define application-specific functions
for specific types of content. For effectively supporting this use case, it may be useful to extend the
annotation scheme with application-specific concepts.
— UC4: generation of dialogue acts in an interactive system concerns the decision how to continue
a dialogue, and this is the main task of a dialogue manager component. This task is typically
organized as a two-stage process: (1) decide on the communicative functions and semantic contents
of one or more possible dialogue acts; (2) decide on a realization in an appropriate form. In contrast
with human dialogue participants, who may be somewhat vague or unspecific about their beliefs
and intentions, a system’s dialogue manager typically works with precise beliefs and goals, and
generates, in stage (1), dialogue acts with fine-grained communicative functions, for example for
feedback acts, since the system may report a processing problem with great accuracy. This calls for
the annotation scheme to include very fine-grained functions.
The new elements in this second edition of this document were introduced for providing more effective
support for each of these use cases, in particular for the cases UC3 and UC4.
5 Basic concepts and metamodel
5.1 Dialogue acts
The term ‘dialogue act' is often used rather loosely in the sense of speech act used in dialogue. Indeed,
the idea of interpreting communicative behaviour in terms of actions, such as questions, promises, and
[9],[50]
requests goes back to speech act theory . But where speech act theory is primarily an action-
based approach to meaning within the philosophy of language, dialogue act theory is an empirically-
based approach to the computational modelling of linguistic and nonverbal communicative behaviour
in dialogue.
Dialogue acts offer a way of characterizing the meaning of communicative behaviour in terms of update
operations, to be applied to the information states of participants in the dialogue; this approach is
commonly known as the ‘information-state update’ or ‘context-change’ approach; see e.g. References
[12] and [56]. For instance, when an addressee understands the utterance “Do you know what time it
is?” as a question about the time, then the addressee’s information state is updated to contain (among
other things) the information that the speaker does not know what time it is and would like to know
that. If, by contrast, it is understood that the speaker is reproaching the addressee for being late, then
the addressee’s information state is updated to include (among other things) the information that the
6 © ISO 2020 – All rights reserved

speaker does know what time it is. Distinctions such as that between a question and a reproach concern
the communicative function of a dialogue act, which is one of its two main components. The other main
component is its semantic content, which describes the objects, properties, relations, situations, actions
or events that the dialogue act is about. The communicative function of a dialogue act specifies how an
addressee updates his information state with the information expressed in the semantic content when
he/she understands the dialogue act.
This approach to the definition of communicative functions is strictly semantic, in contrast to
approaches based on linguistic form. For example, the behaviour of a speaker who repeats something
that was said by someone else may be characterized as a ‘repetition’ (which is a communicative function
in some annotation schemes); however, this only says something about the form of the behaviour
compared to the repeated behaviour, not about its function. A repetition often has a feedback function,
as in (1.2), but it can also have other functions, as in (1.4), where it is used as a confirmation in response
to a check question.
(1) 1. S: There are evening flights at seven-fifteen and eight-thirty.
2. C: Seven-fifteen and eight-thirty.
3. C: And that’s on Sunday too.
4. S: And that’s on Sunday too.
A form-related requirement for introducing a communicative function is however that there are
observable features of communicative (linguistic and/or nonverbal) behaviour which are indicative for
that function in the context in which the behaviour occurs. This requirement puts all communicative
functions on an empirical basis.
Dialogue act annotation is the marking up of stretches of dialogue with information about the dialogue
acts they contain. Spoken dialogues are traditionally segmented into turns, in the sense of ‘turn units’
as defined in 3.24. Such turns can be quite long and complex, and are therefore not the most useful units
of behaviour to assign communicative functions to. Communicative functions can be assigned more
accurately to smaller units that are functionally relevant. Such units are called functional segments, and
are defined as the minimal stretches of communicative behaviour that have one or more communicative
functions. Subclause 6.3 discusses dialogue segmentation.
Inherent to the notion of a dialogue act is that there is an agent who produces the dialogue act, called
the ‘sender’, and one or more agents who are addressed, called the ‘addressee(s)’. Dialogue studies often
focus on two-person dialogues, in which case the dialogue acts have only one addressee. Besides sender
and addressee(s), there may be various types of side-participants who are present but do not or only
[29]
marginally participate .
Dialogue act annotation is often limited to assigning communicative functions to dialogue segments,
which corresponds intuitively to indicating the type of communicative action that is performed. A
semantically more complete characterization additionally provides information about the category of
semantic content. The DAMSL annotation scheme distinguishes three categories of semantic content:
Task, Task Management, and Communication, which indicate whether the semantic content of the
dialogue act advances the task which underlies the dialogue, or discusses how to perform the task,
or concerns the communication process. The DIT++ scheme distinguishes a number of subcategories
of communication-related information, such as feedback information, turn allocation information,
and speech management information. The scheme in this document inherits the DIT++ categories of
semantic content, also called ‘dimensions’; see Clause 7.
Example (2) illustrates the use of the key attributes of a dialogue act in the DiAML-XML annotation of
a task-related yes-no question addressed by speaker ‘a’ to addressee ‘b’, expressed by the functional
segment ‘m1’.
(2) communicativeFunction="propositionalQuestion"/>
5.2 Dependence relations
Some types of dialogue acts are inherently dependent for their full meaning on one or more dialogue
acts earlier in the dialogue, which they respond to. This is for example the case for answers, whose
meaning is partly determined by the question that is being answered, and also for the acceptance or
rejection of offers, suggestions, requests, and apologies. This is illustrated in example (3), where the
meaning of the answer in turn 3 depends on whether it is an answer to the question in turn 1 or to the
one in turn 2.
(3) 1. B: Do you know who’s coming tonight?
2. B: Which of the project members do you think will be there?
3. A: I’m expecting Jan, Alex, Claudia, and David, and maybe Olga and Andrei.
As an answer to the question in 1, A’s answer says that nobody else is expected to come than the people
that are mentioned, but as an answer to the question in 2 it leaves open the possibility that other people
will come, who are not members of ‘the project’.
This kind of semantic dependence, which is due to the responsive character of some communicative
functions, is called a functional dependence relation. Marking up this relation between a dialogue act
with a responsive communicative function and its ‘antecedent’ dialogue acts allows the annotation
to not just indicate e.g. that an utterance has the function of an answer, but also to indicate to which
question it is an answer, as illustrated in (4). Subclause 7.3.4 lists the responsive communicative
functions defined in this document.
(4) a. B: Which of the project members do you think will be there?
A: I’m expecting Jan, Alex, Claudia, and David, and maybe Olga and Andrei.
b. communicativeFunction="setQuestion"/>
communicativeFunction="answer" functionalDependence=”#da1”/>
The property of ‘responsiveness’ is related to what in the literature is called ‘backward-looking’. For
example, in DAMSL the communicative functions are divided over two categories: forward-looking and
backward-looking. Backward-looking functions are defined as those functions that indicate how the
current utterance relates to the previous discourse. These include not only answers and other dialogue
acts whose semantic content is co-determined by antecedent dialogue acts, but also feedback acts and
other acts concerned with speech editing.
Positive and negative feedback-providing acts depend for their interpretation also on what happened
earlier in the dialogue, but in a different way. They are concerned with the processing of what was said
before - such as its perception or its interpretation. This is illustrated by the examples in (5).
(5) 1. A: The flight on Tuesday would suit me really well.
B: Okay.
2. A: The flight on Tuesday would suit me really well.
B: On Tuesday?
In the first example, B indicates that he/she has correctly understood A’s remark; in the second, he/she
checks whether he/she heard (or remembers) correctly what A said. This relation between a positive or
negative feedback act and its ‘antecedent’ is called a feedback dependence relation.
A feedback dependence relation indicates one or more preceding dialogue acts if the feedback concerns
high-level processing, such as understanding, and it indicates a dialogue segment in the case of low-
level processing, such as hearing what was said. In the latter case, ISO 24617-2:2012 stipulated that the
feedback dependence relation should refer to the smallest functional segment containing the segment
8 © ISO 2020 – All rights reserved

that the feedback act is about. This way of annotating feedback dependence relations is not quite
accurate, since feedback about a stretch of communicative behaviour smaller than a functional segment
is not about the entire segment. For example, negative feedback that signals a problem in hearing
certain words may imply positive feedback about the rest of the segment. Similarly, for feedback-
eliciting acts and for dialogue acts in the Own Communication Management (OCM) dimension or in the
Partner Communication Management (PCM) dimension. In particular, Self-Corrections and Partner
Corrections frequently refer to a single word or phrase which does not form a functional segment. To
make more accurate annotation possible, this second edition introduces a ‘reference segment’, as a
stretch of communicative behaviour that is t
...


INTERNATIONAL ISO
STANDARD 24617-2
Second edition
2020-12
Language resource management —
Semantic annotation framework
(SemAF) —
Part 2:
Dialogue acts
Gestion des ressources langagières — Cadre d'annotation sémantique
(SemAF) —
Partie 2: Actes de dialogue
Reference number
©
ISO 2020
© ISO 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2020 – All rights reserved

Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Use cases . 5
5 Basic concepts and metamodel . 6
5.1 Dialogue acts . 6
5.2 Dependence relations . 8
5.3 Rhetorical relations. 9
5.4 Qualifiers .11
5.5 Metamodel .11
6 Multifunctionality, multidimensionality and segmentation .11
6.1 Multifunctionality .11
6.2 Multidimensionality and dimensions .13
6.3 Segmentation .14
7 Specification of the annotation scheme .15
7.1 Overview .15
7.2 Dimensions .15
7.2.1 Overview .15
7.2.2 Task and Task Management .16
7.2.3 Auto-Feedback and Allo-Feedback .16
7.2.4 Turn Management .16
7.2.5 Time Management .16
7.2.6 Discourse Structuring .17
7.2.7 Social Obligations Management .17
7.2.8 Own- and Partner Communication Management .17
7.2.9 Contact Management .17
7.3 Communicative functions .17
7.3.1 Overview .17
7.3.2 General-purpose functions .19
7.3.3 Dimension-specific functions .20
7.3.4 Responsive communicative functions .21
7.4 Functional and feedback dependences .22
7.5 Qualifiers .22
8 The Dialogue Act Markup Language (DiAML) .23
8.1 Overview .23
8.2 Abstract syntax .24
8.3 Concrete syntax .24
8.4 Semantics .26
9 Extension and customization .27
9.1 Overview .27
9.2 Simplifying the annotation scheme: options and selections .27
9.3 Extending the annotation scheme: triple-layered plug-ins and interfaces .28
Annex A (normative) Formal specification of DiAML .30
Annex B (normative) DiAML-XML technical schema.36
Annex C (normative) Data categories for DiAML concepts .40
Annex D (informative) Plug-ins for semantic content and other enrichments .62
Annex E (informative) Annotation guidelines and examples .73
Bibliography .92
iv © ISO 2020 – All rights reserved

Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 4, Language resource management.
This second edition cancels and replaces the first edition (ISO 24617-2:2012), which has been technically
revised.
The main changes compared to the previous edition are as follows:
— in 6.2, ‘reference segments’ are introduced to allow more accurate annotations of feedback
dependence relations;
— in 6.3, a more detailed way of annotating rhetorical relations between dialogue acts is made possible
by importing concepts from ISO 24617-8:2016 (DR-core);
— in 7.2, the Contact Management dimension, known from the DIT++ annotation scheme, and the Task
Management dimension, known from the DAMSL annotation scheme, have been added, along with a
few communicative functions specific for contact management;
— in 7.5 and Annex D, a possibility is introduced for importing elements from the W3C recommendation
EmotionML in order to add affective information to dialogue acts;
— in Clause 9 and Annex D, the mechanism of ‘triple-layered annotation scheme plug-in’ with ‘plug-in
interface’ is introduced; this mechanism allows the dialogue act annotation to be customized, using
application-specific concepts, and to be enriched with semantic content information.
A list of all parts in the ISO 24617 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
Introduction
Since its publication in 2012, ISO 24617-2 has been used in a number of annotation efforts as well as in
the development of language-based interactive systems. These experiences have brought to light
— that the standard allowed dialogue act annotations that are slightly inaccurate in some respects,
— that some applications would benefit from the availability of mechanisms for customizing the set of
concepts defined in the standard, and
— that certain use cases require the representation of functional dialogue act information to be
extended with semantic content information.
This second edition seeks to remedy the noted inaccuracies, and to provide mechanisms
a) for customizing the set of defined concepts, and
b) for extending the information types in dialogue act annotations.
The improved accuracy of this second edition concerns the annotation of semantic dependence relations
of dialogue acts and their scopes, and of rhetorical relations between dialogue acts. The mechanisms for
extending and customizing the standard for a specific application concern most notably the annotation
of information about the (domain-specific) semantic content of dialogue acts, the introduction of
application-specific dialogue act types, the addition of communicative functions for fine-grained
specification of feedback, and the annotation of speaker emotions.
This second edition is downward compatible with the original ISO 24617-2:2012 in the sense that
every annotation made with the original version is a valid annotation according to the second edition.
Existing annotations do not need to be revised in order to be compliant with this second edition.
vi © ISO 2020 – All rights reserved

INTERNATIONAL STANDARD ISO 24617-2:2020(E)
Language resource management — Semantic annotation
framework (SemAF) —
Part 2:
Dialogue acts
1 Scope
This document provides a set of empirically and theoretically well-motivated concepts for dialogue
annotation, a formal language for expressing dialogue annotations (the Dialogue Act Markup Language,
DiAML), and a method for segmenting a dialogue into semantic units. This allows the manual or
automatic annotation of dialogue segments with information about the communicative actions which
the participants perform by their contributions to the dialogue. The annotation scheme specified in
this document supports multidimensional annotation of spoken, written, and multimodal dialogues
involving two or more participants. Dialogue units are viewed as having multiple communicative
functions in different dimensions. The markup language DiAML has an XML-based representation format
and a formal semantics which makes it possible to perform inferences with DiAML representations.
This document also specifies data categories for dimensions of dialogue analysis, for communicative
functions, for dialogue act qualifiers, and for relations between dialogue acts. Additionally, it provides
mechanisms for customizing these sets of concepts, extending them with application-specific or
domain-specific concepts and descriptions of semantic content, or selecting relevant coherent subsets
of them. These mechanisms make the dialogue act concepts specified in this document useful not only
for annotation but also for the recognition and generation of dialogue acts in interactive systems.
2 Normative references
There are no normative references in this document.
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1
addressee
dialogue (3.5) participant (3.13) oriented to by the sender (3.20) in a manner to suggest that his/her
utterances (3.25) are particularly intended for this participant, and that some response is therefore
anticipated from this participant, more so than from the other participants
Note 1 to entry: This definition is a de facto standard in the linguistics literature.
[SOURCE: Reference [34], modified - ‘speaker' replaced by ‘sender', and use of ambiguous pronouns
avoided.]
3.2
allo-feedback act
feedback act (3.8) where the sender (3.20) elicits information about the addressee's (3.1) processing
of an utterance (3.25) that the sender contributed to the dialogue (3.5), or where the sender provides
information about his perceived processing by the addressee of an utterance that the sender contributed
to the dialogue
EXAMPLE 1. A: Now move up.
2. B: Slightly northeast you mean?
3. A: Slightly yeah
With utterance 3, A performs an allo-feedback act signalling that he/she thinks B understood utterance 1
correctly.
3.3
auto-feedback act
feedback act (3.8) where the sender (3.20) provides information about his/her own processing of an
utterance (3.25) contributed to the dialogue (3.5) by another participant (3.13)
EXAMPLE B's utterance in the example dialogue fragment in 3.2 signals that he/she is uncertain whether
he/she understood the previous utterance correctly.
3.4
communicative function
property of certain stretches of communicative behaviour, describing how the behaviour changes the
information state (3.12) of an understander of the behaviour
3.5
dialogue
exchange of utterances (3.25) between two or more persons or artificial agents
3.6
dialogue act
communicative activity of a dialogue (3.5) participant (3.13), interpreted as having a certain
communicative function (3.4) and semantic content (3.18)
Note 1 to entry: A dialogue act can additionally also have certain functional dependence relations (3.10), rhetorical
relations (3.17) and feedback dependence relations (3.9) with other units in a dialogue.
3.7
dimension
class of dialogue acts (3.6) that are concerned with a particular aspect of communication, corresponding
to a particular category of semantic content (3.18)
EXAMPLE (1) Dialogue acts advancing the task or activity that motivates the dialogue (the ‘Task' dimension);
(2) dialogue acts providing and eliciting feedback (the Auto- and Allo-Feedback dimensions); (3) dialogue acts for
allocating the speaker role (the Turn Management dimension).
3.8
feedback act
dialogue act (3.6) that provides or elicits information about the sender's (3.20) or the addressee's (3.1)
processing of something that was uttered in the dialogue (3.5)
Note 1 to entry: Two classes of feedback are distinguished: allo-feedback acts (3.2) and auto-feedback acts (3.3).
2 © ISO 2020 – All rights reserved

3.9
feedback dependence relation
relation between a feedback act (3.8) and the stretch of communicative behaviour the processing of
which the act provides or elicits information about
EXAMPLE In the example in 3.2, both the allo-feedback act expressed by utterance 3 and the auto-feedback
act expressed by utterance 2 have a feedback dependence relation to utterance 1.
Note 1 to entry: Feedback dependence relations are also used to relate self-corrections, partner corrections, and
other speech editing acts, which strictly speaking are not feedback acts, to the segments that they apply to.
3.10
functional dependence relation
relation between a dialogue act (3.6) with a responsive communicative function (3.16) and one or more
previous dialogue acts that it responds to
EXAMPLE The relation between an answer and the corresponding question, such as between utterance 3 and
utterance 2 in the example in 3.2; or the relation between the acceptance of an offer and the corresponding offer.
3.11
functional segment
minimal stretch of communicative behaviour that has one or more communicative functions (3.4)
Note 1 to entry: The condition of being ‘minimal' ensures that functional segments do not include material that
does not contribute to the expression of a communicative function that identifies the segment.
EXAMPLE The functional segment corresponding to the answer given by S in the following dialogue
fragment does not include the parts "Just a moment please" and “. let me see." but only the parts “the first train to
the airport on Sunday morning is" and “at 5:45”.
1. U: What time is the first train to the airport on Sunday morning please?
2. S: Just a moment please. the first train to the airport on Sunday morning is . let me see. at 5:45.
Note 2 to entry: A consequence of this definition is that functional segments can be discontinuous, can overlap or
be embedded, and can contain parts from more than one turn.
3.12
information state
context
totality of a dialogue (3.5) participant's (3.13) beliefs, assumptions, expectations, goals, preferences,
hopes, and other attitudes that may influence the participant's interpretation and generation of
communicative behaviour
3.13
participant
person or artificial agent involved in the exchange of utterances (3.25)
3.14
qualifier
predicate that can be associated with a communicative function (3.4)
EXAMPLE A: Would you like to have some coffee?
B: Only if you have it ready.
B's utterance accepts A's offer under a certain condition; this can be described by qualifying the communicative
function Accept Offer with the predicate ‘conditional'.
3.15
reference segment
stretch of communicative behaviour that a feedback dependence relation (3.9) refers to and that is not a
functional segment (3.11)
3.16
responsive communicative function
communicative function (3.4) of a dialogue act (3.6) that depends for its semantic content (3.18) on one or
more dialogue acts that it responds to
Note 1 to entry: See 5.2.
Note 2 to entry: In 7.3.4, the set of responsive communicative functions is listed of the annotation scheme defined
in this document.
3.17
rhetorical relation
discourse relation
semantic or pragmatic relation between two dialogue acts (3.6) or their semantic contents (3.18)
Note 1 to entry: Relations such as elaboration, explanation, justification, cause, and concession have been studied
extensively in the analysis of (monologue) text, where they are often called ‘rhetorical relations' or ‘discourse
relations', and are mostly viewed either as relations between text segments or as relations between events or
propositions, described in text segments. Many of these relations also occur in dialogue (3.5).
EXAMPLE 1 In the following example, the statement in the second utterance provides a motivation for the
question in the first utterance:
A: Can you tell me what flights there are to Sydney on Saturday? I’d like to attend my mother's 80th birthday.
EXAMPLE 2 A rhetorical relation between the semantic contents of two dialogue act occurs in the following,
where the content of B's statement mentions a cause for the content of A's statement:
A: I can never find these stupid remote controls.
B: That's because they don’t have a fixed location.
3.18
semantic content
information, situation, action, event, or objects that a stretch of communicative behaviour refers to
3.19
semantic content category
semantic content type
type of the semantic content (3.18) of a dialogue act (3.6)
EXAMPLE The various dimensions (3.7) defined in this document correspond to categories of semantic
content. In particular, the Task dimension corresponds to the category of task-specific actions and information;
the Allo- and Auto-Feedback dimensions correspond to the categories of information about the processing by
the current speaker or by the addressee, respectively, of something that was said before; the Turn Management
dimension corresponds to the category of information about the allocation of the speaker role, and so forth.
3.20
sender
dialogue (3.5) participant (3.13) who performs a dialogue act (3.6)
3.21
speaker
sender (3.20) of a dialogue act (3.6) in spoken form, possibly combining speech with nonverbal
communicative behaviour
Note 1 to entry: A dialogue (3.5) participant (3.13) can contribute to a dialogue without having the speaker role
(3.22), for example by nodding in agreement to what the other participant says. Therefore, the term ‘speaker' is
not synonymous with ‘participant who occupies speaker role'.
4 © ISO 2020 – All rights reserved

3.22
speaker role
role occupied by a participant (3.13) who has temporary control of a dialogue (3.5) and speaks for some
period of time
[SOURCE: DAMSL annotation scheme (see Reference [3]).]
3.23
speech act
act that a speaker (3.21) performs when producing an utterance (3.25)
Note 1 to entry: The notion ‘utterance’ in this definition is commonly interpreted as mentioned in Note 1 to entry
of 3.25.
[SOURCE: SIL Glossary of linguistic terms (https:// glossary .sil .org/ term/ speech -act), modified - Added
Note 1 to entry.]
3.24
turn unit
stretch of communicative activity produced by one participant (3.13) who occupies the speaker role
(3.22), bounded by periods of inactivity of that sender (3.20) or by periods where another participant
occupies the speaker role
Note 1 to entry: The term ‘turn unit’ corresponds to one of the meanings of the often used term ‘turn’, which is
ambiguous between ‘turn unit’ and ‘right to speak”, as in “to have the turn” and “turn-taking”. The term ‘turn’ is
only used in this document when the context makes it clear in what sense the term is meant.
Note 2 to entry: The term ‘turn unit’ is also closely related to the term ‘turn construction unit’ (TCU), introduced
by Reference [51]. The TCU seems a rather intuitive and holistic notion, of which the usefulness has been the
subject of debate (see e.g. Reference [52]). The term is therefore avoided in this document.
Note 3 to entry: The term ‘turn unit’ is useful in the description of dialogue (3.5) behaviour, but is not of central
importance in this document, since dialogue acts (3.6) are not assumed to correspond to turn units.
3.25
utterance
anything said, written, keyed, signed, or otherwise expressed, possibly in multimodal form
Note 1 to entry: An utterance is part of a turn unit (3.24). In the literature, the term is commonly used in the sense
of ‘everything contributed by a sender (3.20) within a turn unit’.
Note 2 to entry: The term ‘utterance’ is useful in the description of dialogue (3.5) behaviour, but is not of central
importance in this standard, since dialogue acts (3.6) are not assumed to correspond to utterances, but rather to
the communicative behaviour in functional segments (3.11).
4 Use cases
The notion of a dialogue act plays a key role in the analysis of spoken and multimodal dialogue, as well
as in the design of spoken dialogue systems and embodied conversational agents. These applications
all depend on the availability of dialogue corpora, annotated with dialogue act information. The main
purpose of this document is to define a reference set of domain-independent basic concepts for dialogue
act annotation, and their use for representing such annotations, in the Dialogue Act Markup Language
(DiAML). The set of concepts defined in ISO 24617-2:2012 was based on the DIT++ taxonomy, which was
originally developed to serve a double purpose: the articulate functional description of communicative
activity in natural human dialogue, and a basis for the design of modules in dialogue systems. As part of
the ISO 24617 series, a focus came to lie on annotation. Still, like DIT++, this document has multiple use
cases, which can be grouped into four types:
— UC1: manual annotation of spoken, written, or multimodal human-human or human-computer
dialogue;
— UC2: automatic annotation of spoken, written, or multimodal human-human or human-computer
dialogue starting from transcriptions or recordings of communicative behaviour;
— UC3: recognition of dialogue acts in (multimodal) communicative user behaviour;
— UC4: generation of dialogue acts by the dialogue manager component of a dialogue system.
The different use cases bring different requirements and desiderata, as follows.
— UC1: manual annotation is costly and only feasible for limited amounts of data, but has the advantage
of producing annotations of the highest quality since expert annotators have a wealth of context
information, general world knowledge, and common-sense reasoning abilities to infer speaker
beliefs and intentions. Expert annotators are therefore able to make fine-grained annotations. In
order to support this use case, the annotation scheme should include fine-grained concepts.
— UC2: automatic annotation systems typically cannot characterize dialogue behaviour with the same
level of detail as expert human annotators, since they lack common world knowledge, and usually have
access to context information only as far as represented in the dialogue history. Automatic annotation
is therefore in general less fine-grained. To effectively support this use case, the annotation scheme
should contain more coarse-grained concepts than those needed for use case UC1.
— UC3: recognition of dialogue acts by an interactive system is almost the same as automatic dialogue
act annotation, except that in an interactive system the semantic contents of dialogue acts play a
prominent role. For a given application, it may be beneficial to define application-specific functions
for specific types of content. For effectively supporting this use case, it may be useful to extend the
annotation scheme with application-specific concepts.
— UC4: generation of dialogue acts in an interactive system concerns the decision how to continue
a dialogue, and this is the main task of a dialogue manager component. This task is typically
organized as a two-stage process: (1) decide on the communicative functions and semantic contents
of one or more possible dialogue acts; (2) decide on a realization in an appropriate form. In contrast
with human dialogue participants, who may be somewhat vague or unspecific about their beliefs
and intentions, a system’s dialogue manager typically works with precise beliefs and goals, and
generates, in stage (1), dialogue acts with fine-grained communicative functions, for example for
feedback acts, since the system may report a processing problem with great accuracy. This calls for
the annotation scheme to include very fine-grained functions.
The new elements in this second edition of this document were introduced for providing more effective
support for each of these use cases, in particular for the cases UC3 and UC4.
5 Basic concepts and metamodel
5.1 Dialogue acts
The term ‘dialogue act' is often used rather loosely in the sense of speech act used in dialogue. Indeed,
the idea of interpreting communicative behaviour in terms of actions, such as questions, promises, and
[9],[50]
requests goes back to speech act theory . But where speech act theory is primarily an action-
based approach to meaning within the philosophy of language, dialogue act theory is an empirically-
based approach to the computational modelling of linguistic and nonverbal communicative behaviour
in dialogue.
Dialogue acts offer a way of characterizing the meaning of communicative behaviour in terms of update
operations, to be applied to the information states of participants in the dialogue; this approach is
commonly known as the ‘information-state update’ or ‘context-change’ approach; see e.g. References
[12] and [56]. For instance, when an addressee understands the utterance “Do you know what time it
is?” as a question about the time, then the addressee’s information state is updated to contain (among
other things) the information that the speaker does not know what time it is and would like to know
that. If, by contrast, it is understood that the speaker is reproaching the addressee for being late, then
the addressee’s information state is updated to include (among other things) the information that the
6 © ISO 2020 – All rights reserved

speaker does know what time it is. Distinctions such as that between a question and a reproach concern
the communicative function of a dialogue act, which is one of its two main components. The other main
component is its semantic content, which describes the objects, properties, relations, situations, actions
or events that the dialogue act is about. The communicative function of a dialogue act specifies how an
addressee updates his information state with the information expressed in the semantic content when
he/she understands the dialogue act.
This approach to the definition of communicative functions is strictly semantic, in contrast to
approaches based on linguistic form. For example, the behaviour of a speaker who repeats something
that was said by someone else may be characterized as a ‘repetition’ (which is a communicative function
in some annotation schemes); however, this only says something about the form of the behaviour
compared to the repeated behaviour, not about its function. A repetition often has a feedback function,
as in (1.2), but it can also have other functions, as in (1.4), where it is used as a confirmation in response
to a check question.
(1) 1. S: There are evening flights at seven-fifteen and eight-thirty.
2. C: Seven-fifteen and eight-thirty.
3. C: And that’s on Sunday too.
4. S: And that’s on Sunday too.
A form-related requirement for introducing a communicative function is however that there are
observable features of communicative (linguistic and/or nonverbal) behaviour which are indicative for
that function in the context in which the behaviour occurs. This requirement puts all communicative
functions on an empirical basis.
Dialogue act annotation is the marking up of stretches of dialogue with information about the dialogue
acts they contain. Spoken dialogues are traditionally segmented into turns, in the sense of ‘turn units’
as defined in 3.24. Such turns can be quite long and complex, and are therefore not the most useful units
of behaviour to assign communicative functions to. Communicative functions can be assigned more
accurately to smaller units that are functionally relevant. Such units are called functional segments, and
are defined as the minimal stretches of communicative behaviour that have one or more communicative
functions. Subclause 6.3 discusses dialogue segmentation.
Inherent to the notion of a dialogue act is that there is an agent who produces the dialogue act, called
the ‘sender’, and one or more agents who are addressed, called the ‘addressee(s)’. Dialogue studies often
focus on two-person dialogues, in which case the dialogue acts have only one addressee. Besides sender
and addressee(s), there may be various types of side-participants who are present but do not or only
[29]
marginally participate .
Dialogue act annotation is often limited to assigning communicative functions to dialogue segments,
which corresponds intuitively to indicating the type of communicative action that is performed. A
semantically more complete characterization additionally provides information about the category of
semantic content. The DAMSL annotation scheme distinguishes three categories of semantic content:
Task, Task Management, and Communication, which indicate whether the semantic content of the
dialogue act advances the task which underlies the dialogue, or discusses how to perform the task,
or concerns the communication process. The DIT++ scheme distinguishes a number of subcategories
of communication-related information, such as feedback information, turn allocation information,
and speech management information. The scheme in this document inherits the DIT++ categories of
semantic content, also called ‘dimensions’; see Clause 7.
Example (2) illustrates the use of the key attributes of a dialogue act in the DiAML-XML annotation of
a task-related yes-no question addressed by speaker ‘a’ to addressee ‘b’, expressed by the functional
segment ‘m1’.
(2) communicativeFunction="propositionalQuestion"/>
5.2 Dependence relations
Some types of dialogue acts are inherently dependent for their full meaning on one or more dialogue
acts earlier in the dialogue, which they respond to. This is for example the case for answers, whose
meaning is partly determined by the question that is being answered, and also for the acceptance or
rejection of offers, suggestions, requests, and apologies. This is illustrated in example (3), where the
meaning of the answer in turn 3 depends on whether it is an answer to the question in turn 1 or to the
one in turn 2.
(3) 1. B: Do you know who’s coming tonight?
2. B: Which of the project members do you think will be there?
3. A: I’m expecting Jan, Alex, Claudia, and David, and maybe Olga and Andrei.
As an answer to the question in 1, A’s answer says that nobody else is expected to come than the people
that are mentioned, but as an answer to the question in 2 it leaves open the possibility that other people
will come, who are not members of ‘the project’.
This kind of semantic dependence, which is due to the responsive character of some communicative
functions, is called a functional dependence relation. Marking up this relation between a dialogue act
with a responsive communicative function and its ‘antecedent’ dialogue acts allows the annotation
to not just indicate e.g. that an utterance has the function of an answer, but also to indicate to which
question it is an answer, as illustrated in (4). Subclause 7.3.4 lists the responsive communicative
functions defined in this document.
(4) a. B: Which of the project members do you think will be there?
A: I’m expecting Jan, Alex, Claudia, and David, and maybe Olga and Andrei.
b. communicativeFunction="setQuestion"/>
communicativeFunction="answer" functionalDependence=”#da1”/>
The property of ‘responsiveness’ is related to what in the literature is called ‘backward-looking’. For
example, in DAMSL the communicative functions are divided over two categories: forward-looking and
backward-looking. Backward-looking functions are defined as those functions that indicate how the
current utterance relates to the previous discourse. These include not only answers and other dialogue
acts whose semantic content is co-determined by antecedent dialogue acts, but also feedback acts and
other acts concerned with speech editing.
Positive and negative feedback-providing acts depend for their interpretation also on what happened
earlier in the dialogue, but in a different way. They are concerned with the processing of what was said
before - such as its perception or its interpretation. This is illustrated by the examples in (5).
(5) 1. A: The flight on Tuesday would suit me really well.
B: Okay.
2. A: The flight on Tuesday would suit me really well.
B: On Tuesday?
In the first example, B indicates that he/she has correctly understood A’s remark; in the second, he/she
checks whether he/she heard (or remembers) correctly what A said. This relation between a positive or
negative feedback act and its ‘antecedent’ is called a feedback dependence relation.
A feedback dependence relation indicates one or more preceding dialogue acts if the feedback concerns
high-level processing, such as understanding, and it indicates a dialogue segment in the case of low-
level processing, such as hearing what was said. In the latter case, ISO 24617-2:2012 stipulated that the
feedback dependence relation should refer to the smallest functional segment containing the segment
8 © ISO 2020 – All rights reserved

that the feedback act is about. This way of annotating feedback dependence relations is not quite
accurate, since feedback about a stretch of communicative behaviour smaller than a functional segment
is not about the entire segment. For example, negative feedback that signals a problem in hearing
certain words may imply positive feedback about the rest of the segment. Similarly, for feedback-
eliciting acts and for dialogue acts in the Own Communication Management (OCM) dimension or in the
Partner Communication Management (PCM) dimension. In particular, Self-Corrections and Partner
Corrections frequently refer to a single word or phrase which does not form a functional segment. To
make more accurate annotation possible, this second edition introduces a ‘reference segment’, as a
stretch of communicative behaviour that is the object of a feedback dependence relation and that is not
a functional segment.
5.3 Rhetorical relations
The possibility of annotating rhetorical relations between dialogue acts in ISO 24617-2:2012 was
limited in three respects:
a) no particular set of relations was specified;
b) there was no possibility to indicate the roles of the arguments;
c) it was not possible to distinguish between relations at the level of dialogue acts and relations at the
level of their semantic contents.
Since the publication of this standard, ISO 24617-8:2016 (DR-core) was published, which defines an
annotation scheme for rhetorical relations. This second edition provides an option for annotating
rhetorical relations in dialogue in a more fine-grained manner by importing concepts of the DR-core
annotation scheme. See 6.3.
Dialogue acts may also be semantically and pragmatically related through other relations, known as
rhetorical relations or discourse relations, as in the examples shown in (6).
(6) 1. A: It ties you on in terms of the technology and the complexity that you want.
2. A: like for example voice recognition.
3. A: because you might need to power a microphone and other things.
4. A: So that’s one constraint there.
1)
In this example , a sequence of four functional segments is contributed by the same partici
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...