ISO/IEC 18040:2019
(Main)Information technology — Computer graphics, image processing and environmental data representation — Live actor and entity representation in mixed and augmented reality (MAR)
Information technology — Computer graphics, image processing and environmental data representation — Live actor and entity representation in mixed and augmented reality (MAR)
This document defines a reference model and base components for representing and controlling a single LAE or multiple LAEs in an MAR scene. It defines concepts, a reference model, system framework, functions and how to integrate a 2D/3D virtual world and LAEs, and their interfaces, in order to provide MAR applications with interfaces of LAEs. It also defines an exchange format necessary for transferring and storing LAE-related data between LAE-based MAR applications. This document specifies the following functionalities: a) definitions for an LAE in MAR; b) representation of an LAE; c) representation of properties of an LAE; d) sensing of an LAE in a physical world; e) integration of an LAE into a 2D/3D virtual scene; f) interaction between an LAE and objects in a 2D/3D virtual scene; g) transmission of information related to an LAE in an MAR scene. This document defines a reference model for LAE representation-based MAR applications to represent and to exchange data related to LAEs in a 2D/3D virtual scene in an MAR scene. It does not define specific physical interfaces necessary for manipulating LAEs, that is, it does not define how specific applications need to implement a specific LAE in an MAR scene, but rather defines common functional interfaces for representing LAEs that can be used interchangeably between MAR applications.
Technologies de l'information — Infographie, traitement de l'image et représentation des données environnementales — Représentation d'acteurs et d'entités réels en réalité mixte et augmentée (MAR)
General Information
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 18040
First edition
2019-05
Information technology — Computer
graphics, image processing and
environmental data representation —
Live actor and entity representation in
mixed and augmented reality (MAR)
Technologies de l'information — Infographie, traitement de l'image
et représentation des données environnementales — Représentation
d'acteurs et d'entités réels en réalité mixte et augmentée (MAR)
Reference number
©
ISO/IEC 2019
© ISO/IEC 2019
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2019 – All rights reserved
Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms, definitions and abbreviated terms . 1
3.1 Terms and definitions . 1
3.2 Abbreviated terms . 3
4 Concepts of LAE representation in MAR . 3
4.1 Overview . 3
4.2 Components . 6
4.2.1 General. 6
4.2.2 LAE capturer and sensor . 7
4.2.3 LAE recognizer . 8
4.2.4 LAE tracker . 8
4.2.5 LAE spatial mapper . 8
4.2.6 LAE event mapper . 8
4.2.7 Renderer . 8
4.2.8 Display and user interface . 8
4.2.9 Scene representation . 8
5 LAE capturer and sensor . 9
5.1 Overview . 9
5.2 Computational view. 9
5.2.1 General. 9
5.2.2 LAE capturer . 9
5.2.3 LAE sensor . . .10
5.3 Informational view .12
6 Tracker and spatial mapper for an LAE .13
6.1 Overview .13
6.2 Computational view.14
6.3 Informational view .16
6.4 An example of LAE tracking and spatial mapping in MAR .17
7 Recognizer and event mapper for an LAE .17
7.1 Overview .17
7.2 Recognizer .17
7.3 Event mapper .19
7.4 Event execution .20
7.5 Examples of LAE recognizing and event mapping in MAR.21
8 Scene representation for an LAE .22
8.1 Overview .22
8.2 Scene description .23
9 Renderer .24
9.1 Overview .24
9.2 Computational view.24
9.3 Information view.25
10 Display and UI .25
11 Extensions to virtual actor and entity .25
12 System performance .26
13 Safety .27
© ISO/IEC 2019 – All rights reserved iii
14 Conformance .28
Annex A (informative) Use case examples .31
Bibliography .39
iv © ISO/IEC 2019 – All rights reserved
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that
are members of ISO or IEC participate in the development of International Standards through
technical committees established by the respective organization to deal with particular fields of
technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other
international organizations, governmental and non-governmental, in liaison with ISO and IEC, also
take part in the work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www .iso .org/patents) or the IEC
list of patent declarations received (see http: //patents .iec .ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www .iso
.org/iso/foreword .html.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 24, Computer graphics, image processing and environmental data representation.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/members .html.
© ISO/IEC 2019 – All rights reserved v
Introduction
This document defines the scope and key concepts of a representation model for a live actor and
entity (LAE) to be included in a mixed and augmented reality (MAR) world. The relevant terms and
their definitions, and a generalized system architecture, together serve as a reference model for MAR
applications, components, systems, services and specifications. It defines representing and rendering
an LAE in an MAR scene, and interaction interfaces between an LAE and objects in an MAR scene. It
defines a set of principles, concepts and functionalities for an LAE applicable to the complete range of
current and future MAR standards. This reference model establishes the set of required modules and
their minimum functions, the associated information content, and the information models that shall
be provided and/or be supported by a compliant MAR system. It includes (but is not limited to) the
following content:
— an introduction to the mixed and augmented reality standards domain and concepts;
— a representation model for including an LAE in an MAR scene;
— 3D modelling, rendering and simulation of an LAE in an MAR scene;
— attributes of an LAE in an MAR scene;
— sensing representation of an LAE in an MAR scene;
— representation of the interfaces for controlling an LAE in an MAR scene;
— functionalities and base components for controlling an LAE in an MAR scene;
— interactive interfaces between an LAE and an MAR scene;
— interface with other MAR components;
— relationship to other standards;
— use cases.
The objectives of this document are as follows:
— provide a reference model for LAE representation-based MAR applications;
— manage and control an LAE with its properties in an MAR environment;
— integrate an LAE into a 2D and/or 3D virtual scene in an MAR scene;
— achieve interaction of an LAE with a 2D and/or 3D virtual scene in an MAR scene;
— provide an exchange format necessary for transferring and storing data between LAE-based MAR
applications.
This document has the following document structure:
— Clause 4 describes the concepts of LAE-based systems represented in MAR.
— Clause 5 illustrates how a sensor captures an LAE in a physical world and a virtual world.
— Clause 6 describes mechanisms to track the position of an LAE and specifies the role of a spatial
mapper between physical space and the MAR space.
— Clause 7 describes mechanisms to recognize the behaviour of an LAE and specifies an association
or event between an MAR event of an LAE and the condition specified by the MAR content creator.
— Clause 8 describes a scene, which consists of a virtual scene, sensing data, a spatial scene, events,
targets and so on, for an LAE.
vi © ISO/IEC 2019 – All rights reserved
— Clause 9 describes how the MAR scene system renders the scene, LAE mapping, event and so on for
presentation output on a given display device.
— Clause 10 describes types of displays, including monitors, head mounted displays, projectors, haptic
devices and sound output devices for displaying an LAE in an MAR scene.
— Clause 11 identifies and describes virtual LAE, such as virtual 3D model (avatar) and virtual LAE,
such as real human model in an MAR system.
— Clause 12 makes statements regarding any system performance related issues of an LAE in MAR.
— Clause 13 makes statements regarding any operational safety related issues of an LAE in MAR.
— Clause 14 makes statements regarding any conformance related issues of an LAE in MAR.
— Annex A gives examples of representative LAE representation systems in MAR.
© ISO/IEC 2019 – All rights reserved vii
INTERNATIONAL STANDARD ISO/IEC 18040:2019(E)
Information technology — Computer graphics, image
processing and environmental data representation — Live
actor and entity representation in mixed and augmented
reality (MAR)
1 Scope
This document defines a reference model and base components for representing and controlling a
single LAE or multiple LAEs in an MAR scene. It defines concepts, a reference model, system framework,
functions and how to integrate a 2D/3D virtual world and LAEs, and their interfaces, in order to provide
MAR applications with interfaces of LAEs. It also defines an exchange format necessary for transferring
and storing LAE-related data between LAE-based MAR applications.
This document specifies the following functionalities:
a) definitions for an LAE in MAR;
b) representation of an LAE;
c) representation of properties of an LAE;
d) sensing of an LAE in a physical world;
e) integration of an LAE into a 2D/3D virtual scene;
f) interaction between an LAE and objects in a 2D/3D virtual scene;
g) transmission of information related to an LAE in an MAR scene.
This document defines a reference model for LAE representation-based MAR applications to represent
and to exchange data related to LAEs in a 2D/3D virtual scene in an MAR scene. It does not define
specific physical interfaces necessary for manipulating LAEs, that is, it does not define how specific
applications need to implement a specific LAE in an MAR scene, but rather defines common functional
interfaces for representing LAEs that can be used interchangeably between MAR applications.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 18039, Information technology — Computer graphics, image processing and environmental data
representation — Mixed and augmented reality (MAR) reference model
3 Terms, definitions and abbreviated terms
3.1 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 18039 and the
following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https: //www .iso .org/obp
© ISO/IEC 2019 – All rights reserved 1
— IEC Electropedia: available at http: //www .electropedia .org/
3.1.1
augmented object
object with augmentation
3.1.2
geographic coordinate system
coordinate system which is provided by sensor devices for defining a location of LAE (3.1.4)
3.1.3
head mounted display
HMD
device which displays stereo views of virtual reality
Note 1 to entry: It has two small displays with lenses and semi-transparent mirrors which can adapt to the left
and right eyes.
3.1.4
live actor and entity
LAE
representation of a living physical or real object, such as a human being, animal or bird, in the MAR
content or system
Note 1 to entry: A live actor can be animated, moved and interacted with virtual objects in an MAR scene by
capturing gesture from a camera. Entity refers to 3D objects and entities that exist in MAR content.
3.1.5
LAE recognizer
MAR component that recognizes the output from an LAE capturer (3.1.6) and an LAE sensor, then
generates MAR events based on conditions indicated by the content creator
3.1.6
LAE capturer
MAR component that captures an LAE (3.1.4) in a virtual world and a physical world, which includes
depth cameras, general cameras, 360° cameras and so on
Note 1 to entry: LAE’s information can be processed by an LAE recognizer and an LAE tracker to extract
background or skeleton.
3.1.7
LAE tracker
MAR component (hardware and software) that analyses signals from LAE capturers (3.1.6) and sensors
and provides some characteristics of a tracked LAE (3.1.4) (for example position, orientation, amplitude,
profile)
3.1.8
physical camera coordinate system
coordinate system which is provided by a camera for capturing LAE(s) (3.1.4) in physical world
3.1.9
physical coordinate system
coordinate system that enables locating an LAE (3.1.4) and is controlled by a geospatial coordinate
system sensing device
2 © ISO/IEC 2019 – All rights reserved
3.1.10
virtual actor and entity
VAE
virtual reality representation of an LAE (3.1.4)
Note 1 to entry: The virtual actor and entity is obtained by a 3D capturing technique and can be reconstructed,
transmitted or compressed in the MAR scene. A virtual actor and entity can be captured in one place or
transmitted to another place in real time using holography technology.
3.1.11
world coordinate system
universal system in computer graphics that allows model coordinate systems to interact with each other
3.2 Abbreviated terms
For the purposes of this document, the abbreviated terms given in ISO/IEC 18039 and the following apply.
DDR dance dance revolution
EID event identifier
FOV field of view
GNSS global navigation satellite system
LAE-MAR live actor and entity representation in mixed and augmented reality
RGB red, green, blue
SDK software development kit
SID sensor identifier
UI user interface
UTM universal transverse mercator
VR virtual reality
4 Concepts of LAE representation in MAR
4.1 Overview
As illustrated in ISO/IEC 18039, MAR represents a continuum which encompasses all domains or
systems that use a combination of reality (for example live video) and virtuality representations (for
[1][2]
example computer graphic objects or scene) as its main presentation medium . Figure 1 illustrates
the MAR that is defined according to a mixture of reality and virtuality representations excluding pure
real environment and pure virtual environment with viewpoints of an LAE representation. The real
environment refers to the physical world environment where an LAE and physical objects are located.
Virtual environment commonly refers to virtual reality, that is the computer-generated realistic images
and hypothetical world that replicate a real environment. Augmented reality refers to the view of the
real-world environment whose elements include LAE and objects that can be augmented by computer-
generated sensory, and augmented virtuality is the virtual environment within which physical world
elements including LAE can be mapped and interacted. In Figure 1, an LAE wears an HMD device to see
the virtual world and interacts directly with virtual objects.
© ISO/IEC 2019 – All rights reserved 3
Figure 1 — Mixed and augmented reality (MAR)
This clause describes the concepts of LAE representation in an MAR scene based on the MAR reference
model (MAR-RM) of ISO/IEC 18039, which includes objectives, embedding, interaction and functions of
the system for representing an LAE in an MAR scene. In general, an actor is an individual who portrays
a character in a performance. In this case, an actor represents a human captured by a depth camera or
a general camera, which can then perform actions that are embedded into an MAR scene. A 3D object
that exists in an MAR scene and can interact with a live actor is called an entity. The entity can be
moved or interact with an actor’s motion via an event mapper. An LAE in this document is defined as
a representation of a physical living actor and an object in an MAR content or system. For example,
human beings, birds and animals are all represented as LAEs in an MAR scene.
Figure 2 shows the examples of LAE representation in an MAR scene which consists of 2D virtual world
and 3D virtual world that can be described as the following.
Figure 2 a) shows an LAE integrated into a 2D virtual world that is a real or virtual image. The LAE can
be captured from general camera and/or depth camera sensors. This subfigure shows a real-like action
where a man is captured by cameras in a green screen studio and is integrated as an actor into a 2D
virtual world image of the White House.
Figure 2 b) shows multiple LAEs integrated into a 3D virtual world. This scenario can be applicable in
[3][4]
various situations, such as news studios, education services, virtual surgical operations or games .
It supports an integrative combination application of 3D videoconferencing, reality-like communication
features, presentation/application sharing and 3D model display within a mixed environment.
[5]
Figure 2 c) shows an MAR scene constructed by integrating a 3D virtual world and a live actor . The
live actor interacts with objects in the 3D virtual world by using a joystick or by motion captured by a
depth camera. An HMD device is used to display 360° 3D views, including real time and real-like action,
[6]
in the virtual world . The figure shows a man in the studio wearing an HMD through which he sees the
bow sling training field and handling a joystick. By handling the joystick, it appears that he is handling
the arrow and bow sling. As a result, he can shoot at objects in the virtual world.
[7]
Figure 2 d) shows a virtual actor and entity with representation in the physical world . The virtual
LAE and LAE can communicate and interact with each other, for example to have a natural face-to-face
conversation. When combined with MAR, this technology allows an LAE to see, hear and interact with a
virtual LAE in a 3D virtual space, just like a real presentation in physical space.
Figure 2 e) describes the LAE representation as a bird and a dog from which it can be inferred that LAE
can be animals, birds or humans to be represented in MAR scene.
4 © ISO/IEC 2019 – All rights reserved
a) An LAE integrated into a 2D video virtual b) Two LAEs integrated into a
[8]
world after Chroma-keying 3D virtual world
c) An LAE interacting with a d) Virtual representation of
[5] [7]
virtual object in a 3D virtual world an LAE in an MAR scene
© ISO/IEC 2019 – All rights reserved 5
e) LAE representation as a bird and a dog
Figure 2 — Examples of LAE representation in an MAR scene
Once an LAE in the physical world is integrated into a 3D virtual world, its location, movements and
interactions should be represented precisely in the 3D virtual world. In an MAR application, an LAE
that needs to be embedded in a 3D virtual world shall be defined and then information, such as the
LAE’s location, actions and sensing data from a handled device, shall be able to be transferred between
the physical world and the 3D virtual world, and between MAR applications. This document aims at
defining how an LAE can be managed and controlled with its properties in a 3D virtual world; how
an LAE can be embedded in a 3D virtual world; how an LAE can interact with virtual cameras, virtual
objects and AR content in a 3D virtual world; and how MAR application data can be exchanged in
heterogeneous computing environments.
4.2 Components
4.2.1 General
An LAE in an MAR scene can be captured from the physical world, then represented in a 3D virtual
world, and can interact with cameras, objects and AR content in the 3D virtual world according to an
input of sensing information.
In order to provide a 3D virtual world with the capability of representing an LAE based on the MAR-
RM, the MAR system requires the following functions:
— sensing of an LAE in a physical world from input devices such as a (depth) camera;
— sensing of information for interaction from input sensors;
— recognizing and tracking an LAE in a physical world;
— recognizing and tracking events made by LAEs in a physical world;
— recognizing and tracking events captured by sensors in a physical world;
— representation of the physical properties of an LAE in a 3D virtual world;
— spatial control of an LAE in a 3D virtual world;
6 © ISO/IEC 2019 – All rights reserved
— an event interface between an LAE and a 3D virtual world;
— composite rendering of an LAE into a 3D virtual world.
Figure 3 — System framework for an LAE in an MAR scene
Conceptually, as shown in Figure 3, the system for implementing an MAR scene with LAEs includes
several components necessary for processing the representations and interactions of the LAEs to be
integrated into the 3D virtual world. This clause provides a conceptual overview of these components.
Each component is described in more detail in Clauses 5 to 10.
4.2.2 LAE capturer and sensor
An MAR scene with an LAE can receive a captured image, or a sequence of captured images, and sensing
data triggered by the LAE in the physical world. The data for the LAE representation can be obtained
from sensing devices/interfaces and is classified into two types: one for the LAE, and one for the
sensing data for the LAE’s actions. The data for the LAE can be used to perform both spatial mapping
and event mapping of the LAE into the 3D virtual space, and the sensing data can be used to perform
event mapping between the LAE and the 3D virtual environment.
The MAR system can use devices such as Web cameras and depth-like devices. For Web cameras, the
sequence of images is captured and the images themselves are used as input. A depth camera is a
movement sensing interface device for capturing the movement of an LAE. The device consists of an
© ISO/IEC 2019 – All rights reserved 7
RGB camera, 3D depth sensors, multi-array microphones and a motorized tilt. The device uses an IR
(infrared radiation) camera and an IR laser projector to take 3D depth information.
4.2.3 LAE recognizer
Recognition refers to finding and identifying the actions of an LAE in the captured image, or a sequence
of captured images, and the sensing data. When a sensor processes the output of an MAR component
and generates MAR events, the LAE recognizer identifies each event and matches the event with an
event ID from the database. The LAE recognizer analyzes signals of LAE motion or interaction from the
physical world by comparing them with a local or remote signal. Then, an event function is processed.
4.2.4 LAE tracker
Modules are required to pre-process the captured image, or a sequence of captured images, and the
sensing data obtained from the sensing devices and interfaces. Pre-processing examples include
chroma-keying, colour conversion and background removal. Tracking an LAE refers to finding the
location of the LAE in each image of the sequence, which can be implemented by an image processing
library.
4.2.5 LAE spatial mapper
An LAE in a physical world is embedded into a 3D virtual world in an MAR application. The spatial
mapper’s role is to support more natural movement of the LAE within the 3D virtual world. The LAE
spatial mapper provides spatial information, such as position, orientation and scale, between the
physical world space and the MAR scene space by applying transformations for calibration. The LAE
spatial mapper maps the physical space and the LAE into the MAR scene by supplying explicit mapping
information. The mapping information can be modelled by characterizing the translation process of a
sensor’s given information.
4.2.6 LAE event mapper
An LAE performs actions in the physical world that can trigger events. These actions can be reflected
in the 3D virtual world where an LAE is participating and interacting with virtual cameras, virtual
objects and AR content. The LAE event mapper’s role is to support the actions of the LAE. It creates
an association between an MAR event and the events which are identified or recognized by the LAE
recognizer.
4.2.7 Renderer
The results of spatial and event mappers for an LAE are transferred into the rendering module. This
module integrates the results and a 3D virtual scene and displays the final rendering result of the
integrated outcomes. The MAR scene can be specified by various capabilities of the renderer; thus, the
scene can be adapted, and simulation performance can be optimized.
4.2.8 Display and user interface
The final rendering result of the integrated outcomes of LAEs and an MAR scene can be displayed on a
variety of devices, such as monitors, head mounted displays, projectors, scent diffusers, haptic devices
and sound speakers. A user interface provides users with a way to modify the state of the MAR scene.
[9] [10]
WebXR is proposed to display a rendered scene on an HMD device .
4.2.9 Scene representation
The MAR scene describes all information related to LAEs in the MAR environment. This information
consists of sensing data, spatial scene, events, targets and so on. The MAR scene observes the spatiality
of physical and virtual objects and has at least one physical object and one virtual object.
8 © ISO/IEC 2019 – All rights reserved
Table 1 summarizes the inputs and outputs for the components of an LAE-MAR system based on the
discussion so far.
Table 1 — Attributes of an LAE in each component
Type
Component
Input Output
LAE capturer/ Sensor data related to representation of an
Physical world signal
sensor LAE
Raw or processed signals representing the
LAE (provided by sensors) and target object At least one event acknowledging the rec-
LAE recognizer
specification data (reference target to be ognition
recognized)
Instantaneous values of the characteristics
Sensing data related to the representation
LAE tracker (pose, orientation, volume and so on) of the
of an LAE
recognized target signals
LAE spatial Sensor identifier and sensed spatial informa- Calibrated spatial information for an LAE
mapper tion in a given MAR scene
Translated event identifier for an LAE in a
LAE event mapper MAR event identifier and event information
given MAR scene
Synchronized rendering output (for ex-
Renderer MAR scene data ample visual frame, stereo sound signals,
motor commands and so on)
Display/UI Render scene data/user actions Display output/response of user actions
5 LAE capturer and sensor
5.1 Overview
An LAE in a physical world can be captured from hardware and (optionally) software sensors that are
able to measure any kind of physical property. As referenced in ISO/IEC 18039, two types of sensors,
“capturer” and “sensor”, are used to embed the LAE into the virtual world and to perform an interaction
between the LAE and objects in the virtual world. The most common “capturer” sensors are video and/
or depth cameras which capture a physical world as a (depth) video containing live actors and entities.
The video is used to extract the LAE and its actions to be processed by the recognizer and tracker
components. The actions, especially, can affect interaction between the LAE in the physical world and
objects in the virtual world. The target physical object can generate physical or nonphysical data which
can be captured from a “sensor”. The (non) physical data can be used to detect, recognize and track the
target physical object to be augmented, and to process the interaction. The sensing data is input into
the recognizer and tracker components.
5.2 Computational view
5.2.1 General
An LAE capturer/sensor module defines the functionalities of components and their interfaces for
sensing an LAE. It specifies the services and protocols that each component exposes to the environment.
This module provides two types of sensors, “capturer” and “sensor”, for sensing an LAE.
5.2.2 LAE capturer
Various types of cameras, including general cameras, depth cameras and 360° cameras, can be used to
capture an LAE. Figure 4 shows an LAE capturer that captures the physical world, including an LAE,
as a video, depth image, and skeleton which is used in an MAR scene. An LAE can be extracted in a
pre-processing step for the LAE tracker and/or recognizer by using video processing methods such as
© ISO/IEC 2019 – All rights reserved 9
background removal, filtering and chroma-keying. An extracted LAE can not only be embedded into a
virtual world but also be used to identify an MAR event, that is, be used as input to the LAE tracker and/
or recognizer.
Figure 4 — LAE capturer
5.2.3 LAE sensor
An LAE sensor can measure various physical properties and interpret and convert the observations
into digital signals related to the LAE. Figure 5 shows sensor capturing actions related to LAE activities.
It shows that the captured data can only be used to compute the context in the LAE tracker and LAE
recognizer, or it can be used to both compute the context and contribute to the composition of the scene,
[11]
depending on the nature of the physical property or type of sensor device . There are many types of
sensors that can be used to control virtual objects, virtual cameras and augmented objects by an LAE in
an MAR environment. These sensors can generate different results depending on their properties, such
as position, direction, geographic coordinate system, time, motion and so on. Especially, the output of
sensors can be filtered and regenerated as high-quality data.
10 © ISO/IEC 2019 – All rights reserved
Figure 5 — Sensor capturing actions related to LAE activities
As an example, Figure 6 shows how a smartphone can be used to generate a gyroscope sensor,
accelerometer, digital compass, ambient light sensor, proximity sensor, barometer, Hall Effect sensor,
magnetometer, pedometer and so on. A gyroscope sensor is used to generate the angular velocity of
the rotational angle per unit of time to detect the rotation of the phone. An accelerometer sensor is
used to measure acceleration forces and dynamics to sense movement of the phone or vibration. A
digital compass sensor is used to detect the geographic coordinate system, direction and navigation
information of the phone.
Figure 6 — Examples of sensors for LAE activities
© ISO/IEC 2019 – All rights reserved 11
An HMD sensor is also important for an LAE to be represented naturally in the virtual world. While an
LAE is wearing an HMD device, they can see the real-like scenes of the virtual world and interact with
virtual objects by using gesture front of depth or leap motion devices. There are three types of HMD
devices, each of them providing different sensor information for an LAE:
— First, a PC HMD is a desktop peripheral that acts as an external monitor. It provides the deepest
and most immersive VR experience. It can track position and orientation smoothly, allowing for
easy calculation of the user's position, then generate real time position movement in virtual space.
Most PC HMDs come with input devices, such as a camera for tracking position or a joystick for
interacting with objects in the virtual world.
— Second, a mobile HMD is a device that can be connected via smartphone in order to visualize the
virtual reality and can be customized using a built-in SDK. Mobile HMDs can provide orientation
and head tracking sensing data. This type is supported strictly for smartphones.
— Third, there are other, lower-cost devices that can view the stereo scenes of virtual reality, such as
a drop-in phone viewer. Drop-in phone viewers are supported by many smartphones using simple
stereo rendering and accelerometer tracking. They can only track orientation and, therefore, do not
provide an as smooth or immersive virtual reality experience as the first two.
Head tracking allows an LAE to feel present in a 360° scene due to sensors, such as gyroscope,
accelerometer and magnetometer.
5.3 Informational view
Sensors receive various types of signals in the physical world as inputs and obtain sensing data related
to the representation of an LAE in an MAR scene.
The input and output of sensors related to an LAE in an MAR scene are:
— input: physical world signals and/or device signals;
— output: sensor data related to representation of the LAE.
Physical sensor devices have a set of capabilities and parameters. Based on their properties, sensor
devices generate wide-ranging types of sensing data, including camera intrinsic parameters (for
example focal length, field of view (FOV), gain, frequency range);camera extrinsic parameters (for
example position and orientation); resolution; sampling rate; skeleton; mono, stereo, or spatial audio;
and 2D, 3D (colour and depth), or multi-view video. To provide the sensing data in a universally-
consistent way, sensor output consists of , , and .
In order to composite a 3D virtual space and an LAE, consideration needs to be paid to the LAE sensing
devices themselves. This document can use the following types of sensing devices: general cameras
such as Web cameras and depth cameras such as depth-like devices. For general cameras, the sequence
of RGB images is captured and the image itself is used as the sensing data. Depth cameras provide the
following: an RGB color image, a 3D depth image, multi-array microphones and a motorized tilt. The
captured image and/or the depth image can be used to embed the LAE into the 3D virtual space. Table 2
summarizes the inputs and outputs of the LAE capturer and LAE sensor.
12 © ISO/IEC 2019 – All rights reserved
Table 2 — Input/output of LAE capturer and LAE sensor
Devices to be
Capture type/Sensor type Input Output
used
Image/video
RGB camera
Video (gray, color, depth)
Depth camera
camera information
o
360 image/video
o o
360 video 360 camera
Physical world
camera information
including LAEs
LAE Stereo image/video
Stereo video Stereo camera
capture
camera information
Skeleton
Skeleton Depth camera
camera information
Audio signal (spatial
or stereo)
Audio Physical world Microphone
audio information
Electromagnetic
Geographic coordinate sys- Location, geostationary waves, coordinate
tem sensor satellite information value, latitude, longi-
tude, height
Electromagnetic waves,
Rotation motion, chang-
Gyro sensor coordinate value, mov-
es of orientation
ing directions
Electromagnetic
Accelerometer, gyro-
waves, coordinate
LAE scope, magnetometer,
Smartphone value, camera, touch,
sensor barometer, proximity,
acceleration, acceler-
light sensor, touch sensor
ometer, motion
Electromagnetic waves,
User input button func-
TMa
Wii remote /joystick coordinate value, dis-
tion
tance depth, force
Electro
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...