Information technology — Coded representation of immersive media — Part 4: MPEG-I immersive audio

This document specifies technology that supports the real-time interactive rendering of an immersive virtual or augmented reality audio presentation while permitting the user to have 6DoF movement in the audio scene. It defines metadata to support this rendering and a bitstream syntax that enables efficient storage and streaming of immersive audio content.

Technologies de l'information — Représentation codée de média immersifs — Partie 4: Audio immersif MPEG-I

General Information

Status
Not Published
Publication Date
02-Nov-2025
Current Stage
6060 - International Standard published
Start Date
03-Nov-2025
Due Date
10-Nov-2025
Completion Date
03-Nov-2025
Ref Project
Standard
ISO/IEC 23090-4:2025 - Information technology — Coded representation of immersive media — Part 4: MPEG-I immersive audio Released:3. 11. 2025
English language
625 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


International
Standard
ISO/IEC 23090-4
First edition
Information technology — Coded
2025-11
representation of immersive media —
Part 4:
MPEG-I immersive audio
Technologies de l'information — Représentation codée de média
immersifs —
Partie 4: Audio immersif MPEG-I
Reference number
© ISO/IEC 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
© ISO/IEC 2025 – All rights reserved
ii
Contents
Foreword .vi
Introduction . vii
1 Scope . 1
2 Normative references . 1
3 Terms, definitions and abbreviated terms . 1
3.1 Terms and definitions . 1
3.2 Mnemonics . 5
3.3 Abbreviated terms . 6
4 Overview . 7
5 MPEG-I immersive audio transport . 10
5.1 Overview . 10
5.2 Definitions . 11
5.3 MHAS syntax . 11
5.3.1 Audio stream . 11
5.3.2 Audio stream packet . 12
5.4 Semantics . 21
6 MPEG-I Immersive audio renderer . 31
6.1 Definitions . 31
6.2 Syntax . 31
6.2.1 General . 31
6.2.2 Generic codebook . 31
6.2.3 Directivity payloads syntax . 33
6.2.4 Diffraction payload syntax . 36
6.2.5 Voxel payload syntax . 42
6.2.6 Early reflection payload syntax . 47
6.2.7 Portal payload syntax . 50
6.2.8 Reverberation payload syntax . 52
6.2.9 Audio plus payload syntax. 54
6.2.10 Dispersion payload syntax . 54
6.2.11 Scene plus payload syntax . 54
6.2.12 Airflow payload syntax . 71
6.2.13 Granular payload syntax . 72
6.2.14 RasterMap payload syntax . 75
6.2.15 Support elements . 76
6.3 Data structure . 82
6.3.1 General . 82
6.3.2 Renderer payloads data structure . 82
6.3.3 Generic codebook . 134
6.4 Renderer framework . 134
6.4.1 Control workflow . 134
6.4.2 Rendering workflow . 146
6.5 Geometry data decompression . 159
6.5.1 General . 159
6.5.2 Metadata extraction . 159
6.5.3 Geometry . 160
6.5.4 Materials . 163
6.6 Renderer stages . 164
6.6.1 Effect activator . 164
6.6.2 Acoustic environment assignment . 165
© ISO/IEC 2025 – All rights reserved
iii
6.6.3 Granular synthesis . 167
6.6.4 Reverberation . 181
6.6.5 Portals . 242
6.6.6 Early reflections . 256
6.6.7 Airflow simulation . 270
6.6.8 DiscoverSESS . 277
6.6.9 Occlusion . 279
6.6.10 Diffraction . 284
6.6.11 Voxel-based occlusion and diffraction . 301
6.6.12 Multi-Path voxel-based diffraction with RasterMaps . 326
6.6.13 Voxel-based early reflections . 332
6.6.14 Metadata culling . 340
6.6.15 Heterogeneous extent . 345
6.6.16 Directivity . 374
6.6.17 Distance . 380
6.6.18 Directional focus . 390
6.6.19 Consolidation of render items . 391
6.6.20 Equalizer (EQ) . 398
6.6.21 Low-complexity early reflections (LC-ERs) . 399
6.6.22 Fade . 408
6.6.23 Single point higher order ambisonics (SP-HOA) . 411
6.6.24 Homogeneous extent . 416
6.6.25 Panner . 421
6.6.26 Multi-point higher order ambisonics (MP-HOA) . 428
6.6.27 Low-complexity MP-HOA . 468
6.7 Spatializer . 475
6.7.1 Binaural spatializer . 475
6.7.2 Adaptive loudspeaker rendering . 494
6.8 Limiter . 528
6.8.1 General . 528
6.8.2 Data elements and variables . 528
6.8.3 Description . 528
6.9 Interface for audio utilization information . 530
6.9.1 General . 530
6.9.2 Syntax and semantics of an interface for renderer audio utilization . 530
Annex A (normative) Tables and additional algorithm details . 531
A.1 Panner default output positions . 531
A.2 Adaptive loudspeaker rendering calibration guide . 531
A.3 RIR analysis: loudspeaker source directivity factor . 535
A.4 Default acoustic environment presets . 535
A.5 VR filter design initialization vector .
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.