ISO/IEC 23003-4:2020
(Main)Information technology — MPEG audio technologies — Part 4: Dynamic range control
Information technology — MPEG audio technologies — Part 4: Dynamic range control
This document specifies technology for loudness and dynamic range control. It is applicable to most MPEG audio technologies. It offers flexible solutions to efficiently support the widespread demand for technologies such as loudness normalization and dynamic range compression for various playback scenarios.
Technologies de l'information — Technologies audio MPEG — Partie 4: Contrôle de gamme dynamique
General Information
Relations
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 23003-4
Second edition
2020-06
Information technology — MPEG
audio technologies —
Part 4:
Dynamic range control
Technologies de l'information — Technologies audio MPEG —
Partie 4: Contrôle de gamme dynamique
Reference number
©
ISO/IEC 2020
© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved
Contents Page
Foreword . vi
Introduction . vii
1 Scope . 1
2 Normative references . 1
3 Terms, definitions and mnemonics. 1
3.1 Terms and definitions . 1
3.2 Mnemonics . 3
4 Symbols (and abbreviated terms) . 3
5 Technical overview . 4
6 DRC decoder . 6
6.1 DRC decoder configuration . 6
6.1.1 Overview . 6
6.1.2 Description of logical blocks . 7
6.1.3 Derivation of peak and loudness values. 12
6.2 Dynamic DRC gain payload . 16
6.3 DRC set selection . 16
6.3.1 Overview . 16
6.3.2 Pre-selection based on Signal Properties and Decoder Configuration . 17
6.3.3 Selection based on requests . 20
6.3.4 Final selection . 22
6.3.5 Applying multiple DRC sets . 23
6.3.6 Album mode . 23
6.3.7 Ducking. 23
6.3.8 Precedence . 24
6.4 Time domain DRC application . 24
6.4.1 Overview . 24
6.4.2 Framing . 24
6.4.3 Time resolution . 25
6.4.4 Time alignment . 25
6.4.5 Decoding . 26
6.4.6 Gain modifications and interpolation . 29
6.4.7 Spline interpolation . 35
6.4.8 Look-ahead in decoder . 36
6.4.9 Node reservoir . 37
6.4.10 Applying the compression . 38
6.4.11 Dynamic equalization . 41
6.4.12 Multi-band DRC filter bank . 43
6.5 Sub-band domain DRC . 47
6.6 Generation of DRC gain values at the decoder . 51
6.6.1 Overview . 51
6.6.2 Description of logical blocks . 52
6.6.3 Algorithmic details . 53
6.6.4 Combining parametric and non-parametric DRCs . 60
6.7 Loudness equalization support . 61
6.8 Equalization tool . 62
© ISO/IEC 2020 – All rights reserved iii
6.8.1 Overview . 62
6.8.2 EQ payloads . 62
6.8.3 EQ filter elements . 63
6.8.4 EQ set selection . 64
6.8.5 Application of EQ set . 64
6.9 Complexity management . 72
6.9.1 General . 72
6.9.2 DRC and downmixing complexity estimation . 72
6.9.3 EQ complexity estimation . 74
6.10 Loudness normalization . 75
6.10.1 Overview . 75
6.10.2 Loudness normalization based on target loudness . 76
6.11 DRC in streaming scenarios . 79
6.11.1 DRC configuration . 79
6.11.2 Error handling . 79
6.12 DRC configuration changes during active processing . 79
7 Syntax . 81
7.1 Syntax of DRC payload . 81
7.2 Syntax of DRC gain payload . 81
7.3 Syntax of static DRC payload . 82
7.4 Syntax of DRC gain sequence. 109
7.5 Syntax of parametric DRC tool. 110
7.6 Syntax of equalization tools . 117
8 Reference software . 131
8.1 Reference software structure . 131
8.1.1 General . 131
8.2 Bitstream decoding software . 131
8.2.1 General . 131
8.2.2 MPEG-D DRC decoding software . 132
9 Conformance . 132
9.1 General . 132
9.2 Conformance testing . 132
9.2.1 Conformance test data and test procedure . 132
9.2.2 Naming conventions . 134
9.2.3 File format definitions . 136
9.3 Encoder Conformance for MPEG-D DRC bitstreams . 138
9.3.1 Characteristics and test procedure . 138
9.3.2 Configuration payload . 139
9.3.3 Interface payload . 153
9.3.4 Frame Payload . 156
9.3.5 Requirements depending on profiles and levels . 157
9.4 Decoder conformance test categories and conditions . 158
9.4.1 General . 158
9.4.2 Conformance test categories . 158
9.4.3 Conformance test conditions . 158
Annex A (normative) Tables . 167
Annex B (normative) External Interface to DRC tool . 207
Annex C (informative) Audio codec specific information . 220
iv © ISO/IEC 2020 – All rights reserved
Annex D (informative) DRC gain generation and encoding . 225
Annex E (informative) DRC set selection and adjustment at decoder . 236
Annex F (informative) Loudness normalization . 243
Annex G (informative) Peak limiter . 244
Annex H (informative) Equalization . 249
Annex I (normative) Profiles and levels. 251
Annex J (informative) Reference software disclaimer . 260
Annex K (informative) Reference software . 261
Bibliography . 262
© ISO/IEC 2020 – All rights reserved v
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents) or the
IEC list of patent declarations received (see http://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT)
see www.iso.org/iso/foreword.html.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology,
Subcommittee SC 29, Coding of audio, picture, multimedia, and hypermedia.
This second edition cancels and replaces the first edition (ISO 23003-4:2015), which has been
technically revised. It also incorporates the Amendments ISO 23003-4:2015/Amd.1:2017 and
ISO 23003-4:2015/Amd.2:2017. The main changes compared to the previous edition are as follows:
— Amendments to the previous edition that include enhancements, definitions of profiles and levels,
reference software, and conformance are integrated.
A list of all parts in the ISO/IEC 23003 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body.
A complete listing of these bodies can be found at www.iso.org/members.html.
vi © ISO/IEC 2020 – All rights reserved
Introduction
Consumer audio systems and devices are used in a large variety of configurations and acoustical
environments. For many of these scenarios, the audio reproduction quality can be improved by
appropriate control of content dynamics and loudness.
This document provides a universal dynamic range control tool that supports loudness normalization.
The DRC tool offers a bitrate efficient representation of dynamically compressed versions of an audio
signal. This is achieved by adding a low-bitrate DRC metadata stream to the audio signal. The DRC tool
includes dedicated sections for clipping prevention, ducking, and for generating a fade-in and fade-out
to supplement the main dynamic range compression functionality. The DRC effects available at the DRC
decoder are generated at the DRC encoder side. At the DRC decoder side, the audio signal may be played
back without applying the DRC tool, or an appropriate DRC tool effect is selected and applied based on
the given playback scenario.
Loudness normalization is fully integrated with DRC and peak control to avoid clipping. A metadata-
controlled equalization tool is provided to compensate for playback scenarios that impact the spectral
balance, such as downmix or DRC. Furthermore, the DRC tool supports metadata-based loudness
equalization to compensate the effect of playback level changes on the spectral balance.
The International Organization for Standardization (ISO) and International Electrotechnical
Commission (IEC) draw attention to the fact that it is claimed that compliance with this document may
involve the use of a patent.
ISO and IEC take no position concerning the evidence, validity and scope of these patent rights
The holders of these patent rights have assured ISO and IEC that they are willing to negotiate licences
under reasonable and non-discriminatory terms and conditions with applicants throughout the world.
In this respect, the statements of the holders of these patent rights are registered with ISO and IEC.
Information may be obtained from the patent database available at www.iso.org/patents.
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights other than those in the patent database. ISO and IEC shall not be held responsible for
identifying any or all such patent rights.
© ISO/IEC 2020 – All rights reserved vii
INTERNATIONAL STANDARD ISO/IEC 23003-4:2020(E)
Information technology — MPEG audio technologies —
Part 4:
Dynamic range control
1 Scope
This document specifies technology for loudness and dynamic range control. It is applicable to most
MPEG audio technologies. It offers flexible solutions to efficiently support the widespread demand for
technologies such as loudness normalization and dynamic range compression for various playback
scenarios.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 14496-12, Information technology — Coding of audio-visual objects — Part 12: ISO base media
file format
ISO/IEC 14496-26:2010, Information technology — Coding of audio-visual objects — Part 26: Audio
Conformance
ISO/IEC 23008-3:2019, Information technology — High efficiency coding and media delivery in
heterogeneous environments — Part 3: 3D audio
ISO/IEC 23091-3, Information technology — Coding-independent code points — Part 3: Audio
3 Terms, definitions and mnemonics
3.1 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 14496-12 and the
following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at http://www.electropedia.org/
3.1.1
DRC sequence
series of DRC gain values that can be applied to one or more audio channels
© ISO/IEC 2020 – All rights reserved 1
3.1.2
DRC set
defined set of DRC sequences that produce a desired effect if applied to the audio signal
3.1.3
album
collection of audio recordings that are mastered in a consistent way
Note 1 to entry: Traditionally, a collection of songs released on a Compact Disk belongs into this category, for
example.
3.1.4
conformance test bitstream
bitstream used for testing the conformance of MPEG-D DRC compliant audio decoders
3.1.5
conformance test case
conformance test category and a combination of one or more conformance test conditions for which a
conformance test sequence is provided
3.1.6
conformance test condition
condition which applies to properties of a conformance test sequence in order to test a certain
functionality of the MPEG-D DRC decoder
3.1.7
conformance test criteria
one or more conformance test tools and corresponding parameters applied to verify the conformance
for a certain conformance test sequence
3.1.8
conformance test sequence
set of a conformance test bitstream, a decoder setting, an input audio file and a corresponding reference
file
3.1.9
decoder input parameters
input parameters that are supplied to an MPEG-D DRC decoder in addition to a conformance test
bitstream, a decoder interface bitstream and an input audio file
3.1.10
decoder setting
combination of a decoder interface bitstream and decoder input parameters that are supplied to an
MPEG-D DRC decoder
3.1.11
input DRC set selection parameters
input parameter set for testing of a DRC gain decoder instance
Note 1 to entry: This parameter set is solely used for conformance testing in the context of the DRC gain decoder
conformance test category (DrcGainDec).
3.1.12
reference audio file
decoded counterpart of a conformance test bitstream, a decoder setting and an input audio file
2 © ISO/IEC 2020 – All rights reserved
3.1.13
reference DRC set selection parameters
decoded counterpart of a conformance test bitstream and a decoder setting fed to the DRC set selection
process
Note 1 to entry: This parameter set is an intermediate result of an MPEG-D DRC compliant decoder
implementation solely used for conformance testing in the context of the DRC selection process test category
(DrcSelProc).
3.1.14
reference file
reference audio file or reference DRC set selection parameters
3.2 Mnemonics
bslbf bit string, left bit first, where “left” is the order in which bit strings are written in the
ISO/IEC 14496 series
NOTE Bit strings are written as a string of 1s and 0s within single quote marks, for
example '1000 0001'. Blanks within a bit string are for ease of reading and have no
significance.
byte_align() number of bits to fill for byte alignment at the offset of n bits:
byte_align(n) = 8 ceil (n/8) – n
uimsbf unsigned integer, most significant bit first
vlclbf variable length code, left bit first, where “left” refers to the order in which the
variable length codes are written
bit(n) a bit string with n bits in the same format as bslbf
unsigned int(n) an unsigned integer with n bits in the same format as uimsbf
signed int(n) a signed integer with n bits, most significant bit first
mod modulo operator: (x mod y) = x-y floor (x/y)
sizeof(x) size operator that returns the bit size of a field x
TRUE/FALSE values of Boolean data type, which correspond to numerical 1 and 0, respectively
4 Symbols
a filter coefficient
i
b band index of DRC filter bank (starting at 0)
b filter coefficient
i
deltaTmin smallest permitted DRC gain sample interval in units of the audio sample interval
© ISO/IEC 2020 – All rights reserved 3
f cross-over frequency in Hz
c
f cross-over frequency expressed as fraction of the audio sample rate
c,norm
f (s) cross-over frequency of audio decoder sub-band s expressed as fraction of
c,norm,SB
the audio sample rate
NOTE The cross-over frequency is the upper band edge frequency of the sub-
band.
f audio sample rate in Hz
s
NOTE If an audio decoder is present, it is the sample rate of the decoded time-
domain audio signal.
M DRC frame size in units of the audio sample interval 1/f
DRC s
N maximum permitted number of DRC samples per DRC frame
DRC
NOTE Identical to the number of intervals with a duration of deltaTmin per DRC
frame.
N codec frame size in units of the audio sample interval 1/f
Codec s
π ratio of a circle’s circumference to its diameter
s audio decoder sub-band index (starting at 0)
z complex variable of the z-transform
5 Technical overview
The technology described in this document is called the “DRC tool”. It provides efficient control of
dynamic range, loudness, and clipping based on metadata generated at the encoder. The decoder can
choose to selectively apply the metadata to the audio signal to achieve a desired result. Metadata for
dynamic range compression consists of encoded time-varying gain values that can be applied to the
audio signal. Hence, the main blocks of the DRC tool include a DRC gain encoder, a DRC gain decoder, a
DRC gain modification block, and a DRC gain application block. These blocks are exercised on a frame-
by-frame basis during audio processing. In addition to encoded time-varying gain values, the DRC gain
decoder can also receive parametric DRC metadata for generation of time-varying gain values at the
decoder. Various DRC configurations can be conveyed in a separate bitstream element, such as
configurations for a downmix or combined DRCs. The DRC set selection block decides based on the
playback scenario and the applicable DRC configurations which DRC gains to apply to the audio signal.
Moreover, the DRC tool supports loudness normalization based on loudness metadata.
A typical system for loudness and dynamic range control in the time domain is shown in Figure 1. A
more complex system including downmixer and peak limiter is shown in Figure 2. The decoder part of
the DRC tool is driven by metadata that efficiently represents the DRC gain samples and parameters for
interpolation. The gain samples can be updated as fast as necessary to accurately represent gain
changes down to at least 1 ms update intervals. In the following, the decoder part of the DRC tool is
referred to as “DRC decoder”, which includes everything except the audio decoder and associated
bitstream de-multiplexing.
4 © ISO/IEC 2020 – All rights reserved
Figure 1 — Block diagram of a typical system with audio decoder and DRC tool modules to
achieve loudness normalization (LN) and dynamic range control
Figure 2 — Block diagram of a more complex system including downmixer and peak limiter
(TD = time-domain, SD = subband-domain)
The DRC tool provides support for loudness equalization, sometimes called “loudness compensation”,
that can be applied to compensate for the effect of the playback level on the spectral balance. For this
purpose, time-varying loudness information can be recovered from DRC gain sequences to dynamically
control the compensation module. While the compensation module is out of scope, the interface
describes in which frequency ranges the loudness information should be applied.
A flexible tool for generic metadata-controlled equalization is provided. The tool can be used to reach
the desired spectral balance of the reproduced audio signal depending on a wide variety of playback
scenarios, such as downmix, DRC, or playback room size. It can operate in the sub-band domain of an
audio decoder and in the time domain.
The DRC tool is specified in Clause 6. The tool may be subject to profiles and levels that shall be in
accordance with Annex I. The bitstream field decoding of the DRC tool shall be in accordance with
Annex A. If an interface for external parameter control of the DRC tool is used, it shall conform to
Annex B.
© ISO/IEC 2020 – All rights reserved 5
6 DRC decoder
6.1 DRC decoder configuration
6.1.1 Overview
The DRC configuration information can be received in-stream using the static payloads uniDrcConfig()
and loudnessInfoSet() described below, or it can be delivered by a higher layer, such as in ISO/IEC
14496-12 (see Table 1). The basic decoding process of the static information is virtually the same. The
difference consists mainly in a few syntax changes and reduced field sizes to increase the bit rate
efficiency of the in-stream configuration. The syntax of the in-stream static payload is given in 7.3. The
associated metadata encoding is given in A.6. The static DRC payload is evaluated once at the beginning
of the decoding process and it is monitored subsequently. For static DRC payload changes during
playback, see 6.12.
Table 1 — Overview of configuration (setup) and separate metadata track in ISO/IEC 14496-12
Sample entry Setup Track reference Sample format
code (in sample entry)
Audio As specified for the DRCInstructions box "adrc" referring to the As specified for the
track audio codec in use using negative values for metadata tracks audio codec in use
(unchanged) drcLocation carrying gain values (unchanged)
Metadata "unid" (none) (none) Each sample is a
track uniDrcGain() payload
The static payload is divided into several logical blocks:
— channelLayout();
— downmixInstructions(), downmixInstructionsV1();
— drcCoefficientsBasic(), drcCoefficientsUniDrc(), drcCoefficientsUniDrcV1();
— drcInstructionsBasic(), drcInstructionUniDrc(), drcInstructionUniDrcV1();
— loudnessInfo(), loudnessInfoV1();
— drcCoefficientsParametricDrc();
— parametricDrcInstructions();
— loudEqInstructions();
— eqCoefficients();
— eqInstructions().
Except for the channelLayout(), drcCoefficientsParametricDrc(), and eqCoefficients(), multiple
instances of a logical block can appear. The DRC decoder combines the information of the matching
instances of the logical blocks for a given playback scenario. Matching instances are found by matching
several identifiers (labels) contained in the blocks.
6 © ISO/IEC 2020 – All rights reserved
From the static payload, the decoder can also extract information about the effect of a particular DRC
and various associated loudness information, if present. If multiple DRCs are available, this information
can be used to select a particular DRC based on target criteria for dynamics and loudness (see 6.3)
uniDrcConfig() contains all blocks except for the loudnessInfo() blocks which are bundled in
loudnessInfoSet(). The last part of the uniDrcConfig() payload can include future extension payloads. In
the event that a uniDrcConfigExtType value is received that is not equal to UNIDRCCONFEXT_TERM, the
DRC tool parser shall read and discard the bits (otherBit) of the extension payload. Similarly, the last
part of the loudnessInfoSet() payload can include future extension payloads. In the event that a
loudnessInfoSetExtType value is received that is not equal to UNIDRCLOUDEXT_TERM, the DRC tool
parser shall read and discard the bits (otherBit) of the extension payload. Each extension payload type
in uniDrcConfig() or loudnessInfoSet() shall not appear more than once in the bitstream if not stated
otherwise. An extension payload of type UNIDRCCONFEXT_V1 shall preceed an extension payload of
type UNIDRCCONFEXT_PARAM_DRC in the bitstream if both payloads are present. For
ISO/IEC 14496-12, configuration extension payloads are provided according to Table 76.
The top level fields of uniDrcConfig() include the audio sample rate, which is a fundamental parameter
for the decoding process (if not present, the audio sample rate is inherited from the employed audio
codec). Moreover, the top level fields of uniDrcConfig() include the number of instances of each of the
logical blocks, except for the channelLayout() block which appears only once. The top level fields of
loudnessInfoSet() only include the number of loudnessInfo() blocks. The logical blocks are described in
the following.
6.1.2 Description of logical blocks
6.1.2.1 channelLayout()
The channelLayout() block includes the channel count of the audio signal in the base layout. It may also
include the base layout unless it is specified elsewhere. For use cases where the base audio signal
represents objects or other audio content, the base channel count represents the total number of base
content channels. The base channel count value shall serve as the value of baseChannelCount for
parsing the downmixInstructions(), downmixInstructionsV1(), drcInstructionsUniDrc(),
drcInstructionsUniDrcV1() and eqInstructions() payloads as specified in Clause 7.
6.1.2.2 downmixInstructions() and downmixInstructionsV1()
This block includes a unique non-zero downmix identifier (downmixId) that can be used externally to
refer to this downmix. The targetChannelCount specifies the number of channels after downmixing to
the target layout. It may also contain downmix coefficients, unless they are specified elsewhere. For use
cases where the base audio signal represents objects or other audio content, the downmixId can be used
to refer to a specific target channel configuration of a present rendering engine. In contrast to
downmixInstructions(), the downmixInstructionsV1() payload includes an offset for all downmix
coefficients and the coefficient decoding does not depend on the LFE channel assignment. The
downmixInstructions() box for ISO/IEC 14496-12 contains the corresponding metadata of either one of
the in-stream payloads as indicated by the version parameter of the box.
6.1.2.3 drcCoefficientsBasic(), drcCoefficientsUniDrc(), and drcCoefficientsUniDrcV1()
A drcCoefficients block describes all available DRC gain sequences in one location. The block can have
the basic format or the uniDrc format. The basic format, drcCoefficientsBasic(), contains a subset of
information included in drcCoefficientsUniDrc() that can be used to describe DRCs other than the ones
specified in this document. drcCoefficientsUniDrc() contains for each sequence several indicators on
how it is encoded, the time resolution, time alignment, the number of DRC sub-bands and
corresponding crossover frequencies and DRC characteristics. The crossover frequencies shall increase
© ISO/IEC 2020 – All rights reserved 7
with increasing band index. Alternatively, explicit indices in a decoder sub-band domain can be
specified for the assignment of DRC sub-bands. The sub-band indices shall also increase with increasing
band index. If the DRC gains are applied in the time-domain by using the multi-band DRC filter bank
specified in 6.4.12, explicit index signalling is not allowed. The index of the DRC characteristic indicates
which compression characteristic was used to produce the gain sequence. The DRC location describes
where these gain sequences can be found in the bitstream. The DRC gain sequences in that location are
inherently enumerated according to their order of appearance starting with 1.
The DRC location field encoding depends on the audio codec. A codec specification may include this
specification, and use values 1 to 4 to refer to codec-specific locations as indicated in Table 2. For
example, for AAC (ISO/IEC 14496-3), the codec-specific values of the DRC location field are encoded as
shown in Table 3.
Table 2 — Encoding of drcLocation for in-stream payload
drcLocation n Payload
0 Reserved
1 Location 1 (Codec-specific use)
2 Location 2 (Codec-specific use)
3 Location 3 (Codec-specific use)
4 Location 4 (Codec-specific use)
n > 4 reserved
Table 3 — Codec-specific encoding of drcLocation for MPEG-4 Audio
drcLocation n Payload
1 uniDrc() (defined in Clause 7)
2 dyn_rng_sgn[i] / dyn_rng_ctl[i] in dynamic_range_info()
(defined in ISO/IEC 14496-3:2009 subpart 4)
3 compression_value in MPEG4_ancillary_data()
(defined in ISO/IEC 14496-3:2009/AMD 4:2013)
4 reserved
The DRC frame size can optionally be specified. It shall be provided if the DRC frame size deviates from
the default size specified in 6.4.2. If not specified, the default frame size is used.
The in-stream drcCoefficient syntax is given in Table 65, Table 67 and Table 68. The syntax for the
corresponding block for ISO/IEC 14496-12 (ISO base media file format) is shown in Table 66 and
Table 69. The corresponding blocks carry essentially the same information. Values that are identically
included in both blocks are coded the same way except for drcLocation.
In ISO base media file format (see ISO/IEC 14496-12), for each codec that can be carried in MP4 files
and that also carries DRC information, there is a specific definition of how the location is coded, using
the DRC_location field (see Table 4). A negative value of DRC_location indicates that a DRC payload is in
an associated meta-data track. That track is the n-th linked via a track reference of type "adrc" (audio
DRC) from the audio track, where n = abs(DRC_location), and the sample-entry type in the meta-data
track indicates in which format the coefficients are stored. Table 3 defines the specific entries of the
drcLocation field for AAC. Some example use cases are discussed in C.10.
8 © ISO/IEC 2020 – All rights reserved
If the uniDrc() payload is stored in a separate track in the ISO base media file format
(ISO/IEC 14496-12), then the track is a metadata track with the sample entry identifier "unid" (uniDrc),
with no required boxes added to the sample entry. The time synchronization with the linked audio track
is the same as if the payload was in-stream.
Table 4 — Encoding of drcLocation for ISO/IEC 14496-12
drcLocation n Payload
n < 0 DRC payload located in |n|-th linked meta-data track
0 reserved
1 Location 1 (Codec-specific use)
2 Location 2 (Codec-specific use)
3 Location 3 (Codec-specific use)
4 Location 4 (Codec-specific use)
n > 4 reserved
The drcCoefficientsUniDrcV1() payload is defined in Table 68. It contains the same information as
drcCoefficientsUniDrc() except for the assignment of DRC gain sequences to gain sets and the optional
specification of a number of parametric DRC characteristics. The drcCoefficientsUniDrc() payload
assigns gain sequences in order of transmission. In contrast, the drcCoefficientsUniDrcV1() payload
maps a gain sequence by index to gainSets. The latter permits to refer to the same gain sequence for
multiple DRC bands which is not possible when using drcCoefficientsUniDrc(). If a
drcCoefficientsUniDrcV1() payload is present, any drcCoefficientsUniDrc() payload for the same
location is ignored.
The drcCoefficientsUniDrcV1() payload can also include information about dynamic equalization filters
if the field shapingFiltersPresent==1. There can be a number of filters that are indexed in order of
appearance. The DRC sets defined in drcInstructionsUniDrcV1() can refer to specific filters using their
indices (see 6.4.11).
6.1.2.4 drcInstructionsBasic(), drcInstructionsUniDrc(), and drcInstructionsUniDrcV1()
A drcInstructions block includes information about one specific DRC set that can be applied to achieve a
desired effect. This block can have the basic format or the uniDrc format. The basic format,
drcInstructionsBasic(), contains a subset of information included in drcInstructionsUniDrc() that can be
used to describe DRCs other than the ones specified in this document. The information included in
drcInstructionsUniDrc() consists mainly of pre-defined description elements such as the DRC set effect
and the DRC gain sequences that are applied. The drcSetEffect field contains several effect bits as listed
in Table A.45. Multi
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...