Information technology — Coding of audio-visual objects — Part 26: Audio conformance

This document specifies how tests can be designed to verify whether compressed data and decoders meet requirements specified by ISO/IEC 14496-3. Encoders are not addressed specifically. An ISO/IEC 14496 encoder generates compressed data compliant with the syntactic and semantic bitstream payload requirements specified in ISO/IEC 14496-3. This document summarises the requirements, cross references them to characteristics, and defines how conformance with them can be tested. Guidelines are given on constructing tests to verify decoder conformance. Some examples of compressed data implemented according to these guidelines are provided as an electronic annex to this document usually together with their uncompressed counterparts (reference waveforms).

Technologies de l'information — Codage des objets audiovisuels — Partie 26: Conformité audio

General Information

Status
Published
Publication Date
07-Nov-2024
Current Stage
6060 - International Standard published
Start Date
08-Nov-2024
Due Date
15-Jan-2025
Completion Date
08-Nov-2024
Ref Project

Relations

Standard
ISO/IEC 14496-26:2024 - Information technology — Coding of audio-visual objects — Part 26: Audio conformance Released:11/8/2024
English language
253 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


International
Standard
ISO/IEC 14496-26
Second edition
Information technology — Coding of
audio-visual objects —
2024-11
Part 26:
Audio conformance
Technologies de l'information — Codage des objets
audiovisuels —
Partie 26: Conformité audio
Reference number
© ISO/IEC 2024
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
© ISO/IEC 2024 – All rights reserved
ii
Contents Page
Foreword . v
Introduction . vii
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 2
3.1 Conformance Data . 2
3.2 Conformance Tools . 2
3.3 Conformance Test Sequences . 2
3.4 Compressed Data . 2
3.5 Reference Waveforms . 2
4 Conformance Points . 2
5 Profiles . 4
6 Conformance data . 4
6.1 File name conventions . 4
6.2 Content . 6
7 Audio Object Types . 7
7.1 General . 7
7.2 Null . 13
7.3 AAC-based scalable configurations . 14
7.4 AAC (main, LC, ER LC, SSR, LTP, ER LTP, ER LD, scalable, ER scalable) . 15
7.5 TwinVQ and ER_TwinVQ . 40
7.6 ER BSAC . 43
7.7 CELP . 51
7.8 ER CELP . 55
7.9 HVXC . 60
7.10 ER HVXC . 71
7.11 ER HILN and ER Parametric . 74
7.12 TTSI . 89
7.13 General MIDI . 91
7.14 Wavetable Synthesis . 92
7.15 Algorithmic Synthesis and AudioFX . 93
7.16 Main Synthetic . 100
7.17 SBR . 101
7.18 PS (Parametric Stereo) . 113
7.19 SSC (SinuSoidal Coding) . 115
7.20 DST (Lossless coding of oversampled audio) . 120
7.21 Layer-3 . 123
7.22 ALS (Audio lossless coding) . 125
7.23 SLS (Scalable Lossless Coding) . 127
7.24 Layer-1 and Layer 2 . 130
7.25 Low Delay SBR . 132
8 Audio EP tool . 143
8.1 Compressed data . 143
8.2 Decoders . 145
9 Audio Composition . 151
9.1 AudioBIFS v1 . 151
9.2 Advanced Audio BIFS nodes . 162
9.3 AudioBIFS v3 Nodes . 188
© ISO/IEC 2024 – All rights reserved
iii
10 MPEG-4 audio transport stream . 205
10.1 General . 205
10.2 Compressed Data . 205
10.3 Decoders . 206
11 Upstream . 207
11.1 Compressed data . 207
11.2 Decoders . 207
12 Conformance test sequence assignment to profiles and levels . 207
12.1 Overview . 207
12.2 Audio . 208
12.3 Systems . 224
Annex A (informative) Complexity measurement criteria and tool for level definitions of algorithmic
synthesis and AudioFX Object Type . 229
Annex B (informative) Test bitstreams for the CELP object type . 250

© ISO/IEC 2024 – All rights reserved
iv
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives or
www.iec.ch/members_experts/refdocs).
ISO and IEC draw attention to the possibility that the implementation of this document may involve the
use of (a) patent(s). ISO and IEC take no position concerning the evidence, validity or applicability of
any claimed patent rights in respect thereof. As of the date of publication of this document, ISO and IEC
had not received notice of (a) patent(s) which may be required to implement this document. However,
implementers are cautioned that this may not represent the latest information, which may be obtained
from the patent database available at www.iso.org/patents and https://patents.iec.ch. ISO and IEC shall
not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see
www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
This second edition cancels and replaces the first edition (ISO/IEC 14496-26:2010), which has been
technically revised. It also incorporates the Amendments ISO/IEC 14496-26:2010/Amd 2:2010,
ISO/IEC 14496-26:2010/Amd 3:2014, ISO/IEC 14496-26:2010/Amd 4:2016, ISO/IEC 14496-
26:2010/Amd 5:2018 and the Technical Corrigenda ISO/IEC 14496-26:2010/Cor 2:2011, ISO/IEC
14496-26:2010/Cor 3:2011, ISO/IEC 14496-26:2010/Cor 4:2011, ISO/IEC 14496-26:2010/Cor 5:2012,
ISO/IEC 14496-26:2010/Cor 6:2013, ISO/IEC 14496-26:2010/Cor 7:2013 and ISO/IEC 14496-
26:2010/Cor 8:2015.
The main changes are as follows:
- Introduced additional BSAC conformance bitstreams to assist in implementing Terrestrial-DMB
products
- Correction according to bitstreams and wave files for BSAC
- Correction according to channel_pair_element()
- Correction for MPEG-4 ALS floating point bitstreams
- Correction for MPEG-4 SLS test sequences
© ISO/IEC 2024 – All rights reserved
v
- Correction for BSAC and HE-AAC V2 profile
- Correction for ER AAC test sequences
- New conformance bitstreams for Low Delay AAC V2 profile
- Correction for AAC block length parameter and test sequences
- Additional multichannel conformance data for AAC and HE-AAC
- Additional levels of ALS simple profile and SBR enhancements
- Additional levels of MPEG-4 ALS simple profile supporting high-resolution audio

A list of all parts in the ISO/IEC 14496 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards body. A
www.iec.ch/national-
complete listing of these bodies can be found at www.iso.org/members.html and
committees.
© ISO/IEC 2024 – All rights reserved
vi
Introduction
ISO/IEC 14496-3 specifies coded representations of audio information. ISO/IEC 14496-3 allows for large
flexibility, achieving suitability of ISO/IEC 14496 for many different applications. The flexibility is obtained by
including parameters in the bitstream that define the characteristics of coded bitstreams. Examples are the audio
sampling frequency bitrate parameters, synchronisation timestamps, the association of bitstreams and synthetic
objects within objects.
Characteristics of compressed data and decoders are defined for ISO/IEC 14496-3. The compressed data
characteristics define the subset of the standard that is exploited in the compressed data. Examples are the
applied values or range of the sampling rate and bitrate parameters. Decoder characteristics define the properties
and capabilities of the applied decoding process. An example of a property is the applied arithmetic accuracy. The
capabilities of a decoder specify which compressed data the decoder can decode and reconstruct, by defining the
subset of the standard that may be exploited in the decodable compressed data. Compressed data can be decoded
by a decoder if the characteristics of the compressed data are within the subset of the standard specified by the
decoder capabilities.
The tests described in this document can be used for various purposes such as:
• manufacturers of encoders, and their customers, can use the tests to verify whether the encoder produces
bitstreams compliant with ISO/IEC 14496-3.
• manufacturers of decoders and their customers can use the tests to verify whether the decoder meets the
requirements specified in ISO/IEC 14496-3 for the claimed decoder capabilities.
• manufacturers and customers of terminals supporting interactive, broadcast and local sessions over a
multitude of transport protocols and networks, can use the tests to verify whether the claimed functionalities
are compliant with ISO/IEC 14496-6.
• manufacturers of test equipments, and their customers can use the tests to verify compliance with ISO/IEC
14496-3.
© ISO/IEC 2024 – All rights reserved
vii
International Standard ISO/IEC 14496-26:2024(en)

Information technology — Coding of audio-visual
objects —
Part 26:
Audio conformance
1 Scope
This document specifies how tests can be designed to verify whether compressed data and decoders meet
requirements specified by ISO/IEC 14496-3. Encoders are not addressed specifically. An ISO/IEC 14496 encoder
generates compressed data compliant with the syntactic and semantic bitstream payload requirements specified
in ISO/IEC 14496-3.
This document summarises the requirements, cross references them to characteristics, and defines how
conformance with them can be tested. Guidelines are given on constructing tests to verify decoder conformance.
Some examples of compressed data implemented according to these guidelines are provided as an electronic
annex to this document usually together with their uncompressed counterparts (reference waveforms).
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are indispensable
for its application. For dated references, only the edition cited applies. For undated references, the latest edition of
the referenced document (including any amendments) applies.
ISO/IEC 11172-3, Information technology — Coding of moving pictures and associated audio for digital storage
media at up to about 1,5 Mbit/s — Part 3: Audio
ISO/IEC 11172-4, Information technology — Coding of moving pictures and associated audio for digital storage
media at up to about 1,5 Mbit/s — Part 4: Compliance testing
ISO/IEC 13818-3, Information technology — Generic coding of moving pictures and associated audio information —
Part 3: Audio
ISO/IEC 13818-4, Information technology — Generic coding of moving pictures and associated audio information —
Part 4: Conformance testing
ISO/IEC 13818-7, Information technology — Generic coding of moving pictures and associated audio information —
Part 7: Advanced Audio Coding (AAC)
ISO/IEC 14496-1, Information technology — Coding of audio-visual objects — Part 1: Systems
ISO/IEC 14496-3, Information technology — Coding of audio-visual objects — Part 3: Audio
ISO/IEC 14496-3:2019, Information technology — Coding of audio-visual objects — Part 3: Audio
ISO/IEC 14496-11, Information technology — Coding of audio-visual objects — Part 11: Scene description and
application engine
ISO/IEC 23003-1, Information technology — MPEG audio technologies — Part 1: MPEG Surround
ISO/IEC 23003-2:2018, Information technology — MPEG audio technologies — Part 2: Spatial Audio Object Coding
(SAOC)
© ISO/IEC 2024 – All rights reserved
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 14496-1, ISO/IEC 14496-3 and the
following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at https://www.electropedia.org/
3.1
conformance data
conformance test sequences and conformance tools
3.2
conformance tools
tools which are provided within an electronic annex of this document to check certain conformance criteria
3.3
conformance test sequences
set of compressed data and its reference waveforms
Note 1 to entry: The conformance test sequences are provided as examples at: https://standards.iso.org/iso-
iec/14496/-26/ed-2/en/.
3.4
compressed data
encoded data according to ISO/IEC 14496-3
3.5
reference waveforms
decoded counterparts of the compressed data
4 Conformance points
All audio decoders except the LATM-based decoders are part of the MPEG-4 framework. Table 1 gives an overview
about the interfaces that have to be provided to test the audio decoders using the MPEG-4 System.
Table 1 — Conformance points
conformance point / data flow description / reference
interface direction
AudioSpecificConfig in audio related decoder specific information, see ISO/IEC 14496-3:2019, (1.6.2.1
AudioSpecificConfig)
audio access units in audio related bitstream payload, see ISO/IEC 14496-1:2004 (7.1.2.3 Access Units (AU))
BIFS/AudioSource node in see ISO/IEC 14496-11: 2005 (7.2.2.15 Audio Source)
private test info in to control some elements which are usually generated by random number generators
audio composition units out see ISO/IEC 14496-1: 2004 (7.2.8 Composition Units (CU))

Figure 1 gives an overview about the test bench (MPEG-4 System), the system under test (Audio decoder), and the
interfaces between them. Figure 2 gives a more detailed view on the audio decoder, consisting of error protection
(EP) decoder and audio core decoder.
© ISO/IEC 2024 – All rights reserved
Figure 1 — Audio conformance points

Figure 2 — Audio decoder structure

Clause 7 describes:
The conformance criteria of the audio core decoder.
The conformance criteria of the compressed data not requiring the EP decoder (epConfig == 0 || epConfig == 1).
The properties of the examples of compressed data with (epConfig == 0 || epConfig == 1).

Clause 8 describes:
The conformance criteria of the EP decoder
The conformance criteria of the compressed data requiring the EP decoder (epConfig == 2 || epConfig == 3).
The properties of the examples of compressed data with (epConfig == 2 || epConfig == 3).
Compressed data with different epConfig settings are available referring to the same reference waveforms. Here,
the output of a conforming decoder shall be equal, independently of the used epConfig setting.
For some of the compressed data containing scalable configurations, conformance points are defined at the PCM
output of the decoder for m layers being decoded from an n-layer input, where m is an integer in the range 0 (base
layer conformance) to n-1. The reference PCM decoder output signals corresponding to these conformance points
are listed in the respective conformance tables.
© ISO/IEC 2024 – All rights reserved
5 Profiles
ISO/IEC 14496-3 defines several profiles and several levels within each profile. Conformance is always tested
against a certain level within a certain profile. Audio profiles always comprise a set of audio object types.
Nevertheless the conformance criteria as described within this document are based on audio object types. The
assignment of object types to profiles as well as the level definitions can be found in ISO/IEC 14496-3. The
conformance of a certain level within a certain profile is fulfilled, if the conformance of each object type belonging
to this profile is fulfilled. The assignment of the provided test sequences to profiles and levels can be found in
Clause 12.
6 Conformance data
6.1 File name conventions
For all conformance test sequences, the file name convention given in Table 2 is used.
Table 2 — File name conventions
object type name/ tool name File Name (compressed) File Name (uncompressed)
AdvancedAudioBIFS aabper -- not applicable --
- perceptual apporach
AdvancedAudioBIFS aabphy -- not applicable --
- physical approach
AudioBIFS ab_ ab_
AudioBIFS v3 ABv3_ -- not applicable --
AAC scalable ac ac[_lay]
AAC LC al_ al_[_cut_boost]
[_level][_]
AAC main am_ am_[_cut_boost]
[_level][_]
AAC LTP ap_ ap_
AAC SSR as_ as_[_]
CELP ce ce[_lay]
ER AAC scalable er_ac_ep er_ac[_lay]
[]
ER AAC LD er_ad__ep er_ad_
[]
ER AAC LC er_al__ep er_al_
[]
ER AAC LTP er_ap__ep er_ap_
[]
SBR (+AAC LC) al_sbr___[_fsaac al_sbr____
][_sig] [_fsaac][_sig][_]
SBR (+AAC LC with al960_sbr___ al960_sbr____
960 samples per frame ) [_fsaac][_sig] [_fsaac][_sig][_]
PS (+SBR+AAC LC) al_sbr_ps_ al_sbr_ps_[_]
SSC ssc__[_sig] ssc___[_sig][_]
DST dst__[_sig] dst___[_sig][_]
Layer-3 l3_ l3_
© ISO/IEC 2024 – All rights reserved
object type name/ tool name File Name (compressed) File Name (uncompressed)
ER BSAC er_bs__ep er_bs_[_lay]
[]
ER CELP er_ce_ep er_ce[_lay]
[]
ER HILN er_hi_ep er_hi[_lay]
[] [_s][_p ER HVXC er_hv_ep er_hv[_lay]_
[]
ER Parametric er_pa_ep er_pa[_lay]_
[]
ER Twin VQ er_tv_ep er_tv[_lay]
[]
HVXC hv hv[_lay]_ref
Algorithmic Synthesis sy sy
and Audio FX
TTSI tts tts
TwinVQ tv tv[_lay]
ALS als__ als__
SLS sls__ sls__
Layer-1 l1_ l1_
Layer-2 l2_ l2_
ER AAC ELD er_eld__ep er_eld_
[]
LD MPEG Surround (+ER AAC LD) er_ad_ldmps_ er_ad_ldmps_
LD MPEG Surround (+ER AAC ELD) er_eld_ldmps_ er_eld_ldmps_

can be 16 or 24 and indicates the bit resolution of the coded wavefile
indicates the channel for multi-channel sequences (f - number of the front channel,
b- number of the back channel, s - number of the side channel, l - number of the
LSF channel).
indicates the coder used to encode the content (ce – CELP, sa – Structured Audio, pcm – PCM)
refers to a certain audio coder setup. It is most likely a number, but may also contain characters.
refers to the decoder delay, it can become "ld" (low delay) or "nd" (normal delay).
can be 0, 1, 2 or 3, depending on epConfig (defined in AudioSpecificConfig).
is required if (epConfig==2 || epConfig==3). It refers to a certain error protection setup.
sampling frequency (08, 11, 12, 16, 22, 24, 32, 44, 48, 64, 88 or 96).
_level refers to the level with regard to DRC.
_cut_boost referes to the cut and boost factors with regard to DRC.
_lay is required for any scalable configuration. It marks the highest layer of the scalable
configuration used for decoding (starting with 0 for the core layer).
_p is a number refering to the decoder configuration with regard to the pitch factor.
_ref is a number refering to the decoder configuration with regard to delay mode, speed and pitch
change.
© ISO/IEC 2024 – All rights reserved
_s is a number refering to the decoder configuration with regard to the speed factor.
indicates the SBR module mainly targeted by the test sequence. Possible values are “e” for testing the
envelope adjuster “s” for testing sine addition, “gh” for testing time-grid transitions in combination with changes
of SBR header data, “i” for testing inverse filtering, “qmf” for testing the QMF implementation, “cm” and "gen" for
testing various channel modes, “sig” for testing SBR signalling, “twi” for QMF identification, “sr” for testing various
combinations of sampling rates, and "esbr" for testing SBR enhancements.

is the abbreviation of one of the AudioBIFS v3 node names.
corresponds to the number of channels present in the conformance test sequence. It is either a single
integer, in which case it refers to the number of main audio channels, or two integers separated by a ‘.’, in which
case the first integer equals the number of main audio channels, while the second number equals the number of
low frequency enhancement channels.
fsaac corresponds to the sampling rate of the underlying AAC-LC data. If it is omitted, it is half the sampling
rate given as output sampling rate.
is an integer describing the kind of signalling used according to Table 3. If this value is omitted, backwards
compatible explicit signalling of SBR is used.
Table 3 – File name conventions for SBR signaling
sig Signalling method used
0 Implicit signalling of SBR
1 Hierarchical explicit signalling of SBR
2 Backwards compatible explicit signalling of SBR

is either "hq" or "lp" for the high quality or the low power version of the SBR decoding algorithm
respectively.
is either “bl” or ”ur” for the baseline or the unrestricted version of the parametric stereo decoding
algorithm respectively.
With respect to file extensions, the following rules are applied as shown in Table 4
Table 4 – File extensions
Compressed MPEG-4 file format .mp4
Compressed native MPEG-1/2 Audio storage format .mpg
Compressed Audio data interchange format .adif
Compressed Audio data transport stream .adts
Compressed AudioSyncStream .ass
Compressed EPAudioSyncStream .ess
Compressed AudioPointerStream .aps
Uncompressed HILN Conformance Test Parameters .ctp
Uncompressed WAVE format (uncompressed PCM format) .wav
Uncompressed TTSI decoded text and control digits .txt

6.2 Content
The test set includes a set of sine sweeps, a set of musical/speech test sequences and a set of noise-like test
sequences. The supplied sine sweeps with an amplitude of -20dB relative to full scale have an absolute amplitude
of +/- 0.1.
© ISO/IEC 2024 – All rights reserved
7 Audio object types
7.1 General
This clause lists all audio object types. It starts with a general description, which may be related to more than one
object type.
This clause contains general descriptions for conformance testing on compressed data and decoders. Unless
explicitly restricted, these descriptions are related to all object types.
7.1.1 Compressed data
7.1.1.1 Characteristics
Characteristics of compressed data specify the constraints that are applied by the encoder in generating the
compressed data. These syntactic and semantic constraints may, for example, restrict the range or the values of
parameters that are encoded directly or indirectly in the compressed data. The constraints applied to a given
compressed data may or may not be known a priori.
Decoder relevant compressed data may consist of the following parts:
decoder specific information (AudioSpecificConfig)
BIFS/AudioSource node (field information)
audio access units (establishing the bitstream payload)
7.1.1.1.1 ESC instance configuration
In case of epConfig=1, each instance of each sensitivity category belonging to one frame is stored separately
within a single access unit, i.e. there exist as many elementary streams as instances defined within a frame as
shown in Table 6.
Note: In case of epConfig=3, the mapping between EP classes and ESC instances is signaled by the data element
directMapping. In case of directMapping=1, the restrictions regarding the ESC instance configuration apply
accordingly to the EP class configuration.
Table 5 gives an overview about the valid configurations:
Table 5 — Number of ESC instances that build a frame in case of epConfig==1
Audio object type number of ESC instances to build a frame
ER AAC see Table 6
ER Twin VQ non-scalable or base layer: 2
any enhancement layer: 2
ER BSAC base layer: 2
any large-step enhancement layer: 1
ER CELP base layer: 5
any enhancement layer: 1
ER HVXC 2 kbit/s, non-scalable or base layer: 4
4 kbit/s, non-scalable: 5
any enhancement layer: 3
ER HILN base layer: 5
any enhancement/extension layer: 1
ER Parametric PARAmode==0,1 base layer: 5
PARAmode==2,3 base layer: 15
any enhancement/extension layer: 1
© ISO/IEC 2024 – All rights reserved
Table 6 — Number of ESC instances that build elements/layers of
an ER AAC frame in the case of epConfig==1
aacScalefactorDataResilienceFlag
0 1
single channel element (SCE) / mono layer 3 4
channel pair element (CPE) / stereo layer 7 9
extension payload (EPL) 2
Depending on the value of the data element channelConfiguration, an AAC frame may cover several instances of
SCE, CPE or EPL. This leads to the following valid configurations presented in Table 7.
Table 7 — Number of ESC instances that build an ER AAC frame/layer in the case of epConfig==1
aacScalefactorDataResilienceFlag
AOT 0 1
17 19 20 23 channelConfiguration main payload N extension payloads
x x x x 1 3 4
x x x x 2 7 9
x x x 3 3+7 4+9
x x x 4 3+7+3 4+9+4 +2*N
x x x 5 3+7+7 4+9+9
x x x 6 3+7+7+3 4+9+9+4
x x x 7 3+7+7+7+3 4+9+9+9+4
7.1.1.2 Test procedure
Each compressed data shall meet the syntactic and semantic requirements specified in ISO/IEC 14496-3. For each
audio object type a set of semantic tests to be performed on the compressed data is described. To verify whether
the syntax is correct is straightforward and therefore not defined herein after. In the description of the semantic
tests it is assumed that the tested compressed data contains no errors due to transmission or other causes. For
each test the condition or conditions that shall be satisfied are given, as well as the prerequisites or conditions in
which the test can be applied.
7.1.2 Decoders
7.1.2.1 Characteristics
The decoder characteristics are defined by the profiles and levels being tested.
7.1.2.2 Test procedure
To test audio decoders, ISO/IEC JTC 1/SC 29/WG 11 supplies a number of test sequences. Supplied sequences
cover all profile decoders. For a supplied test sequence, testing can be done by comparing the output of a decoder
under test with a reference output also supplied by ISO/IEC JTC 1/SC 29/WG 11. In cases where the decoder
under test is followed by additional operations (e.g. quantizing a signal to a 16 bit output signal) the conformance
point is prior to such additional operations, i.e. it is permitted to use the actual decoder output (e.g. with more
than 16 bit) for conformance testing.
Measurements are carried out relative to full scale where the output signals of the decoders are normalized to be
in the range between -1.0 and +1.0.
The following subclauses define a set of test methods. A particular test method for a certain test sequence is
specified in the object type specific subclauses.
© ISO/IEC 2024 – All rights reserved
For elements producing output that cannot be tested with the methods described below, specific conformance
testing procedures are described in the object type specific subclauses.
7.1.2.2.1 RMS/LSB measurement
To fulfill the “RMS/LSB Measurement” test at an accuracy level of “K bit”, an ISO/IEC 14496-3 decoder shall
provide an output waveform such that the RMS level of the difference signal between the output of the decoder
-(K-1)
under test and the supplied reference output is less than 2 /sqrt(12). In addition, the difference signal shall
-(K-2)
have a maximum absolute value of at most 2 relative to full-scale. The “RMS/LSB Measurement” test shall be
carried out for an accuracy level of K=16 bit unless a different accuracy level is explicitly stated.
7.1.2.2.1.1 Calculation of RMS
For the calculation of the RMS level, all measurements are carried out relative to full scale where the output
signals of the decoder and supplied test sequences are normalized to be in the range between -1.0 and +1.0.
The supplied reference waveforms have a precision (P) of 24 bits, where the most significant bit (MSB) will be
labeled bit 0 and the least-significant bit (LSB) will be labeled bit 23. The most significant bit (bit 0) represents the
value of –1, the second most significant bit (bit 1) represents the value of +1/2, etc.
value of bit 0 (MSB) = − = − 1
1 1
value of bit 1 = =
2 2
1 1
value of bit 2 = =
2 4

1 1
=
value of bit 23 (LSB) =
2 8,388,608
The output waveform of the decoder under test is required to be in the same format. In the case that the output of
the decoder has a precision of P' bits and if P' is smaller than 24, then the output is extended to 24 bits by setting
bit P’ through bit 23 to zero. In the next step, the difference (diff) of the samples of these signals has to be
calculated. Every channel of a multichannel waveform shall be tested. The total number of samples for each
channel is N.
diff (n)= ' output signal of decoder under test (n)' - ' supplied test sequence (n)' , for n= 1 to N
The values of all difference samples shall be squared, summed, divided by N and then the square-root shall be
calculated. This calculation finally gives the RMS level.
N
rms= diff (n)

N
n=1
This test only verifies the computational accuracy of an implementation.
Software is provided for performing this verification procedure.
© ISO/IEC 2024 – All rights reserved
7.1.2.2.2 Segmental SNR
This criterion is designed to test decoders decoding the object types CELP, ER CELP, HVXC, ER HVXC, TwinVQ, ER
TwinVQ and ER HILN.
Definition:
th
x (i) : i sample of reference output signal (normalized in a range between –1.0 and 1.0).
a
th
x (i) : i sample of output signal of a decoder under test normalized in a range between –1.0 and 1.0.
b
L : the length of segment
N : the total number of segments
th
SS(k) : SNR of k segment
SSNR : segmental SNR
L−1
 
 x (k×L+i) 
∑ a
 i=0 
SS(k)= log 1+
L−1
 
−13
10 L+ (x (k×L+i)−x (k×L+i))

 a b 
 i=0 
N−1
 
SS (k ) /N

 
k=0
SSNR 10 log 10 1.0
= × −
 
 
 
7.1.2.2.3 Frequency domain criterion based on cepstrum analysis
This criterion is designed to test decoders decoding the object types CELP, ER CELP, TwinVQ, ER TwinVQ and ER
HILN.
The cepstrum analysis procedure is defined by means of the functions lpc2cepstrum
and calculate_lpc provided in pseudo C code below.

#define LPC_ORDER 16 /*  LPC order         */
#define CEPSTRUM_ORDER 32 /*  Cepstrum order       */
#define BW 0.0125F /*  Bandwidth scalefactor   */

void lpc2cepstrum (float  lpc_coef[], /*  in:  LPC coefficients (a-parameters)  */
float  C[]) /*  out:  LPC cepstrum            */

{
float ss;
int  i, m;
/* it is assumed that lpc_coef[0] is 1 ! */

C[1] = -lpc_coef[1];
for (m = 2; m <= LPC_ORDER; m++)
{
ss= -lpc_coef[m] * m;
for (i = 1; i < m; i++)
{
© ISO/IEC 2024 – All rights reserved
ss -= lpc_coef[i] * C[m-i];
}
C[m] = ss;
}
for (m = LPC_ORDER + 1; m <= CEPSTRUM_ORDER; m++)
{
ss = 0.0F;
for (i = 1; i<= LPC_ORDER; i++)
{
ss -= lpc_coef[i] * C[m-i];
}
C[m] = ss;
}
for (m = 2; m <= CEPSTRUM_ORDER; m++)
{
C[m] /= m;
}
}
void calculate_lpc (float  *in,     /*  in:  input PCM audio data        */
int   frame_size,  /*  in:  analysis frame length in samples  */
float  *lpc_coef)  /*  out:  LPC coefficients          */

{
int   ip;
float  wvpowfr, cor[LPC_ORDER + 1];
float  wlag [LPC_ORDER + 1];
float  *wdw;
wdw = (float*) malloc (sizeof (float) * frame_size);

if (wdw == NULL)
{
printf ("Memory allocation error in calculate_lpc.\n");
exit (1);
}
hamwdw (wdw, frame_size);
for (ip = 0; ip < frame_size; ip++)
{
in[ip] *= wdw[ip];
}
sigcor (in, frame_size, &wvpowfr, cor, LPC_ORDER);

lagwdw (wlag, LPC_ORDER, BW);
for (ip = 1; ip <= LPC_ORDER; ip++)
{
cor[ip] *= wlag[ip];
}
corref (LPC_ORDER, cor, lpc_coef);

free (wdw);
}
void hamwdw (float  wdw[],
int   n)
{
int    i;
float   d, pi = 3.141592653589793F;

d = (float) (2.0 * pi/n);
for (i = 0; i < n; i++)
{
wdw[i] = (float) (0.54 - 0.46 * cos (d * i));
}
}
void lagwdw (float  wdw[],
int   n,
© ISO/IEC 2024 – All rights reserved
float  h)
{
int   i;
float  pi = 3.141592653589793F;
float  a, b, w;
a = (float) (log (0.5) * 0.5 / log (cos (0.5 * pi * h)));
a = (float) ((int) a);
w = 1.0F;
b = a;
wdw[0] = 1.0F;
for (i = 1; i <= n; i++)
{
b += 1.0F;
w *= a / b;
wdw[i] = w;
a -= 1.0F;
}
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...