ISO/IEC 13818-7:2004
(Main)Information technology - Generic coding of moving pictures and associated audio information - Part 7: Advanced Audio Coding (AAC)
Information technology - Generic coding of moving pictures and associated audio information - Part 7: Advanced Audio Coding (AAC)
ISO/IEC 13818-7:2004 specifies MPEG-2 Advanced Audio Coding (AAC), a multi-channel audio coding standard that delivers higher quality than is achievable when requiring MPEG-1 backwards compatibility. It provides ITU-R "indistinguishable" quality at a data rate of 320 kbit/s for five full-bandwidth channel audio signals. In comparison to ISO/IEC 13818-7:2003, ISO/IEC 13818-7:2004 supplements information on how to transmit and obtain MPEG-4 SBR data as part of the MPEG-2 AAC access unit.
Technologies de l'information — Codage générique des images animées et du son associé — Partie 7: Codage du son avancé (AAC)
General Information
Relations
Frequently Asked Questions
ISO/IEC 13818-7:2004 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Generic coding of moving pictures and associated audio information - Part 7: Advanced Audio Coding (AAC)". This standard covers: ISO/IEC 13818-7:2004 specifies MPEG-2 Advanced Audio Coding (AAC), a multi-channel audio coding standard that delivers higher quality than is achievable when requiring MPEG-1 backwards compatibility. It provides ITU-R "indistinguishable" quality at a data rate of 320 kbit/s for five full-bandwidth channel audio signals. In comparison to ISO/IEC 13818-7:2003, ISO/IEC 13818-7:2004 supplements information on how to transmit and obtain MPEG-4 SBR data as part of the MPEG-2 AAC access unit.
ISO/IEC 13818-7:2004 specifies MPEG-2 Advanced Audio Coding (AAC), a multi-channel audio coding standard that delivers higher quality than is achievable when requiring MPEG-1 backwards compatibility. It provides ITU-R "indistinguishable" quality at a data rate of 320 kbit/s for five full-bandwidth channel audio signals. In comparison to ISO/IEC 13818-7:2003, ISO/IEC 13818-7:2004 supplements information on how to transmit and obtain MPEG-4 SBR data as part of the MPEG-2 AAC access unit.
ISO/IEC 13818-7:2004 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC 13818-7:2004 has the following relationships with other standards: It is inter standard links to ISO/IEC 13818-7:2004/Cor 1:2005, ISO/IEC 13818-7:2003/Amd 1:2004, ISO/IEC 13818-7:2006, ISO/IEC 13818-7:2003; is excused to ISO/IEC 13818-7:2004/Cor 1:2005. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO/IEC 13818-7:2004 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 13818-7
Third edition
2004-10-15
Information technology — Generic
coding of moving pictures and
associated audio information —
Part 7:
Advanced Audio Coding (AAC)
Technologies de l’information — Codage générique des images
animées et du son associé —
Partie 7: Codage du son avancé (AAC)
Reference number
©
ISO/IEC 2004
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2004
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2004 – All rights reserved
Contents Page
1 Scope. 1
1.1 MPEG-2 AAC Tools Overview . 1
2 Normative References . 8
3 Terms and Definitions . 8
4 Symbols and Abbreviations. 15
4.1 Arithmetic Operators . 15
4.2 Logical Operators. 16
4.3 Relational Operators . 16
4.4 Bitwise Operators. 17
4.5 Assignment . 17
4.6 Mnemonics. 17
4.7 Constants . 17
5 Method of Describing Bitstream Syntax.17
6 Syntax. 19
6.1 Audio Data Interchange Format, ADIF. 19
6.2 Audio Data Transport Stream, ADTS. 20
6.3 Raw Data . 22
7 Profiles and Profile Interoperability . 34
7.1 Profiles . 34
7.2 Profile Interoperability . 36
8 General Information. 37
8.1 Audio Data Interchange Format (ADIF) and Audio Data Transport Stream (ADTS). 37
8.2 Decoding of Raw Data . 42
8.3 Decoding of a single_channel_element() (SCE), a channel_pair_element() (CPE) or an
individual_channel_stream() (ICS). 47
8.4 Low Frequency Enhancement Channel (LFE). 54
8.5 program_config_element() (PCE) . 55
8.6 Data Stream Element (DSE). 59
8.7 Fill Element (FIL). 60
8.8 Dedoding of extension_payload() . 60
8.9 Tables. 66
8.10 Figures . 74
9 Noiseless Coding . 74
9.1 Tool Description . 74
9.2 Definitions . 75
9.3 Decoding Process. 77
9.4 Tables. 80
10 Quantization . 81
10.1 Tool Description . 81
10.2 Definitions . 81
10.3 Decoding Process. 81
11 Scalefactors. 82
© ISO/IEC 2004 – All rights reserved iii
11.1 Tool Description .82
11.2 Definitions.82
11.3 Decoding Process .83
12 Joint Coding.84
12.1 M/S Stereo.84
12.2 Intensity Stereo .86
12.3 Coupling Channel.88
13 Prediction.92
13.1 Tool Description .92
13.2 Definitions.92
13.3 Decoding Process .93
13.4 Diagrams .100
14 Temporal Noise Shaping (TNS).100
14.1 Tool Description .100
14.2 Definitions.101
14.3 Decoding Process .101
15 Filterbank and Block Switching.103
15.1 Tool Description .103
15.2 Definitions.103
15.3 Decoding Process .104
16 Gain Control.108
16.1 Tool Description .108
16.2 Definitions.109
16.3 Decoding Process .109
16.4 Diagrams .115
16.5 Tables.115
Annex A (normative) Huffman Codebook Tables.117
Annex B (informative) Information on Unused Codebooks .138
Annex C (informative) Encoder .139
C.1 Psychoacoustic Model.139
C.2 Gain Control .171
C.3 Filterbank and Block Switching .172
C.4 Prediction.175
C.5 Temporal Noise Shaping (TNS) .178
C.6 Joint Coding.179
C.7 Quantization .181
C.8 Noiseless Coding .188
C.9 Features of AAC dynamic range control .191
Annex D (informative) Patent Holders .193
D.1 List of Patent Holders .193
Annex E (informative) Registration Procedure.194
E.1 Procedure for the Request of a Registered Identifier (RID).194
E.2 Responsibilities of the Registration Authority.194
E.3 Contact Information of the Registration Authority .194
E.4 Responsibilities of Parties Requesting a RID.195
E.5 Appeal procedure for Denied Applications.195
Annex F (informative) Registration Application Form.196
Annex G (informative) Registration Authority.197
Bibliography.198
iv © ISO/IEC 2004 – All rights reserved
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
ISO/IEC 13818-7 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
This third edition cancels and replaces the second edition (ISO/IEC 13818-7:2003), which has been
technically revised. It also incorporates the Amendment ISO/IEC 13818-7:2003/Amd.1:2004.
ISO/IEC 13818 consists of the following parts, under the general title Information technology — Generic
coding of moving pictures and associated audio information:
— Part 1: Systems
— Part 2: Video
— Part 3: Audio
— Part 4: Conformance testing
— Part 5: Software simulation
— Part 6: Extensions for DSM-CC
— Part 7: Advanced Audio Coding (AAC)
— Part 9: Extension for real time interface for systems decoders
— Part 10: Conformance extensions for Digital Storage Media Command and Control (DSM-CC)
— Part 11: IPMP on MPEG-2 systems
© ISO/IEC 2004 – All rights reserved v
Introduction
The standardization body ISO/IEC JTC 1/SC 29/WG 11, also known as the Moving Pictures Experts Group
(MPEG), was established in 1988 to specify digital video and audio coding schemes at low data rates. MPEG
completed its first phase of audio specifications (MPEG-1) in November 1992, ISO/IEC 11172-3. In its second
phase of development, the MPEG Audio subgroup defined a multichannel extension to MPEG-1 audio that is
backwards compatible with existing MPEG-1 systems (MPEG-2 BC) and defined an audio coding standard at
lower sampling frequencies than MPEG-1, ISO/IEC 13818-3.
The International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC)
draw attention to the fact that it is claimed that compliance with this document may involve the use of patents.
The ISO and IEC take no position concerning the evidence, validity and scope of this patent right.
The holder of this patent right has assured the ISO and IEC that he is willing to negotiate licences under
reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this respect,
the statement of the holder of this patent right is registered with the ISO and IEC. Information may be obtained
from the companies listed in Annex D.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights other than those identified in Annex D. ISO and IEC shall not be held responsible for identifying any or
all such patent rights.
vi © ISO/IEC 2004 – All rights reserved
INTERNATIONAL STANDARD ISO/IEC 13818-7:2004(E)
Information technology — Generic coding of moving pictures
and associated audio information —
Part 7:
Advanced Audio Coding (AAC)
1 Scope
This International Standard describes the MPEG-2 audio non-backwards compatible
standard called MPEG-2 Advanced Audio Coding, AAC [1], a higher quality multichannel
standard than achievable while requiring MPEG-1 backwards compatibility. This MPEG-2
AAC audio standard allows for ITU-R ‘indistinguishable’ quality according to [2] at data rates
of 320 kbit/s for five full-bandwidth channel audio signals.
The AAC decoding process makes use of a number of required tools and a number of
optional tools. Table 1 lists the tools and their status as required or optional. Required tools
are mandatory in any possible profile. Optional tools may not be required in some profiles.
Table 1 – AAC decoder tools
Tool Name Required / Optional
Bitstream Formatter Required
Noiseless Decoding Required
Inverse quantization Required
Rescaling Required
M/S Optional
Prediction Optional
Intensity Optional
Dependently switched coupling Optional
TNS Optional
Filterbank / block switching Required
Gain control Optional
Independently switched coupling Optional
1.1 MPEG-2 AAC Tools Overview
The basic structure of the MPEG-2 AAC system is shown in Figure 1 and Figure 2. As is
shown in Table 1, there are both required and optional tools in the decoder. The data flow
in this diagram is from left to right, top to bottom. The functions of the decoder are to find
the description of the quantized audio spectra in the bitstream, decode the quantized values
and other reconstruction information, reconstruct the quantized spectra, process the
reconstructed spectra through whatever tools are active in the bitstream in order to arrive at
the actual signal spectra as described by the input bitstream, and finally convert the
frequency domain spectra to the time domain, with or without an optional gain control tool.
Following the initial reconstruction and scaling of the spectrum reconstruction, there are
many optional tools that modify one or more of the spectra in order to provide more efficient
© ISO/IEC 2004 – All rights reserved 1
coding. For each of the optional tools that operate in the spectral domain, the option to
“pass through” is retained, and in all cases where a spectral operation is omitted, the
spectra at its input are passed directly through the tool without modification.
The input to the bitstream demultiplexer tool is the MPEG-2 AAC bitstream. The
demultiplexer separates the parts of the MPEG-AAC data stream into the parts for each
tool, and provides each of the tools with the bitstream information related to that tool.
The outputs from the bitstream demultiplexer tool are:
• The sectioning information for the noiselessly coded spectra
• The noiselessly coded spectra
• The M/S decision information (optional)
• The predictor state information (optional)
• The intensity stereo control information and coupling channel control information
(both optional)
• The temporal noise shaping (TNS) information (optional)
• The filterbank control information
• The gain control information (optional)
The noiseless decoding tool takes information from the bitstream demultiplexer, parses that
information, decodes the Huffman coded data, and reconstructs the quantized spectra and
the Huffman and DPCM coded scalefactors.
The inputs to the noiseless decoding tool are:
• The sectioning information for the noiselessly coded spectra
• The noiselessly coded spectra
The outputs of the Noiseless Decoding tool are:
• The decoded integer representation of the scalefactors:
• The quantized values for the spectra
The inverse quantizer tool takes the quantized values for the spectra, and converts the
integer values to the non-scaled, reconstructed spectra. This quantizer is a non-uniform
quantizer.
The input to the Inverse Quantizer tool is:
• The quantized values for the spectra
The output of the inverse quantizer tool is:
• The un-scaled, inversely quantized spectra
2 © ISO/IEC 2004 – All rights reserved
The rescaling tool converts the integer representation of the scalefactors to the actual
values, and multiplies the un-scaled inversely quantized spectra by the relevant
scalefactors.
The inputs to the rescaling tool are:
• The decoded integer representation of the scalefactors
• The un-scaled, inversely quantized spectra
The output from the scalefactors tool is:
• The scaled, inversely quantized spectra
The M/S tool converts spectra pairs from Mid/Side to Left/Right under control of the M/S
decision information in order to improve coding efficiency.
The inputs to the M/S tool are:
• The M/S decision information
• The scaled, inversely quantized spectra related to pairs of channels
The output from the M/S tool is:
• The scaled, inversely quantized spectra related to pairs of channels, after M/S
decoding
Note: The scaled, inversely quantized spectra of individually coded channels are not processed by the M/S block, rather they are passed
directly through the block without modification. If the M/S block is not active, all spectra are passed through this block unmodified.
The prediction tool reverses the prediction process carried out at the encoder. This
prediction process re-inserts the redundancy that was extracted by the prediction tool at the
encoder, under the control of the predictor state information. This tool is implemented as a
second order backward adaptive predictor. The inputs to the prediction tool are:
• The predictor state information
• The scaled, inversely quantized spectra
The output from the prediction tool is:
• The scaled, inversely quantized spectra, after prediction is applied.
Note: If the prediction is disabled, the scaled, inversely quantized spectra are passed directly through the block without modification.
The intensity stereo tool implements intensity stereo decoding on pairs of spectra.
The inputs to the intensity stereo tool are:
• The inversely quantized spectra
• The intensity stereo control information
The output from the intensity stereo tool is:
• The inversely quantized spectra after intensity channel decoding.
Note: The scaled, inversely quantized spectra of individually coded channels are passed directly through this tool without modification, if
intensity stereo is not indicated. The intensity stereo tool and M/S tool are arranged so that the operation of M/S and intensity stereo are
mutually exclusive on any given scalefactor band and group of one pair of spectra.
© ISO/IEC 2004 – All rights reserved 3
The coupling tool for dependently switched coupling channels adds the relevant data from
dependently switched coupling channels to the spectra, as directed by the coupling control
information.
The inputs to the coupling tool are:
• The inversely quantized spectra
• The coupling control information
The output from the coupling tool is:
• The inversely quantized spectra coupled with the dependently switched coupling
channels.
Note: The scaled, inversely quantized spectra are passed directly through this tool without modification, if coupling is not indicated.
Depending on the coupling control information, dependently switched coupling channels might either be coupled before or after the TNS
processing.
The coupling tool for independently switched coupling channels adds the relevant data from
independently switched coupling channels to the time signal, as directed by the coupling
control information.
The inputs to the coupling tool are:
• The time signal as output by the filterbank
• The coupling control information
The output from the coupling tool is:
• The time signal coupled with the independently switched coupling channels.
Note: The time signal is passed directly through this tool without modification, if coupling is not indicated.
The temporal noise shaping (TNS) tool implements a control of the fine time structure of the
coding noise. In the encoder, the TNS process has flattened the temporal envelope of the
signal to which it has been applied. In the decoder, the inverse process is used to restore
the actual temporal envelope(s), under control of the TNS information. This is done by
applying a filtering process to parts of the spectral data.
The inputs to the TNS tool are:
• The inversely quantized spectra
• The TNS information
The output from the TNS block is:
• The inversely quantized spectra
Note: If this block is disabled, the inversely quantized spectra are passed through without modification.
The filterbank / block switching tool applies the inverse of the frequency mapping that was
carried out in the encoder. An inverse modified discrete cosine transform (IMDCT) is used
for the filterbank tool. The IMDCT can be configured to support either one set of 128 or
1024, or four sets of 32 or 256 spectral coefficients.
4 © ISO/IEC 2004 – All rights reserved
The inputs to the filterbank tool are:
• The inversely quantized spectra
• The filterbank control information
The output(s) from the filterbank tool is (are):
• The time domain reconstructed audio signal(s).
When present, the gain control tool applies a separate time domain gain control to each of
4 frequency bands that have been created by the gain control PQF filterbank in the
encoder. Then, it assembles the 4 frequency bands and reconstructs the time waveform
through the gain control tool’s filterbank.
The inputs to the gain control tool are:
• The time domain reconstructed audio signal(s)
• The gain control information
The output(s) from the gain control tool is (are):
• The time domain reconstructed audio signal(s)
If the gain control tool is not active, the time domain reconstructed audio signal(s) are
passed directly from the filterbank tool to the output of the decoder. This tool is used for the
scalable sampling rate (SSR) profile only.
© ISO/IEC 2004 – All rights reserved 5
input time signal
Legend:
data
control
AAC
gain control
psychoacoustic
model
window length block
decision switching
filterbank
threshold
TNS
calculation
coded
audio
intensity
stream
bitstream
formatter
spectral
prediction
processing
M/S
scaling
quantization
quantization
and noiseless
coding
Huffman coding
Figure 1 – MPEG-2 AAC Encoder Block Diagram
6 © ISO/IEC 2004 – All rights reserved
Legend:
data
control
Huffman decoding
noiseless
inverse
decoding and
quantization
inverse
quantization
rescaling
M/S
bitstream
deformatter
prediction
coded
audio
intensity
stream
spectral
dependently
processing
switched
coupling
TNS
dependently
switched
coupling
block
switching
filterbank
AAC
gain control
output
time
signal
independently
switched
coupling
Figure 2 – MPEG-2 AAC Decoder Block Diagram
© ISO/IEC 2004 – All rights reserved 7
2 Normative References
The following referenced documents are indispensable for the application of this document.
For dated references, only the edition cited applies. For undated references, the latest
edition of the referenced document (including any amendments) applies.
ISO/IEC 11172-3, Information technology — Coding of moving pictures and associated
audio for digital storage media at up to about 1,5 Mbit/s — Part 3: Audio
ISO/IEC 13818-1, Information technology — Generic coding of moving pictures and
associated audio information — Part 1: Systems
ISO/IEC 13818-3, Information technology — Generic coding of moving pictures and
associated audio information — Part 3: Audio
ISO/IEC 14496-3, Information technology — Coding of audio-visual objects — Part 3: Audio
3 Terms and Definitions
For the purposes of this part of ISO/IEC 13818, the following definitions apply:
3.1
access unit
in the case of compressed audio an access unit is an audio access unit.
3.2
alias
mirrored signal component resulting from sampling.
3.3
analysis filterbank
filterbank in the encoder that transforms a broadband PCM audio signal into a set of
spectral coefficients.
3.4
ancillary data
part of the bitstream that might be used for transmission of ancillary data.
3.5
audio access unit
for AAC, an audio access unit is defined as the smallest part of the encoded bitstream
which can be decoded by itself, where decoded means "fully reconstructed sound".
Typically this is a segment of the encoded bitstream starting after the end of the byte
containing the last bit of one ID_END id_syn_ele() through the end of the byte containing
the last bit of the next ID_END id_syn_ele.
3.6
audio buffer
a buffer in the system target decoder (see ISO/IEC 13818-1) for storage of compressed
audio data.
3.7
Bark
the Bark is the standard unit corresponding to one critical band width of human hearing.
8 © ISO/IEC 2004 – All rights reserved
3.8
backward compatibility
a newer coding standard is backward compatible with an older coding standard if decoders
designed to operate with the older coding standard are able to continue to operate by
decoding all or part of a bitstream produced according to the newer coding standard.
3.9
bitrate
the rate at which the compressed bitstream is delivered to the input of a decoder.
3.10
bitstream; stream
an ordered series of bits that forms the coded representation of the data.
3.11
bitstream verifier
a process by which it is possible to test and verify that all the requirements specified in this
part of ISO/IEC 13818 are met by the bitstream.
3.12
block companding
normalising of the digital representation of an audio signal within a certain time period.
3.13
byte aligned
a bit in a coded bitstream is byte-aligned if its position is a multiple of 8-bits from either the
first bit in the stream for the Audio Data Interchange Format (see subclause 6.1) or the first
bit in the syncword for the Audio Data Transport Stream Format (see subclause 6.2).
3.14
Byte
sequence of 8-bits.
3.15
centre channel
an audio presentation channel used to stabilise the central component of the frontal stereo
image.
3.16
channel
a sequence of data representing an audio signal intended to be reproduced at one listening
position.
3.17
coded audio bitstream
a coded representation of an audio signal.
3.18
coded representation
a data element as represented in its encoded form.
3.19
compression:
reduction in the number of bits used to represent an item of data.
3.20
constant bitrate
operation where the bitrate is constant from start to finish of the coded bitstream.
© ISO/IEC 2004 – All rights reserved 9
3.21
CRC
the Cyclic Redundancy Check to verify the correctness of data.
3.22
critical band
this unit of bandwidth represents the standard unit of bandwidth expressed in human
auditory terms, corresponding to a fixed length on the human cochlea. It is approximately
equal to 100 Hz at low frequencies and 1/3 octave at higher frequencies, above
approximately 700 Hz.
3.23
data element
an item of data as represented before encoding and after decoding.
3.24
decoded stream
the decoded reconstruction of a compressed bitstream.
3.25
decoder
an embodiment of a decoding process.
3.26
decoding (process)
the process defined in this part of ISO/IEC 13818 that reads an input coded bitstream and
outputs decoded audio samples.
3.27
digital storage media; DSM
a digital storage or transmission device or system.
3.28
discrete cosine transform; DCT
either the forward discrete cosine transform or the inverse discrete cosine transform. The
DCT is an invertible, discrete orthogonal transformation.
3.29
downmix
a matrixing of n channels to obtain less than n channels.
3.30
editing
the process by which one or more coded bitstreams are manipulated to produce a new
coded bitstream. Conforming edited bitstreams must meet the requirements defined in this
part of ISO/IEC 13818.
3.31
encoder
an embodiment of an encoding process.
3.32
encoding (process)
a process, not specified in ISO/IEC 13818, that reads a stream of input audio samples and
produces a valid coded bitstream as defined in this part of ISO/IEC 13818.
10 © ISO/IEC 2004 – All rights reserved
3.33
entropy coding
variable length lossless coding of the digital representation of a signal to reduce statistical
redundancy.
3.34
FFT
Fast Fourier Transformation. A fast algorithm for performing a discrete Fourier transform
(an orthogonal transform).
3.35
filterbank
a set of band-pass filters covering the entire audio frequency range.
3.36
flag
a variable which can take one of only the two values defined in this specification.
3.37
forward compatibility
a newer coding standard is forward compatible with an older coding standard if decoders
designed to operate with the newer coding standard are able to decode bitstreams of the
older coding standard.
3.38
frame
a part of the audio signal that corresponds to audio PCM samples from an audio access
unit.
3.39
Fs
sampling frequency.
3.40
Hann window
a time function applied sample-by-sample to a block of audio samples before Fourier
transformation.
3.41
Huffman coding
a specific method for entropy coding.
3.42
hybrid filterbank
a serial combination of subband filterbank and MDCT.
3.43
IDCT
Inverse Discrete Cosine Transform.
3.44
IMDCT
Inverse Modified Discrete Cosine Transform.
© ISO/IEC 2004 – All rights reserved 11
3.45
intensity stereo
a method of exploiting stereo irrelevance or redundancy in stereophonic audio programmes
based on retaining at high frequencies only the energy envelope of the right and left
channels.
3.46
joint stereo coding
any method that exploits stereophonic irrelevance or stereophonic redundancy.
3.47
joint stereo mode
a mode of the audio coding algorithm using joint stereo coding.
3.48
low frequency enhancement (LFE) channel
a limited bandwidth channel for low frequency audio effects in a multichannel system.
3.49
main audio channels
all channels represented by either single_channel_element()'s (see subclause 8.2.1) or
channel_pair_element()´s (see subclause 8.2.1)
3.50
Mapping
conversion of an audio signal from time to frequency domain by subband filtering and/or by
MDCT.
3.51
Masking
a property of the human auditory system by which an audio signal cannot be perceived in
the presence of another audio signal.
3.52
masking threshold
a function in frequency and time below which an audio signal cannot be perceived by the
human auditory system.
3.53
modified discrete cosine transform (MDCT)
a transform which has the property of time domain aliasing cancellation. An analytical
espression for the MDCT can be found in subclause C.3.1.2.
3.54
M/S stereo
a method of removing imaging artefacts as well as exploiting stereo irrelevance or
redundancy in stereophonic audio programmes based on coding the sum and difference
signal instead of the left and right channels.
3.55
Multichannel
a combination of audio channels used to create a spatial sound field.
3.56
Multilingual
a presentation of dialogue in more than one language.
12 © ISO/IEC 2004 – All rights reserved
3.57
non-tonal component
a noise-like component of an audio signal.
3.58
NCC
Number of Considered Channels. The number of channels represented by the elements
SCE, independently switched CCE and CPE, i.e. once the number of SCEs plus once the
number of independently switched CCEs plus twice the number of CPEs. With respect to
the naming conventions of the MPEG-AAC decoders and bitstreams, NCC=A+I. This
number is used to derive the required decoder input buffer size (see subclause 8.2.2).
3.59
Nyquist sampling
sampling at or above twice the maximum bandwidth of a signal.
3.60
Padding
a method to adjust the average length of an audio frame in time to the duration of the
corresponding PCM samples, by conditionally adding a slot to the audio frame.
3.61
Parameter
a variable within the syntax of this specification which may take one of a range of values. A
variable which can take one of only two values is a flag or indicator and not a parameter.
3.62
Parser
functional stage of a decoder which extracts from a coded bitstream a series of bits
representing coded elements.
3.63
polyphase filterbank
a set of equal bandwidth filters with special phase interrelationships, allowing for an efficient
implementation of the filterbank.
3.64
prediction error
the difference between the actual value of a sample or data element and its predictor.
3.65
Prediction
the use of a predictor to provide an estimate of the sample value or data element currently
being decoded.
3.66
predictor
a linear combination of previously decoded sample values or data elements.
3.67
presentation channel
an audio channel at the output of the decoder.
3.68
presentation unit
In the case of compressed audio a decoded audio access unit.
© ISO/IEC 2004 – All rights reserved 13
3.69
Program
a set of main audio channels, coupling_channel_element()'s (see subclause 8.2.1),
lfe_channel_element()'s (see subclause 8.2.1), and associated data streams intended to be
decoded and played back simultaneously. A program may be defined by default (see
subclause 8.5.2.1 and subclause 8.5.2.3) or specifically by a program_config_element()
(see subclauses 8.5.2.2). A given single_channel_element() (see subclause 8.2.1),
channel_pair_element() (see subclause 8.2.1), coupling_channel_element(),
lfe_channel_element() or data channel may accompany one or more programs in any given
bitstream.
3.70
psychoacoustic model
a mathematical model of the masking behaviour of the human auditory system.
3.71
random access
the process of beginning to read and decode the coded bitstream at an arbitrary point.
3.72
Reserved
the term "reserved" when used in the clauses defining the coded bitstream indicates that
the value may be used in the future for ISO/IEC defined extensions.
3.73
Sampling Frequency (Fs)
defines the rate in Hertz which is used to digitise an audio signal during the sampling
process.
3.74
Scalefactor
factor by which a set of values is scaled before quantization.
3.75
scalefactor band
a set of spectral coefficients which are scaled by one scalefactor.
3.76
scalefactor index
a numerical code for a scalefactor.
3.77
side information
information in the bitstream necessary for controlling the decoder.
3.78
spectral coefficients
discrete frequency domain data output from the analysis filterbank.
3.79
spreading function
a function that describes the frequency spread of masking effects.
3.80
stereo-irrelevant
a portion of a stereophonic audio signal which does not contribute to spatial perception.
14 © ISO/IEC 2004 – All rights reserved
3.81
stuffing (bits); stuffing (bytes)
code-words that may be inserted at particular locations in the coded bitstream that are
discarded in the decoding process. Their purpose is to increase the bitrate of the stream
which would otherwise be lower than the desired bitrate.
3.82
surround channel
an audio presentation channel added to the front channels (L and R or L, R, and C) to
enhance the spatial perception.
3.83
Syncword
a 12-bit code embedded in the audio bitstream that identifies the start of a adts_frame()
(see subclase 6.2, Table 5).
3.84
synthesis filterbank
filterbank in the decoder that reconstructs a PCM audio signal from subband samples.
3.85
tonal component
a sinusoid-like component of an audio signal.
3.86
variable bitrate
operation where the bitrate varies with time during the decoding of a coded bitstream.
3.87
variable length coding
a reversible procedure for coding that assigns shorter code-words to frequent symbols and
longer code-words to l
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...