Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 3: Audio (ISO/IEC 11172-3:1993)

Informationstechnik - Codierung von bewegten Bildern und damit verbundenen Tonsignalen für digitale Speichermedien bis zu 1,5 Mbit/s - Teil 3: Audio (ISO/IEC 11172-3:1993)

Technologies de l'information - Codage de l'image animée et du son associé pour les supports de stockage numérique jusqu'à environ 1,5 Mbit/s - Partie 3: Audio (ISO/IEC 11172-3:1993)

Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 3: Audio (ISO/IEC 11172-3:1993)

General Information

Status
Withdrawn
Publication Date
23-Feb-1995
Withdrawal Date
08-Jun-2005
Current Stage
9960 - Withdrawal effective - Withdrawal
Start Date
09-Jun-2005
Completion Date
09-Jun-2005

Relations

Effective Date
09-Feb-2026
Effective Date
09-Feb-2026
Effective Date
09-Feb-2026
Effective Date
09-Feb-2026
Effective Date
09-Feb-2026
Effective Date
09-Feb-2026
Effective Date
09-Feb-2026
Effective Date
09-Feb-2026
Effective Date
09-Feb-2026
Effective Date
09-Feb-2026
Effective Date
09-Feb-2026
Standard

EN ISO/IEC 11172-3:1997

English language
152 pages
Preview
Preview
e-Library read for
1 day

Get Certified

Connect with accredited certification bodies for this standard

BSI Group

BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

UKAS United Kingdom Verified

NYCE

Mexican standards and certification body.

EMA Mexico Verified

Sponsored listings

Frequently Asked Questions

EN ISO/IEC 11172-3:1995 is a standard published by the European Committee for Standardization (CEN). Its full title is "Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 3: Audio (ISO/IEC 11172-3:1993)". This standard covers: Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 3: Audio (ISO/IEC 11172-3:1993)

Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 3: Audio (ISO/IEC 11172-3:1993)

EN ISO/IEC 11172-3:1995 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.

EN ISO/IEC 11172-3:1995 has the following relationships with other standards: It is inter standard links to EN 62481-2:2014, EN 61937-2:2007, EN 62766-2-1:2017, EN 62516-1:2009, EN 62107:2001, EN 61937-2:2003, EN 61937:2000, EN 62676-2-2:2014, EN 61834-11:2008, EN 62298-4:2005, EN 60774-5:2004. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

EN ISO/IEC 11172-3:1995 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.

Standards Content (Sample)


SLOVENSKI STANDARD
01-december-1997
Information technology - Coding of moving pictures and associated audio for
digital storage media at up to about 1,5 Mbit/s - Part 3: Audio (ISO/IEC 11172-
3:1993)
Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbit/s - Part 3: Audio (ISO/IEC 11172-3:1993)
Informationstechnik - Codierung von bewegten Bildern und damit verbundenen
Tonsignalen für digitale Speichermedien bis zu 1,5 Mbit/s - Teil 3: Audio (ISO/IEC 11172
-3:1993)
Technologies de l'information - Codage de l'image animée et du son associé pour les
supports de stockage numérique jusqu'a environ 1,5 Mbit/s - Partie 3: Audio (ISO/IEC
11172-3:1993)
Ta slovenski standard je istoveten z: EN ISO/IEC 11172-3:1995
ICS:
35.040 Nabori znakov in kodiranje Character sets and
informacij information coding
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

I NTERNAT I O NA L
ISOIIEC
S TA N DA R D
I1 172-3
First edition
1993-08-01
Information technology - Coding of
moving pictures and associated audio for
digital storage media at up to about
1,5 Mbit/s -
Part 3:
Audio
Technologies de l'information - Codage de l'image animee et du son
associe pour les supports de stockage numerique jusqu'd environ
1,5 MbiVs -
Partie 3: Audio
Reference number
ISO/IEC 11 172-3:1993(E)
ISOAEC 11 172-3: 1993 (E)
Contents
Page
III troduc tion. .
..................................... v
Section 1: General .
............................ 1
.........................
1.1 Scope . 1
1.2 Normative references. . . 1
Section 2: Techiiical elements. . . 2
2.1 Defiiiitioiis . . 2
2.2 Symbols and abbreviations. . . 10
2.3 Method of describing bitstream syntax 12
2.4 R eq U ire inen ts . . . 14
A II II ex es
A Diagrams . . . . 38
B Tables . .
C The encodiug process . .
D Psychoacoustic models . .IO9
E Bit sensitivity to errors. . 140
OISO/IEC 1993
All rights reserved. No part of this publicatiori inay be reproduced or utilized in any form or by
any ~neaiis, electronic or nech ha ni cal, i~icluding photocopying and microfilm, without
permission in writiiig from the publisher.
ISOAEC Copyright Office Case Postale 56 CH 121 1 Genève 20 Switzerland
Printed in Switzerland.
ii
O ISO/IEC
ISO/IEC 11 172-3: 1993 (E)
F Error concealment 142
.............................................................................
G Joint stereo coding 143
...........................................................................
H List of patent holders . 147
iii
ISO/IEC 11 172-3: 1993 (E)
8 ISO/IEC
Foreword
IS0 (the International Organization for Standardization) and IEC (the Inter-
national Electrotechnical Commission) form the specialized system for
worldwide standardization. National bodies that are members of IS0 or
IEC participate in the development of International Standards through
technical committees established by the respective organization to deal
with particular fields of technical activity. IS0 and IEC technical com-
mittees collaborate in fields of mutual interest. Other international organ-
izations, governmental and non-governmental, in liaison with IS0 and IEC,
also take part in the work.
In the field of information technology, IS0 and IEC have established a joint
technical committee, ISO/IEC JTC 1. Draft International Standards adopted
by the joint technical committee are circulated to national bodies for vot-
ing. Publication as an International Standard requires approval by at least
75 YO of the national bodies casting a vote.
International Standard iSO/IEC 11 172-3 was prepared by Joint Technical
Committee ISO/IEC JTC 1, lnformation technology, Sub-committee SC 29,
Coded representation of audio, picture, multimedia and hypermedia infor-
mation.
ISO/lEC 11 172 consists of the following parts, under the general title In-
formation technology - Coding of moving pictures and associated audio
for digital storage media at up to about 1,5 MbiVs:
- Part 1: Systems
- Part2: Video
- Part 3: Audio
- Part 4: Compliance testing
Annexes A and B form an integral part of this part of ISO/IEC 11 172. An-
nexes C, D, E, F, G and H are for information only.
iv
O ISO/IEC
ISO/IEC 11 172-3: 1993 (E)
Introduction
Note: Readers interested in an overview of MPEG Audio should read this Introduction and then proceed to
annex A (Diagrams) (and annex C (The encoding process) before reading the normative clauses 1 and 2.
To aid in the understanding of the specification of the stored compressed bitstream and its decoding, a
sequence of encoding, storage and decoding is described.
0.1 Encoding
The encoder processes the digital audio signal and produces the compressed bitstream for storage. The
encoder algorithm is not standardized, and may use various means for encoding such as estimation of the
auditory masking threshold, qu(mtization, and scaling. However, the encoder output must be such that a
decoder conforming to the specifications of clause 2.4 will produce audio suitable for the intended
application.
PCM
encoded
audio samples
bitstream
32 44,l 48kHz
quanrizer
f ra
.I,:--
4 and 4 pacnllly I
4 psychoacoustic
model
t
ISOAEC 11172-3
encoder
I
ancillary data
Figure 1 -- Sketch of the basic structure of an encoder
Figure 1 illustrates the basic structure of a audio encoder. Input audio samples are fed into the encoder. The
mapping creates a filtered and subsampled represenwion of the input audio stream. The mapped samples
may be Galled either subb Layer ID). A psychoacoustic model creates a set of data to control the quantizer and coding. These data are
different depending on the actual coder implemenWion. One possibility is to use an estimation of the
masking threshold to do this quantizer control. The quantizer and coding block creates a set of coding
symbols from the mapped input samples. Again, this block can depend on the encoding system. The block
'frame packing' assembles the actual bitstream from the output &zta of the other blocks, and adds other
information (e.g. error correction) if necessary.
There are four different modes possible, single chmnel, dual channel (two independent audio signals coded
within one bitstrean), stereo (left and right signals of a stereo pair coded within one bitstream), and Joint
Stereo (left and right signals of a stereo pair coded within one bitstrean with the stereo irrelevancy and
redundancy exploited).
V
ISOAEC 11 172-3: 1993 (E)
O ISOAEC
0.2 Layers
Depending on the application, different layers of the coding system with increasing encoder complexity and
performance can be used. An ISOAEC 11172-3 Audio Layer N decoder is able to decode bitstream data
which has been encoded in Layer N and all layers below N.
Layer I
This layer contains the basic mapping of the digital audio input into 32 subbands, fixed segmentation to
format the data into blocks, a psychoacoustic model to determine the adaptive bit allocation, and
quantization using block companding and formatting. The theoretical minimum encoding/decoding delay for
Layer I is about 19 ms.
Layer JI
This layer provides additional coding of bit allocation, scalefactors and samples. Different framing is used.
The theoretical minimum encoding/decoding delay for Layer II is about 35 ms.
Layer III
This layer introduces inmased frequency resolution based on a hybrid filterb (nonuniform) quantizer, adaptive segmentation and entropy coding of the quantized values. The theoretical
minimum encoding/decoding delay for Layer III is about 59 ms.
Joint Stereo coding ~a~i be added as an additional feature to any of the layers.
0.3 Storage
Various streams of encoded video, encoded audio, synchronization data, systems data and auxiliary data may
be stored together on a storage medium. Editing of the audio will be easier if the edit point is constrained to
coincide with an addressable point.
Access to storage may involve remote access over a communication system. Access is assumed to be
controlled by a functional unit other th,an the audio decoder itself. This control unit accepts user ~omm~ands,
reads and interprets dm base structure information, reads the stored information from the media,
demultiplexes non-audio information aid passes the stored audio bitstream to the audio decoder at the
required rate.
0.4 Decoding
The decoder accepts the compressed audio bitstream in the syntax defined in 2.4.1, decodes the data elements
according to 2.4.2, and uses the information to produce digital audio output according to 2.4.3.
I I PCM
audio samples
32 44,l 48kHz
bitstream encoded I n n I I
trame inverse
b
reconstruction
unpacking mapping
1-1- -1
ISOAEC 1 1172-3 decoder
I
I I
ancillary data
Figure 2 -- Sketch of the basic structure of a decoder
Figure 2 illustrates the basic smcture of a audio decoder. Bitstrerun dm is fed into the decoder. The
bitstream unpacking and decoding block does error detection if error-check is applied in the encoder (see
2.4.2.4). The bitstream &?ta are unpacked to recover the various pieces of information. The
reconstruction block reconstructs the quantized version of the set of mapped samples. The inverse
mapping transforms these mapped samples back into uniform PCM.
vi
INTERNATIONAL STANDARD ISoAEC ISO/IEC 11 172-3: 1993 (E)
Information technology - Coding of moving
pictures and associated audio for digital storage
media at up to about 1,5 Mbit/s -
Part 3:
Audio
Section 1: General
1.1 Scope
This part of ISOmEC 11172 specifies the coded representation of high quality audio for storage media and
the method for decoding of high quality audio sigwils. The input of the encoder and the output of the decoder
are compatible with existing PCM standards such as standard Compact Disc and Digital Audio Tape.
This part of the ISO/IEC 11 172 is intended for application to digital storage media providing a total
continuous transfer rate of about 1,s Mbits/sec for both audio and video bitstreams, such as CD, DAT and
magnetic hard disc. The storage media irt?y either be connected directly to the decoder, or via other means
such as communication lines and the ISO/IEC 11 172 multiplexed stream defined in ISO/IEC 11 172-1.
This p'art of ISO/IEC 11 172 is intended for sampling rates of 32 kHz, 44,l kHz, and 48 kHz.
1 .2 Normative references
The following International Standards contain provisions which, through reference in this text, constitute
provisions of this part of ISO/IEC 11 172. At the time of publication, the editions indicated were valid.
All standards are subject to revision, and parties to agreements based on this part of ISOAEC 11 172 are
encouraged to investigate the possibility of applying the most recent editions of the standards indicated
below. Meinbers of IEC aid IS0 maintzn registers of currently valid International Standards.
ISOAEC 11172-1:1993 Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbith - Part I System.
ISO/IEC 11 172-2: 1993 Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,s Mbids - Part 2: Video.
CCIR Recommendition 601-2 Encoding parameters of digital television for studios.
CCIR Report 624-4 Characteristics of systems for monochrome and colour television.
CCIR Recommendation 648 Recording of audio signals.
CCIR Report 955-2 Sound broadcasting by satellite for portable and mobile receivers, including Annex IV
Surnmary description of Advanced Digital System II.
CCIn Recommendation J.17 Pre-emphasis used on Sound-Programme Circuits.
IEEE Draft Stmdaud P1180/D2 1990 Specijïcation for the irnplemntation of 8x 8 inverse discrete cosine
transfonn ".
IEC publication 908:1987 CD Digital Audio System.

ISO/IEC 11 172-3: 1993 (E)
O ISO/IEC
Section 2: Technical elements
2.1 Definitions
For the purposes of ISO/IEC 11 172, the following definitions apply. If specific to a part, this is noted in
square brackets.
2.1.1 ac coefficient [video]: Any DCT coefficient for which the frequency in one or both dimensions
is non-zero.
2.1.2 access unit [system]: In the case of compressed audio an access unit is an audio access unit. In
the case of compressed video an access unit is the coded representation of a picture.
2.1.3 adaptive segmentation [audio]: A subdivision of the digital representation of an audio signal
in variable segments of time.
2.1.4 adaptive bit allocation [audio]: The assignment of bits to subbands in a time and frequency
varying fashion according to a psychoacoustic model.
2.1.5 adaptive noise allocation [audio]: The assignment of coding noise to frequency bands in a
time and frequency varying fashion according to a psychoacoustic model.
2.1.6 alias [audio]: Mirrored signal component resulting from sub-Nyquist sampling.
2.1.7 analysis filterbank [audio]: Filterbank in the encoder that transforms a broadband PCM audio
signal into a set of subsampled subband samples.
2.1.8 audio access unit [audio]: For Layers I and II an audio access unit is defined as the smallest
part of the encoded bitstream which GQI be decoded by itself, where decoded means "fully recoiistructed
sound". For Layer III an audio access unit is part of the bitsue'm that is decodable with the use of
previously acquired main information.
2.1.9 audio buffer [audio]: A buffer in the system Luget decoder for storage of compressed audio data.
2.1.10 audio sequence [audio]: A non-interrupted series of audio frames in which the following
parameters are not changed:
-ID
- Layer
- Sampling Frequency
- For Layer I and II: Bitrate index
2.1.11 backward motion vector [video]: A motion vector that is used for motion compensation
from a reference picture at a later time in display order.
2.1.12 Bark [audio]: LJnit of critical band rate. The Bark scale is a non-linear mapping of the frequency
scale over the audio rcange closely corresponding with the frequency selectivity of the human ear across the
band.
2.1.13 bidirectionally predictive-coded picture; B-picture [video]: A picture that is coded
using motion compensated prediction from a past and/or future reference picture.
2.1.14 bitrate: The rate at which the compressed bitstrean is delivered from the storage medium to the
input of a decoder.
2.1.15 block companding [audio]: Normalizing of the digitxl representation of an audio signal
within a certain time period.
2.1.16 block [video]: An 8-row by 8-column orthogonal block of pels.
2.1.17 bound [audio]: The lowest subband in which intensity stereo coding is used.
O ISO/IEC
ISOAEC 11172-3: 1993 (E)
2.1.18 byte aligned: A bit in a coded bitstream is byte-aligned if its position is a multiple of &bits
from the fvst bit in the stream.
2.1.19 byte: Sequence of %bits.
2.1.20 channel: A digital medium that stores or transports an ISO/IEC 11 172 stream.
2.1.21 channel [audio]: The left and right channels of a stereo signal
2.1.22 chrominance (component) [video]: A matrix, block or single pel representing one of the
two colour difference signals related to the primary colours in the manner defined in CCIR Rec 601, The
used for the colour difference signals are Cr and Cb.
symbols
2.1.23 coded audio bitstream [audio]: A coded representation of an audio signal as specified in this
part of ISOilEC 11 172.
2.1.24 coded video bitstream [video]: A coded represenLition of a series of one or more pictures as
specified in ISOIIEC 1 1 172-2.
2.125 coded order [video]: The order in which the pictures are stored and decoded. This order is not
necessarily the same as the display order.
2.1.26 coded representation: A &?ta element as represented in its encoded form.
2.1.27 coding parameters [video]: The set of user-definable parameters that characterize a coded video
bitstream. Bitstreams are characterised by coding paraneters. Decoders (?re chatacterised by the bitstreams
that they are capable of decoding.
2.1.28 component [video]: A matrix, block or single pel from one of the three matrices (luminance
and two chrominaice) that make up a picture.
2.1.29 compression: Reduction in the number of bits used to represent an item of data.
2.1.30 constant bitrate coded video [video]: A compressed video bitstream with a constant
average bitrate.
2.1.31 constant bitrate: Operation where the bitrate is constant from start to finish of the compressed
bits Ueam.
2.1.32 constrained parameters [video]: The values of the set of coding parameters defined in
Of ISO/IEC 11 172-2.
2.4.3.2
2.1.33 constrained system parameter stream (CSPS) [system]: An ISO/IEC 11 172
multiplexed stream for which the constraints defined in 2.4.6 of ISO/IEC 11 172-1 apply.
2.1.34 CRC: Cyclic redundancy code.
2.1.35 critical band rate [audio]: Psychoacoustic function of frequency. At a given audible
frequency it is proportional to the number of critical bands below that frequency. The units of the critical
band rate scale are Barks.
2.1.36 critical band [audio]: Psychoacoustic measure in the spectral domain which corresponds to the
frequency selectivity of the human ear. This selectivity is expressed in Bark.
2.1.37 data element: An item of data as represented before encoding aid after decoding.
2.138 dc-coefficient [video]: The DCT coefficient for which the frequency is zero in both
dimensions.
1 ISO/IEC 11172-3: 1993 (E)
O ISO/IEC
2.1.39 dc-coded picture; D-picture [video]: A picture that is coded using only information from
itself. Of the DCT coefficients in the coded representation, only the dc-coefficients are present.
2.1.40 DCT coefficient: The amplitude of a specific cosine basis function.
2.1.41 decoded stream: The decoded reconstruction of a compressed bitstream.
2.1.42 decoder input buffer [video]: The first-in first-out (FIFO) buffer specified in the video
buffering verifier.
2.1.43 decoder input rate [video]: The &U rate specified in the video buffering verifier and encoded
in the coded video bitstream.
2.1.44 decoder: An embodiment of a decoding process.
2.1.45 decoding (process): The process defined in ISO/uEC 11172 that reads an input coded bitstream
and produces decoded pictures or audio samples.
2.1.46 decoding time-stamp; DTS [system]: A field that may be present in a packet header that
indicates the time that an access unit is decoded in the system target decoder.
2.1.47 de-emphasis [audio]: Filtering applied to an audio signal after storage or transmission to undo
a linear distortion due to emphasis.
2.1.48 dequantization [video]: The process of rescaling the qu representation in the bitstream has been decoded and before they are presented to the inverse DCT.
2.1.49 digital storage media; DSM: A digital storage or transmission device or system.
2.1.50 discrete cosine transform; DCT [video]: Either the forward discrete cosine transform or the
inverse discrete cosine transform. The DCT is an invertible, discrete orthogonal misformation. The
inverse DCT is defined in annex A of ISO/IEC 11172-2.
2.1.51 display order [video]: The order in which the decoded pictures should be displayed. Normally
this is the me order in which they were presented at the input of the encoder.
2.1.52 dual channel mode [audio]: A mode, where two audio chilnnels with independent programme
contents (e.g. bilingual) are encoded within one bitstream. The coding process is the same as for the stereo
mode.
2.1.53 editing: The process by which one or more compressed bitstreams are manipulated to produce a
new compressed bitstream. Conforming edited bitstreams must meet the requirements defined in this
ISO/IEC 11172.
2.1.54 elementary stream [system]: A generic tenn for one of the coded video, coded audio or other
coded bitstmms.
2.1.55 emphasis [audio]: Filtering applied to improve the signal-to-noise ratio at high frequencies.
2.1.56 encoder: An embodiment of (an encoding process.
2.1.57 encoding (process): A process, not specified in ISO/IEC 11172, that reads a stream of input
pictures or audio samples and produces a valid coded bitstream as defined in ISO/IEC 11 172.
2.1.58 entropy coding: Variable length lossless coding of the digital representltion of a signal to
reduce redundancy.
2.1.59 fast forward playback [video]: The process of displaying a sequence, or parts of a sequence,
of pictures in display-order faster than real-time.
O ISOAEC
ISO/IEC 1 11 72-3: 1993 (E)
2.1.60 FFT: Fast Fourier Transformation. A fast algorithm for performing a discrete Fourier transform
(an orthogonal transform).
2.1.61 filterbank [audio]: A set of band-pass filters covering the entire audio frequency range.
2.1.62 fiied segmentation [audio]: A subdivision of the digital representation of an audio signal
into fixed segments of time.
2.1.63 forbidden: The term "forbidden" when used in the clauses defining the coded bitstream indicates
that the value shall never be used. This is usually to avoid emulation of stut codes.
2.1.64 forced updating [video]: The process by which macroblocks are intra-coded from time-to-time
to ensure that mismatch errors between the inverse DCT processes in encoders and decoders cannot build up
excessively.
2.1.65 forward motion vector [video]: A motion vector that is used for motion compensation from
a reference picture at an earlier time in display order.
2.1.66 frame [audio]: A pIvt of the audio signal that corresponds to audio PCM &unples from an
Audio Access Unit.
2.1.67 free format [audio]: Any bitrate other than the defined bitrates that is less than the maximum
valid bitrate for each layer.
2.1.68 future reference picture [video]: The future reference picture is the reference picture that
occurs at a later tirne than the current picture in display order.
2.1.69 granules [Layer II] [audio]: The set of 3 consecutive subband s'amples from all 32 subbands
that are considered together before quantization. They correspond to 96 PCM samples.
2.1.70 granules [Layer III] [audio]: 576 frequency lines that carry their own side information.
2.1.71 group of pictures [video]: A series of one or more coded pictures intended to assist random
access. The group of pictures is one of the layers in the coding syntax defined in ISO/IEC 11172-2.
2.1.72 Hann window [audio]: A time function applied sample-by-sample to a block of audio samples
before Fourier transformation.
2.1.73 Huffman coding: A specific method for entropy coding.
2.1.74 hybrid filterbank [audio]: A serial combination of subband filterbank and MDCT.
2.1.75 IMDCT [audio]: Inverse Modified Discrete Cosine Transform.
2.1.76 intensity stereo [audio]: A method of exploiting stereo irrelevance or redundancy in
stereophonic audio prograrmnes based on retaining at high frequencies only the energy envelope of the right
and left channels.
2.1.77 interlace [video]: The property of conventional television pictures where alternating lines of
the picture represent different instances in time.
2.1.78 intra coding [video]: Coding of a macroblock or picture that uses information only from that
macroblock or picture.
2.1.79 intra-coded picture; I-picture [video]: A picture coded using information only from itself.
2.1.80 ISO/IEC 11172 (multiplexed) stream [system]: A bitstream composed of zero or more
elementary streams combined in the manner defined in ISO/IEC 11172-1.
ISOAEC 11 172-3: 1993 (E)
O ISOAEC
2.1.81 joint stereo coding [audio]: Any method that exploits stereophonic irrelevance or
stereophonic redundancy.
2.1.82 joint stereo mode [audio]: A mode of the audio coding algorithm using joint stereo coding.
2.1.83 layer [audio]: One of the levels in the coding hierarchy of the audio system defined in this part
of ISO/IEC 11172.
2.1.84 layer [video and systems]: One of the levels in the data hierarchy of the video and system
1 1172-1 and ISO/IEC 11 172-2.
specifications defined in ISOIIEC
2.1.85 luminance (component) [video]: A matrix, block or single pel representing a monochrome
representation of the signal and related to the primary colours in the manner defined in CCIR Rec 601. The
symbol used for luminance is Y.
2.1.86 macroblock [video]: The four 8 by 8 blocks of luminance data and the two corresponding 8 by
8 blocks of chrominance data coming from a 16 by 16 section of the luminance component of the picture.
Macroblock is sometimes used to refer to the pel data and sometimes to the coded representation of the pel
values and other data elements defined in the inacroblock layer of the syntax defined in ISOIIEC 11 172-2.
The ustige is clear from the context.
2.1.87 mapping [audio]: Conversion of an audio signal from time to frequency domain by subband
filtering and/or by MDCT.
2.1.88 masking [audio]: A property of the human auditory system by which mi audio signal c perceived in the presence of another audio signal .
2.1.89 masking threshold [audio]: A function in frequency and time below which an audio signal
cannot be perceived by the human auditory system.
2.1.90 MDCT [audio]: Modified Discrete Cosine Transform.
2.1.91 motion compensation [video]: The use of motion vectors to improve the efficiency of the
prediction of pel values. The prediction uses motion vectors to provide offsets into the past andor future
reference pictures containing previously decoded pel values that are used to form the prediction error signal.
2.1.92 motion estimation [video]: The process of estimating motion vectors during the encoding
process.
2.1.93 motion vector [video]: A two-dimensional vector used for motion compensation that provides
an offset from the coordinate position in the current picture to the coordinates in a reference picture.
2.1.94 MS stereo [audio]: A method of exploiting stereo irrelevance or redundancy in stereophonic
audio programmes based on coding the sum and difference signal instead of the left and right channels.
2.1.95 non-intra coding [video]: Coding of a macroblock or picture that uses information both from
itself and from macroblocks and pictures occurring at other times.
2.1.96 non-tonal component [audio]: A noise-like component of an audio signal.
2.1.97 Nyquist sampling: Sampling at or above twice the m 2.1.98 pack [system]: A pack consists of a pack header followed by one or more packets. It is a layer
in the system coding syntax described in ISO/IEC 11172-1.
2.1.99 packet data [system]: Contiguous bytes of data from an elementmy stream present in a packet.
2.1.100 packet header [system]: The data structure used to convey information about the elementary
stream data contained in the packet data.
O ISO/IEC
ISO/IEC 11 172-3: 1993 (E)
2.1.101 packet [system]: A packet consists of a header followed by a number of contiguous bytes
from an elementary data stream. It is a layer in the system coding syntax described in ISO/IEC 11172-1.
2.1.102 padding [audio]: A method to adjust the average length in time of an audio frame to the
duration of the corresponding PCM samples, by conditionally adding a slot to the audio frame.
2.1.103 past reference picture [video]: The past reference picture is the reference picture that occurs
at an earlier time than the current picture in display order.
2.1.104 pel aspect ratio [video]: The ratio of the nominal vertical height of pel on the display to its
nominal horizontal width.
2.1.105 pel [video]: Picture element.
2.1.106 picture period [video]: The reciprocal of the picture rate.
2.1.107 picture rate [video]: The nominal rate at which pictures should be output from the decoding
process.
2.1.108 picture [video]: Source, coded or reconstructed image data. A source or reconstructed picture
consists of three rectangular matrices of 8-bit numbers representing the luminance and two chrominance
signals. The Picture layer is one of the layers in the coding syntax defined in ISO/IEC 11 172-2. Note that
the term "picture" is always used in ISO/IEC 11 172 in preference to the terms field or fmne.
2.1.109 polyphase filterbank [audio]: A set of equal b'andwidth filters with special phase
interrelationships, allowing for an efficient implementation of the filterbank.
2.1.110 prediction [video]: The use of a predictor to provide an estimate of the pel value or data
element currently being decoded.
2.1.111 predictive-coded picture; P-picture [video]: A picture that is coded using motion
compensated prediction from the past reference picture.
2.1.112 prediction error [video]: The difference between the actual value of a pel or data element and
its predictor.
2.1.113 predictor [video]: A linear combination of previously decoded pel values or data elements.
2.1.114 presentation time-stamp; PTS [system]: A field that may be present in a packet header
that a presentation unit is presented in the system target decoder.
that indicates the time
2.1.115 presentation unit; PU [system]: A decoded audio access unit or a decoded picture.
2.1.116 psychoacoustic model [audio]: A mathematical model of the masking behaviour of the
human auditory system.
2.1.117 quantization matrix [video]: A set of sixty-four 8-bit values used by the dequantizer.
2.1.118 quantized DCT coefficients [video]: DCT coefficients before dequantization. A variable
length coded representation of quantized DCT coefficients is stored as part of the compressed video
bitstream.
2.1.119 quantizer scalefactor [video]: A &?ta element represented in the bitstrean and used by the
decoding process to scale the dequantization.
2.1.120 random access: The process of beginning to read and decode the coded bitstream at an arbitrary
point.
ISOAEC 11172-3: 1993 (E)
O ISOAEC
2.1.121 reference picture [video]: Reference pictures are the nearest adjacent I- or P-pictures to the
current picture in display order.
2.1.122 reorder buffer [video]: A buffer in the system target decoder for storage of a reconstructed I-
picture or a reconstructed P-picture.
2.1.123 requantization [audio]: Decoding of coded subband samples in order to recover the original
quantized values.
2.1.124 reserved: The tern "reserved" when used in the clauses defining the coded bitstream indicates
that the value may be used in the future for ISOEC defined extensions.
2.1.125 reverse playback [video]: The process of displaying the picture sequence in the reverse of
display order.
2.1.126 scalefactor band [audio]: A set of frequency lines in Layer III which are scaled by one
scalefac tor.
2.1.127 scalefactor index [audio]: A numerical code for a scalefactor.
2.1.128 scalefactor [audio]: Factor by which a set of values is scaled before qu'antization.
2.1.129 sequence header [video]: A block of &?ta in the coded bitstream containing the coded
representation of a number of data elements.
2.1.130 side information: Information in the bitstream necess 2.1.131 skipped macroblock [video]: A macroblock for which no data are stored.
2.1.132 slice [video]: A series of macroblocks. It is one of the layers of the coding syntax defined in
ISO/IEC 11 172-2.
2.1.133 slot [audio]: A slot is in Layers II and III one byte.
2.1.134 source stream: A single non-multiplexed stream of samples before compression coding.
2.1.135 spreading function [audio]: A function that describes the frequency spread of masking.
2.1.136 start codes [system and video]: 32-bit codes embedded in that coded bitstream that are
unique. They are used for several purposes including identifying some of the layers in the coding syntax.
2.1.137 STD input buffer [system]: A first-in first-out buffer at the input of the system target
decoder for storage of compressed &?ta from elementary sueams before decoding.
2.1.138 stereo mode [audio]: Mode, where two audio channels which form a stereo pair (left and
right) are encoded within one bitsueam. The coding process is the same as for the dual channel mode.
2.1.139 stuffing (bits); stuffing (bytes) : Code-words that may be inserted into the compressed
bitstream that are discarded in the decoding process. Their purpose is to increase the bitrate of the stream.
2.1.140 subband [audio]: Subdivision of the audio frequency band.
2.1.141 subband filterbank [audio]: A set of band filters covering the entire audio frequency range.
In this PM of ISOEC 11 172 the subband filterbank is a polyphase filterbank.
2.1.142 subband samples [audio]: The subband filterbank within the audio encoder creates a filtered
and subsampled representation of the input audio stream. The filtered samples are dled subband samples.
8 ISO/IEC
ISO/IEC 11172-3: 1993 (E)
From 384 time-consecutive input audio samples, 12 time-consecutive subband samples are generated within
each of the 32 subbands.
2.1.143 syncword [audio]: A 12-bit code embedded in the audio bitstream that identifes the start of a
fiame.
2.1.144 synthesis filterbank [audio]: Filterbank in the decoder that reconstructs a PCM audio
signal from subband samples.
2.1.145 system header [system]: The system header is a data structure defined in ISO/IEC 11172-1
that carries information summarising the system characteristics of the ISO/IEC 11172 multiplexed stream.
2.1.146 system target decoder; STD [system]: A hypothetical reference model of a decoding
process used to describe the semantics of an ISO/IEC 11 172 multiplexed bitstream.
2.1.147 time-stamp [system]: A term that indicates the time of an event.
2.1.148 triplet [audio]: A set of 3 consecutive subband smnples from one subband. A triplet from
each of the 32 subb 2.1.149 tonal component [audio]: A sinusoid-like component of an audio signal.
2.1.150 variable bitrate: Operation where the bitrate v,uies with time during the decoding of a
compressed bitstream.
2.1.151 variable length coding; VLC: A reversible procedure for coding that assigns shorter code-
words to frequent events and longer code-words to less frequent events.
2.1.152 video buffering verifier; VBV [video]: A hypothetical decoder that is conceptually
connected to the output of the encoder. Its purpose is to provide a constraint on the variability of the data
rate that 2.1.153 video sequence [video]: A series of one or more groups of pictures. It is one of the layers of
the coding syntax defined in ISO/IEC 11 172-2.
2.1.154 zig-zag scanning order [video]: A specific sequential ordering of the DCT coefficients from
(approximately) the lowest spatial frequency to the highest.
ISOAEC 11 172-3: 1993 (E)
0 ISOAEC
2.2 Symbols and abbreviations
The mathematical operators used to describe this International Standard are similar to those used in the C
progl-dmming language. However, integer division with truncation and rounding are specifically defined.
The bitwise operators are defined assuming twos-complement representation of integers. Numbering and
counting loops generally begin from zero.
2.2.1 Arithmetic operators
Addition.
Subtraction (as a binary operator) or negation (as a unary operator).
Increment.
Decrement.
Multiplication.
Power.
Integer division with truncation of the result toward zero. For example, 714 and -71-4 are
truncated to 1 and -7/4 and 71-4 are truncated to -1.
Integer division with rounding to the nearest integer. Half-integer values are rourided away
from zero unless otherwise specified. For example 3/12 is rounded to 2, and -3//2 is rounded
to -2.
Integer division with truncation of the result towards -m.
I x I = x when x >O
Absolute value.
I x I =O when x == O
I x I = -x when x < O
Modulus operator. Defined only for positive numbers.
Sign(x) = 1 x >O
O x==o
-1 x Nearest integer operator. Returns the ne‘uest integer value to the real-valued argument. Half-
integer values are rounded away from zero.
Sine.
Cosine.
Exponential.
Square root.
LogCarithm to base ten.
Logarithm to base e.
Logarithm to base 2.
2.2.2 Logical operators
Il Logical OR.
&& Logical AND.
O ISOAEC
ISO/IEC 11 172-3: 1993 (E)
1 Logical NOT
2.2.3 Relational operators
> Greater than.
r-
Greater than or equal to.
< Less than.
<=
Less than or equal to.
!= Not equal to.
max [,.,.,;I the maximum value in the argument list.
,min [, .,I
the minimum value in the argument list.
2.2.4 Bitwise operators
A twos complement nunbet representation is assumed where the bitwise operators are used.
& AND
I OR
>>
Shift right with sign extension.
<<
Shift left with zero fill.
2.2.5 Assignment
- - Assignment operator.
2.2.6 Mnemonics
The follouri~i~g mnemonics are defined to describe the different chta types used in the coded bit-stream.
bslbf Bit string, left bit first, where "left" is the order in which bit strings are written in
ISO/IEC 11172. Bit strings are written ils a string of 1s and Os within single quote
marks, e.g. '1000 0001'. Blanks within a bit string are for ease of reading and have no
significance.
ch Channel. If ch has the value O, the left channel of a stereo signal or the fist of two
independent signals is indicated. (Audio)
ncfr Number of channels; equal to 1 for single-channel mode, 2 in other modes. (Audio)
Gr'vlule of 3 * 32 subbaind samples in audio Layer II, 18 * 32 sub-band samples in
gr
audio Layer III. (Audio)
The main-data portion of the bitstream contiIins the scalefactors, Huffman encoded
rnain-dm
data, aid ancillary information, (Audio)
The location in the bitsuean of the beginning of the main-&ta for the frame. The
main-dak?_beg
location is equal to the ending location of the previous frame's m;un-data plus one bit.
It is Gdculated from the maindata-end value of the previous frame. (Audio)
The number of main-&&t bits used for scalefactors. (Audio)
pardlength
ISOAEC 11 172-3: 1993 (E)
O ISO/IEC
xpchof
Remainder polynomial coefficients, highest order first. (Audio)
sb Subband. (Audio)
sblimit The number of the lowest sub-b scfsi Scalefactor selection information. (Audio)
switch-point-l
Number of scalefactor bald (long block scalefactor band) from which point on window
switching is used. (Audio)
switch-point-s Number of scalefactor baud (short block scalefactor band) from which point on window
switching is used. (Audio)
uimsbf Unsigned integer, most significant bit first.
vlclbf
Variable length code, left bit fust, where "left" refers to the order in which the VLC
codes are written.
window Number of the actual time slot in case of block_type==2, O I window S 2. (Audio)
The byte order of multi-byte words is most significant byte first.
2.2.7 Constants
x 3,'14159265358 .
e 2,'11828 182845 .
2.3 Method of describing bitstream syntax
The bitstream refrieved by the decoder is described in 2.4.1. Each dztz item in the bitstrean is in bold type.
It is described by its name, its length in bits, and a mnemonic for its type and order of transmissiion.
The action ~zused by a decoded dztz element in a bitstrean depends on the value of that dm element and
on data elements previously decoded. The decoding of the dzta elements and definition of the state variables
used in their decoding are described in 2.4.2. The following constructs are used to express the conditions
when data elemtmts are present, aid are in nonnal type:
Note this syntax uses the 'C'-code convention that a variable or expression evaluating to a non-zero value is
equivalent to a condition that is me.
while ( condition ) ( If the condition is true, then the group of data elements occurs next
da ta-element in the daLi stream. This repeats until the condition is not me.
...
do(
da ta-eleme nt The dztiz element always occurs at least once.
...
] while ( condition ) The clam element is repeated until the condition is not true.
If the condition is Vue, then the first group of dat? elements occurs
if ( condition) (
data-element next in the &?LI stream.
I
If the condition is not true, then the second group of dm elements
else (
da ta-eleme nt occurs next in the dam stream.
...
O ISOAEC
ISOAEC 11 172-3: 1993 (E)
fol: (exprl; expr2; expr3) ( exprl is an expression specifying the initialization of the loop. Normally it
data-elemen t
specifies the initial state of the counter. expr2 is a condition specifying a test
...
made before each iteration of the loop. The loop terminates when the condition
is not true. expr3 is an expression that is performed at the end of each iteration
of the loop, normally it increments a counter.
Note that the most common usage of this construct is as follows:
for ( i = O; i c n; i++) ( The group of data elements occurs n times. Conditional constructs
data-element within the group of data elements may depend on the value of the
... loop control variable i, which is set to zero for the first occurrence,
inaemented to one for the second occurrence, and so forth.
As noted, the group of data elements may contain nested conditional constructs. For compactness, the ( }
may be omitted when only one data element follows.
data-element [I
data-element 0 is an array of data The number of dari? elements is indicated by
the context.
da ta-element [ n] dak?-element [n] is the n+lth element of an array of daci.
data-element [m][n] data-element [m][nl is the m+l,n+l th element of a two-dimensional 'way of
&?til.
dala-element [I][m][n] &?ta-element [l][m][n] is the l+l,m+l,ntl th element of a three-dimensional
anay of data.
data-element [m.n] is the inclusive range of bits between bit m and bit n in the data-element.
While the syntax is expressed in procedural terms, it should not be ilssumed that 2.4.3 implements a
satisfactory decoding procedure. In particular, it defines a correct and error-free in
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...