EN ISO/IEC 11172-1:1995
(Main)Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 1: Systems (ISO/IEC 11172-1:1993)
Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 1: Systems (ISO/IEC 11172-1:1993)
Informationstechnik - Codierung von bewegten Bildern und damit verbundenen Tonsignalen für digitale Speichermedien bis zu 1,5 Mbit/s - Teil 1: Systeme (ISO/IEC 11172-1:1993)
Technologies de l'information - Codage de l'image animée et du son associé pour les supports de stockage numérique jusqu'à environ 1,5 Mbit/s - Partie 1: Systèmes (ISO/IEC 11172-1:1993)
Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 1: Systems (ISO/IEC 11172-1:1993)
General Information
- Status
- Withdrawn
- Publication Date
- 23-Feb-1995
- Withdrawal Date
- 08-Jun-2005
- Technical Committee
- CEN/SS F12 - Information processing systems
- Drafting Committee
- CEN/SS F12 - Information processing systems
- Current Stage
- 9960 - Withdrawal effective - Withdrawal
- Start Date
- 09-Jun-2005
- Completion Date
- 09-Jun-2005
Relations
- Effective Date
- 09-Feb-2026
Get Certified
Connect with accredited certification bodies for this standard

BSI Group
BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

NYCE
Mexican standards and certification body.
Sponsored listings
Frequently Asked Questions
EN ISO/IEC 11172-1:1995 is a standard published by the European Committee for Standardization (CEN). Its full title is "Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 1: Systems (ISO/IEC 11172-1:1993)". This standard covers: Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 1: Systems (ISO/IEC 11172-1:1993)
Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 1: Systems (ISO/IEC 11172-1:1993)
EN ISO/IEC 11172-1:1995 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.
EN ISO/IEC 11172-1:1995 has the following relationships with other standards: It is inter standard links to EN 62481-2:2014. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
EN ISO/IEC 11172-1:1995 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.
Standards Content (Sample)
SLOVENSKI STANDARD
01-december-1997
Information technology - Coding of moving pictures and associated audio for
digital storage media at up to about 1,5 Mbit/s - Part 1: Systems (ISO/IEC 11172-
1:1993)
Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbit/s - Part 1: Systems (ISO/IEC 11172-1:1993)
Informationstechnik - Codierung von bewegten Bildern und damit verbundenen
Tonsignalen für digitale Speichermedien bis zu 1,5 Mbit/s - Teil 1: Systeme (ISO/IEC
11172-1:1993)
Technologies de l'information - Codage de l'image animée et du son associé pour les
supports de stockage numérique jusqu'a environ 1,5 Mbit/s - Partie 1: Systemes
(ISO/IEC 11172-1:1993)
Ta slovenski standard je istoveten z: EN ISO/IEC 11172-1:1995
ICS:
35.040 Nabori znakov in kodiranje Character sets and
informacij information coding
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
INTERNATIONAL ISO/IEC
STANDARD 11172-1
First edition
1993-08-0 1
Information technology - Coding of
moving pictures and associated audio for
digital storage media at up to about
1,5 Mbit/s -
Part 1:
Systems
- Codage de I’image animee et du son
Technologies de I’informa tion
associ6 pour /es supports de stockage num&ique jusqu’a environ
1,5 Mbit/s -
Partie 1: Systemes
Reference number
ISO/IEC 11172-1 :1993(E)
ISOAEC 11172~1:1993 (E)
Page
Contents
iii
Foreword .
iv
Introduction .
........................................................................................
Section 1: General
1.1 Scope .
............................................................................ 1
1.2 Normative references
.......................................................................... 3
Section 2: Technical elements
.
2.1 Defini tions .
..................................................................
2.2 Symbols and abbreviations
...................................................
2.3 Method of describing bit stream Syntax
2.4 Requirements .
Annexes
A Description of the System coding layer .
B List of patent holders .
0 ISO/IEC 1993
AI1 rights reserved. No part of this publication may be reproduced or utilized in any form or by
any means, electronie or mechanical, including photocopying and microfilm, without
permission in writing from the publisher.
ISOLEC Copyright Office l Case Postale 56 l CH 1211 Geneve 20 l Switzerland
Printed in Switzerland.
ii
ISOAEC 11172~1:1993 (E)
o ISOAEC
Foreword
ISO (the International Organization for Standardization) and IEC (the Inter-
national Electrotechnical Commission) form the specialized System for
worldwide standardization. National bodies that are members of ISO or
IEC participate in the development of International Standards through
technical committees established by the respective organization to deal
with particular fields of technical activity. ISO and IEC technical com-
mittees collaborate in fields of mutual interest. Other international organ-
izations, governmental and non-governmental, in liaison with ISO and IEC,
also take part in the work.
In the field of information technology, ISO and IEC have established a joint
technical committee, ISO/IEC JTC 1. Draft International Standards adopted
by the joint technical committee are circulated to national bodies for vot-
ing. Publication as an International Standard requires approval by at least
75 % of the national bodies casting a vote.
International Standard lSO/IEC 11172-1 was prepared by Joint Technical
Committee ISO/IEC JTC 1, Information technology, Sub-Committee SC 29,
Coded representation of audio, picture, multimedia and hypermedia infor-
ma tion.
lSO/IEC 11172 consists of the following Parts, under the general title In-
forma tion technology - Coding of moving pictures and associated audio
for digital storage media at up to about 1,5 Mbit/s:
- Part 7: Systems
- Part 2: Video
- Part 3: Audio
- Part 4: Compliance testing
Annexes A and B of this part of lSO/IEC 11172 are for information only.
. . .
ISOAEC 11172~1:1993 (E) 0 ISOAEC
Introduction
Note -- Readers interested in an overview of the MPEG Systems layer should read this Introduction and then
proceed to annex A, before retuming to the clauses 1 and 2. Since the System target decoder concept is
referred to throughout both the normative and informative clauses of this part of ISO/IEC 11172, it may
also be useful to refer to clause 2.4, and particularly 2.4.2, where the System target decoder is described.
The Systems specification addresses the Problem of combining one or more data streams from the Video and
audio parts of this International Standard with timing information to ferm a Single stream. Once combined
into a Single stream, the data are in a form well suited to digital storage or transmission. The syntactical
and semantic rules imposed by this Systems specification enable synchronized playback without overflow or
underflow of decoder buffers under a wide range of stream retrieval or receipt conditions. The scope of
syntactical and semantic rules set forth in the Systems specification differ: the syntactical rules apply to
Systems layer coding only, and do not extend to the compression layer coding of the Video and audio
specifications; by contrast, the semantic rules apply to the combined stream in its entirety.
The Systems specification does not specify the architecture or implementation of encoder or decoders.
However, bitstream properties do impose functional and performante requirements on encoders and decoders.
For instance, encoders must meet minimum clock tolerante requirements. Notwithstanding this and other
requirements, a considerable degree of freedom exists in the design and implementation of encoders and
decoders.
A prototypical audio/video decoder System is depicted in figure 1 to illustrate the function of an ISO/IEC
11172 decoder. The architecture is not unique -- System Decoder functions including decoder timing control
might equally well be distributed among elementar-y stream decoders and the Medium Specific Decoder -- but
this figure is useful for discussion. The prototypical decoder design does not imply any normative
requirement for the design of an ISO/IEC 11172 decoder. Indeed non-audio/video data is also allowed, but
not shown.
m-----B----
r ISO/IEC 11172
I
Audi
DeuKied
I
I
audio
I
ISO1 1172
I
Stream
_ I
Digital II3ecodm 1
Storage
Medium
ISO/IEC 11172
Prototypical ISO/IEC 11172 decoder
Figure 1 --
The prototypical ISO/IEC 11172 decoder shown in figure 1 is composed of System, Video, and Audio
decoders conforming to Parts 1,2, and 3, respectively, of ISOAEC 11172. In this decoder the multiplexed
coded representation of one or more audio and/or Video streams is assumed to be stored on a digital storage
medium (DSM), or network, in some medium-specific format. The medium specific format is not governed
by this International Standard, nor is the medium-specific decoding part of the prototypical ISO/IEC 11172
dtXOdH.
The prototypical decoder accepts as input an ISO/lEC 11172 multiplexed stream and relies on a System
Decoder to extract timing information from the stream. The System Decoder demultiplexes the stream, and
the elementary streams so produced serve as inputs to Video and Audio decoders, whose outputs are decoded
Video and audio Signals. Included in the design, but not shown in the figure, is the flow of timing
information among the System Decoder, the Video and Audio Decoders, and the Medium Specific Decoder.
iv
0 ISOAEC ISOAEC 11172-1: 1993 (E)
The Video and Audio Decoders are synchronized with each other and with the DSM using this timing
information.
ISO/IEC 11172 multiplexed streams are constructed in two layers: a System layer and a compression layer.
The input stream to the System Decoder has a System layer wrapped about a compression layer. Input
streams to the Video and Audio decoders have only the compression layer.
Operations performed by the System Decoder either apply to the entire ISO/IEC 11172 multiplexed stream
(“multiplex-wide operations”), or to individual elementary streams (“stream-specific opemtions”). The
ISO/IEC 11172 System layer is divided into two sub-layers, one for multiplex-wide operations (the pack
layer), and one for stream-specific operations (the packet layer).
0.1 Multiplex-wide operations (pack layer)
Multiplex-wide operations include the coordination of data retrieval off the DSM, the adjustment of clocks,
and the management of buffers. The tasks are intimately related. If the rate of data delivery off the DSM is
controllable, then DSM delivery may be adjusted so that decoder buffers neither overflow nor underflow;
but if the DSM rate is not controllable, then elementar-y stream decoders must Slave their timing to the
DSM to avoid Overflow or underflow.
ISO/IEC 11172 multiplexed streams are composed of Packs whose headers facilitate the above tasks. Pack
headers specify intended times at which each byte is to enter the System decoder from the DSM, and this
target arrival schedule serves as a reference for clock correction and buffer management. The schedule need
not be followed exactly by decoders, but they must compensate for deviations about it.
An additional multiplex-wide Operation is a decoder’s ability to establish what resources are required to
decode an ISO/IEC 11172 multiplexed stream. The first pack of each ISO/IEC 11172 multiplexed stream
conveys Parameters to assist decoders in this task. Included, for example, are the stream’s maximum data
rate and the highest number of simultaneous Video channels.
Individual stream operations (packet layer)
0.2
The principal stream -specific operations are 1) demultiplexing, and 2) synchronizing playback of multiple
elementar-y streams. These topics are discus sed next.
0.2.1 Demultiplexing
On encoding, ISO/IEC 11172 multiplexed streams
Elementary streams may include private, reserved, and padding stre
and Video streams. The streams are temporally subdivided into packets, and the packets are serialized. A
packet co~~tahs coded bytes from one and only one elementary stream.
Both fixed and variable packet lengths are allowed subject to constraints in 2.4.3.3 and in 2.4.5 and 2.4.6.
On decoding, demultiplex ing is required to reconstitute elementar-y stre(ams from the IS O/IEC 11172
is made possible by stream id Codes in packet headers
mul tiplexed stream. This -
0.2.2 Synchronization
Synchronization cunong multiple strecams is effected with presentation time stamps in the ISO/IEC 11172
multiplexed stream. The time stamps are in units of 90kHz. Playback of N streams is synchronized by
adjusting the playback of all streams to a master time base rather than by adjusting the playback of one
stream to match that of another. The master time base may be one of the N decoders’ clocks, the DSM or
channel clock, or it may be some extemal clock.
Because presentation time-stamps apply to the decoding of individual elementar-y streams, they reside in the
packet layer. End-to-end synchronization occurs when encoders record time-stamps at Capture time, when
the time stamps propagate with associated coded data to decoders, and when decoders use those time-stamps
to schedule presentations.
Synchronization is also possible with DSM timing time stamps in the multiplexed data st.re
ISOAEC 11172-1: 1993 (E) o ISOAEC
0.2.3 Relation to compression layer
The packet layer is independent of the compression layer in some senses, but not in all. It is independent in
the sense that packets need not start at compression layer start codes, as defined in parts 2 and 3. For
example, a Video packet may statt at any byte in the Video stream. However, time stamps encoded in
packet headers apply to presentation times of compression layer co~vXructs (namely, presentation units).
0.3 System reference decoder
Part 1 of ISO/IEC 11172 employs a “System target decoder,” (SID) to provide a formalism for timing and
buffering relationships. Because the STD is parameterized in terms of fields defined in ISO/IEC 11172 (for
example, buffer sizes) each ISO/IEC 11172 multiplexed stream leads to its own parameterization of the
STD. It is up to encoders to ensure that bitstreams they produce will play in normal Speed, forward play on
corresponding STDs. Physical decoders may assume that a stream plays properly on its SID; the physical
decoder must compensate for ways in which its design differs from that of the STD.
Vi
INTERNATIONAL STANDARD o lSo’IEC ISO/IEC 11172-1:1993 (E)
Information technology - Coding of moving
pictures and associated audio for digital storage
media at up to about 1,5 Mbit/s -
Part 1:
Systems
Section 1: General
1.1 Scope
This patt of ISO/IEC 11172 specifies the System layer of the coding. It was developed principally to
support the combination of the Video and audio coding methods defined in ISO/IEC 11172-2 and ISO/IEC
11172-3. The System layer supports five basic functions:
a) the synchronization of multiple compressed streams on playback,
b) the interleaving of multiple compressed streams into a Single stream,
c) the initialization of buffering for playback Start up,
d) continuous buffer management, and
e) time identification.
An ISO/IEC 11172 multiplexed bit stream is constructed in two layers: the outermost layer is the System
layer, and the innermost is the compression layer. The System layer provides the functions necessary for
using one or more compressed data streams in a System. The Video and audio parts of this specification
define the compression coding layer for audio and Video data. Coding of other types of data is not defined by
the specification, but is supported by the System layer provided that the other types of data adhere to the
constraints defined in clause 2.4.
1.2 Normative references
The following International Standards contain provisions which, through reference in this text, constitute
provisions of this part of ISO/IEC 11172. At the time of publication, the editions indicated were valid. All
standards are subject to revision, and Parties to agreements bczsed on this pczrt of ISO 11172 are encouraged
to investigate the possibility of applying the most recent editions of the Standyards indicated below.
Members of IEC and ISO maintain registers of currently valid International Standards.
ISO/IEC 11172-2:1993 Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 MbitLs - Part 2: Video.
ISO/IEC 11172-3: 1993 Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbit/s - Part 3 Audio.
CCIR Recommendation 601-2 Encoding Parameters of digital television for Studios.
CCIR Report 624-4 Characteristics of Systems for monochrome and colour television.
CCIR Recommendation 648 Recording of audio Signals.
CCIR Report 955-2 Sound broadcasting by satellite for portable and mobile receivers, including Annex IV
Summry description of Advanced Digital System II.
o ISO/IEC
ISOAEC 11172~1:1993 (E)
CCIIT Recommendation J. 17 Pre-emphasis used on Sound-Programme Circuits.
IEEE Draft Standard PllWD2 1990 Specijkation for the implementation of 8x 8 inverse discrete cosine
transform”.
IEC publication 908:1987 CD Digital Audi0 System.
o ISOAEC ISOAEC 11172-1: 1993 (E)
Section 2: Technical elements
2.1 Definitions
For the purposes of ISO/IEC 11172, the following definitions apply. If specific to a part, this is noted in
Square brackets.
2.1.1 ac coefficient [Video]: Any DCT coefficient for which the frequency in one or both dimensions
is non-Zero.
In the case of compressed audio an access unit is an audio access unit. In
2.1 .2 access unit [System]:
an access un it is the coded representation of a picture.
the case of compressed Video
n [audio]: A subdivision of the digital representation of an audio Signal
2. 1.3 adaptive segmentatio
in variable Segments of time.
2.1.4 adaptive bit allocation [audio]: The assignment of bits to subbands in a time and frequency
varying fashion according to a psychoacoustic model.
2.1.5 adaptive noise allocation [audio]: The assignment of coding noise to freq uency bands in a
time and frequency varying fashion according to a psychoacoustic model.
2.1.6 alias [audio]: Mirrored Signal component resulting from sub-Nyquist sampling.
2.1.7 analysis filterbank [audio]: Filterbank in the encoder that transforms a broadband PCM audio
Signal into a set of subsampled subband scunples.
2.1.8 audio access unit [audio]: For Layers 1 and 11
part of the encoded bitstream which ca.11 be decoded by itself, where decoded means “fully reconstructed
Sound”. For Layer 111 an audio access unit is part of the bitstream that is decodable with the use of
previously acquired main information.
2.1.9 audio buffer [audio]: A buffer in the System target decoder for storage of compressed audio data.
2.1.10 audio sequence [audio]: A non-interrupted series of audio frames in which the following
Parameters are not changed:
-ID
- Layer
- Sampling Frequency
- For Layer 1 and 11: Bitrate index
2.1.11 backward motion vector [Video]: A motion vector that is used for motion compensation
from a reference picture at a later time in display Order.
2.1.12 Bark [audio]: Unit of critical band rate. The Bark scale is a non-linear mapping of the frequency
scale over the audio range closely corresponding with the frequency selectivity of the human ear across the
band.
2.1.13 bidirectionally predictive-coded picture;
B-picture [Video]: A picture that is coded
using motion compensated prediction from a pcist andlor future reference picture.
2.1.14 bitrate: The rate at which the compressed bitstream is delivered from the storage medium to the
input of a decoder.
2.1.15 block companding [audio]: Normalizing of the digital representation of an audio Signal
within a certain time period.
2.1.16 block [Video]: An 8-row by 8-column orthogonal block of pels.
2.1.17 bound [audio]: The lowest subband in which intensity stereo coding is used.
ISOAEC 11172~1:1993 (E) o ISOAEC
2.1.18 byte aligned: A bit in a coded bitstream is byte-aligned if its position is a multiple of 8-bits
from the first bit in the stream.
2.1.19 byte: Sequence of 8-bits.
2.1.20 channel: A digital medium that stores or transports an ISO/IEC 11172 stream.
2.1.21 channel [audio]: The left and right channels of a stereo Signal
2.1.22 chrominance (component) [Video]: A matrix, block or Single pel representing one of the
two colour differente Signals related to the Primar-y colours in the manner defined in CCIR Ret 601. The
Symbols used for the colour differente Signals are Cr and Cb.
2.1.23 coded audio bitstream [audio]: A coded representation of an audio Signal as specified in
ISO/IEC 11172-3.
2.1.24 coded Video bitstream [Video]: A coded representation of a series of one or more pictures as
specified in ISO/IEC 11172-2.
2.1.25 coded Order [Video]: The Order in which the pictures are stored and decoded. This Order is not
necessarily the same as the display Order.
2.1.26 coded representation: A data element as represented in its encoded form.
2.1.27 coding Parameters [Video]: The set of user-definable Parameters that ch‘aracterize a coded Video
bitstream. Bitstreams are characterised by coding Parameters. Decoders
that they are capable of decoding.
2.1.28 component [Video]: A matrix, block or Single pel from one of the three matrices (luminance
and two chrominance) that make up a picture.
2.1.29 compression: Reduction in the number of bits used to represent an item of data.
2.1.30 constant bitrate coded Video [Video]: A compressed Video bitstream with a constant
average bitrate.
2.1.31 constant bitrate: Operation where the bitrate is constant from Start to finish of the compressed
bitstream.
2.1.32 constrained Parameters [Video]: The values of the set of coding pclrameters defined in
2.4.3.2 of ISO/IEC 11172-2.
2.1.33 constrained System Parameter stream (CSPS) [System]: An ISO/IEC 11172
multiplexed stream for which the constraints defined in 2.4.6 of this pczrt of ISO/IEC 11172 apply.
2.1.34 CRC: Cyclic redundancy Code.
2.1.35 critical band rate [audio]: Psychoacoustic function of frequency. At a given audible
frequency it is proportional to the number of critical bands below that frequency. The units of the critical
band rate scale arc Barks.
2.1.36 critical band [audio]: Psychoacoustic measure in the spectral domain which corresponds to the
frequency selectivity of the human ear. This selectivity is expressed in Bark.
2.137 data element: An item of data as represented before encoding and after decoding.
2.138 dc-coefficient [Video]: The DCT coefficient for which the frequency is zero in both
dimensions.
ISOAEC 11172-1: 1993 (E)
0 ISOAEC
2.1.39 dc-coded picture; D-picture [Video]: A picture that is coded using only information from
itself. Of the DCT coefficients in the coded representation, only the dc-coefficients are present.
2.1.40 DCT coefficient: The amplitude of a specific cosine basis function.
2.1.41 decoded stream: The decoded reconstruction of a compressed bitstream.
2.1.42 decoder input buffer [Video]: The first-in first-out (FIFO) buffer specified in the Video
buffering verifier.
2.1.43 decoder input rate [Video]: The data rate specified in the Video buffering verifier and encoded
in the coded Video bitstream.
2.1.44 decoder: An embodiment of a decoding process.
2.1.45 decoding (process): The process defined in ISO/IEC 11172 that reads an input coded bitstream
and produces decoded pictures or audio samples.
2.1.46 decoding time-stamp; DTS [System]: A field that may be present in a packet header that
indicates the time that an access unit is decoded in the System target decoder.
storage or transmission to
2.1.47 de-emphasis [ audio]: Filtering applied to
a linear distortion due to emphasis.
2.1.48 dequantization [Video]: The process of rescaling the quantized DCT coefficients after their
representation in the bitstream has been decoded and before they are presented to the inverse DCT.
2.1.49 digital storage media; DSM: A digital storage or transmission device or System.
2.1.50 discrete cosine transform; DCT [Video]: Either the forward discrete cosine transform or the
inverse discrete cosine transform. The DCT is an invertible, discrete orthogonal transformation. The
inverse DCT is defined in annex A of ISO/IEC 11172-2.
2.1.51 display Order [Video]: The Order in which the decoded pictures should be displayed. Normally
this is the same Order in which they wer-e presented at the input of the encoder.
2.1.52 dual channel mode [audio]: A mode, where two audio channels with independent Programme
contents (e.g. bilingual) care encoded within one bitstream. The coding process is the s
mode.
2.1.53 editing: The process by which one or more compressed bitstreams are manipulated to produce a
new compressed bitstream. Conforming edited bitstreams must meet the requirements defined in ISOLIEC
11172.
2.1.54 elementary stream [System]: A generic term for one of the coded Video, coded audio or other
coded bitstreams.
2.1.55 emphasis [audio]: Filtering applied to an audio Signal before storage or transmission to
improve the signal-to-noise ratio at high frequencies.
2.1.56 encoder: An embodiment of an encoding process.
2.1.57 encoding (process): A process, not specified in ISO/IEC 11172, that reads a stream of input
pictures or audio samples and produces a valid coded bitstream as defined in ISO/IEC 11172.
2.1.58 entropy coding: Variable length lossless coding of the digital representation of a Signal to
reduce redundancy.
2.1.59 fast forward playback [Video]: The process of displaying a sequence, or parts of a sequence,
of pictures in display-order faster than real-time.
ISOAEC 11172~1:1993 (E) 0 ISOAEC
2.1.60 FFT: Fast Fourier Transformation. A fast algorithm for perfonning a discrete Fourier transform
(an orthogonal transform).
2.1.61 filterbank [audio]: A set of band-pass filters covering the entire audio frequency range.
2.1.62 fixed Segmentation [ audio]: A subdivision of the digital representation of an audio Signal
into fixed Segments of time.
2.1.63 forbidden: The term “forbidden” when used in the clauses defining the coded bitstream indicates
that the value shall never be used. This is usually to avoid emulation of start Codes.
2.1.64 forced updating [Video]: The process by which macroblocks are intra-coded from time-to-time
to ensure that mismatch errors between the inverse DCT processes in encoders and decoders cannot build up
excessively.
mo tion
2.1.65 forward motion vector [Video]: A vector that is used for motion compensation from
a reference picture at an earlier time in display Order.
2.1.66 frame [audio]: A part of the audio Signal that corresponds to audio PCM samples from an
Audio Access Unit.
2.1.67 free format [audio]: Any bitrate other than the defined bitrates that is less than the maximum
valid bitrate for each layer.
2.1.68 future reference picture [v tideo]: The future reference picture is the reference picture that
occurs at a later time than the current picture in display Order.
2.1.69 granules [Layer 11] [audio]: The set of 3 consecutive subband samples from all 32 subbands
that are considered together before qu
2.1.70 granules [Layer 1111 [audio]: 576 frequency lines that carry their own side information.
2.1.71 group of pictures [Video]: A series of one or more coded pictures intended to assist random
access. The group of pictures is one of the layers in the coding syntax defined in ISO/IEC 11172-2.
2.1.72 Hann window [audio]: A time function applied scunple-by-scunple to a block of audio samples
before Fourier transformation.
2.1.73 Huffman coding: A specific method for entropy coding.
2.1.74 hybrid filterbank [audio]: A serial combination of subband filterbank and MDCT.
2.1.75 IMDCT [audio]: Inverse Modified Discrete Cosine Transform.
2.1.76 intensity stereo [audio]: A method of exploiting stereo irrelevante or redundancy in
stereophonic audio Programmes bc?sed on retaining at high frequencies only the energy envelope of the right
and left channels.
2.1.77 interlace [Video]: The property of conventional television pictures where alternating lines of
the picture represent different instances in time.
2.1.78 intra coding [Video]: Coding of a macroblock or picture that uses information only from that
macroblock or picture.
2.1.79 intra-coded picture; 1-picture [Video]: A picture coded using information only from itself.
2.1.80 ISO/IEC 11172 (multiplexed) stream [System]: A bitstream composed of Zero or more
elementar-y streams combined in the manner defined in this p‘art of ISO/IEC 11172.
o ISOAEC ISOAEC 11172~1:1993 (E)
2.1.81 joint stereo coding [audio]: Any method that exploits stereophonic irrelevante or
stereophonic redundancy.
2.1.82 joint stereo mode [audio]: A mode of the audio coding algorithm using joint stereo coding.
2.1.83 layer [audio]: One of the levels in the coding hierarchy of the audio System defined in ISO/IEC
11172-3.
2.1.84 layer [Video and Systems]: One of the levels in the data hierarchy of the Video and System
specifications defmed in this part of ISO/IEC 11172 and ISO/IEC 11172-2.
2.1.85 luminance (component) [Video]: A matrix, block or Single pel representing a monochrome
representation of the Signal and related to the primary colours in the manner defined in CCIR Ret 601. The
Symbol used for luminance is Y.
2.1.86 macroblock [Video]: The four 8 by 8 blocks of luminance data and the two corresponding 8 by
8 blocks of chrominance data coming from a 16 by 16 section of the luminance component of the picture.
Macroblock is sometimes used to refer to the pel data and sometimes to the coded representation of the pel
values and other data elements defined in the macroblock layer of the syntax defined in ISO/IEC 11172-2.
The usage is clear from the context.
2.1.87 mapping [audio]: Conversion of an audio Signal from time to frequency domain by subband
filtering and/or by MDCT.
2.1.88 masking [ audio] : A property of the human auditory System by which an audio Signal cannot be
perceived in the presence of another au dio Signal .
2.1.89 masking threshold [audio]: A function in frequency and time below which an audio Signal
cannot be perceived by the human auditory System.
2.1.90 MDCT [ audio]: Modified Discrete Cosine Transform.
2.1.91 motion compensation [Video]: The use of motion vectors to improve the efficiency of the
prediction of pel values. The prediction uses motion vectors to provide offsets into the past and/or future
reference pictures containing previously decoded pel values that are used to ferm the prediction error Signal.
2.1.92 motion estimation [Video]: The process of estimating motion vectors during the encoding
process.
2.1.93 motion vector [Video]: A two- dimensional that provides
vector used for motion compensation
an offset from the coordinate position in the current pi cture to the coordinates in a reference pi .cture.
2.1.94 MS stereo [audio]: A method of exploiting stereo irrelevante or redundancy in stereophonic
audio programmes bLzsed on coding the sum and differente Signal instead of the left and right channels.
2.1.95 non-intra coding [Video]: Coding of a macroblock or picture that uses information both from
itself and from macroblocks and pictures occurring at other times.
2.1.96 non-tonal component [audio]: A noise-like component of an audio Signal.
2.1.97 Nyquist sampling: Sampling at or above twice the maximum bandwidth of a Signal.
2.1.98 pack [System]: A pack consists of a pack header followed by one or more packets. It is a layer
in the System coding syntax described in this pclrt of ISODEX 11172.
2.1.99 packet data [System]: Contiguous bytes of data from an elementar-y stream present in a packet.
2.1.100 packet header [System]: The data structure used to convey information about the elementary
stream data contained in the packet data.
ISOAEC 11172~1:1993 (E) o ISOAEC
2.1.101 packet [System]: A packet consists of a header followed by a number of contiguous bytes
from an elementar-y data stream. It is a layer in the System coding syntax described in this part of ISO/IEC
11172.
2.1.102 padding [audio]: A method to adjust the average length in time of an audio frame to the
duration of the corresponding PCM samples, by conditionally adding a slot to the audio frame.
2.1.103 past reference picture [Video]: The past reference picture is the reference picture that occurs
at an earlier time than the current picture in display Order.
2.1.104 pel aspect ratio [Video]: The ratio of the nominal vertical height of pel 011 the display to its
nominal horizontal width.
2.1.105 pel [Video]: Picture element.
The reciprocal of the picture rate.
2.1.106 picture period [Video]:
2.1.107 picture rate [Video]: The nominal rate at which pictures should be output from the decoding
process.
2.1.108 picture [Video]: Source, coded or reconstructed image data. A Source or reconstructed picture
consists of three rectangular matrices of 8-bit numbers representing the luminance and two chrominance
Signals. The Picture layer is one of the layers in the coding Syntax defined in ISO/IEC 11172-2. Note that
the term “picture” is always used in ISO/IEC 11172 in preference to the tenns field or frame.
2.1.109 Polyphase filterbank [audio]: A set of equal bandwidth filters with special Phase
interrelationships, allowing for an efficient implementation of the filterbank.
2.1.110 prediction [Video]: The use of a predictor to provide an estimate of the pel value or data
element currently being decoded.
2.1.111 predictive-coded picture; P-picture [Video]: A picture that is coded using motion
compensated prediction from the past reference picture.
2.1.112 prediction error [Video]: The differente between the actual value of a pel or data element and
its predictor.
2.1.113 predictor [Video]: A linear combination of previously decoded pel values or data elements.
2.1.114 presentation time-stamp; PTS [System]: A field that may be presen t in a packet
that indicates the time that a presentation unit is presented in the System tat-get decoder.
2.1.115 presentation unit; PU [System]: A decoded audio access unit or a decoded picture.
2.1.116 psychoacoustic model [audio]: A mathematical model of the masking behaviour of the
human auditory System.
2.1.117 quantization matrix [Video]: A set of sixty-four 8-bit values used by the dequantizer.
2.1.118 quantized DCT coefficients [Video]: DCT coefficients before dequantization. A variable
length coded representation of quantized DCT coefficients is stored as part of the compressed Video
bitstream.
2.1.119 quantizer scalefactor [Video]: A data element represented in the bitstream and used by the
decoding process to scale the dequantization.
ISO/IEC 11172~1:1993 (E)
0 ISOAEC
random access: The process of beginning to read coded bi tstream at an arbitrary
2.1.120 and decode the
Point.
2.1.121 reference picture [Video]: Reference pictures are the nearest adjacent 1- or P-pictures to the
current picture in display Order.
2.1.122 reorder buffer [Video]: A buffer in the System target for storage of a reconstructed I-
decoder
P-picture.
picture or a reconstructed
Decoding of
2.1.123 requantization [audio]: coded subband samples in Order to recover the original
quantized values.
2.1.124 reserved: The term “reserved” when used in the clauses defining the coded bitstream indicates
that the value may be used in the future for ISOAEC defined extensions.
playback [Video]: The process of displaying the picture
2.1.125 reverse sequence in the reverse of
display Order.
2.1.126 scalefactor band [audio]: A set of frequency lines in Layer 111 which are scaled by one
scalefactor.
2.1.127 scalefactor index [audio]: A numerical code for a scalefactor.
2.1.128 scalefactor [audio]: Factor by which a set of values is scaled before quantization.
header [Video]: A block
2.1.129 sequence of data in the coded bi tstream containing the
representation of a number of data elements.
2.1.130 side information: Information in the bitstream necessary for controlling the decoder.
2.1.131 skipped macroblock [Video]: A macroblock for which no data are stored.
2.1.132 Slice [Video]: A series of macroblocks. It is one of the layers of the coding Syntax defined in
ISO/IEC 11172-2.
2.1.133 slot [audio]: A slot is an elementar-y pczrt in the bitstre,un. In Layer 1 a slot equals four bytes,
in Layers 11 and 111 one byte.
2.1.134 Source stream: A Single non-multiplexed stream of samples before compression coding.
2.1.135 spreading function [audio]: A function that describes the frequency spread of masking.
2.1.136 Start Codes [System and Video]: 32-bit Codes embedded in that coded bitstream that are
unique. are used for sev ,er-a1 purposes including identifying some of the layers in the coding Syntax.
JJ=Y
2.1.137 STD input buffer [System]: A first-in first-out buffer at the input of the System target
decoder for storage of compressed data from elementar-y streams before decoding.
2.1.138 stereo mode [audio]: Mode, where two audio channels which form a stereo pair (left and
right) are encoded within one bitstream. The coding process is the same as for the dual channel mode.
2.1.139 stuffing (bits); stuffing (bytes) : Code-words that may be inserted into the compressed
bitstream that are discarded in the decoding process. Their purpose is to incre
2.1.140 subband [audio]: Subdivision of the audio frequency band.
2.1.141 subband filterbank [audio]: A set of band filters covering the entire audio f.equency rage.
111 ISO/IEC 11172-3 the subband filterbank is a Polyphase filterbank.
ISOAEC 111724: 1993 (E) 0 ISOAEC
2.1.142 subband samples [audio]: The subband filterbank within the audio encoder creates a filtered
and subsampled representation of the input audio stream. The filtered samples are called subband samples.
From 384 time-consecutive input audio samples, 12 time-consecutive subband samples
each of the 32 subbands.
2.1.143 syncword [audio]: A 12-bit code embedded in the audio bitstream that identifies the start of a
frame.
2.1.144 Synthesis filterbank [audio]: Filterbank in the decoder that reconstructs a PCM audio
Signal from subband samples.
2.1.145 System header [System]: The System header is a data structure defined in this part of
ISO/IEC 11172 that carries information summarising the System characteristics of the ISO/IEC 11172
mul tiplexed stream.
2.1.146 System target decoder; STD [System]: A hypothetical reference model of a decoding
process used to describe the semantics of an ISO/IEC 11172 multiplexed bitstream.
2.1.147 time-stamp [System]: A term that indicates the time of an event.
2.1.148 triplet [audio]: A set of 3 consecutive subband samples from one subband. A triplet from
each of the 32 subbands forms a granule.
2.1.149 tonal component [audio]: A sinusoid-like component of an audio Signal.
2.1.150 variable bitrate: Operation where the bitrate vczries with time during the decoding of a
compressed bitstream.
2.1.151 variable length coding; VLC: A reversible procedure for coding that assigns shorter code-
words to frequent events and longer code-words to less frequent events.
2.1.152 Video buffering verifier; VBV [Video]: A hypothetical decoder that is conceptually
connected to the output of the encoder. Its purpose is to provide a constraint on the variability of the data
rate that an encoder or editing process may produce.
2.1.153 Video sequence [Video]: A series of one or more groups of pictures. It is one of the layers of
the coding Syntax defined in ISO/IEC 11172-2.
2.1.154 zig-zag scanning Order [Video]: A specific sequential ordering of the DCT coefficients from
(approximately) the lowest spatial frequency to the highest.
ISO/IEC 11172~1:1993 (E)
0 ISOAEC
2.2 Symbols and abbreviations
The mathematical Operators used to describe this International Standard at-e similar to those used in the C
programming language. However, integer division with truncation and rounding are specifically defined.
The bitwise Operators arc defined assuming twos-complement representation of integers. Numbering and
counting loops generally begin from Zero.
2.2.1 Arithmetic Operators
Addition.
+
Subtraction (as a binar-y Operator) or negation (as a unary Operator).
++ Increment.
--
Decrement.
*
Mul tiplication.
A
Power.
Integer division with truncation of the result toward Zero. For exarnple, 7/4 and -7/-4 are
l
truncated to 1 and -7/4 and 7/-4 arc truncated to -1.
// Integer division with rounding to the nearest integer. Half-integer values are rounded away
from zero unless otherwise specified. For example 3//2 is rounded to 2, and -3//2 is rounded
to -2.
DIV Integer division with truncation of the result towards-c=.
I x I = x when x > 0
I I Absolute value.
lxI=Owhenx==O
1x1 = -x when x < 0
% Modulus Operator. Defined only for positive numbers.
= 1 x >o
Sign(x)
sign( )
--
0 x 0
-1 --
x
Nearest integer Operator. Returns the nearest integer value to the real-valued argument. Half-
NINu >
integer values are rounded away from Zero.
sin Sine.
cos Cosine.
Exponen tinl .
exp
Yf Square root.
Logarithm to base ten.
log10
Logarithm to base e.
log,
Logcuithm to base 2.
log2
2.2.2 Logical Operators
Logical OR.
II
Logical AND.
&&
0 ISOAEC
ISOAEC 11172-1: 1993 (E)
.
1 . 2 . 3 .,“~lN~~erators
> Greater than.
s Greater than or equal to.
< Less than.
<= Less than or equal to.
Equal to.
!z
Not equal to.
max [,.,] the maximum value in the argument list.
min [,.,] the minimum value in the argument list.
2.2.4 Bitwise Operators
A twos complement number representation is assumed where the bitwise Operators
& AND.
I OR .
>> Shift right with sign extension.
<< Shift left with Zero fill.
2.2.5 Assignment
Assignment Operator.
2.2.6 Mnemonics
The following mnemonics are defined to describe the different data types used in the coded bit-stream.
bslbf Bit string, left bit first, where “left” is the Order in which bit strings are written in
ISO/IEC 11172. Bit strings arc written as a string of 1s and Os within Single quote
m
significance.
Channel. If ch has the value 0, the left ch(anne1 of a stereo Signal or the first of two
ch
independent Signals is indicated. (Audio)
Number of channels; equal to 1 for singlechannel mode, 2 in other modes. (Audio)
nch
Granule of 3 * 32 subband samples in audio Layer 11, 18 * 32 sub-band samples in
&r
audio Layer 111. (Audio)
The main data portion of the bitst.re,un contains the scalefactors, Huffman encoded
main-data
data, and &cillary information. (Audio)
The location in the bitstream of the beginning of the main-data for the frcune. The
main-databeg
location is equal to the ending location of the previous frame’s main data plus one bit.
It is calculated from the maindata-end value of the previous frame. (Audio)
The number of main-data bits used for scalefactors. (Audio)
part2Jength
ISOAEC 1117201:1993 (E)
o ISOAEC
Remainder polynomial coefficients, highest Order first. (Audio)
rpchof
Subband. (Audio)
sb
The number of the lowest sub-band for which no bits are allocated. (Audio)
sblimi t
Scalefactor selection information. (Audio)
scfsi
Number of scalefactor band (long block scalefactor band) from which Point on window
switch-Point-1
switching is used. (Audio)
switch-Point-s Number of scalefactor band (short block scalefactor band) from which
...




Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...