ISO/IEC TR 13818-5:2005
(Main)Information technology - Generic coding of moving pictures and associated audio information - Part 5: Software simulation
Information technology - Generic coding of moving pictures and associated audio information - Part 5: Software simulation
ISO/IEC 13818-5:2005-05-23 provides a C language software simulation of an encoder and decoder for Part 1 (Systems), Part 2 (Video), Part 3 (Audio), Part 7 (AAC) and Part 11 (IPMP) of ISO/IEC 13818.
Technologies de l'information — Codage générique des images animées et des informations sonores associées — Partie 5: Simulation de logiciel
General Information
Relations
Overview
ISO/IEC TR 13818-5:2005 (Part 5: Software simulation) provides a C language reference implementation of MPEG-2 (ISO/IEC 13818) encoder and decoder functions. The Technical Report supplies software simulations for the Systems (Part 1), Video (Part 2), Audio (Part 3), Advanced Audio Coding (Part 7, AAC) and IPMP on MPEG-2 systems (Part 11) components. It is a practical developer resource that documents architecture, core components and usage of MPEG-2 reference software and includes an electronic annex with the actual code.
Keywords: ISO/IEC TR 13818-5:2005, MPEG-2 software simulation, C reference implementation, encoder decoder, IPMP, AAC
Key Topics
- Scope and purpose: A C-language encoder/decoder simulation for multiple MPEG-2 parts to aid implementation, testing and education.
- Systems simulation: Reference for stream packaging, access unit definitions, and system-level behavior.
- Video simulation: Simulation of MPEG-2 video coding features (frame/field structures, prediction modes, DCT-based blocks, bitstream verification).
- Audio simulation: Reference implementations for Layers I/II/III and AAC elements, including filterbanks, bit allocation concepts and access unit handling.
- IPMP reference software: Architecture and core components for IPMP (Intellectual Property Management and Protection) on MPEG-2 systems with guidance for usage.
- Normative material: Terms, definitions, symbols and normative references. Annex A contains an electronic software annex; Annex B lists patent holders (patent considerations apply).
- Conformance support: Helps developers produce bitstreams compatible with ISO/IEC 13818 Parts and supports bitstream verification and conformance testing.
Applications
Who uses ISO/IEC TR 13818-5:2005:
- Codec developers building or validating MPEG-2 encoders and decoders.
- Systems integrators implementing MPEG-2 Systems for broadcasting, storage or streaming.
- Test and QA teams running conformance and interoperability tests against a reference implementation.
- Researchers and educators studying video/audio coding algorithms and system integration using a practical C reference.
- Product teams implementing AAC or IPMP features who need a known-good software simulation.
Practical uses include prototype development, interoperability debugging, reference for implementation detail, and building compliance test suites.
Related Standards
- ISO/IEC 13818-1 (Systems), 13818-2 (Video), 13818-3 (Audio), 13818-4 (Conformance), 13818-7 (AAC), 13818-11 (IPMP)
- ITU-T Rec. H.262 / ISO/IEC 13818-2 (video equivalence)
- See Annex B of the report for patent-holder listings and licensing considerations.
This Technical Report is an authoritative reference for anyone implementing or testing MPEG-2 functionality and seeking a C-language software simulation of encoder/decoder behavior.
Frequently Asked Questions
ISO/IEC TR 13818-5:2005 is a technical report published by the International Organization for Standardization (ISO). Its full title is "Information technology - Generic coding of moving pictures and associated audio information - Part 5: Software simulation". This standard covers: ISO/IEC 13818-5:2005-05-23 provides a C language software simulation of an encoder and decoder for Part 1 (Systems), Part 2 (Video), Part 3 (Audio), Part 7 (AAC) and Part 11 (IPMP) of ISO/IEC 13818.
ISO/IEC 13818-5:2005-05-23 provides a C language software simulation of an encoder and decoder for Part 1 (Systems), Part 2 (Video), Part 3 (Audio), Part 7 (AAC) and Part 11 (IPMP) of ISO/IEC 13818.
ISO/IEC TR 13818-5:2005 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC TR 13818-5:2005 has the following relationships with other standards: It is inter standard links to ISO/IEC TR 13818-5:1997/Amd 1:1999, ISO/IEC TR 13818-5:1997/Amd 1:1999/Cor 2:2004, ISO/IEC TR 13818-5:1997, ISO/IEC TR 13818-5:1997/Amd 2:2005, ISO/IEC TR 13818-5:1997/Amd 1:1999/Cor 1:2003. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO/IEC TR 13818-5:2005 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
TECHNICAL ISO/IEC
REPORT TR
13818-5
Second edition
2005-10-15
Information technology — Generic coding
of moving pictures and associated audio
information —
Part 5:
Software simulation
Technologies de l'information — Codage générique des images
animées et des informations sonores associées —
Partie 5: Simulation de logiciel
Reference number
©
ISO/IEC 2005
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2005
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2005 – All rights reserved
Contents Page
Foreword .iv
Introduction.vi
1 Scope.1
2 Normative references.1
3 Terms and definitions .2
4 Symbols and abbreviations.17
5 Systems simulation.17
6 Video simulation.18
7 Audio simulation .18
7.1 Layer 1, Layer 2 and Layer 3 .18
7.2 AAC.18
8 MPEG-2 IPMP Reference Software .19
8.1 Architecture .19
8.2 Core Components .20
8.3 Usage of the Reference Software .25
Annex A (normative) Electronic annex containing software.29
Annex B (informative) List of patent holders .30
Bibliography.32
© ISO/IEC 2005 — All rights reserved iii
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
In exceptional circumstances, the joint technical committee may propose the publication of a Technical Report
of one of the following types:
— type 1, when the required support cannot be obtained for the publication of an International Standard,
despite repeated efforts;
— type 2, when the subject is still under technical development or where for any other reason there is the
future but not immediate possibility of an agreement on an International Standard;
— type 3, when the joint technical committee has collected data of a different kind from that which is
normally published as an International Standard (“state of the art”, for example).
Technical Reports of types 1 and 2 are subject to review within three years of publication, to decide whether
they can be transformed into International Standards. Technical Reports of type 3 do not necessarily have to
be reviewed until the data they provide are considered to be no longer valid or useful.
ISO/IEC 13818-5, which is a Technical Report of type 3, was prepared by Joint Technical Committee
ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and
hypermedia information.
This second edition cancels and replaces the first edition (ISO/IEC 13818-5:1997), which has been technically
revised. It also incorporates the Amendments ISO/IEC TR 13818-5:1997/Amd.1:1999 and
ISO/IEC TR 13818-5:1997/Amd.2:2005, and the Technical Corrigenda
ISO/IEC TR 13818-5:1997/Amd.1:1999/Cor.1:2003 and ISO/IEC TR 13818-5:1997/Amd.1:1999/Cor.2:2004.
ISO/IEC 13818 consists of the following parts, under the general title Information technology — Generic
coding of moving pictures and associated audio information:
Part 1: Systems
Part 2: Video
Part 3: Audio
Part 4: Conformance testing
Part 5: Software simulation [Technical Report]
iv © ISO/IEC 2005 — All rights reserved
Part 6: Extensions for DSM-CC
Part 7: Advanced Audio Coding (AAC)
Part 9: Extension for real time interface for systems decoders
Part 10: Conformance extensions for Digital Storage Media Command and Control (DSM-CC)
Part 11: IPMP on MPEG-2 systems
© ISO/IEC 2005 — All rights reserved v
Introduction
This Part of ISO/IEC 13818 was developed in response to the growing need for a generic coding method of
moving pictures and of associated sound for various applications such as digital storage media, television
broadcasting and communication. The use of this specification means that motion video can be manipulated
as a form of computer data and can be stored on various storage media, transmitted and received over
existing and future networks and distributed on existing and future broadcasting channels.
The International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC)
draw attention to the fact that it is claimed that compliance with this document may involve the use of patents.
The ISO and IEC take no position concerning the evidence, validity and scope of this patent right.
The holder of this patent right has assured the ISO and IEC that he is willing to negotiate licences under
reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this respect,
the statement of the holder of this patent right is registered with the ISO and IEC. Information may be obtained
from the companies listed in Annex B.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights other than those identified in Annex B. ISO and IEC shall not be held responsible for identifying any or
all such patent rights.
vi © ISO/IEC 2005 — All rights reserved
TECHNICAL REPORT ISO/IEC TR 13818-5:2005(E)
Information technology — Generic coding of moving pictures
and associated audio information —
Part 5:
Software simulation
1 Scope
This Technical Report provides a C language software simulation of an encoder and decoder for Part 1
(Systems), Part 2 (Video), Part 3 (Audio), Part 7 (AAC) and Part 11 (IPMP) of ISO/IEC 13818.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO 639 (all parts), Code for the representation of names of languages
ISO 8859-1, Information processing - 8-bit single-byte coded graphic character sets - Part 1: Latin alphabet
No. 1
ISO/IEC 10918-1:1994, Information technology - Digital compression and coding of continuous-tone still
images: Requirements and guidelines (See also ITU-T Rec. T.81.)
ISO/IEC 11172-1:1993, Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbit/s - Part 1: Systems
ISO/IEC 11172-2:1993, Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbit/s - Part 2: Video
ISO/IEC 11172-3:1993, Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbit/s - Part 3: Audio
ISO/IEC 11172-4:1995, Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbit/s - Part 4: Compliance testing
ISO/IEC 11172-5:1998, Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbit/s - Part 5: Software simulation
ISO/IEC 11172-6, Information technology - Coding of moving pictures and associated audio for digital storage
media at up to about 1,5 Mbit/s - Part 5: Specification for implementation of Inverse Discrete Cosine
Transform
ITU-T Rec. H.222.0 (2000) | ISO/IEC 13818-1:2000, Information technology - Generic coding of moving
pictures and associated audio information : Systems
ITU-T Rec. H.262 (2000) | ISO/IEC 13818-2:2000, Information technology - Generic coding of moving pictures
and associated audio information : Video (See also ITU-T Rec. H.262.)
ISO/IEC 13818-3:1998, Information technology - Generic coding of moving pictures and associated audio
information - Part 3: Audio
ISO/IEC 13818-4:2004, Information technology - Generic coding of moving pictures and associated audio
information - Part 4: Conformance testing
ISO/IEC 13818-7:2004, Information technology – Generic coding of moving pictures and associated audio
information - Part 7: Advanced Audio Coding (AAC)
© ISO/IEC 2005 — All rights reserved 1
ISO/IEC 13818-11:2004, Information technology – Generic coding of moving pictures and associated audio
information – Part 11: IPMP on MPEG-2 systems
3 Terms and definitions
For the purposes of this document, the following definitions apply.
3.1 16x8 prediction [video]: A prediction mode similar to field-based prediction but where the predicted
block size is 16x8 luminance samples.
3.2 AC coefficient [video]: Any DCT coefficient for which the frequency in one or both dimensions is non-
zero.
3.3 access unit [systems]: A coded representation of a presentation unit. In the case of audio, an access
unit is the coded representation of an audio frame.
In the case of video, an access unit includes all the coded data for a picture, and any stuffing that follows it, up
to but not including the start of the next access unit. If a picture is not preceded by a group_start_code or a
sequence_header_code, the access unit begins with the picture start code. If a picture is preceded by a
group_start_code and/or a sequence_header_code, the access unit begins with the first byte of the first of
these start codes. If it is the last picture preceding a sequence_end_code in the bitstream all bytes between
the last byte of the coded picture and the sequence_end_code (including the sequence_end_code) belong to
the access unit.
3.4 adaptive bit allocation [audio]: The assignment of bits to subbands in a time and frequency varying
fashion according to a psychoacoustic model.
3.5 adaptive multichannel prediction [audio]: A method of multichannel data reduction exploiting statistical
inter-channel dependencies.
3.6 adaptive noise allocation [audio]: The assignment of coding noise to frequency bands in a time and
frequency varying fashion according to a psychoacoustic model.
3.7 adaptive segmentation [audio]: A subdivision of the digital representation of an audio signal in variable
segments of time.
3.8 alias [audio]: Mirrored signal component resulting from sub-Nyquist sampling.
3.9 analysis filterbank [audio]: Filterbank in the encoder that transforms a broadband PCM audio signal
into a set of subsampled subband samples.
3.10 ancillary data [audio]: part of the bitstream that might be used for transmission of ancillary data.
3.11 audio access unit [audio]: For Layers I and II, an audio access unit is defined as the smallest part of
the encoded bitstream which can be decoded by itself, where decoded means "fully reconstructed sound". For
Layer III, an audio access unit is part of the bitstream that is decodable with the use of previously acquired
main information.
3.12 audio buffer [audio]: A buffer in the system target decoder for storage of compressed audio data.
3.13 audio sequence [audio]: A non-interrupted series of audio frames (base frames plus optional
extension frames) in which the following parameters are not changed:
2 © ISO/IEC 2005 — All rights reserved
- ID
- Layer
- Sampling Frequency
For Layer I and II, a decoder is not required to support a continuously variable bitrate (change in the bitrate
index) of the base stream. Such a relaxation of requirements does not apply to the extension stream.
3.14 B-field picture [video]: A field structure B-Picture.
3.15 B-frame picture [video]: A frame structure B-Picture.
3.16 B-picture; bidirectionally predictive-coded picture [video]: A picture that is coded using motion
compensated prediction from past and/or future reference fields or frames.
3.17 backward compatibility: A newer coding standard is backward compatible with an older coding
standard if decoders designed to operate with the older coding standard are able to continue to operate by
decoding all or part of a bitstream produced according to the newer coding standard.
3.18 backward motion vector [video]: A motion vector that is used for motion compensation from a
reference frame or reference field at a later time in display order.
3.19 backward prediction [video]: Prediction from the future reference frame (field).
3.20 Bark [audio]: Unit of critical band rate. The Bark scale is a non-linear mapping of the frequency scale
over the audio range closely corresponding with the frequency selectivity of the human ear across the band.
3.21 base layer [video]: First, independently decodable layer of a scalable hierarchy.
3.22 big picture [video]: A coded picture that would cause VBV buffer underflow as defined in C.7 Annex C
of ISO/IEC 13818-2. Big pictures can only occur in sequences where low_delay is equal to 1. “Skipped
picture” is a term that is sometimes used to describe the same concept.
3.23 bitrate [audio]: The rate at which the compressed bitstream is delivered to the input of a decoder.
3.24 bitstream; stream: An ordered series of bits that forms the coded representation of the data.
3.25 bitstream verifier [video]: A process by which it is possible to test and verify that all the requirements
specified in ISO/IEC 13818-2 are met by the bitstream.
3.26 block [video]: An 8-row by 8-column matrix of samples, or 64 DCT coefficients (source, quantised or
dequantised).
3.27 block companding [audio]: Normalising of the digital representation of an audio signal within a certain
time period.
3.28 bottom field [video]: One of two fields that comprise a frame. Each line of a bottom field is spatially
located immediately below the corresponding line of the top field.
3.29 bound [audio]: The lowest subband in which intensity stereo coding is used.
3.30 byte aligned: A bit in a coded bitstream is byte-aligned if its position is a multiple of 8-bits from the first
bit in the stream.
© ISO/IEC 2005 — All rights reserved 3
3.31 byte: Sequence of 8-bits.
3.32 centre channel [audio]: An audio presentation channel used to stabilise the central component of the
frontal stereo image.
3.33 channel [audio]: A sequence of data representing an audio signal being transported.
3.34 chroma simulcast [video]: A type of scalability (which is a subset of SNR scalability) where the
enhancement layer (s) contain only coded refinement data for the DC coefficients, and all the data for the
AC coefficients, of the chrominance components.
3.35 chrominance format [video]: Defines the number of chrominance blocks in a macroblock.
3.36 chrominance component [video]: A matrix, block or single sample representing one of the two colour
difference signals related to the primary colours in the manner defined in the bitstream. The symbols used for
the chrominance signals are Cr and Cb.
3.37 coded audio bitstream [audio]: A coded representation of an audio signal as specified in part 3 of
ISO/IEC 13818.
3.38 coded B-frame [video]: A B-frame picture or a pair of B-field pictures.
3.39 coded frame [video]: A coded frame is a coded I-frame, a coded P-frame or a coded B-frame.
3.40 coded I-frame [video]: An I-frame picture or a pair of field pictures, where the first field picture is an I-
picture and the second field picture is an I-picture or a P-picture.
3.41 coded order [video]: The order in which the pictures are transmitted and decoded. This order is not
necessarily the same as the display order.
3.42 coded P-frame [video]: A P-frame picture or a pair of P-field pictures.
3.43 coded picture [video]: A coded picture is made of a picture header, the optional extensions
immediately following it, and the following picture data. A coded picture may be a coded frame or a coded
field.
3.44 coded representation: A data element as represented in its encoded form.
3.45 coded video bitstream [video]: A coded representation of a series of one or more pictures as defined
in ISO/IEC 13818-2.
3.46 coding parameters [video]: The set of user-definable parameters that characterise a coded bitstream.
Bitstreams are characterised by coding parameters. Decoders are characterised by the bitstreams that they
are capable of decoding.
3.47 component [video]: A matrix, block or single sample from one of the three matrices (luminance and
two chrominance) that make up a picture.
3.48 compression: Reduction in the number of bits used to represent an item of data.
3.49 constant bitrate: Operation where the bitrate is constant from start to finish of the coded bitstream.
4 © ISO/IEC 2005 — All rights reserved
3.50 constrained parameters [video]: The values of the set of coding parameters defined in 2.4.3.2 of
ISO/IEC 11172-2.
3.51 constrained system parameter stream; CSPS [systems]: A Program Stream for which the
constraints defined in 2.7.9 of ISO/IEC 13818-1 apply.
3.52 CRC: The Cyclic Redundancy Check to verify the correctness of data.
3.53 critical band [audio]: Psychoacoustic measure in the spectral domain which corresponds to the
frequency selectivity of the human ear. This selectivity is expressed in Bark.
3.54 critical band rate [audio]: Psychoacoustic function of frequency. At a given audible frequency, it is
proportional to the number of critical bands below that frequency. The units of the critical band rate scale are
Barks.
3.55 data element: An item of data as represented before encoding and after decoding.
3.56 data partitioning [video]: A method for dividing a bitstream into two separate bitstreams for error
resilience purposes. The two bitstreams have to be recombined before decoding.
3.57 DC coefficient [video]: The DCT coefficient for which the frequency is zero in both dimensions.
3.58 DCT coefficient [video]: The amplitude of a specific cosine basis function.
3.59 de-emphasis [audio]: Filtering applied to an audio signal after storage or transmission to undo a linear
distortion due to emphasis.
3.60 decoded stream: The decoded reconstruction of a compressed bitstream.
3.61 decoder input buffer [video]: The first-in first-out (FIFO) buffer specified in the video buffering verifier.
3.62 decoder: An embodiment of a decoding process.
3.63 decoder sub-loop [video]: Stages within encoder which produce numerically identical results to the
decode process described in ISO/IEC 13818-2 clause 7. Encoders capable of producing more than just I-
pictures embed a decoder sub-loop to create temporal predictions and to model the behaviour of downstream
decoders.
3.64 decoding (process): The process defined in ISO/IEC 13818 parts 1, 2 and 3 that reads an input coded
bitstream and outputs decoded pictures or audio samples.
3.65 decoding time-stamp; DTS [systems]: A field that may be present in a PES packet header that
indicates the time that an access unit is decoded in the system target decoder.
3.66 dequantisation: The process of rescaling the quantised DCT coefficients after their representation in
the bitstream has been decoded and before they are presented to the inverse DCT.
3.67 digital storage media; DSM: A digital storage or transmission device or system.
3.68 discrete cosine transform; DCT: Either the forward discrete cosine transform or the inverse discrete
cosine transform. The DCT is an invertible, discrete orthogonal transformation.
© ISO/IEC 2005 — All rights reserved 5
3.69 display aspect ratio [video]: The ratio height/width (in SI units) of the intended display.
3.70 display order [video]: The order in which the decoded pictures are displayed. Normally this is the
same order in which they were presented at the input of the encoder.
3.71 display process [video]: The (non-normative) process by which reconstructed frames are displayed.
3.72 downmix [audio]: A matrixing of n channels to obtain less than n channels.
3.73 drift [video]: Accumulation of mismatch between the reconstructed output produced by the
hypothetical decoder sub-loop embedded within an encoder (see definition of "decoder sub-loop") and the
reconstructed outputs produced by a (downstream) decoder.
3.74 DSM-CC: digital storage media command and control.
3.75 dual channel mode [audio]: A mode, where two audio channels with independent programme
contents (e.g. bilingual) are encoded within one bitstream. The coding process is the same as for the stereo
mode.
3.76 dual-prime prediction [video]: A prediction mode in which two forward field-based predictions are
averaged. The predicted block size is 16x16 luminance samples. Dual-prime prediction is only used in
interlaced P-pictures.
3.77 dynamic crosstalk [audio]: A method of multichannel data reduction in which stereo-irrelevant signal
components are copied to another channel.
3.78 dynamic transmission channel switching [audio]: A method of multichannel data reduction by
allocating the most orthogonal signal components to the transmission channels.
3.79 editing: The process by which one or more coded bitstreams are manipulated to produce a new coded
bitstream. Conforming edited bitstreams must meet the requirements defined in parts 1, 2, and 3 of ISO/IEC
13818.
3.80 Elementary Stream Clock Reference; ESCR [systems]: A time stamp in the PES Stream from which
decoders of PES streams may derive timing.
3.81 elementary stream; ES [systems]: A generic term for one of the coded video, coded audio or other
coded bitstreams in PES packets. One elementary stream is carried in a sequence of PES packets with one
and only one stream_id.
3.82 emphasis [audio]: Filtering applied to an audio signal before storage or transmission to improve the
signal-to-noise ratio at high frequencies.
3.83 encoder: An embodiment of an encoding process.
3.84 encoding (process): A process, not specified in ISO/IEC 13818, that reads a stream of input pictures
or audio samples and produces a valid coded bitstream as defined in parts 1, 2, and 3 of ISO/IEC 13818.
3.85 enhancement layer [video]: A relative reference to a layer (above the base layer) in a scalable
hierarchy. For all forms of scalability, its decoding process can be described by reference to the lower layer
decoding process and the appropriate additional decoding process for the enhancement layer itself.
6 © ISO/IEC 2005 — All rights reserved
3.86 entitlement control message; ECM [systems]: Entitlement Control Messages are private conditional
access information which specify control words and possibly other, typically stream-specific, scrambling
and/or control parameters.
3.87 entitlement management message; EMM [systems]: Entitlement Management Messages are private
conditional access information which specify the authorisation levels or the services of specific decoders.
They may be addressed to single decoders or groups of decoders.
3.88 entropy coding: Variable length lossless coding of the digital representation of a signal to reduce
redundancy.
3.89 event [systems]: An event is defined as a collection of elementary streams with a common time base,
an associated start time, and an associated end time.
3.90 evil bitstreams: Bitstreams orthogonal to reality.
3.91 extension bitstream [audio]: Information contained in an optional additional bit stream related to the
audio base bit stream at the system level, to support bit rates beyond those defined in ISO/IEC 11172-3. The
optional extension bit stream contains the remainder of the multichannel and multilingual data.
3.92 fast reverse playback [video]: The process of displaying the picture sequence in the reverse of
display order faster than real-time.
3.93 fast forward playback [video]: The process of displaying a sequence, or parts of a sequence, of
pictures in display-order faster than real-time.
3.94 FFT: Fast Fourier Transformation. A fast algorithm for performing a discrete Fourier transform (an
orthogonal transform).
3.95 field [video]: For an interlaced video signal, a “field” is the assembly of alternate lines of a frame.
Therefore an interlaced frame is composed of two fields, a top field and a bottom field.
3.96 field period [video]: The reciprocal of twice the frame rate.
3.97 field picture; field structure picture [video]: A field structure picture is a coded picture with
picture_structure is equal to "Top field" or "Bottom field".
3.98 field-based prediction [video]: A prediction mode using only one field of the reference frame. The
predicted block size is 16x16 luminance samples. Field-based prediction is not used in progressive frames.
3.99 filterbank [audio]: A set of band-pass filters covering the entire audio frequency range.
3.100 fixed segmentation [audio]: A subdivision of the digital representation of an audio signal into fixed
segments of time.
3.101 flag: A variable which can take one of only the two values defined in this specification.
3.102 FLC: Fixed Length Code.
3.103 forbidden: The term "forbidden", when used in the clauses defining the coded bitstream, indicates that
the value shall never be used. This is usually to avoid emulation of start codes.
© ISO/IEC 2005 — All rights reserved 7
3.104 forced updating [video]: The process by which macroblocks are intra-coded from time-to-time to
ensure that mismatch errors between the inverse DCT processes in encoders and decoders cannot build up
excessively.
3.105 forward compatibility: A newer coding standard is forward compatible with an older coding standard if
decoders designed to operate with the newer coding standard are able to decode bitstreams of the older
coding standard.
3.106 forward motion vector [video]: A motion vector that is used for motion compensation from a
reference frame or reference field at an earlier time in display order.
3.107 forward prediction [video]: Prediction from the past reference frame (field).
3.108 frame [audio]: A part of the audio bit stream that corresponds to audio PCM samples from an Audio
Access Unit.
3.109 frame [video]: A frame contains lines of spatial information of a video signal. For progressive video,
these lines contain samples starting from one time instant and continuing through successive lines to the
bottom of the frame. For interlaced video a frame consists of two fields, a top field and a bottom field. One of
these fields may be temporally located one field period later than the other.
3.110 frame period [video]: The reciprocal of the frame rate.
3.111 frame picture; frame structure picture [video]: A frame structure picture is a coded picture with
picture_structure is equal to "Frame".
3.112 frame rate [video]: The rate at which frames are be output from the decoding process.
3.113 frame reordering [video]: The process of reordering the reconstructed frames when the coded order
is different from the display order. Frame reordering occurs when B-frames are present in a bitstream. There
is no frame reordering when decoding low delay bitstreams.
3.114 frame-based prediction [video]: A prediction mode using both fields of the reference frame.
3.115 free format [audio]: Any bitrate other than the defined bitrates that is less than the maximum valid
bitrate for each layer.
3.116 future reference frame (field) [video]: A future reference frame(field) is a reference frame(field) that
occurs at a later time than the current picture in display order.
3.117 granules [Layer II] [audio]: The set of 3 consecutive subband samples from all 32 subbands that are
considered together before quantisation. They correspond to 96 PCM samples.
3.118 granules [Layer III] [audio]: 576 frequency lines that carry their own side information.
3.119 group of pictures [video]: A notion defined only in ISO/IEC 11172-2 (MPEG-1 Video). In ISO/IEC
13818-2, a similar functionality can be achieved by the mean of inserting group of pictures headers.
3.120 Hann window [audio]: A time function applied sample-by-sample to a block of audio samples before
Fourier transformation.
8 © ISO/IEC 2005 — All rights reserved
3.121 header: A block of data in the coded bitstream containing the coded representation of a number of data
elements pertaining to the coded data that follow the header in the bitstream.
3.122 Huffman coding: A specific method for entropy coding.
3.123 hybrid filterbank [audio]: A serial combination of subband filterbank and MDCT.
3.124 hybrid scalability [video]: Hybrid scalability is the combination of two (or more) types of scalability.
3.125 I-field picture [video]: A field structure I-Picture.
3.126 I-frame picture [video]: A frame structure I-Picture.
3.127 I-picture; intra-coded picture [video]: A picture coded using information only from itself.
3.128 IDCT: Inverse Discrete Cosine Transform.
3.129 IMDCT [audio]: Inverse Modified Discrete Cosine Transform.
3.130 intensity stereo [audio]: A method of exploiting stereo irrelevance or redundancy in stereophonic
audio programmes based on retaining at high frequencies only the energy envelope of the right and left
channels.
3.131 interlace [video]: The property of conventional television frames where alternating lines of the frame
represent different instances in time. In an interlaced frame, one of the field is meant to be displayed first.
This field is called the first field. The first field can be the top field or the bottom field of the frame.
3.132 intra coding [video]: Coding of a macroblock or picture that uses information only from that
macroblock or picture.
3.133 ITU-T Rec. H.222.0 | ISO/IEC 13818 (multiplexed) stream [systems]: A bitstream composed of 0 or
more elementary streams combined in the manner defined in ITU-T Rec. H.222.0 | ISO/IEC 13818-1.
3.134 joint stereo coding [audio]: Any method that exploits stereophonic irrelevance or stereophonic
redundancy.
3.135 joint stereo mode [audio]: A mode of the audio coding algorithm using joint stereo coding.
3.136 layer [audio]: One of the levels in the coding hierarchy of the audio system defined in ISO/IEC
13818-3.
3.137 layer [systems]: One of the levels in the data hierarchy of the video and system specifications defined
in ISO/IEC 13818 parts 1 and 2.
3.138 layer [video]: In a scalable hierarchy denotes one out of the ordered set of bitstreams and (the result
of) its associated decoding process (implicitly including decoding of all layers below this layer).
3.139 layer bitstream [video]: A single bitstream associated to a specific layer (always used in conjunction
with layer qualifiers, e. g. "enhancement layer bitstream").
© ISO/IEC 2005 — All rights reserved 9
3.140 level [video]: A defined set of constraints on the values which may be taken by the parameters of this
specification within a particular profile. A profile may contain one or more levels. In a different context, level is
the absolute value of a non-zero coefficient (see “run”).
3.141 LFE [audio]: Low Frequency Enhancement channel. A limited bandwidth channel for low frequency
audio effects in a multichannel system.
3.142 low frequency enhancement channel [audio]: A limited bandwidth channel for low frequency audio
effects in a multichannel system.
3.143 lower layer [video]: A relative reference to the layer immediately below a given enhancement layer
(implicitly including decoding of all layers below this enhancement layer).
3.144 luminance component [video]: A matrix, block or single sample representing a monochrome
representation of the signal and related to the primary colours in the manner defined in the bitstream. The
symbol used for luminance is Y.
3.145 macroblock [video]: The four 8 by 8 blocks of luminance data and the two (for 4:2:0 chrominance
format), four (for 4:2:2 chrominance format) or eight (for 4:4:4 chrominance format) corresponding 8 by 8
blocks of chrominance data coming from a 16 by 16 section of the luminance component of the picture.
Macroblock is sometimes used to refer to the sample data and sometimes to the coded representation of the
sample values and other data elements defined in the macroblock header of the syntax defined in this part of
this specification. The usage is clear from the context.
3.146 mapping [audio]: Conversion of an audio signal from time to frequency domain by subband filtering
and/or by MDCT.
3.147 masking [audio]: A property of the human auditory system by which an audio signal cannot be
perceived in the presence of another audio signal.
3.148 masking threshold [audio]: A function in frequency and time below which an audio signal cannot be
perceived by the human auditory system.
3.149 Mbit [video]: 1 000 000 bits.
3.150 MCP [video]: Motion Compensated Predictor.
3.151 MDCT [audio]: Modified Discrete Cosine Transform which corresponds to the Time Domain Aliasing
Cancellation Filter Bank.
3.152 mismatch [video]: Numerical discrepancy between the data reconstructed from the same coded
bitstream by two decoding processes. With the exception of IDCT, the specification of ISO/IEC 13818-2
defines the decoding process absolutely unambiguously. Therefore, if both decoding processes are
implemented according the specifications ISO/IEC 13818-2, mismatch can only be caused by different
implementations of IDCT.
3.153 motion compensation [video]: The use of motion vectors to improve the efficiency of the prediction of
sample values. The prediction uses motion vectors to provide offsets into the past and/or future reference
frames or reference fields containing previously decoded sample values that are used to form the prediction
error.
3.154 motion estimation [video]: The process of estimating motion vectors during the encoding process.
10 © ISO/IEC 2005 — All rights reserved
3.155 motion vector [video]: A two-dimensional vector used for motion compensation that provides an offset
from the coordinate position in the current picture or field to the coordinates in a reference frame or reference
field.
3.156 MS stereo [audio]: A method of exploiting stereo irrelevance or redundancy in stereophonic audio
programmes based on coding the sum and difference signal instead of the left and right channels.
3.157 multichannel [audio]: A combination of audio channels used to create a spatial sound field.
3.158 multilingual [audio]: A presentation of dialogue in more than one language.
3.159 NIT [systems]: Network Information Table as defined in table 2-23 of ISO/IEC 13818-1.
3.160 non-intra coding [video]: Coding of a macroblock or picture that uses information both from itself and
from macroblocks and pictures occurring at other times.
3.161 non-tonal component [audio]: A noise-like component of an audio signal.
3.162 Nyquist sampling: Sampling at or above twice the maximum bandwidth of a signal.
3.163 opposite parity [video]: The opposite parity of top is bottom, and vice versa.
3.164 P-field picture [video]: A field structure P-Picture.
3.165 P-frame picture [video]: A frame structure P-Picture.
3.166 P-picture; predictive-coded picture [video]: A picture that is coded using motion compensated
prediction from past reference fields or frame.
3.167 pack [systems]: A pack consists of a pack header followed by zero or more packets. It is a layer in
the system coding syntax described in 2.5.3.3 on page 51 of ISO/IEC 13818-1.
3.168 packet [systems]: A packet consists of a header followed by a number of contiguous bytes from an
elementary data stream. It is a layer in the system coding syntax described in 2.4.3 of ISO/IEC 13818-1.
3.169 packet data [systems]: Contiguous bytes of data from an elementary stream present in a packet.
3.170 packet identifier; PID [systems]: A unique integer value used to associate elementary streams of a
program in a single or multi-program Transport Stream as described in 2.4.3 of ISO/IEC 13818-1.
3.171 padding [audio]: A method to adjust the average length of an audio frame in time to the duration of the
corresponding PCM samples, by conditionally adding a slot to the audio frame.
3.172 parameter: A variable within the syntax of this specification which may take one of a range of values.
A variable which can take one of only two values is a flag or indicator and not a parameter.
3.173 parity (of field) [video]: The parity of a field can be top or bottom.
3.174 parser: Functional stage of a decoder which extracts from a coded bitstream series of bits representing
coded elements (FLC or VLC).
© ISO/IEC 2005 — All rights reserved 11
3.175 past reference frame (field) [video]: A past reference frame(field) is a reference frame(field) that
occurs at an earlier time than the current picture in display order.
3.176 PAT [systems]: Program Association Table as defined in clause 2.4.4.3 of ISO/IEC 13818-1.
3.177 payload [systems]: Payload refers to the bytes which follow the header bytes in a packet. For
example, the payload of a Transport Stream packet includes the PES_packet_header and its
PES_packet_data_bytes, or pointer_field and PSI sections, or private data; but a PES_packet_payload
consists of only PES_packet_data_bytes. The Transport Stream packet header and adaptation fields are not
payload.
3.178 PES [systems]: An abbreviation for Packetized Elementary Stream.
3.179 PES packet [systems]: The data structure used to carry elementary stream data. It consists of a PES
packet header followed by PES packet payload and is described in 2.4.3.6 and 2.4.3.7 of ISO/IEC 13818-1.
3.180 PES packet header[systems]: The leading fields in a PES packet up to and not including the
PES_packet_data_byte fields, where the stream is not a padding stream. In the case of a padding stream the
PES packet header is similarly defined as the leading fields in a PES packet up to and not including
padding_byte fields.
3.181 PES Stream [systems]: A PES Stream consists of PES packets, all of whose payloads consist of data
from a single elementary stream, and all of which have the same stream_id. Specific semantic constraints
apply.
3.182 picture [video]: Source, coded or reconstructed image data. A source or reconstructed picture consists
of three rectangular matrices of 8-bit numbers representing the luminance and two chrominance signals. A
“coded picture” is defined in ISO/IEC 13818-2. For progressive video, a picture is identical to a frame, while for
interlaced video, a picture can refer to a frame, or the top field or the bottom field of the frame depending on
the context.
3.183 picture data [video]: In the VBV operations, picture data is defined as all the bits of the coded picture,
all the header(s) and user data immediately preceding it if any (including any stuffing between them) and all
the stuffing following it, up to (but not including) the next start code, except in the case where the next start
code is an end of sequence code, in which case it is included in the picture data.
3.184 polyphase filterbank [audio]: A set of equal bandwidth filters with special phase interrelationships,
allowing for an efficient implementation of the filterbank.
3.185 prediction [audio]: The use of a predictor to provide an estimate of the subband sample in one
channel from the subband samples in other channels.
3.186 prediction error: The difference between the actual value of a sample or data element
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...