ISO/IEC 13818-1:2013
(Main)Information technology - Generic coding of moving pictures and associated audio information - Part 1: Systems
Information technology - Generic coding of moving pictures and associated audio information - Part 1: Systems
ISO/IEC 13818-1:2013 specifies the system layer of the coding. It was developed principally to support the combination of the video and audio coding methods defined in ISO/IEC 13818-2 and ISO/IEC 13818-3. The system layer supports six basic functions: the synchronization of multiple compressed streams on decoding; the interleaving of multiple compressed streams into a single stream; the initialization of buffering for decoding start up; continuous buffer management; time identification; multiplexing and signalling of various components in a system stream.
Technologies de l'information — Codage générique des images animées et du son associé — Partie 1: Systèmes
General Information
Relations
Frequently Asked Questions
ISO/IEC 13818-1:2013 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Generic coding of moving pictures and associated audio information - Part 1: Systems". This standard covers: ISO/IEC 13818-1:2013 specifies the system layer of the coding. It was developed principally to support the combination of the video and audio coding methods defined in ISO/IEC 13818-2 and ISO/IEC 13818-3. The system layer supports six basic functions: the synchronization of multiple compressed streams on decoding; the interleaving of multiple compressed streams into a single stream; the initialization of buffering for decoding start up; continuous buffer management; time identification; multiplexing and signalling of various components in a system stream.
ISO/IEC 13818-1:2013 specifies the system layer of the coding. It was developed principally to support the combination of the video and audio coding methods defined in ISO/IEC 13818-2 and ISO/IEC 13818-3. The system layer supports six basic functions: the synchronization of multiple compressed streams on decoding; the interleaving of multiple compressed streams into a single stream; the initialization of buffering for decoding start up; continuous buffer management; time identification; multiplexing and signalling of various components in a system stream.
ISO/IEC 13818-1:2013 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC 13818-1:2013 has the following relationships with other standards: It is inter standard links to ISO/IEC 13818-1:2013/Amd 1:2014, ISO/IEC 13818-1:2013/Amd 2:2014, ISO/IEC 13818-1:2013/Amd 3:2014, ISO/IEC 13818-1:2013/Amd 4:2014, ISO/IEC 13818-1:2013/FDAmd 5, ISO/IEC 13818-1:2015, ISO/IEC 13818-1:2007/Cor 1:2008, ISO/IEC 13818-1:2007, ISO/IEC 13818-1:2007/Amd 1:2007, ISO/IEC 13818-1:2007/Amd 5:2011, ISO/IEC 13818-1:2007/Amd 4:2009, ISO/IEC 13818-1:2007/Amd 3:2009, ISO/IEC 13818-1:2007/Amd 2:2008, ISO/IEC 13818-1:2007/Amd 6:2011. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO/IEC 13818-1:2013 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 13818-1
Fourth edition
2013-06-15
Information technology — Generic coding
of moving pictures and associated audio
information: Systems
Technologies de l'information — Codage générique des images
animées et du son associé: Systèmes
Reference number
©
ISO/IEC 2013
© ISO/IEC 2013
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form or by any
means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission.
Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester.
ISO copyright office
Case postale 56 CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2013 – All rights reserved
CONTENTS
Page
1.1 Scope . 1
1.2 Normative references . 1
2.1 Definitions . 2
2.2 Symbols and abbreviations . 7
2.3 Method of describing bit stream syntax . 9
2.4 Transport stream bitstream requirements . 10
2.5 Program stream bitstream requirements . 55
2.6 Program and program element descriptors . 69
2.7 Restrictions on the multiplexed stream semantics . 108
2.8 Compatibility with ISO/IEC 11172. 113
2.9 Registration of copyright identifiers . 113
2.10 Registration of private data format . 114
2.11 Carriage of ISO/IEC 14496 data . 114
2.12 Carriage of metadata . 126
2.13 Carriage of ISO 15938 data . 134
2.14 Carriage of Rec. ITU-T H.264 | ISO/IEC 14496-10 video . 134
2.15 Carriage of ISO/IEC 14496-17 text streams . 150
2.16 Carriage of auxiliary video streams . 151
Annex A – CRC decoder model . 152
A.1 CRC decoder model . 152
Annex B – Digital Storage Medium Command and Control (DSM-CC) . 153
B.1 Introduction . 153
B.2 General elements . 154
B.3 Technical elements . 156
Annex C – Program-specific information . 162
C.1 Explanation of program-specific information in transport streams . 162
C.2 Introduction . 162
C.3 Functional mechanism . 162
C.4 The mapping of sections into transport stream packets . 163
C.5 Repetition rates and random access . 163
C.6 What is a program? . 164
C.7 Allocation of program_number . 164
C.8 Usage of PSI in a typical system . 164
C.9 The relationships of PSI structures . 165
C.10 Bandwidth utilization and signal acquisition time . 167
Annex D – Systems timing model and application implications of this Recommendation | International
Standard . 170
D.1 Introduction . 170
Annex E – Data transmission applications . 179
E.1 General considerations . 179
E.2 Suggestion . 179
Annex F – Graphics of syntax for this Recommendation | International Standard . 180
F.1 Introduction . 180
Annex G – General information . 184
G.1 General information . 184
Annex H – Private data . 185
H.1 Private data . 185
Annex I – Systems conformance and real-time interface . 186
I.1 Systems conformance and real-time interface . 186
© ISO/IEC 2013 – All rights reserved iii
Page
Annex J – Interfacing jitter-inducing networks to MPEG-2 decoders . 187
J.1 Introduction . 187
J.2 Network compliance models . 187
J.3 Network specification for jitter smoothing . 188
J.4 Example decoder implementations . 189
Annex K – Splicing transport streams . 190
K.1 Introduction . 190
K.2 The different types of splicing point . 190
K.3 Decoder behaviour on splices . 191
Annex L – Registration procedure (see 2.9) . 193
L.1 Procedure for the request of a Registered Identifier (RID) . 193
L.2 Responsibilities of the Registration Authority . 193
L.3 Responsibilities of parties requesting an RID . 193
L.4 Appeal procedure for denied applications . 194
Annex M – Registration application form (see 2.9) . 195
M.1 Contact information of organization requesting a Registered Identifier (RID) . 195
M.2 Statement of an intention to apply the assigned RID . 195
M.3 Date of intended implementation of the RID . 195
M.4 Authorized representative . 195
M.5 For official use only of the Registration Authority . 195
Annex N – Registration Authority diagram of administration structure (see 2.9) . 196
Annex O – Registration procedure (see 2.10) . 197
O.1 Procedure for the request of an RID . 197
O.2 Responsibilities of the Registration Authority . 197
O.3 Contact information for the Registration Authority . 197
O.4 Responsibilities of parties requesting an RID . 197
O.5 Appeal procedure for denied applications . 197
Annex P – Registration application form . 199
P.1 Contact information of organization requesting an RID . 199
P.2 Request for a specific RID . 199
P.3 Short description of RID that is in use and date system that was implemented . 199
P.4 Statement of an intention to apply the assigned RID . 199
P.5 Date of intended implementation of the RID . 199
P.6 Authorized representative . 199
P.7 For official use of the Registration Authority . 199
Annex Q – T-STD and P-STD buffer models for ISO/IEC 13818-7 ADTS . 200
Q.1 Introduction . 200
Q.2 Leak rate from Transport Buffer . 200
Q.3 Buffer size . 200
Q.4 Conclusion . 201
Annex R – Carriage of ISO/IEC 14496 scenes in Rec. ITU-T H.222.0 | ISO/IEC 13818-1 . 203
R.1 Content access procedure for ISO/IEC 14496 program components within a program stream . 203
R.2 Content access procedure for ISO/IEC 14496 program components within a transport stream . 204
Annex S – Carriage of JPEG 2000 part 1 video over MPEG-2 transport streams . 206
S.1 Introduction . 206
S.2 J2K video access unit, J2K video elementary stream, J2K video sequence and J2K still picture . 206
S.3 Elementary stream header (elsm) and mapping to PES packets . 206
S.4 J2K transport constraints . 208
S.5 Interpretation of flags in adaptation and PES headers for J2K video elementary streams . 208
S.6 T-STD extension for J2K video elementary streams . 209
Bibliography . 211
iv © ISO/IEC 2013 – All rights reserved
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form
the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the
development of International Standards through technical committees established by the respective organization to deal
with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest.
Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International Standards
adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International
Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO
and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 13818-1 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee
SC 29, Coding of audio, picture, multimedia and hypermedia information.
This fourth edition cancels and replaces the third edition (ISO/IEC 13818-1:2007), which has been technically revised. It
also incorporates the Amendments ISO/IEC 13818-1:2007/Amd1:2007, ISO/IEC 13818-1:2007/Amd2:2008, ISO/IEC
13818-1:2007/Amd3:2009, ISO/IEC 13818-1:2007/Amd3:2009/Cor1:2011, ISO/IEC 13818-1:2007/Amd4:2009,
ISO/IEC 13818-1:2007/Amd5:2011, ISO/IEC 13818-1:2007/Amd6:2011 and the Technical Corrigenda ISO/IEC 13818-
1:2007/Cor1:2008, ISO/IEC 13818-1:2007/Cor2:2009, and ISO/IEC 13818-1:2007/Cor3:2011.
ISO/IEC 13818 consists of the following parts, under the general title Information technology — Generic coding of
moving pictures and associated audio information:
— Part 1: Systems
— Part 2: Video
— Part 3: Audio
— Part 4: Conformance testing
— Part 5: Software simulation [Technical Report]
— Part 6: Extensions for DSM-CC
— Part 7: Advanced Audio Coding (AAC)
— Part 9: Extension for real time interface for systems decoders
— Part 10: Conformance extensions for Digital Storage Media Command and Control (DSM-CC)
— Part 11: IPMP on MPEG-2 systems
© ISO/IEC 2013 – All rights reserved v
Introduction
The systems part of this Recommendation | International Standard addresses the combining of one or more elementary
streams of video and audio, as well as other data, into single or multiple streams which are suitable for storage or
transmission. Systems coding follows the syntactical and semantic rules imposed by this Specification and provides
information to enable synchronized decoding of decoder buffers over a wide range of retrieval or receipt conditions.
System coding shall be specified in two forms: the transport stream and the program stream. Each is optimized for a
different set of applications. Both the transport stream and program stream defined in this Recommendation |
International Standard provide coding syntax which is necessary and sufficient to synchronize the decoding and
presentation of the video and audio information, while ensuring that data buffers in the decoders do not overflow or
underflow. Information is coded in the syntax using time stamps concerning the decoding and presentation of coded
audio and visual data and time stamps concerning the delivery of the data stream itself. Both stream definitions are
packet-oriented multiplexes.
The basic multiplexing approach for single video and audio elementary streams is illustrated in Figure Intro. 1. The video
and audio data is encoded as described in Rec. ITU-T H.262 | ISO/IEC 13818-2 and ISO/IEC 13818-3. The resulting
compressed elementary streams are packetized to produce PES packets. Information needed to use PES packets
independently of either transport streams or program streams may be added when PES packets are formed. This
information is not needed and need not be added when PES packets are further combined with system level information
to form transport streams or program streams. This systems standard covers those processes to the right of the vertical
dashed line.
Video data Video PES
Video encoder Packetizer Program
PS stream
Audio data Audio PES
Audio encoder Packetizer Mux
Transport
TS
stream
Mux
Extent of systems specification
H.222.0(12)_F01
Figure Intro. 1 – Simplified overview of the scope of this Recommendation | International Standard
The program stream is analogous and similar to the ISO/IEC 11172 systems layer. It results from combining one or more
streams of PES packets, which have a common time base, into a single stream.
For applications that require the elementary streams that comprise a single program to be in separate streams that are not
multiplexed, the elementary streams can also be encoded as separate program streams, one per elementary stream, with a
common time base. In this case the values encoded in the SCR fields of the various streams shall be consistent.
Like the single program stream, all elementary streams can be decoded with synchronization.
The program stream is designed for use in relatively error-free environments and is suitable for applications which may
involve software processing of system information such as interactive multi-media applications. Program stream packets
may be of variable and relatively great length.
The transport stream combines one or more programs with one or more independent time bases into a single stream. PES
packets made up of elementary streams that form a program share a common timebase. The transport stream is designed
for use in environments where errors are likely, such as storage or transmission in lossy or noisy media. Transport stream
packets are 188 bytes in length.
vi
© ISO/IEC 2013 – All rights reserved
Program and transport streams are designed for different applications and their definitions do not strictly follow a layered
model. It is possible and reasonable to convert from one to the other; however, one is not a subset or superset of the
other. In particular, extracting the contents of a program from a transport stream and creating a valid program stream is
possible and is accomplished through the common interchange format of PES packets, but not all of the fields needed in a
program stream are contained within the transport stream; some must be derived. The transport stream may be used to
span a range of layers in a layered model, and is designed for efficiency and ease of implementation in high bandwidth
applications.
The scope of syntactical and semantic rules set forth in the systems specification differs: the syntactical rules apply to
systems layer coding only, and do not extend to the compression layer coding of the video and audio specifications; by
contrast, the semantic rules apply to the combined stream in its entirety.
The systems specification does not specify the architecture or implementation of encoders or decoders, nor those of
multiplexors or demultiplexors. However, bit stream properties do impose functional and performance requirements on
encoders, decoders, multiplexors and demultiplexors. For instance, encoders must meet minimum clock tolerance
requirements. Notwithstanding this and other requirements, a considerable degree of freedom exists in the design and
implementation of encoders, decoders, multiplexors, and demultiplexors.
Intro. 1 Transport stream
The transport stream is a stream definition which is tailored for communicating or storing one or more programs of coded
data according to Rec. ITU-T H.262 | ISO/IEC 13818-2 and ISO/IEC 13818-3 and other data in environments in which
significant errors may occur. Such errors may be manifested as bit value errors or loss of packets.
Transport streams may be either fixed or variable rate. In either case the constituent elementary streams may either be
fixed or variable rate. The syntax and semantic constraints on the stream are identical in each of these cases. The
transport stream rate is defined by the values and locations of program clock reference (PCR) fields, which in general are
separate PCR fields for each program.
There are some difficulties with constructing and delivering a transport stream containing multiple programs with
independent time bases such that the overall bit rate is variable. Refer to 2.4.2.2.
The transport stream may be constructed by any method that results in a valid stream. It is possible to construct transport
streams containing one or more programs from elementary coded data streams, from program streams, or from other
transport streams which may themselves contain one or more programs.
The transport stream is designed in such a way that several operations on a transport stream are possible with minimum
effort. Among these are:
1) Retrieve the coded data from one program within the transport stream, decode it and present the decoded
results as shown in Figure Intro. 2.
2) Extract the transport stream packets from one program within the transport stream and produce as output a
different transport stream with only that one program as shown in Figure Intro. 3.
3) Extract the transport stream packets of one or more programs from one or more transport streams and
produce as output a different transport stream (not illustrated).
4) Extract the contents of one program from the transport stream and produce as output a program stream
containing that one program as shown in Figure Intro. 4.
5) Take a program stream, convert it into a transport stream to carry it over a lossy environment, and then
recover a valid, and in certain cases, identical program stream.
Figure Intro. 2 and Figure Intro. 3 illustrate prototypical demultiplexing and decoding systems which take as input a
transport stream. Figure Intro. 2 illustrates the first case, where a transport stream is directly demultiplexed and decoded.
Transport streams are constructed in two layers:
– a system layer; and
– a compression layer.
The input stream to the transport stream decoder has a system layer wrapped about a compression layer. Input streams to
the video and audio decoders have only the compression layer.
Operations performed by the prototypical decoder which accepts transport streams either apply to the entire transport
stream ("multiplex-wide operations"), or to individual elementary streams ("stream-specific operations"). The transport
stream system layer is divided into two sub-layers, one for multiplex-wide operations (the transport stream packet layer),
and one for stream-specific operations (the PES packet layer).
vii
© ISO/IEC 2013 – All rights reserved
A prototypical decoder for transport streams, including audio and video, is also depicted in Figure Intro. 2 to illustrate the
function of a decoder. The architecture is not unique – some system decoder functions, such as decoder timing control,
might equally well be distributed among elementary stream decoders and the channel-specific decoder – but this figure is
useful for discussion. Likewise, indication of errors detected by the channel-specific decoder to the individual audio and
video decoders may be performed in various ways and such communication paths are not shown in the diagram. The
prototypical decoder design does not imply any normative requirement for the design of a transport stream decoder.
Indeed non-audio/video data is also allowed, but not shown.
Decoded
video
Video decoder
Transport stream
Channel Channel-specific
demultiplex and Clock control
decoder
decoder
Decoded
Transport stream audio
Audio decoder
containing one or multiple programs
H.222.0(12)_F02
Figure Intro. 2 – Prototypical transport demultiplexing and decoding example
Figure Intro. 3 illustrates the second case, where a transport stream containing multiple programs is converted into a
transport stream containing a single program. In this case the re-multiplexing operation may necessitate the correction of
program clock reference (PCR) values to account for changes in the PCR locations in the bit stream.
Transport stream
Channel Channel-specific
demultiplex and
decoder
decoder
Transport stream
Transport stream
containing multiple programs
with single program
H.222.0(12)_F03
Figure Intro. 3 – Prototypical transport multiplexing example
Figure Intro. 4 illustrates a case in which a multi-program transport stream is first demultiplexed and then converted into
a program stream.
Figures Intro. 3 and Intro. 4 indicate that it is possible and reasonable to convert between different types and
configurations of transport streams. There are specific fields defined in the transport stream and program stream syntax
which facilitate the conversions illustrated. There is no requirement that specific implementations of demultiplexors or
decoders include all of these functions.
Transport stream
Channel Channel-specific
demultiplex and program
decoder
stream multiplexor
Transport stream
Program stream
containing multiple programs
H.222.0(12)_F04
Figure Intro. 4 – Prototypical transport stream to program stream conversion
Intro. 2 Program stream
The program stream is a stream definition which is tailored for communicating or storing one program of coded data and
other data in environments where errors are very unlikely, and where processing of system coding, e.g., by software, is a
major consideration.
Program streams may be either fixed or variable rate. In either case, the constituent elementary streams may be either
fixed or variable rate. The syntax and semantics constraints on the stream are identical in each case. The program stream
rate is defined by the values and locations of the system clock reference (SCR) and mux_rate fields.
viii
© ISO/IEC 2013 – All rights reserved
A prototypical audio/video program stream decoder system is depicted in Figure Intro. 5. The architecture is not unique –
system decoder functions including decoder timing control might as equally well be distributed among elementary stream
decoders and the channel-specific decoder – but this figure is useful for discussion. The prototypical decoder design does
not imply any normative requirement for the design of a program stream decoder. Indeed non-audio/video data is also
allowed, but not shown.
Decoded
video
Video decoder
Channel Channel-specific Program stream
Clock control
decoder decoder
Decoded
audio
Program Audio decoder
stream
H.222.0(12)_F05
Figure Intro. 5 – Prototypical decoder for program streams
The prototypical decoder for program streams shown in Figure Intro. 5 is composed of system, video and audio decoders
conforming to Parts 1, 2 and 3, respectively, of ISO/IEC 13818. In this decoder, the multiplexed coded representation of
one or more audio and/or video streams is assumed to be stored or communicated on some channel in some channel-
specific format. The channel-specific format is not governed by this Recommendation | International Standard, nor is the
channel-specific decoding part of the prototypical decoder.
The prototypical decoder accepts as input a program stream and relies on a program stream decoder to extract timing
information from the stream. The program stream decoder demultiplexes the stream, and the elementary streams so
produced serve as inputs to video and audio decoders, whose outputs are decoded video and audio signals. Included in
the design, but not shown in the figure, is the flow of timing information among the program stream decoder, the video
and audio decoders, and the channel-specific decoder. The video and audio decoders are synchronized with each other
and with the channel using this timing information.
Program streams are constructed in two layers: a system layer and a compression layer. The input stream to the program
stream decoder has a system layer wrapped about a compression layer. Input streams to the video and audio decoders
have only the compression layer.
Operations performed by the prototypical decoder either apply to the entire program stream ("multiplex-wide
operations"), or to individual elementary streams ("stream-specific operations"). The program stream system layer is
divided into two sub-layers, one for multiplex-wide operations (the pack layer), and one for stream-specific operations
(the PES packet layer).
Intro. 3 Conversion between transport stream and program stream
It may be possible and reasonable to convert between transport streams and program streams by means of PES packets.
This results from the specification of transport stream and program stream as embodied in 2.4.1 and 2.5.1 of the
normative requirements of this Recommendation | International Standard. PES packets may, with some constraints, be
mapped directly from the payload of one multiplexed bit stream into the payload of another multiplexed bit stream. It is
possible to identify the correct order of PES packets in a program to assist with this if the
program_packet_sequence_counter is present in all PES packets.
Certain other information necessary for conversion, e.g., the relationship between elementary streams, is available in
tables and headers in both streams. Such data, if available, shall be correct in any stream before and after conversion.
Intro. 4 Packetized elementary stream
Transport streams and program streams are each logically constructed from PES packets, as indicated in the syntax
definitions in 2.4.3.6. PES packets shall be used to convert between transport streams and program streams; in some
cases the PES packets need not be modified when performing such conversions. PES packets may be much larger than
the size of a transport stream packet.
A continuous sequence of PES packets of one elementary stream with one stream ID may be used to construct a PES
Stream. When PES packets are used to form a PES stream, they shall include elementary stream clock reference (ESCR)
fields and elementary stream rate (ES_Rate) fields, with constraints as defined in 2.4.3.8. The PES stream data shall be
contiguous bytes from the elementary stream in their original order. PES streams do not contain some necessary system
ix
© ISO/IEC 2013 – All rights reserved
information which is contained in program streams and transport streams. Examples include the information in the pack
header, system header, program stream map, program stream directory, program map table, and elements of the transport
stream packet syntax.
The PES stream is a logical construct that may be useful within implementations of this Recommendation | International
Standard; however, it is not defined as a stream for interchange and interoperability. Applications requiring streams
containing only one elementary stream can use program streams or transport streams which each contain only one
elementary stream. These streams contain all of the necessary system information. Multiple program streams or transport
streams, each containing a single elementary stream, can be constructed with a common time base and therefore carry a
complete program, i.e., with audio and video.
Intro. 5 Timing model
Systems, video and audio all have a timing model in which the end-to-end delay from the signal input to an encoder to
the signal output from a decoder is a constant. This delay is the sum of encoding, encoder buffering, multiplexing,
communication or storage, demultiplexing, decoder buffering, decoding, and presentation delays. As part of this timing
model all video pictures and audio samples are presented exactly once, unless specifically coded to the contrary, and the
inter-picture interval and audio sample rate are the same at the decoder as at the encoder. The system stream coding
contains timing information which can be used to implement systems which embody constant end-to-end delay. It is
possible to implement decoders which do not follow this model exactly; however, in such cases it is the decoder's
responsibility to perform in an acceptable manner. The timing is embodied in the normative specifications of this
Recommendation | International Standard, which must be adhered to by all valid bit streams, regardless of the means of
creating them.
All timing is defined in terms of a common system clock, referred to as a system time clock (STC). In the program
stream this clock may have an exactly specified ratio to the video or audio sample clocks, or it may have an operating
frequency which differs slightly from the exact ratio while still providing precise end-to-end timing and clock recovery.
In the transport stream the system clock frequency is constrained to have the exactly specified ratio to the audio and
video sample clocks at all times; the effect of this constraint is to simplify sample rate recovery in decoders.
Intro. 6 Conditional access
Encryption and scrambling for conditional access to programs encoded in the program and transport streams is supported
by the system data stream definitions. Conditional access mechanisms are not specified here. The stream definitions are
designed so that implementation of practical conditional access systems is reasonable, and there are some syntactical
elements specified which provide specific support for such systems.
Intro. 7 Multiplex-wide operations
Multiplex-wide operations include the coordination of data retrieval of the channel, the adjustment of clocks, and the
management of buffers. The tasks are intimately related. If the rate of data delivery of the channel is controllable, then
data delivery may be adjusted so that decoder buffers neither overflow nor underflow; but if the data rate is not
controllable, then elementary stream decoders must slave their timing to the data received from the channel to avoid
overflow or underflow.
Program streams are composed of packs whose headers facilitate the above tasks. Pack headers specify intended times at
which each byte is to enter the program stream Decoder from the channel, and this target arrival schedule serves as a
reference for clock correction and buffer management. The schedule need not be followed exactly by decoders, but they
must compensate for deviations about it.
Similarly, transport streams are composed of transport stream packets with headers containing information which
specifies the times at which each byte is intended to enter a transport stream decoder from the channel. This schedule
provides exactly the same function as that which is specified in the program stream.
An additional multiplex-wide operation is a decoder's ability to establish what resources are required to decode a
transport stream or program stream. The first pack of each program stream conveys parameters to assist decoders in this
task. Included, for example, are the stream's maximum data rate and the highest number of simultaneous video channels.
The transport stream likewise contains globally useful information.
The transport stream and program stream each contain information which identifies the pertinent characteristics of, and
relationships between, the elementary streams which constitute each program. Such information may include the
language spoken in audio channels, as well as the relationship between video streams when multi-layer video coding is
implemented.
x
© ISO/IEC 2013 – All rights reserved
Intro. 8 Individual stream operations (PES packet layer)
The principal stream-specific operations are:
1) demultiplexing; and
2) synchronizing playback of multiple elementary streams.
Intro. 8.1 Demultiplexing
On encoding, program streams are formed by multiplexing elementary streams, and transport streams are formed by
multiplexing elementary streams, program streams, or the contents of other transport streams. Elementary streams may
include private, reserved, and padding streams in addition to audio and video streams. The streams are temporally
subdivided into packets, and the packets are serialized. A PES packet contains coded bytes from one and only one
elementary stream.
In the program stream both fixed and variable packet lengths are allowed subject to constraints as specified in 2.5.1
and 2.5.2. For transport streams the packet length is 188 bytes. Both fixed and variable PES packet lengths are allowed,
and will be relatively long in most applications.
On decoding, demultiplexing is required to reconstitute elementary streams from the multiplexed program stream or
transport stream. Stream_id codes in program stream packet headers, and packet ID codes in the transport stream make
this possible.
Intro. 8.2 Synchronization
Synchronization among multiple elementary streams is accomplished with presentation time stamps (PTSs) in the
program stream and transport streams. Time stamps are generally in units of 90 kHz, but the system clock reference
(SCR), the program clock reference (PCR) and the optional elementary stream clock reference (ESCR) have extensions
with a resolution of 27 MHz. Decoding of N-elementary streams is synchronized by adjusting the decoding of streams to
a common master time base rather than by adjusting the decoding of one stream to match that of another. The master time
base may be one of the N-decoders' clocks, the data source's clock, or it may be some external clock.
Each program in a transport stream, which may contain multiple programs, may have its own time base. The time bases
of different programs within a transport stream may be different.
Because PTSs apply to the decoding of individual elementary streams, they reside in the PES packet layer of both the
transport streams and program streams. End-to-end synchronization occurs when encoders save time stamps at capture
time, when the time stamps propagate with associated coded data to decoders, and when decoders use those time stamps
to schedule presentations.
Synchronization of a decoding system with a channel is achieved through the use of the SCR in the program stream and
by its analogue, the PCR, in the transport stream. The SCR and PCR are time stamps encoding the timing of the bit
stream itself, and are derived from the same time base used for the audio and video PTS values from the same program.
Since each program may have its own time base, there are separate PCR fields for each program in a transport stream
containing multiple programs. In some cases it may be possible for programs to share PCR fields. Refer to 2.4.4,
program-specific information (PSI), for the method of identifying which PCR is associated with a program. A program
shall have one and only one PCR time base associated with it.
Intro. 8.3 Relation to compression layer
The PES packet layer is independent of the compression layer in some senses, but not in all. It is independent in the sense
that PES packet payloads need not start at compression layer start codes, as defined in Parts 2 and 3 of ISO/IEC 13818.
For example, video start codes may occur anywhere within the payload of a PES packet, and start codes may be split by a
PES packet header. However, time stamps encoded in PES packet headers apply to presentation times of compression
layer constructs (namely, presentation units). In addition, when the elementary stream data conforms to Rec. ITU-
T H.262 | ISO/IEC 13818-2 or ISO/IEC 13818-3, the PES_packet_data_bytes shall be byte aligned to the bytes of this
Recommendation | International Standard.
Intro. 9 System reference decoder
Part 1 of ISO/IEC 13818 employs a "system target decoder" (STD), one for transport streams (refer to 2.4.2) referred to
as "transport system target decoder" (T-STD) and one for program streams (refer to 2.5.2) referred to as "program system
target decoder" (P-STD), to provide a formalism for timing and buffering relationships. Because the STD is
parameterized in terms of Rec. ITU-T H.222.0 | ISO/IEC 138
...
INTERNATIONAL ISO/IEC
STANDARD 13818-1
Fourth edition
2013-06-15
Information technology — Generic coding
of moving pictures and associated audio
information: Systems — Part 1
Technologies de l'information — Codage générique des images
animées et du son associé: Systèmes — Partie 1
Reference number
©
ISO/IEC 2013
© ISO/IEC 2013
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form or by any
means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission.
Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester.
ISO copyright office
Case postale 56 CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2013 – All rights reserved
CONTENTS
Page
1.1 Scope . 1
1.2 Normative references . 1
2.1 Definitions . 2
2.2 Symbols and abbreviations . 7
2.3 Method of describing bit stream syntax . 9
2.4 Transport stream bitstream requirements . 10
2.5 Program stream bitstream requirements . 55
2.6 Program and program element descriptors . 69
2.7 Restrictions on the multiplexed stream semantics . 108
2.8 Compatibility with ISO/IEC 11172. 113
2.9 Registration of copyright identifiers . 113
2.10 Registration of private data format . 114
2.11 Carriage of ISO/IEC 14496 data . 114
2.12 Carriage of metadata . 126
2.13 Carriage of ISO 15938 data . 134
2.14 Carriage of Rec. ITU-T H.264 | ISO/IEC 14496-10 video . 134
2.15 Carriage of ISO/IEC 14496-17 text streams . 150
2.16 Carriage of auxiliary video streams . 151
Annex A – CRC decoder model . 152
A.1 CRC decoder model . 152
Annex B – Digital Storage Medium Command and Control (DSM-CC) . 153
B.1 Introduction . 153
B.2 General elements . 154
B.3 Technical elements . 156
Annex C – Program-specific information . 162
C.1 Explanation of program-specific information in transport streams . 162
C.2 Introduction . 162
C.3 Functional mechanism . 162
C.4 The mapping of sections into transport stream packets . 163
C.5 Repetition rates and random access . 163
C.6 What is a program? . 164
C.7 Allocation of program_number . 164
C.8 Usage of PSI in a typical system . 164
C.9 The relationships of PSI structures . 165
C.10 Bandwidth utilization and signal acquisition time . 167
Annex D – Systems timing model and application implications of this Recommendation | International
Standard . 170
D.1 Introduction . 170
Annex E – Data transmission applications . 179
E.1 General considerations . 179
E.2 Suggestion . 179
Annex F – Graphics of syntax for this Recommendation | International Standard . 180
F.1 Introduction . 180
Annex G – General information . 184
G.1 General information . 184
Annex H – Private data . 185
H.1 Private data . 185
Annex I – Systems conformance and real-time interface . 186
I.1 Systems conformance and real-time interface . 186
© ISO/IEC 2013 – All rights reserved iii
Page
Annex J – Interfacing jitter-inducing networks to MPEG-2 decoders . 187
J.1 Introduction . 187
J.2 Network compliance models . 187
J.3 Network specification for jitter smoothing . 188
J.4 Example decoder implementations . 189
Annex K – Splicing transport streams . 190
K.1 Introduction . 190
K.2 The different types of splicing point . 190
K.3 Decoder behaviour on splices . 191
Annex L – Registration procedure (see 2.9) . 193
L.1 Procedure for the request of a Registered Identifier (RID) . 193
L.2 Responsibilities of the Registration Authority . 193
L.3 Responsibilities of parties requesting an RID . 193
L.4 Appeal procedure for denied applications . 194
Annex M – Registration application form (see 2.9) . 195
M.1 Contact information of organization requesting a Registered Identifier (RID) . 195
M.2 Statement of an intention to apply the assigned RID . 195
M.3 Date of intended implementation of the RID . 195
M.4 Authorized representative . 195
M.5 For official use only of the Registration Authority . 195
Annex N – Registration Authority diagram of administration structure (see 2.9) . 196
Annex O – Registration procedure (see 2.10) . 197
O.1 Procedure for the request of an RID . 197
O.2 Responsibilities of the Registration Authority . 197
O.3 Contact information for the Registration Authority . 197
O.4 Responsibilities of parties requesting an RID . 197
O.5 Appeal procedure for denied applications . 197
Annex P – Registration application form . 199
P.1 Contact information of organization requesting an RID . 199
P.2 Request for a specific RID . 199
P.3 Short description of RID that is in use and date system that was implemented . 199
P.4 Statement of an intention to apply the assigned RID . 199
P.5 Date of intended implementation of the RID . 199
P.6 Authorized representative . 199
P.7 For official use of the Registration Authority . 199
Annex Q – T-STD and P-STD buffer models for ISO/IEC 13818-7 ADTS . 200
Q.1 Introduction . 200
Q.2 Leak rate from Transport Buffer . 200
Q.3 Buffer size . 200
Q.4 Conclusion . 201
Annex R – Carriage of ISO/IEC 14496 scenes in Rec. ITU-T H.222.0 | ISO/IEC 13818-1 . 203
R.1 Content access procedure for ISO/IEC 14496 program components within a program stream . 203
R.2 Content access procedure for ISO/IEC 14496 program components within a transport stream . 204
Annex S – Carriage of JPEG 2000 part 1 video over MPEG-2 transport streams . 206
S.1 Introduction . 206
S.2 J2K video access unit, J2K video elementary stream, J2K video sequence and J2K still picture . 206
S.3 Elementary stream header (elsm) and mapping to PES packets . 206
S.4 J2K transport constraints . 208
S.5 Interpretation of flags in adaptation and PES headers for J2K video elementary streams . 208
S.6 T-STD extension for J2K video elementary streams . 209
Bibliography . 211
iv © ISO/IEC 2013 – All rights reserved
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form
the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the
development of International Standards through technical committees established by the respective organization to deal
with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest.
Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International Standards
adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International
Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO
and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 13818-1 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee
SC 29, Coding of audio, picture, multimedia and hypermedia information.
This fourth edition cancels and replaces the third edition (ISO/IEC 13818-1:2007), which has been technically revised. It
also incorporates the Amendments ISO/IEC 13818-1:2007/Amd1:2007, ISO/IEC 13818-1:2007/Amd2:2008, ISO/IEC
13818-1:2007/Amd3:2009, ISO/IEC 13818-1:2007/Amd3:2009/Cor1:2011, ISO/IEC 13818-1:2007/Amd4:2009,
ISO/IEC 13818-1:2007/Amd5:2011, ISO/IEC 13818-1:2007/Amd6:2011 and the Technical Corrigenda ISO/IEC 13818-
1:2007/Cor1:2008, ISO/IEC 13818-1:2007/Cor2:2009, and ISO/IEC 13818-1:2007/Cor3:2011.
ISO/IEC 13818 consists of the following parts, under the general title Information technology — Generic coding of
moving pictures and associated audio information:
— Part 1: Systems
— Part 2: Video
— Part 3: Audio
— Part 4: Conformance testing
— Part 5: Software simulation [Technical Report]
— Part 6: Extensions for DSM-CC
— Part 7: Advanced Audio Coding (AAC)
— Part 9: Extension for real time interface for systems decoders
— Part 10: Conformance extensions for Digital Storage Media Command and Control (DSM-CC)
— Part 11: IPMP on MPEG-2 systems
© ISO/IEC 2013 – All rights reserved v
Introduction
The systems part of this Recommendation | International Standard addresses the combining of one or more elementary
streams of video and audio, as well as other data, into single or multiple streams which are suitable for storage or
transmission. Systems coding follows the syntactical and semantic rules imposed by this Specification and provides
information to enable synchronized decoding of decoder buffers over a wide range of retrieval or receipt conditions.
System coding shall be specified in two forms: the transport stream and the program stream. Each is optimized for a
different set of applications. Both the transport stream and program stream defined in this Recommendation |
International Standard provide coding syntax which is necessary and sufficient to synchronize the decoding and
presentation of the video and audio information, while ensuring that data buffers in the decoders do not overflow or
underflow. Information is coded in the syntax using time stamps concerning the decoding and presentation of coded
audio and visual data and time stamps concerning the delivery of the data stream itself. Both stream definitions are
packet-oriented multiplexes.
The basic multiplexing approach for single video and audio elementary streams is illustrated in Figure Intro. 1. The video
and audio data is encoded as described in Rec. ITU-T H.262 | ISO/IEC 13818-2 and ISO/IEC 13818-3. The resulting
compressed elementary streams are packetized to produce PES packets. Information needed to use PES packets
independently of either transport streams or program streams may be added when PES packets are formed. This
information is not needed and need not be added when PES packets are further combined with system level information
to form transport streams or program streams. This systems standard covers those processes to the right of the vertical
dashed line.
Video data Video PES
Video encoder Packetizer Program
PS stream
Audio data Audio PES
Audio encoder Packetizer Mux
Transport
TS
stream
Mux
Extent of systems specification
H.222.0(12)_F01
Figure Intro. 1 – Simplified overview of the scope of this Recommendation | International Standard
The program stream is analogous and similar to the ISO/IEC 11172 systems layer. It results from combining one or more
streams of PES packets, which have a common time base, into a single stream.
For applications that require the elementary streams that comprise a single program to be in separate streams that are not
multiplexed, the elementary streams can also be encoded as separate program streams, one per elementary stream, with a
common time base. In this case the values encoded in the SCR fields of the various streams shall be consistent.
Like the single program stream, all elementary streams can be decoded with synchronization.
The program stream is designed for use in relatively error-free environments and is suitable for applications which may
involve software processing of system information such as interactive multi-media applications. Program stream packets
may be of variable and relatively great length.
The transport stream combines one or more programs with one or more independent time bases into a single stream. PES
packets made up of elementary streams that form a program share a common timebase. The transport stream is designed
for use in environments where errors are likely, such as storage or transmission in lossy or noisy media. Transport stream
packets are 188 bytes in length.
vi
© ISO/IEC 2013 – All rights reserved
Program and transport streams are designed for different applications and their definitions do not strictly follow a layered
model. It is possible and reasonable to convert from one to the other; however, one is not a subset or superset of the
other. In particular, extracting the contents of a program from a transport stream and creating a valid program stream is
possible and is accomplished through the common interchange format of PES packets, but not all of the fields needed in a
program stream are contained within the transport stream; some must be derived. The transport stream may be used to
span a range of layers in a layered model, and is designed for efficiency and ease of implementation in high bandwidth
applications.
The scope of syntactical and semantic rules set forth in the systems specification differs: the syntactical rules apply to
systems layer coding only, and do not extend to the compression layer coding of the video and audio specifications; by
contrast, the semantic rules apply to the combined stream in its entirety.
The systems specification does not specify the architecture or implementation of encoders or decoders, nor those of
multiplexors or demultiplexors. However, bit stream properties do impose functional and performance requirements on
encoders, decoders, multiplexors and demultiplexors. For instance, encoders must meet minimum clock tolerance
requirements. Notwithstanding this and other requirements, a considerable degree of freedom exists in the design and
implementation of encoders, decoders, multiplexors, and demultiplexors.
Intro. 1 Transport stream
The transport stream is a stream definition which is tailored for communicating or storing one or more programs of coded
data according to Rec. ITU-T H.262 | ISO/IEC 13818-2 and ISO/IEC 13818-3 and other data in environments in which
significant errors may occur. Such errors may be manifested as bit value errors or loss of packets.
Transport streams may be either fixed or variable rate. In either case the constituent elementary streams may either be
fixed or variable rate. The syntax and semantic constraints on the stream are identical in each of these cases. The
transport stream rate is defined by the values and locations of program clock reference (PCR) fields, which in general are
separate PCR fields for each program.
There are some difficulties with constructing and delivering a transport stream containing multiple programs with
independent time bases such that the overall bit rate is variable. Refer to 2.4.2.2.
The transport stream may be constructed by any method that results in a valid stream. It is possible to construct transport
streams containing one or more programs from elementary coded data streams, from program streams, or from other
transport streams which may themselves contain one or more programs.
The transport stream is designed in such a way that several operations on a transport stream are possible with minimum
effort. Among these are:
1) Retrieve the coded data from one program within the transport stream, decode it and present the decoded
results as shown in Figure Intro. 2.
2) Extract the transport stream packets from one program within the transport stream and produce as output a
different transport stream with only that one program as shown in Figure Intro. 3.
3) Extract the transport stream packets of one or more programs from one or more transport streams and
produce as output a different transport stream (not illustrated).
4) Extract the contents of one program from the transport stream and produce as output a program stream
containing that one program as shown in Figure Intro. 4.
5) Take a program stream, convert it into a transport stream to carry it over a lossy environment, and then
recover a valid, and in certain cases, identical program stream.
Figure Intro. 2 and Figure Intro. 3 illustrate prototypical demultiplexing and decoding systems which take as input a
transport stream. Figure Intro. 2 illustrates the first case, where a transport stream is directly demultiplexed and decoded.
Transport streams are constructed in two layers:
– a system layer; and
– a compression layer.
The input stream to the transport stream decoder has a system layer wrapped about a compression layer. Input streams to
the video and audio decoders have only the compression layer.
Operations performed by the prototypical decoder which accepts transport streams either apply to the entire transport
stream ("multiplex-wide operations"), or to individual elementary streams ("stream-specific operations"). The transport
stream system layer is divided into two sub-layers, one for multiplex-wide operations (the transport stream packet layer),
and one for stream-specific operations (the PES packet layer).
vii
© ISO/IEC 2013 – All rights reserved
A prototypical decoder for transport streams, including audio and video, is also depicted in Figure Intro. 2 to illustrate the
function of a decoder. The architecture is not unique – some system decoder functions, such as decoder timing control,
might equally well be distributed among elementary stream decoders and the channel-specific decoder – but this figure is
useful for discussion. Likewise, indication of errors detected by the channel-specific decoder to the individual audio and
video decoders may be performed in various ways and such communication paths are not shown in the diagram. The
prototypical decoder design does not imply any normative requirement for the design of a transport stream decoder.
Indeed non-audio/video data is also allowed, but not shown.
Decoded
video
Video decoder
Transport stream
Channel Channel-specific
demultiplex and Clock control
decoder
decoder
Decoded
Transport stream audio
Audio decoder
containing one or multiple programs
H.222.0(12)_F02
Figure Intro. 2 – Prototypical transport demultiplexing and decoding example
Figure Intro. 3 illustrates the second case, where a transport stream containing multiple programs is converted into a
transport stream containing a single program. In this case the re-multiplexing operation may necessitate the correction of
program clock reference (PCR) values to account for changes in the PCR locations in the bit stream.
Transport stream
Channel Channel-specific
demultiplex and
decoder
decoder
Transport stream
Transport stream
containing multiple programs
with single program
H.222.0(12)_F03
Figure Intro. 3 – Prototypical transport multiplexing example
Figure Intro. 4 illustrates a case in which a multi-program transport stream is first demultiplexed and then converted into
a program stream.
Figures Intro. 3 and Intro. 4 indicate that it is possible and reasonable to convert between different types and
configurations of transport streams. There are specific fields defined in the transport stream and program stream syntax
which facilitate the conversions illustrated. There is no requirement that specific implementations of demultiplexors or
decoders include all of these functions.
Transport stream
Channel Channel-specific
demultiplex and program
decoder
stream multiplexor
Transport stream
Program stream
containing multiple programs
H.222.0(12)_F04
Figure Intro. 4 – Prototypical transport stream to program stream conversion
Intro. 2 Program stream
The program stream is a stream definition which is tailored for communicating or storing one program of coded data and
other data in environments where errors are very unlikely, and where processing of system coding, e.g., by software, is a
major consideration.
Program streams may be either fixed or variable rate. In either case, the constituent elementary streams may be either
fixed or variable rate. The syntax and semantics constraints on the stream are identical in each case. The program stream
rate is defined by the values and locations of the system clock reference (SCR) and mux_rate fields.
viii
© ISO/IEC 2013 – All rights reserved
A prototypical audio/video program stream decoder system is depicted in Figure Intro. 5. The architecture is not unique –
system decoder functions including decoder timing control might as equally well be distributed among elementary stream
decoders and the channel-specific decoder – but this figure is useful for discussion. The prototypical decoder design does
not imply any normative requirement for the design of a program stream decoder. Indeed non-audio/video data is also
allowed, but not shown.
Decoded
video
Video decoder
Channel Channel-specific Program stream
Clock control
decoder decoder
Decoded
audio
Program Audio decoder
stream
H.222.0(12)_F05
Figure Intro. 5 – Prototypical decoder for program streams
The prototypical decoder for program streams shown in Figure Intro. 5 is composed of system, video and audio decoders
conforming to Parts 1, 2 and 3, respectively, of ISO/IEC 13818. In this decoder, the multiplexed coded representation of
one or more audio and/or video streams is assumed to be stored or communicated on some channel in some channel-
specific format. The channel-specific format is not governed by this Recommendation | International Standard, nor is the
channel-specific decoding part of the prototypical decoder.
The prototypical decoder accepts as input a program stream and relies on a program stream decoder to extract timing
information from the stream. The program stream decoder demultiplexes the stream, and the elementary streams so
produced serve as inputs to video and audio decoders, whose outputs are decoded video and audio signals. Included in
the design, but not shown in the figure, is the flow of timing information among the program stream decoder, the video
and audio decoders, and the channel-specific decoder. The video and audio decoders are synchronized with each other
and with the channel using this timing information.
Program streams are constructed in two layers: a system layer and a compression layer. The input stream to the program
stream decoder has a system layer wrapped about a compression layer. Input streams to the video and audio decoders
have only the compression layer.
Operations performed by the prototypical decoder either apply to the entire program stream ("multiplex-wide
operations"), or to individual elementary streams ("stream-specific operations"). The program stream system layer is
divided into two sub-layers, one for multiplex-wide operations (the pack layer), and one for stream-specific operations
(the PES packet layer).
Intro. 3 Conversion between transport stream and program stream
It may be possible and reasonable to convert between transport streams and program streams by means of PES packets.
This results from the specification of transport stream and program stream as embodied in 2.4.1 and 2.5.1 of the
normative requirements of this Recommendation | International Standard. PES packets may, with some constraints, be
mapped directly from the payload of one multiplexed bit stream into the payload of another multiplexed bit stream. It is
possible to identify the correct order of PES packets in a program to assist with this if the
program_packet_sequence_counter is present in all PES packets.
Certain other information necessary for conversion, e.g., the relationship between elementary streams, is available in
tables and headers in both streams. Such data, if available, shall be correct in any stream before and after conversion.
Intro. 4 Packetized elementary stream
Transport streams and program streams are each logically constructed from PES packets, as indicated in the syntax
definitions in 2.4.3.6. PES packets shall be used to convert between transport streams and program streams; in some
cases the PES packets need not be modified when performing such conversions. PES packets may be much larger than
the size of a transport stream packet.
A continuous sequence of PES packets of one elementary stream with one stream ID may be used to construct a PES
Stream. When PES packets are used to form a PES stream, they shall include elementary stream clock reference (ESCR)
fields and elementary stream rate (ES_Rate) fields, with constraints as defined in 2.4.3.8. The PES stream data shall be
contiguous bytes from the elementary stream in their original order. PES streams do not contain some necessary system
ix
© ISO/IEC 2013 – All rights reserved
information which is contained in program streams and transport streams. Examples include the information in the pack
header, system header, program stream map, program stream directory, program map table, and elements of the transport
stream packet syntax.
The PES stream is a logical construct that may be useful within implementations of this Recommendation | International
Standard; however, it is not defined as a stream for interchange and interoperability. Applications requiring streams
containing only one elementary stream can use program streams or transport streams which each contain only one
elementary stream. These streams contain all of the necessary system information. Multiple program streams or transport
streams, each containing a single elementary stream, can be constructed with a common time base and therefore carry a
complete program, i.e., with audio and video.
Intro. 5 Timing model
Systems, video and audio all have a timing model in which the end-to-end delay from the signal input to an encoder to
the signal output from a decoder is a constant. This delay is the sum of encoding, encoder buffering, multiplexing,
communication or storage, demultiplexing, decoder buffering, decoding, and presentation delays. As part of this timing
model all video pictures and audio samples are presented exactly once, unless specifically coded to the contrary, and the
inter-picture interval and audio sample rate are the same at the decoder as at the encoder. The system stream coding
contains timing information which can be used to implement systems which embody constant end-to-end delay. It is
possible to implement decoders which do not follow this model exactly; however, in such cases it is the decoder's
responsibility to perform in an acceptable manner. The timing is embodied in the normative specifications of this
Recommendation | International Standard, which must be adhered to by all valid bit streams, regardless of the means of
creating them.
All timing is defined in terms of a common system clock, referred to as a system time clock (STC). In the program
stream this clock may have an exactly specified ratio to the video or audio sample clocks, or it may have an operating
frequency which differs slightly from the exact ratio while still providing precise end-to-end timing and clock recovery.
In the transport stream the system clock frequency is constrained to have the exactly specified ratio to the audio and
video sample clocks at all times; the effect of this constraint is to simplify sample rate recovery in decoders.
Intro. 6 Conditional access
Encryption and scrambling for conditional access to programs encoded in the program and transport streams is supported
by the system data stream definitions. Conditional access mechanisms are not specified here. The stream definitions are
designed so that implementation of practical conditional access systems is reasonable, and there are some syntactical
elements specified which provide specific support for such systems.
Intro. 7 Multiplex-wide operations
Multiplex-wide operations include the coordination of data retrieval of the channel, the adjustment of clocks, and the
management of buffers. The tasks are intimately related. If the rate of data delivery of the channel is controllable, then
data delivery may be adjusted so that decoder buffers neither overflow nor underflow; but if the data rate is not
controllable, then elementary stream decoders must slave their timing to the data received from the channel to avoid
overflow or underflow.
Program streams are composed of packs whose headers facilitate the above tasks. Pack headers specify intended times at
which each byte is to enter the program stream Decoder from the channel, and this target arrival schedule serves as a
reference for clock correction and buffer management. The schedule need not be followed exactly by decoders, but they
must compensate for deviations about it.
Similarly, transport streams are composed of transport stream packets with headers containing information which
specifies the times at which each byte is intended to enter a transport stream decoder from the channel. This schedule
provides exactly the same function as that which is specified in the program stream.
An additional multiplex-wide operation is a decoder's ability to establish what resources are required to decode a
transport stream or program stream. The first pack of each program stream conveys parameters to assist decoders in this
task. Included, for example, are the stream's maximum data rate and the highest number of simultaneous video channels.
The transport stream likewise contains globally useful information.
The transport stream and program stream each contain information which identifies the pertinent characteristics of, and
relationships between, the elementary streams which constitute each program. Such information may include the
language spoken in audio channels, as well as the relationship between video streams when multi-layer video coding is
implemented.
x
© ISO/IEC 2013 – All rights reserved
Intro. 8 Individual stream operations (PES packet layer)
The principal stream-specific operations are:
1) demultiplexing; and
2) synchronizing playback of multiple elementary streams.
Intro. 8.1 Demultiplexing
On encoding, program streams are formed by multiplexing elementary streams, and transport streams are formed by
multiplexing elementary streams, program streams, or the contents of other transport streams. Elementary streams may
include private, reserved, and padding streams in addition to audio and video streams. The streams are temporally
subdivided into packets, and the packets are serialized. A PES packet contains coded bytes from one and only one
elementary stream.
In the program stream both fixed and variable packet lengths are allowed subject to constraints as specified in 2.5.1
and 2.5.2. For transport streams the packet length is 188 bytes. Both fixed and variable PES packet lengths are allowed,
and will be relatively long in most applications.
On decoding, demultiplexing is required to reconstitute elementary streams from the multiplexed program stream or
transport stream. Stream_id codes in program stream packet headers, and packet ID codes in the transport stream make
this possible.
Intro. 8.2 Synchronization
Synchronization among multiple elementary streams is accomplished with presentation time stamps (PTSs) in the
program stream and transport streams. Time stamps are generally in units of 90 kHz, but the system clock reference
(SCR), the program clock reference (PCR) and the optional elementary stream clock reference (ESCR) have extensions
with a resolution of 27 MHz. Decoding of N-elementary streams is synchronized by adjusting the decoding of streams to
a common master time base rather than by adjusting the decoding of one stream to match that of another. The master time
base may be one of the N-decoders' clocks, the data source's clock, or it may be some external clock.
Each program in a transport stream, which may contain multiple programs, may have its own time base. The time bases
of different programs within a transport stream may be different.
Because PTSs apply to the decoding of individual elementary streams, they reside in the PES packet layer of both the
transport streams and program streams. End-to-end synchronization occurs when encoders save time stamps at capture
time, when the time stamps propagate with associated coded data to decoders, and when decoders use those time stamps
to schedule presentations.
Synchronization of a decoding system with a channel is achieved through the use of the SCR in the program stream and
by its analogue, the PCR, in the transport stream. The SCR and PCR are time stamps encoding the timing of the bit
stream itself, and are derived from the same time base used for the audio and video PTS values from the same program.
Since each program may have its own time base, there are separate PCR fields for each program in a transport stream
containing multiple programs. In some cases it may be possible for programs to share PCR fields. Refer to 2.4.4,
program-specific information (PSI), for the method of identifying which PCR is associated with a program. A program
shall have one and only one PCR time base associated with it.
Intro. 8.3 Relation to compression layer
The PES packet layer is independent of the compression layer in some senses, but not in all. It is independent in the sense
that PES packet payloads need not start at compression layer start codes, as defined in Parts 2 and 3 of ISO/IEC 13818.
For example, video start codes may occur anywhere within the payload of a PES packet, and start codes may be split by a
PES packet header. However, time stamps encoded in PES packet headers apply to presentation times of compression
layer constructs (namely, presentation units). In addition, when the elementary stream data conforms to Rec. ITU-
T H.262 | ISO/IEC 13818-2 or ISO/IEC 13818-3, the PES_packet_data_bytes shall be byte aligned to the bytes of this
Recommendation | International Standard.
Intro. 9 System reference decoder
Part 1 of ISO/IEC 13818 employs a "system target decoder" (STD), one for transport streams (refer to 2.4.2) referred to
as "transport system target decoder" (T-STD) and one for program streams (refer to 2.5.2) referred to as "program system
target decoder" (P-STD), to provide a formalism for timing and buffering relationships. Because the STD is
parameterized in terms of Rec. ITU-T H
...










Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...