Information technology - Coding of audio-visual objects - Part 10: Advanced Video Coding - Amendment 2: MVC extensions for inclusion of depth maps

Technologies de l'information — Codage des objets audiovisuels — Partie 10: Codage visuel avancé — Amendement 2: Extensions du codage vidéo multivues pour l'inclusion de cartes de profondeur

General Information

Status
Withdrawn
Publication Date
18-Sep-2013
Withdrawal Date
18-Sep-2013
Current Stage
9599 - Withdrawal of International Standard
Start Date
27-Aug-2014
Completion Date
30-Oct-2025
Ref Project

Relations

Standard
ISO/IEC 14496-10:2012/Amd 2:2013 - MVC extensions for inclusion of depth maps
English language
84 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
ISO/IEC 14496-10:2012/Amd 2:2013 - MVC extensions for inclusion of depth maps
English language
84 pages
sale 15% off
Preview
sale 15% off
Preview

Frequently Asked Questions

ISO/IEC 14496-10:2012/Amd 2:2013 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Coding of audio-visual objects - Part 10: Advanced Video Coding - Amendment 2: MVC extensions for inclusion of depth maps". This standard covers: Information technology - Coding of audio-visual objects - Part 10: Advanced Video Coding - Amendment 2: MVC extensions for inclusion of depth maps

Information technology - Coding of audio-visual objects - Part 10: Advanced Video Coding - Amendment 2: MVC extensions for inclusion of depth maps

ISO/IEC 14496-10:2012/Amd 2:2013 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.

ISO/IEC 14496-10:2012/Amd 2:2013 has the following relationships with other standards: It is inter standard links to ISO/IEC 14496-10:2012, ISO/IEC 14496-10:2014. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

You can purchase ISO/IEC 14496-10:2012/Amd 2:2013 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 14496-10
Seventh edition
2012-05-01
AMENDMENT 2
2013-09-15
Information technology — Coding of
audio-visual objects —
Part 10:
Advanced Video Coding
AMENDMENT 2: MVC extensions for
inclusion of depth maps
Technologies de l'information — Codage des objets audiovisuels —
Partie 10: Codage visuel avancé
AMENDEMENT 2: Extensions du codage vidéo multivues pour
l'inclusion de cartes de profondeur

Reference number
ISO/IEC 14496-10:2012/Amd.2:2013(E)
©
ISO 2013
ISO/IEC 14496-10:2012/Amd.2:2013(E)

©  ISO 2013
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form or by any
means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission.
Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2013 – All rights reserved

ISO/IEC 14496-10:2012/Amd.2:2013(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 2 to ISO/IEC 14496-10:2012 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information.
© ISO/IEC 2013 – All rights reserved iii

ISO/IEC 14496-10:2012/Amd.2:2013(E)

Information technology — Coding of audio-visual objects —
Part 10:
Advanced Video Coding
AMENDMENT 2: MVC extensions for inclusion of depth maps
In 0.6, add the following paragraph after the paragraph that starts with "Multiview video coding":
An extension of multiview video coding that additionally supports the inclusion of depth maps is specified in Annex I,
allowing the construction of bitstreams that represent multiple views with corresponding depth views. In a similar
manner as with the multiview video coding specified in Annex H, bitstreams encoded as specified in Annex I may also
contain sub-bitstreams that conform to this Specification.

In 0.7, add the following paragraph after the paragraph that starts with "Annex H specifies":
Annex I specifies MVC extensions for inclusion of depth maps, referred to as multiview video coding with depth
(MVCD). The reader is referred to Annex I for the entire decoding process for MVCD, which is specified there with
references being made to clauses 2-9 and Annexes A-E and Annex H. Subclause I.10 specifies one profile for MVCD
(Multiview and Depth).
In Clause 2, add the following additional normative reference:
– ISO 12232:2006, Photography – Digital still cameras – Determination of exposure index, ISO speed
ratings, standard output sensitivity, and recommended exposure index.
In Clause 4, add the following additional abbreviation:
MVCD Multiview Video Coding with Depth
© ISO/IEC 2013 – All rights reserved 1

ISO/IEC 14496-10:2012/Amd.2:2013(E)
In 7.3.1, replace the syntax table with:

nal_unit( NumBytesInNALunit ) { C Descriptor
forbidden_zero_bit All f(1)
nal_ref_idc All u(2)
nal_unit_type All u(5)
NumBytesInRBSP = 0
nalUnitHeaderBytes = 1
if( nal_unit_type = = 14 | | nal_unit_type = = 20 | |
nal_unit_type = = 21 ) {
svc_extension_flag All u(1)
if( svc_extension_flag )
nal_unit_header_svc_extension( ) /* specified in Annex G */ All
else
nal_unit_header_mvc_extension( ) /* specified in Annex H */ All
nalUnitHeaderBytes += 3
}
for( i = nalUnitHeaderBytes; i < NumBytesInNALunit; i++ ) {
if( i + 2 < NumBytesInNALunit && next_bits( 24 ) = = 0x000003 ) {
rbsp_byte[ NumBytesInRBSP++ ] All b(8)
rbsp_byte[ NumBytesInRBSP++ ] All b(8)
i += 2
emulation_prevention_three_byte /* equal to 0x03 */ All f(8)
} else
rbsp_byte[ NumBytesInRBSP++ ] All b(8)
}
}
2 © ISO/IEC 2013 – All rights reserved

ISO/IEC 14496-10:2012/Amd.2:2013(E)
In 7.3.2.1.1, replace the syntax table with:
seq_parameter_set_data( ) { C Descriptor
profile_idc 0 u(8)
constraint_set0_flag 0 u(1)
constraint_set1_flag 0 u(1)
constraint_set2_flag 0 u(1)
constraint_set3_flag 0 u(1)
constraint_set4_flag 0 u(1)
constraint_set5_flag 0 u(1)
reserved_zero_2bits /* equal to 0 */ 0 u(2)
level_idc 0 u(8)
seq_parameter_set_id 0 ue(v)
if( profile_idc = = 100 | | profile_idc = = 110 | |
profile_idc = = 122 | | profile_idc = = 244 | | profile_idc = = 44 | |
profile_idc = = 83 | | profile_idc = = 86 | | profile_idc = = 118 | |
profile_idc = = 128 | | profile_idc = = 138 ) {
chroma_format_idc 0 ue(v)
if( chroma_format_idc = = 3 )
separate_colour_plane_flag 0 u(1)
bit_depth_luma_minus8 0 ue(v)
bit_depth_chroma_minus8 0 ue(v)
qpprime_y_zero_transform_bypass_flag 0 u(1)
seq_scaling_matrix_present_flag 0 u(1)
if( seq_scaling_matrix_present_flag )
for( i = 0; i < ( ( chroma_format_idc != 3 ) ? 8 : 12 ); i++ ) {
seq_scaling_list_present_flag[ i ] 0 u(1)
if( seq_scaling_list_present_flag[ i ] )
if( i < 6 )
scaling_list( ScalingList4x4[ i ], 16, 0
UseDefaultScalingMatrix4x4Flag[ i ])
else
scaling_list( ScalingList8x8[ i − 6 ], 64, 0
UseDefaultScalingMatrix8x8Flag[ i − 6 ] )
}
}
log2_max_frame_num_minus4 0 ue(v)
pic_order_cnt_type 0 ue(v)
if( pic_order_cnt_type = = 0 )
log2_max_pic_order_cnt_lsb_minus4 0 ue(v)
else if( pic_order_cnt_type = = 1 ) {
delta_pic_order_always_zero_flag 0 u(1)
offset_for_non_ref_pic 0 se(v)
offset_for_top_to_bottom_field 0 se(v)
num_ref_frames_in_pic_order_cnt_cycle 0 ue(v)
for( i = 0; i < num_ref_frames_in_pic_order_cnt_cycle; i++ )
offset_for_ref_frame[ i ] 0 se(v)
}
max_num_ref_frames 0 ue(v)
gaps_in_frame_num_value_allowed_flag 0 u(1)
© ISO/IEC 2013 – All rights reserved 3

ISO/IEC 14496-10:2012/Amd.2:2013(E)
pic_width_in_mbs_minus1 0 ue(v)
pic_height_in_map_units_minus1 0 ue(v)
frame_mbs_only_flag 0 u(1)
if( !frame_mbs_only_flag )
mb_adaptive_frame_field_flag 0 u(1)
direct_8x8_inference_flag 0 u(1)
frame_cropping_flag 0 u(1)
if( frame_cropping_flag ) {
0 ue(v)
frame_crop_left_offset
frame_crop_right_offset 0 ue(v)
0 ue(v)
frame_crop_top_offset
0 ue(v)
frame_crop_bottom_offset
}
0 u(1)
vui_parameters_present_flag
if( vui_parameters_present_flag )
vui_parameters( ) 0
}
In 7.3.2.1.3, replace the syntax table with:
subset_seq_parameter_set_rbsp( ) {
C Descriptor
seq_parameter_set_data( ) 0
if( profile_idc = = 83 | | profile_idc = = 86 ) {
seq_parameter_set_svc_extension( ) /* specified in Annex G */ 0
svc_vui_parameters_present_flag 0 u(1)
if( svc_vui_parameters_present_flag = = 1 )
svc_vui_parameters_extension( ) /* specified in Annex G */ 0
} else if( profile_idc = = 118 | | profile_idc = = 128 ) {
bit_equal_to_one /* equal to 1 */ 0 f(1)
seq_parameter_set_mvc_extension( ) /* specified in Annex H */ 0
mvc_vui_parameters_present_flag 0 u(1)
if( mvc_vui_parameters_present_flag = = 1 )
mvc_vui_parameters_extension( ) /* specified in Annex H */ 0
} else if ( profile_idc = = 138 ) {
bit_equal_to_one /* equal to 1 */ 0 f(1)
seq_parameter_set_mvcd_extension( ) /* specified in Annex I */ 0
}
additional_extension2_flag 0 u(1)
if( additional_extension2_flag = = 1 )
while( more_rbsp_data( ) )
additional_extension2_data_flag 0 u(1)
rbsp_trailing_bits( ) 0
}
4 © ISO/IEC 2013 – All rights reserved

ISO/IEC 14496-10:2012/Amd.2:2013(E)
In 7.3.3, replace the syntax table with:
slice_header( ) { C Descriptor
first_mb_in_slice 2 ue(v)
slice_type 2 ue(v)
pic_parameter_set_id 2 ue(v)
if( separate_colour_plane_flag = = 1 )
colour_plane_id 2 u(2)
frame_num 2 u(v)
if( !frame_mbs_only_flag ) {
field_pic_flag 2 u(1)
if( field_pic_flag )
bottom_field_flag 2 u(1)
}
if( IdrPicFlag )
idr_pic_id 2 ue(v)
if( pic_order_cnt_type = = 0 ) {
pic_order_cnt_lsb 2 u(v)
if( bottom_field_pic_order_in_frame_present_flag && !field_pic_flag )
delta_pic_order_cnt_bottom 2 se(v)
}
if( pic_order_cnt_type = = 1 && !delta_pic_order_always_zero_flag ) {
delta_pic_order_cnt[ 0 ] 2 se(v)
if( bottom_field_pic_order_in_frame_present_flag && !field_pic_flag )
delta_pic_order_cnt[ 1 ] 2 se(v)
}
if( redundant_pic_cnt_present_flag )
redundant_pic_cnt 2 ue(v)
if( slice_type = = B )
direct_spatial_mv_pred_flag 2 u(1)
if( slice_type = = P | | slice_type = = SP | | slice_type = = B ) {
num_ref_idx_active_override_flag 2 u(1)
if( num_ref_idx_active_override_flag ) {
num_ref_idx_l0_active_minus1 2 ue(v)
if( slice_type = = B )
num_ref_idx_l1_active_minus1 2 ue(v)
}
}
if( nal_unit_type = = 20 | | nal_unit_type = = 21 )
ref_pic_list_mvc_modification( ) /* specified in Annex H */ 2
else
ref_pic_list_modification( ) 2
if( ( weighted_pred_flag && ( slice_type = = P | | slice_type = = SP ) ) | |
( weighted_bipred_idc = = 1 && slice_type = = B ) )
pred_weight_table( ) 2
if( nal_ref_idc != 0 )
dec_ref_pic_marking( ) 2
if( entropy_coding_mode_flag && slice_type != I && slice_type != SI )
cabac_init_idc 2 ue(v)
slice_qp_delta 2 se(v)
© ISO/IEC 2013 – All rights reserved 5

ISO/IEC 14496-10:2012/Amd.2:2013(E)
if( slice_type = = SP | | slice_type = = SI ) {
if( slice_type = = SP )
sp_for_switch_flag 2 u(1)
slice_qs_delta 2 se(v)
}
if( deblocking_filter_control_present_flag ) {
disable_deblocking_filter_idc 2 ue(v)
if( disable_deblocking_filter_idc != 1 ) {
2 se(v)
slice_alpha_c0_offset_div2
slice_beta_offset_div2 2 se(v)
}
}
if( num_slice_groups_minus1 > 0 &&
slice_group_map_type >= 3 && slice_group_map_type <= 5)
slice_group_change_cycle 2 u(v)
}
6 © ISO/IEC 2013 – All rights reserved

ISO/IEC 14496-10:2012/Amd.2:2013(E)
Replace Table 7-1 with:
nal_unit_type Content of NAL unit and RBSP C Annex A Annex G Annex I
syntax structure and
Annex H
NAL unit NAL unit NAL unit
type class type class type class
0 Unspecified non-VCL non-VCL non-VCL
1 Coded slice of a non-IDR picture 2, 3, 4 VCL VCL VCL
slice_layer_without_partitioning_rbsp( )
2 Coded slice data partition A 2 VCL not not
slice_data_partition_a_layer_rbsp( ) applicable applicable
3 Coded slice data partition B 3 VCL not not
slice_data_partition_b_layer_rbsp( ) applicable applicable
4 Coded slice data partition C 4 VCL not not
slice_data_partition_c_layer_rbsp( ) applicable applicable
5 Coded slice of an IDR picture 2, 3 VCL VCL VCL
slice_layer_without_partitioning_rbsp( )
6 Supplemental enhancement information 5 non-VCL non-VCL non-VCL
(SEI)
sei_rbsp( )
7 Sequence parameter set 0 non-VCL non-VCL non-VCL
seq_parameter_set_rbsp( )
8 Picture parameter set 1 non-VCL non-VCL non-VCL
pic_parameter_set_rbsp( )
9 Access unit delimiter 6 non-VCL non-VCL non-VCL
access_unit_delimiter_rbsp( )
10 End of sequence 7 non-VCL non-VCL non-VCL
end_of_seq_rbsp( )
11 End of stream 8 non-VCL non-VCL non-VCL
end_of_stream_rbsp( )
12 Filler data 9 non-VCL non-VCL non-VCL
filler_data_rbsp( )
13 Sequence parameter set extension 10 non-VCL non-VCL non-VCL
seq_parameter_set_extension_rbsp( )
14 Prefix NAL unit 2 non-VCL suffix suffix
prefix_nal_unit_rbsp( ) dependent dependent
15 Subset sequence parameter set 0 non-VCL non-VCL non-VCL
subset_seq_parameter_set_rbsp( )
16.18 Reserved non-VCL non-VCL non-VCL
19 Coded slice of an auxiliary coded 2, 3, 4 non-VCL non-VCL non-VCL
picture without partitioning
slice_layer_without_partitioning_rbsp( )
20 Coded slice extension 2, 3, 4 non-VCL VCL VCL
slice_layer_extension_rbsp( )
21 Coded slice extension for depth view 2, 3, 4 non-VCL non-VCL VCL
components /*specified in Annex I */
slice_layer_extension_rbsp( ) /*
© ISO/IEC 2013 – All rights reserved 7

ISO/IEC 14496-10:2012/Amd.2:2013(E)
specified in Annex I */
22.23 Reserved non-VCL non-VCL VCL
24.31 Unspecified non-VCL non-VCL non-VCL

In 7.4.1, make the following changes:
Replace the following:
svc_extension_flag indicates whether a nal_unit_header_svc_extension( ) or nal_unit_header_mvc_extension( ) will
follow next in the syntax structure.
with:
svc_extension_flag indicates whether a nal_unit_header_svc_extension( ) or nal_unit_header_mvc_extension( ) will
follow next in the syntax structure. When nal_unit_type is equal to 21, svc_extension_flag shall be equal to 0 and the
semantics of svc_extension_flag equal to 1 are reserved for future specification by ITU-T | ISO/IEC.

Add the following paragraph after the semantics of svc_extension_flag just before the semantics of
rbsp_byte[ i ].
The value of svc_extension_flag shall be equal to 0 for coded video sequences conforming to one or more profiles
specified in Annex I. Decoders conforming to one or more profiles specified in Annex I shall ignore (remove from the
bitstream and discard) NAL units for which nal_unit_type is equal to 14, 20, or 21 and svc_extension_flag is equal to 1.

In 7.4.2.1.1, replace the following:
chroma_format_idc specifies the chroma sampling relative to the luma sampling as specified in clause 6.2. The value of
chroma_format_idc shall be in the range of 0 to 3, inclusive. When chroma_format_idc is not present, it shall be inferred
to be equal to 1 (4:2:0 chroma format).
with
chroma_format_idc specifies the chroma sampling relative to the luma sampling as specified in clause 6.2. The value of
chroma_format_idc shall be in the range of 0 to 3, inclusive. When chroma_format_idc is not present and profile_idc is
not equal to 138, chroma_format_idc shall be inferred to be equal to 1 (4:2:0 chroma format). When chroma_format_idc
is not present and profile_idc is equal to 138, chroma_format_idc shall be inferred to be equal to 0 (4:0:0 chroma format),
otherwise, it shall be inferred to be equal to 1 (4:2:0 chroma format).

In 7.4.2.1.3, replace the following:
additional_extension2_data_flag may have any value. It shall not affect the conformance to profiles specified in
Annex A, G, or H.
with
additional_extension2_data_flag may have any value. It shall not affect the conformance to profiles specified in
Annex A, G, H, or I.
8 © ISO/IEC 2013 – All rights reserved

ISO/IEC 14496-10:2012/Amd.2:2013(E)
Replace Annex C with:
Annex C
Hypothetical reference decoder

(This annex forms an integral part of this Recommendation | International Standard)
This annex specifies the hypothetical reference decoder (HRD) and its use to check bitstream and decoder conformance.
Two types of bitstreams are subject to HRD conformance checking for this Recommendation | International Standard.
The first such type of bitstream, called Type I bitstream, is a NAL unit stream containing only the VCL NAL units and
filler data NAL units for all access units in the bitstream. The second type of bitstream, called a Type II bitstream,
contains, in addition to the VCL NAL units and filler data NAL units for all access units in the bitstream, at least one of
the following:
– additional non-VCL NAL units other than filler data NAL units,
– all leading_zero_8bits, zero_byte, start_code_prefix_one_3bytes, and trailing_zero_8bits syntax elements that form
a byte stream from the NAL unit stream (as specified in Annex B).
Figure C-1 shows the types of bitstream conformance points checked by the HRD.
Non-VCL NAL units other
VCL NAL units than filter data NAL units
Filter data NAL units
Byte stream format
encapsulation
(see Annex B)
H.264(09)_FC-1
Figure C-1 – Structure of byte streams and NAL unit streams for HRD conformance checks
The syntax elements of non-VCL NAL units (or their default values for some of the syntax elements), required for the
HRD, are specified in the semantics subclauses of clause 7, Annexes D and E, and subclauses G.7, G.13, G.14,
H.7, H.13, H.14, I.7, I.13, and I.14.
Two types of HRD parameter sets (NAL HRD parameters and VCL HRD parameters) are used. The HRD parameter sets
are signalled as follows:
© ISO/IEC 2013 – All rights reserved 9

ISO/IEC 14496-10:2012/Amd.2:2013(E)
– When the coded video sequence conforms to one or more of the profiles specified in Annex A and the decoding
process specified in clauses 2-9 is applied, the HRD parameter sets are signalled through video usability information
as specified in subclauses E.1 and E.2, which is part of the sequence parameter set syntax structure.
– When the coded video sequence conforms to one or more of the profiles specified in Annex G and the decoding
process specified in Annex G is applied, the HRD parameter sets are signalled through the SVC video usability
information extension as specified in subclauses G.14.1 and G.14.2, which is part of the subset sequence parameter
set syntax structure.
NOTE 1 – For coded video sequences that conform to both, one or more of the profiles specified in Annex A and one or more of
the profiles specified in Annex G, the signalling of the applicable HRD parameter sets is depending on whether the decoding
process specified in clauses 2-9 or the decoding process specified in Annex G is applied.
– When the coded video sequence conforms to one or more of the profiles specified in Annex H and the decoding
process specified in Annex H is applied, the HRD parameter sets are signalled through the MVC video usability
information extension as specified in subclauses H.14.1 and H.14.2, which is part of the subset sequence parameter
set syntax structure.
NOTE 2 – For coded video sequences that conform to both, one or more of the profiles specified in Annex A and one or
more of the profiles specified in Annex H, the signalling of the applicable HRD parameter sets is depending on whether the
decoding process specified in clauses 2-9 or the decoding process specified in Annex H is applied.
– When the coded video sequence conforms to one or more of the profiles specified in Annex I and the decoding
process specified in Annex I is applied, the HRD parameter sets are signalled through the MVC video usability
information extension as specified in subclause I.14, which is part of the subset sequence parameter set syntax
structure.
NOTE 3 – For coded video sequences that conform to one or more of the profiles specified in Annex A, one or more of the
profiles specified in Annex H and one or more of the profiles specified in Annex I, the signalling of the applicable HRD
parameter sets is depending on whether the decoding process specified in clauses 2-9, the decoding process specified in
Annex H or the decoding process specified in Annex I is applied.
All sequence parameter sets and picture parameter sets referred to in the VCL NAL units, and corresponding buffering
period and picture timing SEI messages shall be conveyed to the HRD, in a timely manner, either in the bitstream (by
non-VCL NAL units), or by other means not specified in this Recommendation | International Standard.
In Annexes C, D, and E and subclauses G.12, G.13, G.14, H.12, H.13, H.14, I.12, I.13, and I.14, the specification for
"presence" of non-VCL NAL units is also satisfied when those NAL units (or just some of them) are conveyed to
decoders (or to the HRD) by other means not specified by this Recommendation | International Standard. For the purpose
of counting bits, only the appropriate bits that are actually present in the bitstream are counted.
NOTE 4 – As an example, synchronization of a non-VCL NAL unit, conveyed by means other than presence in the bitstream, with
the NAL units that are present in the bitstream, can be achieved by indicating two points in the bitstream, between which the
non-VCL NAL unit would have been present in the bitstream, had the encoder decided to convey it in the bitstream.
When the content of a non-VCL NAL unit is conveyed for the application by some means other than presence within the
bitstream, the representation of the content of the non-VCL NAL unit is not required to use the same syntax specified in
this annex.
NOTE 5 – When HRD information is contained within the bitstream, it is possible to verify the conformance of a bitstream to the
requirements of this subclause based solely on information contained in the bitstream. When the HRD information is not present in
the bitstream, as is the case for all "stand-alone" Type I bitstreams, conformance can only be verified when the HRD data is
supplied by some other means not specified in this Recommendation | International Standard.
The HRD contains a coded picture buffer (CPB), an instantaneous decoding process, a decoded picture buffer (DPB), and
output cropping as shown in Figure C-2.
10 © ISO/IEC 2013 – All rights reserved

ISO/IEC 14496-10:2012/Amd.2:2013(E)
Hypothetical stream
scheduler (HSS)
Type I or type II bitstream
Coded picture
buffer (CPB)
Access units
Decoding process
(instantaneous)
Reference
Pictures
pictures
Decoded picture
buffer (DPB)
Pictures
Output cropping
Output cropped pictures
Figure C-2 – HRD buffer model
The CPB size (number of bits) is CpbSize[ SchedSelIdx ]. The DPB size (number of frame buffers) is
Max( 1, max_dec_frame_buffering ). When the coded video sequence conforms to one or more of the profiles specified
in Annex H and the decoding process specified in Annex H is applied, the DPB size is specified in units of view
components. When the coded video sequence conforms to one or more of the profiles specified in Annex I and the
decoding process specified in Annex I is applied, the DPB is operated separately for texture view components and depth
view components and the terms texture DPB and depth DPB are used, respectively. The texture DPB size is specified in
units of texture view components and the depth DPB size is specified in units of depth view components.
The HRD operates as follows. Data associated with access units that flow into the CPB according to a specified arrival
schedule are delivered by the HSS. The data associated with each access unit are removed and decoded instantaneously
by the instantaneous decoding process at CPB removal times. Each decoded picture is placed in the DPB at its CPB
removal time unless it is output at its CPB removal time and is a non-reference picture. When a picture is placed in the
DPB it is removed from the DPB at the later of the DPB output time or the time that it is marked as "unused for
reference".
For each picture in the bitstream, the variable OutputFlag for the decoded picture and, when applicable, the reference
base picture is set as follows:
– If the coded video sequence containing the picture conforms to one or more of the profiles specified in Annex A and
the decoding process specified in clauses 2-9 is applied, OutputFlag is set equal to 1.
– Otherwise, if the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex G and the decoding process specified in Annex G is applied, the following applies:
– For a reference base picture, OutputFlag is set equal to 0.
© ISO/IEC 2013 – All rights reserved 11

ISO/IEC 14496-10:2012/Amd.2:2013(E)
– For a decoded picture, OutputFlag is set equal to the value of the output_flag syntax element of the target layer
representation.
– Otherwise, if the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex H and the decoding process specified in Annex H is applied, the following applies:
– For the decoded view components of the target output views, OutputFlag is set equal to 1.
– For the decoded view components of other views, OutputFlag is set equal to 0.
– Otherwise (the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex I and the decoding process specified in Annex I is applied), the following applies:
– For the decoded texture view components and corresponding depth view components with same VOIdx of the
target output views, OutputFlag is set equal to 1.
– For the decoded texture view components and corresponding depth view components with same VOIdx of
other views, OutputFlag is set equal to 0.
The operation of the CPB is specified in subclause C.1. The instantaneous decoder operation is specified in clauses 2-9
(for coded video sequences conforming to one or more of the profiles specified in Annex A) and in Annex G (for coded
video sequences conforming to one or more of the profiles specified in Annex G) and in Annex H (for coded video
sequences conforming to one or more of the profiles specified in Annex H) and in Annex I (for coded video sequences
conforming to one or more of the profiles specified in Annex I). The operation of the DPB is specified in subclause C.2.
The output cropping is specified in subclause C.2.2.
NOTE 6 – Coded video sequences that conform to both, one or more of the profiles specified in Annex A and one or more of the
profiles specified in Annex G, can be decoded either by the decoding process specified in clauses 2-9 or by the decoding process
specified in Annex G. The decoding result and the HRD operation may be dependent on which of the decoding processes is
applied.
NOTE 7 – Coded video sequences that conform both to one or more of the profiles specified in Annex A and one or more of the
profiles specified in Annex H can be decoded either by the decoding process specified in clauses 2-9 or by the decoding process
specified in Annex H. The decoding result and the HRD operation may be dependent on which of the decoding processes is
applied.
NOTE 8 – Coded video sequences that conform to one or more of the profiles specified in Annex A, one or more of the profiles
specified in Annex H and one or more of the profiles specified in Annex I, can be decoded either by the decoding process specified
in clauses 2-9, by the decoding process specified in Annex H or by the decoding process specified in Annex I. The decoding result
and the HRD operation may be dependent on which of the decoding processes is applied.
HSS and HRD information concerning the number of enumerated delivery schedules and their associated bit rates and
buffer sizes is specified in subclauses E.1.1, E.1.2, E.2.1, E.2.2, G.14.1, G.14.2, H.14.1, H.14.2 and I.14. The HRD is
initialised as specified by the buffering period SEI message as specified in subclauses D.1.1 and D.2.1. The removal
timing of access units from the CPB and output timing from the DPB are specified in the picture timing SEI message as
specified in subclauses D.1.2 and D.2.2. All timing information relating to a specific access unit shall arrive prior to the
CPB removal time of the access unit.
When the coded video sequence conforms to one or more of the profiles specified in Annex G and the decoding process
specified in Annex G is applied, the following is specified:
(a) When an access unit contains one or more buffering period SEI messages that are included in scalable nesting
SEI messages and are associated with values of DQId in the range of ( ( DQIdMax >> 4) << 4 ) to
( ( ( DQIdMax >> 4 ) << 4 ) + 15 ), inclusive, the last of these buffering period SEI messages in decoding order
is the buffering period SEI message that initialises the HRD. Let hrdDQId be the largest value of
16 * sei_dependency_id[ i ] + sei_quality_id[ i ] that is associated with the scalable nesting SEI message
12 © ISO/IEC 2013 – All rights reserved

ISO/IEC 14496-10:2012/Amd.2:2013(E)
containing the buffering period SEI message that initialises the HRD, let hrdDId and hrdQId be equal to
hrdDQId >> 4 and hrdDQId & 15, respectively, and let hrdTId be the value of sei_temporal_id that is
associated with the scalable nesting SEI message containing the buffering period SEI message that initialises
the HRD.
(b) The picture timing SEI messages that specify the removal timing of access units from the CPB and output
timing from the DPB are the picture timing SEI messages that are included in scalable nesting SEI messages
associated with values of sei_dependency_id[ i ], sei_quality_id[ i ], and sei_temporal_id equal to hrdDId,
hrdQId, and hrdTId, respectively.
(c) The HRD parameters that are used for conformance checking are the HRD parameters included in the SVC
video usability information extension of the active SVC sequence parameter set that are associated with values
of vui_ext_dependency_id[ i ], vui_ext_quality_id[ i ], and vui_ext_temporal_id[ i ] equal to hrdDId, hrdQId,
and hrdTId, respectively. For the specification in this annex, num_units_in_tick, time_scale,
fixed_frame_rate_flag, nal_hrd_parameters_present_flag, vcl_hrd_parameters_present_flag,
low_delay_hrd_flag, and pic_struct_present_flag are substituted with the values of
vui_ext_num_units_in_tick[ i ], vui_ext_time_scale[ i ], vui_ext_fixed_frame_rate_flag[ i ],
vui_ext_nal_hrd_parameters_present_flag[ i ], vui_ext_vcl_hrd_parameters_present_flag[ i ],
vui_ext_low_delay_hrd_flag[ i ], and vui_ext_pic_struct_present_flag[ i ], respectively, with i being the value
for which vui_ext_dependency_id[ i ], vui_ext_quality_id[ i ], and vui_ext_temporal_id[ i ] are equal to hrdDId,
hrdQId, and hrdTId, respectively.

When the coded video sequence conforms to one or more of the profiles specified in Annex H and the decoding process
specified in Annex H is applied, the following is specified:
(a) When an access unit contains one or more buffering period SEI messages that are included in MVC scalable
nesting SEI messages, the buffering period SEI message that is associated with the operation point being
decoded is the buffering period SEI message that initialises the HRD. Let hrdVId[ i ] be equal to
sei_op_view_id[ i ] for all i in the range of 0 to num_view_components_op_minus1, inclusive, and let hrdTId
be the value of sei_op_temporal_id, that are associated with the MVC scalable nesting SEI message containing
the buffering period SEI message that initialises the HRD.
(b) The picture timing SEI messages that specify the removal timing of access units from the CPB and output
timing from the DPB are the picture timing SEI messages that are included in MVC scalable nesting SEI
messages associated with values of sei_op_view_id[ i ] equal to hrdVId[ i ] for all i in the range of 0 to
num_view_components_op_minus1, inclusive, and sei_temporal_id equal to hrdTId.
(c) The HRD parameters that are used for conformance checking are the HRD parameters included in the MVC
video usability information extension of the active MVC sequence parameter set that are associated with values
of vui_mvc_view_id[ i ][ j ] for all j in the range of 0 to vui_mvc_num_target_output_views_minus1[ i ],
inclusive, equal to hrdVId[ j ], and the value of vui_mvc_temporal_id[ i ] equal to hrdTId. For the specification
in this annex, num_units_in_tick, time_scale, fixed_frame_rate_flag, nal_hrd_parameters_present_flag,
vcl_hrd_parameters_present_flag, low_delay_hrd_flag, and pic_struct_present_flag are substituted with the
values of vui_mvc_num_units_in_tick[ i ], vui_mvc_time_scale[ i ], vui_mvc_fixed_frame_rate_flag[ i ],
vui_mvc_nal_hrd_parameters_present_flag[ i ], vui_mvc_vcl_hrd_parameters_present_flag[ i ],
vui_mvc_low_delay_hrd_flag[ i ], and vui_mvc_pic_struct_present_flag[ i ], respectively, with i being the
value for which vui_mvc_view_id[ i ] is equal to hrdVId[ j ] for all j in the range of 0 to
vui_mvc_num_traget_output_views_minus1[ i ], inclusive, and vui_mvc_temporal_id[ i ] equal to hrdTId.

When the coded video sequence conforms to one or more of the profiles specified in Annex I and the decoding process
specified in Annex I is applied, the following is specified:
(a) When an access unit contains one or more buffering period SEI messages that are included in MVCD scalable
nesting SEI messages, the buffering period SEI message that is associated with the operation point being
decoded is the buffering period SEI message that initialises the HRD. Let hrdVId[ i ] be equal to
sei_op_view_id[ i ] for all i in the range of 0 to num_view_components_op_minus1, inclusive, and let hrdTId
be the value of sei_op_temporal_id, that are associated with the MVCD scalable nesting SEI message
containing the buffering period SEI message that initialises the HRD.
© ISO/IEC 2013 – All rights reserved 13

ISO/IEC 14496-10:2012/Amd.2:2013(E)
(b) The picture timing SEI messages that specify the removal timing of access units from the CPB and output
timing from the DPB are the picture timing SEI messages that are included in MVCD scalable nesting SEI
messages associated with values of sei_op_view_id[ i ] equal to hrdVId[ i ] for all i in the range of 0 to
num_view_components_op_minus1, inclusive, and sei_temporal_id equal to hrdTId.
(c) The HRD parameter sets that are used for conformance checking are the HRD parameter sets, included in the
MVC video usability information extension of the active MVCD sequence parameter set, that are associated
with values of vui_mvc_view_id[ i ][ j ] for all j in the range of 0 to
vui_mvc_num_target_output_views_minus1[ i ], inclusive, equal to hrdVId[ j ], and the value of
vui_mvc_temporal_id[ i ] equal to hrdTId. For the specification in this annex, num_units_in_tick, time_scale,
fixed_frame_rate_flag, nal_hrd_parameters_present_flag, vcl_hrd_parameters_present_flag,
low_delay_hrd_flag, and pic_struct_present_flag are substituted with the values of
vui_mvc_num_units_in_tick[ i ], vui_mvc_time_scale[ i ], vui_mvc_fixed_frame_rate_flag[ i ],
vui_mvc_nal_hrd_parameters_present_flag[ i ], vui_mvc_vcl_hrd_parameters_present_flag[ i ],
vui_mvc_low_delay_hrd_flag[ i ], and vui_mvc_pic_struct_present_flag[ i ], respectively, with i being the
value for which vui_mvc_view_id[ i ] is equal to hrdVId[ j ] for all j in the range of 0 to
vui_mvc_num_traget_output_views_minus1[ i ], inclusive, and vui_mvc_temporal_id[ i ] equal to hrdTId.

The HRD is used to check conformance of bitstreams and decoders as specified in subclauses C.3 and C.4, respectively.
NOTE 9 – While conformance is guaranteed under the assumption that all frame-rates and clocks used to generate the bitstream
match exactly the values signalled in the bitstream, in a real system each of these may vary from the signalled or specified value.
All the arithmetic in this annex is done with real values, so that no rounding errors can propagate. For example, the
number of bits in a CPB just prior to or after removal of an access unit is not necessarily an integer.
The variable t is derived as follows and is called a clock tick:
c
t = num_units_in_tick  time_scale (C-1)
c
The following is specified for expressing the constraints in this annex:
– Let access unit n be the n-th access unit in decoding order with the first access unit being access unit 0.
– Let picture n be the primary coded picture or the decoded primary picture of access unit n.
C.1 Operation of coded picture buffer (CPB)
The specifications in this subclause apply independently to each set of CPB parameters that is present and to both the
Type I and Type II conformance points shown in Figure C-1.
C.1.1 Timing of bitstream arrival
The HRD may be initialised at any one of the buffering period SEI messages. Prior to initialisation, the CPB is empty.
NOTE – After initialisation, the HRD is not initialised again by subsequent buffering period SEI messages.
Each access unit is referred to as access unit n, where the number n identifies the particular access unit. The access unit
that is associated with the buffering period SEI message that initialises the CPB is referred to as access unit 0. The value
of n is incremented by 1 for each subsequent access unit in decoding order.
The time at which the first bit of access unit n begins to enter the CPB is referred to as the initial arrival time t ( n ).
ai
The initial arrival time of access units is derived as follows:
– If the access unit is access unit 0, t ( 0 ) = 0,
ai
14 © ISO/IEC 2013 – All rights reserved

ISO/IEC 14496-10:2012/Amd.2:2013(E)
– Otherwise (the access unit is access unit n with n > 0), the following applies:
– If cbr_flag[ SchedSelIdx ] is equal to 1, the initial arrival time for access unit n, is equal to the final arrival time
(which is derived below) of access unit n − 1, i.e.,
t ( n ) = t ( n − 1 ) (C-2)
ai af
– Otherwise (cbr_flag[ SchedSelIdx ] is equal to 0), the initial arrival time for access unit n is derived by
t ( n ) = Max( t ( n − 1 ), t ( n ) ) (C-3)
ai af ai,earliest
where t ( n ) is derived as follows:
ai,earliest
– If access unit n is not the first access unit of a subsequent buffering period, t ( n ) is derived as
ai,earliest
t ( n ) = t ( n ) − ( initial_cpb_removal_delay[ SchedSelIdx ] +
ai,earliest r,n
initial_cpb_removal_delay_offset[ SchedSelIdx ] )  90000 (C-4)
with t ( n ) being the nominal removal time of access unit n from the CPB as specified in subclause C.1.2
r,n
and initial_cpb_removal_delay[ SchedSelIdx ] and initial_cpb_removal_delay_offset[ SchedSelIdx ]
being specified in the previous buffering period SEI message.
– Otherwise (access unit n is the first access unit of a subsequent buffering period), t ( n ) is derived as
ai,earliest
t ( n ) = t ( n ) − ( initial_cpb_removal_delay[ SchedSelIdx ]  90000 ) (C-5)
ai,earliest r,n
with initial_cpb_removal_delay[ SchedSelIdx ] being specified in the buffering period SEI message
associated with access unit n.
The final arrival time for access unit n is derived by
t ( n ) = t ( n ) + b( n )  BitRate[ SchedSelIdx ] (C-6)
af ai
where b( n ) is the size in bits of access unit n, counting the bits of the VCL NAL units and the filler data NAL units for
the Type I conformance point or all bits of the Type II bitstream for the Type II conformance point, where the Type I and
Type II conformance points are as shown in Figure C-1.
The values of SchedSelIdx, BitRate[ SchedSelIdx ], and CpbSize[ SchedSelIdx ] are constrained as follows:
– If the content of the active sequence parameter sets for access unit n and access unit n − 1 differ, the HSS selects a
value SchedSelIdx1 of SchedSelIdx from among the values of SchedSelIdx provided in the active sequence
parameter set for access unit n that results in a BitRate[ SchedSelIdx1 ] or CpbSize[ SchedSelIdx1 ] for access
unit n. The value of BitRate[ SchedSelIdx1 ] or CpbSize[ SchedSelIdx1 ] may differ from the value of
BitRate[ SchedSelIdx0 ] or CpbSize[ SchedSelIdx0 ] for the value SchedSelIdx0 of SchedSelIdx that was in use for
access unit n − 1.
– Otherwise, the HSS continues to operate with the previous values of SchedSelIdx, BitRate[ SchedSelIdx ] and
CpbSize[ SchedSelIdx ].
When the HSS selects values of BitRate[ SchedSelIdx ] or CpbSize[ SchedSelIdx ] that differ from those of the previous
access unit, the following applies:
– the variable BitRate[ SchedSelIdx ] comes into effect at time t ( n )
ai
© ISO/IEC 2013 – All rights reserved 15

ISO/IEC 14496-10:2012/Amd.2:2013(E)
– the variable CpbSize[ SchedSelIdx ] comes into effect as follows:
– If the new value of CpbSize[ SchedSelIdx ] exceeds the old CPB size, it comes into effect at time t ( n ),
ai
– Otherwise, the new value of CpbSize[ SchedSelIdx ] comes into effect at the time t ( n ).
r
C.1.2 Timing of coded picture removal
When an access unit n is the access unit with n equal to 0 (the access unit that initialises the HRD), the nominal removal
time of the access unit from the CPB is specified by
t ( 0 ) = initial_cpb_removal_delay[ SchedSelIdx ]  90000 (C-7)
r,n
When an access unit n is the first access unit of a buffering period that does not initialise the HRD, the nominal removal
time of the access unit from the CPB is specified by
t ( n ) = t ( n ) + t * cpb_removal_delay( n ) (C-8)
r,n r,n b c
where t ( n ) is the nominal removal time of the first access unit of the previous buffering period and
r,n b
cpb_removal_delay( n ) is the value of cpb_removal_delay specified in the picture timing SEI message associated with
access unit n.
The nominal removal time t (n) of an access unit n that is not the first access unit of a buffering period is given by
r,n
t ( n ) = t ( n ) + t * cpb_removal_delay( n ) (C-9)
r,n r,n b c
where t ( n ) is the nominal removal time of the first access unit of the current buffering period and
r,n b
cpb_removal_delay( n ) is the value of cpb_removal_delay specified in the picture timing SEI message associated with
access unit n.
The removal time of access unit n is specified as follows:
– If low_delay_hrd_flag is equal to 0 or t ( n ) >= t ( n ), the removal time of access unit n is specified by
r,n af
t ( n ) = t ( n ) (C-10)
r r,n
– Otherwise (low_delay_hrd_flag is equal to 1 and t ( n ) < t ( n )), the removal time of access unit n is specified by
r,n af
t ( n ) = t ( n ) + t * Ceil( ( t ( n ) − t ( n ) ) t )  (C-11)
r r,n c af r,n c
NOTE – The latter case indicates that the size of access unit n, b( n ), is so large that it prevents removal at the nominal removal
time.
When an access unit n is the first access unit of a buffering period, n is set equal to n at the removal time t ( n ) of the
b r
access unit n.
C.2 Operation of the decoded picture buffer (DPB)
The decoded picture buffer contains frame buffers. When a coded video sequence conforming to one or more of the
profiles specified in Annex A is decoded by applying the decoding process specified in clauses 2-9, each of the frame
buffers may contain a decoded frame, a decoded complementary field pair or a single (non-paired) decoded field that is
marked as "used for reference" (reference pictures) or is held for future output (reordered or delayed pictures). When a
coded video sequence conforming to one or more of the profiles specified in Annex G is decoded by applying the
decoding process specified in Annex G, each frame buffer may contain a decoded frame, a decoded complementary field
16 © ISO/IEC 2013 – All rights reserved
...


DRAFT AMENDMENT ISO/IEC 14496-10:2012/DAM 2
ISO/IEC JTC 1 Secretariat: ANSI

Voting begins on Voting terminates on
2012-10-09 2013-01-09
INTERNATIONAL ORGANIZATION FOR STANDARDIZATION • МЕЖДУНАРОДНАЯ ОРГАНИЗАЦИЯ ПО СТАНДАРТИЗАЦИИ • ORGANISATION INTERNATIONALE DE NORMALISATION
INTERNATIONAL ELECTROTECHNICAL COMMISSION • МЕЖДУНАРОДНАЯ ЭЛЕКТРОТЕХНИЧЕСКАЯ КОММИСИЯ • COMMISSION ÉLECTROTECHNIQUE INTERNATIONALE

Information technology — Coding of audio-visual objects —
Part 10:
Advanced Video Coding
AMENDMENT 2
Technologies de l'information — Codage des objets audiovisuels —
Partie 10: Codage visuel avancé
AMENDEMENT 2
ICS 35.040
To expedite distribution, this document is circulated as received from the committee
secretariat. ISO Central Secretariat work of editing and text composition will be undertaken at
publication stage.
Pour accélérer la distribution, le présent document est distribué tel qu'il est parvenu du
secrétariat du comité. Le travail de rédaction et de composition de texte sera effectué au
Secrétariat central de l'ISO au stade de publication.

THIS DOCUMENT IS A DRAFT CIRCULATED FOR COMMENT AND APPROVAL. IT IS THEREFORE SUBJECT TO CHANGE AND MAY NOT BE
REFERRED TO AS AN INTERNATIONAL STANDARD UNTIL PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS BEING ACCEPTABLE FOR INDUSTRIAL, TECHNOLOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON OCCASION HAVE TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL TO BECOME
STANDARDS TO WHICH REFERENCE MAY BE MADE IN NATIONAL REGULATIONS.
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT, WITH THEIR COMMENTS, NOTIFICATION OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPORTING DOCUMENTATION.
International Organization for Standardization, 2012
©
International Electrotechnical Commission, 2012

ISO/IEC 14496-10:2010/DAM 2
Copyright notice
This ISO document is a Draft International Standard and is copyright-protected by ISO. Except as permitted
under the applicable laws of the user's country, neither this ISO draft nor any extract from it may be
reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic,
photocopying, recording or otherwise, without prior written permission being secured.
Requests for permission to reproduce should be addressed to either ISO at the address below or ISO's
member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Reproduction may be subject to royalty payments or a licensing agreement.
Violators may be prosecuted.
ii © ISO/IEC 2012 — All rights reserved

ISO/IEC 14496-10:2012/DAM 2
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 2 to ISO/IEC 14496-10:2012 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information.
© ISO/IEC 2012 – All rights reserved iii

ISO/IEC 14496-10:2012/DAM 2
Information technolog — Coding of audio-visual objects —
Part 10: Advanced Video Coding, AMENDMENT 2: MVC
extensions for inclusion of depth maps
AMENDMENT 2
In 0.6, add the following paragraph after the paragraph that starts with “Multiview video coding”:
An extension of multiview video coding that also supports the inclusion of depth maps is specified in Annex I allowing
the construction of bitstreams that represent multiple views with corresponding depth views. Similar to multiview video
coding, bitstreams that include multiple depth views may also contain sub-bitstreams that conform to this specification.
For temporal bitstream scalability, i.e., the presence of a sub-bitstream with a smaller temporal sampling rate than the
bitstream, complete access units are removed from the bitstream when deriving the sub-bitstream. For view bitstream
scalability, i.e. the presence of a sub-bitstream with fewer views than those included in the bitstream, NAL units are
removed from the bitstream when deriving the sub-bitstream. In this case, inter-view prediction, i.e., the prediction of one
view by data of another view signal, is typically used for efficient coding.

In 0.7, add the following paragraph after the paragraph that starts with “Annex H specifies”:
Annex I specifies MVC extensions for inclusion of depth maps, referred to as 3D Video Coding (3DVC). The reader is
referred to Annex I for the entire decoding process for 3DVC, which is specified there with references being made to
clauses 2-9 and Annexes A-E and Annex H. Subclause I.10 specifies one profile for 3DVC (Multiview and Depth).

© ISO/IEC 2012 – All rights reserved 1

ISO/IEC 14496-10:2012/DAM 2
In 7.3.1, replace the syntax table with:

nal_unit( NumBytesInNALunit ) { C Descriptor
forbidden_zero_bit All f(1)
nal_ref_idc All u(2)
nal_unit_type All u(5)
NumBytesInRBSP = 0
nalUnitHeaderBytes = 1
if( nal_unit_type = = 14 | | nal_unit_type = = 20  | |
nal_unit_type = = 21) {
svc_extension_flag All u(1)
if( !svc_extension_flag | | nal_unit_type = = 21 ) )
nal_unit_header_mvc_extension( ) /* specified in Annex H */ All
else
nal_unit_header_svc_extension( ) /* specified in Annex G */ All
nalUnitHeaderBytes += 3
}
for( i = nalUnitHeaderBytes; i < NumBytesInNALunit; i++ ) {
if( i + 2 < NumBytesInNALunit && next_bits( 24 ) = = 0x000003 ) {
rbsp_byte[ NumBytesInRBSP++ ] All b(8)
rbsp_byte[ NumBytesInRBSP++ ] All b(8)
i += 2
emulation_prevention_three_byte /* equal to 0x03 */ All f(8)
} else
rbsp_byte[ NumBytesInRBSP++ ] All b(8)
}
}
2 © ISO/IEC 2012 – All rights reserved

ISO/IEC 14496-10:2012/DAM 2
In 7.3.2.1.3, replace the syntax table with:

subset_seq_parameter_set_rbsp( ) { C Descriptor
seq_parameter_set_data( ) 0
if( profile_idc = = 83 | | profile_idc = = 86 ) {
seq_parameter_set_svc_extension( ) /* specified in Annex G */ 0
svc_vui_parameters_present_flag 0 u(1)
if( svc_vui_parameters_present_flag = = 1 )
svc_vui_parameters_extension( ) /* specified in Annex G */ 0
} else if( profile_idc = = 118 | | profile_idc = = 128 ) {
bit_equal_to_one /* equal to 1 */ 0 f(1)
seq_parameter_set_mvc_extension( ) /* specified in Annex H */ 0
mvc_vui_parameters_present_flag 0 u(1)
if( mvc_vui_parameters_present_flag = = 1 )
mvc_vui_parameters_extension( ) /* specified in Annex H */ 0
}
if( profile_idc = = 138 ) {
bit_equal_to_one /* equal to 1 */ 0 f(1)
seq_parameter_set_mvc_extension( ) /* specified in Annex H */ 0
seq_parameter_set_3dvc_extension( ) 0
}
additional_extension3_flag 0 u(1)
if( additional_extension3_flag = = 1 )
while( more_rbsp_data( ) )
additional_extension3_data_flag 0 u(1)
rbsp_trailing_bits( ) 0
}
In 7.3.3, replace the syntax table with:

slice_header( ) { C Descriptor
first_mb_in_slice 2 ue(v)
slice_type 2 ue(v)
pic_parameter_set_id 2 ue(v)
if( separate_colour_plane_flag = = 1 )
colour_plane_id 2 u(2)
frame_num 2 u(v)
if( !frame_mbs_only_flag ) {
field_pic_flag 2 u(1)
if( field_pic_flag )
bottom_field_flag 2 u(1)
}
if( IdrPicFlag )
idr_pic_id 2 ue(v)
if( pic_order_cnt_type = = 0 ) {
pic_order_cnt_lsb 2 u(v)
if( bottom_field_pic_order_in_frame_present_flag && !field_pic_flag )
© ISO/IEC 2012 – All rights reserved 3

ISO/IEC 14496-10:2012/DAM 2
delta_pic_order_cnt_bottom 2 se(v)
}
if( pic_order_cnt_type = = 1 && !delta_pic_order_always_zero_flag ) {
delta_pic_order_cnt[ 0 ] 2 se(v)
if( bottom_field_pic_order_in_frame_present_flag && !field_pic_flag )
delta_pic_order_cnt[ 1 ] 2 se(v)
}
if( redundant_pic_cnt_present_flag )
redundant_pic_cnt 2 ue(v)
if( slice_type = = B )
direct_spatial_mv_pred_flag 2 u(1)
if( slice_type = = P | | slice_type = = SP | | slice_type = = B ) {
num_ref_idx_active_override_flag 2 u(1)
if( num_ref_idx_active_override_flag ) {
num_ref_idx_l0_active_minus1 2 ue(v)
if( slice_type = = B )
num_ref_idx_l1_active_minus1 2 ue(v)
}
}
if( nal_unit_type = = 20 | | nal_unit_type = = 21 )
ref_pic_list_mvc_modification( ) /* specified in Annex H */ 2
else
ref_pic_list_modification( ) 2
if( ( weighted_pred_flag && ( slice_type = = P | | slice_type = = SP ) ) | |
( weighted_bipred_idc = = 1 && slice_type = = B ) )
pred_weight_table( ) 2
if( nal_ref_idc != 0 )
dec_ref_pic_marking( ) 2
if( entropy_coding_mode_flag && slice_type != I && slice_type != SI )
cabac_init_idc 2 ue(v)
slice_qp_delta 2 se(v)
if( slice_type = = SP | | slice_type = = SI ) {
if( slice_type = = SP )
sp_for_switch_flag 2 u(1)
slice_qs_delta 2 se(v)
}
if( deblocking_filter_control_present_flag ) {
disable_deblocking_filter_idc 2 ue(v)
if( disable_deblocking_filter_idc != 1 ) {
2 se(v)
slice_alpha_c0_offset_div2
2 se(v)
slice_beta_offset_div2
}
}
if( num_slice_groups_minus1 > 0 &&
slice_group_map_type >= 3 && slice_group_map_type <= 5)
slice_group_change_cycle 2 u(v)
}
4 © ISO/IEC 2012 – All rights reserved

ISO/IEC 14496-10:2012/DAM 2
Replace Table 7-1 with:
nal_unit_type Content of NAL unit and RBSP C Annex A Annex G Annex I
syntax structure NAL unit and NAL unit
type class Annex H type class
NAL unit
type class
0 Unspecified non-VCL non-VCL
non-VCL
1 Coded slice of a non-IDR picture 2, 3, 4 VCL VCL
VCL
slice_layer_without_partitioning_rbsp( )
2 Coded slice data partition A 2 VCL not
not
slice_data_partition_a_layer_rbsp( ) applicable
applicable
3 Coded slice data partition B 3 VCL not
not
slice_data_partition_b_layer_rbsp( ) applicable
applicable
4 Coded slice data partition C 4 VCL not
not
slice_data_partition_c_layer_rbsp( ) applicable
applicable
5 Coded slice of an IDR picture 2, 3 VCL VCL
VCL
slice_layer_without_partitioning_rbsp( )
6 Supplemental enhancement information 5 non-VCL non-VCL
(SEI)
non-VCL
sei_rbsp( )
7 Sequence parameter set 0 non-VCL non-VCL
non-VCL
seq_parameter_set_rbsp( )
8 Picture parameter set 1 non-VCL non-VCL
non-VCL
pic_parameter_set_rbsp( )
9 Access unit delimiter 6 non-VCL non-VCL
non-VCL
access_unit_delimiter_rbsp( )
10 End of sequence 7 non-VCL non-VCL
non-VCL
end_of_seq_rbsp( )
11 End of stream 8 non-VCL non-VCL
non-VCL
end_of_stream_rbsp( )
12 Filler data 9 non-VCL non-VCL
non-VCL
filler_data_rbsp( )
13 Sequence parameter set extension 10 non-VCL non-VCL
non-VCL
seq_parameter_set_extension_rbsp( )
14 Prefix NAL unit 2 non-VCL suffix
suffix
prefix_nal_unit_rbsp( ) dependent
dependent
15 Subset sequence parameter set 0 non-VCL non-VCL
non-VCL
subset_seq_parameter_set_rbsp( )
16.18 Reserved non-VCL non-VCL
non-VCL
19 Coded slice of an auxiliary coded 2, 3, 4 non-VCL non-VCL
picture without partitioning
non-VCL
slice_layer_without_partitioning_rbsp( )
20 Coded slice extension 2, 3, 4 non-VCL VCL
VCL
slice_layer_extension_rbsp( )
21 Coded slice extension for depth view 2, 3, 4 non-VCL VCL
VCL
© ISO/IEC 2012 – All rights reserved 5

ISO/IEC 14496-10:2012/DAM 2
components /*specified in Annex I */
slice_layer_extension_rbsp( ) /*
specified in Annex I */
22.23 Reserved non-VCL non-VCL
VCL
24.31 Unspecified non-VCL non-VCL non-VCL

In 7.4.1, add the following paragraph in the semantics of svc_extension_flag just before the semantics of
rbsp_byte[ i ].
The value of svc_extension_flag shall be equal to 0 for coded video sequences conforming to one or more profiles
specified in Annex I. Decoders conforming to one or more profiles specified in Annex I shall ignore (remove from the
bitstream and discard) NAL units for which nal_unit_type is equal to 14, 20, or 21 and for which svc_extension_flag is
equal to 1.
In 7.4.2.1.3, make the following changes:
Replace the sentence following text in the semantics of chroma_format_idc after “inclusive.” with:
When chroma_format_idc is not present and when profile_idc is equal to 138, it shall be inferred to be equal to 0 (4:0:0
chroma format), otherwise, it shall be inferred to be equal to 1 (4:2:0 chroma format).

Substitute each occurrence of “additional_extension2_flag” with “additional_extension3_flag”.
Replace the semantics of additional_extension2_data_flag with the following:
additional_extension3_data_flag may have any value. It shall not affect the conformance to profiles specified in
Annex A, G, H, or I.
6 © ISO/IEC 2012 – All rights reserved

ISO/IEC 14496-10:2012/DAM 2
Replace Annex C with:
Annex C\
Hypothetical reference decoder

− (This annex forms an integral part of this Recommendation | International Standard)
This annex specifies the hypothetical reference decoder (HRD) and its use to check bitstream and decoder conformance.
Two types of bitstreams are subject to HRD conformance checking for this Recommendation | International Standard.
The first such type of bitstream, called Type I bitstream, is a NAL unit stream containing only the VCL NAL units and
filler data NAL units for all access units in the bitstream. The second type of bitstream, called a Type II bitstream,
contains, in addition to the VCL NAL units and filler data NAL units for all access units in the bitstream, at least one of
the following:
– additional non-VCL NAL units other than filler data NAL units,
– all leading_zero_8bits, zero_byte, start_code_prefix_one_3bytes, and trailing_zero_8bits syntax elements that form
a byte stream from the NAL unit stream (as specified in Annex B).
Figure C-1 shows the types of bitstream conformance points checked by the HRD.
Non-VCL NAL units other
VCL NAL units than filter data NAL units
Filter data NAL units
Byte stream format
encapsulation
(see Annex B)
H.264(09)_FC-1
Figure C-1 – Structure of byte streams and NAL unit streams for HRD conformance checks
The syntax elements of non-VCL NAL units (or their default values for some of the syntax elements), required for the
HRD, are specified in the semantic subclauses of clause 7, Annexes D and E, and subclauses G.7, G.13, G.14, H.7, H.13,
H.14, I.7, I.13, and I.14.
Two types of HRD parameter sets (NAL HRD parameters and VCL HRD parameters) are used. The HRD parameter sets
are signalled as follows:
– When the coded video sequence conforms to one or more of the profiles specified in Annex A and the decoding
process specified in clauses 2-9 is applied, the HRD parameter sets are signalled through video usability information
as specified in subclauses E.1 and E.2, which is part of the sequence parameter set syntax structure.
– When the coded video sequence conforms to one or more of the profiles specified in Annex G and the decoding
process specified in Annex G is applied, the HRD parameter sets are signalled through the SVC video usability
information extension as specified in subclauses G.14.1 and G.14.2, which is part of the subset sequence parameter
set syntax structure.
© ISO/IEC 2012 – All rights reserved 7

ISO/IEC 14496-10:2012/DAM 2
NOTE 1 – For coded video sequences that conform to both, one or more of the profiles specified in Annex A and one
or more of the profiles specified in Annex G, the signalling of the applicable HRD parameter sets is depending on
whether the decoding process specified in clauses 2-9 or the decoding process specified in Annex G is applied.
– When the coded video sequence conforms to one or more of the profiles specified in Annex H and the decoding
process specified in Annex H is applied, the HRD parameter sets are signalled through the MVC video usability
information extension as specified in subclauses H.14.1 and H.14.2, which is part of the subset sequence parameter
set syntax structure.
NOTE 2 – For coded video sequences that conform to both, one or more of the profiles specified in Annex A and one
or more of the profiles specified in Annex H, the signalling of the applicable HRD parameter sets is depending on
whether the decoding process specified in clauses 2-9 or the decoding process specified in Annex H is applied.
– When the coded video sequence conforms to one or more of the profiles specified in Annex I and the decoding
process specified in Annex I is applied, the HRD parameter sets are signalled through the MVC video usability
information extension as specified in subclauses I.14, which is part of the subset sequence parameter set syntax
structure.
NOTE 3 – For coded video sequences that conform to one or more of the profiles specified in Annex A, one or more
of the profiles specified in Annex H and one or more of the profiles specified in Annex I, the signalling of the
applicable HRD parameter sets is depending on whether the decoding process specified in clauses 2-9, the decoding
process specified in Annex H or the decoding process specified in Annex I is applied.
All sequence parameter sets and picture parameter sets referred to in the VCL NAL units, and corresponding buffering
period and picture timing SEI messages shall be conveyed to the HRD, in a timely manner, either in the bitstream (by
non-VCL NAL units), or by other means not specified in this Recommendation | International Standard.
In Annexes C, D, and E and subclauses G.12, G.13, G.14, H.12, H.13, H.14, I.12, I.13, and I.14, the specification for
"presence" of non-VCL NAL units is also satisfied when those NAL units (or just some of them) are conveyed to
decoders (or to the HRD) by other means not specified by this Recommendation | International Standard. For the purpose
of counting bits, only the appropriate bits that are actually present in the bitstream are counted.
NOTE 3 – As an example, synchronization of a non-VCL NAL unit, conveyed by means other than presence in the
bitstream, with the NAL units that are present in the bitstream, can be achieved by indicating two points in the
bitstream, between which the non-VCL NAL unit would have been present in the bitstream, had the encoder decided
to convey it in the bitstream.
When the content of a non-VCL NAL unit is conveyed for the application by some means other than presence within the
bitstream, the representation of the content of the non-VCL NAL unit is not required to use the same syntax specified in
this annex.
NOTE 4 – When HRD information is contained within the bitstream, it is possible to verify the conformance of a
bitstream to the requirements of this subclause based solely on information contained in the bitstream. When the HRD
information is not present in the bitstream, as is the case for all "stand-alone" Type I bitstreams, conformance can
only be verified when the HRD data is supplied by some other means not specified in this Recommendation |
International Standard.
The HRD contains a coded picture buffer (CPB), an instantaneous decoding process, a decoded picture buffer (DPB), and
output cropping as shown in Figure C-2.
8 © ISO/IEC 2012 – All rights reserved

ISO/IEC 14496-10:2012/DAM 2
Hypothetical stream
scheduler (HSS)
Type I or type II bitstream
Coded picture
buffer (CPB)
Access units
Decoding process
(instantaneous)
Reference
Pictures
pictures
Decoded picture
buffer (DPB)
Pictures
Output cropping
Output cropped pictures
Figure C-2 – HRD buffer model
The CPB size (number of bits) is CpbSize[ SchedSelIdx ]. The DPB size (number of frame buffers) is
Max( 1, max_dec_frame_buffering ). When the coded video sequence conforms to one or more of the profiles specified
in Annex H and the decoding process specified in Annex H is applied, the DPB size is specified in units of view
components. When the coded video sequence conforms to one or more of the profiles specified in Annex I and the
decoding process specified in Annex I is applied, the DPB size is specified in units of texture view components and depth
view components.
The HRD operates as follows. Data associated with access units that flow into the CPB according to a specified arrival
schedule are delivered by the HSS. The data associated with each access unit are removed and decoded instantaneously
by the instantaneous decoding process at CPB removal times. Each decoded picture is placed in the DPB at its CPB
removal time unless it is output at its CPB removal time and is a non-reference picture. When a picture is placed in the
DPB it is removed from the DPB at the later of the DPB output time or the time that it is marked as "unused for
reference".
For each picture in the bitstream, the variable OutputFlag for the decoded picture and, when applicable, the reference
base picture is set as follows:
– If the coded video sequence containing the picture conforms to one or more of the profiles specified in Annex A and
the decoding process specified in clauses 2-9 is applied, OutputFlag is set equal to 1.
– Otherwise, if the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex G and the decoding process specified in Annex G is applied, the following applies:
– For a reference base picture, OutputFlag is set equal to 0.
– For a decoded picture, OutputFlag is set equal to the value of the output_flag syntax element of the target layer
representation.
– Otherwise (the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex H and the decoding process specified in Annex H is applied), the following applies:
© ISO/IEC 2012 – All rights reserved 9

ISO/IEC 14496-10:2012/DAM 2
– For the decoded view components of the target output views, OutputFlag is set equal to 1.
– For the decoded view components of other views, OutputFlag is set equal to 0.
– Otherwise (the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex I and the decoding process specified in Annex I is applied), the following applies:
– For the decoded texture view components and depth view component with same VOIdx of the target output
views, OutputFlag is set equal to 1.
– For the decoded texture view components and depth view component with same VOIdx of other views,
OutputFlag is set equal to 0.
The operation of the CPB is specified in subclause C.1. The instantaneous decoder operation is specified in clauses 2-9
(for coded video sequences conforming to one or more of the profiles specified in Annex A) and in Annex G (for coded
video sequences conforming to one or more of the profiles specified in Annex G) and in Annex H (for coded video
sequences conforming to one or more of the profiles specified in Annex H) and in Annex I (for coded video sequences
conforming to one or more of the profiles specified in Annex I). The operation of the DPB is specified in subclause C.2.
The output cropping is specified in subclause C.2.2.
NOTE 5 – Coded video sequences that conform to both, one or more of the profiles specified in Annex A and one or
more of the profiles specified in Annex G, can be decoded either by the decoding process specified in clauses 2-9 or
by the decoding process specified in Annex G. The decoding result and the HRD operation may be depending on
which of the decoding processes is applied.
NOTE 6 – Coded video sequences that conform to both, one or more of the profiles specified in Annex A and one or
more of the profiles specified in Annex H, can be decoded either by the decoding process specified in clauses 2-9 or
by the decoding process specified in Annex H. The decoding result and the HRD operation may be depending on
which of the decoding processes is applied.
NOTE 7 – Coded video sequences that conform to one or more of the profiles specified in Annex A, one or more of
the profiles specified in Annex H and one or more of the profiles specified in Annex I, can be decoded either by the
decoding process specified in clauses 2-9, by the decoding process specified in Annex H or by the decoding process
specified in Annex I. The decoding result and the HRD operation may be depending on which of the decoding
processes is applied.
HSS and HRD information concerning the number of enumerated delivery schedules and their associated bit rates and
buffer sizes is specified in subclauses E.1.1, E.1.2, E.2.1, E.2.2, G.14.1, G.14.2, H.14.1, H.14.2 and I.14. The HRD is
initialised as specified by the buffering period SEI message as specified in subclauses D.1.1 and D.2.1. The removal
timing of access units from the CPB and output timing from the DPB are specified in the picture timing SEI message as
specified in subclauses D.1.2 and D.2.2. All timing information relating to a specific access unit shall arrive prior to the
CPB removal time of the access unit.
When the coded video sequence conforms to one or more of the profiles specified in Annex G and the decoding process
specified in Annex G is applied, the following is specified:
(a) When an access unit contains one or more buffering period SEI messages that are included in scalable nesting
SEI messages and are associated with values of DQId in the range of ( ( DQIdMax >> 4) << 4 ) to
( ( ( DQIdMax >> 4 ) << 4 ) + 15 ), inclusive, the last of these buffering period SEI messages in decoding order
is the buffering period SEI message that initialises the HRD. Let hrdDQId be the largest value of
16 * sei_dependency_id[ i ] + sei_quality_id[ i ] that is associated with the scalable nesting SEI message
containing the buffering period SEI message that initialises the HRD, let hrdDId and hrdQId be equal to
hrdDQId >> 4 and hrdDQId & 15, respectively, and let hrdTId be the value of sei_temporal_id that is
associated with the scalable nesting SEI message containing the buffering period SEI message that initialises
the HRD.
(b) The picture timing SEI messages that specify the removal timing of access units from the CPB and output
timing from the DPB are the picture timing SEI messages that are included in scalable nesting SEI messages
associated with values of sei_dependency_id[ i ], sei_quality_id[ i ], and sei_temporal_id equal to hrdDId,
hrdQId, and hrdTId, respectively.
(c) The HRD parameter sets that are used for conformance checking are the HRD parameter sets, included in the
SVC video usability information extension of the active SVC sequence parameter set, that are associated with
values of vui_ext_dependency_id[ i ], vui_ext_quality_id[ i ], and vui_ext_temporal_id[ i ] equal to hrdDId,
hrdQId, and hrdTId, respectively. For the specification in this annex, num_units_in_tick, time_scale,
fixed_frame_rate_flag, nal_hrd_parameters_present_flag, vcl_hrd_parameters_present_flag,
low_delay_hrd_flag, and pic_struct_present_flag are substituted with the values of
vui_ext_num_units_in_tick[ i ], vui_ext_time_scale[ i ], vui_ext_fixed_frame_rate_flag[ i ],
10 © ISO/IEC 2012 – All rights reserved

ISO/IEC 14496-10:2012/DAM 2
vui_ext_nal_hrd_parameters_present_flag[ i ], vui_ext_vcl_hrd_parameters_present_flag[ i ],
vui_ext_low_delay_hrd_flag[ i ], and vui_ext_pic_struct_present_flag[ i ], respectively, with i being the value
for which vui_ext_dependency_id[ i ], vui_ext_quality_id[ i ], and vui_ext_temporal_id[ i ] are equal to hrdDId,
hrdQId, and hrdTId, respectively.
When the coded video sequence conforms to one or more of the profiles specified in Annex H and the decoding process
specified in Annex H is applied, the following is specified:
(a) When an access unit contains one or more buffering period SEI messages that are included in MVC scalable
nesting SEI messages, the buffering period SEI message that is associated with the operation point being
decoded is the buffering period SEI message that initialises the HRD. Let hrdVId[ i ] be equal to
sei_op_view_id[ i ] for all i in the range of 0 to num_view_components_op_minus1, inclusive, and let hrdTId
be the value of sei_op_temporal_id, that are associated with the MVC scalable nesting SEI message containing
the buffering period SEI message that initialises the HRD.
(b) The picture timing SEI messages that specify the removal timing of access units from the CPB and output
timing from the DPB are the picture timing SEI messages that are included in MVC scalable nesting SEI
messages associated with values of sei_op_view_id[ i ] equal to hrdVId[ i ] for all i in the range of 0 to
num_view_components_op_minus1, inclusive, and sei_temporal_id equal to hrdTId.
(c) The HRD parameters that are used for conformance checking are the HRD parameters, included in the MVC
video usability information extension of the active MVC sequence parameter set, that are associated with values
of vui_mvc_view_id[ i ][ j ] for all j in the range of 0 to vui_mvc_num_target_output_views_minus1[ i ],
inclusive, equal to hrdVId[ j ], and the value of vui_mvc_temporal_id[ i ] equal to hrdTId. For the specification
in this annex, num_units_in_tick, time_scale, fixed_frame_rate_flag, nal_hrd_parameters_present_flag,
vcl_hrd_parameters_present_flag, low_delay_hrd_flag, and pic_struct_present_flag are substituted with the
values of vui_mvc_num_units_in_tick[ i ], vui_mvc_time_scale[ i ], vui_mvc_fixed_frame_rate_flag[ i ],
vui_mvc_nal_hrd_parameters_present_flag[ i ], vui_mvc_vcl_hrd_parameters_present_flag[ i ],
vui_mvc_low_delay_hrd_flag[ i ], and vui_mvc_pic_struct_present_flag[ i ], respectively, with i being the
value for which vui_mvc_view_id[ i ] is equal to hrdVId[ j ] for all j in the range of 0 to
vui_mvc_num_traget_output_views_minus1[ i ], inclusive, and vui_mvc_temporal_id[ i ] equal to hrdTId.
When the coded video sequence conforms to one or more of the profiles specified in Annex I and the decoding process
specified in Annex I is applied, the following is specified:
(a) When an access unit contains one or more buffering period SEI messages that are included in 3DVC scalable
nesting SEI messages, the buffering period SEI message that is associated with the operation point being
decoded is the buffering period SEI message that initialises the HRD. Let hrdVId[ i ] be equal to
sei_op_view_id[ i ] for all i in the range of 0 to num_view_components_op_minus1, inclusive, and let hrdTId
be the value of sei_op_temporal_id, that are associated with the 3DVC scalable nesting SEI message containing
the buffering period SEI message that initialises the HRD.
(b) The picture timing SEI messages that specify the removal timing of access units from the CPB and output
timing from the DPB are the picture timing SEI messages that are included in 3DVC scalable nesting SEI
messages associated with values of sei_op_view_id[ i ] equal to hrdVId[ i ] for all i in the range of 0 to
num_view_components_op_minus1, inclusive, and sei_temporal_id equal to hrdTId.
(c) The HRD parameter sets that are used for conformance checking are the HRD parameter sets, included in the
MVC video usability information extension of the active 3DVC sequence parameter set, that are associated with
values of vui_mvc_view_id[ i ][ j ] for all j in the range of 0 to vui_mvc_num_target_output_views_minus1[ i ],
inclusive, equal to hrdVId[ j ], and the value of vui_mvc_temporal_id[ i ] equal to hrdTId. For the specification
in this annex, num_units_in_tick, time_scale, fixed_frame_rate_flag, nal_hrd_parameters_present_flag,
vcl_hrd_parameters_present_flag, low_delay_hrd_flag, and pic_struct_present_flag are substituted with the
values of vui_mvc_num_units_in_tick[ i ], vui_mvc_time_scale[ i ], vui_mvc_fixed_frame_rate_flag[ i ],
vui_mvc_nal_hrd_parameters_present_flag[ i ], vui_mvc_vcl_hrd_parameters_present_flag[ i ],
vui_mvc_low_delay_hrd_flag[ i ], and vui_mvc_pic_struct_present_flag[ i ], respectively, with i being the
value for which vui_mvc_view_id[ i ] is equal to hrdVId[ j ] for all j in the range of 0 to
vui_mvc_num_traget_output_views_minus1[ i ], inclusive, and vui_mvc_temporal_id[ i ] equal to hrdTId.
The HRD is used to check conformance of bitstreams and decoders as specified in subclauses C.3 and C.4, respectively.
NOTE 7 – While conformance is guaranteed under the assumption that all frame-rates and clocks used to generate the
bitstream match exactly the values signalled in the bitstream, in a real system each of these may vary from the
signalled or specified value.
© ISO/IEC 2012 – All rights reserved 11

ISO/IEC 14496-10:2012/DAM 2
All the arithmetic in this annex is done with real values, so that no rounding errors can propagate. For example, the
number of bits in a CPB just prior to or after removal of an access unit is not necessarily an integer.
The variable t is derived as follows and is called a clock tick:
c
t = num_units_in_tick  time_scale (C-1)
c
The following is specified for expressing the constraints in this annex:
– Let access unit n be the n-th access unit in decoding order with the first access unit being access unit 0.
– Let picture n be the primary coded picture or the decoded primary picture of access unit n.
C.1 Operation of coded picture buffer (CPB)
The specifications in this subclause apply independently to each set of CPB parameters that is present and to both the
Type I and Type II conformance points shown in Figure C-1.
C.1.1 Timing of bitstream arrival
The HRD may be initialised at any one of the buffering period SEI messages. Prior to initialisation, the CPB is empty.
NOTE – After initialisation, the HRD is not initialised again by subsequent buffering period SEI messages.
Each access unit is referred to as access unit n, where the number n identifies the particular access unit. The access unit
that is associated with the buffering period SEI message that initialises the CPB is referred to as access unit 0. The value
of n is incremented by 1 for each subsequent access unit in decoding order.
The time at which the first bit of access unit n begins to enter the CPB is referred to as the initial arrival time t ( n ).
ai
The initial arrival time of access units is derived as follows:
– If the access unit is access unit 0, t ( 0 ) = 0,
ai
– Otherwise (the access unit is access unit n with n > 0), the following applies:
– If cbr_flag[ SchedSelIdx ] is equal to 1, the initial arrival time for access unit n, is equal to the final arrival time
(which is derived below) of access unit n − 1, i.e.,
t ( n ) = t ( n − 1 ) (C-2)
ai af
– Otherwise (cbr_flag[ SchedSelIdx ] is equal to 0), the initial arrival time for access unit n is derived by
t ( n ) = Max( t ( n − 1 ), t ( n ) ) (C-3)
ai af ai,earliest
where t ( n ) is derived as follows:
ai,earliest
– If access unit n is not the first access unit of a subsequent buffering period, t ( n ) is derived as
ai,earliest
t ( n ) = t ( n ) − ( initial_cpb_removal_delay[ SchedSelIdx ] +
ai,earliest r,n
initial_cpb_removal_delay_offset[ SchedSelIdx ] )  90000 (C-4)
with t ( n ) being the nominal removal time of access unit n from the CPB as specified in subclause C.1.2
r,n
and initial_cpb_removal_delay[ SchedSelIdx ] and initial_cpb_removal_delay_offset[ SchedSelIdx ]
being specified in the previous buffering period SEI message.
– Otherwise (access unit n is the first access unit of a subsequent buffering period), t ( n ) is derived as
ai,earliest
t ( n ) = t ( n ) − ( initial_cpb_removal_delay[ SchedSelIdx ]  90000 ) (C-5)
ai,earliest r,n
12 © ISO/IEC 2012 – All rights reserved

ISO/IEC 14496-10:2012/DAM 2
with initial_cpb_removal_delay[ SchedSelIdx ] being specified in the buffering period SEI message
associated with access unit n.
The final arrival time for access unit n is derived by
t ( n ) = t ( n ) + b( n )  BitRate[ SchedSelIdx ] (C-6)
af ai
where b( n ) is the size in bits of access unit n, counting the bits of the VCL NAL units and the filler data NAL units for
the Type I conformance point or all bits of the Type II bitstream for the Type II conformance point, where the Type I and
Type II conformance points are as shown in Figure C-1.
The values of SchedSelIdx, BitRate[ SchedSelIdx ], and CpbSize[ SchedSelIdx ] are constrained as follows:
– If the content of the active sequence parameter sets for access unit n and access unit n − 1 differ, the HSS selects a
value SchedSelIdx1 of SchedSelIdx from among the values of SchedSelIdx provided in the active sequence
parameter set for access unit n that results in a BitRate[ SchedSelIdx1 ] or CpbSize[ SchedSelIdx1 ] for access
unit n. The value of BitRate[ SchedSelIdx1 ] or CpbSize[ SchedSelIdx1 ] may differ from the value of
BitRate[ SchedSelIdx0 ] or CpbSize[ SchedSelIdx0 ] for the value SchedSelIdx0 of SchedSelIdx that was in use for
access unit n − 1.
– Otherwise, the HSS continues to operate with the previous values of SchedSelIdx, BitRate[ SchedSelIdx ] and
CpbSize[ SchedSelIdx ].
When the HSS selects values of BitRate[ SchedSelIdx ] or CpbSize[ SchedSelIdx ] that differ from those of the previous
access unit, the following applies:
– the variable BitRate[ SchedSelIdx ] comes into effect at time t ( n )
ai
– the variable CpbSize[ SchedSelIdx ] comes into effect as follows:
– If the new value of CpbSize[ SchedSelIdx ] exceeds the old CPB size, it comes into effect at time t ( n ),
ai
– Otherwise, the new value of CpbSize[ SchedSelIdx ] comes into effect at the time t ( n ).
r
C.1.2 Timing of coded picture removal
When an access unit n is the access unit with n equal to 0 (the access unit that initialises the HRD), the nominal removal
time of the access unit from the CPB is specified by
t ( 0 ) = initial_cpb_removal_delay[ SchedSelIdx ]  90000 (C-7)
r,n
When an access unit n is the first access unit of a buffering period that does not initialise the HRD, the nominal removal
time of the access unit from the CPB is specified by
t ( n ) = t ( n ) + t * cpb_removal_delay( n ) (C-8)
r,n r,n b c
where t ( n ) is the nominal removal time of the first access unit of the previous buffering period and
r,n b
cpb_removal_delay( n ) is the value of cpb_removal_delay specified in the picture timing SEI message associated with
access unit n.
The nominal removal time t (n) of an access unit n that is not the first access unit of a buffering period is given by
r,n
t ( n ) = t ( n ) + t * cpb_removal_delay( n ) (C-9)
r,n r,n b c
where t ( n ) is the nominal removal time of the first access unit of the current buffering period and
r,n b
cpb_removal_delay( n ) is the value of cpb_removal_delay specified in the picture timing SEI message associated with
access unit n.
The removal time of access unit n is specified as follows:
– If low_delay_hrd_flag is equal to 0 or t ( n ) >= t ( n ), the removal time of access unit n is specified by
r,n af
© ISO/IEC 2012 – All rights reserved 13

ISO/IEC 14496-10:2012/DAM 2
t ( n ) = t ( n ) (C-10)
r r,n
– Otherwise (low_delay_hrd_flag is equal to 1 and t ( n ) < t ( n )), the removal time of access unit n is specified by
r,n af
t ( n ) = t ( n ) + t * Ceil( ( t ( n ) − t ( n ) ) t )  (C-11)
r r,n c af r,n c
NOTE – The latter case indicates that the size of access unit n, b( n ), is so large that it prevents removal at the
nominal removal time.
When an access unit n is the first access unit of a buffering period, n is set equal to n at the removal time t ( n ) of the
b r
access unit n.
C.2 Operation of the decoded picture buffer (DPB)
The decoded picture buffer contains frame buffers. When a coded video sequence conforming to one or more of the
profiles specified in Annex A is decoded by applying the decoding process specified in clauses 2-9, each of the frame
buffers may contain a decoded frame, a decoded complementary field pair or a single (non-paired) decoded field that is
marked as "used for reference" (reference pictures) or is held for future output (reordered or delayed pictures). When a
coded video sequence conforming to one or more of the profiles specified in Annex G is decoded by applying the
decoding process specified in Annex G, each frame buffer may contain a decoded frame, a decoded complementary field
pair, a single (non-paired) decoded field, a decoded reference base frame, a decoded reference base complementary field
pair or a single (non-paired) decoded reference base field that is marked as "used for reference" (reference pictures) or is
held for future output (reordered or delayed pictures). When a coded video sequence conforming to one or more of the
profiles specified in Annex H is decoded by applying the decoding process specified in Annex H, each of the frame
buffers may contain a decoded frame view component, a decoded complementary field view component pair, or a single
(non-paired) decoded field view component that is marked as "used for reference" (reference pictures) or is held for
future output (reordered or delayed pictures) or is held as reference for inter-view prediction (inter-view only reference
components). When a coded video sequence conforming to one or more of the profiles specified in Annex I is decoded
by applying the decoding process specified in Annex I, each of the frame buffers may contain a decoded texture frame
view component, a decoded depth frame view component, a decoded complementary texture field view component pair,
or a single (non-paired) decoded texture field view component that is marked as "used for reference" (reference pictures)
or is held for future output (reordered or delayed pictures) or is held as reference for inter-view prediction (inter-view
only reference components).
Prior to initialisation, the DPB is empty (the DPB fullness is set to zero). The following steps specified in this subclause
all happen instantaneously at t ( n ) and in the order listed. When the decoding process specified in Annex H or Annex I
r
is applied, the view components of the current primary coded picture are processed by applying the ordered steps to each
view component in increasing order of the associated view order index VOIdx. During the invocation of the process for a
particular texture view, only the texture view components of the particular view are considered and during the invocation
of the process for a particular depth view, only the depth view components of the particular view are considered. Prior to
initialisation, the DPB is empty (the DPB fullness is set to zero). Inside each view component of the current primary
coded picture, depth view component if present, is processed after the texture view component within the same view
component.
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...