ISO/IEC 14496-15:2004/FDAM 3
(Amendment)Information technology - Coding of audio-visual objects - Part 15: Advanced Video Coding (AVC) file format - Amendment 3: File format support for Multiview Video Coding
Information technology - Coding of audio-visual objects - Part 15: Advanced Video Coding (AVC) file format - Amendment 3: File format support for Multiview Video Coding
Technologies de l'information — Codage des objets audiovisuels — Partie 15: Format de fichier de codage vidéo avancé (AVC) — Amendement 3: Support de format de fichier pour codage vidéo multivues
General Information
Relations
Frequently Asked Questions
ISO/IEC 14496-15:2004/FDAM 3 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Coding of audio-visual objects - Part 15: Advanced Video Coding (AVC) file format - Amendment 3: File format support for Multiview Video Coding". This standard covers: Information technology - Coding of audio-visual objects - Part 15: Advanced Video Coding (AVC) file format - Amendment 3: File format support for Multiview Video Coding
Information technology - Coding of audio-visual objects - Part 15: Advanced Video Coding (AVC) file format - Amendment 3: File format support for Multiview Video Coding
ISO/IEC 14496-15:2004/FDAM 3 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC 14496-15:2004/FDAM 3 has the following relationships with other standards: It is inter standard links to ISO/IEC 14496-15:2004, ISO/IEC 14496-15:2010; is excused to ISO/IEC 14496-15:2004. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO/IEC 14496-15:2004/FDAM 3 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
FINAL ISO/IEC
AMENDMENT
DRAFT 14496-15:2004
FDAM 3
ISO/IEC JTC 1
Information technology — Coding of
Secretariat: ANSI
audio-visual objects —
Voting begins on:
2009-10-16
Part 15:
Advanced Video Coding (AVC) file format
Voting terminates on:
2009-12-16
AMENDMENT 3: File format support for
Multiview Video Coding
Technologies de l'information — Codage des objets audiovisuels —
Partie 15: Format de fichier de codage vidéo avancé (AVC)
AMENDEMENT 3 : Support de format de fichier pour codage vidéo
multivues
Please see the administrative notes on page iii
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPORT-
ING DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
©
NATIONAL REGULATIONS. ISO/IEC 2009
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.
Copyright notice
This ISO document is a Draft International Standard and is copyright-protected by ISO. Except as permitted
under the applicable laws of the user's country, neither this ISO draft nor any extract from it may be
reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic,
photocopying, recording or otherwise, without prior written permission being secured.
Requests for permission to reproduce should be addressed to either ISO at the address below or ISO's
member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Reproduction may be subject to royalty payments or a licensing agreement.
Violators may be prosecuted.
ii © ISO/IEC 2009 – All rights reserved
In accordance with the provisions of Council Resolution 21/1986, this document is circulated in the
English language only.
© ISO/IEC 2009 – All rights reserved iii
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 3 to ISO/IEC 14496-15:2004 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information.
iv © ISO/IEC 2009 – All rights reserved
Information technology — Coding of audio-visual objects —
Part 15:
Advanced Video Coding (AVC) file format
AMENDMENT 3: File format support for Multiview Video Coding
In the Introduction, replace:
This International Standard defines the storage for both plain AVC and SVC video streams, where ‘plain AVC’
refers to the main part of ISO/IEC 14496-10, not including Annex G (Scalable Video Coding), and SVC refers
to ISO/IEC 14496-10 when the techniques in Annex G (Scalable Video Coding) are in use. Specific
techniques are introduced for the handling of scalable streams, enabling their use, and assisting the extraction
of subsets of scalable streams.
with:
This International Standard defines the storage for plain AVC, SVC, and MVC video streams, where ‘plain
AVC’ refers to the main part of ISO/IEC 14496-10, excluding Annex G (Scalable Video Coding) and Annex H
(Multiview Video Coding); SVC refers to ISO/IEC 14496-10 when the techniques in Annex G (Scalable Video
Coding) are in use, and MVC refers to ISO/IEC 14496-10 when the techniques in Annex H (Multiview Video
Coding) are in use. Specific techniques are introduced for handling of scalable and multiview streams,
enabling their use, and assisting the extraction of subsets of scalable and multiview streams.
In Clause 1, Scope, replace:
The file format for storage of SVC content, as defined in Annexes A-E, uses the existing capabilities of the
ISO base media file format and the AVC file format. In addition, the following new extensions to support SVC-
specific features are specified:
with:
The file format for storage of SVC content, as defined in Annexes A-E, and the file format for storage of MVC
content as defined in Annexes B-F, use the existing capabilities of the ISO base media file format and the
AVC file format. In addition, the following new extensions to support SVC-specific features are specified:
and replace:
• AVC Compatibility: A provision for storing an SVC bitstream in an AVC compatible manner, such
that the AVC compatible base layer can be used by any existing AVC file format compliant reader.
with:
• AVC Compatibility: A provision for storing an SVC or MVC bitstream in an AVC compatible manner,
such that the AVC compatible base layer can be used by any existing AVC file format compliant
reader.
In 3.2, replace:
AVC Advanced Video Coding. Where contrasted with SVC in this International Standard, this term
refers to the main part of ISO/IEC 14496-10, not including Annex G (Scalable Video Coding)
© ISO/IEC 2009 – All rights reserved 1
with:
AVC Advanced Video Coding. Where contrasted with SVC or MVC in this International Standard, this
term refers to the main part of ISO/IEC 14496-10, including neither Annex G (Scalable Video
Coding) nor Annex H (Multiview Video Coding)
and add the following row to 3.2 before the row ‘NAL:’
MVC MultiviewVideo Coding. Refers to ISO/IEC 14496-10 when the techniques in Annex H (Multiview
Video Coding) are in use
Replace all of 5.3.4.1.1 with:
Box Types: ‘avc1’, ‘avc2’, ‘avcC’, ‘m4ds’,’btrt’
Container: Sample Table Box (‘stbl’)
Mandatory: An ‘avc1’ or ‘avc2’ sample entry is mandatory
Quantity: One or more sample entries may be present
An AVC visual sample entry shall contain an AVC Configuration Box, as defined below. This includes an
AVCDecoderConfigurationRecord, as defined in 5.2.4.1.
An optional MPEG4BitRateBox may be present in the AVC visual sample entry to signal the bit rate
information of the AVC video stream. Extension descriptors that should be inserted into the Elementary
Stream Descriptor, when used in MPEG-4, may also be present.
Multiple sample descriptions may be used, as permitted by the ISO Base Media File Format specification, to
indicate sections of video that use different configurations or parameter sets.
The sample entry name ‘avc1’ may only be used when the entire stream is a compliant and usable AVC
stream as viewed by an AVC decoder operating under the configuration (including profile and level) given in
the AVCConfigurationBox. The file format specific structures that resemble NAL units (see Annex B) may be
present but must not be used to access the AVC base data; that is, the AVC data must not be contained in
Aggregators (though they may be included within the bytes referenced by the additional_bytes field) nor
referenced by Extractors.
The sample entry name ‘avc2’ may only be used when Extractors or Aggregators (Annex B) are required to be
supported, and an appropriate Toolset is required (for example, as indicated by the file-type brands). This
sample entry type indicates that, in order to form the intended AVC stream, Extractors must be replaced with
the data they are referencing, and Aggregators must be examined for contained NAL Units. Tier grouping
may be present.
Add to the end of 5.3.4.1.2:
class AVC2SampleEntry() extends VisualSampleEntry (‘avc2’){
AVCConfigurationBox avcconfig;
MPEG4BitRateBox bitrate; // optional
MPEG4ExtensionDescriptorsBox descr; // optional
extra_boxes boxes; // optional
}
Replace A.2.1 with the following:
A.2.1
Aggregator
Aggregators are in-stream structures using a NAL unit header including a NAL unit header extension, with a
NAL unit type equal to 30. Aggregators are used to group NAL units belonging to the same sample.
2 © ISO/IEC 2009 – All rights reserved
Replace A.2.6 with the following:
A.2.6
Extractor
Extractors are in-stream structures using a NAL unit header including a NAL unit header extension, with a
NAL unit type equal to 31. Extractors contain instructions on how to extract data from other tracks. Logically
an Extractor can be seen as a ‘link’. While accessing a track containing Extractors, the Extractor is replaced
by the data it is referencing.
Replace A.2.17 with the following:
SVC VCL NAL unit
SVC VCL NAL units follow the definitions in ISO/IEC 14496-10 Annex G for NAL units of type 20 and 14; they
are NAL units with type 20, and NAL units with type 14 when the immediately following NAL units are AVC
VCL NAL units. SVC VCL NAL units do not affect the decoding process of a legacy AVC decoder.
In A.6.3.1.1
change the paragraph:
The sample entry name ‘avc1’ may only be used when the entire stream is a compliant and usable AVC
stream as viewed by an AVC decoder operating under the configuration (including profile and level) given in
the AVCConfigurationBox. The file format specific structures that resemble NAL units may be present but
must not be used to access the AVC base data; that is, the AVC data must not be contained in Aggregators
(though they may be included within the bytes referenced by the additional_bytes field) nor referenced by
Extractors. The sample entry name ‘avc2’ indicates that, in order to form the intended AVC stream, Extractors
must be replaced with the data they are referencing, and Aggregators must be examined for contained NAL
Units. Extractors or aggregators may be used for SVC VCL NAL units in ‘avc1’, ‘avc2’ or ‘svc1’ tracks.
as follows:
Extractors or aggregators may be used for SVC VCL NAL units in ‘avc1’, ‘avc2’ or ‘svc1’ tracks. The
‘extra_boxes’ in an ‘avc2’ sample entry may be an SVCConfigurationBox, ScalabilityInformationSEIBox,
SVCPriorityAssignmentBox or other extension boxes.
remove this row from the Table:
‘avc2’ AVC and SVC Configurations An SVC track with both AVC NAL units and SVC
NAL units; Extractors may be present and used to
reference both AVC and SVC NAL units;
Aggregators may be present to contain and
reference both AVC and SVC NAL units; Tier
grouping may be present.
and add to A.6.3.1.1 before the paragraph “The following table…”:
The parameter sets required to decode a NAL unit that is present in the sample data of a video stream, either
directly or by reference from an Extractor, shall be present in the decoder configuration of that video stream or
in the associated parameter set stream (if used).
© ISO/IEC 2009 – All rights reserved 3
In A.6.3.1.2 replace:
class AVC2SampleEntry() extends VisualSampleEntry (‘avc2’){
AVCConfigurationBox avcconfig;
SVCConfigurationBox svcconfig; // optional
MPEG4BitRateBox bitrate; // optional
MPEG4ExtensionDescriptorsBox descr; // optional
ScalabilityInformationSEIBox scalability; // optional
SVCPriorityAssignmentBox method; // optional
}
with:
class AVC2SVCSampleEntry() extends VisualSampleEntry (‘avc2’){
AVCConfigurationBox avcconfig;
SVCConfigurationBox svcconfig; // optional
MPEG4BitRateBox bitrate; // optional
MPEG4ExtensionDescriptorsBox descr; // optional
ScalabilityInformationSEIBox scalability; // optional
SVCPriorityAssignmentBox method; // optional
}
In the title of Annex B, replace:
SVC-file-format-specific in-stream structures
with:
In-stream structures specific to SVC and MVC file formats
In B.1, Introduction, replace:
Aggregators and Extractors use the NAL unit structure. These structures are seen as NAL units in the context
of the sample structure.
with:
Aggregators and Extractors use the NAL unit syntax. These structures are seen as NAL units in the context of
the sample structure.
In B.2.1, Definition (Aggregators), replace:
Aggregators are used to group NAL units belonging to the same sample. Aggregators use the same NAL unit
header as SVC VCL NAL units of type 20 but with a different value of NAL unit type.
Aggregators can both aggregate, by inclusion, NAL units within them (within the size indicated by their length)
and also aggregate, by reference, NAL units that follow them (within the area indicated by the additional_bytes
field within them), When the stream is scanned by an AVC file reader, only the included NAL units are seen as
“within” the aggregator; this permits, for example, an AVC file reader to skip a whole set of un-needed SVC
VCL NAL units. Similarly if AVC NAL units are aggregated by reference, the AVC reader will not skip them
and they remain in-stream for that reader.
with:
Aggregators are used to group NAL units belonging to the same sample. Aggregators use the same NAL unit
header as SVC VCL NAL units or MVC VCL NAL units, but with a different value of NAL unit type. When the
svc_extension_flag of the NAL unit syntax (specified in 7.3.1 of ISO/IEC 14496-10) of an aggregator is equal
to 1, the NAL unit header of SVC VCL NAL units is used for the aggregator. Otherwise, the NAL unit header of
MVC VCL NAL units is used for the aggregator.
4 © ISO/IEC 2009 – All rights reserved
Aggregators can both aggregate, by inclusion, NAL units within them (within the size indicated by their length)
and also aggregate, by reference, NAL units that follow them (within the area indicated by the additional_bytes
field within them). When the stream is scanned by an AVC file reader, only the included NAL units are seen as
“within” the aggregator. This permits an AVC file reader to skip a whole set of un-needed SVC VCL NAL units
or MVC VCL NAL units when they are aggregated by inclusion. This also permits an AVC reader not to skip
AVC NAL units but let them remain in-stream when they are aggregated by reference.
Aggregators can be used to group AVC base view NAL units. If these Aggregators are used in an ‘avc1’ track
then an aggregator shall not use inclusion but reference of AVC base view NAL units (the length of the
Aggregator includes only its header and the NAL units referenced by the Aggregator are specified by
additional_bytes).
In B.2.3, Semantics (Aggregators), replace:
NALUnitHeader(): The NAL unit structure as specified in ISO/IEC 14496-10 Annex G for NAL units of
type 20:
nal_unit_type shall be set to the aggregator NAL unit type (type 30).
forbidden_zero_bit, reserved_one_bit, and reserved_three_2bits shall be set as specified in
ISO/IEC 14496-10 Annex G.
Other fields (nal_ref_idc, idr_flag, priority_id, no_inter_layer_pred_flag,
dependency_id, quality_id, temporal_id, use_ref_base_pic_flag,
discardable_flag, and output_flag) shall be set as specified in B.4.
with:
NALUnitHeader(): the first four bytes of SVC and MVC VCL NAL units.
nal_unit_type shall be set to the aggregator NAL unit type (type 30).
For an aggregator including or referencing SVC NAL units, the following shall apply.
forbidden_zero_bit and reserved_three_2bits shall be set as specified in
ISO/IEC 14496-10.
Other fields (nal_ref_idc, idr_flag, priority_id, no_inter_layer_pred_flag,
dependency_id, quality_id, temporal_id, use_ref_base_pic_flag,
discardable_flag, and output_flag) shall be set as specified in B.4.
For an aggregator including or referencing MVC NAL units, the following shall apply.
forbidden_zero_bit and reserved_one_bit shall be set as specified in ISO/IEC 14496-10.
Other fields (nal_ref_idc, non_idr_flag, priority_id, view_id, temporal_id,
anchor_pic_flag, and inter_view_flag) shall be set as specified in B.5.
In B.3.1 change the first sentence as follows:
This Subclause describes Extractors, which enable compact formation of tracks that extract, by reference,
NAL unit data from other tracks.
In B.3.3, Semantics (Extractor), replace:
NALUnitHeader(): The NAL unit structure as specified in ISO/IEC 14496-10 Annex G for NAL units of
type 20:
nal_unit_type shall be set to the extractor NAL unit type (type 31).
forbidden_zero_bit, reserved_one_bit, and reserved_three_2bits shall be set as specified in
ISO/IEC 14496-10 Annex G.
Other fields (nal_ref_idc, idr_flag, priority_id, no_inter_layer_pred_flag,
dependency_id, quality_id, temporal_id, use_ref_base_pic_flag,
discardable_flag, and output_flag) shall be set as specified in B.4.
© ISO/IEC 2009 – All rights reserved 5
with:
NALUnitHeader(): the first four bytes of SVC and MVC VCL NAL units.
nal_unit_type shall be set to the extractor NAL unit type (type 31).
For an extractor referencing SVC NAL units, the following shall apply.
forbidden_zero_bit and reserved_three_2bits shall be set as specified in
ISO/IEC 14496-10.
Other fields (nal_ref_idc, idr_flag, priority_id, no_inter_layer_pred_flag,
dependency_id, quality_id, temporal_id, use_ref_base_pic_flag,
discardable_flag, and output_flag) shall be set as specified in B.4.
For an extractor referencing MVC NAL units, the following shall apply.
forbidden_zero_bit and reserved_one_bit shall be set as specified in ISO/IEC 14496-10.
Other fields (nal_ref_idc, non_idr_flag, priority_id, view_id, temporal_id,
anchor_pic_flag, and inter_view_flag) shall be set as specified in B.5.
In the title of Annex B.4, replace:
NAL unit header values
with:
NAL unit header values for SVC
In B.4, replace the following paragraph:
Aggregators can be used to group AVC base layer NAL units. If these Aggregators are used in an ‘avc1’ track
then an aggregator shall not use inclusion but reference of AVC base layer NAL units (the length of the
Aggregator includes only its header and the NAL units referenced by the Aggregator are specified by
additional_bytes).
with:
The fields below shall take the following values:
After B.4, add B.5:
B.5 NAL unit header values for MVC
Both Aggregators and Extractors use NAL unit headers with the NAL unit header extension. The NAL units
extracted by an extractor or aggregated by an aggregator are all those NAL units that are referenced or
included by recursively inspecting the contents of aggregator or extractor NAL units.
The fields nal_ref_idc, non_idr_flag, priority_id, view_id, temporal_id,
anchor_pic_flag, and inter_view_flag shall take the following values:
nal_ref_idc shall be set to the highest values of the field in all the aggregated or extracted NAL units.
non_idr_flag shall be set to the lowest values of the field in all the aggregated or extracted NAL units.
priority_id and temporal_id shall be set to the lowest values of the fields, respectively, in all the
aggregated or extracted NAL units.
view_id shall be set to the view_id value of the VCL NAL unit with the lowest view order index among all
the aggregated or extracted VCL NAL units.
anchor_pic_flag and inter_view_flag shall be set to the highest value of the fields, respectively,
in all the aggregated or extracted VCL NAL units.
6 © ISO/IEC 2009 – All rights reserved
In the title of Annex C, replace:
SVC sample group definitions
with:
SVC and MVC sample group definitions
In C.1, Introduction, replace:
The following sample groups may be used in an SVC track to document the structure of the SVC stream and
to ease obtaining information of subsets of the stream and extraction of any of the subsets.
There are a number of boxes, defined below, which may occur in the sample group description, namely the
Scalable Group Entry.
Each Scalable Group Entry documents a subset of the SVC stream. Each of the subsets is associated with a
tier and may contain one or more operating points. These entries are defined using a grouping type of ‘scif’.
NOTE For each tier, there may be more than one ScalableGroupEntry in the SampleGroupDescriptionBox of
grouping type ‘scif’. Only one of those entries is the primary definition of the tier.
Though the Scalable Group Entries are contained in the SampleGroupDescription box, the grouping is not a
true sample grouping as each sample may be associated with more than one scalable group, as these groups
are used to describe sections of the samples – the NAL units. As a result, it is possible that there may not be a
SampleToGroup box of the grouping type 'scif', unless it happens that a group does, in fact, describe a
whole sample. Even if a SampleToGroup box of the grouping type 'scif' is present, the information is not
needed for extraction of NAL units of tiers; instead, the map groups must always document the ‘pattern’ of
NAL units within the samples and provide the NAL-unit-to-tier mapping information that may be needed for
extraction of NAL units.
with:
The following sample groups may be used in an SVC or MVC track to document the structure of the SVC or
MVC stream and to ease obtaining information of subsets of the stream and extraction of any of the subsets.
If views from the same MVC bitstream are stored in multiple MVC tracks and one or more of these tracks
contain multiple views, sample group entries and map groups can be used for these tracks containing multiple
views.
There are a number of boxes, defined below, which may occur in the sample group description, namely the
Scalable Group Entry for an SVC stream or the Multiview Group Entry for an MVC stream.
Each Scalable Group Entry or Multiview Group Entry documents a subset of the SVC stream or the MVC
stream, respectively. Each of the subsets is associated with a tier and may contain one or more operating
points. A grouping type of ‘scif’ or ‘mvif’ is used to define Scalable Group Entries or Multiview Group
Entries, respectively.
For each tier, there may be more than one Scalable Group Entry or Multiview Group Entry in the
SampleGroupDescriptionBox of grouping type ‘scif’ or ‘mvif’, respectively. Only one of those entries is the
primary definition of the tier.
Though the Scalable and Multiview Group Entries are contained in the SampleGroupDescription box, the
grouping is not a true sample grouping as each sample may be associated with more than one tier, as these
groups are used to describe sections of the samples – the NAL units. As a result, it is possible that there may
not be a SampleToGroup box of the grouping type 'scif' or ‘mvif’, unless it happens that a group does, in
fact, describe a whole sample. Even if a SampleToGroup box of the grouping type 'scif' or ‘mvif’ is present,
the information is not needed for extraction of NAL units of tiers; instead, the map groups must always
© ISO/IEC 2009 – All rights reserved 7
document the ‘pattern’ of NAL units within the samples and provide the NAL-unit-to-tier mapping information
that may be needed for extraction of NAL units.
A multiview group specifies an MVC operating point and is therefore associated with the target output views of
the MVC operating point. The Multiview Group box, defined in F8.3, is used to specify a multiview group.
Many of the boxes used to characterize SVC and MVC tiers are also used to characterize MVC operating
points and can therefore be contained in the Multiview Group box too.
In C.2.1.1, Definition (Tier information box), replace:
Box Types: ‘tiri’
Container: ScalableGroupEntry
Mandatory: Yes
Quantity: Zero or One // depends on primary_definition
The tier information box provides information about the profile, level, frame size, discardability, and frame-rate
of a tier.
with:
Box Type: ‘tiri’
Container: ScalableGroupEntry or MultiviewGroupEntry or MultiviewGroupBox
Mandatory: Yes
Quantity: Zero or One // depends on primary_definition
The tier information box provides information about the profile, level, frame size, discardability, and frame-rate
of a covered bitstream subset. If the Tier Information box is included in a Scalable Group entry or a Multiview
Group entry, the covered bitstream subset consists of the tier and tiers it depends upon. If the Tier Information
box is included in a Multiview Group box, the covered bitstream subset consists of the target output views of
the multiview group and all the views required for decoding the target output views.
In C.2.1.3, Semantics (Tier information box), replace:
tierID gives the identifier of the tier.
profileIndication contains the profile_idc as defined in ISO/IEC 14496-10, when the parameter
applies to the bitstream subset consisting of the tier and all the dependent tiers.
profile_compatibility is a byte defined exactly the same as the byte which occurs between the
profile_idc and level_idc in a sequence parameter set or a subset sequence parameter set, as defined
in ISO/IEC 14496-10 Annex G, when the parameters apply to the bitstream subset consisting of the
tier and all the dependent tiers.
levelIndication contains the level_idc as defined in ISO/IEC 14496-10, when the parameter applies
to the bitstream subset consisting of the tier and all the dependent tiers.
The profile, profile compatibility flags and level indicated by the fields profileIndication,
profile_compatibility, and levelIndication specifies an interoperability point with which
the bitstream obtained from the particular tier and all the dependent tiers is compatible.
visualWidth gives the value of the width of the coded picture or coded sub-picture in luma pixels of the
representation of this tier in the SVC stream. A coded sub-picture consists of a proper subset of
coded slices of a coded picture. A tier may consist of only sub-pictures. In this case, the tier is referred
to as a sub-picture tier. A sub-picture tier may represent a region-of-interest part of the region
represented by the entire stream.
NOTE The tier representation of a sub-picture tier might not be a valid SVC stream. One example is as
follows. An AVC bitstream is encoded using two slice groups. The first slice group includes the macroblocks
representing a region-of-interest and is coded without referring to slices in the other slice group for inter
prediction over all the access units. The slices of the first slice group in each access unit then form a sub-picture
and a sub-picture tier can be specified to include all the sub-pictures over all the access units.
8 © ISO/IEC 2009 – All rights reserved
visualHeight gives the value of the height of the coded picture or coded sub-pictures in luma pixels of
the representation this tier in the SVC stream.
discardable takes one of the following values; the value 02 is reserved.
00 this tier does not contain NAL units with discardable_flag equal to 1.
01 this tier contains both NAL units with discardable_flag equal to 1 and discardable_flag equal to 0.
03 all NAL units in this tier are with discardable_flag equal to 1.
constantFrameRate specifies if the frame rate of this tier is constant. A value of 0 denotes a non-
constant frame rate, a value of 1 denotes a constant frame rate and a value of 2 denotes that it is not
clear whether the frame rate is constant. A value of 3 is reserved.
frameRate gives the frame rate when the bitstream corresponding to this tier and all the lower tiers that
this tier depends on is decoded. If constantFrameRate has a value of 0 or 2 then frameRate
gives the average frame rate. If constantFrameRate has a value of 1 then frameRate gives the
constant frame rate. frameRate equal to 0 indicates an unspecified frame rate.
with:
tierID gives the identifier of the tier, when the Tier Information box is included a Scalable Group entry
or a Multiview Group entry. Otherwise, the semantics of tierID are unspecified, and in this case,
tierID must be set to the reserved value 0.
profileIndication contains the profile_idc as defined in ISO/IEC 14496-10, when the parameter
applies to the covered bitstream subset.
profile_compatibility is a byte defined exactly the same as the byte which occurs between the
profile_idc and level_idc in a sequence parameter set or a subset sequence parameter set, as defined
in ISO/IEC 14496-10 Annex G or Annex H, when the parameters apply to the covered bitstream
subset.
levelIndication contains the level_idc as defined in ISO/IEC 14496-10, when the parameter applies
to the covered bitstream subset. If the Tier Information Box is included in a Multiview Group Entry,
levelIndication shall be valid when all the views of the covered bitstream subset are target
output views. If the Tier Information Box is included in a Multiview Group Box, levelIndication
shall be valid when the views specified by the respective multiview group are the target output views.
If levelIndication is equal to 0 for an MVC stream, the level that applies to the covered bitstream
subset and operating with all the views being target output views is unspecified.
The profile, profile compatibility flags and level indicated by the fields profileIndication,
profile_compatibility, and levelIndication specifies an interoperability point with which
the covered bitstream subset, and, for MVC, operating with the target output views as specified in the
semantics of levelIndication, is compatible.
visualWidth gives the value of the width of the coded picture (of an SVC stream), coded sub-picture
(of an SVC stream), or coded view component (of an MVC stream) in luma pixels of the
representation of this tier in the stream or any view component of the covered bitstream subset. A
coded sub-picture consists of a proper subset of coded slices of a coded picture. A tier may consist of
only sub-pictures. In this case, the tier is referred to as a sub-picture tier. A sub-picture tier may
represent a region-of-interest part of the region represented by the entire stream.
NOTE The tier representation of a sub-picture tier might not be a valid stream. One example is as follows. An AVC
bitstream is encoded using two slice groups. The first slice group includes the macroblocks representing a region-of-
interest and is coded without referring to slices in the other slice group for inter prediction over all the access units. The
slices of the first slice group in each access unit then form a sub-picture and a sub-picture tier can be specified to include
all the sub-pictures over all the access units.
visualHeight gives the value of the height of the coded picture (of an SVC stream), coded sub-picture
(of an SVC stream), or coded view component (of an MVC stream) in luma pixels of the
representation of this tier in the stream or any view component of the covered bitstream subset.
© ISO/IEC 2009 – All rights reserved 9
discardable takes one of the following values; the value 02 is reserved.
00 this tier does not contain NAL units with discardable_flag (for SVC) equal to 1 or inter_view_flag
(for MVC) equal to 0.
01 this tier contains both NAL units with discardable_flag (for SVC) equal to 1 or inter_view_flag (for
MVC) equal to 0 and discardable_flag (for SVC) equal to 0 or inter_view_flag (for MVC) equal to 1.
03 all NAL units in this tier are with discardable_flag (for SVC) equal to 1 or inter_view_flag (for MVC)
equal to 0.
constantFrameRate specifies if the frame rate of this tier is constant. A value of 0 denotes a non-
constant frame rate, a value of 1 denotes a constant frame rate and a value of 2 denotes that it is not
clear whether the frame rate is constant. A value of 3 is reserved.
frameRate gives the frame rate when the bitstream corresponding to this tier and all the lower tiers that
this tier depends on is decoded in frames per second rounded to the closest integer using the Round
function specified in ISO/IEC 14496-10. If constantFrameRate has a value of 0 or 2 then
frameRate gives the average frame rate. If constantFrameRate has a value of 1 then
frameRate gives the constant frame rate. frameRate equal to 0 indicates an unspecified frame rate.
For SVC streams, decoded frames, complementary field pairs and non-paired fields are regarded as
frames when deriving the value of frameRate. For MVC streams, decoded view components of any
single view only are regarded as frames when deriving the value of frameRate, regardless of the
total number of the views, since all output views are required to have simultaneous view components.
In C.2.2.1, Definition (Tier bit rate box), replace:
Box Type: ‘tibr’
Container: ScalableGroupEntry
Mandatory: No
Quantity: Zero or One
The tier bit rate box provides information about the bit rate values of a tier. Two sets of information are
provided: for the tier representation, including all the tiers on which the current tier depends, and for the tier
alone. Similarly, for each set of information, three values are supplied:
a) the lowest long-term average bit rate that this tier could deliver. Let maxDid be the greatest
dependency_id for all NAL units of the tier, and minQid be the least quality_id for all the NAL units of
the tier and having dependency_id equal to maxDid. The following NAL units of this tier are not
considered in calculating this bit rate value: those having dependency_id equal to maxDid and
quality_id greater than minQid.
with:
Box Type: ‘tibr’
Container: ScalableGroupEntry or MultiviewGroupEntry or MultiviewGroupBox
Mandatory: No
Quantity: Zero or One
When included in a Scalable Group entry or a Multiview Group entry, the tier bit rate box provides information
about the bit rate values of a tier. Two sets of information are provided: for the tier representation, including all
the tiers on which the current tier depends, and for the tier alone. Similarly, for each set of information, the
following values are supplied:
a) for SVC streams, the lowest long-term average bit rate that this tier could deliver. Let maxDid be the
greatest dependency_id for all NAL units of the tier, and minQid be the least quality_id for all the NAL
units of the tier and having dependency_id equal to maxDid. The following NAL units of this tier are
not considered in calculating this bit rate value: those having dependency_id equal to maxDid and
quality_id greater than minQid. For MVC streams, the lowest long-term average bit rate that this tier
could deliver is equal to the long-term average bit rate of the tier, when all NAL units of the tier are
considered.
10 © ISO/IEC 2009 – All rights reserved
At the end of C.2.2.1 insert:
When included in a Multiview Group box, the tier bit rate box provides information about the bit rate values of
the covered bitstream subset consisting of the target output views indicated by the multiview group and all the
views required for decoding of the target output views. The maximum and long-term average bit rate for the
covered bitstream subset are provided.
In C.2.2.3, Semantics (Tier bit rate box), make the following modifications:
Replace:
baseBitRate gives the lowest long-term average bit rate in bits/second of the stream made from this tier
and the tiers it depends upon over the entire stream, when the Tier Bit Rate box is included in a
Scalable Group entry or a Multiview Group entry.
For SVC streams, baseBitRate is derived as follows. Let maxDid be the greatest dependency_id for
all NAL units of the tier, and minQid be the least quality_id for all NAL units of the tier and having
dependency_id equal to maxDid. The NAL units that are taken into account when calculating this
bit rate value are as follows: 1) all NAL units of the tier except for those having dependency_id
equal to maxDid and quality_id greater than minQid; 2) all NAL units of the lower tiers the current
tier depends on.
For MVC streams, baseBitRate shall be equal to avgBitRate.
maxBitRate gives the maximum bit rate in bits/second of the stream containing all NAL unit mapped to
this tier and the tiers it depends upon (when the Tier Bit Rate box is included in a Scalable Group
entry or a Multiview Group entry) or the covered bitsteam subset (when the Tier Bit Rate box is
included in a Multiview Group box), over any window of one second. All NAL units in this tier and the
lower tiers this tier depends on are taken into account (when the Tier Bit Rate box is included in a
Scalable Group entry or a Multiview Group entry).
avgBitRate gives the long-term average bit rate in bits/second of the stream containing all NAL unit
mapped to this tier and the tiers it depends upon (when the Tier Bit Rate box is included in a Scalable
Group entry or a Multiview Group entry) or the covered bitsteam subset (when the Tier Bit Rate box is
included in a Multiview Group box), averaged over the entire stream. All NAL units in this tier and the
lower tiers this tier depends on are taken into account.
tierBaseBitRate, tierMaxBitRate, and tierAvgBitRate are unspecified when the Tier Bit Rate
box is included in a Multiview Group box. Otherwise, tierBaseBitRate, tierMaxBitRate, and
tierAvgBitRate are specified as follows.
tierBaseBitRate gives the lowest long-term average bit rate in bits/second of the stream made from
only this tier over the entire stream. For SVC streams, the set of NAL units that are taken into account
when calculating this bit rate value is the same as for baseBitRate but excluding all NAL units of the
dependent lower tiers. For MVC streams, tierBaseBitRate shall be equal to tierAvgBitRate.
tierMaxBitRate gives the maximum bit rate in bits/second that is provided by only this tier over any
window of one second. All NAL units mapped to this tier are taken into account. All NAL units of the
dependent lower tiers are not considered.
tierAvgBitRate - gives the long-term average bit rate in bits/second that is provided by only this tier,
averaged over the entire stream. All NAL units mapped to this tier are taken into account. All NAL
units of the dependent lower tiers are not considered.
with:
baseBitRate gives the lowest long-term average bit rate in bits/second of the stream made from this tier
and the lower tiers this tier depends on over the entire stream.
For SVC streams, baseBitRate is derived as follows. Let maxDid be the greatest dependency_id for
all NAL units of the tier, and minQid be the least quality_id for all NAL units of the tier and having
dependency_id equal to maxDid. The NAL units that are taken into account when calculating this
bit rate value are as follows: 1) all NAL units of the tier except for those having dependency_id
equal to maxDid and quality_id greater than minQid; 2) all NAL units of the lower tiers the current
tier depends on.
© ISO/IEC 2009 – All rights reserved 11
For MVC streams, baseBitRate shall be equal to avgBitRate.
maxBitRate gives the maximum bit rate in bits/second of the stream containing all NAL unit mapped to
this tier and the lower tiers this tier depends on, over any window of one second. All NAL units in this
tier and the lower tiers this tier depends on are taken into account.
avgBitRate gives the long-term average bit rate in bits/second of the stream containing all NAL unit
mapped to this tier and the lower tiers this tier depends on, averaged over the entire stream. All NAL
units in this tier and the lower tiers this tier depends on are taken into account.
tierBaseBitRate gives the lowest long-term average bit rate in bits/second of the stream made from
only this tier over the entire stream. For SVC streams, the set of NAL units that are taken into account
when calculating this bit rate value is the same as for baseBitRate but excluding all NAL units of the
lower tiers this tier depends on. For MVC streams, tierBaseBitRate shall be equal to
tierAvgBitRate.
tierMaxBitRate gives the maximum bit rate in bits/second that is provided by only this tier over any
window of one second. All NAL units mapped to this tier are taken into account. All NAL units of the
lower tiers this tier depends on are not considered.
tierAvgBitRate - gives the long-term average bit rate in bits/second that is provided by only this tier,
averaged over the entire stream. All NAL units mapped to this tier are taken into account. All NAL
units of the lower tiers this tier depends on are not considered.
Replace:
tierBaseBitRate gives the lowest long-term average bit rate in bits/second of the stream made from
only this tier over the entire stream. The set of NAL units that are taken into account when calculating
this bit rate value is the same as for baseBitRate but excluding all NAL units of the dependent lower
tiers.
with:
tierBaseBitRate gives the lowest long-term average bit rate in bits/second of the stream made from
only this tier over the entire stream. For SVC streams, the set of NAL units that are taken into account
when calculating this bit rate value is the same as for baseBitRate but excluding all NAL units of the
lower tiers this tier depends on. For MVC streams, tierBaseBitRate shall be equal to
tierAvgBitRate.
In the title of C.2.3, SVC priority range, replace:
SVC priority range
with:
Priority range
In C.2.3.1, Definition (SVC priority range), replace:
Box Types: ‘svpr’
Container: ScalableGroupEntry
Mandatory: Yes
Quantity: Exactly One
with:
Box Type: ‘svpr’
Container: ScalableGroupEntry or MultiviewGroupEntry
Mandatory: Yes
Quantity: Exactly One
NOTE – this box was previously called SVCPriorityRangeBox.
12 © ISO/IEC 2009 – All rights reserved
In C.2.3.2 replace:
class SVCPriorityRangeBox extends Box(‘svpr’) {
unsigned int(2) reserved1 = 0;
unsigned int(6) min_priorityId;
unsigned int(2) reserved2 = 0;
unsigned int(6) max_priorityId;
}
with:
cla
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...