Information technology — Coding of audio-visual objects — Part 10: Advanced video coding

This document specifies advanced video coding for coding of audio-visual objects.

Technologies de l'information — Codage des objets audiovisuels — Partie 10: Codage visuel avancé

General Information

Status: Published
Publication Date: 10-Jul-2025

ICS: 35.040.40 - Coding of audio, video, multimedia and hypermedia information

Technical Committee: ISO/IEC JTC 1/SC 29 - Coding of audio, picture, multimedia and hypermedia information
Drafting Committee: ISO/IEC JTC 1/SC 29 - Coding of audio, picture, multimedia and hypermedia information

Current Stage: 6060 - International Standard published
Start Date: 11-Jul-2025
Due Date: 24-Jul-2025
Completion Date: 11-Jul-2025

Relations

Revises: ISO/IEC 14496-10:2022 - Information technology — Coding of audio-visual objects — Part 10: Advanced video coding
Effective Date: 12-Aug-2023

Overview

ISO/IEC 14496-10:2025 - "Information technology - Coding of audio-visual objects - Part 10: Advanced video coding" is the eleventh edition of the international standard that specifies advanced video coding for coding audio‑visual objects. The document defines bitstream formats, syntax and semantics, decoding processes, and normative annexes (profiles/levels, byte‑stream, SEI, VUI) to ensure interoperable, high-efficiency video compression and delivery.

Keywords: ISO/IEC 14496-10:2025, advanced video coding standard, video compression, bitstream format

Key Topics and Technical Requirements

The standard covers the full decoder model and coding tools required for compliant implementations. Major technical topics include:

Bitstream and NAL unit formats - definitions for raw byte sequence payloads and network abstraction layers.
Syntax and semantics - detailed tables and descriptions for slice headers, macroblock layers, and RBSP structures.
Decoding process - procedures for NAL parsing, slice decoding, picture order count, and reference picture management.
Prediction methods - comprehensive intra and inter prediction processes (4x4, 8x8, 16x16 modes and chroma prediction).
Transform and quantization - transform coefficient decoding, inverse scanning, scaling, and transform‑bypass cases.
Entropy coding - support and parsing processes for CAVLC and CABAC.
Deblocking filter - normative filtering process applied to reconstructed pictures.
Support for advanced use cases - normative annexes for scalable video coding, multiview video coding, multiview & depth video, and enhanced non‑base view coding.
Profiles and levels; SEI and VUI - constraints, capabilities and supplemental metadata to aid interoperability and usability.
Reference materials - hypothetical reference decoder and byte stream guidance for implementers.

Practical Applications and Who Uses This Standard

ISO/IEC 14496-10:2025 is designed for anyone implementing or using standardized video compression technologies, including:

Codec developers and software vendors building encoders/decoders.
ASIC/SoC and GPU vendors integrating hardware video codecs.
Streaming platforms and content delivery networks ensuring compliant bitstreams.
Broadcast and OTT service providers targeting efficient delivery and interoperability.
Multimedia application developers (conferencing, surveillance, immersive media).
Standards bodies, test labs, and interoperability/QA teams performing compliance testing.
Researchers and educators studying video coding techniques (entropy coding, prediction, multiview).

Keywords: video codec implementation, streaming, broadcasting, hardware encoder

Related Standards

Relevant companion standards and materials often used with ISO/IEC 14496-10:2025 include:

Byte-stream and container specifications for transport (annexed byte stream format)
Metadata and signalling standards (SEI, VUI)
Other ISO/IEC MPEG parts addressing systems, audio, and file formats

Adopting ISO/IEC 14496-10:2025 ensures consistent decoding behavior, efficient compression, and interoperable delivery across devices and services.

ISO/IEC 14496-10:2025 - Information technology — Coding of audio-visual objects — Part 10: Advanced video coding
Released:11. 07. 2025 - Page 1 preview

ISO/IEC 14496-10:2025 - Information technology — Coding of audio-visual objects — Part 10: Advanced video coding
Released:11. 07. 2025 - Page 2 preview

ISO/IEC 14496-10:2025 - Information technology — Coding of audio-visual objects — Part 10: Advanced video coding
Released:11. 07. 2025 - Page 3 preview

Standard

ISO/IEC 14496-10:2025 - Information technology — Coding of audio-visual objects — Part 10: Advanced video coding Released:11. 07. 2025

English language

997 pages

sale 15% off

Preview

sale 15% off

Preview

Get Certified

Connect with accredited certification bodies for this standard

BSI Group

BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

UKAS United Kingdom Verified

Visit Website

NYCE

Mexican standards and certification body.

EMA Mexico Verified

Visit Website

Frequently Asked Questions

What is ISO/IEC 14496-10:2025?

ISO/IEC 14496-10:2025 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology — Coding of audio-visual objects — Part 10: Advanced video coding". This standard covers: This document specifies advanced video coding for coding of audio-visual objects.

What is the scope of ISO/IEC 14496-10:2025?

This document specifies advanced video coding for coding of audio-visual objects.

What ICS categories does ISO/IEC 14496-10:2025 belong to?

ISO/IEC 14496-10:2025 is classified under the following ICS (International Classification for Standards) categories: 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.

What standards are related to ISO/IEC 14496-10:2025?

ISO/IEC 14496-10:2025 has the following relationships with other standards: It is inter standard links to ISO/IEC 14496-10:2022. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

How can I access ISO/IEC 14496-10:2025?

ISO/IEC 14496-10:2025 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.

Standards Content (Sample)

ISO/IEC 14496-10:2025 - Inform...

International
Standard
ISO/IEC 14496-10
Eleventh edition
Information technology — Coding of
2025-07
audio-visual objects —
Part 10:
Advanced video coding
Technologies de l'information — Codage des objets
audiovisuels —
Partie 10: Codage visuel avancé
Reference number
© ISO/IEC 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
© ISO/IEC 2025 – All rights reserved
ii
Contents
Foreword . vi
Introduction . vii
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
3.1 General terms related to advanced video coding . 1
3.2 Terms related to scalable video coding (Annex F) . 18
3.3 Terms related to multiview video coding (Annex G) . 26
3.4 Terms related to multiview and depth video coding (Annex H) . 30
3.5 Terms related to multiview and depth video with enhanced non-base view coding (Annex I) . 32
4 Abbreviated terms . 33
5 Conventions . 33
5.1 Arithmetic operators . 34
5.2 Logical operators . 34
5.3 Relational operators . 34
5.4 Bit-wise operators . 34
5.5 Assignment operators . 35
5.6 Range notation . 35
5.7 Mathematical functions . 35
5.8 Order of operation precedence . 37
5.9 Variables, syntax elements, and tables. 38
5.10 Text description of logical operations . 39
5.11 Processes . 40
6 Source, coded, decoded and output data formats, scanning processes, and neighbouring relationships
..................................................................................................................................................................................... 41
6.1 Bitstream formats . 41
6.2 Source, decoded, and output picture formats . 41
6.3 Spatial subdivision of pictures and slices . 46
6.4 Inverse scanning processes and derivation processes for neighbours . 47
6.4.1 Inverse macroblock scanning process . 47
6.4.2 Inverse macroblock partition and sub-macroblock partition scanning process . 48
6.4.3 Inverse 4x4 luma block scanning process. 50
6.4.4 Inverse 4x4 Cb or Cr block scanning process for ChromaArrayType equal to 3 . 50
6.4.5 Inverse 8x8 luma block scanning process. 50
6.4.6 Inverse 8x8 Cb or Cr block scanning process for ChromaArrayType equal to 3 . 51
6.4.7 Inverse 4x4 chroma block scanning process . 51
6.4.8 Derivation process of the availability for macroblock addresses . 51
6.4.9 Derivation process for neighbouring macroblock addresses and their availability . 51
6.4.10 Derivation process for neighbouring macroblock addresses and their availability in MBAFF frames52
6.4.11 Derivation processes for neighbouring macroblocks, blocks, and partitions . 53
6.4.12 Derivation process for neighbouring locations . 58
6.4.13 Derivation processes for block and partition indices . 62
7 Syntax and semantics . 63
7.1 Method of specifying syntax in tabular form . 63
7.2 Specification of syntax functions, categories, and descriptors . 64
7.3 Syntax in tabular form . 67
7.3.1 NAL unit syntax . 67
7.3.2 Raw byte sequence payloads and RBSP trailing bits syntax. 68
7.3.3 Slice header syntax . 76
7.3.4 Slice data syntax . 81
7.3.5 Macroblock layer syntax . 82
7.4 Semantics . 89
7.4.1 NAL unit semantics . 89
7.4.2 Raw byte sequence payloads and RBSP trailing bits semantics . 102
7.4.3 Slice header semantics . 118
© ISO/IEC 2025 – All rights reserved
iii
7.4.4 Slice data semantics . 132
7.4.5 Macroblock layer semantics . 133
8 Decoding process . 147
8.1 NAL unit decoding process . 149
8.2 Slice decoding process . 149
8.2.1 Decoding process for picture order count . 149
8.2.2 Decoding process for macroblock to slice group map . 154
8.2.3 Decoding process for slice data partitions . 158
8.2.4 Decoding process for reference picture lists construction . 159
8.2.5 Decoded reference picture marking process . 167
8.3 Intra prediction process . 172
8.3.1 Intra_4x4 prediction process for luma samples . 173
8.3.2 Intra_8x8 prediction process for luma samples . 180
8.3.3 Intra_16x16 prediction process for luma samples . 188
8.3.4 Intra prediction process for chroma samples . 191
8.3.5 Sample construction process for I_PCM macroblocks . 196
8.4 Inter prediction process . 196
8.4.1 Derivation process for motion vector components and reference indices . 199
8.4.2 Decoding process for Inter prediction samples . 213
8.4.3 Derivation process for prediction weights . 224
8.5 Transform coefficient decoding process and picture construction process prior to deblocking filter
process . 226
8.5.1 Specification of transform decoding process for 4x4 luma residual blocks . 227
8.5.2 Specification of transform decoding process for luma samples of Intra_16x16 macroblock
prediction mode . 227
8.5.3 Specification of transform decoding process for 8x8 luma residual blocks . 228
8.5.4 Specification of transform decoding process for chroma samples . 229
8.5.5 Specification of transform decoding process for chroma samples with ChromaArrayType equal to 3231
8.5.6 Inverse scanning process for 4x4 transform coefficients and scaling lists . 231
8.5.7 Inverse scanning process for 8x8 transform coefficients and scaling lists . 232
8.5.8 Derivation process for chroma quantization parameters . 234
8.5.9 Derivation process for scaling functions . 234
8.5.10 Scaling and transformation process for DC transform coefficients for Intra_16x16 macroblock type236
8.5.11 Scaling and transformation process for chroma DC transform coefficients . 237
8.5.12 Scaling and transformation process for residual 4x4 blocks . 238
8.5.13 Scaling and transformation process for residual 8x8 blocks . 242
8.5.14 Picture construction process prior to deblocking filter process . 246
8.5.15 Intra residual transform-bypass decoding process . 247
8.6 Decoding process for P macroblocks in SP slices or SI macroblocks . 248
8.6.1 SP decoding process for non-switching pictures . 248
8.6.2 SP and SI slice decoding process for switching pictures . 251
8.7 Deblocking filter process . 253
8.7.1 Filtering process for block edges . 258
8.7.2 Filtering process for a set of samples across a horizontal or vertical block edge . 260
9 Parsing process . 267
9.1 Parsing process for Exp-Golomb codes . 267
9.1.1 Mapping process for signed Exp-Golomb codes . 269
9.1.2 Mapping process for coded block pattern . 269
9.2 CAVLC parsing process for transform coefficient levels . 272
9.2.1 Parsing process for total number of non-zero transform coefficient levels and number of trailing
ones . 273
9.2.2 Parsing process for level information . 278
9.2.3 Parsing process for run information . 280
9.2.4 Combining level and run information . 283
9.3 CABAC parsing process for slice data . 283
9.3.1 Initialization process . 285
9.3.2 Binarization process . 309
9.3.3 Decoding process flow . 319
9.3.4 Arithmetic encoding process . 343
© ISO/IEC 2025 – All rights reserved
iv
Annex A (normative) Profiles and levels . 351
Annex B (normative) Byte stream format . 376
Annex C (normative) Hypothetical reference decoder . 379
Annex D (normative) Supplemental enhancement information . 404
Annex E (normative) Video usability information . 512
Annex F (normative) Scalable video coding . 536
Annex G (normative) Multiview video coding . 799
Annex H (normative) Multiview and depth video coding. 876
Annex I (normative) Multiview and depth video with enhanced non-base view coding . 934
Bibliography . 997

© ISO/IEC 2025 – All rights reserved
v
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives or
www.iec.ch/members_experts/refdocs)
ISO and IEC draw attention to the possibility that the implementation of this document may involve the
use of (a) patent(s). ISO and IEC take no position concerning the evidence, validity or applicability of
any claimed patent rights in respect thereof. As of the date of publication of this document, ISO and IEC
had received notice of (a) patent(s) which may be required to implement this document. However,
implementers are cautioned that this may not represent the latest information, which may be obtained
from the patent database available at www.iso.org/patents and https://patents.iec.ch. ISO and IEC shall
not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT)
see www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration
with ITU-T (as ITU-T H.264).
This eleventh edition cancels and replaces the tenth edition (ISO/IEC 14496-10:2022), which has been
technically revised.
The main changes are as follows:
— the addition of support for neural-network post-filter characteristics, neural-network post-filter
activation, phase indication SEI messages specified in Rec. ITU-T H.274 | ISO/IEC 23002-7, and
additional colour type identifiers.
A list of all parts in the ISO/IEC 14496 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html and www.iec.ch/national-
committees.
© ISO/IEC 2025 – All rights reserved
vi
Introduction
0.1 Prologue
As the costs for both processing power and memory have reduced, network support for coded video
data has diversified, and advances in video coding technology have progressed, the need has arisen for
an industry standard for compressed video representation with substantially increased coding
efficiency and enhanced robustness to network environments. Toward these ends the ITU-T Video
Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) formed a Joint
Video Team (JVT) in 2001 for development of a new Recommendation | International Standard. The
standard has since been maintained and enhanced jointly by VCEG and MPEG.
0.2 Purpose
This Recommendation | International Standard was developed in response to the growing need for
higher compression of moving pictures for various applications such as videoconferencing, digital
storage media, television broadcasting, internet streaming, and communication. It is also designed to
enable the use of the coded video representation in a flexible manner for a wide variety of network
environments. The use of this Recommendation | International Standard allows motion video to be
manipulated as a form of computer data and to be stored on various storage media, transmitted and
received over existing and future networks and distributed on existing and future broadcasting
channels.
0.3 Applications
This Recommendation | International Standard is designed to cover a broad range of applications for
video content including but not limited to the following:
 CATV: cable TV on optical networks, copper, etc.
 DBS: direct broadcast satellite video services.
 DSL: digital subscriber line video services.
 DTTB: digital terrestrial television broadcasting.
 ISM: interactive storage media (optical disks, etc.).
 MMM: multimedia mailing.
 MSPN: multimedia services over packet networks.
 RTC: real-time conversational services (videoconferencing, videophone, etc.).
 RVS: remote video surveillance.
 SSM: serial storage media (digital VTR, etc.).
0.4 Publication and versions of this document
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 1 refers to the first approved version of this
Recommendation | International Standard.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 2 refers to the integrated text containing the corrections
specified in the first technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 3 refers to the integrated text containing both the first
technical corrigendum (2004) and the first amendment, which is referred to as the "Fidelity range
extensions".
© ISO/IEC 2025 – All rights reserved
vii
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 4 refers to the integrated text containing the first
technical corrigendum (2004), the first amendment (the "Fidelity range extensions"), and an additional
technical corrigendum (2005).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 5 refers to the integrated version 4 text with its
specification of the High 4:4:4 profile removed.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 6 refers to the integrated version 5 text after its
amendment to support additional colour space indicators.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 7 refers to the integrated version 6 text after its
amendment to define five new profiles intended primarily for professional applications (the
High 10 Intra, High 4:2:2 Intra, High 4:4:4 Intra, CAVLC 4:4:4 Intra, and High 4:4:4 Predictive profiles)
and two new types of supplemental enhancement information (SEI) messages (the post-filter hint SEI
message and the tone mapping information SEI message).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 8 refers to the integrated version 7 text after its
amendment to specify scalable video coding in three profiles (Scalable Baseline, Scalable High, and
Scalable High Intra profiles).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 9 refers to the integrated version 8 text after applying the
corrections specified in a third technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 10 refers to the integrated version 9 text after its
amendment to specify a profile for multiview video coding (the Multiview High profile) and to define
additional SEI messages.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 11 refers to the integrated version 10 text after its
amendment to define a new profile (the Constrained Baseline profile) intended primarily to enable
implementation of decoders supporting only the common subset of capabilities supported in various
previously-specified profiles.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 12 refers to the integrated version 11 text after its
amendment to define a new profile (the Stereo High profile) for two-view video coding with support of
interlaced coding tools and to specify an additional SEI message specified as the frame packing
arrangement SEI message. The changes for versions 11 and 12 were processed as a single amendment
in the ISO/IEC approval process.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 13 refers to the integrated version 12 text with various
minor corrections and clarifications as specified in a fourth technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 14 refers to the integrated version 13 text after its
amendment to define a new level (Level 5.2) supporting higher processing rates in terms of maximum
macroblocks per second and a new profile (the Progressive High profile) to enable implementation of
decoders supporting only the frame coding tools of the previously-specified High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 15 refers to the integrated version 14 text with
miscellaneous corrections and clarifications as specified in a fifth technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 16 refers to the integrated version 15 text after its
amendment to define three new profiles intended primarily for communication applications (the
Constrained High, Scalable Constrained Baseline, and Scalable Constrained High profiles).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 17 refers to the integrated version 16 text after its
amendment to define additional supplemental enhancement information (SEI) message data, including
the multiview view position SEI message, the display orientation SEI message, and two additional frame
packing arrangement type indication values for the frame packing arrangement SEI message (the 2D
content and tiled arrangement type indication values).
© ISO/IEC 2025 – All rights reserved
viii
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 18 refers to the integrated version 17 text after its
amendment to specify the coding of depth signals, including the specification of an additional profile,
the Multiview Depth High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 19 refers to the integrated version 18 text after
incorporating a correction to the sub-bitstream extraction process for multiview video coding.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 20 refers to the integrated version 19 text after its
amendment to specify the combined coding of video view and depth enhancement, including the
specification of an additional profile, the Enhanced Multiview Depth High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 21 refers to the integrated version 20 text after its
amendment to specify additional colorimetry identifiers and an additional model type in the tone
mapping information SEI message.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 22 refers to the integrated version 21 text after its
amendment to specify multi-resolution frame-compatible (MFC) enhancement for stereoscopic video
coding, including the specification of an additional profile, the MFC High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 23 refers to the integrated version 22 text after its
amendment to specify multi-resolution frame-compatible (MFC) stereoscopic video with depth maps,
including the specification of an additional profile, the MFC Depth High profile, and the mastering
display colour volume SEI message, additional colour-related video usability information codepoint
identifiers, and miscellaneous minor corrections and clarifications.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 24 refers to the integrated version 23 text after its
amendment to specify additional levels of decoder capability supporting larger picture sizes (Levels 6,
6.1, and 6.2), the green metadata SEI message, the alternative depth information SEI message,
additional colour-related video usability information codepoint identifiers, and miscellaneous minor
corrections and clarifications.
Rec. ITU-T H.264 | ISO/IEC 14496-10 version 25 refers to the integrated version 24 text after its
amendment to specify the Progressive High 10 profile; support for additional colour-related indicators,
including the hybrid log-gamma transfer characteristics indication, the alternative transfer
characteristics SEI message, the IC C colour matrix transformation, chromaticity-derived constant
T P
luminance and non-constant luminance colour matrix coefficients, the colour remapping information
SEI message, and miscellaneous minor corrections and clarifications.
Rec. ITU-T H.264 | ISO/IEC 14496-10 version 26 refers to the integrated version 25 text after its
amendment to specify additional SEI messages for ambient viewing environment, content light level
information, content colour volume, equirectangular projection, cubemap projection, sphere rotation,
region-wise packing, omnidirectional viewport, SEI manifest, and SEI prefix indication, and
miscellaneous minor corrections and clarifications.
Rec. ITU-T H.264 | ISO/IEC 14496-10 version 27 refers to the integrated version 26 text after its
amendment to specify additional SEI messages for annotated regions (through referencing to Rec.
ITU-T H.274 | ISO/IEC 23002-7) and shutter interval information, and miscellaneous minor
corrections and clarifications.
Rec. ITU-T H.264 | ISO/IEC 14496-10 version 28 (the current document) refers to the integrated
version 27 text after its amendment to specify additional SEI messages for neural-network post-
filter characteristics, neural-network post-filter activation, and phase indication (through
referencing to Rec. ITU-T H.274 | ISO/IEC 23002-7), additional colour type identifiers, and
miscellaneous minor corrections and clarifications.
This edition corresponds in technical content to the fifteenth edition in ITU-T (approved in August
2024).
© ISO/IEC 2025 – All rights reserved
ix
0.5 Profiles and levels
This document is designed to be generic in the sense that it serves a wide range of applications, bit
rates, resolutions, qualities, and services. Applications should cover, among other things, digital storage
media, television broadcasting and real-time communications. In the course of creating this document,
various requirements from typical applications have been considered, necessary algorithmic elements
have been developed, and these have been integrated into a single syntax. Hence, this document will
facilitate video data interchange among different applications.
Considering the practicality of implementing the full syntax of this document, however, a limited
number of subsets of the syntax are also stipulated by means of "profiles" and "levels". These and other
related terms are formally defined in Clause 3.
A "profile" is a subset of the entire bitstream syntax that is specified by this document. Within the
bounds imposed by the syntax of a given profile it is still possible to require a very large variation in the
performance of encoders and decoders depending upon the values taken by syntax elements in the
bitstream such as the specified size of the decoded pictures. In many applications, it is currently neither
practical nor economic to implement a decoder capable of dealing with all hypothetical uses of the
syntax within a particular profile.
In order to deal with this problem, "levels" are specified within each profile. A level is a specified set of
constraints imposed on values of the syntax elements in the bitstream. These constraints may be simple
limits on values. Alternatively they may take the form of constraints on arithmetic combinations of
values (e.g., picture width multiplied by picture height multiplied by number of pictures decoded per
second).
Coded video content conforming to this document uses a common syntax. In order to achieve a subset
of the complete syntax, flags, parameters, and other syntax elements are included in the bitstream that
signal the presence or absence of syntactic elements that occur later in the bitstream.
0.6 Overview of the design characteristics
0.6.1 General
The coded representation specified in the syntax is designed to enable a high compression capability for
a desired image quality. With the exception of the transform bypass mode of operation for lossless
coding in the High 4:4:4 Intra, CAVLC 4:4:4 Intra, and High 4:4:4 Predictive profiles, and the I_PCM
mode of operation in all profiles, the algorithm is typically not lossless, as the exact source sample
values are typically not preserved through the encoding and decoding processes. A number of
techniques may be used to achieve highly efficient compression. Encoding algorithms (not specified in
this document) may select between inter and intra coding for block-shaped regions of each picture.
Inter coding uses motion vectors for block-based inter prediction to exploit temporal statistical
dependencies between different pictures. Intra coding uses various spatial prediction modes to exploit
spatial statistical dependencies in the source signal for a single picture. Motion vectors and intra
prediction modes may be specified for a variety of block sizes in the picture. The prediction residual is
then further compressed using a transform to remove spatial correlation inside the transform block
before it is quantized, producing an irreversible process that typically discards less important visual
information while forming a close approximation to the source samples. Finally, the motion vectors or
intra prediction modes are combined with the quantized transform coefficient information and encoded
using either variable length coding or arithmetic coding.
Scalable video coding is specified in Annex F allowing the construction of bitstreams that contain sub-
bitstreams that conform to this document. For temporal bitstream scalability, i.e., the presence of a sub-
bitstream with a smaller temporal sampling rate than the bitstream, complete access units are removed
from the bitstream when deriving the sub-bitstream. In this case, high-level syntax and inter prediction
reference pictures in the bitstream are constructed accordingly. For spatial and quality bitstream
scalability, i.e., the presence of a sub-bitstream with lower spatial resolution or quality than the
© ISO/IEC 2025 – All rights reserved
x
bitstream, NAL units are removed from the bitstream when deriving the sub-bitstream. In this case,
inter-layer prediction, i.e., the prediction of the higher spatial resolution or quality signal by data of the
lower spatial resolution or quality signal, is typically used for efficient coding. Otherwise, the coding
algorithm as described in the previous paragraph is used.
Multiview video coding is specified in Annex G allowing the construction of bitstreams that represent
multiple views. Similar to scalable video coding, bitstreams that represent multiple views may also
contain sub-bitstreams that conform to this document. For temporal bitstream scalability, i.e., the
presence of a sub-bitstream with a smaller temporal sampling rate than the bitstream, complete access
units are removed from the bitstream when deriving the sub-bitstream. In this case, high-level syntax
and inter prediction reference pictures in the bitstream are constructed accordingly. For view bitstream
scalability, i.e., the presence of a sub-bitstream with fewer views than the bitstream, NAL units are
removed from the bitstream when deriving the sub-bitstream. In this case, inter-view prediction, i.e.,
the prediction of one view signal by data of another view signal, is typically used for efficient coding.
Otherwise, the coding algorithm as described in the previous paragraph is used.
An extension of multiview video coding that additionally supports the inclusion of depth maps is
specified in Annex H, allowing the construction of bitstreams that represent multiple views with
corresponding depth views. In a similar manner as with the multiview video coding specified in
Annex G, bitstreams encoded as specified in Annex H may also contain sub-bitstreams that conform to
this document.
A multiview video coding extension with depth information is specified in Annex I. Sub-bitstreams
consisting of a texture base view conform to this document, sub-bitstreams consisting of multiple
texture views may also conform to Annex G of this document, and sub-bitstreams consisting of one or
more texture views and one or more depth views may also conform to Annex H of this document.
Enhanced texture view coding that utilizes the associated depth views and decoding processes for
depth views are specified for this extension.
Rec. ITU-T H.274 | ISO/IEC 23002-7 specifies the syntax and semantics of supplemental enhancement
information (SEI) messages that do not affect the conformance specifications in Annex C and subclauses
F.8, G.8, H.8, and I.8. Among these SEI messages, those for which the syntax and semantics are not
specified in this document may be used together with this document. SEI messages for which the syntax
and semantics are specified in this document may always be used together with this document. To
enable being used together with this document, the SEI payload type value of an SEI message specified
in Rec. ITU-T H.274 | ISO/IEC 23002-7 needs to be specified in this document. For example, the SEI
payload type value 202 (for the annotated regions SEI message) is specified in subclause D.1.1, while
the syntax and semantics of that SEI message are specified in Rec. ITU-T H.274 | ISO/IEC 23002-7.
0.6.2 Predictive coding
Because of the conflicting requirements of random access and highly efficient compression, two main
coding types are specified. Intra coding is done without reference to other pictures. Intra coding may
provide access points to the coded sequence where decoding can begin and continue correctly, but
typically also shows only moderate compression efficiency. Inter coding (predictive or bi-predictive) is
more efficient using inter prediction of each block of sample values from some previously decoded
picture selected by the encoder. In contrast to some other video coding standards, pictures coded using
bi-predictive inter prediction may also be used as references for inter coding of other pictures.
The application of the three coding types to pictures in a sequence is flexible, and the order of the
decoding process is generally not the same as the order of the source picture capture process in the
encoder or the output order from the decoder for display. The choice is left to the encoder and will
depend on the requirements of the application. The decoding order is specified such that the decoding
of pictures that use inter-picture prediction follows later in decoding order than other pictures that are
referenced in the decoding process.
© ISO/IEC 2025 – All rights reserved
xi
0.6.3 Coding of progressive and interlaced video
This document specifies a syntax and decoding process for video that originated in either progressive-
scan or interlaced-scan form, which may be mixed together in the same sequence. The two fields of an
interlaced frame are separated in capture time while the two fields of a progressive frame share the
same capture time. Each field may be coded separately or the two fields may be coded together as a
frame. Progressive frames are typically coded as a frame. For interlaced video, the encoder can choose
between frame coding and field coding. Frame coding or field coding can be adaptively selected on a
picture-by-picture basis and also on a more localized basis within a coded frame. Frame coding is
typically preferred when the video scene contains significant detail with limited motion. Field coding
typically works better when there is fast picture-to-picture motion.
0.6.4 Picture partitioning into macroblocks and smaller partitions
As in previous video coding Recommendations and International Standards, a macroblock, consisting of
a 16x16 block of luma samples and two corresponding blocks of chroma samples, is used as the basic
processing unit of the video decoding process.
A macroblock can be further partitioned for inter prediction. The selection of the size of inter prediction
partitions is a result of a trade-off between the coding gain provided by using motion compensation
with smaller blocks and the quantity of data needed to represent the data for motion compensation. In
this document the inter prediction process can form segmentations for motion representation as small
as 4x4 luma samples in size, using motion vector accuracy of one-quarter of the luma sample grid
spacing displace
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...