ISO/IEC 14496-2:2001
(Main)Information technology — Coding of audio-visual objects — Part 2: Visual
Information technology — Coding of audio-visual objects — Part 2: Visual
Technologies de l'information — Codage des objets audiovisuels — Partie 2: Codage visuel
General Information
Relations
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 14496-2
Second edition
2001-12-01
Information technology — Coding of
audio-visual objects —
Part 2:
Visual
Technologies de l'information — Codage des objets audiovisuels —
Partie 2: Codage visuel
Reference number
ISO/IEC 14496-2:2001(E)
©
ISO/IEC 2001
---------------------- Page: 1 ----------------------
ISO/IEC 14496-2:2001(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall not
be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this
file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in this
area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters
were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event
that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2001
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic
or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member body
in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.ch
Web www.iso.ch
Printed in Switzerland
ii © ISO/IEC 2001 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 14496-2:2001(E)
Contents
1 Scope.1
2 Normative references.1
3 Terms and definitions .2
4 Abbreviations and symbols.12
4.1 Arithmetic operators.12
4.2 Logical operators.13
4.3 Relational operators.13
4.4 Bitwise operators.13
4.5 Conditional operators.13
4.6 Assignment.13
4.7 Mnemonics.14
4.8 Constants.14
5 Conventions.14
5.1 Method of describing bitstream syntax.14
5.2 Definition of functions.15
5.2.1 Definition of next_bits() function .15
5.2.2 Definition of bytealigned() function.15
5.2.3 Definition of nextbits_bytealigned() function .15
5.2.4 Definition of next_start_code() function .16
5.2.5 Definition of next_resync_marker() function.16
5.2.6 Definition of transparent_mb() function.16
5.2.7 Definition of transparent_block() function.16
5.2.8 Definition of byte_align_for_upstream() function .16
5.3 Reserved, forbidden and marker_bit .16
5.4 Arithmetic precision.17
6 Visual bitstream syntax and semantics .17
6.1 Structure of coded visual data .17
6.1.1 Visual object sequence.18
6.1.2 Visual object.18
6.1.3 Video object.18
6.1.4 Mesh object.24
6.1.5 FBA object.25
6.1.6 3D Mesh Object.29
6.2 Visual bitstream syntax.30
6.2.1 Start codes.30
6.2.2 Visual Object Sequence and Visual Object.34
6.2.3 Video Object Layer .35
6.2.4 Group of Video Object Plane.40
6.2.5 Video Object Plane and Video Plane with Short Header .40
6.2.6 Macroblock.53
6.2.7 Block.58
6.2.8 Still Texture Object.59
6.2.9 Mesh Object.73
6.2.10 FBA Object.75
6.2.11 3D Mesh Object.85
6.2.12 Upstream message.103
6.3 Visual bitstream semantics .104
6.3.1 Semantic rules for higher syntactic structures.104
6.3.2 Visual Object Sequence and Visual Object.104
6.3.3 Video Object Layer .109
© ISO/IEC 2001 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/IEC 14496-2:2001(E)
6.3.4 Group of Video Object Plane .120
6.3.5 Video Object Plane and Video Plane with Short Header .120
6.3.6 Macroblock related.131
6.3.7 Block related.134
6.3.8 Still texture object.135
6.3.9 Mesh object.142
6.3.10 FBA object.144
6.3.11 3D Mesh Object.151
6.3.12 Upstream message.162
7 The visual decoding process .164
7.1 Video decoding process .165
7.2 Higher syntactic structures .166
7.3 VOP reconstruction.166
7.4 Texture decoding.166
7.4.1 Variable length decoding.167
7.4.2 Inverse scan.168
7.4.3 DC and AC prediction for intra macroblocks.169
7.4.4 Inverse quantisation.172
7.4.5 Inverse DCT.175
7.4.6 Upsampling of the Inverse DCT output for Reduced Resolution VOP.176
7.5 Shape decoding.177
7.5.1 Higher syntactic structures .177
7.5.2 Macroblock decoding.178
7.5.3 Arithmetic decoding.187
7.5.4 Spatial scalable binary shape decoding.189
7.5.5 Grayscale Shape Decoding .198
7.5.6 Multiple Auxiliary Component Decoding .201
7.6 Motion compensation decoding.201
7.6.1 Padding process.201
7.6.2 Sample interpolation for non-integer motion vectors.205
7.6.3 General motion vector decoding process.207
7.6.4 Unrestricted motion compensation .209
7.6.5 Vector decoding processing and motion-compensation in progressive P- and S(GMC)-VOP .210
7.6.6 Overlapped motion compensation.212
7.6.7 Temporal prediction structure.213
7.6.8 Vector decoding process of non-scalable progressive B-VOPs .214
7.6.9 Motion compensation in non-scalable progressive B-VOPs .214
7.6.10 Motion Compensation Decoding of Reduced Resolution VOP .219
7.7 Interlaced video decoding .224
7.7.1 Field DCT and DC and AC Prediction .224
7.7.2 Motion compensation.225
7.8 Sprite decoding.234
7.8.1 Higher syntactic structures .234
7.8.2 Sprite Reconstruction.235
7.8.3 Low-latency sprite reconstruction.235
7.8.4 Sprite reference point decoding.236
7.8.5 Warping.237
7.8.6 Sample reconstruction.239
7.8.7 GMC decoding.240
7.9 Generalized scalable decoding .241
7.9.1 Temporal scalability.241
7.9.2 Spatial scalability.246
7.10 Still texture object decoding.251
7.10.1 Decoding of the DC subband .251
7.10.2 ZeroTree Decoding of the Higher Bands.252
7.10.3 Inverse Quantisation.257
7.10.4 Still Texture Error Resilience.265
7.10.5 Wavelet Tiling.268
7.10.6 Scalable binary shape object decoding .270
iv © ISO/IEC 2001 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/IEC 14496-2:2001(E)
7.11 Mesh object decoding.276
7.11.1 Mesh geometry decoding .276
7.11.2 Decoding of mesh motion vectors.279
7.12 FBA object decoding.281
7.12.1 Frame based face object decoding.281
7.12.2 DCT based face object decoding .282
7.12.3 Decoding of the viseme parameter fap 1 .284
7.12.4 Decoding of the viseme parameter fap 2 .284
7.12.5 Fap masking.285
7.12.6 Frame Based Body Decoding.285
7.12.7 DCT based body object decoding.286
7.13 3D Mesh Object Decoding .287
7.13.1 Start codes and bit stuffing .288
7.13.2 The Topological Surgery decoding process.288
7.13.3 The Forest Split decoding process.291
7.13.4 Header decoder.292
7.13.5 partition type.293
7.13.6 Vertex Graph Decoder.294
7.13.7 Triangle Tree Decoder.298
7.13.8 Triangle Data Decoder.299
7.13.9 Forest Split decoder.303
7.13.10 Arithmetic decoder .309
7.14 NEWPRED mode decoding.314
7.14.1 Decoder Definition.314
7.14.2 Upstream message.314
7.15 Output of the decoding process .314
7.15.1 Video data.315
7.15.2 2D Mesh data.315
7.15.3 Face animation parameter data.315
8 Visual-Systems Composition Issues.315
8.1 Temporal Scalability Composition.315
8.2 Sprite Composition.316
8.3 Mesh Object Composition .317
8.4 Spatial Scalability composition.318
9 Profiles and Levels.318
9.1 Visual Object Types.318
9.2 Visual Profiles.321
9.3 Visual Profiles@Levels.322
9.3.1 Natural Visual.322
9.3.2 Synthetic Visual.322
9.3.3 Synthetic/Natural Hybrid Visual.324
Annex A (normative) Coding transforms.326
A.1 Discrete cosine transform for video texture.326
A.2 Discrete wavelet transform for still texture.327
A.2.1 Adding the mean.327
A.2.2 Wavelet filter.327
A.2.3 Symmetric extension.328
A.2.4 Decomposition level.329
A.2.5 Shape adaptive wavelet filtering and symmetric extension .329
A.3 Shape-Adaptive DCT (SA-DCT).330
A.3.1 Definition of Forward SA-DCT.330
A.3.2 Definition of Inverse SA-DCT .332
A.4 SA-DCT with DC Separation and ∆∆DC Correction (∆∆DC-SA-DCT) .333
∆∆ ∆∆
A.4.1 Definition of Forward ∆∆∆∆DC-SA-DCT .334
A.4.2 Definition of Inverse ∆∆∆∆DC-SA-DCT.334
Annex B (normative) Variable length codes and arithmetic decoding.336
B.1 Variable length codes.336
B.1.1 Macroblock type.336
© ISO/IEC 2001 – All rights reserved v
---------------------- Page: 5 ----------------------
ISO/IEC 14496-2:2001(E)
B.1.2 Macroblock pattern.338
B.1.3 Motion vector .340
B.1.4 DCT coefficients.342
B.1.5 Shape Coding.352
B.1.6 Sprite Coding.357
B.1.7 DCT based facial object decoding .358
B.1.8 Shape decoding for still texture object .367
B.2 Arithmetic Decoding.368
B.2.1 Aritmetic decoding for still texture object .368
B.2.2 Arithmetic decoding for shape decoding.371
B.2.3 FBA Object Decoding.374
Annex C (normative) Face and body object decoding
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.