Information technology - Scalable compression and coding of continuous-tone still images - Part 8: Lossless and near-lossless coding

ISO/IEC 18477-8:2016 specifies a coding format, referred to as JPEG XT, which is designed primarily for continuous-tone photographic content.

Technologies de l'information — Compression échelonnable et codage d'images plates en ton continu — Partie 8: Titre manque

General Information

Status
Withdrawn
Publication Date
13-Oct-2016
Withdrawal Date
13-Oct-2016
Current Stage
9599 - Withdrawal of International Standard
Start Date
29-May-2020
Completion Date
30-Oct-2025
Ref Project

Relations

Standard
ISO/IEC 18477-8:2016 - Information technology -- Scalable compression and coding of continuous-tone still images
English language
60 pages
sale 15% off
Preview
sale 15% off
Preview

Frequently Asked Questions

ISO/IEC 18477-8:2016 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Scalable compression and coding of continuous-tone still images - Part 8: Lossless and near-lossless coding". This standard covers: ISO/IEC 18477-8:2016 specifies a coding format, referred to as JPEG XT, which is designed primarily for continuous-tone photographic content.

ISO/IEC 18477-8:2016 specifies a coding format, referred to as JPEG XT, which is designed primarily for continuous-tone photographic content.

ISO/IEC 18477-8:2016 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.30 - Coding of graphical and photographical information. The ICS classification helps identify the subject area and facilitates finding related standards.

ISO/IEC 18477-8:2016 has the following relationships with other standards: It is inter standard links to ISO/IEC 18477-8:2020. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

You can purchase ISO/IEC 18477-8:2016 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 18477-8
First edition
2016-10-15
Information technology — Scalable
compression and coding of
continuous-tone still images —
Part 8:
Lossless and near-lossless coding
Technologies de l’information — Compression échelonnable et codage
d’images plates en ton continu
Reference number
©
ISO/IEC 2016
© ISO/IEC 2016, Published in Switzerland
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Ch. de Blandonnet 8 • CP 401
CH-1214 Vernier, Geneva, Switzerland
Tel. +41 22 749 01 11
Fax +41 22 749 09 47
copyright@iso.org
www.iso.org
ii © ISO/IEC 2016 – All rights reserved

Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms, definitions, symbols and abbreviated terms . 1
3.1 Terms and definitions . 1
3.2 Symbols . 7
3.3 Abbreviated terms . 7
4 Conventions . 7
4.1 Conformance language . 7
4.2 Operators . 8
4.2.1 Arithmetic operators . 8
4.2.2 Logical operators . 8
4.2.3 Relational operators . 8
4.2.4 Precedence order of operators . 8
4.2.5 Mathematical functions . 9
5 General . 9
5.1 General definitions . 9
5.2 Overview of ISO/IEC 18477-8 . 9
5.3 Profiles .11
5.4 Encoder requirements .11
5.5 Decoder requirements.12
Annex A (normative) Encoding and decoding process .13
Annex B (normative) Boxes .18
Annex C (normative) Multi-component decorrelation transformation .26
Annex D (normative) Entropy coding of residual data in the DCT-bypass and large range mode .30
Annex E (normative) Discrete cosine transformation .41
Annex F (normative) Component upsampling .54
Annex G (normative) Quantization and noise shaping for the DCT-bypass process .56
Annex H (normative) Profiles .59
Bibliography .60
© ISO/IEC 2016 – All rights reserved iii

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work. In the field of information technology, ISO and IEC have established a joint technical committee,
ISO/IEC JTC 1.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation on the meaning of ISO specific terms and expressions related to conformity assessment,
as well as information about ISO’s adherence to the World Trade Organization (WTO) principles in the
Technical Barriers to Trade (TBT) see the following URL: www.iso.org/iso/foreword.html.
The committee responsible for this document is ISO/IEC JTC 1, Information technology, SC29, Coding of
audio, picture, multimedia and hypermedia information.
A list of all parts in the ISO 18477 series, published under the general title Information technology —
Scalable compression and coding of continuous-tone still images, can be found on the ISO website.
iv © ISO/IEC 2016 – All rights reserved

Introduction
This document specifies a coded codestream format for storage of continuous-tone high and low
dynamic range photographic content. JPEG XT part 8 is a scalable lossy to lossless image coding system
supporting multiple component images consisting of integer samples between 8- and 16-bit resolution,
or floating point samples of 16-bit resolution. It is by itself an extension of ISO/IEC 18477-6 and
ISO/IEC 18477-7, which specify intermediate range and high-dynamic range image decoding algorithms.
Both of these are based on the box-based file format specified in ISO/IEC 18477-3, which is again an
extension of ISO/IEC 18477-1; the codestream is composed in such a way that legacy applications
conforming to Rec. ITU-T T.81 | ISO/IEC 10918-1 are able to reconstruct a lossy, low dynamic range,
8 bits per sample version of the image.
Today, the most widely used digital photography format, a minimal implementation of JPEG (specified
in Rec. ITU-T T.81 | ISO/IEC 10918-1), uses a bit depth of 8; each of the three channels that together
compose an image pixel is represented by 8 bits, providing 256 representable values per channel.
For more demanding applications, it is not uncommon to use a bit depth of 16, providing 65 536
representable values to describe each channel within a pixel, resulting in over 2.8 x 10 representable
colour values. In some less common scenarios, even greater bit depths are used, requiring a floating-
point sample representation.
Most common photo and image formats use an 8-bit or 16-bit unsigned integer value to represent some
function of the intensity of each colour channel. While it might be theoretically possible to agree on
one method for assigning specific numerical values to real world colours, doing so is not practical.
Since any specific device has its own limited range for colour reproduction, the device’s range may be a
small portion of the agreed-upon universal colour range. As a result, such an approach is an extremely
inefficient use of the available numerical values, especially when using only 8 bits (or 256 unique
values) per channel. To represent pixel values as efficiently as possible, devices use a numeric encoding
optimized for their own range of possible colours or gamut.
This part of JPEG XT is primarily designed to encode intermediate or high dynamic image sample values
without loss, or with a precisely controllable bounded loss using the tools defined in ISO/IEC 18477-
1 and some minimal extensions of those tools. The goal is to provide a backwards compatible coding
specification that allows legacy applications and existing toolchains to continue to operate on
codestreams conforming to this document.
JPEG XT has been designed to be backwards compatible to legacy applications while at the same time
having a small coding complexity; JPEG XT uses, whenever possible, functional blocks of Rec. ITU-T T.81
| ISO/IEC 10918-1 to extend the functionality of the legacy JPEG Coding System. It is optimized for
storage and transmission of intermediate and high dynamic range and wide colour gamut 8- to 16-
bit integer or 16-bit floating point images while also enabling low-complexity encoder and decoder
implementations.
This document is an extension of ISO/IEC 18477-1, a compression system for continuous tone digital
still images which is backwards compatible with Rec. ITU-T T.81 | ISO/IEC 10918-1. That is, legacy
applications conforming to Rec. ITU-T T.81 | ISO/IEC 10918-1 will be able to reconstruct streams
generated by an encoder conforming to this document, though will possibly not be able to reconstruct
such streams in full dynamic range, full quality or without loss.
This document is itself based on ISO/IEC 18477-3 that defines a box-based file format similar to
other JPEG standards. It also contains elements of ISO/IEC 18477-6 and ISO/IEC 18477-7. The aim
of this document is to provide a migration path for legacy applications to support lossless coding of
intermediate and high dynamic range images, that is images that are either represented by sample
values requiring 8- to 16-bit precision, or even using 16-bit floating point sample resolution. While Rec.
ITU-T T.81 | ISO/IEC 10918-1 already defines a lossless mode for integer samples, images encoded in
this mode cannot be decoded by applications only supporting the lossy 8-bit-mode; the coding engine
for lossless coding in Rec. ITU-T T.81 | ISO/IEC 10918-1 is completely different from the lossy coding
mode. Unlike the legacy standard, this document defines a lossless scalable coding engine supporting
all bit depths between 8 and 16 bits per sample, including 16-bit floating point samples, while also
staying compatible with legacy applications. Such applications will continue to work, but will only able
© ISO/IEC 2016 – All rights reserved v

to reconstruct a lossy 8-bit standard low dynamic range (LDR) version of the full image contained in
the codestream. The parts of ISO/IEC 18477 specify a coded file format, referred to as JPEG XT, which is
designed primarily for storage and interchange of continuous-tone photographic content.
vi © ISO/IEC 2016 – All rights reserved

INTERNATIONAL STANDARD ISO/IEC 18477-8:2016(E)
Information technology — Scalable compression and
coding of continuous-tone still images —
Part 8:
Lossless and near-lossless coding
1 Scope
This document specifies a coding format, referred to as JPEG XT, which is designed primarily for
continuous-tone photographic content.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 18477-1:2015, Information technology — Scalable compression and coding of continuous-tone still
images — Part 1: Scalable compression and coding of continuous-tone still images
ISO/IEC 18477-3:2015, Information technology — Scalable compression and coding of continuous-tone still
images — Part 3: Box file format
ISO/IEC 18477-6:2016, Information technology — Scalable compression and coding of continuous-tone
still images — Part 6: IDR Integer Coding
ISO/IEC 18477-7:2016, Information technology — Scalable compression and coding of continuous-tone still
images — Part 7: HDR Floating-Point Coding
ITU-T T.81 | ISO/IEC 10918-1,Information technology — Digital compression and coding of continuous
tone still images — Requirements and guidelines
ITU-T BT.601,Studio encoding parameters of digital television for standard 4:3 and wide screen 16:9
aspect ratios
3 Terms, definitions, symbols and abbreviated terms
3.1 Terms and definitions
For the purposes of this document, the following definitions apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— IEC Electropedia: available at http://www.electropedia.org/
— ISO Online browsing platform: available at http://www.iso.org/obp
3.1.1
AC coefficient
any DCT coefficient for which the frequency is not zero in at least one dimension
© ISO/IEC 2016 – All rights reserved 1

3.1.2
ASCII encoding
encoding of text characters and text strings according to ISO/IEC 10646
3.1.3
base decoding path
process of decoding legacy codestream and refinement data to the base image, jointly with all further
steps until residual data is added to the values obtained from the residual codestream
3.1.4
base image
collection of sample values obtained by entropy decoding the DCT coefficients of the legacy codestream
and the refinement codestream, and inversely DCT transforming them jointly
3.1.5
block
8×8 array of samples or an 8×8 array of DCT coefficient values of one component
3.1.6
box
structured collection of data describing the image or the image decoding process embedded into one or
multiple APP marker segments
Note 1 to entry: See ISO/IEC 18477-3:2015, Annex B for the definition of boxes.
3.1.7
byte
group of 8 bits
3.1.8
coding
encoding or decoding
3.1.9
coding process
general reference to an encoding process, a decoding process, or both
3.1.10
compression
reduction in the number of bits used to represent source image data
3.1.11
component
two-dimensional array of samples having the same designation in the output or display device
Note 1 to entry: An image typically consists of several components, e.g. red, green and blue.
3.1.12
continuous-tone image
image whose components have more than one bit per sample
3.1.13
DC coefficient
DCT coefficient for which the frequency is zero in both dimensions
3.1.14
DCT coefficient
amplitude of a specific cosine basis function – may refer to an original DCT coefficient, to a quantized
DCT coefficient, or to a dequantized DCT coefficient
2 © ISO/IEC 2016 – All rights reserved

3.1.15
decoder
embodiment of a decoding process
3.1.16
decoding process
process which takes as its input compressed image data and outputs a continuous-tone image
3.1.17
dequantization
inverse procedure to quantization by which the decoder recovers a representation of the DCT
coefficients
3.1.18
discrete cosine transform
DCT
either the forward discrete cosine transform or the inverse discrete cosine transform
3.1.19
downsampling
procedure by which the spatial resolution of a component is reduced
3.1.20
encoder
embodiment of an encoding process
3.1.21
encoding process
process which takes as its input a continuous-tone image and outputs compressed image data
3.1.22
entropy decoder
embodiment of an entropy decoding procedure
3.1.23
entropy decoding
lossless procedure which recovers the sequence of symbols from the sequence of bits produced by the
entropy encoder
3.1.24
entropy encoder
embodiment of an entropy encoding procedure
3.1.25
entropy encoding
lossless procedure which converts a sequence of input symbols into a sequence of bits such that the
average number of bits per symbol approaches the entropy of the input symbols
3.1.26
extension image
residual image
sample values as reconstructed by inverse quantization and inverse DCT transformation applied to the
entropy-decoded coefficients described by the residual scan and residual refinement scans
[SOURCE: ISO/IEC 18477-6:2016, 3.1.54]
3.1.27
fixed point discrete cosine transformation
implementation of the discrete cosine transformation based on fixed point arithmetic following the
specifications in Annex E
© ISO/IEC 2016 – All rights reserved 3

3.1.28
forward DCT bypass
transformation that takes an 8×8 sample block and prepares it for entropy coding without applying a
discrete cosine transformation
3.1.29
forward fixed point DCT
transformation of an 8×8 sample block from the spatial domain to the frequency domain using the fixed
point arithmetic as specified in Annex E
3.1.30
forward integer DCT
transformation of an 8×8 sample block from the spatial domain to the frequency domain using the
integer approximation of the discrete cosine transformation as specified in Annex E
3.1.31
inverse DCT bypass
transformation that takes an 8×8 sample block as generated by entropy decoding and level-shifts it
without applying a discrete cosine transformation
3.1.32
inverse fixed point DCT
transformation of an 8×8 sample block from the frequency domain to the spatial domain using the fixed
point arithmetic as specified in Annex E
3.1.33
inverse integer DCT
the transformation of an 8×8 sample block from the frequency domain to the spatial domain using the
integer approximation of the discrete cosine transformation as specified in Annex E
3.1.34
frequency
two-dimensional index into the two-dimensional array of DCT coefficients
[SOURCE: ISO/IEC 10918-1:1994, 3.1.61]
3.1.35
high dynamic range
HDR
image or image data comprised of more than eight bits per sample
3.1.36
Huffman encoding
entropy encoding procedure which assigns a variable length code to each input symbol
3.1.37
intermediate dynamic range
image or image data comprised of more than eight bits per sample
3.1.38
joint photographic experts group
JPEG
informal name of the committee which created this document
Note 1 to entry: The “joint” comes from the ITU-T and ISO/IEC collaboration.
4 © ISO/IEC 2016 – All rights reserved

3.1.39
legacy codestream
collection of markers and syntax elements defined by Rec. ITU-T T.81 | ISO/IEC 10918-1 bare any syntax
elements defined by the family ISO/IEC 18477 standards, i.e., the legacy codestream consists of the
collection of all markers except those APP markers that describe JPEG XT boxes by the syntax defined
in ISO/IEC 18477-3:2015, Annex A
3.1.40
legacy decoder
embodiment of a decoding process conforming to Rec. ITU-T T.81 | ISO/IEC 10918-1, confined to the
lossy DCT process and the baseline, sequential or progressive modes, decoding at most four components
to eight bits per component
3.1.41
lossless
encoding and decoding processes and procedures in which the output of the decoding procedure(s) is
identical to the input to the encoding procedure(s)
3.1.42
lossless coding
mode of operation which refers to any one of the coding processes defined in ISO/IEC 18477-8 in which
all of the procedures are lossless
3.1.43
lossy
encoding and decoding processes which are not lossless
3.1.44
low-dynamic range
LDR
image or image data comprised of data with no more than eight bits per sample
3.1.45
marker
two-byte code in which the first byte is hexadecimal FF and the second byte is a value between 1 and
hexadecimal FE
3.1.46
marker segment
marker together with its associated set of parameters
3.1.47
noise shaping
signal processing technique that removes quantization noise from the low frequency components and
injects it into the high frequency domain where it can be removed by filtering
3.1.48
pixel
collection of sample values in the spatial image domain having all the same sample coordinates, e.g. a
pixel may consist of three samples describing its red, green and blue value
3.1.49
point transformation
application of a location independent global function to reconstructed sample values in the spatial domain
3.1.50
precision
number of bits allocated to a particular sample or DCT coefficient
© ISO/IEC 2016 – All rights reserved 5

3.1.51
procedure
set of steps which accomplishes one of the tasks which comprise an encoding or decoding process
3.1.52
quantization value
integer value used in the quantization procedure
3.1.53
quantize
act of performing the quantization procedure for a DCT coefficient
3.1.54
residual decoding path
collection of operations applied to the entropy coded data contained in the residual data box and
residual refinement scan boxes up to the point where this data is merged with the legacy data to form
the final output image
3.1.55
residual image
sample values as reconstructed by inverse quantization and inverse DCT transformation applied to the
entropy-decoded coefficients described by the residual scan and residual refinement scans
3.1.56
residual scan
additional pass over the image data invisible to legacy decoders which provides additive and/or
multiplicative correction data of the legacy scans to allow reproduction of high-dynamic range or wide
colour gamut data
3.1.57
refinement scan
additional pass over the image data invisible to legacy decoders which provides additional least
significant bits to extend the precision of the DCT transformed coefficients
Note 1 to entry: Refinement scans can be either applied in the legacy or residual decoding path.
3.1.58
sample
one element in the two-dimensional image array which comprises a component
3.1.59
sample grid
common coordinate system for all samples of an image
Note 1 to entry: The samples at the top left edge of the image have the coordinates (0,0), the first coordinate
increases towards the right, the second towards the bottom.
3.1.60
superbox
box that carries other boxes as payload data
3.1.61
sub box
box that is contained as payload data within a superbox
3.1.62
uniform quantization
procedure by which DCT coefficients are linearly scaled in order to achieve compression
6 © ISO/IEC 2016 – All rights reserved

3.1.63
upsampling
procedure by which the spatial resolution of a component is increased
3.2 Symbols
X width of the sample grid in positions
Y height of the sample grid in positions
Nf number of components in an image
s subsampling factor of component i in horizontal direction
i,x
s subsampling factor of component i in vertical direction
i,y
H subsampling indicator of component i in the frame header
i
V subsampling indicator of component i in the frame header
i
v sample value at the sample grid position x,y
x,y
R additional number of DCT coefficients bits represented by refinement scans in the base image,
h
8+R is the number of non-fractional bits (i.e. bits in front of the “binary dot”) of the output of
h
the inverse DCT process in the base image
R additional number of DCT coefficients bits represented by refinement scans in the residual, P+R
r h
is the number of non-fractional bits (i.e. bits in front of the “binary dot”) of the output of the
inverse DCT process in the residual image where P is the bitdepth indicated in the frame header
of the residual codestream
R additional bits in the HDR image. 8+Rb is the sample precision of the reconstructed HDR image
b
3.3 Abbreviated terms
ASCII American Standard Code for Information Interchange
LSB least significant bit
MSB most significant bit
TMO tone mapping operator
DCT discrete cosine transformation
FCT fixed point multi-component transformation
ICT irreversible multi-component transformation
RCT reversible multi component transformation
4 Conventions
4.1 Conformance language
The keyword “reserved” indicates a provision that is not specified at this time, shall not be used, and
may be specified in the future. The keyword “forbidden” indicates “reserved” and in addition indicates
that the provision will never be specified in the future.
© ISO/IEC 2016 – All rights reserved 7

4.2 Operators
NOTE Many of the operators used in this document are similar to those used in the C programming language.
4.2.1 Arithmetic operators
+ addition
− subtraction (as a binary operator) or negation (as a unary prefix operator)
* multiplication
/ division without truncation or rounding
smod
 
 
x smod a is the unique value y between −−a 12/ and a −12/
() ()
   
 
 
for which y+N*a = x with a suitable integer N
umod x umod a is the unique value y bet ween 0 and a−1 for which
y+N*a = x with a suitable integer N
4.2.2 Logical operators
|| logical OR
&& logical AND
! logical NOT
∈ x ∈ {A, B} is defined as (x == A || x == B)
∉ x ∉ {A, B} is defined as (x != A && x != B)
4.2.3 Relational operators
> greater than
>= greater than or equal to
< less than
<= less than or equal to
== equal to
!= not equal to
4.2.4 Precedence order of operators
Operators are listed below in descending order of precedence. If several operators appear in the same
line, they have equal precedence. When several operators of equal precedence appear at the same level
in an expression, evaluation proceeds according to the associativity of the operator either from right to
left or from left to right.
Operators Type of operation Associativity
(), [ ], . expression left to right
− unary negation
8 © ISO/IEC 2016 – All rights reserved

*, / multiplication left to right
umod, smod modulo (remainder) left to right
+, − addition and subtraction left to right
< , >, <=, >= relational left to right
4.2.5 Mathematical functions
ceil of x: returns the smallest integer that is greater than or equal to x
 
x
 
 
floor of x: returns the largest integer that is lesser than or equal to x
 
x
 
 
|x| absolute value, is –x for x < 0, otherwise x
sign(x) sign of x, 0 if x is zero, +1 if x is positive, -1 if x is negative
clamp(x,min,max) clamps x to the range [min,max]: returns min if x < min, max if x > max or otherwise x
a
x raises the value of x to the power of a: x is a non-negative real number, a is a real
a
number; x is equal to exp(a*log(x)) where exp is the exponential function and
a
log() the natural logarithm; if x is 0 and a is positive, x is defined to be 0
5 General
5.1 General definitions
Clause 5 gives an informative overview of the elements specified in this document. It also introduces
many of the terms which are defined in Clause 3. These terms are printed in italics upon first usage in
Clause 5.
There are three elements specified in this document:
a) An encoder is an embodiment of an encoding process. An encoder takes as input digital source image
data and encoder specifications and, by means of a specified set of procedures, generates as output
codestream.
b) A decoder is an embodiment of a decoding process. A decoder takes as input a codestream and, by
means of a specified set of procedures, generates as output digital reconstructed image data.
c) The codestream is a compressed image data representation, which includes all necessary data to
allow a (full or approximate) reconstruction of the sample values of a digital image. Additional data
might be required that define the interpretation of the sample data, such as colour space or the
spatial dimensions of the samples.
5.2 Overview of ISO/IEC 18477-8
This document allows near-lossless and lossless coding of high and intermediate dynamic range
of photographic images in a way that is backwards compatible to Rec. ITU-T T.81 | ISO/IEC 10918-
1. Decoders compliant to the latter standard will be able to parse codestreams conforming to this
document correctly, albeit in less precision, with a limited dynamic range, and potential loss in sample
bit precision.
This document includes multiple tools to reach the above functionality, defined in Annexes B and on. A
short overview of these coding tools is given in Clause 5.
© ISO/IEC 2016 – All rights reserved 9

The syntax of an ISO/IEC 18477-8 compliant codestream is specified in ISO/IEC 18477-3, that is, this
document uses a syntax element denoted as “box” to annotate its syntactical elements. The definition of
the box syntax element is not repeated here (refer to ISO/IEC 18477-3). Additional boxes besides those
already specified in ISO/IEC 18477-3 are defined in Annex B. In addition, this document also reuses
boxes defined in ISO/IEC 18477-6:2016, Annex B and ISO/IEC 18477-7:2016, Annex B.
Figure 1 — Overview of the decoding process
To allow lossless and near-lossless coding, this document provides a stricter definition of two elements
of the reconstruction process defined in Rec. ITU-T T.81 | ISO/IEC 10918-1 and ISO/IEC 18477-1. The
DCT process, only loosely defined in an implementation-agnostic way in Rec. ITU-T T.81 | ISO/IEC 10918-
1 is replaced by strictly defined algorithms (specified in Annex E) that a conforming decoder shall
follow. Not following these steps will compromise lossless reconstruction. Annex C replaces the ICT
transformation by a precise fixed-point implementation denoted as FCT operating entirely on integer
samples and thus allowing a fully reproducible transformation conforming decoders shall follow as
well. Again, deriving from the specifications of the FCT specified in Annex C will compromise lossless
coding. The DCT operations in Annex E and the FCT in Annex C are fully backwards compatible to the
DCT in Rec. ITU-T T.81 | ISO/IEC 10918-1 and the ICT in ISO/IEC 18477-1 and approximate them within
the error bounds of Rec. ITU-T T.83 | ISO/IEC 10918-2. Thus, a possible implementation choice for the
ISO/IEC 18477 family of standards is to always use the DCT and/or FCT as specified here, and not to
provide a second implementation based on floating point or other technology.
Lossless coding can be achieved by two alternative mechanisms. First, by applying the Integer DCT
specified in Annex E, and replacing the base transformation by an identity transformation. This coding
mode can only be applied to bit precisions of 8 bits per sample, or up to 12 bits per sample in the
presence of refinement scans already defined in ISO/IEC 18477-6. Residual scans are not required if
the Integer DCT is deployed.
10 © ISO/IEC 2016 – All rights reserved

Second, by replacing the DCT by the Fixed Point DCT and by selecting the FCT, the scaled identity
transformation or an integer-based free-form transformation as base transformation. The Fixed Point
DCT is specified in Annex E. The FCT and modifications to free-form integer transformations are
defined in Annex C. In this case, FCT and the Fixed Point DCT create an additional coding error that is
precisely defined by the coding procedure. This coding error is then corrected by an additive residual
scan. While residual scans were already defined in ISO/IEC 18477-6, the Residual DCT Specification box
specified in Annex B allows users to bypass the DCT in the residual image completely and thus avoids
additional complexity not required for lossless coding. Residual data using the DCT bypass mode are
entropy coded in a new scan type denoted as residual scan defined in Annex D. It is closely related
to the regular (baseline, extended or progressive) Huffman scan types specified in Rec. ITU-T T.81 |
ISO/IEC 10918-1. Since all coefficients are now error residuals in the spatial domain, the natural
distinction between DC and AC coefficients no longer applies. This means that the special role the DC
coefficient has in the Rec. ITU-T T.81 | ISO/IEC 10918-1 decoding procedure is no longer justified. The
residual scan type thus extends the AC decoding procedure to the top-left coefficient of an 8×8 block
while keeping everything else unchanged. It is thus only a very minor modification of the decoding
process specified in Rec. ITU-T T.81 | ISO/IEC 10918-1 that improves the coding efficiency of DCT-
bypassed coded error residuals.
If the DCT is bypassed, quantization in the residual domain may cause ringing and stair-casing artifacts.
Such artifacts can be eliminated by the Noise Shaping algorithm specified in Annex G.
To decorrelate the components of the additive error residuals, Annex C specifies an additional lossless
component decorrelation transformation denoted as the RCT. It is related, but not identical, to the RCT
of Rec. ITU-T T.801 | ISO/IEC 15444-1.
The decoding procedure of this document is otherwise closely related to that of ISO/IEC 18477-6:
Legacy codestream and refinement scans in the Refinement box form the DCT coefficients of the base
image. The image is dequantized and then processed by either the Fixed Point DCT or the Integer DCT. A
multi-component decorrelation transformation, the Identity, the scaled identity, the FCT or a free-form
integer transformation follows. The output of this process is optionally processed by a non-linear point
transformation selected by the Base Non-linear Point Transformation box. The output of this process is
called the precursor image.
Image reconstruction proceeds with the entropy decoding of the residual image, if present,
encapsulated by the Residual Data box and the Residual Refinement box, both of which also specified
in ISO/IEC 18477-6:2016, Annex B. Data is dequantized and either processed through the Integer DCT,
or the DCT bypass process specified in Annex E. The DCT bypass process requires residual data to be
encoded in the residual scan type of Annex D. Output of this data is then linearly scaled to range, if
required, and inversely decorrelated either by the RCT or an identity transformation. The error residuals
are finally added to the precursor image, and modulo arithmetic is used in the addition to ensure a
final reconstruction value in range. If the encoded data is floating point, the lossless conversion from
floating point to integer specified in ISO/IEC 18477-7:2016, Annex D completes decoding, otherwise the
sum of residual and precursor image is already the final output.
For the detailed specification of the decoding and merging process, see Annex A.
5.3 Profiles
The profiles define the implementation of a particular technology within the functional blocks of
Figure 1. Profiles are defined in Annex H.
5.4 Encoder requirements
There is no requirement in this document that any encoder shall support all profiles. An encoder is
only required to meet the compliance tests and to generate the codestream according to the syntax and
to limit the coding parameters to those valid within the profile it conforms to. Profiles are defined in
Annex H.
© ISO/IEC 2016 – All rights reserved 11

5.5 Decoder requirements
A decoding process converts compressed image data to reconstructed image data. It may follow the
decoding operation specified in this document and ISO/IEC 18477-1 to generate an LDR image from
the legacy codestream, and it shall follow the operations in this document to decode an IDR or an
HDR image from the data in the full file. The Decoder shall parse the codestream syntax to extract the
parameters, the residual image and the base image. The parameters shall be used to merge the residual
image with the base image into the reconstructed IDR Image.
In order to comply with this document, a decoder
a) may convert a codestream conforming to this document without considering the information
in any box into to a low dynamic range image;
b) shall convert a conforming codestream within the profile it claims to be conforming to into an
intermediate dynamic range image to exactly the same sample values as the reference decoder
1)
specified in ISO/IEC 18477-5 . Additional details on reference testing and allowable error bounds
1)
are specified in ISO/IEC 18477-4 .
1) Under preparation.
12 © ISO/IEC 2016 – All rights reserved

Annex A
(normative)
Encoding and decoding process
A.1 Decoding process (normative)
The decoding process relies on a layered approach to extend the capabilities of the Rec. ITU-T T.81 |
ISO/IEC 10918-1 process. The encoder decomposes an IDR or HDR image into a base layer, which consists
of a tone-mapped version of the IDR/HDR image, and a residual layer. In addition to the residual layer,
the codestream includes a description of an approximate inverse tone mapping operation that allows
the decoder to reconstruct from the LDR image an approximate IDR/HDR image denoted as precursor
image. The errors of this approximation process, if any, are corrected by the residual codestream
included in the residual data box and residual refinement box (see Annex B). Both the description of the
tone mapping and the residual image are included in boxes invisible to legacy decoders. Such decoders
will thus only see the tone mapped LDR image. While the base image complies to ISO/IEC 18477-1 and
thus supports only the 8-bit extended or baseline, extended or progressive Huffman modes, the residual
image may optionally be encoded in the 12-bit Huffman or progressive modes and may optionally use
the residual scan type of Annex D bypassing the inverse DCT.
Figure A.1 illustrates the functionality of a compliant decoder.
Figure A.1 — High-level overview of the decoding process of a compliant decoder
© ISO/IEC 2016 – All rights reserved 13

Bold lines carry three components (or one for greyscale). Round boxes implement point-transformations,
square boxes (except B1, B1a, B5, B5a) multiplications by 3×3 matrices. Letters denote signal names.
The reconstruction process of an intermediate dynamic range image from a LDR image and a residual
image decoders shall follow the following steps (see Figure A.1).
— In steps B1 and B1a, decode the base DCT coefficients from legacy codestream and the refinement
codestream if a Refinement Data box is present. Refinement coding is specified in Annex D.
Otherwise, the decoding process of Rec. ITU-T T.81 | ISO/IEC 10918-1 is used unaltered.
— In step B1b, apply Inverse Quantization as specified in Rec. ITU-T T.81 | ISO/IEC 10918-1.
— In step B1c, apply Inverse Discrete Cosine Transformation. The inverse DCT process is either
the Integer DCT or the Fixed Point DCT of Annex E, and the DCT process is selected by the DCT
Specification box specified in Annex B.
— In step B2, the upsampling process specified in Annex F shall be followed to generate samples for all
positions on the sample grid.
— In step B3 the linear transformation, as selected by the Base Transformation box defined in
Annex B, shall be applied to inversely decorrelate the image components. Table B.1 defines which
transformation to pick. The output of this block consists of either one or three samples per grid
point O , depending on the number of components in the legacy codestream. The output of this
i
transformation is rounded to integers and clipped to [0,255].
— In step B4, a non-linear point transformation shall be applied to each of the output components
O . This process is selected according to the Base non-linear Point Transformation subbox of the
i
Merging Specification box, implementing the L Luts of Annex C and following the specifications of
i
ISO/IEC 18477-3:2015, Annex C. The outputs of this process are the predicted high dynamic range
samples J . As above, i=1.Nf.
i
— Also in step B4, an integer colour transformation is applied to the input values J resulting in the
i
output pixel values H . The transformation is selected by the Colour Transformation subbox of the
i
Merging Specification box, which selects one of the transformations defined in Annex C. If Nf equals
1, no transformation is performed.
— In steps B5 and B5a, the residual image shall be reconstructed from the data contained in the
Residual Codestream box and the Residual Refinement box. The codestream contained in this box
either follows the specifications defined in Rec. ITU-T T.81 | ISO/IEC 10918-1 or is encoded in the
residual scan specified in Annex D. If a Residual Refinement box is present, the precision of the
samples of the residual codestream shall be extended by refinement coding as specified in Anne
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...