ISO/IEC 18477-7:2017
(Main)Information technology — Scalable compression and coding of continuous-tone still images — Part 7: HDR Floating-Point Coding
Information technology — Scalable compression and coding of continuous-tone still images — Part 7: HDR Floating-Point Coding
ISO/IEC 18477-7:2017 specifies a coding format, referred to as JPEG XT, which is designed primarily for continuous-tone photographic content.
Technologies de l'information — Compression échelonnable et codage d'images plates en ton continu — Partie 7: Codage de la virgule flottante en HDR
General Information
Relations
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 18477-7
Second edition
2017-05
Information technology — Scalable
compression and coding of
continuous-tone still images —
Part 7:
HDR Floating-Point Coding
Technologies de l’information — Compression échelonnable et codage
d’images plates en ton continu —
Partie 7: Codage de la virgule flottante en HDR
Reference number
©
ISO/IEC 2017
© ISO/IEC 2017, Published in Switzerland
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Ch. de Blandonnet 8 • CP 401
CH-1214 Vernier, Geneva, Switzerland
Tel. +41 22 749 01 11
Fax +41 22 749 09 47
copyright@iso.org
www.iso.org
ii © ISO/IEC 2017 – All rights reserved
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms, definitions, symbols and abbreviated terms . 1
4 Conventions . 9
4.1 Conformance language . 9
4.2 Operators . 9
4.2.1 Arithmetic operators .10
4.2.2 Logical operators .10
4.2.3 Relational operators .10
4.2.4 Precedence order of operators .10
4.2.5 Mathematical functions .11
5 General .11
5.1 Overview .11
5.2 High-level overview on JPEG XT ISO/IEC 18477-7 (informative) .11
5.3 Profiles .13
5.4 Encoder requirements .13
5.5 Decoder requirements.13
Annex A (normative) Encoding and decoding process .14
Annex B (normative) Boxes .19
Annex C (normative) Multi-component decorrelation .31
Annex D (normative) Half-exponential output transformation .32
Annex E (normative) Profiles .33
Annex F (informative) Implementation guidelines .43
Bibliography .49
© ISO/IEC 2017 – All rights reserved iii
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work. In the field of information technology, ISO and IEC have established a joint technical committee,
ISO/IEC JTC 1.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation on the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO’s adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see the following
URL: w w w . i s o .org/ iso/ foreword .html.
This document was prepared by Technical Committee ISO/IEC JTC 1, Information technology, SC 29,
Coding of audio, picture, multimedia and hypermedia information.
This second edition cancels and replaces the first edition (ISO 18477-7:2016), of which it constitutes a
minor revision. The changes compared to the previous edition are as follows:
— a definition has been added for the term “horizontal subsampling factor” as 3.1.31;
— notes to entry have been added to terms throughout Clause 3;
— text in F.3 and F.4.2 has been modified;
— minor editorial changes.
A list of all the parts in the ISO/IEC 18477 series can be found on the ISO website.
iv © ISO/IEC 2017 – All rights reserved
Introduction
This document specifies a coded codestream format for storage of continuous-tone high and low
dynamic range photographic content. JPEG XT part 7 is a scalable image coding system supporting
multiple component images consisting of floating-point samples. It is by itself an extension of the coding
tools defined in ISO/IEC 18477-1 and the box-based format defined in ISO/IEC 18477-3; the codestream
is composed in such a way that legacy applications conforming to Rec. ITU-T T.81 | ISO/IEC 10918-1
are able to reconstruct a lower quality, low dynamic range, eight bits per sample version of the image.
This standard low dynamic range image is typically constructed at the encoder side by tone mapping
from the high dynamic image; while the LDR image is always present, this document does not define a
process that generates this image.
Today, the most widely used digital photography format, a minimal implementation of JPEG (specified
in Rec. ITU-T T.81 | ISO/IEC 10918-1), uses a bit depth of 8; each of the three channels that together
compose an image pixel is represented by eight bits, providing 256 representable values per channel.
If the dynamic range of the input scene is too large, however, an integer sample representation is no
longer applicable and sample values need to be specified in floating-point. These values typically are, or
are proportional to physical radiance values of three primaries. These primaries may be device specific
physical colours, or may be the basis of the CIE XYZ colourspace.
JPEG XT is primarily designed to provide coded data containing high dynamic range and wide colour
gamut content while simultaneously providing eight bits per pixel low dynamic range images using tools
defined in ISO/IEC 18477-1. The goal is to provide a backwards compatible coding specification that
allows legacy applications and existing tool chains to continue to operate on codestreams conforming
to this document.
JPEG XT has been designed to be backwards compatible to legacy applications while at the same time
having a small coding complexity; JPEG XT uses, whenever possible, functional blocks of Rec. ITU-T T.81
| ISO/IEC 10918-1 to extend the functionality of the legacy JPEG Coding System. It is optimized for
storage and transmission of high dynamic range and wide colour gamut floating-point images while
also enabling low-complexity encoder and decoder implementations.
This document is an extension of ISO/IEC 18477-1, a compression system for continuous- tone digital
still images which is backwards compatible with Rec. ITU-T T.81 | ISO/IEC 10918-1. That is, legacy
applications conforming to Rec. ITU-T T.81 | ISO/IEC 10918-1 will be able to reconstruct streams
generated by an encoder conforming to this document, though will possibly not be able to reconstruct
such streams in full dynamic range, full quality or other features defined in this document.
This document is itself based on ISO/IEC 18477-3, which defines a box-based file format similar to
other JPEG standards. The aim of this document is to provide a migration path for legacy applications
to support, potentially in a limited way, lossless coding and coding of high dynamic range images
consisting of samples represented in floating-point. Existing tools depending on the existing standards
will continue to work, but will only be able to reconstruct a lossy and/or a low dynamic range version of
the image contained in the codestream. This document specifies a coded file format, referred to as JPEG
XT, which is designed primarily for storage and interchange of continuous-tone photographic content
© ISO/IEC 2017 – All rights reserved v
INTERNATIONAL STANDARD ISO/IEC 18477-7:2017(E)
Information technology — Scalable compression and
coding of continuous-tone still images —
Part 7:
HDR Floating-Point Coding
1 Scope
This document specifies a coding format, referred to as JPEG XT, which is designed primarily for
continuous-tone photographic content.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 18477-1:2015, Information technology — Scalable compression and coding of continuous-tone still
images — Part 1: Scalable compression and coding of continuous-tone still images
ISO/IEC 18477-2, Information technology — Scalable compression and coding of continuous-tone still
images — Part 2: Coding of high dynamic range images
ISO/IEC 18477-3:2015, Information technology — Scalable compression and coding of continuous-tone still
images — Part 3: Box file format
ISO/IEC 18477-6:2016, Information technology — Scalable compression and coding of continuous-tone
still images — Part 6: IDR Integer Coding
ISO/IEC/IEEE 60559, Information technology — Microprocessor Systems — Floating-Point arithmetic
Rec. ITU-T T.81 | ISO/IEC 10918–1:1994, Information technology — Digital compression and coding of
continuous-tone still images — Requirements and guidelines
Rec. ITU-T BT.601, Studio encoding parameters of digital television for standard 4:3 and wide screen 16:9
aspect ratios
3 Terms, definitions, symbols and abbreviated terms
3.1 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— IEC Electropedia: available at http:// www .electropedia .org/
— ISO Online browsing platform: available at http:// www .iso .org/ obp
© ISO/IEC 2017 – All rights reserved 1
3.1.1
ASCII encoding
encoding of text characters and text strings according to ISO/IEC 10646
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.2
base decoding path
process of decoding legacy codestream and refinement data to the base image, jointly with all further
steps until residual data is added to the values obtained from the residual codestream
Note 1 to entry: See ISO/IEC 18477-6.
3.1.3
base image
collection of sample values obtained by entropy decoding the DCT coefficients of the legacy codestream
and the refinement codestream, and inversely DCT transforming them jointly
Note 1 to entry: See ISO/IEC 18477-6.
3.1.4
binary decision
choice between two alternatives
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.5
bitstream
partially encoded or decoded sequence of bits comprising an entropy-coded segment
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.6
block
8 × 8 array of samples or an 8 × 8 array of DCT coefficient values of one component
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.7
box
structured collection of data describing the image or the image decoding process embedded into one or
multiple APP marker segments
Note 1 to entry: See ISO/IEC 18477-3:2015, Annex B.
3.1.8
byte
group of 8 bits
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.9
coder
embodiment of a coding process
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.10
coding
encoding or decoding
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
2 © ISO/IEC 2017 – All rights reserved
3.1.11
coding model
procedure used to convert input data into symbols to be coded
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.12
(coding) process
general term for referring to an encoding process, a decoding process, or both
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.13
compression
reduction in the number of bits used to represent source image data
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.14
component
two-dimensional array of samples having the same designation in the output or display device
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
Note 2 to entry: An image typically consists of several components, e.g. red, green and blue.
3.1.15
continuous-tone image
image whose components have more than one bit per sample
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.16
data unit
8 × 8 block of samples of one component in DCT-based processes; a sample in lossless processes
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.17
decoder
embodiment of a decoding process
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.18
decoding process
process which takes as its input compressed image data and outputs a continuous-tone image
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.19
dequantization
inverse procedure to quantization by which the decoder recovers a representation of the DCT
coefficients
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.20
discrete cosine transform
DCT
either the forward discrete cosine transform or the inverse discrete cosine transform
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
© ISO/IEC 2017 – All rights reserved 3
3.1.21
downsampling
procedure by which the spatial resolution of a component is reduced
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.22
encoder
embodiment of an encoding process
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.23
encoding process
process which takes as its input a continuous-tone image and outputs compressed image data
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.24
entropy-coded (data) segment
independently decodable sequence of entropy encoded bytes of compressed image data
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.25
entropy decoder
embodiment of an entropy decoding procedure
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.26
entropy decoding
lossless procedure which recovers the sequence of symbols from the sequence of bits produced by the
entropy encoder
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.27
entropy encoder
embodiment of an entropy encoding procedure
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.28
entropy encoding
lossless procedure which converts a sequence of input symbols into a sequence of bits such that the
average number of bits per symbol approaches the entropy of the input symbols
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.29
grayscale image
continuous-tone image that has only one component
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.30
high dynamic range
image or image data comprised of samples using a floating-point representation
4 © ISO/IEC 2017 – All rights reserved
3.1.31
horizontal subsampling factor
relative number of vertical data units of a particular component with respect to the number of
horizontal data units in the other components in the frame
Note 1 to entry: See ISO/IEC 18477-1.
3.1.32
Huffman decoder
embodiment of a Huffman decoding procedure
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.33
Huffman decoding
entropy decoding procedure which recovers the symbol from each variable length code produced by
the Huffman encoder
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.34
Huffman encoder
embodiment of a Huffman encoding procedure
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.35
Huffman encoding
entropy encoding procedure which assigns a variable length code to each input symbol
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.36
joint photographic experts group
JPEG
informal name of the committee which created this document
Note 1 to entry: The “joint” comes from the ITU-T and ISO/IEC collaboration.
3.1.37
legacy codestream
collection of markers and syntax elements defined by Rec. ITU-T T.81 | ISO/IEC 10918-1 bare any syntax
elements defined by the family ISO/IEC 18477 standards
EXAMPLE The legacy codestream consists of the collection of all markers except those APP markers that
describe JPEG XT boxes by the syntax defined in ISO/IEC 18477-3, Annex A.
Note 1 to entry: See ISO/IEC 18477-6.
3.1.38
legacy decoder
embodiment of a decoding process conforming to Rec. ITU-T T.81 | ISO/IEC 10918-1, confined to the
lossy DCT process and the baseline, sequential or progressive modes, decoding at most four components
to eight bits per component
Note 1 to entry: See ISO/IEC 18477-6.
3.1.39
lossless
descriptive term for encoding and decoding processes and procedures in which the output of the
decoding procedure(s) is identical to the input to the encoding procedure(s)
Note 1 to entry: See ISO/IEC 18477-8.
© ISO/IEC 2017 – All rights reserved 5
3.1.40
lossless coding
mode of operation which refers to any one of the coding processes defined in this document in which all
of the procedures are lossless
Note 1 to entry: See ISO/IEC 18477-8.
3.1.41
lossy
descriptive term for encoding and decoding processes which are not lossless
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.42
low dynamic range
image or image data comprised of data with no more than eight bits per sample
Note 1 to entry: See ISO/IEC 18477-6.
3.1.43
marker
two-byte code in which the first byte is hexadecimal FF and the second byte is a value between 1 and
hexadecimal FE
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.44
marker segment
marker together with its associated set of parameters
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.45
pixel
collection of sample values in the spatial image domain having all the same sample coordinates
Note 1 to entry: A pixel may consist of three samples describing its red, green and blue value.
Note 2 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.46
precision
number of bits allocated to a particular sample or DCT coefficient
3.1.47
procedure
set of steps which accomplishes one of the tasks which comprise an encoding or decoding process
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.48
quantization value
integer value used in the quantization procedure
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.49
quantize
act of performing the quantization procedure for a DCT coefficient
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
6 © ISO/IEC 2017 – All rights reserved
3.1.50
residual decoding path
collection of operations applied to the entropy coded data contained in the residual data box and
residual refinement scan boxes up to the point where this data is merged with the base image to form
the final output image
Note 1 to entry: See ISO/IEC 18477-6.
3.1.51
residual image
sample values as reconstructed by inverse quantization and inverse DCT transformation applied to the
entropy-decoded coefficients described by the residual scan and residual refinement scans
Note 1 to entry: See ISO/IEC 18477-6.
3.1.52
residual scan
additional pass over the image data invisible to legacy decoders which provides additive and/or
multiplicative correction data of the legacy scans to allow reproduction of high dynamic range or wide
colour gamut data
Note 1 to entry: See ISO/IEC 18477-6.
3.1.53
refinement scan
additional pass over the image data invisible to legacy decoders which provides additional least
significant bits to extend the precision of the DCT transformed coefficients
Note 1 to entry: See ISO/IEC 18477-6.
3.1.54
sample
one element in the two-dimensional image array which comprises a component
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.55
sample grid
common coordinate system for all samples of an image
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
Note 2 to entry: The samples at the top left edge of the image have the coordinates (0,0), the first coordinate
increases towards the right, the second towards the bottom.
3.1.56
scan
single pass through the data for one or more of the components in an image
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.57
scan header
marker segment that contains a start-of-scan marker and associated scan parameters that are coded at
the beginning of a scan
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
© ISO/IEC 2017 – All rights reserved 7
3.1.58
table specification data
coded representation from which the tables used in the encoder and decoder are generated and their
destinations specified
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.59
(uniform) quantization
procedure by which DCT coefficients are linearly scaled in order to achieve compression
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.60
upsampling
procedure by which the spatial resolution of a component is increased
Note 1 to entry: See ISO/IEC 18477-1.
3.1.61
vertical subsampling factor
relative number of vertical data units of a particular component with respect to the number of vertical
data units in the other components in the frame
Note 1 to entry: See ISO/IEC 18477-1.
3.1.62
zero byte
0×00 byte
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.1.63
zig-zag sequence
specific sequential ordering of the DCT coefficients from (approximately) lowest spatial frequency
to highest
Note 1 to entry: See Rec. ITU-T T.81 | ISO/IEC 10918-1.
3.2 Symbols
X width of the sample grid in positions
Y height of the sample grid in positions
Nf number of components in an image
s subsampling factor of component i in horizontal direction
i,x
s subsampling factor of component i in vertical direction
i,y
H subsampling indicator of component i in the frame header
i
V subsampling indicator of component i in the frame header
i
v sample value at the sample grid position x,y
x,y
8 © ISO/IEC 2017 – All rights reserved
R additional number of DCT coefficient bits represented by refinement scans in the legacy decod-
h
ing path, 8 + R is the number of non-fractional bits (i.e. bits in front of the “binary dot”) of the
h
output of the inverse DCT process in the legacy decoding path
R additional number of DCT coefficient bits represented by refinement scans in the residual de-
r
coding path. p + R is the number of non-fractional bits of the output of the inverse DCT process
r
in the residual decoding path, where p is the frame-precision of the residual image as recorded
in the frame header of the residual codestream
R additional bits in the HDR image. 8 + R is the sample precision of the reconstructed HDR image
b b
3.3 Abbreviated terms
ASCII American Standard Code for Information Interchange
LSB Least Significant Bit
MSB Most Significant Bit
HDR High Dynamic Range
LDR Low Dynamic Range
TMO Tone Mapping Operator
DCT Discrete Cosine Transformation
4 Conventions
4.1 Conformance language
This document consists of normative and informative text.
Normative text is that text which expresses mandatory requirements. The word “shall” is used to
express mandatory requirements strictly to be followed in order to conform to this document and
from which no deviation is permitted. A conforming implementation is one that fulfils all mandatory
requirements.
Informative text is text that is potentially helpful to the user, but not indispensable and can be removed,
changed or added editorially without affecting interoperability. All text in this document is normative,
with the following exceptions: the Introduction, any parts of the text that are explicitly labelled as
“informative”, and statements appearing with the preamble “NOTE” and behaviour described using the
word “should”. The word “should” is used to describe behaviour that is encouraged but is not required
for conformance to this document.
The keywords “may” and “need not” indicate a course of action that is permissible in a conforming
implementation.
The keyword “reserved” indicates a provision that is not specified at this time, shall not be used, and
may be specified in the future. The keyword “forbidden” indicates “reserved” and in addition indicates
that the provision will never be specified in the future.
4.2 Operators
NOTE Many of the operators used in this document are similar to those used in the C programming language.
© ISO/IEC 2017 – All rights reserved 9
4.2.1 Arithmetic operators
+ addition
- subtraction (as a binary operator) or negation (as a unary prefix operator)
* multiplication
/ division without truncation or rounding
4.2.2 Logical operators
|| Logical OR
&& Logical AND
! Logical NOT
∈ x ∈ {A, B} is defined as (x = = A || x = = B)
∉ x ∉ {A, B} is defined as (x ! = A && x ! = B)
4.2.3 Relational operators
> greater than
> = greater than or equal to
< less than
< = less than or equal to
= = equal to
! = not equal to
4.2.4 Precedence order of operators
Operators are listed below in descending order of precedence. If several operators appear in the same
line, they have equal precedence. When several operators of equal precedence appear at the same level
in an expression, evaluation proceeds according to the associativity of the operator either from right to
left or from left to right.
Operators Type of operation Associativity
(), [ ], . Expression Left to Right
− Unary negation
*, / Multiplication Left to Right
+, − Addition and Subtraction Left to Right
< , > , < = , > = Relational Left to Right
10 © ISO/IEC 2017 – All rights reserved
4.2.5 Mathematical functions
⎾x⏋ Ceil of x. Returns the smallest integer that is greater than or equal to x.
⎿x⏌ Floor of x. Returns the largest integer that is lesser than or equal to x.
|x| Absolute value, is –x for x < 0; otherwise, x.
sign(x) Sign of x, 0 if x is zero, +1 if x is positive, −1 if x is negative.
clamp(x, min, max) Clamps x to the range [min, max]: returns min if x < min, max if x > max or oth-
erwise x.
a
x Raises the value of x to the power of a. x is a non-negative real number, a is a real
a
number. x is equal to exp(a · log(x)) where exp is the exponential function and
a
log() the natural logarithm. If x is 0 and a is positive, x is defined to be 0.
log(x) Natural logarithm to the base of e. log(x) is undefined for non-positive values of x.
x
exp(x) The exponential function of x. exp(x) equals e where e is the Euler number.
Furthermore, exp(log(x)) = x for all positive numbers x.
5 General
5.1 Overview
The purpose of this clause is to give an informative overview of the elements specified in this document.
Another purpose is to introduce many of the terms which are defined in Clause 3. These terms are
printed in italics upon first usage in this clause.
The following are three elements specified in this document.
a) An encoder is an embodiment of an encoding process. An encoder takes as input digital source image
data and encoder specifications and by means of a specified set of procedures, generates as output
codestream.
b) A decoder is an embodiment of a decoding process. A decoder takes as input a codestream and by
means of a specified set of procedures, generates as output digital reconstructed image data.
c) The codestream is a compressed image data representation which includes all necessary data to
allow a (full or approximate) reconstruction of the sample values of a digital image. Additional data
might be required that define the interpretation of the sample data, such as colour space or the
spatial dimensions of the samples.
5.2 High-level overview on JPEG XT ISO/IEC 18477-7 (informative)
This document allows lossy coding of high dynamic range of photographic images in a way that is
backwards compatible to Rec. ITU-T T.81 | ISO/IEC 10918-1. Decoders compliant to the latter standard
will be able to parse codestreams conforming to this document correctly, albeit only in 8-bit sample
precision with a limited dynamic range and potentially a limited colour gamut.
This document includes multiple tools to reach the above functionality, defined in Annex B and on. A
short overview on these coding tools will be given in this subclause; an in-depth specification of the
algorithm is given in Annex A.
The syntax of an ISO/IEC 18477-7 compliant codestream is specified in ISO/IEC 18477-3, that is, this
document uses a syntax element denoted as “box” to annotate its syntactical elements. The definition
of the box syntax element is not repeated here, and readers are referred to ISO/IEC 18477-3 for
further details. Additional boxes besides those already specified in ISO/IEC 18477-3 are defined in
© ISO/IEC 2017 – All rights reserved 11
ISO/IEC 18477-6:2016, Annex B and in Annex B. In addition, ISO/IEC 18477-6 defines coding tools such
as Refinement Coding, specified in ISO/IEC 18477-6:2016, Annex D and the residual codestream
carried by the Residual Data box.
The Refinement Scan, specified in ISO/IEC 18477-6:2016, Annex D, increases the bit precision of the
DCT coefficients, i.e. it operates in the DCT domain. The mechanism used here is very similar to that
of Subsequent Approximation scans specified in Rec. ITU-T T.81 | ISO/IEC 10918-1:1994, Annex G. A
legacy baseline, extended or progressive Huffman scan as defined in Annex G of the legacy standard
defines the 12 most significant bits of the DCT coefficients. These initial scans are represented in
the legacy codestream and are visible for any ISO/IEC 18477-1 compliant decoder. Refinement scans
decode now into up to four additional least significant bits in the same way subsequent approximation
within Rec. ITU-T T.81 | ISO/IEC 10918-1 decode least significant bits of a progressive scan pattern.
The difference between Refinement scans and Subsequent Approximation scans is only that in the
latter case, the number of least significant bits is annotated in the scan header of the legacy codestream
whereas Refinement scans are hidden from legacy applications and do not alter the scan header of the
legacy codestream. Their number is indicated in the Refinement Specification box within the Merging
Specification box and not in the legacy codestream.
While Refinement Scans extend the bit precision within the DCT domain by up to four bits, Residual
Scans extend the sample precision in the spatial (image) domain. While the entropy coded data of
Residual Scans is hidden in the Residual Data box from legacy applications, its decoding process
is identical to that of the legacy data: baseline, extended or progressive Huffman scan decodes the
data in the Residual Data box to DCT coefficients, inverse quantization and inverse Discrete Cosine
Transformation (DCT) compute from these coefficients the residual image data. An image merging
process, defined in Annex A, computes from the precursor image reconstructed from the legacy
codestream and the residual image a final HDR output image.
This merging process is depicted in Figure 1. It first performs chroma upsampling to reconstruct a
single sample on each point of the sample grid of the base image. Chroma upsampling is specified in
ISO/IEC 18477-1:2015, Annex A. It then converts the colour space of the base image first from YCbCr
into the Rec. ITU-T BT.601 colourspace. Optionally, the luma signal of the LDR image is computed from
the BT.601 colourspace and used as pre-scaling of the chroma signals of the residual. The BT.601 image
is optionally transformed again into a wider colourspace. For practical reasons, the transformations
from YCbCr to BT.601 and from BT.601 into an output colourspace are combined into a single linear
transformation matrix. This linear transformation is followed by a nonlinear point transformation
acting separately on each of the output channels sample by sample. This point transformation can
either be seen as an approximate inverse tone mapping operation, or as inverse gamma correction that
transforms the nonlinear sample values of the LDR image into sample values proportional to physical
radiance. It can either be specified by a parametric curve, or by an explicit lookup table. The output
of this decoding path is transformed again by an optional colour transformation into the final HDR
colourspace forming the precursor image which represents a rough imprecise approximation of the
final HDR image, already in the correct HDR output colour space.
Processing continues with the decoding of the residual image. DCT coefficients of the residual image are
decoded from the information in the Residual Data box, and their bit precision is extended by additional
refinement scans decoded from the data in the Residual Refinement box. Processing proceeds with
inverse quantization and inverse DCT transformation. The output undergoes chroma upsampling to
generate a single sample per sample grid coordinate. The next processing step performs a nonlinear
point transformation on each of the reconstructed channels, separately for each sample, resulting in
an error image in a YCbCr type of colour space. Chroma samples are then optionally pre-scaled by a
factor that is computed from the luminance value of the LDR image by a nonlinear pre-scaling map.
Optionally, a post-scaling factor is computed from the luminance of the residual image through a
nonlinear post-scaling map. The post-scaling factor scales a linear HDR image from limited dynamic
range to full dynamic range in a later processing stage. Residual samples then undergo an inverse linear
decorrelation transformation to map the sample values from the intermediate YCbCr colourspace
into the target colourspace. This transformation is typically identical to the transformation matrix in
the base decoding path, but does not need to be. After inverse decorrelation, the residual samples are
mapped again by the residual nonlinear point transformation. The result of this operation is the
residual image.
12 © ISO/IEC 2017 – All rights reserved
To form the final output image, sample values of the precursor image and the residual image are
added together, plus an offset to make the residual image symmetric around zero. The result of this
is scaled by the post-scaling factor if the nonlinear post-scaling map is present. Results are then
clamped to the range of the intermediate range output image and finally processed through the output
conversion. The output conversion either converts samples from integer to floating-point by a half-
exponential map, or uses a table lookup or parametric curve for this conversion.
The detailed specification of the decoding and merging process is found in Annex A.
Figure 1 — Overview on the decoding process
5.3 Profiles
The profiles define the implementation of a particular technology within the functional blocks of
Figure 1. The profiles are described in Annex E.
5.4 Encoder requirements
There is no requirement in this document that any encoder shall support all profiles. An encoder is
only required to meet the compliance tests and to generate the codestream according to the syntax and
to limit the coding parameters to those valid within the profile it conforms to. Profiles are defined in
Annex E.
5.5 Decoder requirements
A decoding process converts compressed image data to reconstructed image data. It may follow the
decoding operation specified in the Recommendation | International Standard and ISO/IEC 18477-1 to
generate an LDR image from the legacy codestream, and it shall follow the operations in this document
to decode an HDR image from the data in the full file. The decoder shall parse the codestream syntax to
extract the parameters, the residual image and the base image. The parameters shall be used to merge
the residual image with the base image into the reconstructed HDR Image.
In order to comply with this document, a decoder
a) may convert a codestream conforming to this document without considering the information
in any box into a low dynamic range image, and
b) shall convert a conforming codestream within the profile it claims to be conforming to into a high
dynamic range image.
© ISO/IEC 2017 – All rights reserved 13
Annex A
(normative)
Encoding and decoding process
A.1 General
In this annex and all of its subclauses, the flow charts and tables are normative only in the sense that
they are defining an output that alternative implementations shall duplicate.
A.2 Decoding process (normative)
The decoding process relies on a layered approach to extend JPEG’s capabilities. The encoder decomposes
an HDR image into a base layer, which consists of a tone mapped version of the HDR image, and an HDR
residual layer. The combination of the data in the residual codestream and the legacy codestream allow
the reconstruction of the fully HDR image. Both the metadata describing the merging process and the
residual image are included in boxes invisible to legacy decoders. Such decoders will thus only see the
tone mapped LDR image. While the base image complies to ISO/IEC 18477-1 and thus supports only the
8-bit extended or baseline, extended or progressive Huffman modes, the residual image may optionally
be encoded in the 12-bit Huffman or progressive modes.
Figure A.1 illustrates the functionality of a compliant decoder.
14 © ISO/IEC 2017 – All rights reserved
NOTE Bold lines carry three (or one, for grey scale) components, thin lines scalar data. Round boxes
implement point-transformations, square boxes (except B1, B1a, B5, B5a) multiplications by 3 × 3 matrices.
Letters denote signal names.
Figure A.1 — High-level overview of the decoding process of a compliant decoder
This subclause specifies the reconstruction process of an intermediate dynamic range image from a
LDR image and residual image decoders shall follow. This process, which is an extension of the process
defined in ISO/IEC 18477-6, consists of the following steps (see also Figure A.1).
— In steps B1 and B1a, reconstruct the base data from legacy codestream and the refinement
codestream if a Refinement Data box is present. Refinement coding is specified in ISO/IEC 18477-
6:2016, Annex D.
— In step B1 and B1a, apply the Inverse Quantization and Inverse Discrete Cosine Transformation as
in Rec. ITU-T T.81 | ISO/IEC 10918-1.
— In step B2, the upsampling process specified in ISO/IEC 18477-1:2015, Annex A shall be followed to
generate samples for all positions on the sample grid.
— In step B3, the linear transformation selected by the Base Transformation box defined in
ISO/IEC 18477-6:2016, Annex B and Annex C shall be applied to inversely decorrelate the image
components. ISO/IEC 18477-6:2016, Table C.1 defines which transformation to pick. The output of
this block consists of either one or three samples per grid point O , depending on the number of
i
components in the legacy codestream. The output of this transformation is rounded to integers
Rh+8
and clipped to [0, 2 −1] where R is the number of refinement scans in the b
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...