Information technology — Scalable compression and coding of continuous-tone still images — Part 1: Core coding system specification

This document specifies a coding format, referred to as JPEG XT, which is designed primarily for continuous-tone photographic content. This document defines the core coding system, which forms the basis for the entire ISO/IEC 18477 series.

Technologies de l'information — Compression échelonnable et codage d'images plates en ton continu — Partie 1: Spécification du système de codage de noyau

General Information

Status
Published
Publication Date
30-May-2024
Current Stage
6060 - International Standard published
Start Date
31-May-2024
Due Date
09-Aug-2024
Completion Date
31-May-2024
Ref Project

Relations

Standard
ISO/IEC 18477-1:2024 - Information technology — Scalable compression and coding of continuous-tone still images — Part 1: Core coding system specification Released:31. 05. 2024
English language
17 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


International
Standard
ISO/IEC 18477-1
Third edition
Information technology — Scalable
2024-05
compression and coding of
continuous-tone still images —
Part 1:
Core coding system specification
Technologies de l'information — Compression échelonnable et
codage d'images plates en ton continu —
Partie 1: Spécification du système de codage de noyau
Reference number
© ISO/IEC 2024
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
© ISO/IEC 2024 – All rights reserved
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols and abbreviated terms. 3
4.1 Symbols .3
4.2 Abbreviated terms .3
5 Conventions . 4
5.1 Conformance language .4
5.2 Operators .4
5.2.1 Arithmetic operators .4
5.2.2 Assignment operators .4
5.2.3 Precedence order of operators .4
5.2.4 Mathematical functions .4
6 General . 5
6.1 General definitions.5
6.2 Functional overview on the decoding process .5
6.3 Encoder requirements .5
6.4 Decoder requirements .5
Annex A (normative) Component subsampling and expansion of subsampling . 6
Annex B (normative) Codestream syntax . 8
Annex C (normative) Multi-component decorrelation .15
Bibliography . 17

© ISO/IEC 2024 – All rights reserved
iii
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations,
governmental and non-governmental, in liaison with ISO and IEC, also take part in the work.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of document should be noted. This document was drafted in accordance with the editorial rules of the ISO/
IEC Directives, Part 2 (see www.iso.org/directives or www.iec.ch/members_experts/refdocs).
ISO and IEC draw attention to the possibility that the implementation of this document may involve the
use of (a) patent(s). ISO and IEC take no position concerning the evidence, validity or applicability of any
claimed patent rights in respect thereof. As of the date of publication of this document, ISO and IEC had not
received notice of (a) patent(s) which may be required to implement this document. However, implementers
are cautioned that this may not represent the latest information, which may be obtained from the patent
database available at www.iso.org/patents and https://patents.iec.ch. ISO and IEC shall not be held
responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www.iso.org/iso/foreword.html.
In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
This third edition cancels and replaces the second edition (ISO/IEC 18477-1:2020), which has been
technically revised.
The main changes are as follows:
— the marker ID for the component decorrelation control marker was corrected.
A list of all parts in the ISO/IEC 18477 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html and
www.iec.ch/national-committees.

© ISO/IEC 2024 – All rights reserved
iv
Introduction
This document specifies a coded codestream format for storage of continuous-tone photographic content.
JPEG XT is a scalable image coding system that builds on the legacy Rec. ITU-T T.81 | ISO/IEC 10918-1 coding
system, also known as JPEG, but extends it in a backwards compatible way. This document specifies the
commonly deployed components of the JPEG coding system. Additional parts of the ISO/IEC 18477 series
extend on this baseline.
JPEG XT has been designed to be backwards compatible to legacy applications while at the same time
having a small coding complexity; JPEG XT uses, whenever possible, functional blocks of Rec. ITU-T T.81 |
ISO/IEC 10918-1, Rec. ITU-T T.86 | ISO/IEC 10918-4 and Rec. ITU-T T.871 | ISO/IEC 10918-5 to extend the
functionality of the legacy JPEG coding system. It is optimized for good image quality and compression
efficiency while also enabling low-complexity encoding and decoding implementations.

© ISO/IEC 2024 – All rights reserved
v
International Standard ISO/IEC 18477-1:2024(en)
Information technology — Scalable compression and coding
of continuous-tone still images —
Part 1:
Core coding system specification
1 Scope
This document specifies a coding format, referred to as JPEG XT, which is designed primarily for continuous-
tone photographic content. This document defines the core coding system, which forms the basis for the
entire ISO/IEC 18477 series.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
Rec. ITU-T T.81 | ISO/IEC 10918-1:1994, Information technology — Digital compression and coding of
continuous-tone still images — Part 1: Requirements and guidelines
Rec. ITU-T T.86 | ISO/IEC 10918-4, Information technology — Digital compression and coding of continuous-
tone still images — Part 4: Registration of JPEG profiles, SPIFF profiles, SPIFF tags, SPIFF colour spaces, APPn
markers, SPIFF compression types and Registration Authorities (REGAUT)
Rec. ITU-T T.871 | ISO/IEC 10918-5, Information technology — Digital compression and coding of continuous-
tone still images — Part 5: JPEG File Interchange Format (JFIF)
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
bitstream
partially encoded or decoded sequence of bits comprising an entropy-coded segment
3.2
block
8×8 array of samples or an 8×8 array of DCT coefficient values of one component
3.3
byte
group of 8 bits
© ISO/IEC 2024 – All rights reserved
3.4
coder
embodiment of a coding process
3.5
coding
encoding or decoding
3.6
compression
reduction in the number of bits used to represent source image data
3.7
component
two-dimensional array of samples having the same designation in the output or display device
Note 1 to entry: An image typically consists of several components, e.g. red, green and blue.
3.8
continuous-tone image
image whose components have more than one bit per sample
3.9
discrete cosine transform
DCT
either the forward discrete cosine transform or the inverse discrete cosine transform
3.10
downsampling
procedure by which the spatial resolution of a component is reduced
3.11
entropy-coded data segment
independently decodable sequence of entropy encoded bytes of compressed image data
3.12
marker
two-byte code in which the first byte is hexadecimal FF and the second byte is a value between 1 and
hexadecimal FE
3.13
marker segment
marker and associated set of parameters
3.14
precision
number of bits allocated to a particular sample or DCT coefficient
3.15
procedure
set of steps which accomplishes one of the tasks which comprise an encoding or decoding process
3.16
sample
one element in the two-dimensional array which comprises a component
3.17
sample grid
common coordinate system for all samples of an image with the samples at the top left edge of the image
having the coordinates (0, 0), the first coordinate increases towards the right, the second to the bottom

© ISO/IEC 2024 – All rights reserved
3.18
scan
single pass through the data for one or more of the components in an image
3.19
scan header
marker segment that contains a start-of-scan marker and associated scan parameters that are coded at the
beginning of a scan
3.20
upsampling
procedure by which the spatial resolution of a component is increased
3.21
vertical sampling factor
relative number of vertical data units of a particular component with respect to the number of vertical data
units in the other components in the frame
4 Symbols and abbreviated terms
4.1 Symbols
X width of the sample grid in positions
Y height of the sample grid in positions
Nf number of components in an image
s subsampling factor of component i in horizontal direction
i, x
s subsampling factor of component i in vertical direction
i, y
H subsampling indicator of component i in the frame header
i
V subsampling indicator of component i in the frame header
i
v sample value at the sample grid position x, y
x, y
4.2 Abbreviated terms
ASCII American Standard Code for Information Interchange
DC lowpass
AC highpass
LSB least significant bit
MSB most significant bit
DCT discrete cosine transformation
JPEG joint photographic experts group

© ISO/IEC 2024 – All rights reserved
5 Conventions
5.1 Conformance language
The keyword "reserved" indicates a provision that is not specified at this time, shall not be used, and may
be specified in the future. The keyword "forbidden" indicates "reserved" and in addition indicates that the
provision will never be specified in the future.
5.2 Operators
NOTE Many of the operators used in this document are similar to those used in the C programming language.
5.2.1 Arithmetic operators
+ addition
− subtraction (as a binary operator) or negation (as a unary prefix operator)
× multiplication
/ division without truncation or rounding
5.2.2 Assignment operators
= assignment operator
5.2.3 Precedence order of operators
Operators are listed in descending order of precedence. If several operators appear in the same line,
they have equal precedence. When several operators of equal precedence appear at the same level in an
expression, evaluation proceeds according to the associativity of the operator either from right to left or
from left to right.
Operators Type of operation Associativity
(), [ ], . expression left to right
− unary negation
×, / multiplication left to right

+, − addition and subtraction left to right
<, >, <=, >= relational left to right
5.2.4 Mathematical functions
⎾x⏋ Ceiling of x. Returns the smallest integer that is greater than or equal to x.
⎿x⏌ Floor of x. Returns the largest integer that is lesser than or equal to x.
|x| Absolute value, is –x for x < 0, otherwise x.
sign(x) Sign of x, zero if x is zero, +1 if x is positive, –1 if x is negative.
clamp(x, min, max) Clamps x to the range [min, max]: returns min if x < min, max if x > max or otherwise x.

© ISO/IEC 2024 – All rights reserved
6 General
6.1 General definitions
The purpose of this clause is to give an informative overview of the elements specified in this document.
There are three elements specified in this document:
a) An encoder is an embodiment of an encoding process. An encoder takes as input digital source image
data and encoder specifications, and by means of a specified set of procedures generates as output a
codestream.
b) A decoder is an embodiment of a decoding process. A decoder takes as input a codestream, and by means
of a specified set of procedures generates as output digital reconstructed image data.
c) The codestream is a compressed image data representation which includes all necessary data to allow
a (full or approximate) reconstruction of the sample values of a digital image. Additional data might be
required that define the interpretation of the sample data, such as the spatial dimensions of the samples.
6.2 Functional overview on the decoding process
The high-level algorithm for decoding is as follows: The samples are first reconstructed following the
decoder specifications defined in Rec. ITU-T T.81 | ISO/IEC 10918-1. If the resulting component arrays are
subsampled, they are upsampled on a common sample grid following the specifications in Annex A. Following
that, the output data is processed by an inverse decorrelation transformation. If the data is already in an
RGB type colour space, e.g. RGB with ITU-R Rec. BT.601 primaries, this transformation will be the identity
transformation. Otherwise, the ICT is used to transform the data into RGB. The inverse decorrelation
transformation is defined in Annex C, and the markers that are required to select the transformation are
defined in Annex B.
6.3 Encoder requirements
An encoding process converts source image data to compressed image data. This includes first obtaining
a low dynamic range image, and representing it by a coding process specified in Rec. ITU-T T.81 |
ISO/IEC 10918-1:1994, Annex F or Annex G.
In order to comply with this document, an encoder shall satisfy at least one of the following two requirements.
An encoder shall with appropriate accuracy, convert source image data to compressed image data which
comply with the codestream format syntax specified in Annex B for the encoding process(es) embodied by
the encoder. A limited accuracy sufficient to match the error bounds specified in the compliance tests is
acceptable.
There is no requirement in this document that any encoder which embodies one of the encoding processes
specified here shall be able to operate for all ranges of the parameters which are allowed for that process. An
encoder is only required to meet the compliance tests and to generate the compressed data format according
to Annex B for those parameter values which it does use.
6.4 Decoder requirements
A decoding process converts compressed image data to reconstructed image data. For that, it has to follow
the decoding operation specified in Rec. ITU-T T.81 | ISO/IEC 10918-1 with sufficient accuracy, using either
the baseline, sequential or progressive scan process defined in Rec. ITU-T T.81 | ISO/IEC 10918-1:1994,
Annex F or Annex G. This process generates sample values on a sample grid, which are then converted
into a digital image by following the upsampling specifications in Annex B of this document and the multi-
component decorrelation (ICT) process in Annex C of this document.

© ISO/IEC 2024 – All rights reserved
Annex A
(normative)
Component subsampling and expansion of subsampling
A.1 General
In this annex, the flowcharts and tables are normative only in the sense that they are defining an output that
implementations shall duplicate.
A.2 Component dimensions and subsampling factors
An image is defined to consist of Nf components, each of which is identified by a unique identifier C defined
i
in the frame header of the codestream format specified in Annex B. The number of components Nf shall
be either 1 or 3. A component consists of a rectangular array of samples x wide and y samples high. The
i i
component dimensions are derived from the image dimensions X and Y, also parameters recorded in the
frame header. These two parameters define a sample grid of X grid points wide and Y grid points high,
where the left topmost grid coordinate is (0, 0) and coordinates increase from left to right and from top
to bottom. However, the dimensions of the component do not need to coincide with the dimensions of the
image. For each component, two subsampling factors s and s define the spacing between sample points
i, x i, y
of component i relative to the sample grid and the size of the component array. If X and Y are the dimensions
of the sample grid, the size of component i with subsampling factors s and s is
i, x i, y
⎾X/s ⏋ and ⎾Y/s ⏋
i,x i,y
Upsampling by interpolation from surrounding samples as specified in Annex A generates then sample
values on all grid points of the sample grid.
The subsampling factors s and s are not directly represented in the binary codestream or any of its
i, x i, y
markers, but shall be derived from the parameters H and V recorded in the frame header. If Nf equals 1,
i i
i.e. the image consists of a single component, H and V shall be 1, and s and s are both 1. If Nf equals 3,
1 1 1, x 1, y
Table A.1 defines the relation between H , V and s and s . No other combinations of H and V than those
i i i, x i, y i i
listed in Table A.1 shall be used.
Table A.1 — Sampling values
H V H V H V s s s s s s
1 1 2 2 3 3 1, x 1, y 2, x 2, y 3, x 3, y
1 1 1 1 1 1 1 1 1 1 1 1
2 2 2 1 2 1 1 1 1 2 1 2
2 2 1 2 1 2 1 1 2 1 2 1
2 2 1 1 1 1 1 1 2 2 2 2
All other values reserved for ITU/ISO purposes.
NOTE Rec. ITU-T T.81 | ISO/IEC 10918-1 allowed other component arrangements and relations between grid
positions and sample positions that are not valid in this document. However, the definitions given here are special
cases of the more general relations provided in Rec. ITU-T T.81 | ISO/IEC 10918-1 and both definitions agree whenever
both are defined.
A.3 Expansion of subsampled components
Whenever the subsampling factors s and s are not both 1, interpolation is used to populate all grid
i,x i,y
positions of the image sample grid. The following bilinear interpolation algorithm can be used to provide

© ISO/IEC 2024 – All rights reserved
sample values at all sampling grid positions. Readers should be aware that the algorithm described here will
also change the sample values at sampling grid positions whose values are represented in the codestream.
This may have the effect of a continuous loss of precision of the subsampled components over multiple
compression-decompression cycles.
A.4 Bilinear expansion of subsampled components
Upsampling is performed in two steps. First, upsampling in the vertical direction if s is 2, generating an
i,y
intermediate image. Second, upsampling in the horizontal direction if s is 2, generating the final output
i,x
image from the intermediate im
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...