Information technology - JPEG 2000 image coding system: Secure JPEG 2000 - Part 8: - Amendment 1: File format security

Technologies de l'information — Système de codage d'images JPEG 2000: JPEG 2000 sécurisé — Partie 8: — Amendement 1: Sécurité de format de fichier

General Information

Status
Withdrawn
Publication Date
14-Dec-2008
Current Stage
9599 - Withdrawal of International Standard
Start Date
16-Oct-2023
Completion Date
30-Oct-2025
Ref Project

Relations

Standard
ISO/IEC 15444-8:2007/Amd 1:2008 - File format security
English language
31 pages
sale 15% off
Preview
sale 15% off
Preview

Frequently Asked Questions

ISO/IEC 15444-8:2007/Amd 1:2008 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - JPEG 2000 image coding system: Secure JPEG 2000 - Part 8: - Amendment 1: File format security". This standard covers: Information technology - JPEG 2000 image coding system: Secure JPEG 2000 - Part 8: - Amendment 1: File format security

Information technology - JPEG 2000 image coding system: Secure JPEG 2000 - Part 8: - Amendment 1: File format security

ISO/IEC 15444-8:2007/Amd 1:2008 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.30 - Coding of graphical and photographical information. The ICS classification helps identify the subject area and facilitates finding related standards.

ISO/IEC 15444-8:2007/Amd 1:2008 has the following relationships with other standards: It is inter standard links to ISO/IEC 15444-8:2007, ISO/IEC 15444-8:2023; is excused to ISO/IEC 15444-8:2007. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

You can purchase ISO/IEC 15444-8:2007/Amd 1:2008 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 15444-8
First edition
2007-04-15
AMENDMENT 1
2008-12-15
Information technology — JPEG 2000
image coding system: Secure JPEG 2000
AMENDMENT 1: File format security
Technologies de l'information — Système de codage d'images
JPEG 2000: JPEG 2000 sécurisé
AMENDEMENT 1: Sécurité de format de fichier

Reference number
ISO/IEC 15444-8:2007/Amd.1:2008(E)
©
ISO/IEC 2008
ISO/IEC 15444-8:2007/Amd.1:2008(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

©  ISO/IEC 2008
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published by ISO in 2009
Published in Switzerland
ii © ISO/IEC 2008 – All rights reserved

ISO/IEC 15444-8:2007/Amd.1:2008(E)
CONTENTS
Page
1) Clause 2: Normative references . 1
2) Clause 3: Terms and definitions. 1
3) Annex E: File format security . 3
Annex E – File Format Security. 3
E.1 Scope. 3
E.2 Introduction . 3
E.3 Extension to ISO base media file format . 5
E.4 Elementary stream and sample definitions. 14
E.5 Protection at file format level . 16
E.6 Examples (Informative). 18
E.7 Boxes defined in ISO/IEC 15444-12 (informative) . 28

© ISO/IEC 2008 – All rights reserved iii

ISO/IEC 15444-8:2007/Amd.1:2008(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 1 to ISO/IEC 15444-8:2007 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information in collaboration with ITU-T. The identical text is published as
ITU-T Rec. T.807 (05/2006)/Amd.1(E).

iv © ISO/IEC 2008 – All rights reserved

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
INTERNATIONAL STANDARD ISO/IEC 15444-8
RECOMMENDATION ITU-T T.807
Information technology – JPEG 2000 image coding system:
Secure JPEG 2000
Amendment 1
File format security
1) Clause 2: Normative references
Add the following references:
– Recommendation ITU-T T.803 (2002) | ISO/IEC 15444-4:2004, Information technology – JPEG 2000
image coding system: Conformance testing.
– ISO/IEC 13818-11:2004, Information technology – Generic coding of moving pictures and associated
audio information – Part 11: IPMP on MPEG-2 systems.
– ISO/IEC 15444-6:2003, Information technology – JPEG 2000 image coding system – Part 6: Compound
image file format.
– ISO/IEC 15444-12:2005, Information technology – JPEG 2000 image coding system – Part 12: ISO
base media file format (technically identical to ISO/IEC 14496-12).
2) Clause 3: Terms and definitions
a) Rewrite the first paragraph as follows (with the changes underlined):
For the purposes of this Recommendation | International Standard, the following definitions apply. The definitions
defined in ITU-T Rec. T.800 | ISO/IEC 15444-1 clause 3 and ISO/IEC 15444-12:2005 clause 3 apply to this
Recommendation | International Standard.
b) Add the following terms and definitions:
Normal decoder
Standard decoder is a process to decode a codestream that is fully compliant with the normative part of coding standard.
Its behaviour is not defined if it tries to decode a non-compliant codestream.
Adaptive-format decoder
Adaptive-format decoder is a process to decode a codestream which is not fully compliant with the normative part of
the coding standard. It shall reconstruct the media (possibly with low quality or resolution) even if the codestream has
missing packets or inconsistent packet headers. For example, an adaptive-format decoder is able to understand a
simply-transcoded codestream, such as the one that has its highest resolution packets removed.
Elementary Stream (ES)
Elementary streaming contains a sequence of samples, where each sample could be a video frame or a contiguous
section of audio data. A sample in ES contains media data, ByteData structure, pointer structure, container structure, or
any mixture of the above.
Self-Contained ES
Self-contained ES contains only media data, whose format is not defined in this amendment. The self-contained ES
could be stored in MDAT box co-located with the file format specified in this amendment, or be stored in a separate file
whose format is not specified by this amendment.
Rec. ITU-T T.807 (2006)/Amd.1 (03/2008) 1

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
Composed ES
Composed ES may contain a mixture of ByteData, pointer and container structures, that is, its samples are composed
with data from other elementary streams. A composed ES can either copy (using ByteData structure) or reference (using
pointer) data from other ESes.
Scalable Composed ES
Scalable composed ES is made up of samples that may not be decodable by themselves. It may need to be combined
with other scalable composed ESes to form a fully decodable codestream. Scalable composed ES is designed to support
scalability, i.e., to make media data "thinable". For example, for a motion JPEG 2000 codestream where each picture
has three layers, it can be divided into 3 scalable composed ESes: the first one consists of all layer 0 data, the second
one consists of all layer 1 data and the third one consists of all layer 2 data.
Decodable Composed ES
Decodable composed ES is made up of samples that are decodable by themselves. It is designed for simple adaptation
where the adaptor just needs to retrieve data pointed by pointer structure and remove the wrapper to form a fully
scalable codestream. For example, for a motion JPEG 2000 codestream where each picture has three layers, it can form
3 decodable composed ESes: the first one consists of layer 0 data, the second one consists of layer 0 and layer 1 data
and the third one consists of layer 0, 1 and 2 data.
Adaptor/transcoder
Adaptor/transcoder is a process to transform media data to lower scalability level, like lower resolution or lower quality
or bit-rate, by removing portions of the file. The adaptor/transcoder can transform media data based on the information
specified in this amendment. An adaptor/transcoder shall update byte offset values in file format parameters that are
impacted by the process.
Secure adaptor/transcoder
Secure adaptor/transcoder is a process to transform encrypted or authenticated media data without necessity to decrypt
or regenerate the MAC or signature. Thus, end-to-end security remains for the transcoded media data.
JPEG 2000-aware adaptor/transcoder
JPEG 2000-aware adaptor/transcoder combines one or more scalable composed ESes to form a fully decodable media
codestream. It should have the capability to generate the headers and markers of media codestream and modify the
packet index, such that the adapted codestream can be decoded by a normal decoder. It may also add empty packets to
replace the removed ones, or it may insert POC marker.
Simple adaptor/transcoder
Simple adaptor/transcoder is able to transform data based on information specified by this amendment. It may not be
capable of generating media headers or modifying packet indices. It simply retrieves data pointed by pointer structure
and removes the wrappers, and the resulting codestream can be decoded by adaptive-format decoder, which can cope
with missing packets and inconsistent headers.
Authentication adaptor/transcoder
An authentication adaptor/transcoder removes data that is not verifiable with the available media data and
authentication data. For example, in a streaming system, some media packets may be lost during transmission. A file
format receiver may reconstruct the received data to the best of its ability based on the available data. Then, an
authentication adaptor/transcoder can determine which data can be verified, and then remove the packets that are not
verified. The resulting file only contains the decodable, verified data.
Container
Container structure is used to wrap a sample in a composed ES. It might contain any number of ByteData or pointer
structures, but is not allowed to contain another container structure.
Pointer
Pointer structure is used to reference a data segment in another ES. It must be contained inside a container structure.
2 Rec. ITU-T T.807 (2006)/Amd.1 (03/2008)

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
ByteData
ByteData structure is used to wrap a data segment which is physically located in a composed ES. It must be contained
inside a container structure.
4CC Code
4CC code is a 32-bit identifier, normally 4 printable characters. A 4CC code can be used to indicate the file type, the
type of file format box, type of a file format track, type of a file format sample description and type of file format track
reference. A 4CC code must be registered with a registration authority.
3) Annex E: File format security
Create a new annex and add the following text:
Annex E
File Format Security
(This annex forms an integral part of this Recommendation | International Standard)
E.1 Scope
This annex specifies JPSEC file format derived from the ISO base file format and modifications to JPEG family file
format (including JP2, JPX and JPM) for protection and secure adaptation of scalable pictures, which is possibly
encrypted and/or authenticated by the owner. The pictures could be either static pictures or time-sequenced pictures. In
particular, this annex provides functionality to do the following:
• To store coded media data corresponding to different scalability levels. Elementary stream (ES) is used
for this purpose. There are three types of ESes, self-contained ES, scalable composed ES and decodable
composed ES.
• To define tracks describing the characteristics of the coded media data stored in ES. For example, the
track should be able to indicate scalability level (resolution, layer, region, etc.) and the rate-distortion
hints of the coded media data in order to facilitate easy and secure adaptation.
• To define new file format boxes to signal protection tools and parameters applied to coded media data or
metadata. The protection tools can be applied to either static JPEG 2000 pictures or time-sequenced
JPEG 2000 pictures.
• The protection tools defined in this amendment can be applied to JPEG family file formats including
JP2, JPX and JPM and ISO-derived file formats such as MJ2 for motion JPEG.
E.2 Introduction
E.2.1 Security protection at file format level
This annex describes a JPSEC file format derived from the ISO base file format and modifications to JPEG family file
format, to add security protection to JPEG 2000 pictures at the file format level. The protection applied at the file
format level can be classified into two types: item-based protection and sample-based protection, both structures are
defined by the ISO base file format. The item-based protection is designed to protect any byte ranges (including coded
media data and metadata) while the sample-based protection is designed to protect time-sequenced media including
JPEG 2000 pictures.
When the security tools applied change the data length, it shall update all pointers and length fields in all boxes, to
ensure correct parsing by the reader.
E.2.2 Item-based protection
This annex describes two item-based protection schemes in the ISO base file format, by leveraging the syntax and
structures specified by the JPSEC standard. Specifically, it describes schemes for decryption and authentication. Each
item in the ItemLocationBox is protected by one or more protection schemes in the ItemProtectionBox. When multiple
schemes are used (or chained together), the order in which they are applied may be significant and thus must be
specified. This annex also specifies how such operations should be chained together. In addition, the
Rec. ITU-T T.807 (2006)/Amd.1 (03/2008) 3

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
ItemDescriptionBox and ItemCorrespondingBox are added into the ISO base file format to allow the flexible processing
properties that are provided by JPSEC. Specifically, the ItemDescriptionBox allows media-dependent metadata (such as
resolution, quality layer, spatial region, and color space component) to be associated with different portions of the file.
These descriptions can be provided regardless of whether protection is applied. When used with scalable coded pictures,
this allows the file to be scaled down or transcoded without parsing or decoding the media data. In cases where
protection is applied, this provides the benefit of enabling transcoding without requiring decryption.
E.2.3 Sample-based protection of scalable media
For time-sequenced pictures, this annex adds syntaxes to facilitate scalability at the file format level, including scalable
composed elementary stream (ES), decodable composed ES, pointer structure, container structure and ByteData
structure. The scalable coded pictures can be divided (either physically or virtually) into elementary streams at different
scalability level, such that the adaptor/transcoder can "thin" media data with low complexity.
Figure E.1 gives an overview of the file format specified by this annex and also shows how the specified FF is used to
adapt the media data.
Given a sequence of JPEG 2000 pictures (also referred to as Self-contained ES), there are two approaches to construct
the file format. In the first approach, the MDAT box contains one or more Scalable Composed ESes, each of which
corresponds to one scalability level of the media data, e.g., a resolution or a layer. The scalable composed ES must be
stored in MDAT box that is co-located with the file format. The self-contained ES can be located in either MDAT box
in the same file, or a different file whose format is not specified in this amendment. The scalable composed ES may not
be decodable by itself, it may need to be combined with other scalable composed ESes to generate fully decodable
JPEG 2000 pictures. In the second approach, the MDAT box contains one or more Decodable Composed ESes, and
each ES constitutes fully decodable JPEG 2000 pictures by itself. Similarly, decodable composed ESes must be stored
in MDAT box co-located with the file format, and the self-contained ES can be stored in either MDAT box in the same
file, or a different file whose format is not specified by this annex.
Each scalable composed ES or decodable composed ES must be described by at least one track. The characteristics of
the ES (like resolution, layer, and region) are indicated in SampleEntryBox inside each track.
To generate a fully decodable JPEG 2000 codestream from scalable composed ESes, a JPEG 2000-aware adaptor
should have the capability to dynamically generate the image headers (based on the number of resolutions, layers and
region in the adapted codestream), to insert empty packets or to insert POC markers as needed to make the resulting
codestream decodable by any standard decoder. However, if a simple adaptor is used, the resulting codestream may
have an inconsistent image header and there may be a missing packet, which require a JPEG 2000 adaptive-format
decoder.
As a decodable composed ES is decodable by itself, a simple adaptor is sufficient to generate fully compliant
JPEG 2000 pictures.
4 Rec. ITU-T T.807 (2006)/Amd.1 (03/2008)

ISO/IEC 15444-8:2007/Amd.1:2008 (E)

Figure E.1 – System diagram for time-sequenced scalable media
Each elementary stream is described by at least one media track, and its characteristics are described in
SampleDescriptionEntry or SampleGroupEntryBox within the track. It is possible that a single elementary stream is
described by multiple tracks, each of which may describe different aspects of the elementary stream.
The sample-based protection can be applied to all samples or a group of samples in a scalable composed ES or
decodable composed ES. If protection is applied to all samples, a ProtectionSchemeInfoBox signalling the parameters
of the protection tool is added to the SampleDescriptionBox, which is then encapsulated as described in E.5.2. In
addition, if protection is applied to a group of samples, a ProtectionSchemeInfoBox is added to their
SampleGroupEntryBox, which is then encapsulated as described in E.5.4.
E.3 Extension to ISO base media file format
E.3.1 Overview
This subclause documents technical extensions (additional box types) to the ISO based media file format, which could
be used for protection, adaptation, or secure adaptation of scalable coded pictures. However, the added box types could
be used for other purposes as well. In particular, this subclause defines ProtectionSchemeInfoBox for the decryption
tool and authentication tool, ItemDescriptionBox, ScalableSampleDescriptionEntry, ScalableSampleGroupEntry, and
Generic Protected Box. All other boxes defined in ISO/IEC 15444-12 are still used as is.
E.3.2 Incorporate JPSEC codestream into ISO-driven file format
A JPSEC codestream can be placed as a payload in the 'mdat' box of the ISO base file format. In the Sample
Description Box ('stsd'), the 'codingname' of the corresponding Sample Entry is defined to be 'jpsc', which is a
registered identifier for JPSEC decoder. In this case, the security service is provided by JPSEC at codestream.
E.3.3 Protected file format brand
Files conforming to this Recommendation | International Standard may use 'ffsc' as the major brand in the File Type
Compatibility Box.
Rec. ITU-T T.807 (2006)/Amd.1 (03/2008) 5

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
Files conforming to this Recommendation | International Standard, i.e., containing protection or authentication
information may use 'ffsc' as a compatible brand in the File Type Compatibility Box.
There are uses of this Recommendation | International Standard which are compatible with JP2, JPX, MJ2, and JPM
files. A typical use of this Recommendation | International Standard will leave the major brand of a file unchanged, but
add boxes and thus add 'ffsc' as a compatible brand.
Thus brands including 'isom' , 'iso2', 'jp2\040, 'jpx\040' and 'jpm\040' should be compatible.
The 'ffsc' compatible brand indicates the use of new boxes and new tools corresponding to the protection methods in
JPSEC.
A file that has been protected, to the extent that an application intending to process the JP2, JPX, JPM, or other file type
content will be unable to do so without using protection tools, may use the 'ffsc' major brand as the file type; such a
protected file must not use a major brand for which it is no longer conformant.
E.3.4 Summary of boxes used
The ISO base media file format defines two structures to describe a presentation: the logical structure and media
sequence structure. The logical structure uses the ItemLocationBox ('iloc') to describe an item which is the byte range or
a series of byte ranges for a particular file, either a local file or a remote file. The media sequence structure uses the
SampleGroupDescriptionBox ('spgd') or SampleDescriptionBox ('stsd') to describe the samples, which could be a frame
of video, a time-contiguous series of video frames, or a time-contiguous compressed section audio.
Accordingly, the protection in the ISO base media file format level is classified into item-based protection and sample-
based protection, as described in E.5.2 and E.5.4, respectively.
Several boxes are used from ISO/IEC 15444-12, these are marked as "Existing" in Table E.1. Boxes defined in this
Recommendation | International Standard are listed as "New" in Table E.1. The definitions for these boxes depend on
the definitions of Box and FullBox from ISO/IEC 15444-12, which are repeated for convenience in E.7.
Table E.1 – List of existing and new boxes
Box names Status Remarks
meta   Existing Metadata
iloc   Existing Item location
iproc   Existing Item protection
sinf  Existing Protection scheme information box
frma  Existing Original format box
schm  Existing Scheme type box
schi  Existing Scheme information box
gran New Granularity box
vall New Value List box
bcip New Block cipher box
keyt New Key template box
scip New Stream cipher box
keyt New Key template box
auth New Authentication box
keyt New Key template box
iinf   Existing Item information box
ides   New Item description box
dest  New Description type box
desd  New Description data box
vide  New Visual item description entry
j2ke  New JPEG 2000 item description entry
icor   New Item correspondence box
… … … … … … … …
stbl   Existing Sample table box
stsd   Existing Sample description box
6 Rec. ITU-T T.807 (2006)/Amd.1 (03/2008)

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
Table E.1 – List of existing and new boxes
Box names Status Remarks
ScalableSampleDescriptionEntry New Scalable sample description entry
sbgp   Existing Sample to group box
sgpd   Existing Sample group box
ScalableSampleGroupEntry New Scalable sample group entry
gprt  New Generic Protected box
E.3.5 Decryption scheme
The Decryption protection scheme is identified in SchemeTypeBox as follows:

scheme_type="decr"
scheme_version=0
scheme_uri=null
For the Decryption protection scheme, the structure of SchemeInfoBox is as follows:

aligned(8) class GranularityBox extends Box('gran') {
unsigned int(8) granularity;
}
Semantics:
granularity is used for item-based protection. For item-based protection, 0 indicates that the processing unit is the
entire item and 1 indicates that the processing unit is one extent within an item. For sample-based protection, 0 indicates
that the processing unit is all samples in the track or sample group and 1 indicates that the processing unit is one sample.

aligned(8) class ValueListBox extends Box('vall') {
unsigned int(8) value_size;
unsigned int(16) value_count;
unsigned int(16) count [value_count];
unsigned char (value_size) value[value_count];
}
Semantics:
value_size is the size in bytes of each value in the array.
value_count is the number of (count, value) pairs in the array. For item-based protection, the (count,
value) pairs are used to map each value to count processing units. For sample-based protection, the (count,
value) pairs are used to map each value to count samples. For instance, the value[0] corresponds to the first
count[0] sample or units, and the value[1] corresponds to the next count[1] samples or units, and so on.

aligned(8) class KeyTemplateBox extends Box('keyt') {
unsigned int(16) key_size;
unsigned int(8) key_info;
GranularityBox GL;      //optional
ValueListBox  VL;
}
Semantics:
key_size is the size of key in bits.
key_info indicates the meaning of the values in the ValueListBox. 1 means the values are X.509 certificate; 2 means
the values are URIs for certificate or secret keys.
GL is a GranularityBox.
VL is a ValueListBox, containing a list of values, whose meaning is defined by the key_info field.

Rec. ITU-T T.807 (2006)/Amd.1 (03/2008) 7

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
aligned(8) class BlockCipherBox extends Box('bcip') {
unsigned int(16) cipher_id;
bit (6) cipher_mode;
bit (2) padding_mode;
unsigned int(8) block_size;
KeyTemplateBox KT;
}
Semantics:
cipher_id identifies which block cipher algorithm is used for protection. Values are defined in Table 25.
cipher_mode could be ECB, CBC, CFB, OFB or CTR. Values are defined in Table 29.
padding_mode is ciphertext stealing or PKCS#7-padding. Values are defined in Table 30.
block_size is size of block for block cipher.
KT is a KeyTemplateBox, holding all the key information used by the block cipher.

aligned(8) class StreamCipherBox extends Box('scip') {
unsigned int(8) cipher_type;
unsigned int(16) cipher_id;
KeyTemplateBox KT;
}
Semantics:
cipher_type indicates the type of cipher used. It has values cipher_type = STRE for stream cipher or
cipher_type = ASYM for asymmetric cipher.
cipher_id identifies the stream cipher algorithm used for the protection. If cipher_type = STRE, see Table 26; if
cipher_type = ASYM, see Table 27.
KT is a KeyTemplateBox, holding all the key information used by the stream cipher.

aligned(8) class SchemeInfomationBox extends Box('schi', cipher_id) {
unsigned int(8) MetaOrMedia;
unsigned int(8) HeaderProtected;
BlockCipherBox(); or StreamCipherBox();
GranularityBox GL;
ValueListBox  VL;
}
Semantics:
MetaOrMedia is to indicate whether the protected data segment corresponds to media data segment or meta data
boxes. (0 for media data and 1 for meta data boxes).
HeaderProtected is to indicate whether the protection is applied to the box content only (value 0) or applied to the
whole box including its header (value 1).
The SchemeInformationBox can contain a BlockCipherBox, or a StreamCipherBox, which are containers for the
parameters of the cipher algorithms. These boxes can only contain particular cipher_id values.
GL is a GranularityBox, holding information about the processing unit. This field is optional for sample-based
protection and required for item-based protection.
VL is a ValueListBox. For block cipher, this box may be empty. For stream cipher, this box contains all the initial
vectors.
8 Rec. ITU-T T.807 (2006)/Amd.1 (03/2008)

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
E.3.6 Authentication scheme
The Authentication protection scheme is identified by SchemeTypeBox as follows:

scheme_type="auth"
scheme_version=0
scheme_uri=null
For Authentication protection scheme, the structure of SchemeInfoBox is defined as follows:

aligned(8) class AuthBox extends Box('Auth') {
unsigned int(8) auth_type; //hash, cipher, or signature
unsigned int(8) method_id;
unsigned int(8) hash_id;
unsigned int(16) MAC_size;
KeyTemplateBox KT;
}
Semantics:
auth_type indicates authentication type, including hash-based (HASH), cipher-based (CIPH) and signature-based
(SIGN) authentication.
method_id identifies the authentication method. 1 indicates HMAC. If auth_type = HASH, see Table 36; if
auth_type = CIPH, see Table 39; if auth_type = SIGN, see Table 41.
hash_id identifies the hash function used. If auth_type = HASH or SIGN, see Table 37; if auth_type = CIPH,
see Table 25.
MAC_size is size of MAC (if auth_type = HASH or CIPH) or digital signature (if auth_type = SIGN) in bits.
KT is a KeyTemplateBox, holding all the key information for the authentication.

aligned(8) class SchemeInfoBox extends Box('schi', auth_method) {
unsigned char (8) auth_method;
AuthBox authBox;
GranularityBox GL;  //optional
ValueListBox VL;
}
Semantics:
auth_method identifies the authentication method used. 0 is for hash-based MAC; 1 is for cipher-based MAC; 2 is
for digital signature. Depending on the auth_method, this box could contain either HashAuthBox, CipherAuthBox,
or SignatureAuthBox.
GL is a GranularityBox. The field is optional for sample-based protection and required for item-based protection.
VL is a ValueListBox, holding all the MACs or signatures.
E.3.7 ItemDescriptionBox
In the Item Location Box, all the items are specified as byte ranges (using the offset and length). The 'iloc' box does not
contain the content-related information about the specified byte ranges, which is required by some protection methods.
For instance, if secure transcoding method wants to discard the least important layer or resolution, it has to know which
byte ranges correspond to the to-be-discarded layer or resolution.
As such, this subclause defines two new boxes to enable the content-related processing at the file format level: Item
Description Box ('ides') and Item Correspondence Box ('icor'). The 'ides' box specifies the content-related information,
like layer, resolution (for visual content), period (audio content), and so on. The 'icor' box links the content-related
information to the items in 'iloc' box, in the same way as the 'iinf' box links 'iloc' box to 'ipro' box.
The Item Description box ('ides') is defined as follows:

aligned(8) class ItemDescriptionBox extends Box('ides') {
unsigned int(32) entry_count;
for(i=0; i < entry_count; i++) {
DescriptionTypeBox desType;
Rec. ITU-T T.807 (2006)/Amd.1 (03/2008) 9

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
unsigned int(32) item_ID;
DescriptionDataBox desData;
}
}
Semantics:
entry_count is the number of entries in the ItemDescriptionBox.
item_ID references an Item in ItemLocationBox. If this field is 0, then item_ID will be specified by
ItemCorrespondenceBox.
Aligned(8) class DescriptionTypeBox extends Box('dest') {
Unsigned int(32) description_type;
Unsigned int(32) description_version;
Unsigned int(8) description_uri[];
}
Semantics:
description_type is the 4CC code defining the description scheme.
description_version is the version of the description.
description_uri allows for the options of directing the user to a webpage if they do not have the description
definition installed on their system. It is an absolute URI formed as a null-terminated string in UTF-8 characters.

Aligned(8) class DescriptionDataBox extends Box('desd') {
Box description_specific_data [];
}
Semantics:
If description_type = 'vide', this is a VisualItemDescriptionEntry; if description_type = 'j2ke', this is a
J2KItemDescriptionEntry; for any other value of description_type, the syntax of the description can be found at
description_uri.
The VisualItemDescriptionEntry is defined as follows:

aligned(8) Class VisualItemDescriptionEntry extends Box('vide') {
unsigned int(16) layer_start;
unsigned int(16) layer_count;
unsigned int(16) res_start;
unsigned int(16) res_count;
unsigned int(16) horizontal_offset;
unsigned int(16) horizontal_length;
unsigned int(16) vertical_offset;
unsigned int(16) vertical_length;
unsigned int(16) color_space;
unsigned int(16) time_start;
unsigned int(16) time_length;
}
Semantics:
The layer_start and layer_count together specify the range of layers. When layer_start equals to 2 –1,
the layer range will start from layer 0; when layer_count equals to 2 –1, the layer range will end at the last layer.
When both layer_start and layer_count equal to 2 –1, the layer range will include all layers.
The res_start and res_count together specify the range of resolutions. The semantics is the same as
layer_start and layer_count when their value equals to 2 –1.
The horizontal_offset, horizontal_length, vertical_offset and vertical_length together
specify the spatial area. The semantics is the same as layer_start and layer_count when their value equals to
2 –1.
The color_space specifies the color space. 0: red color space; 1: green color space; 2: blue color space.
10 Rec. ITU-T T.807 (2006)/Amd.1 (03/2008)

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
The intersection is applied to the layer ranges, resolution ranges, areas and color space to get the portion of the
image/video specified by the VisualItemDescriptionEntry.
The J2KItemDescriptionEntry is defined as follows:

aligned(8) class J2KItemDescriptionEntry extends Box('j2ke') {
VisualItemDescriptionEntry visualDesEntry; //optional
unsigned int(16) tile_start;
unsigned int(16) tile_count;
unsigned int(16) precinct_start;
unsigned int(16) precinct_count;
unsigned int(16) j2k_packet_start;
unsigned int(16) j2k_packet_count;
}
Semantics:
visualDesEntry specifies image/video-specific attributes. It is optional.
tile_start and tile_count specify the tiles.
precinct_start and precinct_count specify the precincts.
j2k_packet_start and j2k_packet_count specify the JPEG 2000 defined packets.
Similar to VisualItemDescriptionEntry, the intersection is applied to tiles, precincts and JPEG 2000 packets to get the
portion of the JPEG 2000 codestream specified by J2KItemDescriptionEntry.
The ItemCorrespondenceBox ('icor') is defined as follows:

aligned(8) class ItemCorrespondenceEntry extends Box('icor') {
unsigned int(16) item_ID;
unsigned int(16) desc_ID;
}
Semantics:
item_ID is pointing to one item in the ItemLocationBox.
desc_ID is pointing to one description entry in the ItemDescriptionBox.
E.3.8 ScalableSampleDescriptionEntry Box
The ScalableSampleDescriptionEntry is used to describe characteristics associated with scalable composed ES or
decodable composed ES, like resolution levels, quality layers, cropped region. For scalable composed ES, res and
layer indicate the media data of that particular resolution and layer, while for decodable composed ES, they indicate
the highest resolution and highest quality layer.
When the media samples are protected by a protection tool, the ScalableSampleDescriptionEntry is encapsulated as
follows:
The four-character-code of the ScalableSampleDescriptionEntry is replaced with another four-character-code indicating
protection encapsulation, which varies only by media type, as defined below:
• Encv: to indicate that video samples are encrypted and thereby un-protection must be applied to get
meaningful media data.
• Autv: to indicate that video samples are authenticated and the media data can still be meaningful before
un-protection.
• Enct: to indicate that text samples are encrypted.
• Autt: to indicate that text samples are authenticated.
• Encs: to indicate that system samples are encrypted.
• Auts: to indicate that system samples are authenticated.
A ProtectionSchemeInfoBox is added to the ScalableSampleDescriptionEntry, leaving all other boxes unmodified.
The original sample entry type is stored in the OriginalFormatBox within the ProtectionSchemeInfoBox.

Rec. ITU-T T.807 (2006)/Amd.1 (03/2008) 11

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
Class ScalableSampleDescriptionEntry(codingname)extends
VisualSampleEntry(codingname) {
Unsigned int(8) res;
Unsigned int(8) layer;
Unsigned int(32) cropped_width, cropped_height;
If(cropped_width > 0 && cropped_height > 0) {
Unsigned int(32) startx;
Unsigned int(32) starty;
}
ProtectionSchemeInfoBox protectionSchemes[]; //optional
}
Semantics:
codingname is "sces" if the track is referring to a scalable composed ES and "dces" if the track is referring to a
decodable composed ES.
res is the resolution of the described samples. A value of –1 indicates all resolution levels.
layer is the quality layer of the described samples. A value of –1 indicates all quality layers.
cropped_width is the width of the cropped region.
cropped_height is the height of the cropped region.
startx & starty indicate the position of the top-left corner of the cropped region. If either cropped_width or
cropped_height is zero, the startx and starty will not be present.
protectionSchemes is a list of ProtectionSchemeInfoBoxes to indicate the protection tools applied to the
described samples. The un-protection process has to follow the order in which it appears in the list.
E.3.9 ScalableSampleGroupEntry Box
A track may be made up of samples with different characteristics and protected by different protection tools or with
different parameters. The ScalableSampleGroupEntry box is used to signal the grouping of samples based. For example,
if a track has 1000 samples where the first 500 samples are encrypted with Key 1 and the second 500 samples are
encrypted with Key 2, the SampleGroupDescription Box contains two ScalableSampleGroupEntry boxes: the first box
describes the first 500 samples while the second box describes the second 500 samples.
When the media samples are protected by a protection tool, the ScalableSampleGroupEntry is encapsulated in the same
way as ScalableSampleDescriptionEntry in E.3.8.

Class ScalableSampleGroupEntry(type) extends VisualSampleGroupEntry(type) {
unsigned int(8) res;
unsigned int(8) layer;
unsigned int(32) cropped_width, cropped_height;
If(cropped_width > 0 && cropped_height > 0) {
unsigned int(32) startx;
unsigned int(32) starty;
}
ProtectionSchemeInfoBox protectionSchemes[]; //optional
}
Semantics:
type indicates the grouping type. If the grouping is based on different protection tools applied to the samples, the type
is "prot"; if the grouping is based on different media characteristics (like resolution or layers), the type is "attr".
res is the resolution of the described samples. A value of –1 indicates all resolution levels.
layer is the quality layer of the described samples. A value of –1 indicates all quality layers.
cropped_width is the width of the cropped region.
cropped_height is the height of the cropped region.
startx & starty indicate the position of the top-left corner of the cropped region. If either cropped_width or
cropped_height is zero, the startx and starty will not be present.
12 Rec. ITU-T T.807 (2006)/Amd.1 (03/2008)

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
protectionSchemes is a list of ProtectionSchemeInfoBoxes to indicate the protection tools applied to the
described samples. The un-protection process has to follow the order in which it appears in the list.
E.3.10 Generic Protected Box
E.3.10.1 Definition
Box Types: 'gprt'
Container: File or any container box
Mandatory: Yes, when failure to use would prevent parsing file
Quantity: Any number
The JProtected Box is used when a protection scheme is applied to a box, and use of the protection scheme prevents
parsing of the box. For example, if the contents of a superbox are encrypted, including the box lengths and types, that
box is no longer parsable, and thus fails to meet the original definition for the box. In this case, a JProtected Box may be
placed in the file in the place of the box that no longer parses correctly, and the encrypted data placed in the JProtected
Box.
Interpreting the content of the data[] portion of this box shall be done using the ItemLocationBox.
Once all of the protection methods have been operated on from the ItemInformationBox, the contents may be
reorganized into original boxes. The first size[0] bytes of the unprotected data[] array are placed in a box of type[0], and
the next size[i] bytes are placed in a box of type[i], and so on. Note that when size_flag is 0, the size of the original
box is not disclosed and when type_flag is 0, the type of the original box is not disclosed. This is to prevent known-
plaintext attacks. When both the size_flag and type_flag are 0, total_size provides the total size of the
protected content.
If the entry count is 0, or if there is data left over, then that data shall be in the format of legal boxes with type and size
codes, i.e., the unprotected data contains the types and sizes.
If any part-1 mandatory box is encrypted, the "jp2" brand should be removed from the compatible list in file type box.
E.3.10.2 Syntax
aligned(8) class JProtectedBox extends Box('gprt') {
bit(1) type_flag;
bit(1) size_flag;
bit(1) location_flag;
unsigned int(5) reserved; // for ISO use
if(size_flag == 1 || type_flag == 1 || location_flag == 1) {
unsigned int(32) entry_count;
if(location_flag == 1)
unsigned int(8) offset_size;
for(i=0; i if(size_flag == 1)
unsigned int(32) size;
if(type_flag == 1)
unsigned int(32) type = boxtype;
if(size_flag == 1 && size == 1)
unsigned int(64) large_size;
if(location_flag == 1)
unsigned int(offset_size*8) offset;
}
}
else {
Unsigned int(32) total_size;
if(total_size == 1)
unsigned int(64) large_total_size;
}
unsigned int(8) data[];
}
Rec. ITU-T T.807 (2006)/Amd.1 (03/2008) 13

ISO/IEC 15444-8:2007/Amd.1:2008 (E)
E.3.10.3 Semantics
size_flag indicates whether or not size value is present. A value of 1 means the size of each entry in Generic
Protected Box is present; 0 means the size is not present.
type_flag indicates whether or not the box type is present. A value of 1 means the original type of each entry in
Generic Protected Box is present; 0 means the type is not present.
location_flag indicates whether or not the location is present. A value of 1 means the original location of each
entry in Generic Protected Box is present; 0 means the original location is not present.
entry_count is the number of entries in the Generic Protected Box.
offset_size is the length of the offset in bytes.
Each size entry is the size of a box that has been replaced by the Generic Protected Box.
Each type entry is the original type of a box that has
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...