Digital publishing — EPUB3 preservation — Part 1: Principles

The ISO/IEC TS 22424 series supports long-term preservation of EPUB publications via a dual strategy. This document considers EPUB features from a long-term preservation point of view. Some EPUB features are forbidden and some others required, depending on how they relate to a long-term preservation. EPUB publications constructed according to these guidelines are suitable for preservation. ISO/IEC TS 22424-2 makes EPUB compliant with Open Archival Information System (OAIS) and current practices of OAIS archives.

Publications numériques — EPUB3 preservation — Partie 1: Principes

General Information

Status
Published
Publication Date
28-Jan-2020
Current Stage
9092 - International Standard to be revised
Completion Date
16-Sep-2024
Ref Project

Relations

Buy Standard

Technical specification
ISO/IEC TS 22424-1:2020 - Digital publishing -- EPUB3 preservation
English language
25 pages
sale 15% off
Preview
sale 15% off
Preview
Technical specification
ISO/IEC TS 22424-1:2020 - Digital publishing — EPUB3 preservation — Part 1: Principles Released:1/29/2020
English language
25 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


TECHNICAL ISO/IEC TS
SPECIFICATION 22424-1
First edition
2020-01
Digital publishing — EPUB3
preservation —
Part 1:
Principles
Publications numériques — EPUB3 preservation —
Partie 1: Principes
Reference number
©
ISO/IEC 2020
© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved

Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Abbreviated terms . 9
5 Packaging standards. 9
6 Construction of OAIS information packages .11
6.1 Overview .11
6.2 General principles .12
6.2.1 EPUB publications shall be sent to a repository system as well-formed
and complete submission information packages (SIPs) .12
6.2.2 Regardless of its type or format, it shall be possible to include any data or
metadata in SIPs .14
6.2.3 It should be possible to transfer SIPs by any means, methods, or tools
from the submitting organization to the repository system .16
6.2.4 The archive shall have a way to verify the identity of the submitting
organization/person, no matter how the information packages are transferred 16
6.2.5 There is no 1:1 relation between OAIS information packages .16
6.2.6 A SIP may contain 0-n EPUB 3 publications, and one EPUB 3 publication
may be submitted to the repository system in 1-n SIPs .16
6.2.7 The information package type (in this case, SIP) shall be indicated .16
6.2.8 SIP packaging method shall not restrict the application of any
preservation method .17
6.2.9 The packaging method shall not limit the size of the SIP .17
6.3 Identification of information packages and their content .17
6.3.1 It shall be possible to identify any SIP uniquely both during and after the
ingest process .17
6.3.2 Information objects (EPUB publications, PREMIS preservation metadata
record, etc.) within SIPs shall be identified uniquely and persistently .17
6.3.3 EPUB Fragment Identifiers should not be used in EPUB publications sent
to a repository system, unless the submission agreement explicitly allows
their use .18
6.4 Structure of information packages .18
6.5 Generic Information package metadata .19
6.5.1 Metadata in information packages shall be based on standards .19
6.5.2 Metadata should allow (automatic) validation of the structure and
content of SIPs in terms of integrity, fixity, and syntax .19
6.5.3 It shall be possible to edit metadata in information packages .19
Annex A (informative) EPUB and digital preservation: issues and recommendations.20
Bibliography .24
© ISO/IEC 2020 – All rights reserved iii

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that
are members of ISO or IEC participate in the development of International Standards through
technical committees established by the respective organization to deal with particular fields of
technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other
international organizations, governmental and non-governmental, in liaison with ISO and IEC, also
take part in the work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents) or the IEC
list of patent declarations received (see http:// patents .iec .ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www .iso .org/
iso/ foreword .html.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 34, Document description and processing languages.
A list of all parts in the ISO/IEC TS 22424 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
iv © ISO/IEC 2020 – All rights reserved

Introduction
0.1 General
This document facilitates the long-term preservation of EPUB publications by specifying in general level
EPUB features which are mandatory for long-term preservation (such as font embedding) and features
which should be avoided if possible.
This document can be seen as a stepping stone towards a detailed specification which would be related
to EPUB in the same way as PDF/A, specified in ISO 19005-1 to ISO 19005-3, is related to the Portable
Document Format (PDF). If and when the EPUB community develops detailed guidelines for the
production of archivable EPUB publications, this document could be used as one of the starting points.
Long-term preservation in general requires two things:
— making the object such as EPUB publication fit for preservation – including features to be used and
features to avoid;
— packaging the object (and any metadata related to it) together with any additional data such as
other versions of the object and other documentation into an Open Archival Information System
(OAIS) submission information package (SIP).
Packaging is covered in ISO/IEC TS 22424-2.
0.2 EPUB
The EPUB standard
defines a distribution and interchange format for digital publications and documents. The EPUB® format
provides a means of representing, packaging and encoding structured and semantically enhanced Web
[17]
content — including HTML, CSS, SVG and other resources — for distribution in a single-file container.
EPUB format was developed by the International Digital Publishing Forum, IDPF, which merged with
the World Wide Web Consortium, W3C, in January 2017. Ongoing technical development of the standard,
related extension specifications and ancillary deliverables are the responsibility of the W3C EPUB 3
1)
Community Group , which published its charter in February 2017. According to the charter,
work on any future major revision of EPUB, e.g. an EPUB 4, is initially out of scope on the presumption that
this will be taken up by a new W3C WG as a W3C Recommendation Track activity. The EPUB 3 CG will
coordinate its work with such new WG, and meanwhile with the existing W3C Digital Publishing Interest
[23]
Group (DPUB IG).
The International Digital Publishing Forum, IDPF, has ceased operations as a membership organization
2)
in January 2017, and its website is now an archive. The latest version of the standard and information
about future EPUB developments is available at the Publishing@ W3C webpage, https:// www .w3 .org/
publishing/ .
3) 4)
The specification at hand covers EPUB 3 versions up to EPUB 3.0.1 . EPUB 3.1 was the first major
revision of EPUB 3.0.1, but there are no implementations of version 3.1 and therefore it is not covered
in this document. The most widely used version of the standard is still 3.0.1. EPUB 3.2, was published in
5)
May 2019 . Unlike 3.1, it is fully backwards compatible with 3.0.1. It will be covered in the next edition
of this document.
1) https:// www .w3 .org/ publishing/ groups/ epub3 -cg/
2) http:// idpf .org/
3) http:// idpf .org/ epub/ 301
4) https:// www .w3 .org/ Submission/ epub31/
5) https:// w3c .github .io/ publ -epub -revision/ epub32/ spec/ epub -spec .html
© ISO/IEC 2020 – All rights reserved v

Differences between EPUB specifications 2.0.1-3.2 are well documented:
6)
— EPUB 3 Changes from EPUB 2.0.1
7)
— EPUB 3.0.1 Changes from EPUB 3.0
8)
— EPUB 3.2 Changes from EPUB 3.0.1
All EPUB specifications are available in the Web; 2.01 at http:// idpf .org/ epub/ 201, EPUB 3.0.1 at http://
idpf .org/ epub/ 301 and 3.2 at https:// w3c .github .io/ publ -epub -revision/ epub32/ spec/ epub -spec .html.
All EPUB publications, including ones using version 3.2, can be validated using EPUBCheck version
4.2.0, which was released in March 2019.
From long-term preservation point of view, lack of backward compatibility between successive versions
of a file format would be a problem because it makes migration more challenging. In addition, EPUB
3.1 has at least one feature which would have been problematic. In EPUB 3.1 foreign resources do not
require fallbacks if they are not in the spine and not embedded in EPUB Content Documents. In EPUB
3.0.1, fallback guarantees that there is a version of the document that can be rendered; in 3.1 such
guarantee no longer exists.
EPUB 3.0.1 was prepared by the IDPF. It consists of six interlinked documents:
— EPUB 3 Overview
— Publications 3.0.1
— Canonical fragment identifiers
— Content documents 3.0.1
— Media overlays 3.0.1
— Open Container Format 3.0.1
There are several extension specifications to these EPUB base standards. The list below is incomplete,
as it contains mainly specifications that are relevant from the long-term preservation point of view.
Some of them are still drafts:
9)
— EPUB Accessibility specification 1.0 addresses evaluation and certification of accessible EPUB
publications, and discovery of the accessible qualities in such publications.
10)
— EPUB Previews 1.0 describes how content previews can be included in EPUB publications.
11)
— EPUB Distributable Objects 1.0 is a draft specification that defines a method for the encapsulation,
transportation, and integration of distributable objects in EPUB publications.
12)
— EPUB Scriptable Components 1.0 provides an interoperable publish and subscribe (pubsub)
pattern by which i
...


TECHNICAL ISO/IEC TS
SPECIFICATION 22424-1
First edition
2020-01
Digital publishing — EPUB3
preservation —
Part 1:
Principles
Publications numériques — EPUB3 preservation —
Partie 1: Principes
Reference number
©
ISO/IEC 2020
© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved

Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Abbreviated terms . 9
5 Packaging standards. 9
6 Construction of OAIS information packages .11
6.1 Overview .11
6.2 General principles .12
6.2.1 EPUB publications shall be sent to a repository system as well-formed
and complete submission information packages (SIPs) .12
6.2.2 Regardless of its type or format, it shall be possible to include any data or
metadata in SIPs .14
6.2.3 It should be possible to transfer SIPs by any means, methods, or tools
from the submitting organization to the repository system .16
6.2.4 The archive shall have a way to verify the identity of the submitting
organization/person, no matter how the information packages are transferred 16
6.2.5 There is no 1:1 relation between OAIS information packages .16
6.2.6 A SIP may contain 0-n EPUB 3 publications, and one EPUB 3 publication
may be submitted to the repository system in 1-n SIPs .16
6.2.7 The information package type (in this case, SIP) shall be indicated .16
6.2.8 SIP packaging method shall not restrict the application of any
preservation method .17
6.2.9 The packaging method shall not limit the size of the SIP .17
6.3 Identification of information packages and their content .17
6.3.1 It shall be possible to identify any SIP uniquely both during and after the
ingest process .17
6.3.2 Information objects (EPUB publications, PREMIS preservation metadata
record, etc.) within SIPs shall be identified uniquely and persistently .17
6.3.3 EPUB Fragment Identifiers should not be used in EPUB publications sent
to a repository system, unless the submission agreement explicitly allows
their use .18
6.4 Structure of information packages .18
6.5 Generic Information package metadata .19
6.5.1 Metadata in information packages shall be based on standards .19
6.5.2 Metadata should allow (automatic) validation of the structure and
content of SIPs in terms of integrity, fixity, and syntax .19
6.5.3 It shall be possible to edit metadata in information packages .19
Annex A (informative) EPUB and digital preservation: issues and recommendations.20
Bibliography .24
© ISO/IEC 2020 – All rights reserved iii

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that
are members of ISO or IEC participate in the development of International Standards through
technical committees established by the respective organization to deal with particular fields of
technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other
international organizations, governmental and non-governmental, in liaison with ISO and IEC, also
take part in the work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents) or the IEC
list of patent declarations received (see http:// patents .iec .ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www .iso .org/
iso/ foreword .html.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 34, Document description and processing languages.
A list of all parts in the ISO/IEC TS 22424 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
iv © ISO/IEC 2020 – All rights reserved

Introduction
0.1 General
This document facilitates the long-term preservation of EPUB publications by specifying in general level
EPUB features which are mandatory for long-term preservation (such as font embedding) and features
which should be avoided if possible.
This document can be seen as a stepping stone towards a detailed specification which would be related
to EPUB in the same way as PDF/A, specified in ISO 19005-1 to ISO 19005-3, is related to the Portable
Document Format (PDF). If and when the EPUB community develops detailed guidelines for the
production of archivable EPUB publications, this document could be used as one of the starting points.
Long-term preservation in general requires two things:
— making the object such as EPUB publication fit for preservation – including features to be used and
features to avoid;
— packaging the object (and any metadata related to it) together with any additional data such as
other versions of the object and other documentation into an Open Archival Information System
(OAIS) submission information package (SIP).
Packaging is covered in ISO/IEC TS 22424-2.
0.2 EPUB
The EPUB standard
defines a distribution and interchange format for digital publications and documents. The EPUB® format
provides a means of representing, packaging and encoding structured and semantically enhanced Web
[17]
content — including HTML, CSS, SVG and other resources — for distribution in a single-file container.
EPUB format was developed by the International Digital Publishing Forum, IDPF, which merged with
the World Wide Web Consortium, W3C, in January 2017. Ongoing technical development of the standard,
related extension specifications and ancillary deliverables are the responsibility of the W3C EPUB 3
1)
Community Group , which published its charter in February 2017. According to the charter,
work on any future major revision of EPUB, e.g. an EPUB 4, is initially out of scope on the presumption that
this will be taken up by a new W3C WG as a W3C Recommendation Track activity. The EPUB 3 CG will
coordinate its work with such new WG, and meanwhile with the existing W3C Digital Publishing Interest
[23]
Group (DPUB IG).
The International Digital Publishing Forum, IDPF, has ceased operations as a membership organization
2)
in January 2017, and its website is now an archive. The latest version of the standard and information
about future EPUB developments is available at the Publishing@ W3C webpage, https:// www .w3 .org/
publishing/ .
3) 4)
The specification at hand covers EPUB 3 versions up to EPUB 3.0.1 . EPUB 3.1 was the first major
revision of EPUB 3.0.1, but there are no implementations of version 3.1 and therefore it is not covered
in this document. The most widely used version of the standard is still 3.0.1. EPUB 3.2, was published in
5)
May 2019 . Unlike 3.1, it is fully backwards compatible with 3.0.1. It will be covered in the next edition
of this document.
1) https:// www .w3 .org/ publishing/ groups/ epub3 -cg/
2) http:// idpf .org/
3) http:// idpf .org/ epub/ 301
4) https:// www .w3 .org/ Submission/ epub31/
5) https:// w3c .github .io/ publ -epub -revision/ epub32/ spec/ epub -spec .html
© ISO/IEC 2020 – All rights reserved v

Differences between EPUB specifications 2.0.1-3.2 are well documented:
6)
— EPUB 3 Changes from EPUB 2.0.1
7)
— EPUB 3.0.1 Changes from EPUB 3.0
8)
— EPUB 3.2 Changes from EPUB 3.0.1
All EPUB specifications are available in the Web; 2.01 at http:// idpf .org/ epub/ 201, EPUB 3.0.1 at http://
idpf .org/ epub/ 301 and 3.2 at https:// w3c .github .io/ publ -epub -revision/ epub32/ spec/ epub -spec .html.
All EPUB publications, including ones using version 3.2, can be validated using EPUBCheck version
4.2.0, which was released in March 2019.
From long-term preservation point of view, lack of backward compatibility between successive versions
of a file format would be a problem because it makes migration more challenging. In addition, EPUB
3.1 has at least one feature which would have been problematic. In EPUB 3.1 foreign resources do not
require fallbacks if they are not in the spine and not embedded in EPUB Content Documents. In EPUB
3.0.1, fallback guarantees that there is a version of the document that can be rendered; in 3.1 such
guarantee no longer exists.
EPUB 3.0.1 was prepared by the IDPF. It consists of six interlinked documents:
— EPUB 3 Overview
— Publications 3.0.1
— Canonical fragment identifiers
— Content documents 3.0.1
— Media overlays 3.0.1
— Open Container Format 3.0.1
There are several extension specifications to these EPUB base standards. The list below is incomplete,
as it contains mainly specifications that are relevant from the long-term preservation point of view.
Some of them are still drafts:
9)
— EPUB Accessibility specification 1.0 addresses evaluation and certification of accessible EPUB
publications, and discovery of the accessible qualities in such publications.
10)
— EPUB Previews 1.0 describes how content previews can be included in EPUB publications.
11)
— EPUB Distributable Objects 1.0 is a draft specification that defines a method for the encapsulation,
transportation, and integration of distributable objects in EPUB publications.
12)
— EPUB Scriptable Components 1.0 provides an interoperable publish and subscribe (pubsub)
pattern by which i
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.