Digital publishing — EPUB3 preservation — Part 2: Metadata requirements

The ISO/IEC TS 22424 series supports long-term preservation of EPUB publications via a dual strategy. This document makes EPUB compliant with current practices of Open Archival Information Systems (OAIS) archives and technical requirements of repository systems. The former tend to rely on OAIS in their operations; the latter prefer to ingest electronic documents only in containers conforming to standards such as METS (Metadata Encoding and Transmission Standard). ISO/IEC TS 22424-1 considers EPUB features from a long-term preservation point of view.

Publications numériques — EPUB3 preservation — Partie 2: Titre manque

General Information

Status
Published
Publication Date
28-Jan-2020
Current Stage
9093 - International Standard confirmed
Completion Date
15-Sep-2023
Ref Project

Buy Standard

Technical specification
ISO/IEC TS 22424-2:2020 - Digital publishing -- EPUB3 preservation
English language
35 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

TECHNICAL ISO/IEC TS
SPECIFICATION 22424-2
First edition
2020-01
Digital publishing — EPUB3
preservation —
Part 2:
Metadata requirements
Reference number
ISO/IEC TS 22424-2:2020(E)
©
ISO/IEC 2020

---------------------- Page: 1 ----------------------
ISO/IEC TS 22424-2:2020(E)

COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC TS 22424-2:2020(E)

Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Abbreviated terms . 2
5 Syntax . 2
6 Packaging metadata . 4
6.1 General . 4
6.2 Package creator / submitter information . 4
6.3 Package status . 5
6.4 Package identifier . 5
6.5 Work and publication identifiers . 6
6.6 Core media type resource identifiers . 8
6.7 Foreign resource identifiers . 9
6.8 Identifiers for metadata records .10
6.9 Dates .11
6.9.1 General.11
6.9.2 Creation date of a submission information package .12
6.9.3 Modification date of a submission information package .12
6.9.4 Creation/modification date of an EPUB publication .12
6.9.5 Creation/modification of a metadata record .13
6.10 Metadata format and its versions .13
7 Administrative metadata .15
7.1 General .15
7.2 Technical metadata .16
7.2.1 File formats and their versions .16
7.2.2 Digital signatures and checksums.19
7.3 Rights metadata .20
7.3.1 General.20
7.3.2 Preservation related rights .21
7.4 Structural metadata .22
7.5 Preservation metadata .24
8 Structure of submission information packages .26
9 Content of submission information packages .27
Annex A (informative) Digital signature .29
Annex B (informative) Events .31
Bibliography .35
© ISO/IEC 2020 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC TS 22424-2:2020(E)

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that
are members of ISO or IEC participate in the development of International Standards through
technical committees established by the respective organization to deal with particular fields of
technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other
international organizations, governmental and non-governmental, in liaison with ISO and IEC, also
take part in the work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents) or the IEC
list of patent declarations received (see http:// patents .iec .ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www .iso .org/
iso/ foreword .html.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 34, Document description and processing languages.
A list of all parts in the ISO/IEC TS 22424 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
iv © ISO/IEC 2020 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC TS 22424-2:2020(E)

Introduction
This document facilitates the long-term preservation of EPUB publications by specifying metadata
elements which are required or recommended for long-term preservation (such as identifiers) and the
ways in which the EPUB publication and related metadata can be packaged. EPUB versions 3 and 3.0.1
are covered; if necessary, the EPUB version applicable is specified.
Long-term preservation in general requires two things:
— making the object such as EPUB publication fit for preservation – including features to be used and
feature to avoid;
— packaging the object (and any metadata related to it) together with any additional data such as
other versions of the object and other documentation into an Open Archival Information System
(OAIS) submission information package (SIP).
ISO/IEC TS 22424-1 concentrates on the archivability of EPUB documents.
The background to this document comes from the Open Archival Information System, which is
described in ISO/IEC TS 22424-1.
When a submission information package (SIP) is formed, mandatory preservation metadata need to
be present in the package. Depending on the agreements made between the producer and the archive,
metadata elements are stored either in the container document or the EPUB publication itself, or both.
Usually an archive would expect to find all relevant metadata in the container, unless the submission
agreement allows embedding of metadata into EPUB publications.
This document does not require any changes to be made to the current of future EPUB standards.
However, when an EPUB publication is created or modified for submission to an archive, there are some
EPUB features that should be used and others that should be avoided. ISO/IEC TS 22424-1 describes
how the EPUB format should be applied. This document concentrates on mandatory and recommended
metadata elements needed for the long-term preservation of EPUB publications and their METS
encoding. ISO/IEC TS 22424-1 recommends the usage of METS but allows also other container standards;
this document concentrates on preservation metadata and its METS encoding in SIPs. Future editions
1)
of these documents may specify other encodings such as BITS (Book Interchange Tag Suite) .
In order to guarantee access to documents, OAIS archives may migrate documents into new file formats
when the original formats are no longer supported by commonly used rendering tools. If the document
to be migrated is an e-book in an outdated EPUB format, migration can be made to a more modern
version of EPUB or, at least in principle, to another e-book format.
Generally, migration into another file format should be straightforward if the current and new format
are compatible and there are efficient and reliable migration tools available. If the target format is a
more modern version of the current format, compatibility should not be a problem. But if a format is
rich, migration tools may not be able to render all the properties of a resource.
This document applies to EPUB versions 3 and 3.0.1. Earlier versions (EPUB 2 and 2.0.1) are not covered.
Since there are no implementations of version 3.1, it is not covered in this document either. EPUB 3.2
2)
was published in May 2019 . It will be taken into account in the next edition of this document.
This document does not cover issues related to migration between EPUB versions or from EPUB to other
e-book formats. Migration to other formats is often lossy; this applies to e-book formats as well, since
there are EPUB features which are not supported in other e-book formats, and vice versa. Moreover,
even if the same feature is supported, technical implementations can be incompatible. For instance, if
an EPUB 3 publication using fixed layout is migrated to Amazon’s KF8 format, preserving fixed layout
properties requires special attention since there are significant technical differences between these
formats in how this feature has been implemented.
1) https:// www .loc .gov/ preservation/ digital/ formats/ fdd/ fdd000453 .shtml
2) https:// w3c .github .io/ publ -epub -revision/ epub32/ spec/ epub -spec .html
© ISO/IEC 2020 – All rights reserved v

---------------------- Page: 5 ----------------------
ISO/IEC TS 22424-2:2020(E)

Sometimes migration cannot be applied at all; programs cannot be migrated without access to and
good understanding of the source code. In such cases long-term preservation is possible only if the OAIS
archive responsible is able to emulate either the program’s original hardware or software environment.
Within the preservation community, emulation is considered to be a viable option for some content. For
the time being there is no full understanding on how emulation will function in the long-term, but this
may change with emulation as a service approach coming to the market.
Metadata requirements in this document are based on the migration of file formats. Emulation is not
covered (just a single example of emulation-related preservation metadata is given), although emulation
is likely to be the best preservation method for fixed layout EPUB publications and interactive EPUB
publications. Preservation metadata requirements for emulation-based preservation strategy may be
added into a future version of this document.
Supporting emulation might require just information about appropriate tools in the submission
agreement or in the related documentation. A more sustainable approach is to include a description
of the emulation environment (hardware and/or software) in the premis: object section of the PREMIS
metadata record in the SIP. During ingest this information is copied into the archival information
package (AIP). If migration is used, hardware and software environments needed for rendering the
versions of the document in the AIP can be specified separately as access environments.
Ambition level of migration may vary. Usually it is to preserve the intellectual content, since retaining
also the original look and feel of preserved documents is considered to be too demanding. If semantics
and layout are interlinked, it is important to keep also the original EPUB publication in order to facilitate
preservation of the semantics via emulation-based access to the original content.
Migration both requires and produces preservation metadata. For instance, staff in the archives has
to figure out which tools can be used to carry out the migration, and what weak points they may have.
The intention of the preservation community is to maintain this information in format libraries such as
3)
PRONOM . When a new AIP is created after a migration, the package should contain both the old and
the new representation of the migrated document and preservation metadata describing the migration
4)
event and the possible differences between the document versions . Depending on their needs and
archived resources archive users can then make a choice between the original, which is authentic but
possibly difficult to render, and the migrated document, which should be easy to use but less authentic.
In practice, finding access software to outdated versions of preserved documents may be difficult. The
OAIS archive, on the other hand, can migrate the original document again when better tools can be
used, or if there are significant issues in migrated documents.
Metadata elements that need to be included in SIPs are a priori essential for digital preservation. For
instance, if there is no digital signature present and a secure transfer channel has not been used, it is
impossible to guarantee the information entering the archive has not changed during transfer or that it
is coming from a correct source. Moreover, if the data has already been tampered with before it enters
the archive, all subsequent preservation actions may be useless.
This document does not specify generic conformance requirements for EPUB publications, but may
make some restrictions to the use of EPUB specifications. The generic conformance requirements made
in the EPUB Contents Documents Specification apply to EPUB publications in SIPs as well.
ISO/IEC TS 22424-1 defined a set of requirements for archivable EPUB publications. Please consult
ISO/IEC TS 22424-1 for more information.
3) http:// www .nationalarchives .gov .uk/ PRONOM/ Default .aspx
4) This document is only concerned with those metadata elements which are to be included in SIPs. Preservation
metadata needed in AIPs (which describes the preservation related events such as migration) is beyond the scope.
vi © ISO/IEC 2020 – All rights reserved

---------------------- Page: 6 ----------------------
TECHNICAL SPECIFICATION ISO/IEC TS 22424-2:2020(E)
Digital publishing — EPUB3 preservation —
Part 2:
Metadata requirements
1 Scope
The ISO/IEC TS 22424 series supports long-term preservation of EPUB publications via a dual strategy.
This document makes EPUB compliant with current practices of Open Archival Information Systems
(OAIS) archives and technical requirements of repository systems. The former tend to rely on OAIS
in their operations; the latter prefer to ingest electronic documents only in containers conforming to
standards such as METS (Metadata Encoding and Transmission Standard).
ISO/IEC TS 22424-1 considers EPUB features from a long-term preservation point of view.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 8601 (all parts), Date and time — Representations for information interchange
ISO/IEC TS 22424-1, Digital publishing — EPUB3 preservation — Part 1: Principles
METS Metadata Encoding & Transmission Standard. Version 1.12.1. [online]. Library of Congress, 2019.
Available from: https:// www .loc .gov/ standards/ mets/
PREMIS PREMIS Data Dictionary for Preservation Metadata. Version 3.0. [online]. Library of Congress,
2015. Available from http:// www .loc .gov/ standards/ premis/
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC TS 22424-1 and the
following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1
data dictionary
organized and constructed (electronic data base) compilation of descriptions of data concepts
that provides a consistent means for documenting, storing and retrieving the syntactical form (i.e.
representational form) and the meaning and connotation of each data concept
Note 1 to entry: PREMIS is a data dictionary. PREMIS Data Dictionary for Preservation Metadata (https:// www
.loc .gov/ standards/ premis/ ) is a leading metadata specification for metadata needed for long-term preservation.
[SOURCE: ISO 24531:2013, 4.14, modified — Note 1 to entry has been added.]
© ISO/IEC 2020 – All rights reserved 1

---------------------- Page: 7 ----------------------
ISO/IEC TS 22424-2:2020(E)

3.2
structural metadata
metadata that indicates how compound objects are put together, for example how the pages of a
document are arranged to form chapters
Note 1 to entry: The definition is adapted from Reference [14].
4 Abbreviated terms
AIP archival information package
DIP dissemination information package
DRM digital rights management
OAIS Open Archival Information System
PDI preservation description information
SIP submission information package
5 Syntax
This document provides examples of how metadata elements should be expressed using either
5)
1) Metadata Encoding and Transmission Standard (METS ) version 1.12.1 and PREMIS Data
6)
Dictionary for Preservation Metadata (PREMIS ) version 3.0, and/or
2) EPUB version 3.0 and 3.0.1
for encoding SIPs. Other container standards may be added to the future editions of this document.
This dual approach was chosen because there are different options available for a producer to turn
existing EPUB publications into SIPs:
1) All metadata (mandatory and otherwise) may be embedded in the EPUB publication.
2) Mandatory metadata is copied from EPUB document to the METS container if and when it is already
present, or created and placed in the METS container (recommended approach).
3) Option 2, but a container standard other than METS is used.
The first option looks appealing because that way it would be relatively easy to create EPUB publications
suitable for long-term preservation, especially if the mandatory metadata elements are already present
(and if the EPUB publication itself does not have features unsuitable for preservation).
Unfortunately this approach has some issues:
— Commonly used repository systems expect information packages based on container standards
such as METS. Current versions of these applications may not able to process SIPs which contain
only an EPUB publication.
— Depending on the mandatory metadata required, it may not be possible to include all preservation
metadata into EPUB publication.
5) http:// www .loc .gov/ standards/ mets/
6) http:// www .loc .gov/ standards/ premis/
2 © ISO/IEC 2020 – All rights reserved

---------------------- Page: 8 ----------------------
ISO/IEC TS 22424-2:2020(E)

— If there is no container document, it may be difficult to send multiple EPUB publications in a single
SIP, or partial updates (for instance, only descriptive metadata about a publication that has already
been archived.
Options 2 and 3 are based on the idea that there are two independent specifications, the core EPUB
specification (currently version 3.2), and a container specification (this document). This allows the two
communities (EPUB and digital archivists) to cooperate without putting unnecessary constraints on
each other. Both specifications are independent from one another, which makes it easier to manage them.
From a technical point of view, the main strength of the second option is that METS containers are
almost universally accepted in long-term preservation applications. One reason for the popularity of
the standard is that it is flexible – it is possible to embed any descriptive or administrative metadata
into a METS document. Whatever mandatory metadata will be agreed upon by the producer and the
OAIS archive, METS can be used as a container.
The option of using some other container standard than METS or EPUB is not examined in this
document. METS is used due to its technical features and popularity among long-term preservation
application vendors as well as libraries, archives, and museums. If and when other options emerge in
the future, it is possible to extend this document to support other container standards as well.
The main weakness of METS approach is that currently very few publishers support it. Unless production
processes change radically, a common solution will be to submit e-books in EPUB format as such, with
accompanying ONIX metadata. In this approach, the producer (which can be the OAIS archive) creates
the METS SIP during pre-ingest, using the data and metadata delivered by the publisher. The publisher
does not need to know METS, but EPUB documents themselves and the accompanying metadata should
meet the requirements made in the submission agreement.
This document requires that each SIP shall have a METS document with mandatory descriptive and
administrative metadata elements embedded, using e.g. Dublin Core (ISO 15836-1) and PREMIS
formats. The use of a separate, METS based preservation layer enables the current long-term
preservation applications to ingest EPUB publications. Producers and OAIS archives may also choose
other approaches, such as embedding all metadata in EPUB publications or using another container
standard. Whichever strategy is chosen, it should be planned out carefully.
In the hybrid approach, some descriptive and administrative metadata needed during ingest may not
be copied from the EPUB document to the METS document. In order to use this metadata, the OAIS
archive shall have reading systems or other applications which are able to render EPUB publications
and extract the relevant metadata from them.
This document does not require copying of EPUB structural metadata to METS documents. Therefore,
the structural metadata in METS is simple, only specifying the location of EPUB publication or
publications in the SIP but not their internal structure. EPUB reading systems would not be able to use
the structural metadata in a METS document, because they utilize structural metadata in the EPUB
spine element when publications are rendered.
In order to eliminate uncertainty concerning the syntax and semantics of SIPs, submission agreements
shall specify a METS profile or profiles which can be used to facilitate packaging of EPUB publications.
This document can be used as a basis for these profiles. The profile can be part of the submission
agreement, or linked to it. The latter approach was chosen in the Finnish Digital Library initiative; the
benefit is that submission agreements will be relatively simple because technical details are stated
7)
in the document “Metadata requirements and preparing content for digital preservation” . Finnish
8)
Digital Library initiative has published also a separate document titled “File formats” , which lists
the file formats suitable for ingest and preservation. Unfortunately, this document does not contain
guidelines on how these file formats should be applied. EPUB is an example of a file format which is
in principle archivable, but in practice can be used in a way which may makes long-term preservation
challenging. The purpose of ISO/IEC TS 22424-1 is to provide guidelines for creation of archivable EPUB
publications.
7) http:// digitalpreservation .fi/ files/ Metadata -1 .7 .1 -en .pdf
8) http:// digitalpreservation .fi/ files/ File -Formats -1 .7 .0 -en .pdf
© ISO/IEC 2020 – All rights reserved 3

---------------------- Page: 9 ----------------------
ISO/IEC TS 22424-2:2020(E)

Specifications, such as the ones created in Finnish Digital Library initiative, shall be sufficiently detailed;
for instance, they shall specify all mandatory metadata elements and all archivable or ingestible file
formats. Otherwise SIPs may lack crucial data, or contain files that cannot be processed. Of course even
this may not be sufficient; in addition to only saying that MXF, TIFF and EPUB are archivable formats, it
is also necessary to specify what type of MXF videos, TIFF images and EPUB publications are acceptable.
Digital archiving projects like the National Digital Library in Finland do not necessarily have a mandate
9)
or resources for such work; that is why specification
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.