ISO/IEC 12034-1:2017
(Main)Information technology — Archive eXchange Format (AXF) — Part 1: Structure and semantics
Information technology — Archive eXchange Format (AXF) — Part 1: Structure and semantics
ISO/IEC 12034-1:2017 is Part 1 of a series of documents that specify a general-purpose format for the storage and/or communication of information in bulk form. The format is named the Archive eXchange Format (AXF). The format described is intended both for interchange between systems and to serve as a native format within systems. ISO/IEC 12034-1:2017 identifies two major categories of data storage media and specifies the basic structures of data stored on those Storage Media Types. It specifies a number of structural elements for use in constructing the appropriate structures for use on each of the Storage Media Types. It defines the semantics of data contained within fields specified for use in the structural elements. The structural elements themselves are documents coded in the eXtensible Markup Language (XML), and this document defines an XML Schema Description (XSD) file for use in formulating the XML documents to be used for the structural elements of AXF Objects.
Technologies de l'information — Archive eXchange Format (AXF) — Partie 1: Structure et sémantique
General Information
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 12034-1
First edition
2017-12
Information technology — Archive
eXchange Format (AXF) —
Part 1:
Structure and semantics
Technologies de l'information — Archive eXchange Format (AXF) —
Partie 1: Structure et sémantique
Reference number
ISO/IEC 12034-1:2017(E)
©
ISO/IEC 2017
---------------------- Page: 1 ----------------------
ISO/IEC 12034-1:2017(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2017, Published in Switzerland
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Ch. de Blandonnet 8 • CP 401
CH-1214 Vernier, Geneva, Switzerland
Tel. +41 22 749 01 11
Fax +41 22 749 09 47
copyright@iso.org
www.iso.org
ii © ISO/IEC 2017 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 12034-1:2017(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International
Electrotechnical Commission) form the specialized system for worldwide standardization.
National bodies that are members of ISO or IEC participate in the development of International
Standards through technical committees established by the respective organization to deal with
particular fields of technical activity. ISO and IEC technical committees collaborate in fields of
mutual interest. Other international organizations, governmental and non‐governmental, in
liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO
and IEC have established a joint technical committee, ISO/IEC JTC 1.
The procedures used to develop this document and those intended for its further maintenance
are described in the ISO/IEC Directives, Part 1. In particular the different approval criteria
needed for the different types of document should be noted. This document was drafted in
accordance with the editorial rules of the ISO/IEC Directives, Part 2
(see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the
subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such
patent rights. Details of any patent rights identified during the development of the document will
be in the Introduction and/or on the ISO list of patent declarations received (see
www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and
does not constitute an endorsement.
For an explanation on the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see the
following URL: www.iso.org/iso/foreword.html.
ISO/IEC 12034‐1:2017 was prepared by the Society of Motion Picture and Television Engineers
(SMPTE) and was adopted, under the PAS procedure, by Joint Technical Committee
ISO/IEC JTC 1, Information technology, in parallel with its approval by the national bodies of ISO
and IEC.
© ISO/IEC 2017 – All rights reserved
---------------------- Page: 3 ----------------------
ISO/IEC 12034-1:2017(E)
ISO/IEC 12034-1
SMPTE ST 2034-1:2017
ISO/IEC 12034-1 STANDARD
Archive eXchange Format (AXF) —
Part 1: Structure & Semantics
Page 1 of 101 pages
Table of Contents Page
Foreword . 3
Intellectual Property . 3
Introduction. 3
1 Scope . 7
2 Conformance Notation . 7
3 Normative References . 7
4 Definitions . 8
5 Storage Media Types . 13
5.1 Media with File Systems . 14
5.2 Media without File Systems . 14
5.3 File Marks . 15
5.4 Relationships Between AXF Structures and Storage Media Types . 15
6 Archive eXchange Format (AXF) Structure . 16
6.1 Form of Data Expression . 16
6.2 Byte Order . 17
6.3 General AXF Concepts . 17
6.4 AXF Data Structures . 18
7 General Usage Considerations . 36
7.1 File Naming . 36
7.2 Media Preparation . 36
7.3 AXF Object Index Structures . 37
7.4 Creating, Reading, Writing, Copying, and Transferring AXF Objects . 38
7.5 Nesting AXF Objects . 39
8 Spanning . 39
8.1 Spanning Linkages . 39
8.2 Encountering a Spanning Situation . 43
8.3 Recovery of Spanned AXF Objects . 43
8.4 Spanning Rules . 43
9 Collected Sets . 44
9.1 Collected Set Linkages . 44
9.2 Collected Set Structure . 45
9.3 Add/Replace/Delete Processes . 45
9.4 Tracking Versions . 46
© ISO/IEC 2017 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/IEC 12034-1:2017(E)
ISO/IEC 12034-1
SMPTE ST 2034-1:2017
10 AXF Data Model . 48
10.1 AXF Medium Identifier . 49
10.2 Object Header . 52
10.3 Object Fragment Header . 58
10.4 File Footer . 61
10.5 Object Fragment Footer . 63
10.6 Object Footer . 68
10.7 AXF Object Index . 74
10.8 UUID . 78
10.9 PositionInteger . 78
10.10 FileFolder . 79
10.11 Folder . 81
10.12 File . 84
10.13 Symlink . 87
10.14 FileTree . 90
10.15 Application . 91
10.16 Entity . 93
10.17 Location . 95
10.18 Identifiers . 97
10.19 Checksums . 97
10.20 Identifier . 97
10.21 Checksum . 98
10.22 ByteOrder . 100
10.23 Media Type . 101
10.24 Structure Version . 101
Page 2 of 101 pages
© ISO/IEC 2017 – All rights reserved
---------------------- Page: 5 ----------------------
ISO/IEC 12034-1:2017(E)
ISO/IEC 12034-1
SMPTE ST 2034-1:2017
Foreword
SMPTE (the Society of Motion Picture and Television Engineers) is an internationally-recognized standard
developing organization. Headquartered and incorporated in the United States of America, SMPTE has
members in over 80 countries on six continents. SMPTE’s Engineering Documents, including Standards,
Recommended Practices, and Engineering Guidelines, are prepared by SMPTE’s Technology Committees.
Participation in these Committees is open to all with a bona fide interest in their work. SMPTE cooperates
closely with other standards-developing organizations, including ISO, IEC and ITU.
SMPTE Engineering Documents are drafted in accordance with the rules given in its Standards Operations
Manual.
SMPTE ST 2034-1 was prepared by Technology Committee 31FS on File Formats and Systems.
Intellectual Property
At the time of publication, no notice had been received by SMPTE claiming patent rights essential to the
implementation of this Engineering Document. However, attention is drawn to the possibility that some of the
elements of this document may be the subject of patent rights. SMPTE shall not be held responsible for
identifying any or all such patent rights.
Introduction
This section is entirely informative and does not form an integral part of this Engineering Document.
The Archive eXchange Format (AXF) is an open format that supports interoperability among disparate data
storage systems and ensures long–term availability of data, no matter how storage or file system technologies
evolve. AXF inherently supports interoperability between existing, discrete storage systems, irrespective of the
operating and file systems used, and also future–proofs digital storage by abstracting the underlying technology
so that content remains available across generations of technology development.
At the most basic level, AXF is a file container that can encapsulate any number, size, and type of files in a fully
self–contained and self–describing package. The package contains its own light-weight file system, which
establishes independence from underlying operating systems, storage technologies, and file systems and can
store any type of data on any type of storage media. Inside its packaging, AXF can contain metadata of any
format, applicable to either AXF Objects or to individual files contained within AXF Objects; AXF also carries
key preservation information, such as provenance, fixity, and the like — all key to ensuring long-term robustness
and recoverability.
Historically, digital archive systems have used media data storage formats that are proprietary to their
manufacturers, either intentionally or due to the lack of established standards. There have been neither
interchange of media nor interoperability of archive systems between manufacturers and in some cases
between different archive systems from the same manufacturer. Archives could be orphaned due to support
ending for the systems used to create data archives. End users and manufacturers recognized that the
proprietary nature of archive systems and the data stores that they create result in significant costs of operation
that are unnecessary. These costs could be avoided if there were standardization of the format used for storage
of the data on media and for transfer of the data between systems and locations. AXF permits separating the
stored content from the systems that create and recover sets of data, thereby enabling refreshing of storage
technology, recovering sets of data that otherwise would have been orphaned, and transferring sets of data
between systems and locations.
This standard specifies a structure for data that can be written to any current or future data storage subsystem,
regardless of the type of media on which it is stored. The data can include any types of files and associated
metadata that are stored and transferred together in a structure called an “AXF Object.” A single AXF Object
can be spanned across multiple physical media, can be copied from one set of physical media to another, and
Page 3 of 101 pages
© ISO/IEC 2017 – All rights reserved
---------------------- Page: 6 ----------------------
ISO/IEC 12034-1:2017(E)
ISO/IEC 12034-1
SMPTE ST 2034-1:2017
is agnostic to the Storage Media Type on which it is stored, e.g., spinning disc or linear tape. Regardless of
the Storage Media Types on which they are stored, AXF Objects are identically structured and formatted for
any given set and relationship of contained files and metadata.
AXF initially arose from the storage needs of the audiovisual production and archiving communities but quickly
encompassed any type of file-based data. The transition to file-based workflows led to a new set of requirements
throughout pre-production, production, distribution, storage, and preservation processes. Those requirements
included long-term archiving of finished and unfinished materials, writing data to any type of storage subsystem
using a standard scheme, transporting formatted archives between systems and locations using either media
or networks, and allowing extensibility sufficient to accommodate any type of file, of any size, from any source,
as well as adoption of any future storage technologies. AXF was created to address these requirements.
Audiovisual content archiving spans a wide range of content and data archiving systems and practices. At the
time this standard was written, many different methods and media were commonly used to store file-based
audiovisual content and its supporting information. Examples range from individual hard drives, solid state
drives, and linear magnetic tape drives in small organizations to large spinning disc arrays in combination with
very large robotic systems with multiple robots, each having multiple drives, in very large cultural, scientific, and
legal archives. Applications in other industries that could benefit from the methods defined herein include
medical imaging, geophysical exploration, scientific research, and similar high-volume producers of data.
The cultural, scientific, and business value of assets stored on these data systems is significant. Methods for
storage, interchange, transport, and preservation of such assets, both locally and remotely, over both short and
very long retention periods, demands a standardized, well-documented, non-manufacturer-specific method of
writing data to any data storage system, from which the data then can be recovered and its contents used,
updated, or transferred to another data storage system. All that would be necessary to achieve these objectives
is a mechanism for recovering data from the media on which it is stored, plus utilities or applications that
implement AXF.
The AXF standard creates a common method of writing individual files or related sets of files, and relevant
metadata, onto data storage subsystems so that the structure of an AXF Object will remain the same no matter
what vendor equipment or Storage Media Type is used. As long as the media remains viable and data can be
read from that media, it will be possible to recover an AXF Object and unwrap its contents with a suitable utility
or application running on whatever platform is current at the time. The AXF Object also has to be able to be
recovered and stored on future data storage systems without requiring any changes to its contents simply to
accomplish the act of medium migration, but it also needs to allow changes to its contents, in case updating is
needed to data that already has been archived.
AXF addresses these needs through a combination of predefined eXtensible Markup Language (XML) schema
fields, defined binary data structures that enable an AXF Object to carry any type of file within its File Payload,
internal file system functionality, and key metadata enabling the spanning of AXF Objects across multiple
physical media. The XML schema also enables essential information about an AXF Object and its contents to
be read without having to process all the information within the AXF Object.
In addition to media interchange, AXF enables the interoperability of disparate systems through networks
because it is structured as a streaming data set. Such interconnections enable seamless movement of AXF
Objects from systems that create them, to systems that do not recognize the AXF protocol but store the AXF
Object files nonetheless (perhaps in “cloud” storage), then to systems that are designed to recover data from
AXF Objects.
Functionally, AXF acts like a file wrapper or a repository for all types of data without constraint. Unlike media-
centric file formats such as MXF, which are similar in that they wrap essences, AXF can contain any number or
types of files of any size encapsulated in an AXF Object. It is applicable across a much broader variety of file
storage user groups than any media-specific file wrapper. Types of data can include media essence files,
related metadata files, production files (such as word processing documents, hypertext documents, associated
essence, applications, spreadsheets, and database copies), or any other type of data that users wish to store
together. Unlike other file wrapper definitions, it is payload agnostic and does not require any special mappings
or adaptations to accept the data an AXF Object carries.
AXF accommodates very large file sizes and quantities within AXF Objects. In the current version of this
standard, 64-bit numbers are used to define the sizes of various parameters applicable to elements of AXF
Page 4 of 101 pages
© ISO/IEC 2017 – All rights reserved
---------------------- Page: 7 ----------------------
ISO/IEC 12034-1:2017(E)
ISO/IEC 12034-1
SMPTE ST 2034-1:2017
18
Objects. 64-bit numbers can express values up to 18.44674 x 10 (e.g., 18.44674 petabytes). Use of 64-bit
numbers thus can define file sizes in bytes, numbers of files, numbers of media in a spanned set, and similar
18
characteristics up to 18.44674 x 10 of any particular element. If future requirements exceed the number
spaces provided in this document, there is nothing fundamental that limits any particular parameter to
expression using a 64-bit number. Future revisions of this standard could adopt larger number spaces (e.g.,
96-bit, 128-bit, etc.) for those parameters requiring them. The net result is effectively unlimited storage capability
within AXF Objects, in terms of file sizes, numbers of files in an AXF Object, number of AXF Objects on a
medium, number of media in a spanned set, and the like.
AXF enables updating AXF Objects when additions of, modifications to, or deletions of files or information that
they contain are needed. The functionality to modify AXF Objects is provided by linking “Supplemental” AXF
Objects, written into an archive system at a later time to an original (“Anchor”) AXF Object. A Supplemental
AXF Object updates contents of previous AXF Objects without requiring the original AXF Object itself to be
modified. Since the original content of the AXF Object is retained in its original form, it is possible to restore
either the original or the modified version whenever necessary. Additional Supplemental AXF Objects can be
added in a chain, with restoration of the current or any earlier version possible at any time. When AXF Objects
are refreshed by copying them to new media, it is possible to consolidate an Anchor Object and its Supplemental
Object(s) into a single, new AXF Object. In doing so, it is possible to retain all of the constituent Objects of the
Collected Set to which they belong, so that all earlier versions still can be reconstituted in the future.
AXF abstracts the storage of data from the applications that create AXF Objects and from the operating
systems, file systems, drivers, and drives that store data on media. By this mechanism, any of the surrounding
hardware and software components of systems can be replaced without affecting the data and its formatting
within AXF Objects. A simplified view of where AXF fits into a basic stack is shown in Figure 1.
Server/Storage Stack with AXF support
AXF-Aware Application
Archive eXchange Format (AXF), Including Internal File System
Block Level Addressing File System
Operating System – Hardware Abstraction Layer
Driver
Physical Drive
Medium (with File System)
Medium (without File System)
Figure 1 – Hardware/Software Stack Incorporating AXF Writing to and Reading from Media
AXF is designed so that each AXF Object comprises four main components, regardless of the technology that
is used to store it. These components are:
Object Header – Each AXF Object begins with an Object Header, which contains descriptive XML metadata
such as a unique identifier (UUID) for the AXF Object, information regarding its origin, its creation date, and a
full index of all the files and folders contained in the Object, including file permissions and the like.
Page 5 of 101 pages
© ISO/IEC 2017 – All rights reserved
---------------------- Page: 8 ----------------------
ISO/IEC 12034-1:2017(E)
ISO/IEC 12034-1
SMPTE ST 2034-1:2017
Generic Metadata Containers – Following an Object Header can be any number of optional Generic Metadata
Containers. Such containers are self–contained, open metadata containers in which applications can place
AXF-Object-specific metadata that is not part of the AXF Object File Payload. The metadata can be structured
or unstructured, open or vendor-specific, binary, XML, plain text, or any other format.
AXF Object File Payload – Following any Generic Metadata Containers is the AXF Object File Payload. It
contains the files encapsulated in the AXF Object. The File Payload consists of any number of triplets: File Data
+ File Padding + File Footer. File Padding ensures alignment of all AXF Object elements on the boundaries of
Chunks into which each AXF Object is divided, thereby enabling addressing, by location within the AXF Object,
by its internal file system. File Footer structures contain full information about the preceding file, along with a
file-level checksum designed to be processed on-the-fly, with little or no overhead, during restore operations by
an application. The information in File Footers enhances the resilience of AXF, as it can be used to recover File
Payload data even if Object Header and Footer structures are missing or corrupt.
Object Footer – Completing an AXF Object is an Object Footer. It repeats the information contained in the
Object Header and adds information captured during creation of the AXF Object, including per-file checksums,
precise file sizes, and file positions within the AXF Object. The Object Footer is important to the interchange of
an AXF Object because it allows efficient indexing by foreign systems when the media content is not previously
known, thereby enabling media transport between systems that follow the AXF standard. It is one of the key
structures that support the self-describing nature of AXF.
Other significant structures in the AXF protocol are AXF Medium Identifiers and AXF Object Indices. AXF
Medium Identifiers are used on media to indicate formatting of the media according to the AXF protocol and to
provide unique identification of the media. AXF Object Indices are optional compilations of the information in all
Object Footers preceding each AXF Object Index on a medium, providing a single structure from which it is
possible to obtain complete information on the contents of the preceding portion of a medium. When an AXF
Object Index is the last structure on a medium, complete information about all AXF Objects stored on the
medium can be obtained efficiently in one place.
AXF does not require a system to be fully compliant with this standard for it to be able to use and store AXF-
generated AXF Objects. The initial adoption of AXF is anticipated to be in applications that create AXF Objects
that then are stored on non-AXF-aware storage systems. Because the AXF Objects do not require a storage
system to know that the AXF Objects are AXF-formatted, the AXF Objects will be viewed simply as files to be
stored and retrieved. All that will be necessary to read AXF Objects will be the software and hardware needed
to read the physical storage medium. As adoption grows, files can be moved into and out of AXF-aware systems
as necessary, with the full range of features becoming available on systems that are AXF compliant. AXF-
compliant applications will be able to read stored AXF Objects from any current operating system without
unpacking entire AXF Objects to see critical metadata. Moreover, Archives, or AXF Objects within archives,
also can span different types of media, allowing for flexibility within mixed-media archives and for AXF Objects
to be identical, regardless of the media on which they are stored.
AXF offers resilience to data corruption and loss. AXF Object Indices, repeated identifier instances, and
cryptographic hash checksums on both contents and AXF Objects allow for data corruption to be identified and
mitigated. Even in catastrophic events, such as the loss of an external database containing records of the
contents of an archive, the content database can be recreated by reading the archive and re
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.