SIST ISO 24138:2024
(Main)Information and documentation - International Standard Content Code (ISCC)
Information and documentation - International Standard Content Code (ISCC)
This document specifies the syntax and structure of the International Standard Content Code (ISCC), as an identification system for digital assets (including encodings of text, images, audio, video or other content across all media sectors). It also describes ISCC metadata and the use of ISCC in conjunction with other schemes, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC.
An ISCC applies to a specific digital asset and is a data-descriptor deterministically constructed from multiple hash digests using the algorithms and rules in this document. This document does not provide information on registration of ISCCs.
Information et documentation — Code international normalisé de contenu (ISCC)
Informatika in dokumentacija - Mednarodna standardna koda digitalne vsebine (ISCC)
Ta dokument določa sintakso in strukturo mednarodne standardne kode digitalne vsebine (ISCC) kot sistema za označevanje digitalnih sredstev (vključno s kodiranjem besedila, slik, zvoka, videa ali druge vsebine na vseh področjih medijev). Opisuje tudi metapodatke mednarodne standardne kode digitalne vsebine in njeno uporabo v povezavi z drugimi shemami, kot so DOI, ISAN, ISBN, ISRC, ISSN in ISWC.
Mednarodna standardna koda digitalne vsebine se uporablja za določeno digitalno sredstvo in je podatkovni deskriptor, deterministično sestavljen iz več zgoščenih izvlečkov z uporabo algoritmov in pravil v tem dokumentu. Ta dokument ne vsebuje informacij o registraciji mednarodne standardne kode digitalne vsebine.
General Information
Overview
SIST ISO 24138:2024 - Information and documentation - International Standard Content Code (ISCC) defines a standardized identification system for digital assets. The standard specifies the syntax, structure and metadata of the ISCC, a deterministically constructed data-descriptor built from multiple hash digests. ISCC applies across media sectors to encodings of text, images, audio, video or mixed content. The document also describes encoding options (canonical form, URI, multiformats, readable encodings) and how ISCC metadata can be used with other identifier schemes. SIST ISO 24138:2024 does not cover ISCC registration processes.
Key topics and technical requirements
- Structure and format: A clearly defined ISCC structure comprising an ISCC-HEADER and ISCC-BODY, with rules for MainTypes, SubTypes, versioning and length.
- Deterministic construction: ISCCs are created from multiple hash digests following the algorithms and rules in the standard to ensure reproducible identification of specific digital assets.
- Encoding methods: Specifications for canonical form, URI encoding, multiformats encoding and human-readable encodings to support different technical ecosystems.
- ISCC-UNITs: Components such as Meta-Code, Content-Codes (with subtypes for Text, Image, Audio, Video, Mixed), Data-Code and Instance-Code - each with defined formats, inputs, outputs, processing and conformance requirements.
- ISCC-CODE composition: Guidance on ISCC-CODE purpose, format, subtypes and length/composition rules to produce interoperable content identifiers.
- Metadata handling: Rules for seed metadata processing, embedding and extraction to link ISCCs with descriptive and provenance information.
- Interoperability: Guidelines for using ISCC alongside existing identifier schemes (DOI, ISBN, ISAN, ISRC, ISSN, ISWC) to support metadata linking and cross-system discovery.
Practical applications
- Persistent content identification for publishing, archiving and digital preservation.
- Content deduplication and similarity detection across platforms and repositories.
- Provenance and authenticity tracking in workflows for news, media, scientific data and cultural heritage.
- Rights management and licensing by linking content identifiers to rights metadata and registries.
- Search, discovery and interoperability across libraries, streaming services, marketplaces and research datasets.
- Integration with semantic web and distributed systems using URI and multiformats encodings.
Who should use this standard
- Metadata specialists, digital librarians and archivists
- Publishers, content platforms and streaming services
- Rights organizations, registries and legal teams
- Media production and distribution firms
- Developers working on content provenance, blockchain linking or metadata pipelines
Related standards
- DOI (digital object identification), ISBN (books), ISAN (audiovisual works), ISRC (recordings), ISSN (serials), ISWC (musical works) - SIST ISO 24138:2024 explains how ISCC can complement these established schemes for richer content identification and interoperability.
Standards Content (Sample)
SLOVENSKI STANDARD
01-november-2024
Informatika in dokumentacija - Mednarodna standardna koda digitalne vsebine
(ISCC)
Information and documentation — International Standard Content Code (ISCC)
Information et documentation — Code international normalisé de contenu (ISCC)
Ta slovenski standard je istoveten z: ISO 24138:2024
ICS:
01.140.20 Informacijske vede Information sciences
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
International
Standard
ISO 24138
First edition
Information and documentation —
2024-05
International Standard Content
Code (ISCC)
Information et documentation — Code international normalisé
de contenu (ISCC)
Reference number
© ISO 2024
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Structure and format of the ISCC . 4
4.1 General structure .4
4.2 ISCC-HEADER . .5
4.2.1 General .5
4.2.2 MainTypes .6
4.2.3 SubTypes .7
4.2.4 Version .7
4.2.5 Length .7
4.3 ISCC-BODY.8
4.4 Encoding .8
4.4.1 Canonical form .8
4.4.2 URI encoding .9
4.4.3 Multiformats encoding.9
4.4.4 Readable encoding .9
5 ISCC-UNITs .10
5.1 Meta-Code .10
5.1.1 General .10
5.1.2 Purpose .10
5.1.3 Format.10
5.1.4 Inputs .10
5.1.5 Outputs . 12
5.1.6 Seed metadata processing . 12
5.1.7 Metadata embedding . 13
5.1.8 Metadata extraction . 13
5.2 Content-Codes .14
5.2.1 General .14
5.2.2 Purpose .14
5.3 Content-Code Subtype Text .14
5.3.1 General .14
5.3.2 Format.14
5.3.3 Inputs . 15
5.3.4 Outputs . 15
5.3.5 Processing . 15
5.3.6 Conformance . 15
5.4 Content-Code Subtype Image .16
5.4.1 General .16
5.4.2 Format.16
5.4.3 Inputs .16
5.4.4 Outputs .16
5.4.5 Processing .16
5.4.6 Conformance .17
5.5 Content-Code Subtype Audio .17
5.5.1 General .17
5.5.2 Format.17
5.5.3 Inputs .17
5.5.4 Outputs .18
5.5.5 Processing .18
5.5.6 Conformance .18
iii
5.6 Content-Code Subtype Video.18
5.6.1 General .18
5.6.2 Format.18
5.6.3 Inputs .19
5.6.4 Outputs .19
5.6.5 Processing .19
5.6.6 Conformance .19
5.7 Content-Code Subtype Mixed .19
5.7.1 General .19
5.7.2 Format. 20
5.7.3 Inputs . 20
5.7.4 Outputs . 20
5.7.5 Processing . 20
5.7.6 Conformance . 20
5.8 Data-Code .21
5.8.1 General .21
5.8.2 Format.21
5.8.3 Inputs .21
5.8.4 Outputs .21
5.8.5 Processing .21
5.8.6 Conformance . 22
5.9 Instance-Code . 22
5.9.1 General . 22
5.9.2 Format. 23
5.9.3 Inputs . 23
5.9.4 Outputs . 23
5.9.5 Processing . 23
5.9.6 Conformance . 23
6 ISCC-CODE .24
6.1 General .24
6.2 Purpose .24
6.3 Format .24
6.3.1 General .24
6.3.2 SubTypes for ISCC-CODEs .24
6.3.3 Length and composition of ISCC-CODEs .24
6.4 Inputs . 25
6.5 Outputs . 25
6.6 Processing . 25
6.7 Comparing ISCC-CODEs . 25
6.8 Conformance . 26
Annex A (normative) Relationship between ISCC and other identifier systems .27
Annex B (normative) ISCC metadata .29
Annex C (informative) Evolution of this document .31
Annex D (normative) Reference implementation .32
Bibliography .33
iv
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 46, Information and documentation,
Subcommittee SC 9, Identification and description.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
v
Introduction
While ISO/TC 46/SC 9 has established a variety of specific identifier standards, a content-dependent
identifier for digital assets in all content formats has not yet been agreed.
Digital content is dynamic, always in motion, and acted upon globally by a variety of entities with different
interests and requirements. Digital content continuously re-encodes, resizes, and re-compresses, changing
its data as it travels through a complex network of actors and systems.
The International Standard Content Code (ISCC) is an identifier for numerous types of digital assets. An
ISCC-CODE is generated from the digital content itself. It is the result of processing the digital content
using a variety of algorithms including hash algorithms. The generated ISCC-CODE supports data integrity
verification and preserves an estimate of the data, digital content and metadata similarity. However, ISCC
has different functionality from content recognition systems.
The ISCC supports the association of higher-level identifiers (like work and product identifiers) with the
digitally encoded manifestations of content. The ISCC does not specify a system for managing authoritative
metadata. Other content identifier standards can use ISCC to support discoverability of their identifiers and
metadata based on digital content.
Organizations, individuals and machines may generate ISCCs for numerous kinds of digital assets and use
them for identification and management of those assets.
ISCCs are neither manually nor automatically assigned to digital media assets. Instead, ISCCs are derived
from media assets according to the procedures described in this document. Unrelated parties can
independently derive the same ISCC from a given media asset.
ISCCs exclusively reference media assets without any implication about ownership. As such, ISCCs are not
managed authoritatively by any institution or entity.
The ISCC enables interoperability between different actors and systems using digital assets and supports
scenarios that require content deduplication, database synchronization and indexing, integrity verification,
timestamping, versioning, data provenance, similarity clustering, anomaly detection, usage tracking,
allocation of royalties, fact-checking and general digital asset management use-cases.
This document includes sections targeting a general audience but also descriptions of more technical
procedures.
Future editions of this document can be developed as outlined in Annex C.
vi
International Standard ISO 24138:2024(en)
Information and documentation — International Standard
Content Code (ISCC)
1 Scope
This document specifies the syntax and structure of the International Standard Content Code (ISCC), as an
identification system for digital assets (including encodings of text, images, audio, video or other content
across all media sectors). It also describes ISCC metadata and the use of ISCC in conjunction with other
schemes, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC.
An ISCC applies to a specific digital asset and is a data-descriptor deterministically constructed from
multiple hash digests using the algorithms and rules in this document. This document does not provide
information on registration of ISCCs.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 10646:2020, Information technology — Universal coded character set (UCS)
ISO/IEC 15938, Information technology — Multimedia content description interface
ISO/IEC 21778, Information technology — The JSON data interchange syntax
1)
IETF RFC 4648, The Base16, Base32, and Base64 Data Encodings
2)
IETF RFC 2397, The "data" URL scheme
3)
IETF RFC 8785, JSON Canonicalization Scheme (JCS)
4)
W3C, C14N 1.1, Canonical XML Version 1.1
5)
W3C, JSON-LD 1.1, A JSON-based Serialization for Linked Data
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
1) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc4648
2) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc2397
3) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc8785
4) Online available: https:// www .w3 .org/ TR/ xml -c14n11
5) Online available: https:// www .w3 .org/ TR/ json -ld/
3.1
bit
atomic unit of information in a computer system
3.2
byte
sequence of 8 bits (3.1)
3.3
nibble
half a byte (3.2), which can be represented by a single hexadecimal digit
[SOURCE: ISO 20038:2017, 3.12]
3.4
data
ordered sequence of bits (3.1)
3.5
file
stored data (3.4) with a known number of bits (3.1) and a filename
3.6
stream
data (3.4) in transit with a known or unknown number of bits (3.1)
3.7
content
information organized to provide value to a user
3.8
digital content
manifestation of content (3.7) in form of data (3.4) structured according to a set of rules
3.9
metadata
data (3.4) that defines and describes other data
[SOURCE: ISO 24531:2013, 4.32]
3.10
seed metadata
initial metadata (3.9) used as input to a hash algorithm (3.1) function
3.11
content format
set of rules used to structure digital content (3.8)
3.12
media type
two-part identifier (3.15) specifying the nature of the referenced data (3.4)
[SOURCE: ISO/IEC 19757-4:2006, 3.9]
3.13
digital asset
file (3.5) or stream (3.6) encoded in conformance with a specific content format (3.11)
3.14
referent
object which is identified
3.15
identifier
sequence of characters that uniquely denotes a referent (3.14)
3.16
identifier system
system to enable the provision of identifiers (3.15) for a given category of referents (3.14)
3.17
content identifier
identifier (3.15) whose referent (3.14) is content (3.7)
3.18
content-dependent identifier
content identifier (3.18) whose data (3.4) depends on the digital content (3.8) that it identifies
3.19
content recognition system
system whose primary purpose is to recognise digital content (3.8) on a granular level
3.20
algorithm
set of instructions
3.21
hash algorithm
deterministic algorithm (3.20) that produces fixed-length data (3.4) from an input of arbitrary-length data
3.22
hash digest
result of processing data (3.4) with a hash algorithm (3.21)
3.23
cryptographic hash function
computationally efficient function mapping binary strings of arbitrary length to binary strings of fixed
length, such that it is computationally infeasible to find two distinct values that hash into a common value
3.24
similarity hash
hash digest (3.22) that preserves correlations between inputs to the hash algorithm (3.21)
3.25
content defined chunking
CDC
method to split data (3.4) into variable length chunks based on internal features such that chunk boundaries
are more resistant to byte (3.2) shifting
3.26
actor
human or non-human (hardware or software) entity that interacts with a system
3.27
Merkle tree
tree data structure in which every leaf node is labelled with the hash digest (3.22) of a data element and
every non-leaf node is labelled with the hash digest of the labels of its child nodes
3.28
Merkle root
root node of a Merkle tree (3.27)
[SOURCE: ISO 22739:2024, 3.57]
3.29
ISCC processor
application that generates ISCCs for digital content (3.8)
3.30
plain text
data (3.4) with a known text encoding that can be transcoded to Unicode
3.31
whitespace
nondisplaying formatting characters such as spaces, tabs, etc., that are embedded within a block of free text
[SOURCE: ISO/IEC/IEEE 31320-2:2012, 3.1.210]
4 Structure and format of the ISCC
4.1 General structure
a) An ISCC shall be composed of an ISCC-HEADER and an ISCC-BODY (see Figure 1).
b) The ISCC-HEADER shall describe the MainType, SubType, Version, and Length of its ISCC-BODY.
c) An ISCC-UNIT shall be an ISCC based on one specific algorithm.
d) An ISCC-CODE shall be an ISCC composed from two or more different ISCC-UNITs.
Figure 1 — General structure of an ISCC
The concatenation of the ISCC-UNITs is based on the underlying data and is not visible in the string
representation of the ISCC-CODE itself. See 6.6 on how an ISCC-CODE is composed from individual ISCC-UNITs.
4.2 ISCC-HEADER
4.2.1 General
4.2.1.1 The ISCC-HEADER is a variable sized bitstream composed of an ordered sequence of the 4 header-
fields MainType, SubType, Version, Length.
4.2.1.2 Each header-field is a bitstream with a length between 4 and 16 bits and encodes an integer value
between 0 and 4679 (see Table 1) with the following encoding scheme.
a) The total bit-length of a header-field shall be determined by its prefix-bits.
b) The prefix-bits shall be followed by data-bits.
c) The data-bits shall be interpreted as unsigned integer values plus the maximum value of the
preceding range.
d) If the total length of all header fields in number of bits is not divisible by 8, the header shall be padded
with 4 zero bits (0000) on the right side.
Table 1 — Variable length ISCC-HEADER field encoding
Prefix bits Number of nibbles Number of data bits Integer range
0 1 3 0-7
10 2 6 8-71
110 3 9 72-583
1110 4 12 584-4679
4.2.1.3 The interpretation of the integer value of a header-field shall be context dependent.
a) For the MainType and SubType fields, it shall be an identifier for the designated type (see 4.2.2 and 4.2.3).
b) For the Version field it shall be the literal version number (see 4.2.4).
c) For the Length field of ISCC-UNITs, it shall be a number used as a multiplier to calculate the bit length of
the ISCC-BODY (see 4.2.5, Table 6).
d) For the Length field of ISCC-CODEs, it shall be a bit-pattern encoding the combination of ISCC-UNITs and
the bit-length of the ISCC-BODY (see 4.2.5, Table 7).
EXAMPLE Header Field Examples
0 = 0000
1 = 0001
…
7 = 0111
8 = 1000 0000
9 = 1000 0001
…
4.2.2 MainTypes
The MainType header-field shall signify the type of an ISCC (see Table 2).
Backward incompatible updates to an algorithm associated with a MainType shall be indicated by
incrementing the version field of the ISCC-HEADER of the respective MainType.
NOTE This document specifies initial algorithms (version 0) for all reserved MainTypes with the exception of the
SEMANTIC type which is not currently defined.
Table 2 — Reserved MainTypes
ID Symbol Bits Definition
0 META 0000 An ISCC-UNIT that matches on metadata similarity
1 SEMANTIC 0001 An ISCC-UNIT that matches on semantic content similarity
2 CONTENT 0010 An ISCC-UNIT that matches on perceptual content similarity
3 DATA 0011 An ISCC-UNIT that matches on data similarity
4 INSTANCE 0100 An ISCC-UNIT that matches on data identity
5 ISCC 0101 An ISCC-CODE composed of two or more headerless ISCC-UNITs for mul-
ti-modal matching
4.2.3 SubTypes
The MainTypes META, DATA, and INSTANCE shall have a single default SubType NONE encoded with the
bits 0000.
The MainTypes SEMANTIC, CONTENT, and ISCC shall have SubTypes that signify the perceptual mode (see
Table 3 and Table 4).
Table 3 — Reserved SubTypes for MainTypes ISCC, SEMANTIC and CONTENT
ID Symbol Bits Definition
0 TEXT 0000 Match on text similarity
1 IMAGE 0001 Match on image similarity
2 AUDIO 0010 Match on audio similarity
3 VIDEO 0011 Match on video similarity
4 MIXED 0100 Match on multi-modal similarity
Table 4 — Additional Reserved SubTypes for the MainType ISCC
ID Symbol Bits Definition
5 SUM 0101 Composite of ISCC-UNITs including only Data- and Instance-Code
6 NONE 0110 Composite ISCC-UNITs including Meta-, Data- and Instance-Code
4.2.4 Version
All ISCC-HEADERs shall have a version header-field of 0000 for the first edition of this document (see
Table 5).
Table 5 — Reserved ISCC Versions
ID Symbol Bits Definition
0 V0 0000 Initial version of ISCC-UNITs and ISCC-CODE
4.2.5 Length
4.2.5.1 General
The encoding of the Length header-field shall be specific to the MainType.
4.2.5.2 Length of ISCC-UNITs
For ISCC-UNITs of the MainTypes META, SEMANTIC, CONTENT, DATA and INSTANCE the length value shall
be encoded as the number of 32-bit blocks of the ISCC-BODY in addition to the minimum length of 32 bits
(see Table 6).
Table 6 — Reserved length field values (multiples of 32 bit)
ID Symbol Bits Definition
0 L32 0000 Length of body is 32 bits (minimum length)
1 L64 0001 Length of body is 64 bits (default length)
2 L96 0010 Length of body is 96 bits
3 L128 0011 Length of body is 128 bits
4 L160 0100 Length of body is 160 bits
5 L192 0101 Length of body is 192 bits
6 L224 0110 Length of body is 224 bits
7 L256 0111 Length of body is 256 bits
4.2.5.3 Length of ISCC-CODEs
a) For ISCC-CODEs, the length value shall designate the composition of ISCC-UNITs (see Table 7).
b) The Data-Code and Instance-Code shall be mandatory 64-bit components of an ISCC-CODE.
c) The first data-bit shall designate the presence of a 64-bit Meta-Code.
d) The second data-bit shall designate the presence of a 64-bit Semantic-Code.
e) The third data-bit shall designate the presence of a 64-bit Content-Code.
f) The length of an ISCC-CODE shall be calculated as the number of active data-bits times 64 plus 128 bits
of mandatory data.
Table 7 — Reserved length field values (for MainType ISCC)
ID Symbol Bits Definition
0 SUM 0000 No optional ISCC-UNITs. Length of body is 128 bits.
1 CDI 0001 Includes Content-Code. Length of body is 192 bits
2 SDI 0010 Includes Semantic-Code. Length of body is 192 bits
3 SCDI 0011 Includes Semantic- and Content-Code. Length of body is 256 bits
4 MDI 0100 Includes Meta-Code. Length of body is 192 bits
5 MCDI 0101 Includes Meta-Code and Content-Code. Length of body is 256 bits
6 MSDI 0110 Includes Meta-Code and Semantic-Code. Length of body is 256 bits
7 MSCDI 0111 Includes Meta-, Semantic-, and Content-Code. Length is 320 bits
4.3 ISCC-BODY
a) The preceding MainType, SubType, and Version fields shall qualify the semantics of an ISCC-BODY.
b) The Length field shall determine the number of bits of an ISCC-BODY.
4.4 Encoding
4.4.1 Canonical form
The printable canonical form of an ISCC shall be its RFC 4648 Base32 encoded representation without
padding and prefixed with “ISCC:”. Base32 defines an upper case standard alphabet.
EXAMPLE ISCC:KEC43HJLPUSHVAZT66YLPUWNVACWYPIV533TRQMWF2IUQYSP5LA4CTY
4.4.2 URI encoding
An ISCC may be encoded using the syntax of a Uniform Resource Identifier (URI) as defined in RFC 3986.
a) The URI representation shall have the format :.
b) The URI scheme shall be the string “iscc”.
c) The URI path shall be the lower-cased base32 representation of an ISCC without padding.
EXAMPLE iscc: kec43 hjlpushvaz t66ylpuwnv acwypiv533 trqmwf2iuq ysp5la4cty
NOTE Because Base32 defines an upper case standard alphabet, the canonical form differs from the URI form,
which is represented in lower case.
4.4.3 Multiformats encoding
[13]
The ISCC may be encoded as a multibase string (see Table 8).
a) The multicodec identifier of an ISCC shall be “0xcc01” (see Table 9).
b) A Multiformat representation of an ISCC shall be prefixed with a multibase code.
c) The encoding scheme shall be .
ISCC shall support the multibase encodings given in Tables 8 and 9.
Table 8 — Supported multibase encodings
Encoding Code Definition
base16 f hexadecimal
base32 b RFC4648 case-insensitive - no padding
base32hex v RFC4648 case-insensitive - no padding - highest char
base58btc z base58 bitcoin
base64url u RFC4648 no padding
Table 9 — Examples of ISCCs in multiformats encoding
Encoding Example
fcc015105cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f
MF base16
bzqavcbontuvx2jd2qmz7pmfx2lg2qblmhuk655zyyglc5ekimjh6vqobj4
MF base32
vpg0l21edjklnq93qgcpvfc5nqb6qg1bc7kauttpoo6b2t4a8c97ulge19s
MF base32hex
z2Yr3BMx3Rj56fyYkNvfa19PCk4SjspQhpVWoLSGg9yXr4vUGsx
MF base58btc
uzAFRBc2dK30keoMz97C30s2oBWw9Fe73OMGWLpFIYk_qwcFP
MF base64url
4.4.4 Readable encoding
The ISCC may be encoded in human readable representation.
a) The readable representation shall encode the header fields with their symbols and the ISCC-BODY in
base16 lower-case.
b) The header fields and the ISCC-BODY shall be separated with hyphens.
EXAMPLE
ISCC-IMAGE-V0-MCDI-cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f
5 ISCC-UNITs
5.1 Meta-Code
5.1.1 General
The Meta-Code is a similarity hash generated from referent seed metadata in accordance with Annex B.
5.1.2 Purpose
The Meta-Code shall support the following use cases:
a) clustering of digital assets based on their metadata;
b) discovery of digital assets with similar metadata;
c) verification or manual disambiguation of matching codes.
5.1.3 Format
The Meta-Code shall have the data format as illustrated in Figure 2:
Figure 2 — Data format of the Meta-Code
EXAMPLE 1 64-bit Meta-Code in its canonical form:
ISCC: AAAUL6P7RMVNT4UJ
EXAMPLE 2 256-bit Meta-Code in its canonical form:
ISCC: AADUL 6P7RMVNT4U JJ4SMTDXBL 5JFZ5XPCDK O42XYPJEVQ 4L7PTYDORQ
5.1.4 Inputs
5.1.4.1 General
Seed metadata is the metadata that is used as the input to calculate the Meta-Code and has three possible
elements:
a) name (required): the name or title of the work manifested by the digital asset;
b) description (optional): a disambiguating textual description of the digital asset;
c) meta (optional): subject, industry, or use-case specific metadata.
Seed metadata shall be stored and carried along unaltered with ISCC Metadata if automated verification of
the Meta-Code based on the original seed metadata is required.
NOTE 1 Because seed metadata is used to construct the Meta-Code, changes to its value can produce different (and
therefore no longer matching) Meta-Codes.
NOTE 2 The identifier standards and their schemas, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC, provide helpful
guidance in selecting seed metadata.
5.1.4.2 name element
The text input for the name element shall be pre-processed before similarity hashing as follows.
a) Apply ISO/IEC 10646 NFKC Unicode Normalization (see Unicode Normalization Forms https:/
...
International
Standard
ISO 24138
First edition
Information and documentation —
2024-05
International Standard Content
Code (ISCC)
Information et documentation — Code international normalisé
de contenu (ISCC)
Reference number
© ISO 2024
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Structure and format of the ISCC . 4
4.1 General structure .4
4.2 ISCC-HEADER . .5
4.2.1 General .5
4.2.2 MainTypes .6
4.2.3 SubTypes .7
4.2.4 Version .7
4.2.5 Length .7
4.3 ISCC-BODY.8
4.4 Encoding .8
4.4.1 Canonical form .8
4.4.2 URI encoding .9
4.4.3 Multiformats encoding.9
4.4.4 Readable encoding .9
5 ISCC-UNITs .10
5.1 Meta-Code .10
5.1.1 General .10
5.1.2 Purpose .10
5.1.3 Format.10
5.1.4 Inputs .10
5.1.5 Outputs . 12
5.1.6 Seed metadata processing . 12
5.1.7 Metadata embedding . 13
5.1.8 Metadata extraction . 13
5.2 Content-Codes .14
5.2.1 General .14
5.2.2 Purpose .14
5.3 Content-Code Subtype Text .14
5.3.1 General .14
5.3.2 Format.14
5.3.3 Inputs . 15
5.3.4 Outputs . 15
5.3.5 Processing . 15
5.3.6 Conformance . 15
5.4 Content-Code Subtype Image .16
5.4.1 General .16
5.4.2 Format.16
5.4.3 Inputs .16
5.4.4 Outputs .16
5.4.5 Processing .16
5.4.6 Conformance .17
5.5 Content-Code Subtype Audio .17
5.5.1 General .17
5.5.2 Format.17
5.5.3 Inputs .17
5.5.4 Outputs .18
5.5.5 Processing .18
5.5.6 Conformance .18
iii
5.6 Content-Code Subtype Video.18
5.6.1 General .18
5.6.2 Format.18
5.6.3 Inputs .19
5.6.4 Outputs .19
5.6.5 Processing .19
5.6.6 Conformance .19
5.7 Content-Code Subtype Mixed .19
5.7.1 General .19
5.7.2 Format. 20
5.7.3 Inputs . 20
5.7.4 Outputs . 20
5.7.5 Processing . 20
5.7.6 Conformance . 20
5.8 Data-Code .21
5.8.1 General .21
5.8.2 Format.21
5.8.3 Inputs .21
5.8.4 Outputs .21
5.8.5 Processing .21
5.8.6 Conformance . 22
5.9 Instance-Code . 22
5.9.1 General . 22
5.9.2 Format. 23
5.9.3 Inputs . 23
5.9.4 Outputs . 23
5.9.5 Processing . 23
5.9.6 Conformance . 23
6 ISCC-CODE .24
6.1 General .24
6.2 Purpose .24
6.3 Format .24
6.3.1 General .24
6.3.2 SubTypes for ISCC-CODEs .24
6.3.3 Length and composition of ISCC-CODEs .24
6.4 Inputs . 25
6.5 Outputs . 25
6.6 Processing . 25
6.7 Comparing ISCC-CODEs . 25
6.8 Conformance . 26
Annex A (normative) Relationship between ISCC and other identifier systems .27
Annex B (normative) ISCC metadata .29
Annex C (informative) Evolution of this document .31
Annex D (normative) Reference implementation .32
Bibliography .33
iv
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 46, Information and documentation,
Subcommittee SC 9, Identification and description.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
v
Introduction
While ISO/TC 46/SC 9 has established a variety of specific identifier standards, a content-dependent
identifier for digital assets in all content formats has not yet been agreed.
Digital content is dynamic, always in motion, and acted upon globally by a variety of entities with different
interests and requirements. Digital content continuously re-encodes, resizes, and re-compresses, changing
its data as it travels through a complex network of actors and systems.
The International Standard Content Code (ISCC) is an identifier for numerous types of digital assets. An
ISCC-CODE is generated from the digital content itself. It is the result of processing the digital content
using a variety of algorithms including hash algorithms. The generated ISCC-CODE supports data integrity
verification and preserves an estimate of the data, digital content and metadata similarity. However, ISCC
has different functionality from content recognition systems.
The ISCC supports the association of higher-level identifiers (like work and product identifiers) with the
digitally encoded manifestations of content. The ISCC does not specify a system for managing authoritative
metadata. Other content identifier standards can use ISCC to support discoverability of their identifiers and
metadata based on digital content.
Organizations, individuals and machines may generate ISCCs for numerous kinds of digital assets and use
them for identification and management of those assets.
ISCCs are neither manually nor automatically assigned to digital media assets. Instead, ISCCs are derived
from media assets according to the procedures described in this document. Unrelated parties can
independently derive the same ISCC from a given media asset.
ISCCs exclusively reference media assets without any implication about ownership. As such, ISCCs are not
managed authoritatively by any institution or entity.
The ISCC enables interoperability between different actors and systems using digital assets and supports
scenarios that require content deduplication, database synchronization and indexing, integrity verification,
timestamping, versioning, data provenance, similarity clustering, anomaly detection, usage tracking,
allocation of royalties, fact-checking and general digital asset management use-cases.
This document includes sections targeting a general audience but also descriptions of more technical
procedures.
Future editions of this document can be developed as outlined in Annex C.
vi
International Standard ISO 24138:2024(en)
Information and documentation — International Standard
Content Code (ISCC)
1 Scope
This document specifies the syntax and structure of the International Standard Content Code (ISCC), as an
identification system for digital assets (including encodings of text, images, audio, video or other content
across all media sectors). It also describes ISCC metadata and the use of ISCC in conjunction with other
schemes, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC.
An ISCC applies to a specific digital asset and is a data-descriptor deterministically constructed from
multiple hash digests using the algorithms and rules in this document. This document does not provide
information on registration of ISCCs.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 10646:2020, Information technology — Universal coded character set (UCS)
ISO/IEC 15938, Information technology — Multimedia content description interface
ISO/IEC 21778, Information technology — The JSON data interchange syntax
1)
IETF RFC 4648, The Base16, Base32, and Base64 Data Encodings
2)
IETF RFC 2397, The "data" URL scheme
3)
IETF RFC 8785, JSON Canonicalization Scheme (JCS)
4)
W3C, C14N 1.1, Canonical XML Version 1.1
5)
W3C, JSON-LD 1.1, A JSON-based Serialization for Linked Data
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
1) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc4648
2) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc2397
3) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc8785
4) Online available: https:// www .w3 .org/ TR/ xml -c14n11
5) Online available: https:// www .w3 .org/ TR/ json -ld/
3.1
bit
atomic unit of information in a computer system
3.2
byte
sequence of 8 bits (3.1)
3.3
nibble
half a byte (3.2), which can be represented by a single hexadecimal digit
[SOURCE: ISO 20038:2017, 3.12]
3.4
data
ordered sequence of bits (3.1)
3.5
file
stored data (3.4) with a known number of bits (3.1) and a filename
3.6
stream
data (3.4) in transit with a known or unknown number of bits (3.1)
3.7
content
information organized to provide value to a user
3.8
digital content
manifestation of content (3.7) in form of data (3.4) structured according to a set of rules
3.9
metadata
data (3.4) that defines and describes other data
[SOURCE: ISO 24531:2013, 4.32]
3.10
seed metadata
initial metadata (3.9) used as input to a hash algorithm (3.1) function
3.11
content format
set of rules used to structure digital content (3.8)
3.12
media type
two-part identifier (3.15) specifying the nature of the referenced data (3.4)
[SOURCE: ISO/IEC 19757-4:2006, 3.9]
3.13
digital asset
file (3.5) or stream (3.6) encoded in conformance with a specific content format (3.11)
3.14
referent
object which is identified
3.15
identifier
sequence of characters that uniquely denotes a referent (3.14)
3.16
identifier system
system to enable the provision of identifiers (3.15) for a given category of referents (3.14)
3.17
content identifier
identifier (3.15) whose referent (3.14) is content (3.7)
3.18
content-dependent identifier
content identifier (3.18) whose data (3.4) depends on the digital content (3.8) that it identifies
3.19
content recognition system
system whose primary purpose is to recognise digital content (3.8) on a granular level
3.20
algorithm
set of instructions
3.21
hash algorithm
deterministic algorithm (3.20) that produces fixed-length data (3.4) from an input of arbitrary-length data
3.22
hash digest
result of processing data (3.4) with a hash algorithm (3.21)
3.23
cryptographic hash function
computationally efficient function mapping binary strings of arbitrary length to binary strings of fixed
length, such that it is computationally infeasible to find two distinct values that hash into a common value
3.24
similarity hash
hash digest (3.22) that preserves correlations between inputs to the hash algorithm (3.21)
3.25
content defined chunking
CDC
method to split data (3.4) into variable length chunks based on internal features such that chunk boundaries
are more resistant to byte (3.2) shifting
3.26
actor
human or non-human (hardware or software) entity that interacts with a system
3.27
Merkle tree
tree data structure in which every leaf node is labelled with the hash digest (3.22) of a data element and
every non-leaf node is labelled with the hash digest of the labels of its child nodes
3.28
Merkle root
root node of a Merkle tree (3.27)
[SOURCE: ISO 22739:2024, 3.57]
3.29
ISCC processor
application that generates ISCCs for digital content (3.8)
3.30
plain text
data (3.4) with a known text encoding that can be transcoded to Unicode
3.31
whitespace
nondisplaying formatting characters such as spaces, tabs, etc., that are embedded within a block of free text
[SOURCE: ISO/IEC/IEEE 31320-2:2012, 3.1.210]
4 Structure and format of the ISCC
4.1 General structure
a) An ISCC shall be composed of an ISCC-HEADER and an ISCC-BODY (see Figure 1).
b) The ISCC-HEADER shall describe the MainType, SubType, Version, and Length of its ISCC-BODY.
c) An ISCC-UNIT shall be an ISCC based on one specific algorithm.
d) An ISCC-CODE shall be an ISCC composed from two or more different ISCC-UNITs.
Figure 1 — General structure of an ISCC
The concatenation of the ISCC-UNITs is based on the underlying data and is not visible in the string
representation of the ISCC-CODE itself. See 6.6 on how an ISCC-CODE is composed from individual ISCC-UNITs.
4.2 ISCC-HEADER
4.2.1 General
4.2.1.1 The ISCC-HEADER is a variable sized bitstream composed of an ordered sequence of the 4 header-
fields MainType, SubType, Version, Length.
4.2.1.2 Each header-field is a bitstream with a length between 4 and 16 bits and encodes an integer value
between 0 and 4679 (see Table 1) with the following encoding scheme.
a) The total bit-length of a header-field shall be determined by its prefix-bits.
b) The prefix-bits shall be followed by data-bits.
c) The data-bits shall be interpreted as unsigned integer values plus the maximum value of the
preceding range.
d) If the total length of all header fields in number of bits is not divisible by 8, the header shall be padded
with 4 zero bits (0000) on the right side.
Table 1 — Variable length ISCC-HEADER field encoding
Prefix bits Number of nibbles Number of data bits Integer range
0 1 3 0-7
10 2 6 8-71
110 3 9 72-583
1110 4 12 584-4679
4.2.1.3 The interpretation of the integer value of a header-field shall be context dependent.
a) For the MainType and SubType fields, it shall be an identifier for the designated type (see 4.2.2 and 4.2.3).
b) For the Version field it shall be the literal version number (see 4.2.4).
c) For the Length field of ISCC-UNITs, it shall be a number used as a multiplier to calculate the bit length of
the ISCC-BODY (see 4.2.5, Table 6).
d) For the Length field of ISCC-CODEs, it shall be a bit-pattern encoding the combination of ISCC-UNITs and
the bit-length of the ISCC-BODY (see 4.2.5, Table 7).
EXAMPLE Header Field Examples
0 = 0000
1 = 0001
…
7 = 0111
8 = 1000 0000
9 = 1000 0001
…
4.2.2 MainTypes
The MainType header-field shall signify the type of an ISCC (see Table 2).
Backward incompatible updates to an algorithm associated with a MainType shall be indicated by
incrementing the version field of the ISCC-HEADER of the respective MainType.
NOTE This document specifies initial algorithms (version 0) for all reserved MainTypes with the exception of the
SEMANTIC type which is not currently defined.
Table 2 — Reserved MainTypes
ID Symbol Bits Definition
0 META 0000 An ISCC-UNIT that matches on metadata similarity
1 SEMANTIC 0001 An ISCC-UNIT that matches on semantic content similarity
2 CONTENT 0010 An ISCC-UNIT that matches on perceptual content similarity
3 DATA 0011 An ISCC-UNIT that matches on data similarity
4 INSTANCE 0100 An ISCC-UNIT that matches on data identity
5 ISCC 0101 An ISCC-CODE composed of two or more headerless ISCC-UNITs for mul-
ti-modal matching
4.2.3 SubTypes
The MainTypes META, DATA, and INSTANCE shall have a single default SubType NONE encoded with the
bits 0000.
The MainTypes SEMANTIC, CONTENT, and ISCC shall have SubTypes that signify the perceptual mode (see
Table 3 and Table 4).
Table 3 — Reserved SubTypes for MainTypes ISCC, SEMANTIC and CONTENT
ID Symbol Bits Definition
0 TEXT 0000 Match on text similarity
1 IMAGE 0001 Match on image similarity
2 AUDIO 0010 Match on audio similarity
3 VIDEO 0011 Match on video similarity
4 MIXED 0100 Match on multi-modal similarity
Table 4 — Additional Reserved SubTypes for the MainType ISCC
ID Symbol Bits Definition
5 SUM 0101 Composite of ISCC-UNITs including only Data- and Instance-Code
6 NONE 0110 Composite ISCC-UNITs including Meta-, Data- and Instance-Code
4.2.4 Version
All ISCC-HEADERs shall have a version header-field of 0000 for the first edition of this document (see
Table 5).
Table 5 — Reserved ISCC Versions
ID Symbol Bits Definition
0 V0 0000 Initial version of ISCC-UNITs and ISCC-CODE
4.2.5 Length
4.2.5.1 General
The encoding of the Length header-field shall be specific to the MainType.
4.2.5.2 Length of ISCC-UNITs
For ISCC-UNITs of the MainTypes META, SEMANTIC, CONTENT, DATA and INSTANCE the length value shall
be encoded as the number of 32-bit blocks of the ISCC-BODY in addition to the minimum length of 32 bits
(see Table 6).
Table 6 — Reserved length field values (multiples of 32 bit)
ID Symbol Bits Definition
0 L32 0000 Length of body is 32 bits (minimum length)
1 L64 0001 Length of body is 64 bits (default length)
2 L96 0010 Length of body is 96 bits
3 L128 0011 Length of body is 128 bits
4 L160 0100 Length of body is 160 bits
5 L192 0101 Length of body is 192 bits
6 L224 0110 Length of body is 224 bits
7 L256 0111 Length of body is 256 bits
4.2.5.3 Length of ISCC-CODEs
a) For ISCC-CODEs, the length value shall designate the composition of ISCC-UNITs (see Table 7).
b) The Data-Code and Instance-Code shall be mandatory 64-bit components of an ISCC-CODE.
c) The first data-bit shall designate the presence of a 64-bit Meta-Code.
d) The second data-bit shall designate the presence of a 64-bit Semantic-Code.
e) The third data-bit shall designate the presence of a 64-bit Content-Code.
f) The length of an ISCC-CODE shall be calculated as the number of active data-bits times 64 plus 128 bits
of mandatory data.
Table 7 — Reserved length field values (for MainType ISCC)
ID Symbol Bits Definition
0 SUM 0000 No optional ISCC-UNITs. Length of body is 128 bits.
1 CDI 0001 Includes Content-Code. Length of body is 192 bits
2 SDI 0010 Includes Semantic-Code. Length of body is 192 bits
3 SCDI 0011 Includes Semantic- and Content-Code. Length of body is 256 bits
4 MDI 0100 Includes Meta-Code. Length of body is 192 bits
5 MCDI 0101 Includes Meta-Code and Content-Code. Length of body is 256 bits
6 MSDI 0110 Includes Meta-Code and Semantic-Code. Length of body is 256 bits
7 MSCDI 0111 Includes Meta-, Semantic-, and Content-Code. Length is 320 bits
4.3 ISCC-BODY
a) The preceding MainType, SubType, and Version fields shall qualify the semantics of an ISCC-BODY.
b) The Length field shall determine the number of bits of an ISCC-BODY.
4.4 Encoding
4.4.1 Canonical form
The printable canonical form of an ISCC shall be its RFC 4648 Base32 encoded representation without
padding and prefixed with “ISCC:”. Base32 defines an upper case standard alphabet.
EXAMPLE ISCC:KEC43HJLPUSHVAZT66YLPUWNVACWYPIV533TRQMWF2IUQYSP5LA4CTY
4.4.2 URI encoding
An ISCC may be encoded using the syntax of a Uniform Resource Identifier (URI) as defined in RFC 3986.
a) The URI representation shall have the format :.
b) The URI scheme shall be the string “iscc”.
c) The URI path shall be the lower-cased base32 representation of an ISCC without padding.
EXAMPLE iscc: kec43 hjlpushvaz t66ylpuwnv acwypiv533 trqmwf2iuq ysp5la4cty
NOTE Because Base32 defines an upper case standard alphabet, the canonical form differs from the URI form,
which is represented in lower case.
4.4.3 Multiformats encoding
[13]
The ISCC may be encoded as a multibase string (see Table 8).
a) The multicodec identifier of an ISCC shall be “0xcc01” (see Table 9).
b) A Multiformat representation of an ISCC shall be prefixed with a multibase code.
c) The encoding scheme shall be .
ISCC shall support the multibase encodings given in Tables 8 and 9.
Table 8 — Supported multibase encodings
Encoding Code Definition
base16 f hexadecimal
base32 b RFC4648 case-insensitive - no padding
base32hex v RFC4648 case-insensitive - no padding - highest char
base58btc z base58 bitcoin
base64url u RFC4648 no padding
Table 9 — Examples of ISCCs in multiformats encoding
Encoding Example
fcc015105cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f
MF base16
bzqavcbontuvx2jd2qmz7pmfx2lg2qblmhuk655zyyglc5ekimjh6vqobj4
MF base32
vpg0l21edjklnq93qgcpvfc5nqb6qg1bc7kauttpoo6b2t4a8c97ulge19s
MF base32hex
z2Yr3BMx3Rj56fyYkNvfa19PCk4SjspQhpVWoLSGg9yXr4vUGsx
MF base58btc
uzAFRBc2dK30keoMz97C30s2oBWw9Fe73OMGWLpFIYk_qwcFP
MF base64url
4.4.4 Readable encoding
The ISCC may be encoded in human readable representation.
a) The readable representation shall encode the header fields with their symbols and the ISCC-BODY in
base16 lower-case.
b) The header fields and the ISCC-BODY shall be separated with hyphens.
EXAMPLE
ISCC-IMAGE-V0-MCDI-cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f
5 ISCC-UNITs
5.1 Meta-Code
5.1.1 General
The Meta-Code is a similarity hash generated from referent seed metadata in accordance with Annex B.
5.1.2 Purpose
The Meta-Code shall support the following use cases:
a) clustering of digital assets based on their metadata;
b) discovery of digital assets with similar metadata;
c) verification or manual disambiguation of matching codes.
5.1.3 Format
The Meta-Code shall have the data format as illustrated in Figure 2:
Figure 2 — Data format of the Meta-Code
EXAMPLE 1 64-bit Meta-Code in its canonical form:
ISCC: AAAUL6P7RMVNT4UJ
EXAMPLE 2 256-bit Meta-Code in its canonical form:
ISCC: AADUL 6P7RMVNT4U JJ4SMTDXBL 5JFZ5XPCDK O42XYPJEVQ 4L7PTYDORQ
5.1.4 Inputs
5.1.4.1 General
Seed metadata is the metadata that is used as the input to calculate the Meta-Code and has three possible
elements:
a) name (required): the name or title of the work manifested by the digital asset;
b) description (optional): a disambiguating textual description of the digital asset;
c) meta (optional): subject, industry, or use-case specific metadata.
Seed metadata shall be stored and carried along unaltered with ISCC Metadata if automated verification of
the Meta-Code based on the original seed metadata is required.
NOTE 1 Because seed metadata is used to construct the Meta-Code, changes to its value can produce different (and
therefore no longer matching) Meta-Codes.
NOTE 2 The identifier standards and their schemas, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC, provide helpful
guidance in selecting seed metadata.
5.1.4.2 name element
The text input for the name element shall be pre-processed before similarity hashing as follows.
a) Apply ISO/IEC 10646 NFKC Unicode Normalization (see Unicode Normalization Forms https:// unicode
.org/ reports/ tr15/ #Norm _Forms).
b) Remove control characters (see Unicode Character Database https:// www .unicode .org/ ucd/ ).
c) Strip leading and trailing whitespace.
d) Trim the end of the text such that the UTF-8 encoded size does not exceed 128 bytes.
5.1.4.3 description element
Text input for the description element shall be pre-processed before similarity hashing as follows.
a) Apply NFKC Unicode Normalization.
b) Remove control characters (as specified by Unicode Character Database) except for the following
newline characters:
1) U000A - Line Feed;
2) U000B - Vertical Tab;
3) U000C - Form Feed;
4) U000D - Carriage Return;
5) U0085 - Next Line;
6) U2028 - Line Separator;
7) U2029 - Paragraph Separator.
c) Collapse more than two consecutive newlines characters to a maximum of two consecutive newlines.
d) Strip leading and trailing whitespace characters.
5.1.4.4 meta element
a) The value of the meta element shall be wrapped in an RFC 2397 Data-URL.
b) The value of the meta
...
Frequently Asked Questions
SIST ISO 24138:2024 is a standard published by the Slovenian Institute for Standardization (SIST). Its full title is "Information and documentation - International Standard Content Code (ISCC)". This standard covers: This document specifies the syntax and structure of the International Standard Content Code (ISCC), as an identification system for digital assets (including encodings of text, images, audio, video or other content across all media sectors). It also describes ISCC metadata and the use of ISCC in conjunction with other schemes, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC. An ISCC applies to a specific digital asset and is a data-descriptor deterministically constructed from multiple hash digests using the algorithms and rules in this document. This document does not provide information on registration of ISCCs.
This document specifies the syntax and structure of the International Standard Content Code (ISCC), as an identification system for digital assets (including encodings of text, images, audio, video or other content across all media sectors). It also describes ISCC metadata and the use of ISCC in conjunction with other schemes, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC. An ISCC applies to a specific digital asset and is a data-descriptor deterministically constructed from multiple hash digests using the algorithms and rules in this document. This document does not provide information on registration of ISCCs.
SIST ISO 24138:2024 is classified under the following ICS (International Classification for Standards) categories: 01.140.20 - Information sciences. The ICS classification helps identify the subject area and facilitates finding related standards.
You can purchase SIST ISO 24138:2024 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of SIST standards.
The SIST ISO 24138:2024 document offers a comprehensive framework for the International Standard Content Code (ISCC), aimed at enhancing the identification and management of digital assets across various media sectors. The standard's scope is clearly defined, providing a robust syntax and structure that underpins the ISCC as an effective identification system. One of the document's key strengths is its detailed specification of how the ISCC is constructed, emphasizing its deterministic nature through the use of hash digests. This ensures unique identifiers for digital assets, addressing a critical need in the rapidly evolving digital landscape, where content proliferation makes accurate identification imperative. Furthermore, the integration of ISCC metadata facilitates enhanced organization and retrieval of digital resources. The relevance of the ISO 24138:2024 extends beyond standalone application, as it outlines the interoperability of the ISCC with existing standards such as DOI, ISAN, ISBN, ISRC, ISSN, and ISWC. This compatibility is essential for stakeholders across the digital asset management ecosystem, allowing for seamless integration into current workflows and systems, thereby streamlining processes and improving efficiency. In summary, the SIST ISO 24138:2024 document stands out for its thorough examination of the ISCC, its structured approach to digital asset identification, and its commitment to interoperability with other standards, making it an indispensable resource for professionals in information and documentation fields.
SIST ISO 24138:2024は、デジタル資産に対する国際標準コンテンツコード(ISCC)の文書であり、その構造と構文を明確に定義しています。この標準が持つ重要な強みは、テキスト、画像、音声、ビデオなど、さまざまなメディアセクターにわたるデジタルコンテンツを一貫して識別するためのフレームワークを提供している点です。ISCCは、特定のデジタル資産に適用され、複数のハッシュダイジェストから決定論的に構築されたデータ記述子です。 この標準は、ISCCメタデータの仕様も含まれており、DOI、ISAN、ISBN、ISRC、ISSN、ISWCなどの他のスキームとの併用に関する指針を提供しています。これにより、異なる識別システムとの統合が容易になり、デジタル資産の管理と識別の効率性が大幅に向上します。 また、SIST ISO 24138:2024は、デジタル資産の識別に必要な精度と信頼性を持つ設計となっており、情報と文書の領域において革新的な進展を促進しています。この標準の適用により、各種メディアにおけるコンテンツの扱いが一貫したものとなり、業界全体の標準化がさらに進むことが期待されます。
La norme SIST ISO 24138:2024, intitulée "Information et documentation - International Standard Content Code (ISCC)", présente une portée significative en définissant la syntaxe et la structure du système d'identification pour les actifs numériques. Ce standard constitue un outil essentiel pour le classement et l'identification des contenus, englobant non seulement des textes, mais aussi des images, des fichiers audio et vidéo, s'adaptant ainsi à tous les secteurs médiatiques. L'un des points forts de cette norme réside dans son approche méthodique de construction d'un ISCC. En utilisant des algorithmes et des règles bien définies pour générer des descripteurs de données à partir de multiples résumés de hachage, le système garantit une identification unique et ajustée à chaque actif numérique. Cela facilite une traçabilité précise à travers les différentes plateformes et dispositifs. De plus, la norme souligne l'importance de l'intégration de l'ISCC avec d'autres systèmes de référencement tels que le DOI, l'ISAN, l'ISBN, l'ISRC, l'ISSN et l'ISWC. Cette interopérabilité augmente la pertinence du standard, permettant une utilisation harmonisée dans un environnement complexe où les contenus numériques sont souvent référencés par divers moyens différents. En incluant des métadonnées ISCC, la norme permet d'enrichir les informations associées à chaque actif, ce qui renforce non seulement l'identification, mais aussi la gestion et la distribution des contenus sur des plateformes variées. Bien que le document ne traite pas de l'enregistrement des ISCC, il fournit des bases solides pour leur application et leur utilisation dans des contextes variés, essentiels pour les professionnels du secteur. En somme, la norme SIST ISO 24138:2024 est un ajout de valeur significatif pour la documentation et l'information, fournissant un cadre robuste pour l'identification des contenus numériques dans un monde de plus en plus digitalisé. Sa précision et sa capacité d’intégration en font un instrument incontournable pour les acteurs du secteur, consolidant son importance et sa pertinence dans le paysage numérique contemporain.
Der Standard SIST ISO 24138:2024, der den International Standard Content Code (ISCC) behandelt, legt einen klaren und strukturierten Rahmen für die Identifikation digitaler Assets fest. Der Umfang dieses Dokuments ist entscheidend für die digitale Medienlandschaft, da es eine einheitliche Syntax und Struktur für den ISCC definiert, die für Text, Bilder, Audio, Video und weitere Inhalte über alle Mediensektoren hinweg Anwendung findet. Ein herausragendes Merkmal dieses Standards ist die Fähigkeit, ISCC-Metadaten zu beschreiben und die Verwendung des Codes in Kombination mit anderen etablierten Systemen wie DOI, ISAN, ISBN, ISRC, ISSN und ISWC aufzuzeigen. Dies unterstreicht die Relevanz des ISCC in einem zunehmend komplexen digitalen Ökosystem, in dem die Identifikation und Verwaltung von Inhalten von großer Bedeutung ist. Die Stärke des Standards liegt in seiner deterministischen Konstruktion. Der ISCC wird aus mehreren Hash-Digests erstellt, was eine präzise und zuverlässige Datenbeschreibung ermöglicht. Diese Methodik sorgt für eine hohe Integrität und Konsistenz bei der Identifikation von digitalen Assets. Insgesamt bietet der Standard SIST ISO 24138:2024 eine essenzielle Grundlage für die Identifikation und das Management digitaler Inhalte in einer vielfältigen Medienumgebung. Seine Relevanz zeigt sich besonders in der Notwendigkeit, verschiedene Medienformate und -systeme zu vereinen und ihnen eine klare Kennzeichnung zu geben.
SIST ISO 24138:2024 문서는 국제 표준 콘텐츠 코드(International Standard Content Code, ISCC)의 구문 및 구조를 명세하고 있어 디지털 자산의 식별 시스템으로서의 기능을 강조하고 있습니다. 이 표준은 텍스트, 이미지, 오디오, 비디오 등 다양한 미디어 분야에서의 콘텐츠 인코딩을 포함하는 디지털 자산에 적용됩니다. SIST ISO 24138:2024의 강점 중 하나는 ISCC에 대한 메타데이터를 명확히 정의하며, DOI, ISAN, ISBN, ISRC, ISSN 및 ISWC와 같은 다른 스킴과의 연계 사용 방식에 대해 구체적으로 설명한다는 점입니다. 이러한 연계 가능성은 다양한 디지털 콘텐츠의 관리를 보다 효율적으로 만들어 주며, 콘텐츠의 통합성과 접근성을 높이는 데 기여합니다. 또한, ISCC는 특정 디지털 자산에 적용되며, 다수의 해시 다이제스트를 기반으로 결정론적으로 구성된 데이터 설명자입니다. 이로 인해 ISCC는 신뢰성과 정확성을 보장하는 동시에 복잡한 디지털 자산 식별의 필요성을 효과적으로 충족할 수 있습니다. 본 문서에서는 ISCC 등록에 대한 정보는 제공되지 않으나, 충분히 정의된 구조 덕분에 사용자는 명확한 기준으로 ISCC를 활용할 수 있습니다. 이와 같이, SIST ISO 24138:2024는 디지털 자산 관리와 그에 따른 표준화의 중요성을 감안할 때, 매우 적절하고 현대적인 기준을 제공하고 있습니다.










Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...