Information and documentation — International Standard Content Code (ISCC)

This document specifies the syntax and structure of the International Standard Content Code (ISCC), as an identification system for digital assets (including encodings of text, images, audio, video or other content across all media sectors). It also describes ISCC metadata and the use of ISCC in conjunction with other schemes, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC.
An ISCC applies to a specific digital asset and is a data-descriptor deterministically constructed from multiple hash digests using the algorithms and rules in this document. This document does not provide information on registration of ISCCs.

Information et documentation — Code international normalisé de contenu (ISCC)

Informatika in dokumentacija - Mednarodna standardna koda digitalne vsebine (ISCC)

Ta dokument določa sintakso in strukturo mednarodne standardne kode digitalne vsebine (ISCC) kot sistema za označevanje digitalnih sredstev (vključno s kodiranjem besedila, slik, zvoka, videa ali druge vsebine na vseh področjih medijev). Opisuje tudi metapodatke mednarodne standardne kode digitalne vsebine in njeno uporabo v povezavi z drugimi shemami, kot so DOI, ISAN, ISBN, ISRC, ISSN in ISWC.
Mednarodna standardna koda digitalne vsebine se uporablja za določeno digitalno sredstvo in je podatkovni deskriptor, deterministično sestavljen iz več zgoščenih izvlečkov z uporabo algoritmov in pravil v tem dokumentu. Ta dokument ne vsebuje informacij o registraciji mednarodne standardne kode digitalne vsebine.

General Information

Status
Published
Publication Date
06-Oct-2024
Current Stage
6060 - National Implementation/Publication (Adopted Project)
Start Date
17-Sep-2024
Due Date
22-Nov-2024
Completion Date
07-Oct-2024
Standard
SIST ISO 24138:2024
English language
39 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day
Standard
ISO 24138:2024 - Information and documentation — International Standard Content Code (ISCC) Released:15. 05. 2024
English language
33 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


SLOVENSKI STANDARD
01-november-2024
Informatika in dokumentacija - Mednarodna standardna koda digitalne vsebine
(ISCC)
Information and documentation — International Standard Content Code (ISCC)
Information et documentation — Code international normalisé de contenu (ISCC)
Ta slovenski standard je istoveten z: ISO 24138:2024
ICS:
01.140.20 Informacijske vede Information sciences
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

International
Standard
ISO 24138
First edition
Information and documentation —
2024-05
International Standard Content
Code (ISCC)
Information et documentation — Code international normalisé
de contenu (ISCC)
Reference number
© ISO 2024
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Structure and format of the ISCC . 4
4.1 General structure .4
4.2 ISCC-HEADER . .5
4.2.1 General .5
4.2.2 MainTypes .6
4.2.3 SubTypes .7
4.2.4 Version .7
4.2.5 Length .7
4.3 ISCC-BODY.8
4.4 Encoding .8
4.4.1 Canonical form .8
4.4.2 URI encoding .9
4.4.3 Multiformats encoding.9
4.4.4 Readable encoding .9
5 ISCC-UNITs .10
5.1 Meta-Code .10
5.1.1 General .10
5.1.2 Purpose .10
5.1.3 Format.10
5.1.4 Inputs .10
5.1.5 Outputs . 12
5.1.6 Seed metadata processing . 12
5.1.7 Metadata embedding . 13
5.1.8 Metadata extraction . 13
5.2 Content-Codes .14
5.2.1 General .14
5.2.2 Purpose .14
5.3 Content-Code Subtype Text .14
5.3.1 General .14
5.3.2 Format.14
5.3.3 Inputs . 15
5.3.4 Outputs . 15
5.3.5 Processing . 15
5.3.6 Conformance . 15
5.4 Content-Code Subtype Image .16
5.4.1 General .16
5.4.2 Format.16
5.4.3 Inputs .16
5.4.4 Outputs .16
5.4.5 Processing .16
5.4.6 Conformance .17
5.5 Content-Code Subtype Audio .17
5.5.1 General .17
5.5.2 Format.17
5.5.3 Inputs .17
5.5.4 Outputs .18
5.5.5 Processing .18
5.5.6 Conformance .18

iii
5.6 Content-Code Subtype Video.18
5.6.1 General .18
5.6.2 Format.18
5.6.3 Inputs .19
5.6.4 Outputs .19
5.6.5 Processing .19
5.6.6 Conformance .19
5.7 Content-Code Subtype Mixed .19
5.7.1 General .19
5.7.2 Format. 20
5.7.3 Inputs . 20
5.7.4 Outputs . 20
5.7.5 Processing . 20
5.7.6 Conformance . 20
5.8 Data-Code .21
5.8.1 General .21
5.8.2 Format.21
5.8.3 Inputs .21
5.8.4 Outputs .21
5.8.5 Processing .21
5.8.6 Conformance . 22
5.9 Instance-Code . 22
5.9.1 General . 22
5.9.2 Format. 23
5.9.3 Inputs . 23
5.9.4 Outputs . 23
5.9.5 Processing . 23
5.9.6 Conformance . 23
6 ISCC-CODE .24
6.1 General .24
6.2 Purpose .24
6.3 Format .24
6.3.1 General .24
6.3.2 SubTypes for ISCC-CODEs .24
6.3.3 Length and composition of ISCC-CODEs .24
6.4 Inputs . 25
6.5 Outputs . 25
6.6 Processing . 25
6.7 Comparing ISCC-CODEs . 25
6.8 Conformance . 26
Annex A (normative) Relationship between ISCC and other identifier systems .27
Annex B (normative) ISCC metadata .29
Annex C (informative) Evolution of this document .31
Annex D (normative) Reference implementation .32
Bibliography .33

iv
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 46, Information and documentation,
Subcommittee SC 9, Identification and description.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.

v
Introduction
While ISO/TC 46/SC 9 has established a variety of specific identifier standards, a content-dependent
identifier for digital assets in all content formats has not yet been agreed.
Digital content is dynamic, always in motion, and acted upon globally by a variety of entities with different
interests and requirements. Digital content continuously re-encodes, resizes, and re-compresses, changing
its data as it travels through a complex network of actors and systems.
The International Standard Content Code (ISCC) is an identifier for numerous types of digital assets. An
ISCC-CODE is generated from the digital content itself. It is the result of processing the digital content
using a variety of algorithms including hash algorithms. The generated ISCC-CODE supports data integrity
verification and preserves an estimate of the data, digital content and metadata similarity. However, ISCC
has different functionality from content recognition systems.
The ISCC supports the association of higher-level identifiers (like work and product identifiers) with the
digitally encoded manifestations of content. The ISCC does not specify a system for managing authoritative
metadata. Other content identifier standards can use ISCC to support discoverability of their identifiers and
metadata based on digital content.
Organizations, individuals and machines may generate ISCCs for numerous kinds of digital assets and use
them for identification and management of those assets.
ISCCs are neither manually nor automatically assigned to digital media assets. Instead, ISCCs are derived
from media assets according to the procedures described in this document. Unrelated parties can
independently derive the same ISCC from a given media asset.
ISCCs exclusively reference media assets without any implication about ownership. As such, ISCCs are not
managed authoritatively by any institution or entity.
The ISCC enables interoperability between different actors and systems using digital assets and supports
scenarios that require content deduplication, database synchronization and indexing, integrity verification,
timestamping, versioning, data provenance, similarity clustering, anomaly detection, usage tracking,
allocation of royalties, fact-checking and general digital asset management use-cases.
This document includes sections targeting a general audience but also descriptions of more technical
procedures.
Future editions of this document can be developed as outlined in Annex C.

vi
International Standard ISO 24138:2024(en)
Information and documentation — International Standard
Content Code (ISCC)
1 Scope
This document specifies the syntax and structure of the International Standard Content Code (ISCC), as an
identification system for digital assets (including encodings of text, images, audio, video or other content
across all media sectors). It also describes ISCC metadata and the use of ISCC in conjunction with other
schemes, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC.
An ISCC applies to a specific digital asset and is a data-descriptor deterministically constructed from
multiple hash digests using the algorithms and rules in this document. This document does not provide
information on registration of ISCCs.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 10646:2020, Information technology — Universal coded character set (UCS)
ISO/IEC 15938, Information technology — Multimedia content description interface
ISO/IEC 21778, Information technology — The JSON data interchange syntax
1)
IETF RFC 4648, The Base16, Base32, and Base64 Data Encodings
2)
IETF RFC 2397, The "data" URL scheme
3)
IETF RFC 8785, JSON Canonicalization Scheme (JCS)
4)
W3C, C14N 1.1, Canonical XML Version 1.1
5)
W3C, JSON-LD 1.1, A JSON-based Serialization for Linked Data
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
1) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc4648
2) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc2397
3) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc8785
4) Online available: https:// www .w3 .org/ TR/ xml -c14n11
5) Online available: https:// www .w3 .org/ TR/ json -ld/

3.1
bit
atomic unit of information in a computer system
3.2
byte
sequence of 8 bits (3.1)
3.3
nibble
half a byte (3.2), which can be represented by a single hexadecimal digit
[SOURCE: ISO 20038:2017, 3.12]
3.4
data
ordered sequence of bits (3.1)
3.5
file
stored data (3.4) with a known number of bits (3.1) and a filename
3.6
stream
data (3.4) in transit with a known or unknown number of bits (3.1)
3.7
content
information organized to provide value to a user
3.8
digital content
manifestation of content (3.7) in form of data (3.4) structured according to a set of rules
3.9
metadata
data (3.4) that defines and describes other data
[SOURCE: ISO 24531:2013, 4.32]
3.10
seed metadata
initial metadata (3.9) used as input to a hash algorithm (3.1) function
3.11
content format
set of rules used to structure digital content (3.8)
3.12
media type
two-part identifier (3.15) specifying the nature of the referenced data (3.4)
[SOURCE: ISO/IEC 19757-4:2006, 3.9]
3.13
digital asset
file (3.5) or stream (3.6) encoded in conformance with a specific content format (3.11)
3.14
referent
object which is identified
3.15
identifier
sequence of characters that uniquely denotes a referent (3.14)
3.16
identifier system
system to enable the provision of identifiers (3.15) for a given category of referents (3.14)
3.17
content identifier
identifier (3.15) whose referent (3.14) is content (3.7)
3.18
content-dependent identifier
content identifier (3.18) whose data (3.4) depends on the digital content (3.8) that it identifies
3.19
content recognition system
system whose primary purpose is to recognise digital content (3.8) on a granular level
3.20
algorithm
set of instructions
3.21
hash algorithm
deterministic algorithm (3.20) that produces fixed-length data (3.4) from an input of arbitrary-length data
3.22
hash digest
result of processing data (3.4) with a hash algorithm (3.21)
3.23
cryptographic hash function
computationally efficient function mapping binary strings of arbitrary length to binary strings of fixed
length, such that it is computationally infeasible to find two distinct values that hash into a common value
3.24
similarity hash
hash digest (3.22) that preserves correlations between inputs to the hash algorithm (3.21)
3.25
content defined chunking
CDC
method to split data (3.4) into variable length chunks based on internal features such that chunk boundaries
are more resistant to byte (3.2) shifting
3.26
actor
human or non-human (hardware or software) entity that interacts with a system
3.27
Merkle tree
tree data structure in which every leaf node is labelled with the hash digest (3.22) of a data element and
every non-leaf node is labelled with the hash digest of the labels of its child nodes
3.28
Merkle root
root node of a Merkle tree (3.27)
[SOURCE: ISO 22739:2024, 3.57]

3.29
ISCC processor
application that generates ISCCs for digital content (3.8)
3.30
plain text
data (3.4) with a known text encoding that can be transcoded to Unicode
3.31
whitespace
nondisplaying formatting characters such as spaces, tabs, etc., that are embedded within a block of free text
[SOURCE: ISO/IEC/IEEE 31320-2:2012, 3.1.210]
4 Structure and format of the ISCC
4.1 General structure
a) An ISCC shall be composed of an ISCC-HEADER and an ISCC-BODY (see Figure 1).
b) The ISCC-HEADER shall describe the MainType, SubType, Version, and Length of its ISCC-BODY.
c) An ISCC-UNIT shall be an ISCC based on one specific algorithm.
d) An ISCC-CODE shall be an ISCC composed from two or more different ISCC-UNITs.

Figure 1 — General structure of an ISCC
The concatenation of the ISCC-UNITs is based on the underlying data and is not visible in the string
representation of the ISCC-CODE itself. See 6.6 on how an ISCC-CODE is composed from individual ISCC-UNITs.
4.2 ISCC-HEADER
4.2.1 General
4.2.1.1 The ISCC-HEADER is a variable sized bitstream composed of an ordered sequence of the 4 header-
fields MainType, SubType, Version, Length.

4.2.1.2 Each header-field is a bitstream with a length between 4 and 16 bits and encodes an integer value
between 0 and 4679 (see Table 1) with the following encoding scheme.
a) The total bit-length of a header-field shall be determined by its prefix-bits.
b) The prefix-bits shall be followed by data-bits.
c) The data-bits shall be interpreted as unsigned integer values plus the maximum value of the
preceding range.
d) If the total length of all header fields in number of bits is not divisible by 8, the header shall be padded
with 4 zero bits (0000) on the right side.
Table 1 — Variable length ISCC-HEADER field encoding
Prefix bits Number of nibbles Number of data bits Integer range
0 1 3 0-7
10 2 6 8-71
110 3 9 72-583
1110 4 12 584-4679
4.2.1.3 The interpretation of the integer value of a header-field shall be context dependent.
a) For the MainType and SubType fields, it shall be an identifier for the designated type (see 4.2.2 and 4.2.3).
b) For the Version field it shall be the literal version number (see 4.2.4).
c) For the Length field of ISCC-UNITs, it shall be a number used as a multiplier to calculate the bit length of
the ISCC-BODY (see 4.2.5, Table 6).
d) For the Length field of ISCC-CODEs, it shall be a bit-pattern encoding the combination of ISCC-UNITs and
the bit-length of the ISCC-BODY (see 4.2.5, Table 7).
EXAMPLE Header Field Examples
0 = 0000
1 = 0001

7 = 0111
8 = 1000 0000
9 = 1000 0001

4.2.2 MainTypes
The MainType header-field shall signify the type of an ISCC (see Table 2).
Backward incompatible updates to an algorithm associated with a MainType shall be indicated by
incrementing the version field of the ISCC-HEADER of the respective MainType.
NOTE This document specifies initial algorithms (version 0) for all reserved MainTypes with the exception of the
SEMANTIC type which is not currently defined.

Table 2 — Reserved MainTypes
ID Symbol Bits Definition
0 META 0000 An ISCC-UNIT that matches on metadata similarity
1 SEMANTIC 0001 An ISCC-UNIT that matches on semantic content similarity
2 CONTENT 0010 An ISCC-UNIT that matches on perceptual content similarity
3 DATA 0011 An ISCC-UNIT that matches on data similarity
4 INSTANCE 0100 An ISCC-UNIT that matches on data identity
5 ISCC 0101 An ISCC-CODE composed of two or more headerless ISCC-UNITs for mul-
ti-modal matching
4.2.3 SubTypes
The MainTypes META, DATA, and INSTANCE shall have a single default SubType NONE encoded with the
bits 0000.
The MainTypes SEMANTIC, CONTENT, and ISCC shall have SubTypes that signify the perceptual mode (see
Table 3 and Table 4).
Table 3 — Reserved SubTypes for MainTypes ISCC, SEMANTIC and CONTENT
ID Symbol Bits Definition
0 TEXT 0000 Match on text similarity
1 IMAGE 0001 Match on image similarity
2 AUDIO 0010 Match on audio similarity
3 VIDEO 0011 Match on video similarity
4 MIXED 0100 Match on multi-modal similarity
Table 4 — Additional Reserved SubTypes for the MainType ISCC
ID Symbol Bits Definition
5 SUM 0101 Composite of ISCC-UNITs including only Data- and Instance-Code
6 NONE 0110 Composite ISCC-UNITs including Meta-, Data- and Instance-Code
4.2.4 Version
All ISCC-HEADERs shall have a version header-field of 0000 for the first edition of this document (see
Table 5).
Table 5 — Reserved ISCC Versions
ID Symbol Bits Definition
0 V0 0000 Initial version of ISCC-UNITs and ISCC-CODE
4.2.5 Length
4.2.5.1 General
The encoding of the Length header-field shall be specific to the MainType.
4.2.5.2 Length of ISCC-UNITs
For ISCC-UNITs of the MainTypes META, SEMANTIC, CONTENT, DATA and INSTANCE the length value shall
be encoded as the number of 32-bit blocks of the ISCC-BODY in addition to the minimum length of 32 bits
(see Table 6).
Table 6 — Reserved length field values (multiples of 32 bit)
ID Symbol Bits Definition
0 L32 0000 Length of body is 32 bits (minimum length)
1 L64 0001 Length of body is 64 bits (default length)
2 L96 0010 Length of body is 96 bits
3 L128 0011 Length of body is 128 bits
4 L160 0100 Length of body is 160 bits
5 L192 0101 Length of body is 192 bits
6 L224 0110 Length of body is 224 bits
7 L256 0111 Length of body is 256 bits
4.2.5.3 Length of ISCC-CODEs
a) For ISCC-CODEs, the length value shall designate the composition of ISCC-UNITs (see Table 7).
b) The Data-Code and Instance-Code shall be mandatory 64-bit components of an ISCC-CODE.
c) The first data-bit shall designate the presence of a 64-bit Meta-Code.
d) The second data-bit shall designate the presence of a 64-bit Semantic-Code.
e) The third data-bit shall designate the presence of a 64-bit Content-Code.
f) The length of an ISCC-CODE shall be calculated as the number of active data-bits times 64 plus 128 bits
of mandatory data.
Table 7 — Reserved length field values (for MainType ISCC)
ID Symbol Bits Definition
0 SUM 0000 No optional ISCC-UNITs. Length of body is 128 bits.
1 CDI 0001 Includes Content-Code. Length of body is 192 bits
2 SDI 0010 Includes Semantic-Code. Length of body is 192 bits
3 SCDI 0011 Includes Semantic- and Content-Code. Length of body is 256 bits
4 MDI 0100 Includes Meta-Code. Length of body is 192 bits
5 MCDI 0101 Includes Meta-Code and Content-Code. Length of body is 256 bits
6 MSDI 0110 Includes Meta-Code and Semantic-Code. Length of body is 256 bits
7 MSCDI 0111 Includes Meta-, Semantic-, and Content-Code. Length is 320 bits
4.3 ISCC-BODY
a) The preceding MainType, SubType, and Version fields shall qualify the semantics of an ISCC-BODY.
b) The Length field shall determine the number of bits of an ISCC-BODY.
4.4 Encoding
4.4.1 Canonical form
The printable canonical form of an ISCC shall be its RFC 4648 Base32 encoded representation without
padding and prefixed with “ISCC:”. Base32 defines an upper case standard alphabet.
EXAMPLE ISCC:KEC43HJLPUSHVAZT66YLPUWNVACWYPIV533TRQMWF2IUQYSP5LA4CTY

4.4.2 URI encoding
An ISCC may be encoded using the syntax of a Uniform Resource Identifier (URI) as defined in RFC 3986.
a) The URI representation shall have the format :.
b) The URI scheme shall be the string “iscc”.
c) The URI path shall be the lower-cased base32 representation of an ISCC without padding.
EXAMPLE iscc: kec43 hjlpushvaz t66ylpuwnv acwypiv533 trqmwf2iuq ysp5la4cty
NOTE Because Base32 defines an upper case standard alphabet, the canonical form differs from the URI form,
which is represented in lower case.
4.4.3 Multiformats encoding
[13]
The ISCC may be encoded as a multibase string (see Table 8).
a) The multicodec identifier of an ISCC shall be “0xcc01” (see Table 9).
b) A Multiformat representation of an ISCC shall be prefixed with a multibase code.
c) The encoding scheme shall be .
ISCC shall support the multibase encodings given in Tables 8 and 9.
Table 8 — Supported multibase encodings
Encoding Code Definition
base16 f hexadecimal
base32 b RFC4648 case-insensitive - no padding
base32hex v RFC4648 case-insensitive - no padding - highest char
base58btc z base58 bitcoin
base64url u RFC4648 no padding
Table 9 — Examples of ISCCs in multiformats encoding
Encoding Example
fcc015105cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f
MF base16
bzqavcbontuvx2jd2qmz7pmfx2lg2qblmhuk655zyyglc5ekimjh6vqobj4
MF base32
vpg0l21edjklnq93qgcpvfc5nqb6qg1bc7kauttpoo6b2t4a8c97ulge19s
MF base32hex
z2Yr3BMx3Rj56fyYkNvfa19PCk4SjspQhpVWoLSGg9yXr4vUGsx
MF base58btc
uzAFRBc2dK30keoMz97C30s2oBWw9Fe73OMGWLpFIYk_qwcFP
MF base64url
4.4.4 Readable encoding
The ISCC may be encoded in human readable representation.
a) The readable representation shall encode the header fields with their symbols and the ISCC-BODY in
base16 lower-case.
b) The header fields and the ISCC-BODY shall be separated with hyphens.
EXAMPLE
ISCC-IMAGE-V0-MCDI-cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f

5 ISCC-UNITs
5.1 Meta-Code
5.1.1 General
The Meta-Code is a similarity hash generated from referent seed metadata in accordance with Annex B.
5.1.2 Purpose
The Meta-Code shall support the following use cases:
a) clustering of digital assets based on their metadata;
b) discovery of digital assets with similar metadata;
c) verification or manual disambiguation of matching codes.
5.1.3 Format
The Meta-Code shall have the data format as illustrated in Figure 2:
Figure 2 — Data format of the Meta-Code
EXAMPLE 1 64-bit Meta-Code in its canonical form:
ISCC: AAAUL6P7RMVNT4UJ
EXAMPLE 2 256-bit Meta-Code in its canonical form:
ISCC: AADUL 6P7RMVNT4U JJ4SMTDXBL 5JFZ5XPCDK O42XYPJEVQ 4L7PTYDORQ
5.1.4 Inputs
5.1.4.1 General
Seed metadata is the metadata that is used as the input to calculate the Meta-Code and has three possible
elements:
a) name (required): the name or title of the work manifested by the digital asset;
b) description (optional): a disambiguating textual description of the digital asset;
c) meta (optional): subject, industry, or use-case specific metadata.
Seed metadata shall be stored and carried along unaltered with ISCC Metadata if automated verification of
the Meta-Code based on the original seed metadata is required.
NOTE 1 Because seed metadata is used to construct the Meta-Code, changes to its value can produce different (and
therefore no longer matching) Meta-Codes.
NOTE 2 The identifier standards and their schemas, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC, provide helpful
guidance in selecting seed metadata.

5.1.4.2 name element
The text input for the name element shall be pre-processed before similarity hashing as follows.
a) Apply ISO/IEC 10646 NFKC Unicode Normalization (see Unicode Normalization Forms https:/
...


International
Standard
ISO 24138
First edition
Information and documentation —
2024-05
International Standard Content
Code (ISCC)
Information et documentation — Code international normalisé
de contenu (ISCC)
Reference number
© ISO 2024
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Structure and format of the ISCC . 4
4.1 General structure .4
4.2 ISCC-HEADER . .5
4.2.1 General .5
4.2.2 MainTypes .6
4.2.3 SubTypes .7
4.2.4 Version .7
4.2.5 Length .7
4.3 ISCC-BODY.8
4.4 Encoding .8
4.4.1 Canonical form .8
4.4.2 URI encoding .9
4.4.3 Multiformats encoding.9
4.4.4 Readable encoding .9
5 ISCC-UNITs .10
5.1 Meta-Code .10
5.1.1 General .10
5.1.2 Purpose .10
5.1.3 Format.10
5.1.4 Inputs .10
5.1.5 Outputs . 12
5.1.6 Seed metadata processing . 12
5.1.7 Metadata embedding . 13
5.1.8 Metadata extraction . 13
5.2 Content-Codes .14
5.2.1 General .14
5.2.2 Purpose .14
5.3 Content-Code Subtype Text .14
5.3.1 General .14
5.3.2 Format.14
5.3.3 Inputs . 15
5.3.4 Outputs . 15
5.3.5 Processing . 15
5.3.6 Conformance . 15
5.4 Content-Code Subtype Image .16
5.4.1 General .16
5.4.2 Format.16
5.4.3 Inputs .16
5.4.4 Outputs .16
5.4.5 Processing .16
5.4.6 Conformance .17
5.5 Content-Code Subtype Audio .17
5.5.1 General .17
5.5.2 Format.17
5.5.3 Inputs .17
5.5.4 Outputs .18
5.5.5 Processing .18
5.5.6 Conformance .18

iii
5.6 Content-Code Subtype Video.18
5.6.1 General .18
5.6.2 Format.18
5.6.3 Inputs .19
5.6.4 Outputs .19
5.6.5 Processing .19
5.6.6 Conformance .19
5.7 Content-Code Subtype Mixed .19
5.7.1 General .19
5.7.2 Format. 20
5.7.3 Inputs . 20
5.7.4 Outputs . 20
5.7.5 Processing . 20
5.7.6 Conformance . 20
5.8 Data-Code .21
5.8.1 General .21
5.8.2 Format.21
5.8.3 Inputs .21
5.8.4 Outputs .21
5.8.5 Processing .21
5.8.6 Conformance . 22
5.9 Instance-Code . 22
5.9.1 General . 22
5.9.2 Format. 23
5.9.3 Inputs . 23
5.9.4 Outputs . 23
5.9.5 Processing . 23
5.9.6 Conformance . 23
6 ISCC-CODE .24
6.1 General .24
6.2 Purpose .24
6.3 Format .24
6.3.1 General .24
6.3.2 SubTypes for ISCC-CODEs .24
6.3.3 Length and composition of ISCC-CODEs .24
6.4 Inputs . 25
6.5 Outputs . 25
6.6 Processing . 25
6.7 Comparing ISCC-CODEs . 25
6.8 Conformance . 26
Annex A (normative) Relationship between ISCC and other identifier systems .27
Annex B (normative) ISCC metadata .29
Annex C (informative) Evolution of this document .31
Annex D (normative) Reference implementation .32
Bibliography .33

iv
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 46, Information and documentation,
Subcommittee SC 9, Identification and description.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.

v
Introduction
While ISO/TC 46/SC 9 has established a variety of specific identifier standards, a content-dependent
identifier for digital assets in all content formats has not yet been agreed.
Digital content is dynamic, always in motion, and acted upon globally by a variety of entities with different
interests and requirements. Digital content continuously re-encodes, resizes, and re-compresses, changing
its data as it travels through a complex network of actors and systems.
The International Standard Content Code (ISCC) is an identifier for numerous types of digital assets. An
ISCC-CODE is generated from the digital content itself. It is the result of processing the digital content
using a variety of algorithms including hash algorithms. The generated ISCC-CODE supports data integrity
verification and preserves an estimate of the data, digital content and metadata similarity. However, ISCC
has different functionality from content recognition systems.
The ISCC supports the association of higher-level identifiers (like work and product identifiers) with the
digitally encoded manifestations of content. The ISCC does not specify a system for managing authoritative
metadata. Other content identifier standards can use ISCC to support discoverability of their identifiers and
metadata based on digital content.
Organizations, individuals and machines may generate ISCCs for numerous kinds of digital assets and use
them for identification and management of those assets.
ISCCs are neither manually nor automatically assigned to digital media assets. Instead, ISCCs are derived
from media assets according to the procedures described in this document. Unrelated parties can
independently derive the same ISCC from a given media asset.
ISCCs exclusively reference media assets without any implication about ownership. As such, ISCCs are not
managed authoritatively by any institution or entity.
The ISCC enables interoperability between different actors and systems using digital assets and supports
scenarios that require content deduplication, database synchronization and indexing, integrity verification,
timestamping, versioning, data provenance, similarity clustering, anomaly detection, usage tracking,
allocation of royalties, fact-checking and general digital asset management use-cases.
This document includes sections targeting a general audience but also descriptions of more technical
procedures.
Future editions of this document can be developed as outlined in Annex C.

vi
International Standard ISO 24138:2024(en)
Information and documentation — International Standard
Content Code (ISCC)
1 Scope
This document specifies the syntax and structure of the International Standard Content Code (ISCC), as an
identification system for digital assets (including encodings of text, images, audio, video or other content
across all media sectors). It also describes ISCC metadata and the use of ISCC in conjunction with other
schemes, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC.
An ISCC applies to a specific digital asset and is a data-descriptor deterministically constructed from
multiple hash digests using the algorithms and rules in this document. This document does not provide
information on registration of ISCCs.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 10646:2020, Information technology — Universal coded character set (UCS)
ISO/IEC 15938, Information technology — Multimedia content description interface
ISO/IEC 21778, Information technology — The JSON data interchange syntax
1)
IETF RFC 4648, The Base16, Base32, and Base64 Data Encodings
2)
IETF RFC 2397, The "data" URL scheme
3)
IETF RFC 8785, JSON Canonicalization Scheme (JCS)
4)
W3C, C14N 1.1, Canonical XML Version 1.1
5)
W3C, JSON-LD 1.1, A JSON-based Serialization for Linked Data
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
1) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc4648
2) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc2397
3) Online available: https:// datatracker .ietf .org/ doc/ html/ rfc8785
4) Online available: https:// www .w3 .org/ TR/ xml -c14n11
5) Online available: https:// www .w3 .org/ TR/ json -ld/

3.1
bit
atomic unit of information in a computer system
3.2
byte
sequence of 8 bits (3.1)
3.3
nibble
half a byte (3.2), which can be represented by a single hexadecimal digit
[SOURCE: ISO 20038:2017, 3.12]
3.4
data
ordered sequence of bits (3.1)
3.5
file
stored data (3.4) with a known number of bits (3.1) and a filename
3.6
stream
data (3.4) in transit with a known or unknown number of bits (3.1)
3.7
content
information organized to provide value to a user
3.8
digital content
manifestation of content (3.7) in form of data (3.4) structured according to a set of rules
3.9
metadata
data (3.4) that defines and describes other data
[SOURCE: ISO 24531:2013, 4.32]
3.10
seed metadata
initial metadata (3.9) used as input to a hash algorithm (3.1) function
3.11
content format
set of rules used to structure digital content (3.8)
3.12
media type
two-part identifier (3.15) specifying the nature of the referenced data (3.4)
[SOURCE: ISO/IEC 19757-4:2006, 3.9]
3.13
digital asset
file (3.5) or stream (3.6) encoded in conformance with a specific content format (3.11)
3.14
referent
object which is identified
3.15
identifier
sequence of characters that uniquely denotes a referent (3.14)
3.16
identifier system
system to enable the provision of identifiers (3.15) for a given category of referents (3.14)
3.17
content identifier
identifier (3.15) whose referent (3.14) is content (3.7)
3.18
content-dependent identifier
content identifier (3.18) whose data (3.4) depends on the digital content (3.8) that it identifies
3.19
content recognition system
system whose primary purpose is to recognise digital content (3.8) on a granular level
3.20
algorithm
set of instructions
3.21
hash algorithm
deterministic algorithm (3.20) that produces fixed-length data (3.4) from an input of arbitrary-length data
3.22
hash digest
result of processing data (3.4) with a hash algorithm (3.21)
3.23
cryptographic hash function
computationally efficient function mapping binary strings of arbitrary length to binary strings of fixed
length, such that it is computationally infeasible to find two distinct values that hash into a common value
3.24
similarity hash
hash digest (3.22) that preserves correlations between inputs to the hash algorithm (3.21)
3.25
content defined chunking
CDC
method to split data (3.4) into variable length chunks based on internal features such that chunk boundaries
are more resistant to byte (3.2) shifting
3.26
actor
human or non-human (hardware or software) entity that interacts with a system
3.27
Merkle tree
tree data structure in which every leaf node is labelled with the hash digest (3.22) of a data element and
every non-leaf node is labelled with the hash digest of the labels of its child nodes
3.28
Merkle root
root node of a Merkle tree (3.27)
[SOURCE: ISO 22739:2024, 3.57]

3.29
ISCC processor
application that generates ISCCs for digital content (3.8)
3.30
plain text
data (3.4) with a known text encoding that can be transcoded to Unicode
3.31
whitespace
nondisplaying formatting characters such as spaces, tabs, etc., that are embedded within a block of free text
[SOURCE: ISO/IEC/IEEE 31320-2:2012, 3.1.210]
4 Structure and format of the ISCC
4.1 General structure
a) An ISCC shall be composed of an ISCC-HEADER and an ISCC-BODY (see Figure 1).
b) The ISCC-HEADER shall describe the MainType, SubType, Version, and Length of its ISCC-BODY.
c) An ISCC-UNIT shall be an ISCC based on one specific algorithm.
d) An ISCC-CODE shall be an ISCC composed from two or more different ISCC-UNITs.

Figure 1 — General structure of an ISCC
The concatenation of the ISCC-UNITs is based on the underlying data and is not visible in the string
representation of the ISCC-CODE itself. See 6.6 on how an ISCC-CODE is composed from individual ISCC-UNITs.
4.2 ISCC-HEADER
4.2.1 General
4.2.1.1 The ISCC-HEADER is a variable sized bitstream composed of an ordered sequence of the 4 header-
fields MainType, SubType, Version, Length.

4.2.1.2 Each header-field is a bitstream with a length between 4 and 16 bits and encodes an integer value
between 0 and 4679 (see Table 1) with the following encoding scheme.
a) The total bit-length of a header-field shall be determined by its prefix-bits.
b) The prefix-bits shall be followed by data-bits.
c) The data-bits shall be interpreted as unsigned integer values plus the maximum value of the
preceding range.
d) If the total length of all header fields in number of bits is not divisible by 8, the header shall be padded
with 4 zero bits (0000) on the right side.
Table 1 — Variable length ISCC-HEADER field encoding
Prefix bits Number of nibbles Number of data bits Integer range
0 1 3 0-7
10 2 6 8-71
110 3 9 72-583
1110 4 12 584-4679
4.2.1.3 The interpretation of the integer value of a header-field shall be context dependent.
a) For the MainType and SubType fields, it shall be an identifier for the designated type (see 4.2.2 and 4.2.3).
b) For the Version field it shall be the literal version number (see 4.2.4).
c) For the Length field of ISCC-UNITs, it shall be a number used as a multiplier to calculate the bit length of
the ISCC-BODY (see 4.2.5, Table 6).
d) For the Length field of ISCC-CODEs, it shall be a bit-pattern encoding the combination of ISCC-UNITs and
the bit-length of the ISCC-BODY (see 4.2.5, Table 7).
EXAMPLE Header Field Examples
0 = 0000
1 = 0001

7 = 0111
8 = 1000 0000
9 = 1000 0001

4.2.2 MainTypes
The MainType header-field shall signify the type of an ISCC (see Table 2).
Backward incompatible updates to an algorithm associated with a MainType shall be indicated by
incrementing the version field of the ISCC-HEADER of the respective MainType.
NOTE This document specifies initial algorithms (version 0) for all reserved MainTypes with the exception of the
SEMANTIC type which is not currently defined.

Table 2 — Reserved MainTypes
ID Symbol Bits Definition
0 META 0000 An ISCC-UNIT that matches on metadata similarity
1 SEMANTIC 0001 An ISCC-UNIT that matches on semantic content similarity
2 CONTENT 0010 An ISCC-UNIT that matches on perceptual content similarity
3 DATA 0011 An ISCC-UNIT that matches on data similarity
4 INSTANCE 0100 An ISCC-UNIT that matches on data identity
5 ISCC 0101 An ISCC-CODE composed of two or more headerless ISCC-UNITs for mul-
ti-modal matching
4.2.3 SubTypes
The MainTypes META, DATA, and INSTANCE shall have a single default SubType NONE encoded with the
bits 0000.
The MainTypes SEMANTIC, CONTENT, and ISCC shall have SubTypes that signify the perceptual mode (see
Table 3 and Table 4).
Table 3 — Reserved SubTypes for MainTypes ISCC, SEMANTIC and CONTENT
ID Symbol Bits Definition
0 TEXT 0000 Match on text similarity
1 IMAGE 0001 Match on image similarity
2 AUDIO 0010 Match on audio similarity
3 VIDEO 0011 Match on video similarity
4 MIXED 0100 Match on multi-modal similarity
Table 4 — Additional Reserved SubTypes for the MainType ISCC
ID Symbol Bits Definition
5 SUM 0101 Composite of ISCC-UNITs including only Data- and Instance-Code
6 NONE 0110 Composite ISCC-UNITs including Meta-, Data- and Instance-Code
4.2.4 Version
All ISCC-HEADERs shall have a version header-field of 0000 for the first edition of this document (see
Table 5).
Table 5 — Reserved ISCC Versions
ID Symbol Bits Definition
0 V0 0000 Initial version of ISCC-UNITs and ISCC-CODE
4.2.5 Length
4.2.5.1 General
The encoding of the Length header-field shall be specific to the MainType.
4.2.5.2 Length of ISCC-UNITs
For ISCC-UNITs of the MainTypes META, SEMANTIC, CONTENT, DATA and INSTANCE the length value shall
be encoded as the number of 32-bit blocks of the ISCC-BODY in addition to the minimum length of 32 bits
(see Table 6).
Table 6 — Reserved length field values (multiples of 32 bit)
ID Symbol Bits Definition
0 L32 0000 Length of body is 32 bits (minimum length)
1 L64 0001 Length of body is 64 bits (default length)
2 L96 0010 Length of body is 96 bits
3 L128 0011 Length of body is 128 bits
4 L160 0100 Length of body is 160 bits
5 L192 0101 Length of body is 192 bits
6 L224 0110 Length of body is 224 bits
7 L256 0111 Length of body is 256 bits
4.2.5.3 Length of ISCC-CODEs
a) For ISCC-CODEs, the length value shall designate the composition of ISCC-UNITs (see Table 7).
b) The Data-Code and Instance-Code shall be mandatory 64-bit components of an ISCC-CODE.
c) The first data-bit shall designate the presence of a 64-bit Meta-Code.
d) The second data-bit shall designate the presence of a 64-bit Semantic-Code.
e) The third data-bit shall designate the presence of a 64-bit Content-Code.
f) The length of an ISCC-CODE shall be calculated as the number of active data-bits times 64 plus 128 bits
of mandatory data.
Table 7 — Reserved length field values (for MainType ISCC)
ID Symbol Bits Definition
0 SUM 0000 No optional ISCC-UNITs. Length of body is 128 bits.
1 CDI 0001 Includes Content-Code. Length of body is 192 bits
2 SDI 0010 Includes Semantic-Code. Length of body is 192 bits
3 SCDI 0011 Includes Semantic- and Content-Code. Length of body is 256 bits
4 MDI 0100 Includes Meta-Code. Length of body is 192 bits
5 MCDI 0101 Includes Meta-Code and Content-Code. Length of body is 256 bits
6 MSDI 0110 Includes Meta-Code and Semantic-Code. Length of body is 256 bits
7 MSCDI 0111 Includes Meta-, Semantic-, and Content-Code. Length is 320 bits
4.3 ISCC-BODY
a) The preceding MainType, SubType, and Version fields shall qualify the semantics of an ISCC-BODY.
b) The Length field shall determine the number of bits of an ISCC-BODY.
4.4 Encoding
4.4.1 Canonical form
The printable canonical form of an ISCC shall be its RFC 4648 Base32 encoded representation without
padding and prefixed with “ISCC:”. Base32 defines an upper case standard alphabet.
EXAMPLE ISCC:KEC43HJLPUSHVAZT66YLPUWNVACWYPIV533TRQMWF2IUQYSP5LA4CTY

4.4.2 URI encoding
An ISCC may be encoded using the syntax of a Uniform Resource Identifier (URI) as defined in RFC 3986.
a) The URI representation shall have the format :.
b) The URI scheme shall be the string “iscc”.
c) The URI path shall be the lower-cased base32 representation of an ISCC without padding.
EXAMPLE iscc: kec43 hjlpushvaz t66ylpuwnv acwypiv533 trqmwf2iuq ysp5la4cty
NOTE Because Base32 defines an upper case standard alphabet, the canonical form differs from the URI form,
which is represented in lower case.
4.4.3 Multiformats encoding
[13]
The ISCC may be encoded as a multibase string (see Table 8).
a) The multicodec identifier of an ISCC shall be “0xcc01” (see Table 9).
b) A Multiformat representation of an ISCC shall be prefixed with a multibase code.
c) The encoding scheme shall be .
ISCC shall support the multibase encodings given in Tables 8 and 9.
Table 8 — Supported multibase encodings
Encoding Code Definition
base16 f hexadecimal
base32 b RFC4648 case-insensitive - no padding
base32hex v RFC4648 case-insensitive - no padding - highest char
base58btc z base58 bitcoin
base64url u RFC4648 no padding
Table 9 — Examples of ISCCs in multiformats encoding
Encoding Example
fcc015105cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f
MF base16
bzqavcbontuvx2jd2qmz7pmfx2lg2qblmhuk655zyyglc5ekimjh6vqobj4
MF base32
vpg0l21edjklnq93qgcpvfc5nqb6qg1bc7kauttpoo6b2t4a8c97ulge19s
MF base32hex
z2Yr3BMx3Rj56fyYkNvfa19PCk4SjspQhpVWoLSGg9yXr4vUGsx
MF base58btc
uzAFRBc2dK30keoMz97C30s2oBWw9Fe73OMGWLpFIYk_qwcFP
MF base64url
4.4.4 Readable encoding
The ISCC may be encoded in human readable representation.
a) The readable representation shall encode the header fields with their symbols and the ISCC-BODY in
base16 lower-case.
b) The header fields and the ISCC-BODY shall be separated with hyphens.
EXAMPLE
ISCC-IMAGE-V0-MCDI-cd9d2b7d247a8333f7b0b7d2cda8056c3d15eef738c1962e9148624feac1c14f

5 ISCC-UNITs
5.1 Meta-Code
5.1.1 General
The Meta-Code is a similarity hash generated from referent seed metadata in accordance with Annex B.
5.1.2 Purpose
The Meta-Code shall support the following use cases:
a) clustering of digital assets based on their metadata;
b) discovery of digital assets with similar metadata;
c) verification or manual disambiguation of matching codes.
5.1.3 Format
The Meta-Code shall have the data format as illustrated in Figure 2:
Figure 2 — Data format of the Meta-Code
EXAMPLE 1 64-bit Meta-Code in its canonical form:
ISCC: AAAUL6P7RMVNT4UJ
EXAMPLE 2 256-bit Meta-Code in its canonical form:
ISCC: AADUL 6P7RMVNT4U JJ4SMTDXBL 5JFZ5XPCDK O42XYPJEVQ 4L7PTYDORQ
5.1.4 Inputs
5.1.4.1 General
Seed metadata is the metadata that is used as the input to calculate the Meta-Code and has three possible
elements:
a) name (required): the name or title of the work manifested by the digital asset;
b) description (optional): a disambiguating textual description of the digital asset;
c) meta (optional): subject, industry, or use-case specific metadata.
Seed metadata shall be stored and carried along unaltered with ISCC Metadata if automated verification of
the Meta-Code based on the original seed metadata is required.
NOTE 1 Because seed metadata is used to construct the Meta-Code, changes to its value can produce different (and
therefore no longer matching) Meta-Codes.
NOTE 2 The identifier standards and their schemas, such as DOI, ISAN, ISBN, ISRC, ISSN and ISWC, provide helpful
guidance in selecting seed metadata.

5.1.4.2 name element
The text input for the name element shall be pre-processed before similarity hashing as follows.
a) Apply ISO/IEC 10646 NFKC Unicode Normalization (see Unicode Normalization Forms https:// unicode
.org/ reports/ tr15/ #Norm _Forms).
b) Remove control characters (see Unicode Character Database https:// www .unicode .org/ ucd/ ).
c) Strip leading and trailing whitespace.
d) Trim the end of the text such that the UTF-8 encoded size does not exceed 128 bytes.
5.1.4.3 description element
Text input for the description element shall be pre-processed before similarity hashing as follows.
a) Apply NFKC Unicode Normalization.
b) Remove control characters (as specified by Unicode Character Database) except for the following
newline characters:
1) U000A - Line Feed;
2) U000B - Vertical Tab;
3) U000C - Form Feed;
4) U000D - Carriage Return;
5) U0085 - Next Line;
6) U2028 - Line Separator;
7) U2029 - Paragraph Separator.
c) Collapse more than two consecutive newlines characters to a maximum of two consecutive newlines.
d) Strip leading and trailing whitespace characters.
5.1.4.4 meta element
a) The value of the meta element shall be wrapped in an RFC 2397 Data-URL.
b) The value of the meta
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...