Genomics informatics — Data elements and their metadata for describing the microsatellite instability (MSI) information of clinical massive parallel DNA sequencing

This document identifies data elements and metadata to represent the information about microsatellite instability (MSI) for reporting the value of the biomarker using clinical massive parallel DNA sequencing. This document covers information about the MSI test result and related data, such as used resources, data generation condition, and data processing information which are helpful to clinical diagnosis and research. This document is not intended — for defining experimental protocols or methods for calculating the value of microsatellite instability (MSI), — for the other biological species than human resource, or — for the Sanger sequencing methods.

Informatique génomique — Éléments de données et leurs métadonnées pour décrire les informations relatives à l'instabilité des microsatellites (MSI) du séquençage massif parallèle d'ADN

General Information

Status
Published
Publication Date
01-May-2023
Current Stage
6060 - International Standard published
Start Date
30-Apr-2023
Due Date
14-Mar-2024
Completion Date
02-May-2023
Ref Project

Buy Standard

Technical specification
ISO/TS 4425:2023 - Genomics informatics — Data elements and their metadata for describing the microsatellite instability (MSI) information of clinical massive parallel DNA sequencing Released:2. 05. 2023
English language
20 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
REDLINE ISO/DTS 4425 - Genomics informatics — Data elements and their metadata for describing the microsatellite instability (MSI) information of clinical massive parallel DNA sequencing Released:12/21/2022
English language
21 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
ISO/DTS 4425 - Genomics informatics — Data elements and their metadata for describing the microsatellite instability (MSI) information of clinical massive parallel DNA sequencing Released:12/21/2022
English language
21 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

TECHNICAL ISO/TS
SPECIFICATION 4425
First edition
2023-04
Genomics informatics — Data
elements and their metadata
for describing the microsatellite
instability (MSI) information of
clinical massive parallel DNA
sequencing
Informatique génomique — Éléments de données et leurs
métadonnées pour décrire les informations relatives à l'instabilité des
microsatellites (MSI) du séquençage massif parallèle d'ADN
Reference number
ISO/TS 4425:2023(E)
© ISO 2023

---------------------- Page: 1 ----------------------
ISO/TS 4425:2023(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2023
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
  © ISO 2023 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/TS 4425:2023(E)
Contents Page
Foreword .v
Introduction . vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Abbreviated terms . 5
5 Microsatellite instability (MSI) .5
6 Composition of elements for describing MSI status on clinical DNA NGS report .6
6.1 General . 6
6.2 Summary part . 7
6.3 Detail part . 7
7 Fields and their nomenclature of required data . 7
7.1 General . 7
7.2 Clinical sequencing order . . 8
7.2.1 General . 8
7.2.2 Clinical sequencing order code . 8
7.2.3 Date and time . 8
7.3 Information on subject of care . 9
7.3.1 General . 9
7.3.2 Subject of care identifier . 9
7.3.3 Subject of care name . 9
7.3.4 Subject of care birth date . 9
7.3.5 Subject of care sex . 9
7.3.6 Subject of care ancestry . 9
7.3.7 Referring diagnosis . 10
7.4 Information on legally authorized person ordering clinical sequencing . 10
7.4.1 General . 10
7.5 Performing laboratory . 10
7.5.1 General . 10
7.5.2 Basic information on performing laboratory . 10
7.5.3 Information on report generator . 10
7.5.4 Information of legally confirmed person on sequencing report. 10
7.6 Biospecimen information. 10
7.6.1 General . 10
7.6.2 Type of specimen . 10
7.7 MSI status result information . 11
7.7.1 General . 11
7.7.2 MSI status . 11
7.8 Recommended treatment . 11
7.8.1 General . 11
7.8.2 Medication . . 11
7.8.3 Clinical trial information . 11
7.8.4 Other recommendations . 11
7.8.5 Supporting information . 11
8 Fields and their nomenclature of optional data .12
8.1 General .12
8.2 Reference genome information .12
8.3 MSI information . 13
8.3.1 General .13
8.3.2 Criteria of MSI status .13
8.3.3 Genomic position for determining MSI status .13
iii
© ISO 2023 – All rights reserved

---------------------- Page: 3 ----------------------
ISO/TS 4425:2023(E)
8.3.4 Genomic position against markers of alternative method . .13
8.3.5 Clinical implication of MSI status . 13
8.4 Sequencing information . 13
8.4.1 Clinical sequencing date . 13
8.4.2 Sequencing type .13
8.4.3 Quality control metrics . 14
8.4.4 Sequencing platform information . 14
8.4.5 Analysis platform information . 14
Annex A (informative) Example structure of MSI status report .16
Bibliography .19
iv
  © ISO 2023 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/TS 4425:2023(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 215, Health informatics, Subcommittee
SC 1, Genomics informatics.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
v
© ISO 2023 – All rights reserved

---------------------- Page: 5 ----------------------
ISO/TS 4425:2023(E)
Introduction
Massively parallel sequencing is a high-throughput analytical approach to nucleic acid sequencing that
allows whole genomes, transcriptomes, and specific nucleic acid targets. These advanced technologies
have been used in the clinical field, and clinical sequencing has been applied to realize personalized
[1]
medicine and precision medicine. ISO/TS 20428 has been developed for clinical usage.
In the field of cancer treatment, various treatment strategies were performed differently from
traditional anti-cancer chemotherapies. One of those strategies is the control of human immune system
that maintains the action to extract cancer cells. Recent outcomes of clinical trials show that this
immune therapy is efficient for some patients who have a specific molecular character of their tumor
[2]
mass, such as PD-L1 or CTLA4 surface protein expression . As a result, these molecular characters
are used as biomarkers for selecting patients. In colon cancer, according to several clinical trials, it is
reported that the status of MSI (microsatellite instability) is regarded as a biomarker that drugs based
[3]
on immuno-therapy are more efficient for the patient with MSI-H (high) .
The status of MSI can be calculated and reported by small nucleotide deletion on a specific region of
[4]
human genome reference with NGS sequencing . According to US FDA, four NGS sequencing products
were approved for companion diagnostics. Among these products, three NGS sequencing provide MSI
status and value on their NGS sequencing report. CLIA-certified labs or equivalent level agencies in
[5]
countries also are servicing the MSI status from their methods . It is forecasted that more clinical NGS
sequencing will be approved to report MSI.
However, there is no standard for describing MSI status, value, and metadata. ISO/TS 20428 focuses
on only DNA variations compared with the reference genome. According to some research results,
MSI status and the way to describe it are different even if using the same sequencing data. This makes
it difficult for clinicians and researchers not only to use MSI status results for clinical decisions but
also for secondary analyzing purposes when receiving from more than one sequencing lab. Related
metadata should be essential to expand the usage of MSI status results.
In this document, the data elements and their standardized metadata for MSI status in electronic health
records will be described. The clinical report for MSI will provide helpful information on bioinformatics
analysis to help clinical decisions.
vi
  © ISO 2023 – All rights reserved

---------------------- Page: 6 ----------------------
TECHNICAL SPECIFICATION ISO/TS 4425:2023(E)
Genomics informatics — Data elements and their
metadata for describing the microsatellite instability (MSI)
information of clinical massive parallel DNA sequencing
1 Scope
This document identifies data elements and metadata to represent the information about microsatellite
instability (MSI) for reporting the value of the biomarker using clinical massive parallel DNA sequencing.
This document covers information about the MSI test result and related data, such as used resources,
data generation condition, and data processing information which are helpful to clinical diagnosis and
research.
This document is not intended
— for defining experimental protocols or methods for calculating the value of microsatellite instability
(MSI),
— for the other biological species than human resource, or
— for the Sanger sequencing methods.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 8601 (all parts), Date and time — Representations for information interchange
ISO/TS 22220:2011, Health informatics — Identification of subjects of health care
ISO/TS 27527:2010, Health informatics — Provider identification
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
biological specimen
biospecimen
specimen
sample of tissue, body fluid, food, or other substance that is collected or acquired to support the
assessment, diagnosis, treatment, mitigation or prevention of a disease, disorder or abnormal physical
state, or its symptoms
[SOURCE: ISO/TS 20428:2017, 3.34]
1
© ISO 2023 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/TS 4425:2023(E)
3.2
clinical sequencing
next generation sequencing or later sequencing technologies with human samples for clinical practice
and clinical trials
[SOURCE: ISO/TS 20428:2017, 3.5]
3.3
deletion
contiguous removal of one or more bases from a genomic sequence
[SOURCE: ISO/IEC 23092-2:2020, 3.4]
3.4
deoxyribonucleic acid
DNA
molecule that encodes genetic information in the nucleus of cells
[SOURCE: ISO 25720:2009, 4.7]
3.5
DNA sequencing
determining the order of nucleotide bases (adenine, guanine, cytosine and thymine) in a molecule of
DNA
Note 1 to entry: Sequence is generally described from the 5' end.
[SOURCE: ISO 17822:2020, 3.19]
3.6
exome
part of the genome formed by exons
[SOURCE: ISO/TS 20428:2017, 3.13]
3.7
gene
basic unit of hereditary material that encodes and controls the expression of a protein or protein
subunit
3.8
indel
insertion (3.9) or/and deletion (3.3)
[SOURCE: ISO/TS 20428:2017, 3.18]
3.9
insertion
contiguous addition of one or more bases into a genomic sequence
[SOURCE: ISO/IEC 23092-2:2020, 3.18]
3.10
microsatellite
repetitive DNA elements, also known as simple sequence repeats (SSR), consisting of short in tandem
repeat motifs of one to a few nucleotides that tend to occur in non-coding DNA of eukaryotic genomes
and that are sometimes referred to as variable number of tandem repeats (VNTRs)
2
  © ISO 2023 – All rights reserved

---------------------- Page: 8 ----------------------
ISO/TS 4425:2023(E)
3.11
microsatellite instability
MSI
condition of genetic hypermutability (predisposition to mutation) that results from impaired DNA
mismatch repair (MMR)
3.12
DNA mismatch repair
MMR
system for recognizing and repairing erroneous insertion, deletion, and misincorporation of bases that
can rise during DNA replication and recombination, as well as repairing some forms of DNA damage
3.13
nucleotide
monomer of a nucleic acid polymer such as DNA or RNA
Note 1 to entry: Nucleotides are denoted as letters ('A' for adenine; 'C' for cytosine; 'G' for guanine; 'T' for thymine
which only occurs in DNA; and 'U' for uracil, which only occurs in RNA). The chemical formula for a specific
DNA or RNA molecule is given by the sequence of its nucleotides, which can be represented as a string over the
alphabet ('A',' C',' G', 'T') in the case of DNA, and a string over the alphabet ('A', 'C', 'G', 'U') in the case of RNA. Bases
with unknown molecular composition are denoted with 'N'.
[SOURCE: ISO/IEC 23092-2:2020, 3.20]
3.14
polymerase chain reaction
PCR
in vitro enzymatic technique to increase the number of copies of a specific DNA fragment by several
orders of magnitude
[SOURCE: ISO 16577:2022, 3.6.47]
3.15
quality score
Phred quality score
Q score
sequencing quality score of a given nucleotide base
Note 1 to entry: Q is defined by the following equation: Q = -10log10(e), where e is the estimated probability of the
base call being wrong.
Note 2 to entry: A quality score of 20 represents an error rate of 1 in 100, with a corresponding call accuracy of
99 %.
Note 3 to entry: Higher quality scores indicate a smaller probability of error. Lower quality scores can result in a
significant portion of the reads being unusable. Low quality scores may also indicate false-positive variant calls,
resulting in inaccurate conclusions.
[SOURCE: ISO 20397-2:2021, 3.30]
3.16
read type
type of run in the sequencing instrument
Note 1 to entry: It can be either single-end or paired-end.
Note 2 to entry: Single-end: Single read runs the sequencing instrument reads from one end of a fragment to the
other end.
Note 3 to entry: Paired-end: Paired-end runs read from one end to the other and then starts another round of
reading from the opposite end.
[SOURCE: ISO/TS 20428:2017, 3.27]
3
© ISO 2023 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/TS 4425:2023(E)
3.17
reference sequence
nucleic acid sequence with biological relevance
Note 1 to entry: Each reference sequence is indexed by a one-dimensional integer coordinate system whereby
each integer within range identifies a single nucleotide. Coordinate values can only be equal to or larger than
zero. The coordinate system in the context of this standard is zero-based (i.e., the first nucleotide has coordinate
0, and it is said to be at position 0) and linearly increases within the string from left to right.
[SOURCE: ISO/IEC 23092-1:2020, 3.22]
3.18
read
sequence read
fragmented nucleotide sequences that are used to reconstruct the original sequence for next-generation
sequencing technologies
[SOURCE: ISO/TS 20428:2017, 3.26]
3.19
variation
sequence variation
DNA sequence variation
differences of DNA sequence among individuals in a population
[SOURCE: ISO 25720:2009, 4.8]
3.20
small indel
insertion (3.9) or deletion (3.3) of 2 nucleotides to 100 nucleotides
[SOURCE: ISO/TS 20428:2017, 3.32]
3.21
subject of care
person who uses, or is a potential user of, a healthcare service
[SOURCE: ISO/TS 22220:2011, 3.2, modified — Note to entry and second preferred term deleted.]
3.22
target capture
method to capture genomic regions of interest from a DNA sample prior to sequencing
[SOURCE: ISO/TS 20428:2017, 3.36]
3.23
targeted sequencing
technique used for sequencing only selected/targeted genomic regions of interest from a DNA sample
[SOURCE: ISO/TS 22692:2020, 3.22, modified — Note to entry and second preferred term deleted.]
3.24
whole exome sequencing
WES
technique for sequencing the exomes of the protein-coding genes in a genome
3.25
whole genome sequencing
WGS
technique that determines the complete DNA sequence of an organism's genome at a single time
[SOURCE: ISO/TS 20428:2017, 3.39]
4
  © ISO 2023 – All rights reserved

---------------------- Page: 10 ----------------------
ISO/TS 4425:2023(E)
4 Abbreviated terms
ATC Anatomical Therapeutic Chemical
CTLA4 Cytotoxic T-Lymphocyte Associated Protein 4
EBI European Bioinformatics Institute
FHIR Fast Healthcare Interoperability Resources
HL7® Health Level Seven
IDMP Identification of Medicinal Product
IMPID Investigational MPID
INN International Nonproprietary Names
MPID Medicinal Product Identifier
NCCN National Comprehensive Cancer Network
NGS Next Generation Sequencing
NIH National Institutes of Health
PD-L1 Programmed death-ligand 1
PD-1 Programmed cell death protein 1
SPREC Standard PREanalytical Code
WHO World Health Organization
UTN Universal Trial Number
5 Microsatellite instability (MSI)
The DNA mismatch repair (MMR) pathway plays an important role in the cell cycle process to recognize
and repair mismatches during DNA replication. The major components are four key enzymes coded
for by the following genes: MLH1, MSH2, MSH6, PMS2, and EPCAM. MMR function doesn't work when
mutational inactivation in the five genes or epigenetic inactivation occurs. It is called Deficiency of
[6]
mismatch repair (dMMR). One of the most related diseases is Lynch syndrome . Lynch syndrome, also
known as hereditary non-polyposis colorectal cancer (HNPCC), is the most common cause of hereditary
colorectal cancer. People with Lynch syndrome are more likely to get colorectal cancer and other
cancers at a younger age (under 50). Patients develop dMMR tumors following the inactivation of the
second wild-type allele through somatic mutation, loss of heterozygosity, or epigenetic silencing. These
alterations – mutation or epigenetic inactivation is related to not only Lynch syndrome but also revealed
differences in the case of cancer type. However, both lead to the accumulation of short sequences of DNA
repeated throughout the genome-specific location and an increased risk of malignant transformation in
certain tissues. These tumors have a higher frequency of somatic mutations compared with non-dMMR
cancers and are assumed to have a large range of tumor neoantigens (high tumor mutation burden)
and a highly immunogenic signature, including a high proportion of tumor-infiltrating lymphocytes.
Defective mismatch repair results in a high tumor mutation burden and abundant neo-antigen
formation, which can be recognized by the host immune system. Microsatellite instability (MSI) is
found in 1,5 % to 3,5 % of all human cancers, such as colorectal, endometrial, ovarian, and cancers of
the stomach, small intestine, pancreas, biliary tract, and ureter. The human genome contains more than
19 million microsatellites, short tandem repeats of motifs of 1 nucleotide to 6 nucleotides, typically
[7]
spanning 10 nucleotides to 60 nucleotides in total length . However, if the MMR function doesn't work
well, nucleotide error accumulates, especially in the human genomic position, including microsatellites.
5
© ISO 2023 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/TS 4425:2023(E)
Sometimes, certain polymorphic microsatellites can serve as an individual's molecular barcode, which
can be used in forensic identification.
The status of MSI is used as a biomarker for predicting the prognosis of colorectal cancer or selecting
patients who be more effective with immune therapy. Recent clinical studies have shown that MSI
status predicts clinical benefits from immune checkpoint blockade (ICB) with PD-1/PD-L1 interaction
1)
inhibitors. In May 2017, a new drug-pembrolizumab (KEYTRUDA® ) was approved, which is a
humanized antibody against the programmed death receptor-1 (PD-1), for the treatment of patients
with any advanced solid cancer harboring a high tumor mutation burden as measured by the presence
[8]
of microsatellite instability (MSI-H) in the US . In addition, combined therapy, including nivolumab
and ipilimumab, was approved for MSI-H metastatic colorectal cancer. This diagnostic and treatment
strategy was recommended through NCCN guidelines for many types of cancers. MSI status also has
prognostic significance, most notably in colorectal cancer, where testing is recommended in clinical
practice guidelines for all patients.
Traditionally, MSI status testing is most performed via PCR and/or IH
...

© ISO #### – All rights reserved
ISO/AWI TS 4425:2022(E)
ISO TC 215/SC 1/WG 1
Secretariat: KATS
Date: 2022-11-17
Genomics informatics — Data elements and their metadata for describing the microsatellite
instability (MSI) information of clinical massive parallel DNA sequencing

WD stage

Warning for WDs and CDs
This document is not an ISO International Standard. It is distributed for review and comment. It is subject to
change without notice and may not be referred to as an International Standard.
Recipients of this draft are invited to submit, with their comments, notification of any relevant patent rights of
which they are aware and to provide supporting documentation.
To help you, this guide on writing standards was produced by the ISO/TMB and is available at
A model manuscript of a draft International Standard (known as "The Rice Model") is available at

---------------------- Page: 1 ----------------------
© ISO #### – All rights reserved
© ISO 20XX

---------------------- Page: 2 ----------------------
ISO/AWI TSDTS 4425:2023(E)
© ISO 2023
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no
part of this publication may be reproduced or utilized otherwise in any form or by any means,
electronic or mechanical, including photocopying, or posting on the internet or an intranet, without
prior written permission. Permission can be requested from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.orgwww.iso.org
Published in Switzerland
4 © ISO ####2023 – All rights reserved

---------------------- Page: 3 ----------------------
ISO/AWI TSDTS 4425:2023(E)
Contents

Foreword . 7
Introduction . 8
1 Scope . 9
2 Normative references . 9
3 Terms and definitions . 9
4 Abbreviated terms . 13
5 Microsatellite instability (MSI) . 15
6 Composition of elements for describing MSI status on clinical DNA NGS report . 16
6.1 General . 16
6.2 Summary part . 16
6.3 Detail part . 16
7 Fields and their nomenclature of required data . 16
7.1 General . 16
Table 1 — Data elements, their metadata, and cardinality for required fields . 17
7.2 Clinical sequencing order . 18
7.2.1 General . 18
7.2.2 Clinical sequencing order code . 18
7.2.3 Date and time . 18
7.3 Information on subject of care . 19
7.3.1 General . 19
7.3.2 Subject of care identifier . 19
7.3.3 Subject of care name . 19
7.3.4 Subject of care birth date . 19
7.3.5 Subject of care sex . 19
7.3.6 Subject of care ancestry . 20
7.3.7 Referring diagnosis . 20
7.4 Information on legally authorized person ordering clinical sequencing . 20
7.4.1 General . 20
7.5 Performing laboratory . 20
7.5.1 General . 20
7.5.2 Basic information on performing laboratory . 20
7.5.3 Information on report generator . 20
7.5.4 Information of legally confirmed person on sequencing report . 20
7.6 Biospecimen information . 20
7.6.1 General . 20
7.6.2 Type of specimen . 20
7.7 MSI status result information . 21
7.7.1 General . 21
7.7.2 MSI status . 21
7.8 Recommended treatment . 21
7.8.1 General . 21
7.8.2 Medication . 21
7.8.3 Clinical trial information . 21
7.8.4 Other recommendations . 21
7.8.5 Supporting information . 21
© ISO ####2023 – All rights reserved 5

---------------------- Page: 4 ----------------------
ISO/AWI TSDTS 4425:2023(E)
8 Fields and their nomenclature of optional data . 22
8.1 General . 22
Table 2 — Data elements and their metadata for optional fields . 22
8.2 Reference genome information . 23
8.3 MSI information . 23
8.3.1 General . 23
8.3.2 Criteria of MSI status . 23
8.3.3 Genomic position for determining MSI status . 23
8.3.4 Genomic position against markers of alternative method . 23
8.3.5 Clinical implication of MSI status . 23
8.4 Sequencing information . 23
8.4.1 Clinical sequencing date . 23
8.4.2 Sequencing type . 24
8.4.3 Quality control metrics . 24
8.4.4 Sequencing platform information . 24
8.4.5 Analysis platform information . 25
Annex A (informative) Example structure of MSI status report . 26
Bibliography . 30


6 © ISO ####2023 – All rights reserved

---------------------- Page: 5 ----------------------
ISO/AWI TSDTS 4425:2023(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO
collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any
patent rights identified during the development of the document will be in the Introduction and/or on
the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the World
Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 215, Health informatics, Subcommittee SC
1, Genomics informatics.
Any feedback or questions on this document should be directed to the user'suser’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html.

© ISO ####2023 – All rights reserved 7

---------------------- Page: 6 ----------------------
ISO/AWI TSDTS 4425:2023(E)
Introduction
Massively parallel sequencing is a high-throughput analytical approach to nucleic acid sequencing that
allows whole genomes, transcriptomes, and specific nucleic acid targets. These advanced technologies
have been used in the clinical field, and clinical sequencing has been applied to realize personalized
[1]
medicine and precision medicine. For clinical usage, ISO/TS 20428 Health informatics — Data elements
and their metadata for describing structured clinical genomic sequence information in electronic health
[1]
records has been published. ISO/TS 20428 has been developed for clinical usage.
In the field of cancer treatment, various treatment strategies were performed differently from traditional
anti-cancer chemotherapies. One of those strategies is the control of human immune system that
maintains the action to extract cancer cells. Recent outcomes of clinical trials show that this immune
therapy is efficient for some patients who have thea specific molecular character of their tumor mass,
[[2] ]
such as PD-L1 or CTLA4 surface protein expression. So . As a result, these molecular characters are
used as biomarkers for selecting patients. In colon cancer, according to several clinical trials, it is reported
that the status of MSI (microsatellite instability) is regarded as a biomarker that drugs based on immuno-
[ [3] ]
therapy are more efficient for the patient with MSI-H (high). ) .
The status of MSI couldcan be calculated and reported by small nucleotide deletion on a specific region
[[4]]
of human genome reference with NGS sequencing. . According to US -FDA, four NGS sequencing
products were approved for companion diagnostics. Among these products, three NGS sequencing
provide MSI status and value on their NGS sequencing report. CLIA-certified labs or equivalent level
[[5]]
agencies in countries also are servicing the MSI status from their methods. . It is forecasted that more
clinical NGS sequencing will be approved to report MSI.
However, there is no international standard for describing MSI status, value, and metadata. The previous
ISO/TS 20428 focusedfocuses on only DNA variations compared with the reference genome. According
to some research results said that, MSI status and the way to describe it are different even if using the
same sequencing data. This makes it difficult for clinicians and researchers not only to use MSI status
results for clinical decisions but also for secondary analyzing purposes when receiving from more than
one sequencing lab. Related metadata should be essential to expand the usage of MSI status results.
In this document, the data elements and their standardized metadata for MSI status in electronic health
records will be described. The clinical report for MSI will provide helpful information on bioinformatics
analysis to help clinical decisions.
8 © ISO ####2023 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/AWI TSDTS 4425:2023(E)
Genomics informatics — General guidelines for describing the
microsatellite instability (MSI) information of clinical massive
parallel DNA sequencing
1 Scope
This Technical Specificationdocument identifies data elements and metadata to represent the
information about microsatellite instability (MSI) for reporting the value of the biomarker using clinical
massive parallel DNA sequencing.
This document covers information about the MSI test result and related data, such as used resources, data
generation condition, and data processing information which are helpful to clinical diagnosis and
research.
This document is not intended
— to definefor defining experimental protocols or methods for calculating the value of microsatellite
instability (MSI)),
— for the other biological species than human resource, or
— for the Sanger sequencing methods.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 8601 (all parts), Date and time — Representations for information interchange
ISO/TS 22220:2011, Health informatics — Identification of subjects of health care
ISO/TS 27527:2010, Health informatics — Provider identification
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminologicalterminology databases for use in standardization at the following
addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
Field Code Changed
— IEC Electropedia: available at https://www.electropedia.org/
3.1
biological specimen
biospecimen
specimen
sample of tissue, body fluid, food, or other substance that is collected or acquired to support the
assessment, diagnosis, treatment, mitigation or prevention of a disease, disorder or abnormal physical
state, or its symptoms
© ISO ####2023 – All rights reserved 9

---------------------- Page: 8 ----------------------
ISO/AWI TSDTS 4425:2023(E)
[SOURCE: ISO/TS 20428:2017, 3.34]
3.2
clinical sequencing
next generation sequencing or later sequencing technologies with human samples for clinical practice
and clinical trials
[SOURCE: ISO/TS 20428:2017, 3.5]
3.2.1
deletion
contiguous removal of one or more bases from a genomic sequence
[SOURCE: ISO/IEC 23092-2:20192020, 3.4]
3.3
deoxyribonucleic acid
DNA
molecule that encodes genetic information in the nucleus of cells
[SOURCE: ISO 25720:2009, 4.7]
3.4
DNA sequencing
determining the order of nucleotide bases (adenine, guanine, cytosine and thymine) in a molecule of DNA
Note 1 to entry: Sequence is generally described from the 5' end.
[SOURCE: ISO/TS 17822-1:2014:2020, 3.2019]
3.5
exome
part of the genome formed by exons
[SOURCE: ISO/TS 20428:2017, 3.13]
3.6
gene
basic unit of hereditary material that encodes and controls the expression of a protein or protein subunit
[SOURCE: ISO 11238:2012, 2.1.16]
3.7
indel
insertion (3.8) or/and deletion (3.2)
[SOURCE: ISO/TS 20428:2017, 3.18]
3.8
insertion
contiguous addition of one or more bases into a genomic sequence
[SOURCE: ISO/IEC 23092-2:20192020, 3.18]
3.9
10 © ISO ####2023 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/AWI TSDTS 4425:2023(E)
microsatellite
repetitive DNA elements, also known as simple sequence repeats (SSR), consisting of short in tandem
repeat motifs of one to a few nucleotides that tend to occur in non-coding DNA of eukaryotic genomes
and that are sometimes referred to as variable number of tandem repeats (VNTRs)
[SOURCE: ISO 16577:2016(en), 3.111]
3.10
microsatellite instability
MSI
condition of genetic hypermutability (predisposition to mutation) that results from impaired DNA
mismatch repair (MMR)
3.11
DNA mismatch repair
MMR
system for recognizing and repairing erroneous insertion, deletion, and misincorporation of bases that
can arise during DNA replication and recombination, as well as repairing some forms of DNA damage
3.12
nucleotide
base
base pair
monomer of a nucleic acid polymer such as DNA or RNA
Note 1 to entry: Nucleotides are denoted as letters ('A' for adenine; 'C' for cytosine; 'G' for guanine; 'T' for thymine
which only occurs in DNA; and 'U' for uracil, which only occurs in RNA). The chemical formula for a specific DNA or
RNA molecule is given by the sequence of its nucleotides, which can be represented as a string over the alphabet
('A',' C',' G', 'T') in the case of DNA, and a string over the alphabet ('A', 'C', 'G', 'U') in the case of RNA. Bases with
unknown molecular composition are denoted with 'N'.
[SOURCE: ISO/IEC 23092-2:20192020, 3.20]
3.13
polymerase chain reaction
PCR
in vitro enzymatic technique to increase the number of copies of a specific DNA fragment by several
orders of magnitude
[SOURCE: ISO 16577:20162022, 3.1486.47]
3.14
quality score
Q score
Phred quality score
Q score
sequencing quality score of a given nucleotide base
Note 1 to entry: Q is defined by the following equation: Q = -10log10(e), where e is the estimated probability of the
base call being wrong.
Note 2 to entry: A quality score of 20 represents an error rate of 1 in 100, with a corresponding call accuracy of
99 %.
© ISO ####2023 – All rights reserved 11

---------------------- Page: 10 ----------------------
ISO/AWI TSDTS 4425:2023(E)
Note 3 to entry: Higher quality scores indicate a smaller probability of error. Lower quality scores can result in a
significant portion of the reads being unusable. Low quality scores may also indicate false-positive variant calls,
resulting in inaccurate conclusions.
[SOURCE: ISO/DIS 20397-2:2021, 3.30]
3.15
read type
type of run in the sequencing instrument
Note 1 to entry: It can be either single-end or paired-end.
Note 2 to entry: Single-end: Single read runs the sequencing instrument reads from one end of a fragment to the
other end.
Note 3 to entry: Paired-end: Paired-end runs read from one end to the other and then starts another round of
reading from the opposite end.
[SOURCE: ISO/TS 20428:2017, 3.27]
3.16
reference sequence
nucleic acid sequence with biological relevance
Note 1 to entry: Each reference sequence is indexed by a one-dimensional integer coordinate system whereby each
integer within range identifies a single nucleotide. Coordinate values can only be equal to or larger than zero. The
coordinate system in the context of this standard is zero-based (i.e., the first nucleotide has coordinate 0, and it is
said to be at position 0) and linearly increases within the string from left to right.
[SOURCE: ISO/IEC 23092-1:20192020, 3.22]
3.17
read
sequence read
read
fragmented nucleotide sequences that are used to reconstruct the original sequence for next-generation
sequencing technologies
[SOURCE: ISO/TS 20428:2017, 3.26]
3.18
variation
sequence variation
DNA sequence variation
variation
differences of DNA sequence among individuals in a population
[SOURCE: ISO 25720:2009, 4.8]
3.19
single nucleotide variant
SNV
DNA sequence variation that occurs when a single nucleotide, A, T, C, or G, in the genome (or other target
sequence) differs between templates
[SOURCE: ISO 20395:2019, 3.35]
12 © ISO ####2023 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/AWI TSDTS 4425:2023(E)
3.20
small indel
insertion (3.8) or deletion (3.2) of 2 nucleotides to 100 nucleotides
[SOURCE: ISO/TS 20428:2017, 3.32]
3.2120
specimen
biospecimen
biological specimen
sample of tissue, body fluid, food, or other substance that is collected or acquired to support the
assessment, diagnosis, treatment, mitigation or prevention of a disease, disorder or abnormal physical
state, or its symptoms
[SOURCE: ISO/TS 20428:2017, 3.34]
3.22
subject of care
person who uses, or is a potential user of, a healthcare service
[SOURCE: ISO/TS 22220:2011, 3.2], modified — Note to entry and second preferred term deleted.]
3.2321
target capture
method to capture genomic regions of interest from a DNA sample prior to sequencing
[SOURCE: ISO/TS 20428:2017, 3.36]
3.2422
targeted sequencing
technique used for sequencing only selected/targeted genomic regions of interest from a DNA sample
[SOURCE: ISO/TS 22692:2020, 3.22], modified — Note to entry and second preferred term deleted.]
3.2523
whole exome sequencing
WES
technique for sequencing the exomes of the protein-coding genes in a genome
[SOURCE: ISO/TS 20428:2018, 3.38]
3.26
3.24
whole genome sequencing
WGS
technique that determines the complete DNA sequence of an organism's genome at a single time
[SOURCE: ISO/TS 20428:20182017, 3.39]
4 Abbreviated terms
ATC  Anatomical Therapeutic Chemical
CDC  Centers for Disease Control and Prevention
© ISO ####2023 – All rights reserved 13

---------------------- Page: 12 ----------------------
ISO/AWI TSDTS 4425:2023(E)
CTLA4 Cytotoxic T-Lymphocyte Associated Protein 4
EBI  European Bioinformatics Institute
FHIR Fast Healthcare Interoperability Resources
HL7  Health Level Seven
IDMP Identification of Medicinal Product
IMPID Investigational MPID
INN  International Nonproprietary Names
MPID Medicinal Product Identifier
NCCN National Comprehensive Cancer Network
NGS  Next Generation Sequencing
NIH  National Institutes of Health
PD-L1 Programmed death-ligand 1
PD-1 Programmed cell death protein 1
SPREC Standard PREanalytical Code
WHO World Health Organization
UTN Universal Trial Number

ATC Anatomical Therapeutic Chemical
CTLA4 Cytotoxic T-Lymphocyte Associated Protein 4
EBI European Bioinformatics Institute
FHIR Fast Healthcare Interoperability Resources
HL7® Health Level Seven
IDMP Identification of Medicinal Product
IMPID Investigational MPID
INN International Nonproprietary Names
MPID Medicinal Product Identifier
NCCN National Comprehensive Cancer Network
NGS Next Generation Sequencing
NIH National Institutes of Health
PD-L1 Programmed death-ligand 1
PD-1 Programmed cell death protein 1
14 © ISO ####2023 – All rights reserved

---------------------- Page: 13 ----------------------
ISO/AWI TSDTS 4425:2023(E)
SPREC Standard PREanalytical Code
WHO World Health Organization
UTN Universal Trial Number
5 Microsatellite instability (MSI)
The DNA mismatch repair (MMR) pathway plays an important role in the cell cycle process to recognize
and repair mismatches during DNA replication. The major components are four key enzymes coded for
by the following genes: MLH1, MSH2, MSH6, PMS2, and EPCAM. MMR function doesn't work when
mutational inactivation in the five genes or epigenetic inactivation occurs. It is called Deficiency of
[[6]]
mismatch repair (dMMR). One of the most related diseases is Lynch syndrome. . Lynch syndrome, also
known as hereditary non-polyposis colorectal cancer (HNPCC), is the most common cause of hereditary
colorectal cancer. People with Lynch syndrome are more likely to get colorectal cancer and other cancers
at a younger age (under 50). Patients develop dMMR tumors following the inactivation of the second wild-
type allele through somatic mutation, loss of heterozygosity, or epigenetic silencing. These alterations –
mutation or epigenetic inactivation is related to not only Lynch syndrome but also revealed differences
in the case of cancer type. However, both lead to the accumulation of short sequences of DNA repeated
throughout the genome-specific location and an increased risk of malignant transformation in certain
tissues. These tumors have a higher frequency of somatic mutations compared with non-dMMR cancers
and are assumed to have a large range of tumor neoantigens (high tumor mutation burden) and a highly
immunogenic signature, including a high proportion of tumor-infiltrating lymphocytes. Defective
mismatch repair results in a high tumor mutation burden and abundant neo-antigen formation, which
can be recognized by the host immune system. Microsatellite instability (MSI) is found in 1,5 % to 3,5 %
of all human cancers, such as colorectal, endometrial, ovarian, and cancers of the stomach, small intestine,
pancreas, biliary tract, and ureter. The human genome contains more than 19 million microsatellites,
short tandem repeats of motifs of 1 nucleotide to 6 nucleotides, typically spanning 10 nucleotides to 60
[[7]]
nucleotides in total length. . However, if the MMR function doesn't work well, nucleotide error
accumulates, especially in the human genomic position, including microsatellites. Sometimes, certain
polymorphic microsatellites can serve as an individual's molecular barcode, which can be used in forensic
identification.
The status of MSI is used as a biomarker for predicting the prognosis of colorectal cancer or selecting
patients who be more effective with immune therapy. Recent clinical studies have shown that MSI status
predicts clinical benefits from immune checkpoint blockade (ICB) with PD-1/PD-L1 interaction
1
inhibitors. In May 2017, a new drug-pembrolizumab (KEYTRUDA® ) was approved, which is a
humanized antibody against the programmed death receptor-1 (PD-1), for the treatment of patients with
an
...

FINAL
TECHNICAL ISO/DTS
DRAFT
SPECIFICATION 4425
ISO/TC 215/SC 1
Genomics informatics — Data
Secretariat: KATS
elements and their metadata
Voting begins on:
2023-01-04 for describing the microsatellite
instability (MSI) information of
Voting terminates on:
2023-03-01
clinical massive parallel DNA
sequencing
Informatique génomique — Éléments de données et leurs
métadonnées pour décrire les informations relatives à l'instabilité des
microsatellites (MSI) du séquençage massif parallèle d'ADN
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/DTS 4425:2023(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS. © ISO 2023

---------------------- Page: 1 ----------------------
ISO/DTS 4425:2023(E)
FINAL
TECHNICAL ISO/DTS
DRAFT
SPECIFICATION 4425
ISO/TC 215/SC 1
Genomics informatics — Data
Secretariat: KATS
elements and their metadata
Voting begins on:
for describing the microsatellite
instability (MSI) information of
Voting terminates on:
clinical massive parallel DNA
sequencing
Informatique génomique — Éléments de données et leurs
métadonnées pour décrire les informations relatives à l'instabilité des
microsatellites (MSI) du séquençage massif parallèle d'ADN
COPYRIGHT PROTECTED DOCUMENT
© ISO 2023
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
RECIPIENTS OF THIS DRAFT ARE INVITED TO
ISO copyright office
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
CP 401 • Ch. de Blandonnet 8
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
CH-1214 Vernier, Geneva
DOCUMENTATION.
Phone: +41 22 749 01 11
IN ADDITION TO THEIR EVALUATION AS
Reference number
Email: copyright@iso.org
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO­
ISO/DTS 4425:2022(E)
Website: www.iso.org
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
Published in Switzerland
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN­
DARDS TO WHICH REFERENCE MAY BE MADE IN
ii
  © ISO 2023 – All rights reserved
NATIONAL REGULATIONS. © ISO 2022

---------------------- Page: 2 ----------------------
ISO/DTS 4425:2022(E)
Contents Page
Foreword .v
Introduction . vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Abbreviated terms . 5
5 Microsatellite instability (MSI) .5
6 Composition of elements for describing MSI status on clinical DNA NGS report .6
6.1 General . 6
6.2 Summary part . 7
6.3 Detail part . 7
7 Fields and their nomenclature of required data . 7
7.1 General . 7
7.2 Clinical sequencing order . . 8
7.2.1 General . 8
7.2.2 Clinical sequencing order code . 8
7.2.3 Date and time . 8
7.3 Information on subject of care . 9
7.3.1 General . 9
7.3.2 Subject of care identifier . 9
7.3.3 Subject of care name . 9
7.3.4 Subject of care birth date . 9
7.3.5 Subject of care sex . 9
7.3.6 Subject of care ancestry . 9
7.3.7 Referring diagnosis . 10
7.4 Information on legally authorized person ordering clinical sequencing . 10
7.4.1 General . 10
7.5 Performing laboratory . 10
7.5.1 General . 10
7.5.2 Basic information on performing laboratory . 10
7.5.3 Information on report generator . 10
7.5.4 Information of legally confirmed person on sequencing report. 10
7.6 Biospecimen information. 10
7.6.1 General . 10
7.6.2 Type of specimen . 10
7.7 MSI status result information . 11
7.7.1 General . 11
7.7.2 MSI status . 11
7.8 Recommended treatment . 11
7.8.1 General . 11
7.8.2 Medication . . 11
7.8.3 Clinical trial information . 11
7.8.4 Other recommendations . 11
7.8.5 Supporting information . 11
8 Fields and their nomenclature of optional data .12
8.1 General .12
8.2 Reference genome information .12
8.3 MSI information . 13
8.3.1 General .13
8.3.2 Criteria of MSI status .13
8.3.3 Genomic position for determining MSI status .13
iii
© ISO 2022 – All rights reserved

---------------------- Page: 3 ----------------------
ISO/DTS 4425:2022(E)
8.3.4 Genomic position against markers of alternative method . .13
8.3.5 Clinical implication of MSI status . 13
8.4 Sequencing information . 13
8.4.1 Clinical sequencing date . 13
8.4.2 Sequencing type .13
8.4.3 Quality control metrics . 14
8.4.4 Sequencing platform information . 14
8.4.5 Analysis platform information . 14
Annex A (informative) Example structure of MSI status report .16
Bibliography .20
iv
  © ISO 2022 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/DTS 4425:2022(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non­governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 215, Health informatics, Subcommittee SC
1, Genomics informatics.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
v
© ISO 2022 – All rights reserved

---------------------- Page: 5 ----------------------
ISO/DTS 4425:2022(E)
Introduction
Massively parallel sequencing is a high-throughput analytical approach to nucleic acid sequencing that
allows whole genomes, transcriptomes, and specific nucleic acid targets. These advanced technologies
have been used in the clinical field, and clinical sequencing has been applied to realize personalized
[1]
medicine and precision medicine. ISO/TS 20428 has been developed for clinical usage.
In the field of cancer treatment, various treatment strategies were performed differently from
traditional anti-cancer chemotherapies. One of those strategies is the control of human immune system
that maintains the action to extract cancer cells. Recent outcomes of clinical trials show that this
immune therapy is efficient for some patients who have a specific molecular character of their tumor
[2]
mass, such as PD-L1 or CTLA4 surface protein expression . As a result, these molecular characters
are used as biomarkers for selecting patients. In colon cancer, according to several clinical trials, it is
reported that the status of MSI (microsatellite instability) is regarded as a biomarker that drugs based
[3]
on immuno-therapy are more efficient for the patient with MSI-H (high) .
The status of MSI can be calculated and reported by small nucleotide deletion on a specific region of
[4]
human genome reference with NGS sequencing . According to US FDA, four NGS sequencing products
were approved for companion diagnostics. Among these products, three NGS sequencing provide MSI
status and value on their NGS sequencing report. CLIA-certified labs or equivalent level agencies in
[5]
countries also are servicing the MSI status from their methods . It is forecasted that more clinical NGS
sequencing will be approved to report MSI.
However, there is no standard for describing MSI status, value, and metadata. ISO/TS 20428 focuses
on only DNA variations compared with the reference genome. According to some research results,
MSI status and the way to describe it are different even if using the same sequencing data. This makes
it difficult for clinicians and researchers not only to use MSI status results for clinical decisions but
also for secondary analyzing purposes when receiving from more than one sequencing lab. Related
metadata should be essential to expand the usage of MSI status results.
In this document, the data elements and their standardized metadata for MSI status in electronic health
records will be described. The clinical report for MSI will provide helpful information on bioinformatics
analysis to help clinical decisions.
vi
  © ISO 2022 – All rights reserved

---------------------- Page: 6 ----------------------
TECHNICAL SPECIFICATION ISO/DTS 4425:2022(E)
Genomics informatics — Data elements and their
metadata for describing the microsatellite instability (MSI)
information of clinical massive parallel DNA sequencing
1 Scope
This document identifies data elements and metadata to represent the information about microsatellite
instability (MSI) for reporting the value of the biomarker using clinical massive parallel DNA sequencing.
This document covers information about the MSI test result and related data, such as used resources,
data generation condition, and data processing information which are helpful to clinical diagnosis and
research.
This document is not intended
— for defining experimental protocols or methods for calculating the value of microsatellite instability
(MSI),
— for the other biological species than human resource, or
— for the Sanger sequencing methods.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 8601 (all parts), Date and time — Representations for information interchange
ISO/TS 22220:2011, Health informatics — Identification of subjects of health care
ISO/TS 27527:2010, Health informatics — Provider identification
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
biological specimen
biospecimen
specimen
sample of tissue, body fluid, food, or other substance that is collected or acquired to support the
assessment, diagnosis, treatment, mitigation or prevention of a disease, disorder or abnormal physical
state, or its symptoms
[SOURCE: ISO/TS 20428:2017, 3.34]
1
© ISO 2022 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/DTS 4425:2022(E)
3.2
clinical sequencing
next generation sequencing or later sequencing technologies with human samples for clinical practice
and clinical trials
[SOURCE: ISO/TS 20428:2017, 3.5]
3.2.1
deletion
contiguous removal of one or more bases from a genomic sequence
[SOURCE: ISO/IEC 23092­2:2020, 3.4]
3.3
deoxyribonucleic acid
DNA
molecule that encodes genetic information in the nucleus of cells
[SOURCE: ISO 25720:2009, 4.7]
3.4
DNA sequencing
determining the order of nucleotide bases (adenine, guanine, cytosine and thymine) in a molecule of
DNA
Note 1 to entry: Sequence is generally described from the 5' end.
[SOURCE: ISO/TS 17822:2020, 3.19]
3.5
exome
part of the genome formed by exons
[SOURCE: ISO 20428:2017, 3.13]
3.6
gene
basic unit of hereditary material that encodes and controls the expression of a protein or protein
subunit
3.7
indel
insertion (3.8) or/and deletion (3.2)
[SOURCE: ISO/TS 20428:2017, 3.18]
3.8
insertion
contiguous addition of one or more bases into a genomic sequence
[SOURCE: ISO/IEC 23092­2:2020, 3.18]
3.9
microsatellite
repetitive DNA elements, also known as simple sequence repeats (SSR), consisting of short in tandem
repeat motifs of one to a few nucleotides that tend to occur in non-coding DNA of eukaryotic genomes
and that are sometimes referred to as variable number of tandem repeats (VNTRs)
2
  © ISO 2022 – All rights reserved

---------------------- Page: 8 ----------------------
ISO/DTS 4425:2022(E)
3.10
microsatellite instability
MSI
condition of genetic hypermutability (predisposition to mutation) that results from impaired DNA
mismatch repair (MMR)
3.11
DNA mismatch repair
MMR
system for recognizing and repairing erroneous insertion, deletion, and misincorporation of bases that
can arise during DNA replication and recombination, as well as repairing some forms of DNA damage
3.12
nucleotide
monomer of a nucleic acid polymer such as DNA or RNA
Note 1 to entry: Nucleotides are denoted as letters ('A' for adenine; 'C' for cytosine; 'G' for guanine; 'T' for thymine
which only occurs in DNA; and 'U' for uracil, which only occurs in RNA). The chemical formula for a specific
DNA or RNA molecule is given by the sequence of its nucleotides, which can be represented as a string over the
alphabet ('A',' C',' G', 'T') in the case of DNA, and a string over the alphabet ('A', 'C', 'G', 'U') in the case of RNA. Bases
with unknown molecular composition are denoted with 'N'.
[SOURCE: ISO/IEC 23092­2:2020, 3.20]
3.13
polymerase chain reaction
PCR
in vitro enzymatic technique to increase the number of copies of a specific DNA fragment by several
orders of magnitude
[SOURCE: ISO 16577:2022, 3.6.47]
3.14
quality score
Phred quality score
Q score
sequencing quality score of a given nucleotide base
Note 1 to entry: Q is defined by the following equation: Q = ­10log10(e), where e is the estimated probability of the
base call being wrong.
Note 2 to entry: A quality score of 20 represents an error rate of 1 in 100, with a corresponding call accuracy of
99 %.
Note 3 to entry: Higher quality scores indicate a smaller probability of error. Lower quality scores can result in a
significant portion of the reads being unusable. Low quality scores may also indicate false-positive variant calls,
resulting in inaccurate conclusions.
[SOURCE: ISO 20397­2:2021, 3.30]
3.15
read type
type of run in the sequencing instrument
Note 1 to entry: It can be either single-end or paired-end.
Note 2 to entry: Single-end: Single read runs the sequencing instrument reads from one end of a fragment to the
other end.
Note 3 to entry: Paired-end: Paired-end runs read from one end to the other and then starts another round of
reading from the opposite end.
[SOURCE: ISO/TS 20428:2017, 3.27]
3
© ISO 2022 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/DTS 4425:2022(E)
3.16
reference sequence
nucleic acid sequence with biological relevance
Note 1 to entry: Each reference sequence is indexed by a one-dimensional integer coordinate system whereby
each integer within range identifies a single nucleotide. Coordinate values can only be equal to or larger than
zero. The coordinate system in the context of this standard is zero-based (i.e., the first nucleotide has coordinate
0, and it is said to be at position 0) and linearly increases within the string from left to right.
[SOURCE: ISO/IEC 23092­1:2020, 3.22]
3.17
read
sequence read
fragmented nucleotide sequences that are used to reconstruct the original sequence for next-generation
sequencing technologies
[SOURCE: ISO/TS 20428:2017, 3.26]
3.18
variation
sequence variation
DNA sequence variation
differences of DNA sequence among individuals in a population
[SOURCE: ISO 25720:2009, 4.8]
3.19
small indel
insertion (3.8) or deletion (3.2) of 2 nucleotides to 100 nucleotides
[SOURCE: ISO/TS 20428:2017, 3.32]
3.20
subject of care
person who uses, or is a potential user of, a healthcare service
[SOURCE: ISO/TS 22220:2011, 3.2, modified — Note to entry and second preferred term deleted.]
3.21
target capture
method to capture genomic regions of interest from a DNA sample prior to sequencing
[SOURCE: ISO/TS 20428:2017, 3.36]
3.22
targeted sequencing
technique used for sequencing only selected/targeted genomic regions of interest from a DNA sample
[SOURCE: ISO/TS 22692:2020, 3.22, modified — Note to entry and second preferred term deleted.]
3.23
whole exome sequencing
WES
technique for sequencing the exomes of the protein-coding genes in a genome
3.24
whole genome sequencing
WGS
technique that determines the complete DNA sequence of an organism's genome at a single time
[SOURCE: ISO/TS 20428:2017, 3.39]
4
  © ISO 2022 – All rights reserved

---------------------- Page: 10 ----------------------
ISO/DTS 4425:2022(E)
4 Abbreviated terms
ATC Anatomical Therapeutic Chemical
CTLA4 Cytotoxic T-Lymphocyte Associated Protein 4
EBI European Bioinformatics Institute
FHIR Fast Healthcare Interoperability Resources
HL7® Health Level Seven
IDMP Identification of Medicinal Product
IMPID Investigational MPID
INN International Nonproprietary Names
MPID Medicinal Product Identifier
NCCN National Comprehensive Cancer Network
NGS Next Generation Sequencing
NIH National Institutes of Health
PD­L1 Programmed death­ligand 1
PD­1 Programmed cell death protein 1
SPREC Standard PREanalytical Code
WHO World Health Organization
UTN Universal Trial Number
5 Microsatellite instability (MSI)
The DNA mismatch repair (MMR) pathway plays an important role in the cell cycle process to recognize
and repair mismatches during DNA replication. The major components are four key enzymes coded
for by the following genes: MLH1, MSH2, MSH6, PMS2, and EPCAM. MMR function doesn't work when
mutational inactivation in the five genes or epigenetic inactivation occurs. It is called Deficiency of
[6]
mismatch repair (dMMR). One of the most related diseases is Lynch syndrome . Lynch syndrome, also
known as hereditary non-polyposis colorectal cancer (HNPCC), is the most common cause of hereditary
colorectal cancer. People with Lynch syndrome are more likely to get colorectal cancer and other
cancers at a younger age (under 50). Patients develop dMMR tumors following the inactivation of the
second wild-type allele through somatic mutation, loss of heterozygosity, or epigenetic silencing. These
alterations – mutation or epigenetic inactivation is related to not only Lynch syndrome but also revealed
differences in the case of cancer type. However, both lead to the accumulation of short sequences of DNA
repeated throughout the genome-specific location and an increased risk of malignant transformation in
certain tissues. These tumors have a higher frequency of somatic mutations compared with non-dMMR
cancers and are assumed to have a large range of tumor neoantigens (high tumor mutation burden)
and a highly immunogenic signature, including a high proportion of tumor-infiltrating lymphocytes.
Defective mismatch repair results in a high tumor mutation burden and abundant neo­antigen
formation, which can be recognized by the host immune system. Microsatellite instability (MSI) is
found in 1,5 % to 3,5 % of all human cancers, such as colorectal, endometrial, ovarian, and cancers of
the stomach, small intestine, pancreas, biliary tract, and ureter. The human genome contains more than
19 million microsatellites, short tandem repeats of motifs of 1 nucleotide t
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.