ISO 20397-3:2025
(Main)Biotechnology — Massively parallel sequencing — Part 3: General requirements and guidance for metagenomics
Biotechnology — Massively parallel sequencing — Part 3: General requirements and guidance for metagenomics
This document specifies general requirements and guidance for metagenomics-dedicated sample preparation, and generating and analysing metagenomics sequence data obtained from massive parallel sequencing platforms. The specified metagenomics process includes the following stages: a) sampling strategy and process, including type, storage, transportation, extraction, quality; b) nucleic acid library preparation c) design and review process including sequencing strategy and assessment; d) database construction; e) bioinformatics analysis and report f) validation and verification for bioinformatics pipeline, and database This document applies to laboratories and research organizations.
Biotechnologie — Séquençage à grande échelle — Partie 3: Exigences générales et recommandations pour la métagénomique
General Information
Standards Content (Sample)
International
Standard
ISO 20397-3
First edition
Biotechnology — Massively parallel
2025-07
sequencing —
Part 3:
General requirements and guidance
for metagenomics
Biotechnologie — Séquençage à grande échelle —
Partie 3: Exigences générales et recommandations pour la
métagénomique
Reference number
© ISO 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Principle . 4
4.1 General .4
5 Sampling strategy . 5
5.1 General .5
5.2 Primary sample type .6
5.3 Primary sample stabilization and storage .6
5.4 Primary sample transportation .6
5.5 DNA/RNA isolation .7
5.6 DNA/RNA sample quality .7
6 Nucleic acid library preparation . 7
7 Design and review process including sequencing strategy and assessment . 7
7.1 General .7
7.2 Short read . .7
7.3 Long read .8
7.4 Hybrid assembly .8
7.5 Sample preparation and library construction.8
7.6 Assessment .8
8 Database construction . 9
8.1 General .9
8.2 Public database .9
8.3 Self-build database .9
9 Bioinformatics analysis . 10
9.1 General .10
9.2 Identification list of microorganisms .11
10 Validation and verification .11
10.1 General .11
10.2 In silico sequence control for bioinformatics pipeline evaluation.11
10.3 Real sample control for pipeline evaluation .11
11 Evaluation .12
12 Test report .12
12.1 General . 12
12.2 Test report content . 12
Annex A (informative) Checklist for NA sample quality assessment before library construction .13
Annex B (informative) Methods for sample stabilization and storage . 14
Annex C (informative) Bioinformatics pipeline .15
Bibliography . 17
iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 276 Biotechnology, Subcommittee SC 1,
Analytical methods.
A list of all parts in the ISO 20397 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
Introduction
Massively parallel sequencing (MPS) is a high-throughput analytical approach to nucleic acid sequencing
utilizing massively parallel processing, that allows whole genomes, transcriptomes, and specific nucleic acid
targets from different organisms to be investigated in a relatively short time.
Metagenomics approaches are an extremely powerful strategy in large-scale genomics applications as
a way to study the taxonomic and functional composition of microbial communities from environmental,
agricultural, and clinical primary samples/samples. Metagenomics does not require isolation of single
bacterium from complex microbial community, but catalogues by sequencing all genes and genomes from
total DNA (tDNA). It has great advantages to identify new species, including microorganisms that are
difficult to culture under typical laboratory conditions.
Analysing metagenomics data involves a complex and statistically driven process that extends beyond
traditional MPS pipelines to include identification, functional and relative abundance analyses. In
metagenomics, whole genomic DNA is prepared from primary samples, regardless of its microbial
composition and is characterized by whole genome sequencing. The annotation of resulting DNA fragments,
individual reads or assembled sequence contigs, to individual taxonomic groups or known genome sequences,
is carried out by sophisticated bioinformatic tools. The analysis is not limited to traditional MPS pipeline but
also focuses on the identification of functional composition of a microbial community, which include the
assignment of protein-coding open reading frames to functional categories, such as protein domain families
or gene ontologies. Consequently, the analysis of whole genome sequencing (WGS) metagenomics data
sets involves a significant statistical component, as sequence data must be evaluated based on relative
abundances rather than on absolute presence/absence data.
v
International Standard ISO 20397-3:2025(en)
Biotechnology — Massively parallel sequencing —
Part 3:
General requirements and guidance for metagenomics
1 Scope
This document specifies general requirements and guidance for metagenomics-dedicated sample
preparation, and generating and analysing metagenomics sequence data obtained from massive parallel
sequencing platforms. The specified metagenomics process includes the following stages:
a) sampling strategy and process, including type, storage, transportation, extraction, quality;
b) nucleic acid library preparation
c) design and review process including sequencing strategy and assessment;
d) database construction;
e) bioinformatics analysis and report
f) validation and verification for bioinformatics pipeline, and database
This document applies to laboratories and research organizations.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 20397-1, Biotechnology — Massively parallel sequencing — Part 1: Nucleic acid and library preparation
ISO 20397-2, Biotechnology — Massively parallel sequencing — Part 2: Quality evaluation of sequencing data
ISO/TS 24420, Biotechnology — Massive Parallel DNA Sequencing — General requirements for data processing
of shotgun metagenomics sequences
ISO 20395, Biotechnology — Requirements for evaluating the performance of quantification methods for nucleic
acid target sequences — qPCR and dPCR
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 20397-1, ISO 20397-2 and the
following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
library preparation
set of procedures used to prepare DNA or RNA fragments containing tags, and sequencing primer binding
regions for massively parallel sequencing (MPS)
[SOURCE: ISO 20397-1:2022, 3.6]
3.2
library
sequencing library
DNA, cDNA or RNA that has been prepared to be for MPS within a specific size range and typically
containing adapters and/or identifiers recognised for sequence specific priming, sequence capture, and/or
identification of specific extracts
Note 1 to entry: Libraries can be DNA or cDNA. cDNA libraries are prepared for RNA sequencing on most sequencers.
Some instruments can directly sequence RNA.
[SOURCE: ISO 20397-1:2022, 3.5]
3.3
algorithm
completely determined finite sequence of instructions by which the values of the output variables may be
calculated from the values of the input variables
[SOURCE: IEC 60050-351:2006, 351-21-37]
3.4
bioinformatics analysis
process using application of computational methods, algorithms, and statistical techniques to process,
analyze, and interpret biological data
3.5
bioinformatics pipeline
individual programs, scripts, or pieces of software linked together, where raw data or output from one
program is used as input for the next step in data processing
EXAMPLE The output from a base quality trimming program may be used as input to a de-novo assembler.
3.6
coverage
coverage depth
number of times that a given base position is read in a sequencing run
Note 1 to entry: The number of reads that cover a particular position.
3.7
DNA
deoxyribonucleic acid
molecule that encodes genetic information
[SOURCE: ISO 25720:2009, 4.7, modified — in the nucleus of cells was deleted]
3.8
sequencing
determining the order and the content of nucleotide bases (adenine, guanine, cytosine, thymine, and uracil)
of a nucleic acid molecule
Note 1 to entry: A sequence is generally described from the 5’ to 3’ end.
[SOURCE: ISO/TS 17822-1:2014, 3.20, modified — DNA was deleted in the term; DNA was replaced by nucleic
acid, and uracil was added in the definition.]
3.9
sequence alignment
arrangement of nucleic acid sequences according to regions of similarity
Note 1 to entry: It is possible that sequence alignment does not require a reference genome /reference targeted nucleic
acid region and its aim does not produce an assembly.
Note 2 to entry: Sequence alignment may include two or more sequence aligned to find overlap, difference or make
contigs.
3.10
raw data
primary sequencing data produced by a sequencer without involving any software-based pre-filtering
Note 1 to entry: Raw data refers to the data prior to base calling. The data file containing raw signals from the
sequencer is not classified as raw data in this draft.
3.11
RNA
ribonucleic acid
polymer of ribonucleotides occurring in a double-stranded or single-stranded form
[SOURCE: ISO 22174:2024, 3.1.9]
3.12
read
sequence read
fragmented nucleotide sequence generated by a sequencing device
Note 1 to entry: A read is a deduced sequence of nucleic acid base pairs (or base pairs probabilities) corresponding to
all (or part of) a single nucleic acid fragment. It is usually referred to those sequences obtained from MPS experiments.
3.13
reference sequence
nucleic acid sequence used either to align by mapping sequence reads or as the basis for annotations such as
genes and sequence variations
3.14
mapping
determination of the origin of a sequence (read) in a reference sequence
3.15
massively parallel sequencing
MPS
next generation sequencing
non-Sanger-based high-throughput nucleic acid sequencing
Note 1 to entry: Millions or billions of nucleic acid strands can be sequenced in parallel, yielding substantially more
throughput.
Note 2 to entry: NGS (next generation sequencing) is also well recognized as MPS in the ISO 20397 series.
Note 3 to entry: MPS or NGS covers long read sequencing and short read sequencing.
3.16
paired end reads
paired sequencing reads from both ends of a DNA fragment
Note 1 to entry: In paired-end sequencing, the instrument sequences both ends of short inserts typically ranging from
75 bps to 600 bps.
3.17
microbiome
microbiota in a particular environment (in human or non-human)
3.18
knowledge-based report
metagenomics report incorporated with knowledge of microorganism functional or extra information from
identification list and annotation results of metagenomics analysis
3.19
knowledge database
database with knowledge of microorganism function or extra information relevant to specific applications,
such as clinical screening of infection
3.20
room temperature
temperature defined as being the range of 18 °C to 25 °C
3.21
target metagenomics
amplifies and sequences specific selected regions to identify organism groups based on variable region
3.22
metagenomic shotgun sequencing
sequences all DNA/RNA in a sample without bias
4 Principle
4.1 General
Metagenomics is the study of genetic material recovered directly from environmental primary samples. This
includes not only genomic DNA from microorganisms but also mobile genetic elements, such as plasmids,
phages, and other extrachromosomal elements. Metagenomics is useful when attempting to understand
what microbes are present, what they are doing and how genetic material is shared within a particular
environment.
NOTE For these mobile genetic elements, they can exist within multiple organisms and identified.
There are two types of metagenomics approaches, targeted metagenomics sequencing and metagenomic
shotgun sequencing. The former is becoming phased out as sequencing costs fall and technology improves,
but it is still used frequently as each approach has their own pros and cons. However, both can answer the
question of what is in the primary sample/sample, but only shotgun metagenomics can truly address the
functional composition of microbial communities. This is because it provides more comprehensive sequence
data or assembly results, leading to more detailed annotations that explain the functional aspects of the
data, such as sequences or species.
— Targeted Metagenomics: In this application, certain conserved regions (16S rRNA, 18S rRNA, ITS regions)
are amplified with PCR primers or capture methods and sequenced. These regions contain variable
sequences that enable the identification of various organism groups. However, accurately identifying
organisms at the species level is unreliable with this method. Additionally, it does not provide direct
insights into the functional roles of these organisms based on the data obtained. Some significant
drawbacks include: a) bias stemming from primer choice, b) inability to determine antimicrobial
resistance (AMR), c) limited resolution for species identification, and others.
NOTE 1 Targeted metagenomics does not include exome-capture sequencing.
NOTE 2 16S rRNA is a gene that codes for a ribosomal RNA (rRNA) molecule found in bacteria and certain
archaea. It is part of the small subunit (30S) of the ribosome and plays a crucial role in the translation of mRNA
into proteins. The 16S rRNA gene is widely used in microbial ecology and taxonomy studies because it contains
both conserved (similar across species) and variable regions (unique to specific groups), making it useful for
identifying and classifying bacteria and archaea.
NOTE 3 18S rRNA gene is found in eukaryotes (organisms with a nucleus, including protists, fungi, plants, and
animals). It is a component of the small subunit (40S) of the eukaryotic ribosome and functions similarly to the
16S rRNA in translation. Like its bacterial counterpart, the 18S rRNA gene also contains conserved and variable
regions, which are used for phylogenetic analysis and taxonomic classification of eukaryotic organisms.
NOTE 4 ITS regions (internal transcribed spacer) are found between the rRNA genes in the nuclear ribosomal
DNA of eukaryotes. They include the ITS1 and ITS2 regions, which are separated by the 5,8S rRNA gene. ITS
regions evolve more rapidly than rRNA genes and contain both conserved and variable regions. These regions
are highly useful for identifying and distinguishing between closely related species or strains within eukaryotic
groups, such as fungi and plants, due to their variability and rapid evolution.
— Shotgun Metagenomics: This method is non-discriminant in that it will sequence all DNA/RNA present
in the sample. This can enable assignment of taxonomy, quantitation of the number of species, and
assignment of the functional composition of microbial communi
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...