Genomics informatics — Requirements of data analysis for direct-to-consumer testing

This document specifies the requirements for genetic data analysis relating to direct-to-consumer (DTC) testing, including preprocessing, detection site, evaluation models, the use of databases and the elements of assessment reports. This document applies to the analysis of genetic data from DTC testing without the involvement of a health care provider.

Informatique génomique — Exigences d'analyse des données pour les tests en libre accès

General Information

Status: Published
Publication Date: 04-Mar-2026

ICS: 35.240.80 - IT applications in health care technology

Technical Committee: ISO/TC 215/SC 1 - Genomics Informatics
Drafting Committee: ISO/TC 215/SC 1 - Genomics Informatics

Current Stage: 6060 - International Standard published
Start Date: 05-Mar-2026
Due Date: 04-Sep-2027
Completion Date: 05-Mar-2026

Overview

ISO/TS 20738:2026 Genomics informatics - Requirements of data analysis for direct-to-consumer testing sets forth standardized requirements for the analysis of genetic data generated through direct-to-consumer (DTC) testing services. This technical specification addresses the complete workflow for genetic data analysis in DTC contexts, including the preprocessing of raw data, quality control, application of evaluation models, use of genetic and phenotypic databases, and best practices for compiling assessment reports. Crucially, this ISO standard is designed for genetic testing provided to consumers without the involvement of healthcare professionals, focusing on ensuring accuracy, transparency, and consumer confidence in the results.

The standard supports the growing demand for personal genomics by offering guidance that facilitates robust data analysis methods across diverse DTC testing scenarios, including ancestry, health risk, traits, and pharmacogenomic applications.

Key Topics

Data Analysis Process: Outlines validated steps for DTC genetic data interpretation, such as initial data integrity checks, preprocessing, genotype imputation, and variant annotation.
Quality Control: Specifies thresholds for raw data from DNA chips and whole genome sequencing (WGS), including requirements for file formats (such as VCF and FASTQ), sequencing depth, call rates, and quality score metrics (Q20, Q30).
Evaluation Models and Databases:
- Use of appropriate haplotype reference panels based on population genetics.
- Guidance for employing public mutation frequency databases (e.g., dbSNP, gnomAD) and phenotype annotation databases (e.g., OMIM, ClinVar).
- Quality filtering criteria for variant imputation based on widely accepted industry standards.
Reporting Requirements: Defines the structure and content of DTC genetic test reports, including clear disclaimers that results are for research use only, quality assurance summaries, database version documentation, and explicit labeling of imputed data.
Data Privacy and Consent: Establishes rules for secondary data use, anonymization, data sharing restrictions, cross-border compliance (GDPR, HIPAA), and the requirement for consumer consent.

Applications

ISO/TS 20738:2026 is designed to bring consistency and reliability to genomic data analysis in the DTC industry. It is particularly relevant for:

DTC Genetics Companies: Ensures product quality, increases consumer trust, and aligns with global benchmarks for genetic data handling and reporting.
Bioinformatics Solution Providers: Supports the development and validation of analytic pipelines compatible with internationally recognized standards.
Data Science Teams: Guides the curation and filtration of genotype and sequencing data for application in ancestry, wellness, and trait reports.
Regulatory Authorities and Auditors: Provides an authoritative reference for evaluating DTC test providers’ compliance with best practices for laboratory data analysis, variant interpretation, and consumer data protection.
Consumers and Advocacy Organizations: Assures individuals that their genetic information is assessed according to clear quality and privacy requirements.

The standard is highly relevant for fields such as personal genomics, medical informatics, population genetics, and consumer health technology.

Related Standards

The field of genomics informatics and DTC genetic testing interfaces with several other international standards and regulatory requirements, including:

ISO 15189: Medical laboratories - Requirements for quality and competence
ISO/IEC 23092: Information technology - Genomic information representation
ISO 20397-2: Biotechnology - Massively parallel sequencing - Part 2: Quality evaluation of sequencing data
Regulation (EU) 2016/679 (GDPR): Data protection and privacy for personal data
US CLIA (42 CFR Part 493): Clinical Laboratory Improvement Amendments
FDA Guidance on NGS-Based IVDs
HGVS Nomenclature: For standardized description of variants
Common Public Databases: dbSNP, gnomAD, OMIM, ClinVar

By adopting ISO/TS 20738:2026, organizations contribute to the responsible development of consumer genomics, safeguarding both scientific robustness and consumer rights in a rapidly evolving sector.

Keywords: ISO/TS 20738, genomics informatics, direct-to-consumer genetic testing, DTC testing, genetic data analysis, quality control, whole genome sequencing, variant annotation, VCF, data privacy, consent, evaluation models, reporting requirements.

Buy Documents

ISO/TS 20738:2026 - Genomics informatics — Requirements of data analysis for direct-to-consumer testing - Page 1 preview

ISO/TS 20738:2026 - Genomics informatics — Requirements of data analysis for direct-to-consumer testing - Page 2 preview

ISO/TS 20738:2026 - Genomics informatics — Requirements of data analysis for direct-to-consumer testing - Page 3 preview

Technical specification

ISO/TS 20738:2026 - Genomics informatics — Requirements of data analysis for direct-to-consumer testing

Release Date:05-Mar-2026

English language (14 pages)

sale 15% off

Preview

sale 15% off

Preview

Get Certified

Connect with accredited certification bodies for this standard

BSI Group

BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

UKAS United Kingdom Verified

Visit Website

NYCE

Mexican standards and certification body.

EMA Mexico Verified

Visit Website

Frequently Asked Questions

What is ISO/TS 20738:2026?

ISO/TS 20738:2026 is a technical specification published by the International Organization for Standardization (ISO). Its full title is "Genomics informatics — Requirements of data analysis for direct-to-consumer testing". This standard covers: This document specifies the requirements for genetic data analysis relating to direct-to-consumer (DTC) testing, including preprocessing, detection site, evaluation models, the use of databases and the elements of assessment reports. This document applies to the analysis of genetic data from DTC testing without the involvement of a health care provider.

What is the scope of ISO/TS 20738:2026?

What ICS categories does ISO/TS 20738:2026 belong to?

ISO/TS 20738:2026 is classified under the following ICS (International Classification for Standards) categories: 35.240.80 - IT applications in health care technology. The ICS classification helps identify the subject area and facilitates finding related standards.

How can I access ISO/TS 20738:2026?

ISO/TS 20738:2026 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.

Standards Content (Sample)

ISO/TS 20738:2026 - Genomics i...

Technical
Specification
ISO/TS 20738
First edition
Genomics informatics —
2026-03
Requirements of data analysis for
direct-to-consumer testing
Informatique génomique — Exigences d'analyse des données
pour les tests en libre accès
Reference number
© ISO 2026
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Data analysis process . 3
5 Quality control of raw data . 4
5.1 DNA chip data preprocessing and quality control requirements .4
5.1.1 Data preprocessing .4
5.1.2 Data quality control .5
5.2 Whole genome sequencing quality requirements.5
5.2.1 Sequencing type and data quality . .5
5.2.2 Sequencing data comparison and quality control .6
6 Evaluation model and database . 6
6.1 DNA chip analysis requirements .6
6.1.1 DNA chip selection .6
6.1.2 Genotyping analysis .6
6.1.3 Genotype imputation analysis .7
6.2 WGS analysis requirements .7
6.2.1 Variant detection, genotype imputation and quality control .7
6.2.2 Variant site annotation .7
6.2.3 Interpretation of variation .8
7 Evaluation report . 8
7.1 Interpretation .8
7.2 Use and disclosure of data .8
Annex A (normative) VCF format file example . 9
Annex B (informative) Annotation databases .11
Annex C (informative) Call rate thresholds by application .12
Bibliography .13

iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO’s adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 215, Health informatics, Subcommittee SC 1,
Genomics Informatics.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.

iv
Introduction
With increasing people’s awareness of their right to know their own body and of the need for disease
prevention, prediction, participation and personalized treatment, and with the rapid development of
sequencing technology, genetic testing has expanded from clinical application to general consumer
application. Direct-to-customer (DTC) testing refers to genetic testing that individuals can order without
needing a clinician or a health care provider. These tests typically analyze DNA from a sample ‒ often saliva
‒ to provide insights into various genetic traits.
DTC tests cover a wide range of genetic analyses, including ancestry and heritage (understanding ethnic
background and lineage), health and disease risk (identifying genetic predispositions to conditions such as
cancer or heart disease), traits and lifestyle (examining genetic influences on taste preferences, hair loss,
or lactose digestion), pharmacogenomics (assessing how genetic variations affect drug metabolism). DTC
testing improves the awareness and attention to certain diseases, and it allows to adjust existing precaution
under the guidance of professionals. It provides the necessary basis for the formation of personalized disease
prevention programs. As an increasing prevalent commonality that connects clinical care and lifestyle, DTC
testing has grown enormously both in practical and expected use, becoming more and more indispensable
in the genetic testing ecosystem.
This document is based on current DTC industry data, combined with the needs of upstream and downstream
industry users. It puts forward general requirements and suggestions on the data and technical content of
genotype imputation technology, analysis and interpretation of results, as well as specific requirements in
the development of a supporting evaluation model and database. With this document’s specifications as the
basis of data analysis in the development of DTC testing products and services, consumers can have greater
confidence in the conclusions drawn from the data, thereby facilitating greater confidence in DTC testing.

v
Technical Specification ISO/TS 20738:2026(en)
Genomics informatics — Requirements of data analysis for
direct-to-consumer testing
1 Scope
This document specifies the requirements for genetic data analysis relating to direct-to-consumer (DTC)
testing, including preprocessing, detection site, evaluation models, the use of databases and the elements of
assessment reports.
This document applies to the analysis of genetic data from DTC testing without the involvement of a health
care provider.
2 Normative references
There are no normative references in this document.
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
coverage
coverage depth
number of times that a given base position is read in a sequencing run
Note 1 to entry: The number of reads that cover a particular position.
[SOURCE: ISO 20397-2:2021 3.6]
3.2
DNA chip
DNA microarray
solid substrate where a collection of probe DNA arranged in a specific design is attached in a high-density
fashion, directly or indirectly, that assays large amounts of biological material using high-throughput
screening methods
[SOURCE: ISO 16577:2022 3.4.13]
3.3
direct-to-customer
DTC
retail business model which eliminates any intermediaries and sells direct to consumer
Note 1 to entry: Also referred to as business to consumer (B2C).
Note 2 to entry: The sample, blood, saliva, cheek swab (cells from buccal cavity), fecal matter, nail clipping, are
provided by the consumer in assumed accordance with the collection protocol provided by the business.

3.4
FASTQ
genomic information representation that includes FASTA and quality values
[SOURCE: ISO/IEC 23092-2:2024, 3.8]
3.5
GC content
proportion of guanine and cytosine in a DNA molecule
3.6
genotype imputation
computational process to infer unobserved or missing genotypes in sequencing/genotyping data
Note 1 to entry: Using statistical models (e.g. hidden Markov models) and reference haplotype panels (e.g. 1 000
Genomes Project, TOPMed), imputation predicts missing variants by leveraging linkage disequilibrium (LD) patterns.
Common tools include IMPUTE2, Minimac, and BGI-lowpass.
Note 2 to entry: The output is typically a completed genomic variant dataset. This step is critical for enhancing data
utility in low-coverage whole genome sequencing (lcWGS) or genome-wide association studies (GWAS).
3.7
haplotype
combination of alleles at multiple sites that are inherited together on the same chromosome
3.8
InDel
insertion or deletion, or both, that occurs at a certain position in the genome
Note 1 to entry: InDel length is less than 50 bp.
3.9
quality score
Q score
Phred score
quality of base calling
measure of the probability of correct base recognition, usually expressed directly by a numerical value
Note 1 to entry: Q score is defined by the following formula:
Q = −10log (p)
where p is the estimated probability of the base call being wrong.
Note 2 to entry: A quality score of 20 represents an error rate of 1 in 100, with a corresponding call accuracy of 99 %.
Note 3 to entry: A quality score of 30 represents an error rate of 1 in 1 000, with a corresponding call accuracy of
99,9 %.
Note 4 to entry: Higher quality scores indicate a smaller probability of error. Lower quality scores can result in a
significant portion of the reads being unusable. Low quality scores can also indicate false-positive variant calls,
resulting in inaccurate conclusions.
3.10
sequencing depth
average number of times a nucleotide in a genome has been sequenced
Note 1 to entry: It is calculated by dividing the total number of sequenced bases in the aligned genome by the total
number of bases in the genome (excluding N).

3.11
whole genome sequencing
WGS
process that determines the complete DNA sequence of a human’s genome, including all 23 chromosome
pairs and mitochondrial DNA
Note 1 to entry: While performed through a coordinated workflow, current next-generation sequencing systems
cannot process the entire genome in a single run. The DNA is fragmented, sequenced in sections, and computationally
reconstructed using bioinformatics tools to assemble the complete genomic sequence.
Note 2 to entry: WGS is divided into high-coverage whole genome sequencing (hcWGS) and low-coverage whole
genome sequencing (lcWGS) according to the amount of sequencing.
Note 3 to entry: High-coverage WGS has sequencing depth >20×, while the coverage of clinical grade WGS is usually
>100×.
Note 4 to entry: For low-coverage WGS: 0,5× ≤ sequencing depth ≤ 6×.
4 Data analysis process
4.1 The integrity of the sample provided should be checked and verified prior to performing the analysis.
4.2 The data analysis process supported by the DNA chip shall include data preprocessing, genotype
calling, quality evaluation (quality assurance or quality control), genotype imputation, cluster analysis.
4.3 The WGS data analysis process shall include sequencing data quality control, compare and
deduplication, comparison quality control, variant calling (hcWGS) or genotype imputation (lcWGS),
variation quality control, variant annotation, variant interpretation and variation manual confirmation,
shown in Figure 1.
Figure 1 — Analysis and interpretation process based on WGS
5 Quality control of raw data
5.1 DNA chip data preprocessing and quality control requirements
5.1.1 Data preprocessing
5.1.1.1 The original data format of the DNA chip shall be subject to the chip manufacturer. Individual
data should be converted into VCF files or raw genotype data files for subsequent analysis. File formats are
provided as 5.1.1.2 and 5.1.1.3.
5.1.1.2 When converting chip data from raw data to variant call format (VCF) files or raw genotype
data files, cluster analysis should be used. The reference data used in the cluster analysis should be the
target population data of the detection service. The source of the reference data should be explained to the
consumer so there is a clear understanding of the relative nature of the results.

5.1.1.3 The raw genotype data file shall consist of four columns, including RSID (Reference SNP Cluster
ID), chromosome, the position on the chromosome, and a pair of bases. The raw genotype data file shall
explain the detection platform, detection time, reference genome sequence and other information in the
form of comments at the beginning of the file.
5.1.1.4 VCF file format requirements shall conform with Annex A.
5.1.2 Data quality control
5.1.2.1 For DNA chips spanning marker densities can be ranging from dozens to over 600 000 genome
sites. High-density DNA chips (> 600 000 markers) shall have a sample call rate ≥ 0,98. Targeted chips
(< 600 000 markers) shall meet call rate thresholds appropriate to their designed purpose, see Table C.1 for
recommended minimum call rates by chip type.
5.1.2.2 The detection rate of a single site in the same batch of samples shall not be lower than 0,8.
5.1.2.3 The international general human nucleic acid database shall be used as
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...