IEC PAS 63621:2026
(Main)Artificial intelligence enabled medical devices - Data management
Artificial intelligence enabled medical devices - Data management
IEC PAS 63621:2025 provides a framework for the data life cycle processes for management of data used to train, test or validate an AI model that is part of a medical device.
For data acquisition and management lifecycle the following considerations apply, amongst others: data suitability, data quality and integrity insurance, data privacy and security, data governance and documentation, data sampling and bias mitigation, data versioning and traceability, data storage and infrastructure, data access and sharing, and data labelling and annotation.
This document outlines the requirements for the data lifecycle, covering stages from planning and acquisition to usage and decommissioning. It emphasizes maintaining data quality, including aspects such as dataset classification, data annotations, traceability, metadata comprehensiveness, representativeness, and validity periods.
The scope is limited to the high-level process concepts applicable across medical specialties and device types and does not include specific requirements that can be covered by modality- or device-specific standards documents.
This document outlines the additional requirements for a quality management system for data management, where an organization demonstrates its capability to manage data in accordance with applicable medical device guidance and standards. Organizations can be involved in one or more stages of the life-cycle, including design and development, production, storage and distribution, installation, or servicing and maintenance of a medical device that incorporates AI. This document can also be used by suppliers or external parties that provide data, including quality management system-related services to such organizations.
General Information
- Status
- Published
- Publication Date
- 17-Mar-2026
- Technical Committee
- TC 62 - Medical equipment, software, and systems
- Current Stage
- PPUB - Publication issued
- Start Date
- 18-Mar-2026
- Completion Date
- 13-Mar-2026
Get Certified
Connect with accredited certification bodies for this standard

BSI Group
BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

TÜV Rheinland
TÜV Rheinland is a leading international provider of technical services.

TÜV SÜD
TÜV SÜD is a trusted partner of choice for safety, security and sustainability solutions.
Sponsored listings
Frequently Asked Questions
IEC PAS 63621:2026 is a technical specification published by the International Electrotechnical Commission (IEC). Its full title is "Artificial intelligence enabled medical devices - Data management". This standard covers: IEC PAS 63621:2025 provides a framework for the data life cycle processes for management of data used to train, test or validate an AI model that is part of a medical device. For data acquisition and management lifecycle the following considerations apply, amongst others: data suitability, data quality and integrity insurance, data privacy and security, data governance and documentation, data sampling and bias mitigation, data versioning and traceability, data storage and infrastructure, data access and sharing, and data labelling and annotation. This document outlines the requirements for the data lifecycle, covering stages from planning and acquisition to usage and decommissioning. It emphasizes maintaining data quality, including aspects such as dataset classification, data annotations, traceability, metadata comprehensiveness, representativeness, and validity periods. The scope is limited to the high-level process concepts applicable across medical specialties and device types and does not include specific requirements that can be covered by modality- or device-specific standards documents. This document outlines the additional requirements for a quality management system for data management, where an organization demonstrates its capability to manage data in accordance with applicable medical device guidance and standards. Organizations can be involved in one or more stages of the life-cycle, including design and development, production, storage and distribution, installation, or servicing and maintenance of a medical device that incorporates AI. This document can also be used by suppliers or external parties that provide data, including quality management system-related services to such organizations.
IEC PAS 63621:2025 provides a framework for the data life cycle processes for management of data used to train, test or validate an AI model that is part of a medical device. For data acquisition and management lifecycle the following considerations apply, amongst others: data suitability, data quality and integrity insurance, data privacy and security, data governance and documentation, data sampling and bias mitigation, data versioning and traceability, data storage and infrastructure, data access and sharing, and data labelling and annotation. This document outlines the requirements for the data lifecycle, covering stages from planning and acquisition to usage and decommissioning. It emphasizes maintaining data quality, including aspects such as dataset classification, data annotations, traceability, metadata comprehensiveness, representativeness, and validity periods. The scope is limited to the high-level process concepts applicable across medical specialties and device types and does not include specific requirements that can be covered by modality- or device-specific standards documents. This document outlines the additional requirements for a quality management system for data management, where an organization demonstrates its capability to manage data in accordance with applicable medical device guidance and standards. Organizations can be involved in one or more stages of the life-cycle, including design and development, production, storage and distribution, installation, or servicing and maintenance of a medical device that incorporates AI. This document can also be used by suppliers or external parties that provide data, including quality management system-related services to such organizations.
IEC PAS 63621:2026 is classified under the following ICS (International Classification for Standards) categories: 11.020 - Medical sciences and health care facilities in general; 11.040 - Medical equipment. The ICS classification helps identify the subject area and facilitates finding related standards.
IEC PAS 63621:2026 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.
Standards Content (Sample)
IEC PAS 63621 ®
Edition 1.0 2026-03
PUBLICLY AVAILABLE
SPECIFICATION
Artificial intelligence enabled medical devices - Data management
ICS 11.040; 11.020 ISBN 978-2-8327-1124-8
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or
by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either
IEC or IEC's member National Committee in the country of the requester. If you have any questions about IEC copyright
or have an enquiry about obtaining additional rights to this publication, please contact the address below or your local
IEC member National Committee for further information.
IEC Secretariat Tel.: +41 22 919 02 11
3, rue de Varembé info@iec.ch
CH-1211 Geneva 20 www.iec.ch
Switzerland
About the IEC
The International Electrotechnical Commission (IEC) is the leading global organization that prepares and publishes
International Standards for all electrical, electronic and related technologies.
About IEC publications
The technical content of IEC publications is kept under constant review by the IEC. Please make sure that you have the
latest edition, a corrigendum or an amendment might have been published.
IEC publications search - IEC Products & Services Portal - products.iec.ch
webstore.iec.ch/advsearchform Discover our powerful search engine and read freely all the
The advanced search enables to find IEC publications by a
publications previews, graphical symbols and the glossary.
variety of criteria (reference number, text, technical With a subscription you will always have access to up to date
committee, …). It also gives information on projects, content tailored to your needs.
replaced and withdrawn publications.
Electropedia - www.electropedia.org
IEC Just Published - webstore.iec.ch/justpublished The world's leading online dictionary on electrotechnology,
Stay up to date on all new IEC publications. Just Published containing more than 22 500 terminological entries in English
details all new publications released. Available online and and French, with equivalent terms in 25 additional languages.
once a month by email. Also known as the International Electrotechnical Vocabulary
(IEV) online.
IEC Customer Service Centre - webstore.iec.ch/csc
If you wish to give us your feedback on this publication or
need further assistance, please contact the Customer
Service Centre: sales@iec.ch.
CONTENTS
FOREWORD . 4
INTRODUCTION . 6
1 Scope . 8
2 Normative references . 8
3 Terms and definitions . 8
4 Data management principles . 11
5 Data management process . 12
5.1 Data management process . 12
5.2 Data development process . 13
5.2.1 Data quality planning . 13
5.2.2 Data quality improvement . 13
5.2.3 Data quality verification . 13
5.2.4 Data quality analysis . 13
6 Data management . 14
6.1 General . 14
6.2 Data requirements . 14
6.3 Data planning . 15
6.4 Data acquisition . 16
6.5 Data development . 17
6.5.1 General. 17
6.5.2 Data de-identification . 18
6.5.3 Dataset composition . 18
6.5.4 Data annotation . 20
6.5.5 Data quality improvement . 21
6.5.6 Data quality verification . 22
6.5.7 Data quality analysis . 23
6.6 Data provisioning . 23
6.7 Data decommissioning . 24
Annex A (informative) Explanation of data management techniques . 25
A.1 Data quality improvement techniques. 25
A.1.1 Cleaning . 25
A.1.2 Data encoding . 25
A.1.3 Data transformation. 25
A.1.4 Data aggregation . 26
A.1.5 Data normalization . 26
A.1.6 Data standardization . 26
A.1.7 Data imputation . 26
A.1.8 Data augmentation . 27
A.1.9 Data mining . 28
A.2 Data quality verification . 28
Annex B (informative) Data quality characteristics . 30
B.1 Description of data quality characteristics . 30
B.1.1 Integrity: Accuracy . 30
B.1.2 Integrity: Completeness . 30
B.1.3 Uniqueness . 30
B.1.4 Consistency . 30
B.1.5 Authenticity . 30
B.1.6 Timeliness. 30
B.1.7 Accessibility . 30
B.1.8 Conformance . 30
B.1.9 Confidentiality . 31
B.1.10 Resource utilization . 31
B.1.11 Precision . 31
B.1.12 Traceability . 31
B.1.13 Comprehensibility . 31
B.1.14 Availability . 31
B.1.15 Portability . 31
B.1.16 Recoverability . 31
B.1.17 Representativeness. 31
B.2 Demonstration of data quality characteristics . 32
B.2.1 Integrity: Accuracy . 32
B.2.2 Integrity: Completeness . 32
B.2.3 Uniqueness . 32
B.2.4 Consistency . 32
B.2.5 Authenticity . 33
B.2.6 Timeliness. 33
B.2.7 Accessibility . 33
B.2.8 Conformance . 33
B.2.9 Confidentiality . 33
B.2.10 Resource utilization . 33
B.2.11 Precision . 33
B.2.12 Traceability . 33
B.2.13 Comprehensibility . 34
B.2.14 Availability . 34
B.2.15 Portability . 34
B.2.16 Recoverability . 34
B.2.17 Representativeness. 34
B.3 Evaluation of data characteristics . 34
B.3.1 Accuracy . 35
B.3.2 Completeness. 35
B.3.3 Uniqueness . 35
B.3.4 Consistency . 35
B.3.5 Authenticity . 35
B.3.6 Timeliness. 35
B.3.7 Accessibility . 36
B.3.8 Conformance . 36
B.3.9 Confidentiality . 36
B.3.10 Resource utilization . 36
B.3.11 Precision . 36
B.3.12 Traceability . 36
B.3.13 Comprehensibility . 36
B.3.14 Availability . 36
B.3.15 Portability . 36
B.3.16 Recoverability . 36
B.3.17 Representativeness. 36
B.3.18 Dataset risk analysis assessment . 37
B.4 Evaluation of dataset description . 37
Annex C (informative) Description of data screening and cleaning . 38
Bibliography . 39
Figure 1 – Data management process . 6
Figure 2 – Data management process . 12
Figure 3 – Data quality measures information for quality reports . 22
Figure 4 – Data quality assessment in data development . 23
Figure A.1 – Flow chart of dataset quality evaluation . 29
Table B.1 – Classification and evaluation methods for data characteristics . 35
Table C.1 – Examples for data screening . 38
Table C.2 – Examples for data exclusion criteria . 38
INTERNATIONAL ELECTROTECHNICAL COMMISSION
____________
Artificial intelligence enabled medical devices - Data management
FOREWORD
1) The International Electrotechnical Commission (IEC) is a worldwide organization for
standardization comprising all national electrotechnical committees (IEC National Committees).
The object of IEC is to promote international co-operation on all questions concerning
standardization in the electrical and electronic fields. To this end and in addition to other
activities, IEC publishes International Standards, Technical Specifications, Technical Reports,
Publicly Available Specifications (PAS) and Guides (hereafter referred to as "IEC
Publication(s)"). Their preparation is entrusted to technical committees; any IEC National
Committee interested in the subject dealt with may participate in this preparatory work.
International, governmental and non-governmental organizations liaising with the IEC also
participate in this preparation. IEC collaborates closely with the International Organization for
Standardization (ISO) in accordance with conditions determined by agreement between the two
organizations.
2) The formal decisions or agreements of IEC on technical matters express, as nearly as
possible, an international consensus of opinion on the relevant subjects since each technical
committee has representation from all interested IEC National Committees.
3) IEC Publications have the form of recommendations for international use and are accepted
by IEC National Committees in that sense. While all reasonable efforts are made to ensure that
the technical content of IEC Publications is accurate, IEC cannot be held responsible for the
way in which they are used or for any misinterpretation by any end user.
4) In order to promote international uniformity, IEC National Committees undertake to apply IEC
Publications transparently to the maximum extent possible in their national and regional
publications. Any divergence between any IEC Publication and the corresponding national or
regional publication shall be clearly indicated in the latter.
5) IEC itself does not provide any attestation of conformity. Independent certification bodies
provide conformity assessment services and, in some areas, access to IEC marks of conformity.
IEC is not responsible for any services carried out by independent certification bodies.
6) All users should ensure that they have the latest edition of this publication.
7) No liability shall attach to IEC or its directors, employees, servants or agents including
individual experts and members of its technical committees and IEC National Committees for
any personal injury, property damage or other damage of any nature whatsoever, whether direct
or indirect, or for costs (including legal fees) and expenses arising out of the publication, use
of, or reliance upon, this IEC Publication or any other IEC Publications.
8) Attention is drawn to the Normative references cited in this publication. Use of the referenced
publications is indispensable for the correct application of this publication.
9) IEC draws attention to the possibility that the implementation of this document may involve
the use of (a) patent(s). IEC takes no position concerning the evidence, validity or applicability
of any claimed patent rights in respect thereof. As of the date of publication of this document,
IEC had not received notice of (a) patent(s), which may be required to implement this document.
However, implementers are cautioned that this may not represent the latest information, which
may be obtained from the patent database available at https://patents.iec.ch. IEC shall not be
held responsible for identifying any or all such patent rights.
IEC PAS 63621 was prepared by IEC technical committee 62: Medical equipment, software,
and systems. It is a Publicly Available Specification.
The text of this Publicly Available Specification is based on the following documents:
Draft Report on voting
62/559/DPAS 62/579/RVDPAS
Full information on the voting for its approval can be found in the report on voting indicated in
the above table.
The language used for the development of this is English.
This document was drafted in accordance with ISO/IEC Directives, Part 2, and developed in
accordance with ISO/IEC Directives, Part 1 and ISO/IEC Directives, IEC Supplement, available
at www.iec.ch/members_experts/refdocs . The main document types developed by IEC are
described in greater detail at www.iec.ch/standardsdev/publications.
The committee has decided that the contents of this document will remain unchanged until the
stability date indicated on the IEC website under webstore.iec.ch in the data related to the
specific document. At this date, the document will be
– reconfirmed,
– withdrawn, or
– revised.
NOTE In accordance with ISO/IEC Directives, Part 1, IEC PASs are automatically withdrawn after 4 years.
INTRODUCTION
Ensuring the safety and effectiveness of a medical device that incorporates AI involves several
steps. These steps include establishing data requirements, data planning and data acquisition,
managing data throughout its life cycle, and demonstrating that the device meets its intended
purpose without posing unacceptable risks.
This document outlines requirements for data management lifecycle cycle processes, detailing
the activities and tasks essential for managing data used by medical devices incorporating AI.
It specifies requirements for each stage of the data life cycle.
The data management process consists of a number of activities. These activities are shown in
Figure 1 below.
Figure 1 – Data management process
Data management process: Data management is a lifecycle activity that continues throughout
the full device lifecycle. This includes adjustments to improve the data quality in case data
characteristics no longer meet the requirements defined in data planning.
This document does not specify an organizational structure for the manufacturer or which part
of the organization is to perform which process, activity, or task.
This document does not specify the name, format, or detailed content of the required
documentation. While this document provides requirements about the documentation of tasks,
it leaves the choice of how to organize and present this documentation up to the user.
Annex A provides further information about how the clauses of this document should be seen
in relation to the quality management system.
This document assumes that the manufacturer has a quality management system in place which
is appropriate for medical device development.
For the purposes of this document:
– "shall" means that conformance with a requirement is mandatory for conformance with this
document;
– "should" means that conformance with a recommendation but is not mandatory for
conformance with this document;
– "may" is used to describe a permissible way to achieve conformance with a requirement;
– "establish" means to define, document, and implement; and
– where this document uses the term "as appropriate" in conjunction with a required process,
activity, task or output, the intention is that the manufacturer shall use the process, activity,
task or output unless the manufacturer can document a justification for not so doing.
1 Scope
This document provides a framework for the data life cycle processes for management of data
used to train, test or validate an AI model that is part of a medical device.
For data acquisition and management lifecycle the following considerations apply, amongst
others: data suitability, data quality and integrity insurance, data privacy and security, data
governance and documentation, data sampling and bias mitigation, data versioning and
traceability, data storage and infrastructure, data access and sharing, and data labelling and
annotation.
This document outlines the requirements for the data lifecycle, covering stages from planning
and acquisition to usage and decommissioning. It emphasizes maintaining data quality,
including aspects such as dataset classification, data annotations, traceability, metadata
comprehensiveness, representativeness, and validity periods.
The scope is limited to the high-level process concepts applicable across medical specialties
and device types and does not include specific requirements that can be covered by modality-
or device-specific standards documents.
This document outlines the additional requirements for AI data management for data
management, where an organization demonstrates its capability to manage data in accordance
with applicable medical device guidance and standards. Organizations can be involved in one
or more stages of the life-cycle, including design and development, production, storage and
distribution, installation, or servicing and maintenance of a medical device that incorporates AI.
This document can also be used by suppliers or external parties that provide data, including
quality management system-related services to such organizations.
2 Normative references
There are no normative references in this document.
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminological databases for use in standardization at the following
addresses:
– IEC Electropedia: available at https://www.electropedia.org/
– ISO Online browsing platform: available at https://www.iso.org/obp
3.1
manufacturer
natural or legal person with responsibility for design and/or manufacture of a medical device
with the intention of making the medical device available for use, under his name; whether or
not such a medical device is designed and/or manufactured by that person himself or on his
behalf by another person(s)
Note 1 to entry: "Design and/or manufacture", as referred to in the above definition, may include specification
development, production, fabrication, assembly, processing, packaging, repackaging, labelling, relabelling,
sterilization, installation, or remanufacturing of a medical device; or putting a collection of devices, and possibly other
products, together for a medical purpose.
Note 2 to entry: The manufacturer’s responsibilities are described in other GHTF or IMDRF guidance documents.
These responsibilities include meeting both pre-market requirements and post-market requirements, such as adverse
event reporting and notification of corrective actions.
Note 3 to entry: This "natural or legal person" has ultimate legal responsibility for ensuring compliance with all
applicable regulatory requirements for the medical devices in the countries or jurisdictions where it is intended to be
made available or sold, unless this responsibility is specifically imposed on another person by the Regulatory
Authority (RA) within that jurisdiction.
Note 4 to entry: Any person who assembles or adapts a medical device that has already been supplied by another
person for an individual patient, in accordance with the instructions for use, is not the manufacturer, provided the
assembly or adaptation does not change the intended use of the medical device.
Note 5 to entry: Any person who changes the intended use of, or modifies, a medical device without acting on behalf
of the original manufacturer and who makes it available for use under his own name, should be considered the
manufacturer of the modified medical device.
Note 6 to entry: An authorized representative, distributor or importer who only adds its own address and contact
details to the medical device or the packaging, without covering or changing the existing labelling, is not considered
a manufacturer.
Note 7 to entry: To the extent that an accessory is subject to the regulatory requirements of a medical device, the
person responsible for the design and/or manufacture of that accessory is considered to be a manufacturer.
[SOURCE: ISO 13485:2016 [1], 3.10, modified – Reordering of notes to entry; addition of "or
IMDRF guidance documents" in Note 2 to entry.]
3.2
data
re-interpretable representation of information in a formalized manner suitable for
communication, interpretation, or processing
Note 1 to entry: Data can be processed by humans or by automatic means.
[SOURCE: ISO/IEC 2382:2015 [2], 2121272, modified – Deletion of Note 2 to entry and Note 3
to entry.]
3.3
data annotation
process of attaching a set of descriptive information to data without any change to that data
Note 1 to entry: The descriptive information can take the form of metadata, labels and anchors.
[SOURCE: ISO/IEC 22989:2022 [3], 3.2.1]
3.4
dataset
data set
collection of data with a shared format
EXAMPLE 1 Micro-blogging posts from June 2020 associated with hashtags #rugby and #football
EXAMPLE 2 Macro photographs of flowers in 256x256 pixels.
Note 1 to entry: Datasets can be used for validating or testing an AI model. In a machine learning context, datasets
can also be used to train a machine learning algorithm.
[SOURCE: ISO/IEC 22989:2022 [3], 3.2.5, modified – Addition of the synonym "data set".]
3.5
data management
encompassing processes and tools for dataset acquisition, construction, storage, governance,
privacy, security, integrity, provision, use and decommissioning
[SOURCE: ISO/IEC 22989:2022 [3], 6.1, modified – Addition of "construction, storage,
governance, privacy, security, integrity, provision, use and decommissioning".]
3.6
data quality
ability of data that the data meets the manufacturer's requirements for a specified context
3.7
data quality management
coordinated activities to direct and control an organization with regard to data quality
[SOURCE: ISO 8000-2:2020 [4], 3.8.2]
3.8
data quality model
defined set of characteristics which provides a framework for specifying data quality
requirements and evaluating data quality
[SOURCE: ISO/IEC 25012:2008 [5], 4.6]
3.9
data quality verification
activity that ensures that data is accurate, consistent, and meets established requirements and
acceptance criteria as defined by the manufacturer
Note 1 to entry: This involves assessing dataset descriptions, evaluating quality characteristics, and analyzing
dataset risk to confirm that the data is fit for its intended purpose and meets required standards and documenting
the validation.
3.10
data quality analysis
activity that looks at data development cycle to determine if the reason for the failure of the
data quality verification is the data itself or the process for the data improvement
3.11
medical device
instrument, apparatus, implement, machine, appliance, implant, reagent for in vitro use,
software, material or other similar or related article, intended by the manufacturer to be used,
alone or in combination, for human beings, for one or more of the specific medical purpose(s) of
– diagnosis, prevention, monitoring, treatment or alleviation of disease,
– diagnosis, monitoring, treatment, alleviation of or compensation for an injury,
– investigation, replacement, modification, or support of the anatomy or of a physiological
process,
– supporting or sustaining life,
– control of conception,
– disinfection of medical devices,
– providing information by means of in vitro examination of specimens derived from the human
body,
and which does not achieve its primary intended action by pharmacological, immunological or
metabolic means, in or on the human body, but can be assisted in its function by such means
Note 1 to entry: Products which can be considered to be medical devices in some jurisdictions but not in others
include:
– disinfection substances;
– aids for persons with disabilities;
– devices incorporating animal and/or human tissues;
– devices for in vitro fertilization or assisted reproduction technologies.
[SOURCE: ISO/IEC Guide 63:2019 [6], 3.7]
3.12
metadata
data that define and describe other data
Note 1 to entry: In the context of analytics and machine learning, metadata provides information on data items or
data records such as their properties, structure, context, intended use, ownership, access and volatility.
[SOURCE: ISO/IEC 11179-1:2023 [7], 3.2.26, modified – Note 1 to entry added.]
3.13
process
set of interrelated or interacting activities that use inputs to deliver an intended result
Note 1 to entry: Whether the "intended result" of a process is called output, product or service depends on the
context of the reference.
Note 2 to entry: Inputs to a process are generally the outputs of other processes and outputs of a process are
generally the inputs to other processes.
Note 3 to entry: Two or more interrelated and interacting processes in series can also be referred to as a process.
[SOURCE: ISO 9000:2015 [8], 3.4.1, modified – Notes to entry 4, 5 and 6 are deleted.]
3.14
DOUP
Data of Unknown Provenance
data previously collected for which adequate records of the collecting process are not available
EXAMPLE Data where the source, date of creation, metadata, or validity of the information cannot be confirmed.
3.15
synthetic data
data that is artificially generated rather than produced by real-world events
3.16
data augmentation
process of creating augmented copies of existing data
3.17
integrity insurance
set of measures and practices put in place to ensure the accuracy, consistency, and reliability
of data throughout its lifecycle
Note 1 to entry: Integrity insurance can include aspects such as data quality, data privacy and security, data
governance, and documentation. For example, in the context of data management, integrity insurance might involve
ensuring that data is collected, stored, and processed in a way that maintains its accuracy and prevents unauthorized
access or modification.
4 Data management principles
Manufacturers shall take appropriate actions to manage the data, as defined in data planning.
NOTE 1 Managing data and ensuring data quality are critical aspects when developing medical devices that
incorporate AI. Throughout the entire data lifecycle, maintaining data quality is essential as conformance against the
data quality characteristics can change overtime. Therefore, data management, including data quality, is a lifecycle
activity.
The manufacturer shall plan and develop the processes needed for data management.
The manufacturer shall define and document general data requirements taking into account the
following aspects:
– appropriateness of data and datasets for the specified intended use;
– proper storage and governance of data and datasets;
– assurance of traceability, as appropriate, considering the provenance and sample
identification, and the datasets version control;
– assurance of transparency, as appropriate, considering the intended purpose and the
provenance identification, and keeping clear process descriptions;
– usage of a data quality model based on data quality characteristics (see 6.5.6);
– verification that data quality aligns with specified data quality requirements, characteristics
and specified targets.
NOTE 2 The verification of data quality includes the evaluation of dataset description, quality characteristics and
the dataset risk analysis documents. See Clause A.2 for more details.
– alignment with organizational requirements for security, privacy, fairness and ethics.
5 Data management process
5.1 Data management process
The manufacturer shall establish and maintain an ongoing process, as shown in Figure 2, for:
– data requirements;
– data planning;
– data acquisition;
– data development;
– data provisioning;
– data decommissioning.
This process shall apply throughout the data life cycle.
This process shall include data development based on the following phases:
– data preparation;
– data improvement;
– data quality improvement;
– data quality verification;
– data quality analysis.
Figure 2 – Data management process
The manufacturer shall determine if data contributes to a hazardous situation using a risk
management process. During hazard identification, if data contributes to a hazardous situation,
it should be identified, the risk assessed and managed per the risk management process.
NOTE Whether the data is a contributing factor to a hazardous situation is determined during the hazard
identification activity of the risk management process. Hazardous situations that could be indirectly caused by the
use of the data (for example, introducing a specific bias could cause inappropriate treatment to be administered) are
considered when determining whether data is a contributing factor. The decision to use the data based on the required
data characteristics and the activities employed is made during the risk control activity of the data management
process. The data risk management process required in this document is part of the overall risk management process.
5.2 Data development process
5.2.1 Data quality planning
As part of data quality planning, the manufacturer shall define the following:
– the relevant data characteristics for the intended use including the intended target
population;
– the relevant acceptance criteria for the defined characteristics;
– the methods used to check the characteristics against the acceptance criteria;
– the data quality model.
The manufacturer shall describe the rationale for the selected characteristics and acceptance
criteria.
5.2.2 Data quality improvement
The data quality improvement process activities and outcomes include:
– activities:
• apply data quality improvement methods as described in Clause A.1;
• apply data augmentation A.1.8 methods, if appropriate;
– outcomes:
• documentation of the data quality improvement and augmentation methods used;
• documentation of the data quality improvement actions taken.
NOTE Data quality improvement can also mean the creation of synthetic data.
5.2.3 Data quality verification
The manufacturer shall conduct verification of the datasets against the acceptance criteria for
the characteristics as defined in the requirements to check if the established targets for the
data are met.
– outcomes:
• Documentation of the differences between the established targets and the verification
result of the data requirements;
• Documentation of the data quality verification.
5.2.4 Data quality analysis
If the data used for training, tuning, or testing fails verification, the manufacturer shall document
the analysis of the failure to determine if it was a data process failure or a failure of the data to
meet requirements. This analysis shall include the process assessment, guidance on improving
the data quality process, and an assessment of whether the data meets requirements, or if there
were missing requirements. If the data does not meet the requirements, or there were missing
requirements, changes to the data through a quality improvement process should be
considered.
If the data management process was insufficient, the process should be improved by the
established methods as defined in the quality management system.
Activities:
– Evaluate and implement necessary changes through the quality improvement process if the
data does not meet the requirements.
Outcomes:
– Documentation of the data quality analysis.
NOTE All datasets used for training, tuning, and verification are verified before use. The dataset used for formal
verification is separate from those used for training and tuning.
6 Data management
6.1 General
The data management clause provides requirements aligned with the data management
principles as described in Clause 4. Data management is based on the data management
process as described in Clause 5.
As shown in 5.1, the high-level elements of a data management process include:
– data requirements;
– data planning;
– data acquisition;
– data development;
– data provisioning;
– data decommissioning.
6.2 Data requirements
Data requirements are based on the intended use of the medical device and system
requirements relevant to the medical device that incorporates AI. The data requirements shall
include the following aspects as appropriate:
– required features in the data;
EXAMPLE Examples of measurable data features include:
• sizes, location and types of lesions in images;
• signs and texture information in images;
• signal patterns of ECG/EEG waves;
• specific peaks of optical spectra.
– amount of data required for coverage of the intended use including the compartmentalizing
of:
• training data;
• tuning data;
• verification data.
NOTE 1 Refer to 6.5.3 for further details.
– requirements that define bias assessment and mitigation;
– relevant statistical properties;
– data quality characteristics including acceptance criteria (see Annex B);
NOTE 2 The data model or data architecture include the metadata necessary to achieve the data requirements.
NOTE 3 The data characteristics can be different for different datasets, e.g. for robustness tests, the characteristics
and acceptance criteria can be different due to the purpose of the data set.
Good data quality can therefore be interpreted differently based on the dataset use.
– legal requirements;
NOTE 4 The selection of data quality characteristics can be influenced by local legislation or regulatory
requirements, if applicable. For example, EU General Data Protection Regulation (GDPR), U.S. Health Insurance
Portability and Accountability Act (HIPAA), China Data Security Law, etc.
– regulatory requirements.
NOTE 5 The regulatory requirements are from medical device regulatory authorities. The medical device regulatory
authorities can release specific data requirements for AI medical device in different forms. For instance, China
National Medical Products Administration (NMPA) released Guidelines for the Registration and Technical Review of
Artificial Intelligence Medical Devices, which describes how manufacturers can promote data collection, curation,
annotation and dataset construction. China NMPA also published sectoral standard YY/T 1833.2:2022 [9] to guide
data quality evaluation.
Additionally for supervised, semi-supervised and reinforced and reinforced learning, the data
requirements shall include as applicable:
– quality of the metadata (semi-supervised);
– balance of data labels among different categories (semi-supervised);
– distribution of data without metadata (semi-supervised);
– learning object of reinforcement learning (reinforcement);
– necessary quantity of data to be generated (reinforcement);
– rules of data generation (reinforcement).
The data characteristics defined in the requirements shall be linked to a method for determining
acceptability.
NOTE 6 The choice of data quality characteristics and data quality measures are an additional level of detail for
the data requirements.
The selection of data quality characteristics can be described on local legislation or regulatory requirement, if
applicable.
6.3 Data planning
The manufacturer shall document the analysis of how the data covers the variables of the use
environment, claimed population and the intended use.
NOTE 1 For the data planning activity, a data quality model derived from the data quality characteristics (e.g.
balance, completeness over expected population, currentness, consistency, etc.) is defined based on the intended use.
In data planning, the organization shall document the following, as appropriate:
– data requirements;
– data availability;
– data licensing;
– data acquisition;
– data quality process;
– skills, roles and resources for data management process;
– data labelling and annotation;
– dataset composition and compartmentalizing;
– data reproducibility;
– data environment;
– data quality reporting.
The output of this planning shall be documented in a form suitable for the organization’s method
of operations.
When a learning method is used that does not use data as input for the AI component's learning,
data planning can be excluded, based on a documented rationale.
NOTE 2 For e.g. reinforcement learning, there is no data planning process since the data are generated through
the learning process.
The manufacturer shall describe the effect of the following as app
...




Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...