Artificial intelligence — Data quality for analytics and machine learning (ML) — Part 1: Overview, terminology, and examples

This document provides the means for understanding and associating the individual documents of the ISO/IEC 5259 series and is the foundation for conceptual understanding of data quality for analytics and machine learning. It also discusses associated technologies and examples (e.g. use cases and usage scenarios).

Intelligence artificielle — Qualité des données pour les analyses de données et l’apprentissage automatique — Partie 1: Vue d'ensemble, terminologie et exemples

General Information

Status
Published
Publication Date
01-Jul-2024
Current Stage
6060 - International Standard published
Start Date
02-Jul-2024
Due Date
30-Apr-2024
Completion Date
02-Jul-2024
Ref Project

Buy Standard

Standard
ISO/IEC 5259-1:2024 - Artificial intelligence — Data quality for analytics and machine learning (ML) — Part 1: Overview, terminology, and examples Released:2. 07. 2024
English language
19 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
ISO/IEC FDIS 5259-1 - Artificial intelligence — Data quality for analytics and machine learning (ML) — Part 1: Overview, terminology, and examples Released:18. 03. 2024
English language
19 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
REDLINE ISO/IEC FDIS 5259-1 - Artificial intelligence — Data quality for analytics and machine learning (ML) — Part 1: Overview, terminology, and examples Released:18. 03. 2024
English language
19 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


International
Standard
ISO/IEC 5259-1
First edition
Artificial intelligence — Data
2024-07
quality for analytics and machine
learning (ML) —
Part 1:
Overview, terminology, and
examples
Intelligence artificielle — Qualité des données pour les analyses
de données et l’apprentissage automatique —
Partie 1: Vue d'ensemble, terminologie et exemples
Reference number
© ISO/IEC 2024
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
© ISO/IEC 2024 – All rights reserved
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols and abbreviated terms. 5
5 Data quality concepts for analytics and machine learning . 5
5.1 Data quality considerations for analytics and machine learning .5
5.1.1 General .5
5.1.2 Machine learning and data quality .5
5.1.3 Data characteristics that pose quality challenges for analytics and machine
learning .6
5.1.4 Data sharing, data re-use and data quality for analytics and machine learning .6
5.2 Data quality concept framework for analytics and machine learning .6
5.2.1 Overview .6
5.2.2 Data quality management .7
5.2.3 Data quality governance .10
5.2.4 Data provenance .10
5.3 Data life cycle for analytics and ML .10
5.3.1 Overview .10
5.3.2 Data life cycle model .10
5.3.3 Processes across the multiple stages . 13
Annex A (informative) Examples and scenarios .15
Bibliography .18

© ISO/IEC 2024 – All rights reserved
iii
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations,
governmental and non-governmental, in liaison with ISO and IEC, also take part in the work.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of document should be noted. This document was drafted in accordance with the editorial rules of the ISO/
IEC Directives, Part 2 (see www.iso.org/directives or www.iec.ch/members_experts/refdocs).
ISO and IEC draw attention to the possibility that the implementation of this document may involve the
use of (a) patent(s). ISO and IEC take no position concerning the evidence, validity or applicability of any
claimed patent rights in respect thereof. As of the date of publication of this document, ISO and IEC had not
received notice of (a) patent(s) which may be required to implement this document. However, implementers
are cautioned that this may not represent the latest information, which may be obtained from the patent
database available at www.iso.org/patents and https://patents.iec.ch. ISO and IEC shall not be held
responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www.iso.org/iso/foreword.html.
In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 42, Artificial intelligence.
A list of all parts in the ISO/IEC 5259 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html and
www.iec.ch/national-committees.

© ISO/IEC 2024 – All rights reserved
iv
Introduction
Data are the raw material for analytics and machine learning (ML) and data quality is a critical aspect for
related analytics and ML projects and systems. The aim of the ISO/IEC 5259 series is to provide tools and
methods to assess and improve the quality of data used for analytics and ML.
Other parts of the ISO/IEC 5259 series include:
1)
— ISO/IEC 5259-2 provides a data quality model, data quality measures and guidance on reporting data
quality in the context of analytics and ML. ISO/IEC 5259-2 builds on the ISO 8000 series, ISO/IEC 25012
and ISO/IEC 25024.
The aim of ISO/IEC 5259-2 is to enable organizations to achieve their data quality objectives and is
applicable to all types of organizations.
— ISO/IEC 5259-3 specifies requirements and provides guidance for establishing, implementing,
maintaining and continually improving the quality for data used in the areas of analytics and ML.
ISO/IEC 5259-3 does not define detailed processes, methods or measurement. Rather it defines the
requirements and guidance for a quality management process along with a reference process and
methods that can be tailored to meet the requirements in ISO/IEC 5259-3.
The requirements and recommendations set out in ISO/IEC 5259-3 are generic and are intended to be
applicable to all organizations, regardless of type, size or nature.
— ISO/IEC 5259-4 provides general common organizational approaches, regardless of type, size or nature
of the applying organization, to ensure data quality for training and evaluation in analytics and ML. It
includes guidelines on the data quality process for:
— supervised ML with regard to the labelling of data used for training ML systems, including common
organizational approaches for training data labelling;
— unsupervised ML;
— semi-supervised ML;
— reinforcement learning;
— analytics.
ISO/IEC 5259-4 is applicable to training and evaluation data that come from different sources, including
data acquisition and data composition, data pre-processing, data labelling, evaluation and data use.
ISO/IEC 5259-4 does not define specific services, platforms or tools.
2)
— ISO/IEC 5259-5 provides a data quality governance framework for analytics and machine learning to
enable the governing bodies of organization to direct and oversee the implementation and operation of
data quality measures, management, and related processes with adequate controls throughout the DLC
model according to ISO/IEC 5259-1.
3)
— ISO/IEC TR 5259-6 describes a visualization framework for data quality in analytics and ML. The aim is
to enable stakeholders using visualization methods to access the results of data quality measures. This
visualization framework supports data quality goals.
1) Under preparation. Stage at the time of publication: ISO/IEC FDIS 5259-2:2024.
2) Under preparation. Stage at the time of publication: ISO/IEC DIS 5259-5:2023.
3) Under preparation. Stage at the time of publication: ISO/IEC CD TR 5259-6:2023.

© ISO/IEC 2024 – All rights reserved
v
International Standard ISO/IEC 5259-1:2024(en)
Artificial intelligence — Data quality for analytics and
machine learning (ML) —
Part 1:
Overview, terminology, and examples
1 Scope
This document provides the means for understanding and associating the individual documents of the
ISO/IEC 5259 series and is the foundation for conceptual understanding of data quality for analytics and
machine learning. It also discusses associated technologies and examples (e.g. use cases and usage scenarios).
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 22989, Information technology — Artificial intelligence — Concepts and terminology
ISO/IEC 23053, Framework for Artificial Intelligence (AI) Systems Using Machine Learning (ML)
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 22989 and ISO/IEC 23053 and
the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
data life cycle
life cycle of data
stages in the process of data usage from idea conception to its discontinuation
3.2
data originator
party that created the data and that can have rights
Note 1 to entry: A data originator can be an individual person.
Note 2 to entry: The data originator can be distinct from the natural or legal person(s) mentioned in, described by, or
implicitly or explicitly associated with the data. For example, PII can be collected by a data originator that identifies
other individuals. Those data subjects (PII Principals) can also have rights, in relation to the data set.
Note 3 to entry: Rights can include the right to publicity, right to display name, right to identity, right to prohibit data
use in a way that offends honourable mention.
[SOURCE: ISO/IEC 23751:2022, 3.2]

© ISO/IEC 2024 – All rights reserved
3.3
data holder
party that has legal control to authorize data processing of the data by other parties
Note 1 to entry: A data originator (3.2) can be a data holder.
[SOURCE: ISO/IEC 23751:2022, 3.4]
3.4
data user
party that is authorized to perform processing of data under the legal control of a data holder (3.3)
[SOURCE: ISO/IEC 23751:2022, 3.5]
3.5
data quality
characteristic of data that the data meet the organization's data requirements for a specified context
3.6
data quality characteristic
category of data quality attributes (3.13) that has a bearing on data quality (3.5)
[SOURCE: ISO/IEC 25012:2008, 4.4, modified — Definition revised.]
3.7
data quality model
defined set of characteristics which provides a framework for specifying data quality requirements (3.9) and
eval
...


FINAL DRAFT
International
Standard
ISO/IEC FDIS
5259-1
ISO/IEC JTC 1/SC 42
Artificial intelligence — Data
Secretariat: ANSI
quality for analytics and machine
Voting begins on:
learning (ML) —
2024-04-01
Part 1:
Voting terminates on:
2024-05-27
Overview, terminology, and
examples
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO­
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
Reference number
ISO/IEC FDIS 5259­1:2024(en) © ISO/IEC 2024

FINAL DRAFT
ISO/IEC FDIS 5259-1:2024(en)
International
Standard
ISO/IEC FDIS
5259-1
ISO/IEC JTC 1/SC 42
Artificial intelligence — Data
Secretariat: ANSI
quality for analytics and machine
Voting begins on:
learning (ML) —
Part 1:
Voting terminates on:
Overview, terminology, and
examples
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
© ISO/IEC 2024
IN ADDITION TO THEIR EVALUATION AS
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO­
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
or ISO’s member body in the country of the requester.
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ISO/IEC FDIS 5259­1:2024(en) © ISO/IEC 2024

© ISO/IEC 2024 – All rights reserved
ii
ISO/IEC FDIS 5259-1:2024(en)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols and abbreviated terms. 5
5 Data quality concepts for analytics and machine learning . 5
5.1 Data quality considerations for analytics and machine learning .5
5.1.1 General .5
5.1.2 Machine learning and data quality .5
5.1.3 Data characteristics that pose quality challenges for analytics and machine
learning .6
5.1.4 Data sharing, data re-use and data quality for analytics and machine learning .6
5.2 Data quality concept framework for analytics and machine learning .6
5.2.1 Overview .6
5.2.2 Data quality management .7
5.2.3 Data quality governance .10
5.2.4 Data provenance .10
5.3 Data life cycle for analytics and ML .10
5.3.1 Overview .10
5.3.2 Data life cycle model .10
5.3.3 Processes across the multiple stages . 13
Annex A (informative) Examples and scenarios .15
Bibliography .18

© ISO/IEC 2024 – All rights reserved
iii
ISO/IEC FDIS 5259-1:2024(en)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations,
governmental and non-governmental, in liaison with ISO and IEC, also take part in the work.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of document should be noted. This document was drafted in accordance with the editorial rules of the ISO/
IEC Directives, Part 2 (see www.iso.org/directives or www.iec.ch/members_experts/refdocs).
ISO and IEC draw attention to the possibility that the implementation of this document may involve the
use of (a) patent(s). ISO and IEC take no position concerning the evidence, validity or applicability of any
claimed patent rights in respect thereof. As of the date of publication of this document, ISO and IEC had not
received notice of (a) patent(s) which may be required to implement this document. However, implementers
are cautioned that this may not represent the latest information, which may be obtained from the patent
database available at www.iso.org/patents and https://patents.iec.ch. ISO and IEC shall not be held
responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www.iso.org/iso/foreword.html.
In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 42, Artificial intelligence.
A list of all parts in the ISO/IEC 5259 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html and
www.iec.ch/national-committees.

© ISO/IEC 2024 – All rights reserved
iv
ISO/IEC FDIS 5259-1:2024(en)
Introduction
Data are the raw material for analytics and machine learning (ML) and data quality is a critical aspect for
related analytics and ML projects and systems. The aim of the ISO/IEC 5259 series is to provide tools and
methods to assess and improve the quality of data used for analytics and ML.
Other parts of the ISO/IEC 5259 series include:
1)
— ISO/IEC 5259-2 provides a data quality model, data quality measures and guidance on reporting data
quality in the context of analytics and ML. ISO/IEC 5259-2 builds on the ISO 8000 series, ISO/IEC 25012
and ISO/IEC 25024.
The aim of ISO/IEC 5259-2 is to enable organizations to achieve their data quality objectives and is
applicable to all types of organizations.
2)
— ISO/IEC 5259-3 specifies requirements and provides guidance for establishing, implementing,
maintaining and continually improving the quality for data used in the areas of analytics and ML.
ISO/IEC 5259-3 does not define detailed processes, methods or measurement. Rather it defines the
requirements and guidance for a quality management process along with a reference process and
methods that can be tailored to meet the requirements in ISO/IEC 5259-3.
The requirements and recommendations set out in ISO/IEC 5259-3 are generic and are intended to be
applicable to all organizations, regardless of type, size or nature.
3)
— ISO/IEC 5259-4 provides general common organizational approaches, regardless of type, size or nature
of the applying organization, to ensure data quality for training and evaluation in analytics and ML. It
includes guidelines on the data quality process for:
— supervised ML with regard to the labelling of data used for training ML systems, including common
organizational approaches for training data labelling;
— unsupervised ML;
— semi-supervised ML;
— reinforcement learning;
— analytics.
ISO/IEC 5259-4 is applicable to training and evaluation data that come from different sources, including
data acquisition and data composition, data pre-processing, data labelling, evaluation and data use.
ISO/IEC 5259-4 does not define specific services, platforms or tools.
4)
— ISO/IEC 5259-5 provides a data quality governance framework for analytics and machine learning to
enable the governing bodies of organization to direct and oversee the implementation and operation of
data quality measures, management, and related processes with adequate controls throughout the DLC
model according to ISO/IEC 5259-1.
5)
— ISO/IEC TR 5259-6 describes a visualization framework for data quality in analytics and ML. The aim is
to enable stakeholders using visualization methods to access the results of data quality measures. This
visualization framework supports data quality goals.
1) Under preparation. Stage at the time of publication: ISO/IEC DIS 5259-2:2023.
2) Under preparation. Stage at the time of publication: ISO/IEC FDIS 5259-3:2024.
3) Under preparation. Stage at the time of publication: ISO/IEC FDIS 5259-4:2024.
4) Under preparation. Stage at the time of publication: ISO/IEC DIS 5259-5:2023.
5) Under preparation. Stage at the time of publication: ISO/IEC WD TR 5259-6:2023.

© ISO/IEC 2024 – All rights reserved
v
FINAL DRAFT International Standard ISO/IEC FDIS 5259-1:2024(en)
Artificial intelligence — Data quality for analytics and
machine learning (ML) —
Part 1:
Overview, terminology, and examples
1 Scope
This document provides the means for understanding and associating the individual documents of the
ISO/IEC 5259 series and is the foundation for conceptual understanding of data quality for analytics and
machine learning. It also discusses associated technologies and examples (e.g. use cases and usage scenarios).
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 22989, Information technology — Artificial intelligence — Concepts and terminology
ISO/IEC 23053, Framework for Artificial Intelligence (AI) Systems Using Machine Learning (ML)
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 22989 and ISO/IEC 23053 and
the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
data life cycle
life cycle of data
stages in the process of data usage from idea conception to its discontinuation
3.2
data originator
party that created the data and that can have rights
Note 1 to entry: A data originator can be an individual person.
Note 2 to entry: The data origi
...


© ISO/IEC 202X – All rights reserved
ISO/IEC FDIS 5259-1:202X(E)
ISO/IEC JTC 1/SC 42/WG 2
Secretariat: ANSI
Date: 2024-03-15
Artificial intelligence — Data quality for analytics and machine
learning (ML) — —
Part 1:
Overview, terminology, and examples

FDIS stage
Warning for WDs and CDs
This document is not an ISO International Standard. It is distributed for review and comment. It is subject to
change without notice and may not be referred to as an International Standard.
Recipients of this draft are invited to submit, with their comments, notification of any relevant patent rights of
which they are aware and to provide supporting documentation.

ISO #####-#:####(X)
To help you, this guide on writing standards was produced by the ISO/TMB and is available at
https://www.iso.org/iso/how-to-write-standards.pdf
A model manuscript of a draft International Standard (known as “The Rice Model”) is available at
https://www.iso.org/iso/model_document-rice_model.pdf

2 © ISO #### – All rights reserved

© ISO/IEC 202X – All rights reserved
ISO/IEC FDIS 5259-1:202X(E2024(en)
© ISO/IEC 2024
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication
may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying,
or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO
at the address below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: + 41 22 749 01 11
EmailE-mail: copyright@iso.org
Website: www.iso.orgwww.iso.org
Published in Switzerland
iv © ISO/IEC 202X 2024 – All rights reserved

iv
ISO/IEC FDIS 5259-1:202X(E2024(en)
Contents
Foreword . vi
Introduction .vi i
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols and abbreviated terms . 5
5 Data quality concepts for analytics and machine learning . 5
5.1 Data quality considerations for analytics and machine learning . 5
5.1.1 General . 5
5.1.2 Machine learning and data quality . 6
5.1.3 Data characteristics that pose quality challenges for analytics and machine learning . 6
5.1.4 Data sharing, data re-use and data quality for analytics and machine learning . 7
5.2 Data quality concept framework for analytics and machine learning . 7
5.2.1 Overview . 7
5.2.2 Data quality management . 8
5.2.3 Data quality governance . 11
5.2.4 Data provenance . 11
5.3 Data life cycle for analytics and ML . 12
5.3.1 Overview . 12
5.3.2 Data life cycle model . 12
5.3.3 Processes across the multiple stages . 16
Annex A (informative) Examples and scenarios . 19
Bibliography . 23

© ISO/IEC 202X 2024 – All rights reserved

v
ISO/IEC FDIS 5259-1:202X(E2024(en)
Foreword
ISO (the International Organization for Standardization) is a and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide federation of national standardsstandardization.
National bodies (that are members of ISO member bodies). The workor IEC participate in the development of
preparing International Standards is normally carried out through ISO technical committees. Each member
body interested in a subject for which a technical committee has been established has the right to be
represented on that committee. Internationalby the respective organization to deal with particular fields of
technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of
ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directiveswww.iso.org/directives or
www.iec.ch/members_experts/refdocs).
ISO drawsand IEC draw attention to the possibility that the implementation of this document may involve the
use of (a) patent(s). ISO takesand IEC take no position concerning the evidence, validity or applicability of any
claimed patent rights in respect thereof. As of the date of publication of this document, ISO and IEC had not
received notice of (a) patent(s) which may be required to implement this document. However, implementers
are cautioned that this may not represent the latest information, which may be obtained from the patent
database available at www.iso.org/patents. ISOwww.iso.org/patents and https://patents.iec.ch. ISO and IEC
shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html)
see www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 42, Artificial intelligence.
A list of all parts in the ISO/IEC 5259 series can be found on the ISO websiteand IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.htmlwww.iso.org/members.html and
www.iec.ch/national-committees.
vi © ISO/IEC 202X 2024 – All rights reserved

vi
ISO/IEC FDIS 5259-1:202X(E2024(en)
Introduction
Data are the raw material for analytics and machine learning (ML) and data quality is a critical aspect for
related analytics and ML projects and systems. This document is part of the ISO/IEC 5259 series. The aim of
the ISO/IEC 5259 series is to provide tools and methods to assess and improve the quality of data used for
analytics and ML.
Other parts of the ISO/IEC 5259 series include:
ISO/IEC 5259-2 , Artificial Intelligence — Data quality for analytics and machine learning (ML) — Part 2: Data
quality measures
— ISO/IEC 5259-2ISO/IEC 5259-2 provides a data quality model, data quality measures and guidance on
reporting data quality in the context of analytics and ML. ISO/IEC 5259-2 builds on the ISO 8000 series,
ISO/IEC 25012 and ISO/IEC 25024.
The aim of ISO/IEC 5259-2 is to enable organizations to achieve their data quality objectives and is
applicable to all types of organizations.
ISO/IEC 5259-3 , Artificial Intelligence — Data quality for analytics and machine learning (ML) — Part 3: Data
quality management requirements and guidelines
— ISO/IEC 5259-3ISO/IEC 5259-3 specifies requirements and provides guidance for establishing,
implementing, maintaining and continually improving the quality for data used in the areas of analytics
and ML.
ISO/IEC 5259-3 does not define detailed processes, methods or measurement. Rather it defines the
requirements and guidance for a quality management process along with a reference process and
methods that can be tailored to meet the requirements in ISO/IEC 5259-3.
The requirements and recommendations set out in ISO/IEC 5259-3 are generic and are intended to be
applicable to all organizations, regardless of type, size or nature.
ISO/IEC 5259-4 , Artificial Intelligence — Data quality for analytics and machine learning (ML) — Part 4: Data
quality process framework
— ISO/IEC 5259-4 provides general common organizational approaches, regardless of type, size or nature of
the applying organization, to ensure data quality for training and evaluation in analytics and ML. It
includes guidelines on the data quality process for:
— supervised ML with regard to the labelling of data used for training ML systems, including common
organizational approaches for training data labelling;
— unsupervised ML;
— semi-supervised ML;
Under preparation. Stage at the time of publication: ISO/IEC DIS 5259-2:2023.
Under preparation. Stage at the time of publication: ISO/IEC DIS 5259-2:2023.
Under preparation. Stage at the time of publication: ISO/IEC FDIS 5259-3:2024.
Under preparation. Stage at the time of publication: ISO/IEC FDIS 5259-3:2024.
Under preparation. Stage at the time of publication: ISO/IEC FDIS 5259-4:2024.
© ISO/IEC 202X 2024 – All rights reserved

vii
ISO/IEC FDIS 5259-1:202X(E2024(en)
— reinforcement learning;
— analytics.
ISO/IEC 5259-4 is applicable to training and evaluation data that come from different sources, including
data acquisition and data composition, data pre-processing, data labelling, evaluation and data use.
ISO/IEC 5259-4 does not define specific services, platforms or tools.
ISO/IEC 5259-5 , Artificial intelligence — Data quality for analytics and machine learning (ML) — Part 5: Data
quality governance framework
— ISO/IEC 5259-5ISO/IEC 5259-5 provides a data quality governance framework for analytics and machine
learning to enable the governing bodies of organization to direct and oversee the implementation and
operation of data quality measures, management, and related processes with adequate controls
throughout the DLC model according to ISO/IEC 5259-1.
ISO/IEC TR 5259-6 Artificial intelligence – Data quality for analytics and machine learning (ML) – Part 6:
Visualization framework for data quality
— ISO/IEC TR 5259-6ISO/IEC TR 5259-6 describes a visualization framework for data quality in analytics
and ML. The aim is to enable stakeholders using visualization methods to access the results of data quality
measures. This visualization framework supports data quality goals.

Under preparation. Stage at the time of publication: ISO/IEC DIS 5259-5:2023.
Under preparation. Stage at the time of publication: ISO/IEC DIS 5259-5:2023.
Under preparation. Stage at the time of publication: ISO/IEC WD TR 5259-6:2023.
Under preparation. Stage at the time of publication: ISO/IEC WD TR 5259-6:2023.
viii © ISO/IEC 202X 2024 – All rights reserved

viii
ISO/IEC FDIS 5259-1:2024(en)
Artificial intelligence — Data quality for analytics and machine
learning (ML) — —
Part 1:
Overview, terminology, and examples
1 Scope
This document provides the means for understanding and associating the individual documents of the
ISO/IEC 5259 series and is the foundation for conceptual understanding of data quality for analytics and
machine learning. It also discusses associated technologies and examples (e.g. use cases and usage scenarios).
2 Normative references
The following documents ar
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.