Data quality — Part 230: Sensor data — Guidelines for data cleansing

This document specifies guidelines for cleansing data that are recorded by sensors as a stream of single, discrete digital values, based on quality characteristics and quality measures defined in ISO 8000-210 and 220, respectively. The following are within the scope of this document: — principles for sensor data cleansing; — the process of sensor data cleansing; — implementation requirements for sensor data cleansing; — cleansing methods for data anomalies; — examples of sensor data cleansing. The following are outside the scope of this document: — detailed algorithms or methods to detect and repair data anomalies.

Titre manque — Partie 230: Titre manque

General Information

Status
Not Published
Current Stage
5000 - FDIS registered for formal approval
Start Date
17-Nov-2025
Completion Date
10-Jan-2026
Draft

ISO/DTS 8000-230 - Data quality — Part 230: Sensor data — Guidelines for data cleansing Released:7. 01. 2026

English language
43 pages
sale 15% off
sale 15% off
Draft

REDLINE ISO/DTS 8000-230 - Data quality — Part 230: Sensor data — Guidelines for data cleansing Released:7. 01. 2026

English language
43 pages
sale 15% off
sale 15% off

Frequently Asked Questions

ISO/DTS 8000-230 is a draft published by the International Organization for Standardization (ISO). Its full title is "Data quality — Part 230: Sensor data — Guidelines for data cleansing". This standard covers: This document specifies guidelines for cleansing data that are recorded by sensors as a stream of single, discrete digital values, based on quality characteristics and quality measures defined in ISO 8000-210 and 220, respectively. The following are within the scope of this document: — principles for sensor data cleansing; — the process of sensor data cleansing; — implementation requirements for sensor data cleansing; — cleansing methods for data anomalies; — examples of sensor data cleansing. The following are outside the scope of this document: — detailed algorithms or methods to detect and repair data anomalies.

This document specifies guidelines for cleansing data that are recorded by sensors as a stream of single, discrete digital values, based on quality characteristics and quality measures defined in ISO 8000-210 and 220, respectively. The following are within the scope of this document: — principles for sensor data cleansing; — the process of sensor data cleansing; — implementation requirements for sensor data cleansing; — cleansing methods for data anomalies; — examples of sensor data cleansing. The following are outside the scope of this document: — detailed algorithms or methods to detect and repair data anomalies.

ISO/DTS 8000-230 is classified under the following ICS (International Classification for Standards) categories: 25.040.40 - Industrial process measurement and control. The ICS classification helps identify the subject area and facilitates finding related standards.

You can purchase ISO/DTS 8000-230 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.

Standards Content (Sample)


FINAL DRAFT
Technical
Specification
ISO/TC 184/SC 4
Data quality —
Secretariat: ANSI
Part 230:
Voting begins on:
2026-01-21
Sensor data — Guidelines for data
cleansing
Voting terminates on:
2026-03-18
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
Reference number
FINAL DRAFT
Technical
Specification
ISO/TC 184/SC 4
Data quality —
Secretariat: ANSI
Part 230:
Voting begins on:
Sensor data — Guidelines for data
cleansing
Voting terminates on:
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
© ISO 2026
IN ADDITION TO THEIR EVALUATION AS
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
or ISO’s member body in the country of the requester.
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ii
Contents Page
Foreword .iv
Introduction .v
0.1 Foundations of the ISO 8000 series .v
0.2 Understanding more about the ISO 8000 series .vi
0.3 Role of this document .vi
0.4 Benefits of the ISO 8000 series . vii
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
3.1 Terms relating to sensor data .2
3.2 Terms relating to data quality .2
3.3 Terms relating to measurement .3
4 Principles for sensor data cleansing . . 3
5 Process for sensor data cleansing . 4
5.1 General .4
5.2 Functional model of sensor data cleansing .4
5.2.1 Perform sensor data cleansing (A0) .4
5.2.2 Prepare measurement plan (A1) .6
5.2.3 Measure data quality (A2) .8
5.2.4 Improve data quality (A3) .10
6 Implementation requirements .12
Annex A (informative) Document identification .13
Annex B (informative) Cleansing methods for data anomaly . 14
Annex C (informative) Examples for sensor data cleansing .24
Bibliography .39

iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 184, Automation systems and integration,
Subcommittee SC 4, Industrial data.
A list of all parts in the ISO 8000 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found atwww.iso.org/members.html.

iv
Introduction
0.1 Foundations of the ISO 8000 series
Digital data deliver value by enhancing all aspects of organizational performance including:
— operational effectiveness and efficiency;
— safety and security;
— reputation with customers and the wider public;
— compliance with statutory regulations;
— innovation;
— consumer costs, revenues and stock prices.
In addition, many organizations are now addressing these considerations with reference to the United
1)
Nations Sustainable Development Goals .
The influence on performance originates from data being the formalized representation of information.
ISO 8000-2 defines information as “knowledge concerning objects, such as facts, events, things, processes, or
ideas, including concepts, that within a certain context has a particular meaning”. This information enables
organizations to make reliable decisions. This decision making can be performed by human beings directly
and also by automated data processing including artificial intelligence systems.
Through widespread adoption of digital computing and associated communication technologies,
organizations become dependent on digital data. This dependency amplifies the negative consequences of
lack of quality in these data. These consequences are the decrease of organizational performance.
The biggest impact of digital data comes from two key factors:
— the data having a structure that reflects the nature of the subject matter;
EXAMPLE 1 A research scientist writes a report using a software application for word processing. This report
includes a table that uses a clear, logical layout to show results from an experiment. These results indicate how
material properties vary with temperature. The report is read by a designer, who uses the results to create a product
that works in a range of different operating temperatures.
— the data being computer processable (machine readable) rather than just being for a person to read and
understand.
EXAMPLE 2 A research scientist uses a database system to store the results of experiments on a material. This
system controls the format of different values in the data set. The system generates an output file of digital data.
This file is processed by a software application for engineering analysis. The application determines the optimum
geometry when using the material to make a product.
[1]
ISO 9000 explains that quality is not an abstract concept of absolute perfection. Quality is actually the
conformance of characteristics to requirements. This actuality means that any item of data can be of high
quality for one purpose but not for a different purpose. The quality is different because the requirements are
different between the two purposes.
EXAMPLE 3 Time data are processed by calendar applications and also by control systems for propulsion units
on spacecraft. These data include start times for meetings in a calendar application and activation times in a control
system. These start times require less precision than the activation times.
1) https://sdgs.un.org/goals
v
The nature of digital data is fundamental to establishing requirements that are relevant to the specific
decisions made by each organization.
[2]
EXAMPLE 4 ISO 8000-1 identifies that data have syntactic (format), semantic (meaning) and pragmatic
(usefulness) characteristics.
To support the delivery of high-quality data, the ISO 8000 series addresses:
— data governance, data quality management and maturity assessment;
[3]
EXAMPLE 5 ISO 8000-61 specifies a process reference model for data quality management.
— creating and applying requirements for data and information;
[4]
EXAMPLE 6 ISO 8000-110 specifies how to exchange characteristic data that are master data.
— monitoring and measuring information and data quality;
[5]
EXAMPLE 7 ISO 8000-8 specifies approaches to measuring information and data quality.
— improving data and, consequently, information quality;
[6]
EXAMPLE 8 ISO/TS 8000-81 specifies an approach to data profiling, which identifies opportunities to improve
data quality.
— issues that are specific to the type of content in a data set.
[7]
EXAMPLE 9 ISO/TS 8000-311 specifies how to address quality considerations for product shape data.
Data quality management covers all aspects of data processing, including creating, collecting, storing,
maintaining, transferring, exploiting and presenting data to deliver information.
Effective data quality management is systemic and systematic, requiring an understanding of the root causes
of data quality issues. This understanding is the basis for not just correcting existing nonconformities but
for also implementing solutions that prevent future reoccurrence of those nonconformities.
EXAMPLE 10 If a data set includes dates in multiple formats including “yyyy-mm-dd”, “mm-dd-yy” and “dd-mm-yy”,
then data cleansing can correct the consistency of the values. Such cleansing requires additional information, however,
to resolve ambiguous entries (such as, “04-05-20”). The cleansing also cannot address any process issues and people
issues, including training, that have caused the inconsistency.
0.2 Understanding more about the ISO 8000 series
[2]
ISO 8000-1 provides a detailed explanation of the structure and scope of the ISO 8000 series.
ISO 8000-2 specifies the single, common vocabulary for the ISO 8000 series. This vocabulary is ideal
reading material by which to understand the overall subject matter of data quality. ISO 8000-2 presents the
vocabulary structured by a series of topic areas (e.g. terms relating to quality and terms relating to data and
information).
[2] [5]
ISO has identified ISO 8000-1 , ISO 8000-2 and ISO 8000-8 as horizontal deliverables, i.e. deliverable
dealing with a subject relevant to a number of committees or sectors or of crucial importance to ensure
coherence across standardization deliverables.
0.3 Role of this document
As a contribution to the overall capability of the ISO 8000 series, this document addresses guidelines to
improve the quality of sensor data by cleansing data anomalies that affect low quality characteristics. The
guidelines include principles, the process and implementation requirements for sensor data cleansing.
The process performs sensor data cleansing using data quality characteristics and anomalies defined
in ISO 8000-210 and data quality measures defined in ISO 8000-220. To help users understand, they also
present methods and examples of cleansing data anomalies. Through this document, users will learn

vi
procedures and methods for improving the quality of sensor data collected from IoT or sensor network
environments prior to data analysis and exploitation.
This document supports activities that affect:
— one or more information systems;
— data flows within the organization and with external organizations;
— any phase of the data life cycle.
Organizations can use this document on its own or in conjunction with other parts in the ISO 8000 series.
[8]
Annex A contains an identifier that conforms to ISO/IEC 8824-1 . The identifier unambiguously identifies
this document in an open information system.
0.4 Benefits of the ISO 8000 series
By implementing parts of the ISO 8000 series to improve organizational performance, an organization
achieves the following benefits:
— objective validation of the foundations for digital transformation of the organization;
— a sustainable basis for data in digital form becoming a fundamental asset class the organization relies on
to deliver value;
— securing evidence-based trust from other parties (including supply chain partners and regulators) about
the repeatability and reliability of data and information processing in the organization;
— portability of data with resulting protection against loss of intellectual property and re-usability across
the organization and applications;
— effective and efficient interoperability between all parties in a supply chain to achieve traceability of
data back to original sources;
— readiness to acquire or supply services where the other party expects to work with common understanding
of explicit data requirements.

vii
FINAL DRAFT Technical Specification ISO/DTS 8000-230:2026(en)
Data quality —
Part 230:
Sensor data — Guidelines for data cleansing
1 Scope
This document specifies guidelines to improve data quality by cleansing sensor data anomalies that affect
low inherent quality characteristics.
The following are within the scope of this document:
— principles for sensor data cleansing;
— the process for sensor data cleansing;
— implementation requirements for sensor data cleansing;
— list of data anomaly detection and repair methods (see Annex B);
— examples of sensor data cleansing (see Annex C).
The following are outside the scope of this document:
— algorithms or detailed methods to detect and repair data anomalies;
— the process of sensor data cleansing for real time processing.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 8000-2, Data quality — Part 2: Vocabulary
ISO 8000-210, Data quality – Part 210: Sensor data: Data quality characteristics
ISO 8000-220, Data quality – Part 220: Sensor data: Quality measurement
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 8000-2 and the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/

3.1 Terms relating to sensor data
3.1.1
sensor
device that observes and measures a property of a natural phenomenon, system or human-made process
and converts that measurement into a signal
Note 1 to entry: A sensor can exist not only in a single physical form but also in a sensor-based variant such as a virtual
sensor.
[9]
[SOURCE: ISO/IEC 29182-2:2013 , 2.1.5, modified — “system” has been added to the definition, “man”
changed to “human”, and “physical” deleted from the definition. Note 1 to entry has been changed.]
3.1.2
sensor network
system of spatially distributed sensor (3.1.1) nodes interacting with each other and, depending on
applications, possibly with other infrastructure in order to acquire, process, transfer, and provide
information extracted from its environment with a primary function of information gathering and possible
control capability
Note 1 to entry: Distinguishing features of a sensor network can include wide area coverage, use of radio networks,
flexibility of purpose, self-organization, openness, and providing data for multiple applications.
[9]
[SOURCE: ISO/IEC 29182-2:2013 , 2.1.6]
3.1.3
sensor node
sensor network (3.1.2) element that includes at least one sensor (3.1.1) and, optionally actuators with
communication capabilities and data processing capabilities
Note 1 to entry: It can include additional application capabilities.
Note 2 to entry: A hybrid sensor (3.1.1) composed of multiple sensors is considered a sensor node that includes multiple
sensors.
[9]
[SOURCE: ISO/IEC 29182-2:2013 , 2.1.8, modified — Note 2 to entry has been added to the definition.]
3.1.4
sensor data
data produced by a sensor node (3.1.3)
Note 1 to entry: Sensor data consist of a stream of digital values converted from sensor (3.1.1) signals, and information
such as the identification of each sensor (3.1.1) and timestamps of data acquired by the sensor node (3.1.3).
3.1.5
internet of things
IoT
infrastructure of interconnected entities, people, systems and information resources together with services
which processes and reacts to information from the physical world and virtual world
[10]
[SOURCE: ISO/IEC 20924:2024 , 3.2.4]
3.2 Terms relating to data quality
3.2.1
data anomaly
item of data in a data set, where the item deviates from the expected pattern for items in the data set

3.2.2
quality characteristic
inherent characteristic of an object related to a requirement
[5]
Note 1 to entry: ISO 8000-8 uses the term quality dimension as a synonym for quality characteristics that determine
the pragmatic quality of data.
[11]
[SOURCE: ISO 9000:2015 , 3.10.2, modified — Note 1 to entry has been added.]
3.2.3
data cleansing
process used to improve data quality by detecting and repairing defects and errors in data
Note 1 to entry: In ISO 8000-2, data error is defined as non-fulfilment of a data requirement and also noted as
synonymous with data nonconformity.
[1]
Note 2 to entry: In ISO 9000 , defect is defined as non-fulfilment of a requirement related to an intended or specified
use.
[3]
Note 3 to entry: In ISO 8000-61 , data cleansing is specified as a sub-process of data quality improvement.
[12]
[SOURCE: ISO 13008:2022 , 3.4, modified — “correcting (or removing)" is changed to “repairing” and
Notes 1, 2 and 3 to entry are added.]
3.2.4
data profiling
activities that are performed to understand the data structures and system rules that affect the extraction
of audit data
[13]
[SOURCE: ISO 21378:2019 , 3.6]
3.3 Terms relating to measurement
3.3.1
data quality measure
quality measure
variable to which a value is assigned as the result of measuring a data quality characteristic (3.2.2)
[14]
Note 1 to entry: Adapted from ISO/IEC 25012:2008 , 4.5.
4 Principles for sensor data cleansing
— When a data anomaly occurs due to sensor or system errors, the quality of the data shall be improved by
deleting or modifying the anomalous data.
— When a data anomaly reflects actual phenomena in the field, whether to maintain, delete, or modify the
anomalous data shall be decided according to the stated purpose of an intended or specified use.
— When the cause of data anomaly is not clearly identified, data deletion or modification shall be minimized
to avoid changing the original correct data.
— When a data anomaly cannot be deleted or modified for any reason, a flag or mark shall be placed on the
data so that the person in charge of the data can recognize it and take appropriate actions.
— Data cleansing shall be carried out with the consent of stakeholders.

5 Process for sensor data cleansing
5.1 General
The sensor data cleansing process is designed with the following considerations in mind:
— The plan-do-check-act concept used to define the data quality management process in ISO 8000-61
[3]
is applied to the data cleansing process. In other words, the process is designed with the following
activities: provide a quality measurement plan (plan), measure data quality (do and check) and improve
data quality (act). In addition, once the plan is provided, activities of measurement (do and check) and
improvement (act) are repeatedly performed to determine whether the sensor data satisfy quality
requirements.
— This process is designed for post processing (or offline mode), not for real time processing (or online
mode).
NOTE 1 As sensor data are collected in the form of streams in real time and the amount is very large, it takes
time to cleanse them. Therefore, real-time data cleansing is not realistic in the environment for rapid decision-
making. Real-time data cleansing can only be performed in special environments where data anomalies are
already known and do not need checked or verified.
— The process is represented by the IDEF0 (integration definition for function modelling) functional model
[15]
defined by ISO/IEC/IEEE 31320-1 . This model breaks down a process into hierarchical activities to
show what activities are performed and how. It helps analyse and design processes by clearly showing
the inputs, outputs, controls, and mechanisms of each activity.
NOTE 2 A functional model is identified by a model name, an IDEF0 box is identified by a box name, and an
IDEF0 arrow segment is identified by an arrow label. An identifier is written in title case, i.e. the first letter of
[15]
each word is capitalized. See ISO/IEC/IEEE 31320-1 for details on the notation in the functional model.
5.2 Functional model of sensor data cleansing
5.2.1 Perform sensor data cleansing (A0)
The functional model of the sensor data cleansing process is represented by the A-0 context diagram for
perform sensor data cleansing (see Figure 1).

Figure 1 — A-0 context diagram for perform sensor data cleansing (model diagram A0)
This process is to perform data cleansing to improve the quality of sensor data prior to data analysis or
exploitation. By accepting sensor data and considering quality requirements, quality characteristics defined
in ISO 8000-210, and quality measures defined in ISO 8000-220, the process provides sensor data with a
quality report as an output.
Figure 2 — Perform sensor data cleansing (model diagram A0)

As in Figure 2, this process consists of three activities, prepare measurement plans (A1), measure data
quality (A2) and improve data quality (A3).
NOTE 1 Figure 2 is a child diagram of Figure 1.
Each activity at the lowest level of the process is described by the following elements:
[16]
— a title, which is a descriptive heading for an activity (modified from ISO/IEC TR 24774:2010 );
— a purpose, which describes the goal of performing an activity (modified fromISO/IEC TR 24774:2010
[16]
);
— tasks, which are required, recommended, or permissible actions, intended to contribute to the
[17]
achievement of the goal of an activity (modified from ISO/IEC/IEEE 24774:2021 );
— inputs, which are items transformed into output by an activity (modified from ISO/IEC/IEEE 31320-1
[15]
);
— outputs, which are product, result or service produced by an activity (modified from
[17]
ISO/IEC/IEEE 24774:2021 );
— controls, which are conditions or constraints required for an activity to produce correct output (modified
[15]
from ISO/IEC/IEEE 31320-1 );
— mechanisms, which are the means used by an activity to transform input into output (modified from
[15]
ISO/IEC/IEEE 31320-1 ).
[16]
NOTE 2 These elements are adapted from those of process description in ISO/IEC TR 24774:2010 ,
[17] [15]
ISO/IEC/IEEE 24774:2021 and those of functional model in ISO/IEC/IEEE 31320-1 to fit the activity definition.
5.2.2 Prepare measurement plan (A1)
5.2.2.1 General
This activity is intended to prepare a plan for measuring sensor data quality based on quality requirements,
quality characteristics, quality measures and sensor data.

Measurement
plan
Figure 3 — Prepare measurement plan (model diagram A1)
As in Figure 3, this activity consists of three sub-activities, establish data quality goal (A11), perform data
profiling (A12) and develop measurement plan (A13).
5.2.2.2 Establish data quality goal (A11)
Purpose: Establish data quality goal is to determine the data quality-related goals that reflect quality
requirements of sensor data.
Task:
— gather data quality requirements from stakeholders;
— determine the goal to achieve based on data quality requirements.
Input: Sensor data collected from sensor nodes.
Output: Data quality goal represented by data quality requirements such as quality measure levels of quality
characteristics in interest.
Control: Quality requirements, quality characteristics and corresponding data anomalies defined in
ISO 8000-210, and quality measures defined in ISO 8000-220.
Mechanism: Software/Human
5.2.2.3 Perform data profiling (A12)
Purpose: Perform data profiling to acquire historical sensor data and perform their data profiling. Through
this activity, the profile and data quality issues of sensor data are extracted from a cluster of historical
occurrences of the relevant sensor data.
Task:
— collect historical sensor data;

— perform data profiling for the sensor data.
[6]
NOTE Refer to ISO/TS 8000-81 for data profiling.
Input: Historical sensor data
Output: Data profile with quality issues
Control: Data quality goal
Mechanism: Software that provides statistical, mathematical, or data learning techniques, or human that
inputs information interactively or manually.
5.2.2.4 Develop measurement plan (A13)
Purpose: Develop measurement plan is to establish the measurement plan that includes the methods,
procedures, criteria, and rationale that will be used to measure the quality of sensor data in accordance
with the reference data patterns.
Task:
— define methods and procedures to measure data quality;
— determine criteria and information necessary to assess data quality.
Input: None
Output: Measurement plan
Control: Data quality goal, data profile with quality issues, and quality measures defined in ISO ISO 8000-220
Mechanism: Software/Human
5.2.3 Measure data quality (A2)
5.2.3.1 General
This activity is intended to derive an anomaly detection model and quality measure values of sensor data
based on the established measurement plan and identifies opportunities for quality improvement.

Figure 4 — Measure data quality (model diagram A2)
As in Figure 4, this activity consists of three sub-activities, derive anomaly detection model (A21), find
quality improvement opportunity (A22), and report quality result (A23).
5.2.3.2 Derive anomaly detection model (A21)
Purpose: Derive anomaly detection model is to analyse data patterns in sensor data and determine a model
that can detect data anomalies.
Task:
— analyse data patterns;
— determine an anomaly detection model.
NOTE Refer to Clause B.1 for anomaly detection models, which have functions that identify the type of anomaly or
detect anomalous data values included in sensor data.
Input: Sensor data
Output: Anomaly detection model
Control: Measurement plan
5.2.3.3 Find quality improvement opportunity (A22)
Purpose: Find quality improvement opportunity is to assess quality measures based on the anomaly
detection model and find opportunities that the quality of sensor data can be improved by modifying data
anomalies.
Task:
— Assess quality characteristic-specific quality measures: Measure quality characteristic-specific quality
measures defined in ISO 8000-220. If they satisfy quality requirements, the task stops since the sensor
data do not require quality improvement. Otherwise, the following additional task is carried out for data
anomalies that affect data quality.
— Assess anomaly-specific quality measures: Detect data anomalies included in sensor data, and measure
anomaly-specific quality measures defined in ISO 8000-220. If there exists any data anomaly modifiable
to improve quality characteristic-specific quality measures (or to reduce anomaly-specific quality
measures), the sensor data are those with quality improvement opportunity. Otherwise, the sensor data
are those without quality improvement opportunity.
Input: Sensor data
Output:
— sensor data with quality improvement opportunity that identifies data anomalies modifiable to improve
the quality of sensor data;
— sensor data without quality improvement opportunity that do not require quality improvement because
they meet quality requirements, or that cannot be improved because no data anomaly to improve the
quality of sensor data is identified.
Control: Anomaly detection model, measurement plan, quality characteristics and quality measures defined
in ISO 8000-220.
Mechanism: Software/Human
5.2.3.4 Report quality result (A23)
Purpose: Report quality result is to report the quality result of sensor data.
Task:
— gather quality information including quality requirements, problems, and improvements;
— write up the report that reflects quality improvement efforts.
Input: Sensor data without quality improvement opportunity that do not require quality improvement
because they meet quality requirements, or that cannot be improved because no data anomaly to improve is
identified.
Output: Sensor data with quality report that includes quality information such as quality requirements,
improvements by cleansing, and problems. The sensor data fall into one of two categories:
a) the sensor data that meet quality requirements and are usable for data analysis or exploitation;
b) the sensor data that do not meet quality requirements, whose quality can no longer be improved,
and therefore are not usable for data analysis or exploitation. In this case, the sensor data are either
discarded or subjected to more in-depth cause analysis for poor data quality.
Control: None
Mechanism: Software/Human
5.2.4 Improve data quality (A3)
5.2.4.1 General
This activity is intended to improve data quality by cleansing sensor data based on the identified data
quality improvement opportunity and provide cleansed sensor data.

Figure 5 — Improve data quality (model diagram A3)
As in Figure 5, this activity consists of three sub-activities, establish data repair plan, confirm data repair
plan, and execute data repair.
5.2.4.2 Establish data repair plan (A31)
Purpose: Establish data repair plan is to establish a specific action plan to cleanse sensor data based on the
identified quality improvement opportunity.
Task:
— list alternative methods for repairing data anomalies;
— determine data repair plans.
Input: Sensor data with quality improvement opportunity
Output: Data repair plan
Control: None
Mechanism: Software/Human
5.2.4.3 Confirm data repair plan (A32)
Purpose: Confirm data repair plan is to obtain confirmation from various stakeholders on the established
data repair plan and finalize it. This is an effort to ensure that stakeholders are fully informed and agree on
the risks and issues that may arise from data repair.
Task:
— collect stakeholders’ opinions on data repair plans;
— confirm the implementable data repair plan including data repair priorities.
Input: Data repair plan
Output: Confirmed data repair plan
Control: None
Mechanism: Software/Human
5.2.4.4 Execute data repair (A33)
Purpose: Execute data repair is to put the data repair plan into concrete action and provide cleansed sensor
data.
Task:
— refine the implementation plan for data repair;
— execute data repair and result checking.
NOTE Refer to Clause B.2 for methods on how to execute data repair.
Input: Sensor data with quality improvement opportunity
Output: Cleansed sensor data with indication that they have been cleansed
Control: Confirmed data repair plan
Mechanism: Software/Human which includes methods for repairing data anomalies.
6 Implementation requirements
In order to perform cleansing of sensor data, the following requirements shall be met:
— sensor data are identifiable;
— sensor data are obtained according to data formats predefined for data acquisition, and therefore, readily
accessible and understandable.
When wishing to understand and potentially improve the quality of sensor data, an organization shall
perform data cleansing using:
— the data quality characteristics and anomalies specified by ISO 8000-210;
— the data quality measures specified by ISO 8000-220.

Annex A
(informative)
Document identification
To provide for unambiguous identification of an information object in an open system, the following object
[18]
identifier is assigned to this document. The meaning of this value is defined in ISO 10303-1 .

Annex B
(informative)
Cleansing methods for data anomaly
B.1 Detection of data anomalies
B.1.1 Data anomaly and detection cases
Data anomaly can be classified into three cases:
— Point anomaly: If an individual data instance can be considered as anomalous with respect to the rest of
data, then the instance is termed as a point anomaly.
— Collective anomaly: If a collection of related data instances is anomalous with respect to the entire data
set or pattern, it is termed as a collective anomaly.
— Contextual anomaly: If a data instance is anomalous in a specific context (but not otherwise), then it is
termed as a contextual anomaly (also referred to as conditional anomaly).
Anomaly detection approaches are based on models and predictions from past historical data. When an
anomaly detection algorithm is applied, three possible cases can be considered:
— Correct detection: Detected data anomalies do correspond exactly to abnormalities that happened in the
real field.
— False positives: The real field continues to be normal, but unexpected anomalous data values are
observed, e.g. due to system failure and malfunction.
— False negatives: The real field becomes abnormal, but the result does not appear as data anomalies.
In this document, correct detection and false positives will be considered since data cleansing is possible
only when data anomalies are detected in the retained sensor data.
B.1.2 Anomaly detection for time series
B.1.2.1 General
Time series is a totally ordered sequence of data items (numerical values), each associated with a timestamp
which makes it possible to identify the time gap between any two items. Therefore, sensor data as a stream
of single, discrete digital values are a type of time series.
There are many anomaly detection methods for time series, among which 21 well-known ones are presented
and grouped into four categories: basic, statistical, digital signal processing, and machine learning.
NOTE These anomaly detection methods are intended to detect anomalous data values or patterns as outliers,
but not to identify the type of data anomaly.
B.1.2.2 Basic method type
This includes several different methods such as fixed threshold and dynamic threshold. The individual
methods are described below:
— Fixed threshold: a technique which employs predetermined static values, known as thresholds, to
identify anomalies. A lower limit and/or an upper limit is set as a threshold based on domain knowledge,
historical data analysis, or other relevant criteria. When a new data point in the time series is measured,

it is compared against this fixed threshold. If the measured value exceeds the upper threshold or falls
below the lower threshold, it is flagged as an anomaly.
— Dynamic threshold: a method which uses a dynamic threshold to identify anomalies. If a data point is
greater than or less than the threshold, it is considered an anomaly. The threshold is adjusted adaptively
based on the statistical properties of the signal (such as mean and variance) and the levels of possible
[19]
background noise in the current or recent period of time .
— Time interval analysis: an approach used to identify anomalies in time-stamped data sets by examining
the intervals between consecutive timestamps. It calculates the differences in time between adjacent
timestamps and compares these intervals against a predefined threshold. Intervals that fall outside the
threshold are flagged as anomalies, indicating possible incorrect timestamps or data loss.
— Sequential dependency check: a method used to verify the correctness of timestamps by ensuring that
the timestamps match the chronological and logical sequence of events. It analyses the chronological
order of timestamps and the logical sequence of associated events to identify missing, extraneous, or
out-of-order data.
— Sliding window: a fundamental method which involves defining a window or range in the input data and
then moving that window across the data to perform some operation within the window. It shifts a
sliding window one by one element to the right until the end of data set. For each window, it computes
mean and standard deviation and compares the data point against a threshold, for example, . The
data point greater or less the threshold is considered an anomaly.
B.1.2.3 Statistical method type
This includes several different methods such as principal component analysis and inter-quartile range. The
individual methods are described below:
— Principal component analysis: one of the linear dimensionality reduction techniques which transforms
a data set into a new set of features called principal components. By using dimensionality reduction
technique, the main components are extracted from the source data, and then the original data are
reconstructed using only a few of these main components. The reconstructed data items with large
[20]
reconstruction errors are considered to be anomalies .
— Inter-quartile range: a statistical technique which employs inter-quartile range (IQR) to detect anomalies.
The IQR is a measure of statistical dispersion, which refers to the spread of the data, and is defined as
the difference between the 75th and 25th percentiles of the data. The 25th percentile is also known as
the first quartile (Q1) and the 75th percentile as the third quartile (Q3). To calculate the IQR, the data set
is divided into quartiles which divide the number of data points into four parts, or quarters,
...


ISO/TC 184/SC 4
ISO/CD TS 8000-230(en)
Secretariat: ANSI
Date: 2026-01-06
Data quality —
Part 230:
Sensor data: — Guidelines for data cleansing

ISO/CD TSDTS 8000-230:20252026(en)
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication
may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying,
or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO
at the address below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: + 41 22 749 01 11
E-mail: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
ISO/CD TSDTS 8000-230:20252026(en)
Contents
Foreword . iv
Introduction . v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Principles for sensor data cleansing . 3
5 Process for sensor data cleansing . 4
6 Implementation requirements . 16
Annex A (informative) Document identification . 17
Annex B (informative) Cleansing methods for data anomaly . 18
Annex C (informative) Examples for sensor data cleansing . 28
Bibliography . 55

iii
ISO/CD TSDTS 8000-230:20252026(en)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1Part 1. In particular, the different approval criteria needed for the different
types of ISO documentsdocument should be noted. This document was drafted in accordance with the editorial
rules of the ISO/IEC DirectivesIEC Directives, Part 2Part 2 (see www.iso.org/directives).
Field Code Changed
Attention is drawnISO draws attention to the possibility that some of the elementsimplementation of this
document may beinvolve the subjectuse of (a) patent(s). ISO takes no position concerning the evidence,
validity or applicability of any claimed patent rights in respect thereof. As of the date of publication of this
document, ISO had not received notice of (a) patent(s) which may be required to implement this document.
However, implementers are cautioned that this may not represent the latest information, which may be
obtained from the patent database available at www.iso.org/patents. ISO shall not be held responsible for
identifying any or all such patent rights. Details of any patent rights identified during the development of the
document will be in the Introduction and/or on the ISO list of patent declarations received (see ).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation onof the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO’sISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT)), see
www.iso.org/iso/foreword.htmlthe following URL: .
This document was prepared by Technical Committee ISO/TC 184, Automation systems and integration,
Subcommittee SC 4, Industrial data.
A list of all parts in the ISO 8000 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found atwww.iso.org/members.html.
iv
ISO/CD TSDTS 8000-230:20252026(en)
Introduction
0.1 Foundations of the ISO8000 series
Digital data deliver value by enhancing all aspects of organizational performance including:
— operational effectiveness and efficiency;
— safety and security;
— reputation with customers and the wider public;
— compliance with statutory regulations;
— innovation;
— consumer costs, revenues and stock prices.
In addition, many organizations are now addressing these considerations with reference to the United Nations
1)
Sustainable Development Goals .
2)
The influence on performance originates from data being the formalized representation of information . ISO
8000-2 defines information as “knowledge concerning objects, such as facts, events, things, processes, or ideas,
including concepts, that within a certain context has a particular meaning”. This information enables
organizations to make reliable decisions. This decision making can be performed by human beings directly
and also by automated data processing including artificial intelligence systems.
Through widespread adoption of digital computing and associated communication technologies,
organizations become dependent on digital data. This dependency amplifies the negative consequences of lack
of quality in these data. These consequences are the decrease of organizational performance.
The biggest impact of digital data comes from two key factors:
— the data having a structure that reflects the nature of the subject matter;
EXAMPLE 1 A research scientist writes a report using a software application for word processing. This report includes
a table that uses a clear, logical layout to show results from an experiment. These results indicate how material properties
vary with temperature. The report is read by a designer, who uses the results to create a product that works in a range
of different operating temperatures.
— the data being computer processable (machine readable) rather than just being for a person to read and
understand.
EXAMPLE 2 A research scientist uses a database system to store the results of experiments on a material. This system
controls the format of different values in the data set. The system generates an output file of digital data. This file is
processed by a software application for engineering analysis. The application determines the optimum geometry when
using the material to make a product.
[1]
ISO 9000 explains that quality is not an abstract concept of absolute perfection. Quality is actually the
conformance of characteristics to requirements. This actuality means that any item of data can be of high

1)
https://sdgs.un.org/goals
2)
ISO 8000-2 defines information as “knowledge concerning objects, such as facts, events, things, processes, or ideas,
including concepts, that within a certain context has a particular meaning”.
v
ISO/CD TSDTS 8000-230:20252026(en)
quality for one purpose but not for a different purpose. The quality is different because the requirements are
different between the two purposes.
EXAMPLE 3 Time data are processed by calendar applications and also by control systems for propulsion units on
spacecraft. These data include start times for meetings in a calendar application and activation times in a control system.
These start times require less precision than the activation times.
The nature of digital data is fundamental to establishing requirements that are relevant to the specific
decisions made by each organization.
[2]
EXAMPLE 4 ISO 8000-1 identifies that data have syntactic (format), semantic (meaning) and pragmatic
(usefulness) characteristics.
To support the delivery of high-quality data, the ISO 8000 series addresses:
— data governance, data quality management and maturity assessment;
[3]
EXAMPLE 5 ISO 8000-61 specifies a process reference model for data quality management.
— creating and applying requirements for data and information;
[4]
EXAMPLE 6 ISO 8000-110 specifies how to exchange characteristic data that are master data.
— monitoring and measuring information and data quality;
[5]
EXAMPLE 7 ISO 8000-8 specifies approaches to measuring information and data quality.
— improving data and, consequently, information quality;
[6]
EXAMPLE 8 ISO/TS 8000-81 specifies an approach to data profiling, which identifies opportunities to improve data
quality.
— issues that are specific to the type of content in a data set.
[7]
EXAMPLE 9 ISO/TS 8000-311 specifies how to address quality considerations for product shape data.
Data quality management covers all aspects of data processing, including creating, collecting, storing,
maintaining, transferring, exploiting and presenting data to deliver information.
Effective data quality management is systemic and systematic, requiring an understanding of the root causes
of data quality issues. This understanding is the basis for not just correcting existing nonconformities but for
also implementing solutions that prevent future reoccurrence of those nonconformities.
EXAMPLE 10 If a data set includes dates in multiple formats including “yyyy-mm-dd”, “mm-dd-yy” and “dd-mm-yy”,
then data cleansing can correct the consistency of the values. Such cleansing requires additional information, however,
to resolve ambiguous entries (such as, “04-05-20”). The cleansing also cannot address any process issues and people
issues, including training, that have caused the inconsistency.
0.2 Understanding more about the ISO 8000 series
[2]
ISO 8000-1 provides a detailed explanation of the structure and scope of the whole ISO 8000 series.
3)
ISO 8000-2 specifies the single, common vocabulary for the ISO 8000 series. This vocabulary is ideal reading
material by which to understand the overall subject matter of data quality. ISO 8000-2 presents the vocabulary

3)
The content is available on the ISO Online Browsing Platform. http://www.iso.org/obp
vi
ISO/CD TSDTS 8000-230:20252026(en)
structured by a series of topic areas (for example,e.g. terms relating to quality and terms relating to data and
information).
[2] [5] 4)
ISO has identified ISO 8000-1 , ISO 8000-2 and ISO 8000-8 as horizontal deliverables . as horizontal
deliverables, i.e. deliverable dealing with a subject relevant to a number of committees or sectors or of crucial
importance to ensure coherence across standardization deliverables.
0.3 Role of this document
As a contribution to the overall capability of the ISO 8000 series, this document addresses guidelines to
improve the quality of sensor data by cleansing data anomalies that affect low quality characteristics. The
guidelines include principles, the process and implementation requirements for sensor data cleansing. The
process performs sensor data cleansing using data quality characteristics and anomalies defined in ISO 8000-
210 and data quality measures defined in ISO 8000-220. To help users understand, they also present methods
and examples of cleansing data anomalies. Through this document, users will learn procedures and methods
for improving the quality of sensor data collected from IoT or sensor network environments prior to data
analysis and exploitation.
This document supports activities that affect:
— one or more information systems;
— data flows within the organization and with external organizations;
— any phase of the data life cycle.
Organizations can use this document on its own or in conjunction with other parts in the ISO 8000 series.
[8]
Annex A contains an identifier that conforms to ISO/IEC 8824-1 . The identifier unambiguously identifies
this document in an open information system.
0.4 Benefits of the ISO 8000 series
By implementing parts of the ISO 8000 series to improve organizational performance, an organization
achieves the following benefits:
— objective validation of the foundations for digital transformation of the organization;
— a sustainable basis for data in digital form becoming a fundamental asset class the organization relies on
to deliver value;
— securing evidence-based trust from other parties (including supply chain partners and regulators) about
the repeatability and reliability of data and information processing in the organization;
— portability of data with resulting protection against loss of intellectual property and re-usability across
the organization and applications;
— effective and efficient interoperability between all parties in a supply chain to achieve traceability of data
back to original sources;
— readiness to acquire or supply services where the other party expects to work with common
understanding of explicit data requirements.

4)
Deliverable dealing with a subject relevant to a number of committees or sectors or of crucial importance to ensure
coherence across standardization deliverables.
vii
ISO/CD TSDTS 8000-230:20252026(en)
Data quality —
Part 230:
Sensor data: — Guidelines for data cleansing
1 Scope
This document specifies guidelines to improve data quality by cleansing sensor data anomalies that affect low
inherent quality characteristics.
The following are within the scope of this document:
— principles for sensor data cleansing;
— the process for sensor data cleansing;
— implementation requirements for sensor data cleansing;
— list of data anomaly detection and repair methods ((see Annex B);
— examples of sensor data cleansing ((see Annex C).
The following are outside the scope of this document:
— algorithms or detailed methods to detect and repair data anomalies.;
— the process of sensor data cleansing for real time processing.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 8000-2, Data quality — Part 2: Vocabulary
ISO 8000-210, Data quality – Part 210: Sensor data: Data quality characteristics
ISO 8000-220, Data quality – Part 220: Sensor data: Quality measurement
3 Terms and definitions
For the purposes of this document, the following terms and definitions / terms and definitions given inISOin
ISO 8000-2 and the following apply.
NOTE Prior to publication of this document as an International Standard, the following terms and definitions will be
published in ISO 8000-2 and removed from this document.
ISO and IEC maintain terminologicalterminology databases for use in standardization at the following
addresses:
IEC Electropedia: available at — ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at https://www.electropedia.org/
ISO/CD TSDTS 8000-230:20252026(en)
3.1 Terms relating to sensor data
3.1.1
sensor
device that observes and measures a property of a natural phenomenon, system or human-made process and
converts that measurement into a signal
Note 1 to entry: A sensor () can exist not only in a single physical form but also in a sensor () -based variant such as a
virtual sensor ().
[9]
[SOURCE: ISO/IEC 29182-2 , 2.1.5, modified — “system” has been added to the definition, “man” changed to
“human”, and “physical” deleted from the definition. Note 1 to entry has been changed.]
3.1.2
sensor network
system of spatially distributed sensor (3.1.1) nodes interacting with each other and, depending on applications,
possibly with other infrastructure in order to acquire, process, transfer, and provide information extracted
from its environment with a primary function of information gathering and possible control capability
Note 1 to entry: Distinguishing features of a sensor network () can include wide area coverage, use of radio networks,
flexibility of purpose, self-organization, openness, and providing data for multiple applications.
[9]
[SOURCE: ISO/IEC 29182-2 , 2.1.6]
3.1.3
sensor node
sensor network (3.1.2) element that includes at least one sensor (3.1.1) and, optionally actuators with
communication capabilities and data processing capabilities
Note 1 to entry: It can include additional application capabilities.
Note 2 to entry: A hybrid sensor (3.1.1) composed of multiple sensors is considered a sensor node () that includes
multiple sensors.
[9]
[SOURCE: ISO/IEC 29182-2 , 2.1.8, modified — Note 2 to entry has been added to the definition.]
3.1.4
sensor data
data produced by a sensor node (3.1.3)
Note 1 to entry: Sensor data () consist of a stream of digital values converted from sensor (3.1.1) signals, and information
such as the identification of each sensor (3.1.1) and timestamps of data acquired by the sensor node (3.1.3).
3.1.5
internet of things
IoT
infrastructure of interconnected entities, people, systems and information resources together with services
which processes and reacts to information from the physical world and virtual world
[10]
[SOURCE: ISO/IEC 20924 , 3.2.4]
3.2 Terms relating to data quality
3.2.1
data anomaly
item of data in a data set, where the item deviates from the expected pattern for items in the data set
ISO/CD TSDTS 8000-230:20252026(en)
3.2.2
quality characteristic
inherent characteristic of an object related to a requirement
[5]
Note 1 to entry: ISO 8000-8 uses the term quality dimension as a synonym for quality characteristics that determine
the pragmatic quality of data.
[11]
[SOURCE: ISO 9000 , 3.10.2, modified — Note 1 to entry has been added.]
3.2.3
data cleansing
process used to improve data quality by detecting and repairing defects and errors in data
Note 1 to entry: In ISO 8000-2, data error is defined as non-fulfilment of a data requirement and also noted as
synonymous with data nonconformity.
[1]
Note 2 to entry: In ISO 9000 , defect is defined as non-fulfilment of a requirement related to an intended or specified
use.
[3]
Note 3 to entry: In ISO 8000-61 , data cleansing is specified as a sub-process of data quality improvement.
[12]
[SOURCE: ISO 13008 , 3.4, modified — “correcting (or removing)" is changed to “repairing,”” and
notesNotes 1, 2 and 3 to entry are added.]
3.2.4
data profiling
activities that are performed to understand the data structures and system rules that affect the extraction of
audit data
[13]
[SOURCE: ISO 21378 :2019 , 3.6]
3.3 Terms relating to measurement
3.3.1
data quality measure
quality measure
variable to which a value is assigned as the result of measuring a data quality characteristic (3.2.2)
[14]
Note 1 to entry: Adapted from ISO/IEC 25012 , 4.5.
4 Principles for sensor data cleansing
— When a data anomaly occurs due to sensor or system errors, the quality of the data shall be improved by
deleting or modifying the anomalous data.
— When a data anomaly reflects actual phenomena in the field, whether to maintain, delete, or modify the
anomalous data shall be decided according to the stated purpose of an intended or specified use.
— When the cause of data anomaly is not clearly identified, data deletion or modification shall be minimized
to avoid changing the original correct data.
— When a data anomaly cannot be deleted or modified for any reason, a flag or mark shall be placed on the
data so that the person in charge of the data can recognize it and take appropriate actions.
— Data cleansing shall be carried out with the consent of stakeholders.
ISO/CD TSDTS 8000-230:20252026(en)
5 Process for sensor data cleansing
5.1 General
The sensor data cleansing process is designed with the following considerations in mind:
— The Plan-Do-Check-Actplan-do-check-act concept used to define the data quality management process in
[3]
ISO 8000-61 is applied to the data cleansing process. In other words, the process is designed with the
following activities: provide a quality measurement plan (Planplan), measure data quality (Dodo and
Checkcheck) and improve data quality (Actact). In addition, once the plan is provided, activities of
measurement (Dodo and Checkcheck) and improvement (Actact) are repeatedly performed to determine
whether the sensor data satisfy quality requirements.
— This process is designed for post processing (or offline mode), not for real time processing (or online
mode).
NOTE 1 As sensor data are collected in the form of streams in real time and the amount is very large, it takes time
to cleanse them. Therefore, real-time data cleansing is not realistic in the environment for rapid decision-making.
Real-time data cleansing can only be performed in special environments where data anomalies are already known
and do not need checked or verified.
— The process is represented by the IDEF0 (Integration DEFinitionintegration definition for function
[15]
modelingmodelling) functional model defined by ISO/IEC/IEEE 31320-1 . This model breaks down a
process into hierarchical activities to show what activities are performed and how. It helps analyse and
design processes by clearly showing the inputs, outputs, controls, and mechanisms of each activity.
NOTE 2 A functional model is identified by a model name, an IDEF0 box is identified by a box name, and an IDEF0
arrow segment is identified by an arrow label. An identifier is written in title case, i.e.,. the first letter of each word
[15]
is capitalized. See ISO/IEC/IEEE 31320-1 for details on the notation in the functional model.
5.2 Functional model of sensor data cleansing
5.2.1 Perform Sensor Data Cleansingsensor data cleansing (A0)
The functional model of the sensor data cleansing process is represented by the A-0 context diagram for
Perform Sensor Data Cleansingperform sensor data cleansing (see Figure 1).
ISO/CD TSDTS 8000-230:20252026(en)
Quality Quality Characteris�cs Quality Measures
Requirements (ISO 8000-210) (ISO 8000-220)
Sensor Data
Perform Sensor
Sensor Data With Quality Report
Data Cleansing
A0
So�ware/Human
Figure 1 — A-0 context diagram for Perform Sensor Data Cleansingperform sensor data cleansing
(model diagram A0)
This process is to perform data cleansing to improve the quality of sensor data prior to data analysis or
exploitation. By accepting sensor data and considering quality requirements, quality characteristics defined
in ISO 8000-210, and quality measures defined in ISO 8000-220, the process provides sensor data with a
quality report as an output.
ISO/CD TSDTS 8000-230:20252026(en)
Quality Measures
Quality Quality Characteris�cs
(ISO 8000-220)
Requirements (ISO 8000-210)
Prepare
Sensor Data Measurement Plan
Measurement
Plan
A1
Sensor Data
With Quality Report
Measure
Data Quality
Sensor Data
A2 With Quality Improvement
Opportunity
Improve
Data Quality
A3
Cleansed Sensor Data
So�ware/Human So�ware/Human So�ware/Human

Figure 2 — Perform Sensor Data Cleansingsensor data cleansing (model diagram A0)
ISO/CD TSDTS 8000-230:20252026(en)
As in Figure 2, this process consists of three activities, Prepare Measurement Plansprepare measurement
plans (A1), Measure Data Qualitymeasure data quality (A2) and Improve Data Qualityimprove data quality
(A3).
NOTE 1 Figure 2 is a child diagram of Figure 1.
Each activity at the lowest level of the process is described by the following elements:
[16]
— a title, which is a descriptive heading for an activity (modified from ISO/IEC TR 24774 :2010 );
[16]
— a purpose, which describes the goal of performing an activity (modified fromISO/IEC TR 24774 );
— tasks, which are required, recommended, or permissible actions, intended to contribute to the
[17]
achievement of the goal of an activity (modified from ISO/IEC/IEEE 24774 );
[15]
— inputs, which are items transformed into output by an activity (modified from ISO/IEC/IEEE 31320-1 );
— outputs, which are product, result or service produced by an activity (modified from ISO/IEC/IEEE
[17]
24774 );
— controls, which are conditions or constraints required for an activity to produce correct output (modified
[15]
from ISO/IEC/IEEE 31320-1 );
— mechanisms, which are the means used by an activity to transform input into output (modified from
[15]
ISO/IEC/IEEE 31320-1 ).
[16]
NOTE 2 These elements are adapted from those of process description in ISO/IEC TR 24774 ,ISO/IEC/IEEE
[17] [15]
24774 and those of functional model in ISO/IEC/IEEE 31320-1 to fit the activity definition.
5.2.2 Prepare Measurement Planmeasurement plan (A1)
5.2.2.1 General
This activity is intended to prepare a plan for measuring sensor data quality based on quality requirements,
quality characteristics, quality measures and sensor data.
ISO/CD TSDTS 8000-230:20252026(en)
Quality Measures
Quality Quality Characteris�cs
(ISO 8000-220)
Requirements (ISO 8000-210)
Establish Data
Sensor Data Data Quality Goal
Quality Goal
A11
Data Profile
With Quality Issues
Historical Sensor Data
Perform Data
Profiling
A12
Measurement
Develop
Plan
Measurement
Plan
A13
So�ware/Human So�ware/Human So�ware/Human

Measurement
plan
Figure 3 — Prepare Measurement Planmeasurement plan (model diagram A1)
As in Figure 3, this activity consists of three sub-activities, Establish Data Quality Goalestablish data quality
goal (A11), Perform Data Profilingperform data profiling (A12) and Develop Measurement Plandevelop
measurement plan (A13).
ISO/CD TSDTS 8000-230:20252026(en)
5.2.2.2 Establish Data Quality Goaldata quality goal (A11)
Purpose: Establish Data Quality Goaldata quality goal is to determine the data quality-related goals that reflect
quality requirements of sensor data.
Task:
— Gathergather data quality requirements from stakeholders;
— Determinedetermine the goal to achieve based on data quality requirements.
Input: Sensor data collected from sensor nodes.
Output: Data Quality Goalquality goal represented by data quality requirements such as quality measure levels
of quality characteristics in interest.
Control: Quality Requirements, Quality Characteristicsrequirements, quality characteristics and
corresponding data anomalies defined in ISO 8000-210, and Quality Measuresquality measures defined in ISO
8000-220.
Mechanism: Software/Human
5.2.2.3 Perform Data Profilingdata profiling (A12)
Purpose: Perform Data Profilingdata profiling to acquire historical sensor data and perform their data
profiling. Through this activity, the profile and data quality issues of sensor data are extracted from a cluster
of historical occurrences of the relevant sensor data.
Task:
— Collectcollect historical sensor data;
— Performperform data profiling for the sensor data.
[6]
NOTE Refer to ISO/TS 8000-81 for data profiling.
Input: Historical Sensor Datasensor data
Output: Data Profile With Quality Issuesprofile with quality issues
Control: Data Quality Goalquality goal
Mechanism: Software that provides statistical, mathematical, or data learning techniques, or Humanhuman
that inputs information interactively or manually.
5.2.2.4 Develop Measurement Planmeasurement plan (A13)
Purpose: Develop Measurement Planmeasurement plan is to establish the measurement plan that includes
the methods, procedures, criteria, and rationale that will be used to measure the quality of sensor data in
accordance with the reference data patterns.
Task:
— Definedefine methods and procedures to measure data quality;
— Determinedetermine criteria and information necessary to assess data quality.
ISO/CD TSDTS 8000-230:20252026(en)
Input: None
Output: Measurement Planplan
Control: Data Quality Goal, Data Profile With Quality Issuesquality goal, data profile with quality issues, and
Quality Measuresquality measures defined in ISO ISO 8000-220
Mechanism: Software/Human
Mechanism: Software/Human
5.2.3 Measure Data Qualitydata quality (A2)
5.2.3.1 General
This activity is intended to derive an anomaly detection model and quality measure values of sensor data
based on the established measurement plan and identifies opportunities for quality improvement.
ISO/CD TSDTS 8000-230:20252026(en)
Measurement
Quality Characteris�cs
Quality Measures
Plan
(ISO 8000-220)
(ISO 8000-220)
Anomaly Detec�on
Sensor Data
Derive Anomaly
Model
Detec�on Model
A21
Sensor Data
With Quality Improvement
Find Quality
Opportunity
Improvement
Opportunity
Sensor Data
A22 Without Quality Improvement
Opportunity
Sensor Data
With Quality
Report
Report Quality
Result
A23
So�ware/Human
So�ware/Human
So�ware/Human
Figure 4 — Measure Data Qualitydata quality (model diagram A2)
ISO/CD TSDTS 8000-230:20252026(en)
As in Figure 4, this activity consists of three sub-activities, Derive Anomaly Detection Modelderive anomaly
detection model (A21), Find Quality Improvement Opportunityfind quality improvement opportunity (A22),
and Report Quality Resultreport quality result (A23).
5.2.3.2 Derive Anomaly Detection Modelanomaly detection model (A21)
Purpose: Derive Anomaly Detection Modelanomaly detection model is to analyse data patterns in sensor data
and determine a model that can detect data anomalies.
Task:
— Analyseanalyse data patterns;
— Determinedetermine an anomaly detection model;.
NOTE Refer to Clause B.1 for anomaly detection models, which have functions that identify the type of anomaly or
detect anomalous data values included in sensor data.
Input: Sensor Datadata
Output: Anomaly Detection Modeldetection model
Control: Measurement Planplan
5.2.3.3 Find Quality Improvement Opportunityquality improvement opportunity (A22)
Purpose: Find Quality Improvement Opportunityquality improvement opportunity is to assess quality
measures based on the anomaly detection model and find opportunities that the quality of sensor data can be
improved by modifying data anomalies.
Task:
— Assess quality characteristic-specific quality measures: Measure quality characteristic-specific quality
measures defined in ISO 8000-220. If they satisfy quality requirements, the task stops since the sensor
data do not require quality improvement. Otherwise, the following additional task is carried out for data
anomalies that affect data quality;.
— Assess anomaly-specific quality measures: Detect data anomalies included in sensor data, and measure
anomaly-specific quality measures defined in ISO 8000-220. If there exists any data anomaly modifiable
to improve quality characteristic-specific quality measures (or to reduce anomaly-specific quality
measures), the sensor data are those with quality improvement opportunity. Otherwise, the sensor data
are those without quality improvement opportunity.
Input: Sensor Datadata
Output:
— Sensor Data With Quality Improvement Opportunitysensor data with quality improvement opportunity
that identifies data anomalies modifiable to improve the quality of sensor data;
— Sensor Data Without Quality Improvement Opportunitysensor data without quality improvement
opportunity that do not require quality improvement because they meet quality requirements, or that
cannot be improved because no data anomaly to improve the quality of sensor data is identified.
Control: Anomaly Detection Model, Measurement Plan, Quality Characteristicsdetection model, measurement
plan, quality characteristics and Quality Measuresquality measures defined in ISO 8000-220.
ISO/CD TSDTS 8000-230:20252026(en)
Mechanism: Software/Human
Mechanism: Software/Human
5.2.3.4 Report Quality Result quality result (A23)
Purpose: Report Quality Resultquality result is to report the quality result of sensor data.
Task:
— Gathergather quality information including quality requirements, problems, and improvements;
— Writewrite up the report that reflects quality improvement efforts.
Input: Sensor Data Without Quality Improvement Opportunitydata without quality improvement opportunity
that do not require quality improvement because they meet quality requirements, or that cannot be improved
because no data anomaly to improve is identified.
Output: Sensor Data With Quality Reportdata with quality report that includes quality information such as
quality requirements, improvements by cleansing, and problems. The sensor data fall into one of two
categories:
a) the sensor data that meet quality requirements and are usable for data analysis or exploitation;
b) the sensor data that do not meet quality requirements, whose quality can no longer be improved, and
therefore are not usable for data analysis or exploitation. In this case, the sensor data are either discarded
or subjected to more in-depth cause analysis for poor data quality.
Control: None
Mechanism: Software/Human
5.2.4 Improve Data Qualitydata quality (A3)
5.2.4.1 General
This activity is intended to improve data quality by cleansing sensor data based on the identified data quality
improvement opportunity and provide cleansed sensor data.
ISO/CD TSDTS 8000-230:20252026(en)
Sensor Data
With Quality Improvement
Opportunity Data Repair Plan
Establish Data
Repair Plan
A31
Confirmed Data
Confirm Data Repair Plan
Repair Plan
A32
Execute Data
Repair
Cleansed
A33 Sensor Data
So�ware/Human So�ware/Human
So�ware/Human
Figure 5 — Improve Data Qualitydata quality (model diagram A3)
As in Figure 5, this activity consists of three sub-activities, Establish Data Repair Plan, Confirm Data Repair
Plan, and Execute Data Repairestablish data repair plan, confirm data repair plan, and execute data repair.
5.2.4.2 Establish Data Repair Plandata repair plan (A31)
Purpose: Establish Data Repair Plandata repair plan is to establish a specific action plan to cleanse sensor data
based on the identified quality improvement opportunity.
Task:
— Listlist alternative methods for repairing data anomalies;
— Determinedetermine data repair plans.
Input: Sensor Data With Quality Improvement Opportunitydata with quality improvement opportunity
ISO/CD TSDTS 8000-230:20252026(en)
Output: Data Repair Planrepair plan
Control: None
Mechanism: Software/Human
Control: None
Mechanism: Software/Human
5.2.4.3 Confirm Data Repair Plandata repair plan (A32)
Purpose: Confirm Data Repair Plandata repair plan is to obtain confirmation from various stakeholders on the
established data repair plan and finalize it. This is an effort to ensure that stakeholders are fully informed and
agree on the risks and issues that may arise from data repair.
Task:
— Collectcollect stakeholders’ opinions on data repair plans;
— Confirmconfirm the implementable data repair plan including data repair priorities.
Input: Data Repair Planrepair plan
Output: Confirmed Data Repair Plandata repair plan
Control: None
Mechanism: Software/Human
Control: None
Mechanism: Software/Human
5.2.4.4 Execute Data Repairdata repair (A33)
Purpose: Execute Data Repairdata repair is to put the data repair plan into concrete action and provide
cleansed sensor data.
Task:
— Refinerefine the implementation plan for data repair;
— Executeexecute data repair and result checking.
NOTE Refer to Clause B.2 for methods on how to execute data repair.
Input: Sensor Data With Quality Improvement Opportunitydata with quality improvement opportunity
Output: Cleansed Sensor Datasensor data with indication that they have been cleansed
Control: Confirmed Data Repair Plandata repair plan
Mechanism: Software/Human which includes methods for repairing data anomalies.
ISO/CD TSDTS 8000-230:20252026(en)
6 Implementation Requirementsrequirements
In order to perform cleansing of sensor data, the following requirements shall be met:
— sensor data are identifiable;
— sensor data are obtained according to data formats predefined for data acquisition, and therefore, readily
accessible and understandable.
When wishing to understand and potentially improve the quality of sensor data, an organization shall perform
data cleansing using:
— the data quality characteristics and anomalies specified by ISO 8000-210;
— the data quality measures specified by ISO 8000-220.
ISO/CD TSDTS 8000-230:20252026(en)
Annex A
(informative)
Document identification
To provide for unambiguous identification of an information object in an open system, the following object
[18]
identifier is assigned to this document. The meaning of this value is defined in ISO 10303-1 .

{ ISO standard 8000 part(230) version(1) }
ISO/CD TSDTS 8000-230:20252026(en)
Annex B
(informative)
Cleansing methods for data anomaly
B.1 Detection of data anomalies
B.1.1 Data anomaly and detection cases
Data anomaly can be classified into three cases:
— Point anomaly: If an individual data instance can be considered as anomalous with respect to the rest of
data, then the instance is termed as a point anomaly.
— Collective anomaly: If a collection of related data instances is anomalous with respect to the entire data
set or pattern, it is termed as a collective anomaly.
— Contextual anomaly: If a data instance is anomalous in a specific context (but not otherwise), then it is
termed as a contextual anomaly (also referred to as conditional anomaly).
Anomaly detection approaches are based on models and predictions from past historical data. When an
anomaly detection algorithm is applied, three possible cases can be considered:
— Correct detection: Detected data anomalies do correspond exactly to abnormalities that happened in the
real field.
— False positives: The real field continues to be normal, but unexpected anomalous data values are observed,
e.g. due to system failure and malfunction.
— False negatives: The real field becomes abnormal, but the result does not appear as data anomalies.
In this document, correct detection and false positives will be considered since data cleansing is possible only
when data anomalies are detected in the retained sensor data.
B.1.2 Anomaly detection for time series
B.1.2.1 General
Time series is a totally ordered sequence of data items (numerical values), each associated with a timestamp
which makes it possible to identify the time gap between any two items. Therefore, sensor data as a stream of
single, discrete digital values are a type of time series.
There are many anomaly detection methods for time series, among which 21 well-known ones are presented
and grouped into four categories: basic, statistical, digital signal processing, and machine learning.
NOTE These anomaly detection methods are intended to detect anomalous data values or patterns as outliers, but
not to identify the type of data anomaly.
B.1.2.2 Basic method type
This includes several different methods such as fixed threshold and dynamic threshold. The individual
methods are described below:
ISO/CD TSDTS 8000-230:20252026(en)
— Fixed threshold: a technique which employs predetermined static values, known as thresholds, to identify
anomalies. A lower limit and/or an upper limit is set as a threshold based on domain knowledge, historical
data analysis, or other relevant criteria. When a new data point in the time series is measured, it is
compared against this fixed threshold. If the measured value exceeds the upper threshold or falls below
the lower threshold, it is flagged as an anomaly.
— Dynamic threshold: a method which uses a dynamic threshold to identify anomalies. If a data point is
greater than or less than the threshold, it is considered an anomaly. The threshold is adjusted adaptively
based on the statistical properties of the signal (such as mean and variance) and the levels of possible
[19]
background noise in the current or recent period of time .
— Time interval analysis: an approach used to identify anomalies in time-stamped data sets by examining
the intervals between consecutive timestamps. It calculates the differences in time between adjacent
timestamps and compares these intervals against a predefined threshold. Intervals that fall outside the
threshold are flagged as anomalies, indicating possible incorrect timestamps or data loss.
— Sequential dependency check: a method used to verify the correctness of timestamps by ensuring that the
timestamps match the chronological and logical sequence of events. It analyses the chronological order of
timestamps and the logical sequence of associated events to identify missing, extraneous, or out-of-order
data.
— Sliding window: a fundamental method which involves defining a window or range in the input data and
then moving that window across the data to perform some operation within the window. It shifts a sliding
window one by one element to the right until the end of data set. For each window, it computes mean and
standard deviation and compares the data point against a threshold, for example, 𝜇𝜇 ± 3𝜎𝜎. The data point
greater or less the threshold is considered an anomaly.
B.1.2.3 Statistical
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...