Chemometrics for process analytical technologies - Part 1: General provisions, and methods for univariate statistics and chemometric processing of data

IEC TR 62829-1:2019, which is a Technical Report, covers
• a study into the pre-requisites of chemometric (exploratory) data analysis,
• an overview of common data analysis procedures for univariate, bivariate and multivariate data analysis,
• explanations of the basic principles and major application areas of the different methods),
• some recommendations on the selection of an appropriate data analysis strategy.
These recommendations not covered earlier by other guidance documents on the topic are complemented by some advice on the validation of commercial (at the site of installation) and tailored software for process analytical purposes. Recommendations are given on available reference data sets (Annex B) for benchmarking of software implementing the data analysis methods covered (if available).

General Information

Status
Published
Publication Date
21-Nov-2019
Current Stage
PPUB - Publication issued
Start Date
22-Nov-2019
Completion Date
27-Nov-2019

Overview

IEC TR 62829-1:2019 is a Technical Report from IEC that provides guidance on applying chemometrics in process analytical technologies (PAT). It focuses on general provisions and methods for univariate statistics and chemometric data processing. The report surveys prerequisites for exploratory data analysis, common univariate/bivariate/multivariate procedures, method principles and application areas, and offers recommendations for selecting appropriate data analysis strategies. It also includes practical advice on software validation (Annex A) and reference data sets for benchmarking (Annex B).

Key topics and technical requirements

  • Pre-requisites for chemometric analysis
    • Data adequacy, representativeness and quality
    • Data acquisition, management, validation and security
    • Metadata and database vs spreadsheet considerations
  • Data preprocessing
    • Filtering, smoothing, data reduction and dimensionality handling
  • Statistical methods covered
    • Univariate analysis: descriptive statistics, hypothesis testing, ANOVA, general linear models
    • Bivariate analysis: regression analysis, time series analysis
    • Overview of multivariate methods (contextual guidance for PAT applications)
  • Technical and methodological classifications
    • Data formats, method selection criteria and performance considerations
  • Software validation and benchmarking
    • Practical recommendations for validating commercial and customized software at installation site
    • Reference data sets for software benchmarking (Annex B)
  • Fields of application
    • Sensor-level analytics, production unit/process control, and along production chains (ERP/LIMS/data mining)

Practical applications and users

IEC TR 62829-1 is intended for professionals working with PAT and industrial measurement data who need reliable, validated chemometric approaches, including:

  • Process engineers and control specialists applying real-time monitoring and multivariate control charts
  • Analytical chemists and laboratory personnel developing calibration and QA/QC procedures
  • Sensor and instrument vendors embedding chemometric algorithms into devices
  • Software developers and system integrators implementing and validating chemometric software
  • Quality assurance, regulatory and compliance teams assessing method performance

Common use cases:

  • Design of sampling and experimental strategies
  • Signal processing and data preprocessing for sensor streams
  • Process monitoring, optimization and control using chemometric models
  • Validation of calibration models and software at the site of installation
  • Benchmarking chemometric tools with reference data sets

Related standards

  • Part of the IEC 62829 series on "Chemometrics for process analytical technologies" - consult other parts of the series for additional, complementary guidance.

Keywords: IEC TR 62829-1, chemometrics, process analytical technologies, PAT, univariate analysis, multivariate, data preprocessing, software validation, benchmarking, process monitoring.

Technical report

IEC TR 62829-1:2019 - Chemometrics for process analytical technologies - Part 1: General provisions, and methods for univariate statistics and chemometric processing of data

English language
39 pages
sale 15% off
Preview
sale 15% off
Preview

Frequently Asked Questions

IEC TR 62829-1:2019 is a technical report published by the International Electrotechnical Commission (IEC). Its full title is "Chemometrics for process analytical technologies - Part 1: General provisions, and methods for univariate statistics and chemometric processing of data". This standard covers: IEC TR 62829-1:2019, which is a Technical Report, covers • a study into the pre-requisites of chemometric (exploratory) data analysis, • an overview of common data analysis procedures for univariate, bivariate and multivariate data analysis, • explanations of the basic principles and major application areas of the different methods), • some recommendations on the selection of an appropriate data analysis strategy. These recommendations not covered earlier by other guidance documents on the topic are complemented by some advice on the validation of commercial (at the site of installation) and tailored software for process analytical purposes. Recommendations are given on available reference data sets (Annex B) for benchmarking of software implementing the data analysis methods covered (if available).

IEC TR 62829-1:2019, which is a Technical Report, covers • a study into the pre-requisites of chemometric (exploratory) data analysis, • an overview of common data analysis procedures for univariate, bivariate and multivariate data analysis, • explanations of the basic principles and major application areas of the different methods), • some recommendations on the selection of an appropriate data analysis strategy. These recommendations not covered earlier by other guidance documents on the topic are complemented by some advice on the validation of commercial (at the site of installation) and tailored software for process analytical purposes. Recommendations are given on available reference data sets (Annex B) for benchmarking of software implementing the data analysis methods covered (if available).

IEC TR 62829-1:2019 is classified under the following ICS (International Classification for Standards) categories: 25.040.40 - Industrial process measurement and control. The ICS classification helps identify the subject area and facilitates finding related standards.

You can purchase IEC TR 62829-1:2019 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of IEC standards.

Standards Content (Sample)


IEC TR 62829-1 ®
Edition 1.0 2019-11
TECHNICAL
REPORT
colour
inside
Chemometrics for process analytical technologies –
Part 1: General provisions, and methods for univariate statistics and
chemometric processing of data
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form
or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from
either IEC or IEC's member National Committee in the country of the requester. If you have any questions about IEC
copyright or have an enquiry about obtaining additional rights to this publication, please contact the address below or
your local IEC member National Committee for further information.

IEC Central Office Tel.: +41 22 919 02 11
3, rue de Varembé info@iec.ch
CH-1211 Geneva 20 www.iec.ch
Switzerland
About the IEC
The International Electrotechnical Commission (IEC) is the leading global organization that prepares and publishes
International Standards for all electrical, electronic and related technologies.

About IEC publications
The technical content of IEC publications is kept under constant review by the IEC. Please make sure that you have the
latest edition, a corrigendum or an amendment might have been published.

IEC publications search - webstore.iec.ch/advsearchform Electropedia - www.electropedia.org
The advanced search enables to find IEC publications by a The world's leading online dictionary on electrotechnology,
variety of criteria (reference number, text, technical containing more than 22 000 terminological entries in English
committee,…). It also gives information on projects, replaced and French, with equivalent terms in 16 additional languages.
and withdrawn publications. Also known as the International Electrotechnical Vocabulary

(IEV) online.
IEC Just Published - webstore.iec.ch/justpublished
Stay up to date on all new IEC publications. Just Published IEC Glossary - std.iec.ch/glossary
details all new publications released. Available online and once 67 000 electrotechnical terminology entries in English and
a month by email. French extracted from the Terms and Definitions clause of IEC
publications issued since 2002. Some entries have been
IEC Customer Service Centre - webstore.iec.ch/csc collected from earlier publications of IEC TC 37, 77, 86 and
If you wish to give us your feedback on this publication or need CISPR.

further assistance, please contact the Customer Service

Centre: sales@iec.ch.
IEC TR 62829-1 ®
Edition 1.0 2019-11
TECHNICAL
REPORT
colour
inside
Chemometrics for process analytical technologies –

Part 1: General provisions, and methods for univariate statistics and

chemometric processing of data

INTERNATIONAL
ELECTROTECHNICAL
COMMISSION
ICS 25.040.40 ISBN 978-2-8322-7584-9

– 2 – IEC TR 62829-1:2019  IEC 2019
CONTENTS
CONTENTS . 2
FOREWORD . 4
INTRODUCTION . 6
1 Scope . 8
2 Normative references . 8
3 Terms and definitions . 8
4 Fields of application. 8
4.1 Process control and process analytical technologies (PAT) . 8
4.2 Physical and chemical properties . 9
4.3 PAT fields of application . 10
4.3.1 Definition of chemometrics. 10
4.3.2 Overview on PAT fields of applications . 10
4.3.3 Chemometrics for sensors . 10
4.3.4 Chemometrics for production units . 11
4.3.5 Chemometrics along a production chain . 11
5 Pre-requisites of chemometric data analysis . 12
5.1 Data has to be adequate and reliable . 12
5.2 Data representativeness . 12
5.3 Data acquisition . 13
5.4 Data management . 13
5.5 Databases versus spreadsheets . 13
5.6 Data quality . 14
5.7 Data validation . 14
5.8 Data corruption . 14
5.9 Data security and fraudulent data detection . 14
5.10 Data management for data mining . 15
6 Pre-requisites of chemometric data analysis . 15
6.1 Technical requirements of chemometric data analysis . 15
6.2 Data dimensionality . 15
6.3 Method classification . 16
6.4 Data pre-processing . 17
6.4.1 Filtering . 17
6.4.2 Smoothing . 17
6.4.3 Data reduction . 17
7 Methods of chemometric data analysis . 20
7.1 Univariate analysis . 20
7.1.1 Descriptive statistics . 20
7.1.2 Hypothesis testing . 21
7.1.3 Analysis of variance (ANOVA) . 23
7.1.4 General linear models . 25
7.2 Bivariate analysis . 25
7.2.1 Regression analysis. 25
7.2.2 Time series analysis . 28

Annex A (informative) Advice on software validation for process analytical applications . 31
A.1 General . 31
A.2 Basic recommendations . 31
A.3 Software validation . 33
Annex B (informative) Reference data sets available for software benchmarking . 35
Bibliography . 36

Figure 1 – Different levels of chemometric applications: (a) within an (intelligent or
smart) sensor, (b) within a production unit, e.g., process control system, process
control environment, or laboratory information management system (LIMS), (c) along a
production chain, e.g., ERP system, data mining, etc. . 10
Figure 2 – influence of pre-processing techniques for classification of the geographical
origin of wine . 19
Figure A.1 – Different paths for the introduction of new software in a laboratory . 33

Table 1 – Data analysis techniques and data formats . 17
Table A.1 – Categories of software . 32
Table A.2 – Software validation levels . 32

– 4 – IEC TR 62829-1:2019  IEC 2019
INTERNATIONAL ELECTROTECHNICAL COMMISSION
____________
CHEMOMETRICS FOR PROCESS ANALYTICAL TECHNOLOGIES –

Part 1: General provisions, and methods for univariate statistics
and chemometric processing of data

FOREWORD
1) The International Electrotechnical Commission (IEC) is a worldwide organization for standardization comprising
all national electrotechnical committees (IEC National Committees). The object of IEC is to promote
international co-operation on all questions concerning standardization in the electrical and electronic fields. To
this end and in addition to other activities, IEC publishes International Standards, Technical Specifications,
Technical Reports, Publicly Available Specifications (PAS) and Guides (hereafter referred to as “IEC
Publication(s)”). Their preparation is entrusted to technical committees; any IEC National Committee interested
in the subject dealt with may participate in this preparatory work. International, governmental and non-
governmental organizations liaising with the IEC also participate in this preparation. IEC collaborates closely
with the International Organization for Standardization (ISO) in accordance with conditions determined by
agreement between the two organizations.
2) The formal decisions or agreements of IEC on technical matters express, as nearly as possible, an international
consensus of opinion on the relevant subjects since each technical committee has representation from all
interested IEC National Committees.
3) IEC Publications have the form of recommendations for international use and are accepted by IEC National
Committees in that sense. While all reasonable efforts are made to ensure that the technical content of IEC
Publications is accurate, IEC cannot be held responsible for the way in which they are used or for any
misinterpretation by any end user.
4) In order to promote international uniformity, IEC National Committees undertake to apply IEC Publications
transparently to the maximum extent possible in their national and regional publications. Any divergence
between any IEC Publication and the corresponding national or regional publication shall be clearly indicated in
the latter.
5) IEC itself does not provide any attestation of conformity. Independent certification bodies provide conformity
assessment services and, in some areas, access to IEC marks of conformity. IEC is not responsible for any
services carried out by independent certification bodies.
6) All users should ensure that they have the latest edition of this publication.
7) No liability shall attach to IEC or its directors, employees, servants or agents including individual experts and
members of its technical committees and IEC National Committees for any personal injury, property damage or
other damage of any nature whatsoever, whether direct or indirect, or for costs (including legal fees) and
expenses arising out of the publication, use of, or reliance upon, this IEC Publication or any other IEC
Publications.
8) Attention is drawn to the Normative references cited in this publication. Use of the referenced publications is
indispensable for the correct application of this publication.
9) Attention is drawn to the possibility that some of the elements of this IEC Publication may be the subject of
patent rights. IEC shall not be held responsible for identifying any or all such patent rights.
The main task of IEC technical committees is to prepare International Standards. However, a
technical committee may propose the publication of a Technical Report when it has collected
data of a different kind from that which is normally published as an International Standard, for
example "state of the art".
IEC TR 62869-1, which is a Technical Report, has been prepared by subcommittee 65B:
Measurement and control devices, of IEC technical committee 65: Industrial-process
measurement, control and automation.

The text of this Technical Report is based on the following documents:
Enquiry draft Report on voting
65B/1062/DTR 65B/1095B/RVDTR
Full information on the voting for the approval of this Technical Report can be found in the
report on voting indicated in the above table.
This document has been drafted in accordance with the ISO/IEC Directives, Part 2.
A list of all parts in the IEC 62829 series, published under the general title Chemometrics for
process analytical technologies, can be found on the IEC website.
The committee has decided that the contents of this document will remain unchanged until the
stability date indicated on the IEC website under "http://webstore.iec.ch" in the data related to
the specific document. At this date, the document will be
• reconfirmed,
• withdrawn,
• replaced by a revised edition, or
• amended.
A bilingual version of this publication may be issued at a later date.

IMPORTANT – The 'colour inside' logo on the cover page of this publication indicates
that it contains colours which are considered to be useful for the correct
understanding of its contents. Users should therefore print this document using a
colour printer.
– 6 – IEC TR 62829-1:2019  IEC 2019
INTRODUCTION
Chemometrics is a rapidly developing subject. It was thus felt that a report offering guidance
on its application to process analytical applications would both be helpful to all users of such
technology and would stimulate specialists in chemometrics to work with users and
developers of this technology.
This document does not seek to do other than provide a useful overview and a brief
bibliography that enables interested parties to learn about and, hopefully, apply chemometrics
in the most useful and appropriate ways for their circumstances. In that sense, it is definitely
not prescriptive but constructively critical and seeks to encourage good practice and a wider
appreciation.
It also aims at encouraging new research and development, as well as innovation, in
applications of chemometrics for process analytical applications by highlighting areas to which
such activities might usefully be directed.
Nowadays, the use of chemometric data analysis methods is widespread. Applications are in
fields like
• design of statistical/chemometric sampling strategies, design of experiments, design of
observational studies,
• design of data collection (including signal processing) protocols, data validation methods
and database management (including metadata management),
• quality management, including quality assurance and quality control,
• data analysis and interpretation, not only in the use of multivariate (many variable)
methods but also univariate (one variable) and bivariate (two variable) methods,
• process monitoring, optimization and control,
• chemical process and property modelling,
• guiding decision analysis and designing decision analysis methods/protocols in process
control and optimization,
• method and instrumentation performance validation (Annex A) and calibration.
Because of the interdisciplinary and multidisciplinary nature of the discipline of chemometrics,
it is often possible to be able to make unusual links and thereby solve problems taking cues
from disciplines that are as diverse as medical diagnostics, decision sciences and quality
assurance.
For example, in diagnosing the likely environmental impact of discharges of waste water from
an industrial process, we might want to link toxicity assessment to chemical composition, the
route and extent of discharge and the organisms likely to be affected. This might involve
establishing a chemometric (mathematical) model of the impact of the discharge, bio-sensing
the toxicity of the discharge on-line and relating both to the time, volume and concentration
variations in chemical composition and physicochemical properties. This could then be used
to assess the predictive reliability of the model and how this might be linked to process control
and optimization of the discharge treatment and any associated risk assessment of the
discharge process.
Conventionally, process control has involved using control charts for individual variables and
this sometimes leads us to false impressions of process behaviour. Since 2010, techniques
including both commercial and other software have become available to construct a wide
variety of useful multivariate control charts that sometimes reveal "out-of-control" situations
not apparent using conventional univariate control charts.

Due to the applicability of chemometric methods to a nearly unlimited number of cases in all
fields of measurement and testing, but particularly due to need of using chemometric
techniques in process analytical applications, it was felt a necessity to have guidance on the
available methods and their appropriate choice.

– 8 – IEC TR 62829-1:2019  IEC 2019
CHEMOMETRICS FOR PROCESS ANALYTICAL TECHNOLOGIES –

Part 1: General provisions, and methods for univariate statistics
and chemometric processing of data

1 Scope
This part of IEC 62829, which is a Technical Report, covers
• a study into the pre-requisites of chemometric (exploratory) data analysis,
• an overview of common data analysis procedures for univariate, bivariate and multivariate
data analysis,
• explanations of the basic principles and major application areas of the different methods),
• some recommendations on the selection of an appropriate data analysis strategy.
These recommendations not covered earlier by other guidance documents on the topic are
complemented by some advice on the validation of commercial (at the site of installation) and
tailored software for process analytical purposes. Recommendations are given on available
reference data sets (Annex B) for benchmarking of software implementing the data analysis
methods covered (if available). An application example is given.
2 Normative references
There are no normative references in this document.
3 Terms and definitions
No terms and definitions are listed in this document.
ISO and IEC maintain terminological databases for use in standardization at the following
addresses:
• IEC Electropedia: available at http://www.electropedia.org/
• ISO Online browsing platform: available at http://www.iso.org/obp
4 Fields of application
4.1 Process control and process analytical technologies (PAT)
There is currently a considerable trend to use process analytical technologies (PAT) for
reaction monitoring and (direct loop) process control. Current developments in the field of
process engineering are not imaginable without PAT, such as modern process design,
integrated processes (e.g., reactive separation processes), and intensified processes along
with requirements to process control, model-based control, and soft sensing – all involving
chemometrics.
The process industry relies on the design, operation, control, and optimization of chemical,
physical, or biological processes. This involves creating production facilities that translate raw
materials into value-added products along the supply chain. Such conversions typically take
place in repeated reaction and separation steps – either in batch or continuous processes.
The end products of a chemical production facility are the result of several production steps
that are connected not only in a sequential fashion, but also involve recycling of unused raw

materials and by-products, as well as waste treatment stages. Production processes in the
process industry are particularly disturbed by variations in feed-stocks and other influences
that impact the product quality. An integrated process control approach enables constant
product quality and prevents out-of-spec production by effectively compensating for such
process variations. In a conventional approach, quality is determined by withdrawing samples
from material streams and conducting offline analytics, which is called in-process control or
at-line or off-line control. By applying quality by design (QbD, see ICH definition in 4.2)
approaches, quality can significantly improve to generate less waste, reduce reprocessing of
substandard material, and create products of superior quality.
Today's optimized process design relies heavily on computer aided tools, which account for,
for example, mass transfer, thermodynamic, kinetic, and other physical properties of the
treated materials. Typically, a sufficient understanding of such properties is available and
implemented in dynamic numeric models. Dynamic models are in turn the essential basis for
optimized process and plant design. Unfortunately, they are only sparsely used for process
control. A definition from Lee (2008) brings this to a contemporary level:
"Cyber-physical systems (CPS) are integrations of computation with physical processes.
Embedded computers and networks monitor and control the physical processes, usually with
feedback loops where physical processes affect computations and vice versa."
4.2 Physical and chemical properties
Process analytical techniques are extremely useful tools for chemical production and
manufacture and are of particular interest to the pharmaceutical, food and (petro-)chemical
industries. It can be easily transferred to manufacturing for process control and for quality
assurance of final products to meet required product specifications, since it provides dynamic
information about product properties, material stream characteristics, and process conditions.
Normally, the quality of the final product is assessed after processing by adequate testing
procedures. The rationale behind process analytical technologies (PAT) is to measure, and
assess, physical and chemical properties over, and throughout, the production process in
order to assure a product which is within the tolerance limits or regulatory restrictions.
The quality of any product or the properties of a material can be described by a complex
functional relationship to physical and/or chemical properties of the constituents of the
product/material and its temporal changes during processing.
According to ICH Q8(R2), quality is the suitability of either a drug substance or drug product
for its intended use (ICH: International Conference on Harmonisation). This term includes
attributes such as identity, strength, and purity. Before the launch of the PAT initiative,
pharmaceutical production was confronted with challenges like drug shortages due to
manufacturing difficulties, process deviations coupled with frequent inconclusive
investigations, batch failures and rejections, in-process test debates (e.g., blend uniformity),
slow and protracted cGMP (current good manufacturing practice) remediation, warning letters,
and others.
The science-based regulatory guidances such as the FDA (US Federal Drug Administration)
and ICH PAT guidance have recognized spectroscopic techniques as potentially useful tools
on building quality into the product and manufacturing processes, as well as continuous
process improvements. The goal of PAT is to enhance understanding and thereby control the
manufacturing process.
The common future vision in pharmaceutical production is continuous manufacturing (CM),
based on real-time release (RTR), i.e. a risk-based and integrated quality control in each
process unit. This will allow for flexible hook-up of smaller production facilities, production
transfer towards fully automated facilities (featuring less operator intervention and less
down-time), and end-to-end process understanding over product life cycle, future knowledge,
and faster product to market.
– 10 – IEC TR 62829-1:2019  IEC 2019
Both methods, i.e. in-process and final-product testing, intensively use statistical methods as
described in this document.
4.3 PAT fields of application
4.3.1 Definition of chemometrics
At the dawn of chemometrics, an appropriate definition of the term was that chemometrics is
"… the use of statistical, mathematical and other logic-based techniques, together with
chemical knowledge to solve chemical problems" (D1) with a clear emphasis on chemical
problems, thus explaining the word "chemo" in the term. A definition taken from the Journal of
Chemical Information and Computer Sciences (1975), Vol. 15, page 201, defines
chemometrics as the "development/application of mathematical/statistical methods to extract
useful chemical information from chemical measurements" (D2) thus detailing the aim, namely
the extraction of (useful) information from measurement data.
"Chemometrics is a sub-discipline of metrology dealing with the application of mathematical,
statistical and other methods employing formal logic to evaluate and interpret (chemical,
analytical) data, optimize and model (chemical, analytical) processes and instrumentation,
extract maximum (chemical, analytical) information from experimental and observational
data" (D3).
To date, no internationally agreed and standardized definition of chemometrics exists.
Although all three definitions reflect, in principle, both the intention of, and the instruments
used for, chemometrics, definition D3 is the most general. The application of the principles
and tools of chemometrics is explicitly not limited to the field of chemical/analytical
measurement, so definition D3 may by read and used without the specification
"chemical/analytical", deliberately put in parentheses here.
4.3.2 Overview on PAT fields of applications
Fields of application of PAT are described below and visualized in Figure 1.

(a) within an (intelligent or smart) sensor
(b) within a production unit, e.g. process control system, process control environment, or laboratory information
management system (LIMS)
(c) along a production chain, e.g. ERP system, data mining
Figure 1 – Different levels of chemometric applications
4.3.3 Chemometrics for sensors
Sensors are the sense organs of process automation. At present there are serious changes in
the areas of information and communication technology, which offer a great opportunity for
optimized process control and value-added production with dedicated network communicating
sensors. These kinds of smart or intelligent sensors (see Figure 1) provide services within a
network and use information from the process information systems.

Intelligent field devices, digital field networks, Internet protocol (IP)-enabled connectivity and
web services, historians, and advanced data analysis software are providing the basis for the
future project “Industrie 4.0”, and industrial Internet of Things (IIoT). This is a prerequisite for
the realization of cyber physical systems (CPS) within these future automation concepts for
the process industry.
As a consequence, smart process sensors enable new business models for users, device
manufacturers, and service providers.
4.3.4 Chemometrics for production units
With the introduction of advanced process analytical technology, the closeness of key process
variables to their limits can be directly monitored and controlled and the processes can
automatically be driven much closer to the optimal operating limit. Classical, non-model-based
solutions reach their limits when sensor information from several sources has to be merged.
In addition, their adaptation causes a high effort during the life cycle of the process. This calls
for adaptive control strategies, which are based on dynamic process models as mentioned
above. Model-based control concepts have also the potential to automatically cope with
changes of the raw-materials as well as process conditions.
Chemical process control technology has advanced significantly during the last decades. For
world-scale high-throughput continuous processing units such as crackers and separation
trains, in most cases classically engineered control solutions (proportional–integral–derivative
(PID) controllers, cascade and override structures) have been replaced by model-based
techniques, most prominently model-predictive control (MPC) based on linear plant models.
However, the engineering and implementation costs of such advanced controllers are still
high. For smaller, flexible processes in which varying products or intermediates are
manufactured, it is not economic to re-engineer the control concept or to re-model the process
for all intended processes.
Advanced control strategies have to be built upon empirical – often data-based – models
which describe cause-effect mappings between the degrees of freedom of the process and
the product properties in a black-box fashion. Chemometric techniques for the derivation of
empirical models – e.g. partial least squares (PLS), principal component analysis (PCA) – are
available but currently mostly used for off-line data analysis to detect the causes of variations
in the product quality. An automated application along the life cycle is still very limited. The
development of such models requires significant experimental work, and the reduction of the
effort needed for these experiments is the focus of ongoing research. When such stationary
models are available and are combined with dynamic models that describe the times needed
for the transition from one steady state to the other, feedback control and iterative
optimization schemes can be built that make use of the novel sensors.
The departure from current automation measurement to smart sensor systems has already
begun. Further development is based on the actual situation over several steps. Possible
perspectives will be via additional communication channels to mobile devices, bidirectional
communication, integration of the cloud and virtualization. The cost of connectivity is dropping
dramatically, providing powerful potential to connect people, assets, and information across
the industrial enterprise. While only providing add-on information, the first cloud services may
not require a high availability or real-time capabilities. But when available in the future, even
process control tasks will be possible using cloud services, e.g. when complex computing
algorithms are needed, which require computing performance and availability.
4.3.5 Chemometrics along a production chain
Current focuses of research are closed-loop adaptive control concepts for plant-wide process
control, which make use of specific or non-specific sensors along with conventional plant
instruments. Such advanced control solutions could give more information than only control
information, such as sensor failure detection, control performance monitoring, and improve
simulation-based engineering. At present, such data is typically collected and analysed in an
enterprise resource management system (ERP).

– 12 – IEC TR 62829-1:2019  IEC 2019
Closed-loop adaptive control concepts can be used to optimize global mass and energy
balances, local response surfaces that relate specifications of outgoing and ingoing streams
to the consumption of energy and the cost of production, or simple dynamic models of the
behaviour of the process stream (i.e. delays, settling times, etc.). In such a manner, the plant-
wide control of the entire process can be performed by setting targets and constraints on the
flow rates and on the properties of material streams.
The plant-wide control scheme is implemented using iterative set-point optimization on the
basis of the local models taking into account the dynamic behaviour. When the local
controllers are model-based, the response surfaces can be computed from these models. This
is not the case if classical control schemes are used, where they must be derived from
empirical data.
Such powerful analytics will help optimize both assets and systems. Predictive analytics will
be installed to reduce unplanned down-time. Newly available information generated by these
tools will lead to new, transformative business models supported by new applications. Instead
of offering physical products for sale, companies will increasingly offer products as a service.
5 Pre-requisites of chemometric data analysis
5.1 Data has to be adequate and reliable
Extraction of useful information from measurement results implicitly supposes, and even
requires, the data to be adequate and reliable. This imposes certain restrictions on data
acquisition. Issues to be tackled before the start of measurement campaigns for the
acquisition of large amounts of data are briefly described in 5.2 to 5.10.
5.2 Data representativeness
Data representativeness is understood as validity of the data obtained on a certain selection
of sub-units from a manifold to characterize the latter. Sub-units may be single products,
samples taken from a reactor, sub-samples taken from a plot of land at different horizons,
prospective measurements taken in an oil, gas or gold field, the atmospheric environs of a
huge city, or anything related. The manifold is the entity which the samples are taken from,
whichever applies.
Representativeness refers to the fitness-of-purpose of decisions (not measurements) taken on
a manifold. A single product or a single sample can easily be qualified using adequate
methods of analysis, but propagating this result and the corresponding decision on the
fitness-for-purpose of the whole manifold remains arbitrary.
Good sampling procedures providing reliable estimates for a manifold consisting of many
sampling units (defined, for example, by the final packaging unit size for the consumer) are
well described leading to factor analysis procedures as described in the ANOVA (see 7.1.3).
Visman (1969) and Ingamells (1973) developed specific sampling constants for defining the
amount of sub-sampling from a manifold of matter. These can be used to characterize the
manifold by
• a sum parameter, in the simplest form the average over a certain number of sum-samples,
• a value distribution which might be spatial or time-dependent requiring a (much) larger
number of observations.
Although it requires more data than the above methods, the Krige (1951) approach may be
helpful. For prospecting of value-carrying metals (like gold or platinum), this might be
acceptable; for all other manifold-investigation tasks, prior information should be included in
the final assessment. However, none of the statistical procedures can guarantee to find the
legendary needle in a haystack.

5.3 Data acquisition
In gathering data from process instrumentation, one should be seeking to obtain data of
adequate quality and quantity for further processing, display and finally reliable interpretation
within the context of process monitoring, control and perhaps feedback for process
adjustment. In order for data to be appropriate and "fit for purpose", the data users need to
think very carefully about the specification of the data at each stage in data processing from
initial capture to final application of the interpretation. Issues relating to the various biases
and uncertainties that may accumulate through these various stages need addressing.
Process instrumentation may have sensors and transducers for direct measurement or may
be more complex involving spectrometric or chromatographic systems for online or off-line
measurement of either batch or continuous processes with either discrete or continuous
physical sampling, either taken from a single location along a chain or gradient or from a two-
dimensional surface or out of a three-dimensional space. The immediate output from sensors,
transducers or more complex instrumentation is commonly analogue and may be conditioned
by filtering and/or amplifying and subsequently digitized for further processing. Alternatively,
the immediate output may be digital, as with photon or ion counting systems.
An important part of the design of the data gathering process involves the careful
consideration of (a) the nature of the signal (analogue or digital, range, frequency, noise
extent and types, etc.), (b) how it is gathered, (c) how it is conditioned and (d) how it is
amplified. These various processes may seriously affect the usability of the data gathered for
process monitoring, control or adjustment.
5.4 Data management
Metadata is a class of data describing data, such as process measurement data acquired
from process instrumentation. For example, the time and date when a measurement was
made, what variable was measured, in what units the variable was measured, from where and
from what the specimen was sampled, under what conditions the specimen was taken, time
delays and time lags in the sampling system, etc. Reliable recording of metadata associated
with measurements made using process instrumentation, whether for calibration or process
monitoring, is essential. It is also crucially important that the links between measurement data
and associated metadata are preserved intact through all data collection, data management
and data analysis processes. Otherwise, interpretation of such measurement has little or no
value in process monitoring, control or feedback adjustments.
5.5 Databases versus spreadsheets
In the electronic recording of data and associated metadata from instrumentation, various
options may be available from which to choose.
1) Data may be recorded on a data collection device and subsequently transferred manually
by a data clerk onto an electronic record.
2) Data may be transferred automatically into a computer-software-based spreadsheet.
3) Data may be transferred automatically into a computer-software-based relational database
management system in which data from an observation (involving one or more variables)
made using process instrumentation and the associated metadata linked to that
observation are considered as a single entity. This enables analysis to be performed with
queries designed in a structured query language (SQL) or via well designed on-line
analytical processing (OLAP) systems as is now being implemented in various types of
data mining.
Option 1, involving manual entry to electronic records, requires data quality and validation
checks to ensure adequate correction of transcription errors, whose incidence may be as high
as 5 % of data entries.
Option 2, involving automatic data transfer to spreadsheets, is less satisfactory than Option 3
for the following reasons: Although some spreadsheet software packages have facilities for

– 14 – IEC TR 62829-1:2019  IEC 2019
security and auditing, many inexperienced users do not appreciate the value of these and
they tend not to be used or are implemented in a less than satisfactory manner. Spreadsheets
are easy to corrupt, whether accidentally or deliberately. Accidental corruption may occur, for
example, as a result of instrumentation faults or signal transmission errors or corruption may
be deliberate and fraudulent.
Because it is not straightforward to protect spreadsheets from accidental or deliberate
corruption and because the auditing of such errors is not straightforward or reliable, it is
difficult to ensure data validity and quality.
Metadata needs to be properly linked to the data with which it is associated and this link is
much less secure, reliable and more difficult to protect in spreadsheets.
In well-designed database software, facilities are available to set many levels of data
protection to limit corruption and to have various forms of data validation (see below),
although the latter types of data screening require care in their design so as to create data
flagging rather than data rejection. This then enables rigorous forms of data audit to be
implemented.
5.6 Data quality
The quality of data may be expressed in terms of an uncertainty budget. Included in that
budget will be contributions from sources of random variation in measurement and sampling
(components of the precision contributions) and sources of bias in measurement and
sampling (accuracy).
These should be checked periodically, preferably in a planned and strategic way, with
additional opportunities for unplanned checking as well, for example using various kinds of
calibration protocols. The data management system protocols should include ways of
checking data quality and that quality management protocols are being actively and reliably
used.
5.7 Data validation
Data validation as data is entered into a database may include checks on whether data values
are in the expected range. Data should never be deleted or rejected merely because they are
outside the expected range. Rather, it should be flagged as outside the set range to enable or
initiate validity checks. The data may actually be valid and outside the expected range
because of changes in process behaviour and thus provide a useful warning of such changes.
5.8 Data corruption
Data may be prone to accidental corruption as a result of failure or faulty behaviour of
sensors and transducers, electrical interference, faulty transmission, incorrect setting of
signal conditioning, amplification, integration, operator error, etc. The integrity of sensor and
transducer behaviour, signal acquisition, etc., should be checked on a planned basis with
opportunities for unplanned checking for accidental corruption. Quality management of
operator compliance with standard operating procedures should be included in the quality
management system to allow for data corruption
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...

본 기사는 IEC TR 62829-1:2019에 대해 논의하고 있으며, 이는 공정 분석 기술에 대한 화학 정보학을 다룹니다. 이 보고서는 탐색적 데이터 분석의 선행 조건, 단변량, 이변량 및 다변량 데이터 분석을 위한 일반적인 데이터 분석 절차, 다양한 방법의 기본 원칙과 주요 응용 분야에 대한 개요, 적합한 데이터 분석 전략 선택에 대한 권장 사항을 다룹니다. 또한, 공정 분석용 상용(설치 위치에서의) 및 사용자 정의 소프트웨어의 검증에 대한 조언과 함께 해당 데이터 분석 방법을 구현한 소프트웨어의 벤치마킹을 위한 참조 데이터 세트(부록 B)의 가용성에 대한 이야기도 나옵니다.

The article discusses IEC TR 62829-1:2019, a Technical Report that focuses on chemometrics for process analytical technologies. It covers various aspects of data analysis, including the prerequisites of exploratory data analysis, common procedures for univariate, bivariate, and multivariate data analysis, and the principles and applications of different methods. The report also provides recommendations on selecting an appropriate data analysis strategy and offers advice on validating software for process analytical purposes. Additionally, it mentions the availability of reference data sets for benchmarking software implementing the covered data analysis methods.

本記事では、IEC TR 62829-1:2019についての議論が行われており、これはプロセス分析技術のためのケモメトリックスに関するものです。この技術報告書では、探索的データ分析の前提条件、単変量、二変量、および多変量データ分析の一般的な手順についての概要、さまざまな方法の基本原則と主な応用分野について説明しています。また、適切なデータ分析戦略の選択に関する推奨事項も提供しており、プロセス分析目的の商用(設置場所での)およびカスタマイズされたソフトウェアの検証に関するアドバイスも行っています。さらに、カバーされたデータ分析方法を実装したソフトウェアのベンチマーキングのための参照データセットの利用可能性についても言及されています。