ASTM E2891-20
(Guide)Standard Guide for Multivariate Data Analysis in Pharmaceutical Development and Manufacturing Applications
Standard Guide for Multivariate Data Analysis in Pharmaceutical Development and Manufacturing Applications
SIGNIFICANCE AND USE
4.1 A significant amount of data is generated during pharmaceutical development and manufacturing activities. The interpretation of such data is becoming increasingly difficult. Individual examination of the univariate process variables is relevant but can be significantly complemented by multivariate data analysis (MVDA). MVDA may be particularly appropriate for exploring and handling large sets of heterogenous data, mapping data of high dimensionality onto lower dimensional representations, exposing significant correlations among multivariate variables within a single data set or significant correlations among multivariate variables across data sets. MVDA may extract statistically significant information which may enhance process understanding, decision making in process development, process monitoring and control (including product release), product life-cycle management, and continuous improvement.
4.2 MVDA is widely used in various industries including the pharmaceutical industry. To achieve a valid outcome, an MVDA model/application should incorporate the following:
4.2.1 A predefined risk-based objective incorporating one or more relevant scientific hypotheses specific to the application;
4.2.2 Sufficient relevant data of requisite quality covering the variance space encountered during intended use, that is, pharmaceutical development, or pharmaceutical manufacturing, or both;
4.2.3 Appropriate data analysis and model utilization practices including considerations on testing, validation, and qualification of all new data prior to using a model to analyze it;
4.2.4 Appropriately trained staff;
4.2.5 Appropriate standard operating procedures; and
4.2.6 Life-cycle management.
4.3 This guide can be used to support data analysis activities associated with pharmaceutical development and manufacturing, process performance and product quality monitoring in manufacturing, as well as for troubleshooting and investigation events. Technical detai...
SCOPE
1.1 This guide covers the applications of multivariate data analysis (MVDA) to support pharmaceutical development and manufacturing activities. MVDA is one of the key enablers for process understanding and decision making in pharmaceutical development, and for the release of intermediate and final products after being validated appropriately using a science and risk-based approach.
1.2 The scope of this guide is to provide general guidelines on the application of MVDA in the pharmaceutical industry. While MVDA refers to typical empirical data analysis, the scope is limited to providing a high level guidance and not intended to provide application-specific data analysis procedures. This guide provides considerations on the following aspects:
1.2.1 Use of a risk-based approach (understanding the objective requirements and assessing the fit-for-use status);
1.2.2 Considerations on the data collection and diagnostics used for MVDA (including data preprocessing and outliers);
1.2.3 Considerations on the different types of data analysis, model testing, and validation;
1.2.4 Qualified and competent personnel; and
1.2.5 Life-cycle management of MVDA model.
1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use.
1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
General Information
- Status
- Published
- Publication Date
- 30-Jun-2020
- Technical Committee
- E55 - Manufacture of Pharmaceutical and Biopharmaceutical Products
- Drafting Committee
- E55.14 - Measurement Systems and Analysis
Relations
- Effective Date
- 15-Feb-2020
- Refers
ASTM E2617-17 - Standard Practice for Validation of Empirically Derived Multivariate Calibrations - Effective Date
- 15-Dec-2017
- Effective Date
- 01-Jul-2017
- Effective Date
- 01-Jun-2016
- Effective Date
- 01-Apr-2014
- Effective Date
- 01-Apr-2013
- Refers
ASTM E1355-12 - Standard Guide for Evaluating the Predictive Capability of Deterministic Fire Models - Effective Date
- 01-Apr-2012
- Refers
ASTM E1355-11 - Standard Guide for Evaluating the Predictive Capability of Deterministic Fire Models - Effective Date
- 01-Jan-2011
- Effective Date
- 01-Mar-2010
- Refers
ASTM E2617-10 - Standard Practice for Validation of Empirically Derived Multivariate Calibrations - Effective Date
- 01-Mar-2010
- Effective Date
- 15-May-2009
- Refers
ASTM E2617-09a - Standard Practice for Validation of Empirically Derived Multivariate Calibrations - Effective Date
- 01-Apr-2009
- Refers
ASTM E2617-09 - Standard Practice for Validation of Empirically Derived Multivariate Calibrations - Effective Date
- 01-Mar-2009
- Refers
ASTM E2617-08ae1 - Standard Practice for Validation of Empirically Derived Multivariate Calibrations - Effective Date
- 01-Oct-2008
- Refers
ASTM E2617-08a - Standard Practice for Validation of Empirically Derived Multivariate Calibrations - Effective Date
- 01-Oct-2008
Overview
ASTM E2891-20: Standard Guide for Multivariate Data Analysis in Pharmaceutical Development and Manufacturing Applications provides high-level guidance for implementing multivariate data analysis (MVDA) in the pharmaceutical industry. Developed by ASTM International, this standard is designed to complement traditional univariate data approaches by enabling more comprehensive analysis of the complex, heterogeneous datasets generated during pharmaceutical process development and manufacturing. By applying MVDA, organizations can improve process understanding, enhance data interpretation, support decision making, and optimize process monitoring, control, and product quality assurance throughout the product life cycle.
Key Topics
Risk-Based Approach: The standard emphasizes the use of a risk-based methodology, ensuring that objectives, data collection, analysis, and model validation are fit for the intended scientific or regulatory purposes. The level of validation depends on the model’s impact - higher for critical process or product release activities.
Data Collection and Diagnostics: ASTM E2891-20 highlights the importance of gathering relevant, high-quality data representing all variables affecting the MVDA objective. Proper data preprocessing and systematic screening for outliers or irrelevant information is crucial for effective modeling.
Model and Method Development: A clear distinction is made between MVDA models (mathematical representations of multivariate relationships) and MVDA methods (broader analytical or process control strategies utilizing one or more models). Validation of both is essential, with model robustness, predictive ability, and reliability being central evaluation criteria.
Life-Cycle Management: Ongoing review, documentation, and adjustment of MVDA models are necessary to maintain performance upon changes in data patterns, process conditions, or materials. Standard operating procedures (SOPs) and change control mechanisms are recommended for continuous improvement and compliance.
Qualified Personnel: Effective application of MVDA requires qualified subject matter experts (SMEs) skilled in statistics, chemometrics, and process engineering-competency that goes beyond routine software-driven analyses.
Applications
ASTM E2891-20 has substantial practical value in a range of pharmaceutical activities, including:
Process Development: By mapping high-dimensional data into interpretable formats, MVDA supports deeper understanding of process variables and their interactions, leading to more robust process designs.
Manufacturing Monitoring and Control: Real-time and retrospective MVDA enable early detection of deviations, facilitate root-cause investigations, and support multivariate statistical process control.
Quality Assurance: MVDA models can be used for product release decisions, ensuring product quality consistency and regulatory compliance, especially when validated with rigorous, risk-based standards.
Troubleshooting and Continuous Improvement: The standard underpins systematic approaches for investigating process disturbances and implementing effective corrective actions based on comprehensive data insights.
Related Standards
ASTM E2891-20 references several related ASTM and international standards that support or complement MVDA practices in the pharmaceutical sector, including:
- ASTM E1655: Practices for Infrared Multivariate Quantitative Analysis
- ASTM E1790: Practice for Near Infrared Qualitative Analysis
- ASTM E2617: Practice for Validation of Empirically Derived Multivariate Calibrations
- ASTM E178: Practice for Dealing With Outlying Observations
- ASTM E2476: Guide for Risk Assessment and Risk Control in Process Analytical Technology (PAT)
- ASTM E2363: Terminology Relating to PAT in the Pharmaceutical Industry
- ICH Q2(R1): Validation of Analytical Procedures: Text and Methodology
By following ASTM E2891-20, pharmaceutical organizations can implement scientifically sound, risk-based MVDA to enhance process understanding, ensure consistent quality, and drive continuous improvement in development and manufacturing environments. This standard supports organizations in meeting regulatory expectations for data integrity, model validation, and systematic process monitoring.
Buy Documents
ASTM E2891-20 - Standard Guide for Multivariate Data Analysis in Pharmaceutical Development and Manufacturing Applications
REDLINE ASTM E2891-20 - Standard Guide for Multivariate Data Analysis in Pharmaceutical Development and Manufacturing Applications
Get Certified
Connect with accredited certification bodies for this standard

BSI Group
BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

TÜV Rheinland
TÜV Rheinland is a leading international provider of technical services.

TÜV SÜD
TÜV SÜD is a trusted partner of choice for safety, security and sustainability solutions.
Sponsored listings
Frequently Asked Questions
ASTM E2891-20 is a guide published by ASTM International. Its full title is "Standard Guide for Multivariate Data Analysis in Pharmaceutical Development and Manufacturing Applications". This standard covers: SIGNIFICANCE AND USE 4.1 A significant amount of data is generated during pharmaceutical development and manufacturing activities. The interpretation of such data is becoming increasingly difficult. Individual examination of the univariate process variables is relevant but can be significantly complemented by multivariate data analysis (MVDA). MVDA may be particularly appropriate for exploring and handling large sets of heterogenous data, mapping data of high dimensionality onto lower dimensional representations, exposing significant correlations among multivariate variables within a single data set or significant correlations among multivariate variables across data sets. MVDA may extract statistically significant information which may enhance process understanding, decision making in process development, process monitoring and control (including product release), product life-cycle management, and continuous improvement. 4.2 MVDA is widely used in various industries including the pharmaceutical industry. To achieve a valid outcome, an MVDA model/application should incorporate the following: 4.2.1 A predefined risk-based objective incorporating one or more relevant scientific hypotheses specific to the application; 4.2.2 Sufficient relevant data of requisite quality covering the variance space encountered during intended use, that is, pharmaceutical development, or pharmaceutical manufacturing, or both; 4.2.3 Appropriate data analysis and model utilization practices including considerations on testing, validation, and qualification of all new data prior to using a model to analyze it; 4.2.4 Appropriately trained staff; 4.2.5 Appropriate standard operating procedures; and 4.2.6 Life-cycle management. 4.3 This guide can be used to support data analysis activities associated with pharmaceutical development and manufacturing, process performance and product quality monitoring in manufacturing, as well as for troubleshooting and investigation events. Technical detai... SCOPE 1.1 This guide covers the applications of multivariate data analysis (MVDA) to support pharmaceutical development and manufacturing activities. MVDA is one of the key enablers for process understanding and decision making in pharmaceutical development, and for the release of intermediate and final products after being validated appropriately using a science and risk-based approach. 1.2 The scope of this guide is to provide general guidelines on the application of MVDA in the pharmaceutical industry. While MVDA refers to typical empirical data analysis, the scope is limited to providing a high level guidance and not intended to provide application-specific data analysis procedures. This guide provides considerations on the following aspects: 1.2.1 Use of a risk-based approach (understanding the objective requirements and assessing the fit-for-use status); 1.2.2 Considerations on the data collection and diagnostics used for MVDA (including data preprocessing and outliers); 1.2.3 Considerations on the different types of data analysis, model testing, and validation; 1.2.4 Qualified and competent personnel; and 1.2.5 Life-cycle management of MVDA model. 1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use. 1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
SIGNIFICANCE AND USE 4.1 A significant amount of data is generated during pharmaceutical development and manufacturing activities. The interpretation of such data is becoming increasingly difficult. Individual examination of the univariate process variables is relevant but can be significantly complemented by multivariate data analysis (MVDA). MVDA may be particularly appropriate for exploring and handling large sets of heterogenous data, mapping data of high dimensionality onto lower dimensional representations, exposing significant correlations among multivariate variables within a single data set or significant correlations among multivariate variables across data sets. MVDA may extract statistically significant information which may enhance process understanding, decision making in process development, process monitoring and control (including product release), product life-cycle management, and continuous improvement. 4.2 MVDA is widely used in various industries including the pharmaceutical industry. To achieve a valid outcome, an MVDA model/application should incorporate the following: 4.2.1 A predefined risk-based objective incorporating one or more relevant scientific hypotheses specific to the application; 4.2.2 Sufficient relevant data of requisite quality covering the variance space encountered during intended use, that is, pharmaceutical development, or pharmaceutical manufacturing, or both; 4.2.3 Appropriate data analysis and model utilization practices including considerations on testing, validation, and qualification of all new data prior to using a model to analyze it; 4.2.4 Appropriately trained staff; 4.2.5 Appropriate standard operating procedures; and 4.2.6 Life-cycle management. 4.3 This guide can be used to support data analysis activities associated with pharmaceutical development and manufacturing, process performance and product quality monitoring in manufacturing, as well as for troubleshooting and investigation events. Technical detai... SCOPE 1.1 This guide covers the applications of multivariate data analysis (MVDA) to support pharmaceutical development and manufacturing activities. MVDA is one of the key enablers for process understanding and decision making in pharmaceutical development, and for the release of intermediate and final products after being validated appropriately using a science and risk-based approach. 1.2 The scope of this guide is to provide general guidelines on the application of MVDA in the pharmaceutical industry. While MVDA refers to typical empirical data analysis, the scope is limited to providing a high level guidance and not intended to provide application-specific data analysis procedures. This guide provides considerations on the following aspects: 1.2.1 Use of a risk-based approach (understanding the objective requirements and assessing the fit-for-use status); 1.2.2 Considerations on the data collection and diagnostics used for MVDA (including data preprocessing and outliers); 1.2.3 Considerations on the different types of data analysis, model testing, and validation; 1.2.4 Qualified and competent personnel; and 1.2.5 Life-cycle management of MVDA model. 1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use. 1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
ASTM E2891-20 is classified under the following ICS (International Classification for Standards) categories: 11.120.01 - Pharmaceutics in general; 35.240.80 - IT applications in health care technology. The ICS classification helps identify the subject area and facilitates finding related standards.
ASTM E2891-20 has the following relationships with other standards: It is inter standard links to ASTM C1174-20, ASTM E2617-17, ASTM C1174-17, ASTM E178-16, ASTM E2474-14, ASTM C1174-07(2013), ASTM E1355-12, ASTM E1355-11, ASTM E1790-04(2010), ASTM E2617-10, ASTM E2476-09, ASTM E2617-09a, ASTM E2617-09, ASTM E2617-08ae1, ASTM E2617-08a. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
ASTM E2891-20 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.
Standards Content (Sample)
This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
Designation: E2891 − 20
Standard Guide for
Multivariate Data Analysis in Pharmaceutical Development
and Manufacturing Applications
This standard is issued under the fixed designation E2891; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope 2. Referenced Documents
2.1 ASTM Standards:
1.1 This guide covers the applications of multivariate data
C1174 Guide for Evaluation of Long-Term Behavior of
analysis (MVDA) to support pharmaceutical development and
Materials Used in Engineered Barrier Systems (EBS) for
manufacturing activities. MVDA is one of the key enablers for
Geological Disposal of High-Level Radioactive Waste
process understanding and decision making in pharmaceutical
E178 Practice for Dealing With Outlying Observations
development, and for the release of intermediate and final
E1355 Guide for Evaluating the Predictive Capability of
products after being validated appropriately using a science
Deterministic Fire Models
and risk-based approach.
E1655 Practices for Infrared Multivariate Quantitative
1.2 The scope of this guide is to provide general guidelines
Analysis
on the application of MVDA in the pharmaceutical industry.
E1790 Practice for Near Infrared Qualitative Analysis
While MVDA refers to typical empirical data analysis, the
E2363 Terminology Relating to Manufacturing of Pharma-
scope is limited to providing a high level guidance and not
ceutical and Biopharmaceutical Products in the Pharma-
intended to provide application-specific data analysis proce-
ceutical and Biopharmaceutical Industry
dures. This guide provides considerations on the following
E2474 Practice for Pharmaceutical Process Design Utilizing
aspects:
Process Analytical Technology (Withdrawn 2020)
1.2.1 Use of a risk-based approach (understanding the
E2476 Guide for Risk Assessment and Risk Control as it
objective requirements and assessing the fit-for-use status);
Impacts the Design, Development, and Operation of PAT
1.2.2 Considerations on the data collection and diagnostics
Processes for Pharmaceutical Manufacture
used for MVDA (including data preprocessing and outliers);
E2617 Practice for Validation of Empirically Derived Mul-
1.2.3 Considerations on the different types of data analysis,
tivariate Calibrations
model testing, and validation;
2.2 ICH Publications:
1.2.4 Qualified and competent personnel; and
ICH Q2(R1) Validation of Analytical Procedures: Text and
1.2.5 Life-cycle management of MVDA model.
Methodology
ICH-Endorsed Guide for ICH Q8/Q9/Q10 Implementa-
1.3 This standard does not purport to address all of the
safety concerns, if any, associated with its use. It is the tion ICH Quality Implementation Working Group Points
to Consider (R2)
responsibility of the user of this standard to establish appro-
priate safety, health, and environmental practices and deter-
3. Terminology
mine the applicability of regulatory limitations prior to use.
1.4 This international standard was developed in accor-
3.1 Definitions—Common term definitions can be found in
dance with internationally recognized principles on standard-
Terminology E2363 for pharmaceutical applications and some
ization established in the Decision on Principles for the
terms can be found in other standards and are cited when they
Development of International Standards, Guides and Recom-
are mentioned.
mendations issued by the World Trade Organization Technical
Barriers to Trade (TBT) Committee.
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
Standards volume information, refer to the standard’s Document Summary page on
This guide is under the jurisdiction of ASTM Committee E55 on Manufacture the ASTM website.
of Pharmaceutical and Biopharmaceutical Products and is the direct responsibility of The last approved version of this historical standard is referenced on
Subcommittee E55.14 on Measurement Systems and Analysis. www.astm.org.
Current edition approved July 1, 2020. Published July 2020. Originally approved Available from International Council for Harmonisation of Technical Require-
in 2013. Last previous edition approved in 2013 as E2891 – 13. DOI: 10.1520/ ments for Pharmaceuticals for Human Use (ICH), ICH Secretariat, Route de
E2891-20. Pré-Bois, 20, P.O Box 1894, 1215 Geneva, Switzerland, https://www.ich.org.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E2891 − 20
4. Significance and Use 5.3 In assessment of fitness for use of data analysis, several
aspects should be considered:
4.1 A significant amount of data is generated during phar-
5.3.1 Criteria for Acceptable Data Analysis—Criteria for
maceutical development and manufacturing activities. The
the data analysis are defined by user requirements and project
interpretation of such data is becoming increasingly difficult.
objectives.
Individual examination of the univariate process variables is
5.3.2 Data Source—Relevant data should be collected and
relevant but can be significantly complemented by multivariate
used in MVDA.
data analysis (MVDA). MVDA may be particularly appropri-
5.3.3 Data Integrity—Confirmation of accuracy,
ate for exploring and handling large sets of heterogenous data,
consistency, and traceability of the data from the source to the
mapping data of high dimensionality onto lower dimensional
analysis.
representations, exposing significant correlations among mul-
5.3.4 Data Analysis Practice (Technique and
tivariate variables within a single data set or significant
Procedure)—In data analysis practice, numerous options are
correlations among multivariate variables across data sets.
available and different options may generate similar results, all
MVDA may extract statistically significant information which
of which may be deemed fit for use. The data analysis process
may enhance process understanding, decision making in pro-
is an iterative approach; in case of an unsatisfactory result, a
cess development, process monitoring and control (including
different data analysis technique may be used or it may be
product release), product life-cycle management, and continu-
necessary to obtain additional data or data of higher quality, or
ous improvement.
both, until a valid model can be established which is deemed fit
4.2 MVDA is widely used in various industries including
for use.
the pharmaceutical industry. To achieve a valid outcome, an
MVDA model/application should incorporate the following:
6. Concepts of MVDA Model and MVDA Method
4.2.1 A predefined risk-based objective incorporating one or
6.1 When implementing MVDA it is important to under-
more relevant scientific hypotheses specific to the application;
stand the differentiation between a multivariate model and a
4.2.2 Sufficient relevant data of requisite quality covering
multivariate method. This is especially true as an MVDA
the variance space encountered during intended use, that is,
application reaches the validation stage.
pharmaceutical development, or pharmaceutical
6.2 MVDA Model:
manufacturing, or both;
6.2.1 As defined in Guide C1174, a model is a simplified
4.2.3 Appropriate data analysis and model utilization prac-
representation of a system or phenomenon with multiple
tices including considerations on testing, validation, and quali-
variables based on a set of hypotheses (assumptions, data,
fication of all new data prior to using a model to analyze it;
simplifications, or idealizations, or combinations thereof) that
4.2.4 Appropriately trained staff;
describe the system or explain the phenomenon, often ex-
4.2.5 Appropriate standard operating procedures; and
pressed mathematically. In the context of this guidance the
4.2.6 Life-cycle management.
term MVDA model is to be taken in a broad sense covering,
4.3 This guide can be used to support data analysis activities
multivariate exploratory, regression as well as dimension
associated with pharmaceutical development and
reduction techniques — such as, but not limited to latent
manufacturing, process performance and product quality moni-
variable-based, principal component analysis (PCA), principal
toring in manufacturing, as well as for troubleshooting and
component regression (PCR) and partial least-squares (PLS)
investigation events. Technical details in data analysis can be
regression. These models often relate observational data to a
found in the scientific literature and standard practices in data
known property or set of properties from a process, or a
analysis are already available (such as Practices E1655 and
summarized measure of the process state that can be used for
E1790 for spectroscopic applications, Practice E2617 for
statistical process control (SPC) approach, as described in 8.3.
model validation, and Practice E2474 for utilizing process
The mathematical relationship is established for a sufficient
analytical technology).
number of cases — preferably derived from experimental
designs. The model can then be applied to a similar set of
5. Risk-Based Approach for MVDA
observational data in order to estimate the targeted property/
5.1 A risk-based approach requires consideration of two properties.
aspects: the risk associated with the use of MVDA for a 6.2.2 MVDA is not limited to such multivariate calibrations
specific objective and the justifications and rationales during and predictions, and similar considerations as the ones de-
the data analysis to ensure the model is fit for use. Aspects of scribed in this guidance are applicable to direct and indirect
general risk assessment and control are described in Guide calibration, as well as PCA-based approaches used for example
E2476 and more specific model considerations are discussed in for exploratory data analysis.
ICH-Endorsed Guide for ICH Q8/Q9/Q10 Implementation.
6.3 Analytical and Process Control Methods, Including One
5.2 The risk level is considered high when the data analysis or More MVDA Elements:
is an integral part of the control strategy, is used directly for the 6.3.1 The MVDA method uses the output from the MVDA
product or intermediate product release or is used to directly model to define the targeted and predefined process character-
control the process. The risk is considered low when the output istic of interest. The MVDA model is one component of the
of the data analysis does not have significant impact on the broader concept that is an MVDA method. Such method should
assessment of the product quality. typically be characterized by the collection of data, the input
E2891 − 20
data to the calculation, the data analysis, and some potential cases properly, how to process the data to build a model, and
transformation from the MVDA model output to generate the how to test the model to see whether the model is fit for use.
pre-defined MVDA method characteristic of interest. (See Fig. The model prediction phase, however, should be emphasized
1.) equally. A valid model will generate a valid result only if the
6.3.2 Note that an MVDA method can incorporate multiple input is valid too. It is important to screen the input data and
MVDA models (for example, across multiple unit operations, monitor the prediction diagnostics when using the model for
from multiple pieces of equipment, etc.) that can be running in prediction. For latent variable-based models, such diagnostics
parallel or feeding sequentially into one another to provide the are often referred to as residual and score space diagnostics or
pre-defined MVDA method output. MVDA methods can also inner/outer model diagnostics (see Section 7). In addition, a
incorporate mechanistic and univariate models. The validation strategy for life-cycle management of the MVDA method is
of the MVDA model and the MVDA method are two different required (see Section 11).
activities. Section 7 of this guideline provides an overview of 6.4.2 In process monitoring analysis, the first phase is to
the MVDA model validation. The validation of an MVDA establish data analysis parameters, trending limits, or a crite-
method should follow the same overarching principles as for rion for the end point of trajectory monitoring. A model may be
any method validation, such as the ones described in ICH created in the first-phase trending analysis. The second phase is
Q2(R1). estimating the new values based on the established parameter
6.3.3 Method development comprises the creation of a set (including a possible model) and assessing the trajectory
model, its testing, and validation based on the established criteria.
6.3.4 Method validation consists in the validation of not
7. Data Collection and Diagnostics
only the model but also all the aspects of data acquisition,
analysis and reporting outlined in Fig. 1. The level of method 7.1 Relevant data properly representing all factors impact-
validation will depend on the intended model impact as defined ing the MVDA objective should be used for data analysis. Data
in ICH-Endorsed Guide for ICH Q8/Q9/Q10 Implementation. gathered from various sources should be screened for errors,
A low impact model will require a fit for purpose model appropriate data preprocessing should be used, and data should
calibration, testing and validation but a lower consideration for be screened for outliers and for irrelevant or accidental
method validation. A medium or high impact model will correlations which may confound attempts to find exploitable
require a higher consideration for method validation. correlations. All processing of data, exclusion of outliers,
selection of samples or variables, or both, and other analysis
6.4 Two-Phase Nature of MVDA:
parameters need to be justified and documented.
6.4.1 Data analysis usually, but not always, has two phases.
The first phase is the creation of a model from acquired data 7.2 Data Source:
with a corresponding known property, and the second phase is 7.2.1 Data can be continuous, discrete, or categorical and
the application of the model to newly acquired independent from multiple sources. The most common sources are input/
data to estimate a value of the property. The first-phase analysis raw material properties, process parameters, in situ/PAT data
is usually called a multivariate calibration for a regression and intermediate/finished product properties. Data should be
process or training for a learning process. The emphasis is
gathered with acceptable quality (free of any obvious human or
usually on the model building phase in practice: how to design machine errors but properly representing a typical noise level
FIG. 1 Relationship Between an MVDA Method and an MVDA Model
E2891 − 20
likely to be present in such data), with appropriate significant should be considered carefully, particularly as the order chosen
figures. Outlier detection is strongly recommended (see 7.4). for the individual preprocessing steps is likely to have signifi-
cant impact on the data analysis outcome. It may take several
7.2.2 Depending on the MVDA-defined objective, the data
iterative cycles to optimize the preprocessing steps to ensure
can come from designed experiments (DOE) specifically for
the necessary, yet sufficient level of preprocessing is applied to
developing the MVDA model, or from other development runs
the data set to enable the MVDA model objectives to be
and routine manufacturing, or both. Data originating from a
achieved.
DOE on input/process parameters has inherent variation (spe-
cial cause variability), while data obtained from routine opera-
7.3.2 Even though preprocessing can reduce or eliminate
tions may reflect smaller variation within the acceptable
some unwanted variations in data, this must not be aimed to
operational ranges, tighter than ranges studied during process transform data that are not fit for purpose (for example,
development (common cause variability). The data collected
measurement errors) into usable data. If the data is unusable, a
from a routine process may be used for trending, process new data collection step should be considered to improve the
monitoring, identification of atypical behavior but rarely for
quality of the data.
predictive regression-based multivariate analysis. A predictive
7.4 Outliers:
empirical model built from the data that has small variation
7.4.1 An outlier means an outlying observation that appears
will typically have a very small range limited by the combi-
to deviate markedly in value from other members of the sample
nation of specification, constrained incoming material variation
in which it appears (Practice E178). Outliers typically originate
and routine process parameter variation (operating ranges).
from either a measurement error (clerical, sampling, sensor) or
Model diagnostics should be used to ensure the model is
a process error (process deviation), or from extreme samples
providing reliable outputs. Often, intentionally induced
outside the model’s space.
variations, preferably following a DOE, are created so that the
7.4.2 Outliers in Model Building Phase:
data with a larger variation range and non-confounded process
conditions is used as part of the training set to build the model.
7.4.2.1 The purpose in identifying outliers in the model
7.2.3 The manufacturing scale at which data collection is building phase or data exploration phase is to ensure that the
performed should be carefully considered. While collecting model is not distorted by the inclusion of a few non-
full-scale manufacturing data at target conditions is necessary, representative data points. Justification and documentation of
it is often prohibitive (time and cost) to perform DOEs at assignable cause to suspected outliers is recommended prior to
full-scale. Small scale DOEs are often preferred. However, if the removal of any point in the dataset. There are a variety of
the MVDA model is to be used as part of the control strategy statistical tools and visualization techniques (such as Hotell-
or the final release of the commercial manufacturing scale, or ing’s T plots, histograms, distance to model plots, control
both, the commercial manufacturing scale should be consid- charts, cluster analysis) available to the MVDA practitioners,
ered during the MVDA data collection phase. When appropri- but one must recognize that the knowledge of the process and
the measurement is fundamental t
...
This document is not an ASTM standard and is intended only to provide the user of an ASTM standard an indication of what changes have been made to the previous version. Because
it may not be technically possible to adequately depict all changes accurately, ASTM recommends that users consult prior editions as appropriate. In all cases only the current version
of the standard as published by ASTM is to be considered the official document.
Designation: E2891 − 13 E2891 − 20
Standard Guide for
Multivariate Data Analysis in Pharmaceutical Development
and Manufacturing Applications
This standard is issued under the fixed designation E2891; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope
1.1 This guide covers the applications of multivariate data analysis (MVDA) to support pharmaceutical development and
manufacturing activities. MVDA is one of the key enablers for process understanding and decision making in pharmaceutical
development, and for the release of intermediate and final products. products after being validated appropriately using a science
and risk-based approach.
1.2 The scope of this guide is to provide general guidelines on the application of MVDA in the pharmaceutical industry. While
MVDA refers to typical empirical data analysis, the scope is limited to providing a high level guidance and not intended to provide
application-specific data analysis procedures. This guide provides considerations on the following aspects:
1.2.1 Use of a risk-based approach (understanding the objective requirements and assessing the fit-for-use status),status);
1.2.2 Considerations on the data collection and diagnostics used for MVDA (including data preprocessing and outliers),
outliers);
1.2.3 Considerations on the different types of data analysis and model validation,analysis, model testing, and validation;
1.2.4 Qualified and competent personnel,personnel; and
1.2.5 Life-cycle management of MVDA.MVDA model.
1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility
of the user of this standard to establish appropriate safety safety, health, and healthenvironmental practices and determine the
applicability of regulatory limitations prior to use.
1.4 This international standard was developed in accordance with internationally recognized principles on standardization
established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued
by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
2. Referenced Documents
2.1 ASTM Standards:
C1174 Guide for Evaluation of Long-Term Behavior of Materials Used in Engineered Barrier Systems (EBS) for Geological
Disposal of High-Level Radioactive Waste
E178 Practice for Dealing With Outlying Observations
E1355 Guide for Evaluating the Predictive Capability of Deterministic Fire Models
E1655 Practices for Infrared Multivariate Quantitative Analysis
E1790 Practice for Near Infrared Qualitative Analysis
E2363 Terminology Relating to Process Analytical Technology in the Pharmaceutical Industry
E2474 Practice for Pharmaceutical Process Design Utilizing Process Analytical Technology (Withdrawn 2020)
E2476 Guide for Risk Assessment and Risk Control as it Impacts the Design, Development, and Operation of PAT Processes
for Pharmaceutical Manufacture
E2617 Practice for Validation of Empirically Derived Multivariate Calibrations
This guide is under the jurisdiction of ASTM Committee E55 on Manufacture of Pharmaceutical and Biopharmaceutical Products and is the direct responsibility of
Subcommittee E55.01 on Process Understanding and PAT System Management, Implementation and Practice.
Current edition approved Nov. 1, 2013July 1, 2020. Published November 2013July 2020. Originally approved in 2013. Last previous edition approved in 2013 as E2891
– 13. DOI: 10.1520/E2891-13.10.1520/E2891-20.
For referenced ASTM standards, visit the ASTM website, www.astm.org, or contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM Standards
volume information, refer to the standard’sstandard’s Document Summary page on the ASTM website.
The last approved version of this historical standard is referenced on www.astm.org.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E2891 − 20
2.2 ICH Standards:Publications:
ICH Q2(R1) Validation of Analytical Procedures: Text and Methodology
ICH-Endorsed Guide for ICH Q8/Q9/Q10 Implementation ICH Quality Implementation Working Group Points to Consider (R2)
ICH Q2(R1) Validation of Analytical Procedures: Text and Methodology
3. Terminology
3.1 Definitions—Common term definitions can be found in Terminology E2363 for pharmaceutical applications and some terms
can be found in other standards and are cited when they are mentioned.
4. Significance and Use
4.1 A significant amount of data is being generated during pharmaceutical development and manufacturing activities. The
interpretation of such data is becoming increasingly difficult. Individual examination of the univariate process variables is relevant
but can be significantly complemented by multivariate data analysis (MVDA). Such methodology has been shown to be
particularly efficient at handling large amounts of data from multiple sources, summarizing complex information into meaningful
low dimensional graphical representations, identifying intricate correlations between multivariate datasets taking into account
variable interactions. The output from MVDA will generate useful information that can be used to MVDA may be particularly
appropriate for exploring and handling large sets of heterogenous data, mapping data of high dimensionality onto lower
dimensional representations, exposing significant correlations among multivariate variables within a single data set or significant
correlations among multivariate variables across data sets. MVDA may extract statistically significant information which may
enhance process understanding, decision making in process development, process monitoring and control (including product
release), product life-cycle management, and continualcontinuous improvement.
4.2 MVDA is a widely used tool in various industries including the pharmaceutical industry. To generateachieve a valid
outcome, an MVDA model/application should containincorporate the following components: following:
4.2.1 A predefined objective based on a risk and scientific hypothesisrisk-based objective incorporating one or more relevant
scientific hypotheses specific to the application,application;
4.2.2 Relevant data,Sufficient relevant data of requisite quality covering the variance space encountered during intended use,
that is, pharmaceutical development, or pharmaceutical manufacturing, or both;
4.2.3 Appropriate data analysis techniques, and model utilization practices including considerations on validation,testing,
validation, and qualification of all new data prior to using a model to analyze it;
4.2.4 Appropriately trained staff, andstaff;
4.2.5 Appropriate standard operating procedures; and
4.2.6 Life-cycle management.
4.3 This guide can be used to support data analysis activities associated with pharmaceutical development and manufacturing,
process performance and product quality monitoring in manufacturing, as well as for troubleshooting and investigation events.
Technical details in data analysis can be found in the scientific literature and standard practices in data analysis are already
available (such as Practices E1655 and E1790 for spectroscopic applications, Practice E2617 for model validation, and Practice
E2474 for utilizing process analytical technology).
5. Risk-Based Approach for MVDA
5.1 A risk-based approach requires consideration of two aspects: the risk associated with the use of MVDA for a specific
objective and the justifications and rationales during the data analysis to ensure the model is fit for use. Aspects of general risk
assessment and control are described in Guide E2476 and more specific model considerations are discussed in ICH-Endorsed
Guide for ICH Q8/Q9/Q10 Implementation.
5.2 The risk level is considered high when the data analysis is an integral part of the control strategy, is used directly for the
product or intermediate product release or is used to directly control the process. The risk is considered low when the output of
the data analysis does not have significant impact on the assessment of the product quality.
5.3 In assessment of fitness for use of data analysis, several aspects should be considered:
5.3.1 Criteria for Acceptable Data Analysis—Criteria for the data analysis are defined by user requirements and project
objectives.
5.3.2 Data Source—Relevant data should be collected and used in MVDA.
5.3.3 Data Integrity—Confirmation of accuracy, consistency, and traceability of the data from the source to the analysis.
5.3.4 Data Analysis Practice (Technique and Procedure)—In data analysis practice, numerous options are available and
different options may generate similar results, all of which may be deemed fit for use. The data analysis process is an iterative
Available from International Conference onCouncil for Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH), ICH
Secretariat, c/o IFPMA, 15 ch. Louis-Dunant, P.O. Box 195, 1211 Geneva 20, Switzerland, http://www.ich.org.Route de Pré-Bois, 20, P.O Box 1894, 1215 Geneva,
Switzerland, https://www.ich.org.
E2891 − 20
approach; in case of an unsatisfactory result, a different data analysis technique may be used or it may be necessary to obtain
additional data or data of higher quality, or both, until a valid model can be established which is deemed fit for use.
6. Concepts of MVDA Model and MVDA Method
6.1 When implementing MVDA it is important to understand the differentiation between a multivariate model and a multivariate
method. This is especially true as an MVDA application reaches the validation stage.
E2891 − 20
6.2 MVDA Model:
6.2.1 As defined in PracticeGuide C1174, a model is a simplified representation of a system or phenomenon with multiple
variables based on a set of hypotheses (assumptions, data, simplifications, or idealizations, or a combinations thereof) that describe
the system or explain the phenomenon, often expressed mathematically. In the context of this guidance the term MVDA model is
to be taken in a broad sense covering, for example multivariate exploratory, regression as well as latent variable-based
techniques—such dimension reduction techniques — such as, but not limited to, Principal Component Analysis (PCA) and Partial
Least Squares (PLS) Regression.to latent variable-based, principal component analysis (PCA), principal component regression
(PCR) and partial least-squares (PLS) regression. These models often relate observational data to a known property or set of
properties from a process. process, or a summarized measure of the process state that can be used for statistical process control
(SPC) approach, as described in 8.3. The mathematical relationship is established for a sufficient number of cases—preferably
cases — preferably derived from experimental designs. The model can then be applied to a similar set of observational data in order
to predictestimate the targeted property/properties.
6.2.2 MVDA is not limited to such multivariate calibrations and predictions, and similar considerations as the ones described
in this guidance are applicable to direct and indirect calibration, as well as PCA-based approaches used for example for exploratory
data analysis.
6.3 MVDA Method:Analytical and Process Control Methods, Including One or More MVDA Elements:
6.3.1 The MVDA method uses the output from the MVDA model to define the targeted and predefined process characteristic
of interest. The MVDA model is one component of the broader concept that is an MVDA method. Such method should typically
be characterized by the collection of data, the input data to the calculation, the data analysis, and some potential transformation
from the MVDA model output to generate the pre-defined MVDA method characteristic of interest. (See Fig. 1.)
6.3.2 Note that an MVDA method can incorporate multiple MVDA models (for example, across multiple unit operations, from
multiple pieces of equipment, etc.) that can be running in parallel or feeding sequentially into one another to provide the
pre-defined MVDA method output. MVDA methods can also incorporate mechanistic and univariate models. The validation of the
MVDA model and the MVDA method are two different activities. Section 97 of this guideline provides an overview of the MVDA
model validation. The validation of an MVDA method should follow the same overarching principles as for any method validation,
such as the ones described in ICH Q2(R1).
6.3.3 Method development comprises the creation of a model, its testing, and validation
6.3.4 Method validation consists in the validation of not only the model but also all the aspects of data acquisition, analysis and
reporting outlined in Fig. 1. The level of method validation will depend on the intended model impact as defined in ICH-Endorsed
Guide for ICH Q8/Q9/Q10 Implementation. A low impact model will require a fit for purpose model calibration, testing and
validation but a lower consideration for method validation. A medium or high impact model will require a higher consideration
for method validation.
6.4 Two-Phase Nature of MVDA:
6.4.1 Data analysis usually, but not always, has two phases. In predictive analysis, the The first phase is the creation of a model
from acquired data with a corresponding known property, and the second phase is the application of the model to newly acquired
independent data to predictestimate a value of the property. The first-phase analysis is usually called a multivariate calibration for
FIG. 1 Relationship Between an MVDA Method and an MVDA Model
E2891 − 20
a regression process or training for a learning process. The emphasis is usually on the model building phase in practice: how to
design cases properly, how to process the data to build a model, and how to test the model to see whether the model is fit for use.
The model prediction phase, however, should be emphasized equally. A valid model does not always generate a valid result; it will
generate a valid result only if the input data is valid too. It is important to screen the input data and monitor the prediction
diagnostics when using the model for prediction. Such For latent variable-based models, such diagnostics are often referred to as
residual and score space diagnostics or inner/outer model diagnostics.diagnostics (see Section 7). In addition, a strategy for
life-cycle management of the MVDA method is required (see Section 11).
6.4.2 In tracking and trending process monitoring analysis, the first phase is to establish data analysis parameters, trending
limits, or a criterion for the end point of trajectory tracking.monitoring. A model may be created in the first-phase trending analysis.
The second phase is predictingestimating the new values based on the established parameter set (including a possible model) and
assessing the trajectory based on the established criteria.
6. Risk-Based Approach for MVDA
6.1 A risk-based approach requires consideration of two aspects: the risk associated with the use of MVDA for a specific
objective and the justifications and rationales during the data analysis to ensure the model is fit for use. Aspects of general risk
assessment and control are described in Practice E2476 and more specific model considerations are discussed in ICH-Endorsed
Guide for ICH Q8/Q9/Q10 Implementation.
6.2 The risk level is considered high when the data analysis is an integral part of the control strategy, is used directly for the
product or intermediate product release, or is used to directly control the process. The risk is considered low when the output of
the data analysis does not have significant impact on the product quality or the assessment of the product quality.
6.3 In assessment of fitness for use of data analysis, three aspects should be considered:
6.3.1 Criteria for Acceptable Data Analysis—Criteria for the data analysis are defined by user requirements and project
objectives.
6.3.2 Data Source—Appropriate and relevant data should be collected and used in MVDA.
6.3.3 Data Analysis Practice (Technique and Procedure)—In data analysis practice, numerous options are available and
different options may generate similar results, all of which may be deemed fit for use. The data analysis process is an iterative
approach; in case of an unsatisfactory result, a different data analysis technique may be used or it may be necessary to obtain
additional data and/or data of higher quality.
7. Data Collection and Diagnostics
7.1 Relevant data properly representing all factors impacting the MVDA objective should be used for data analysis. Data
gathered from various sources should be screened for errors, appropriate data preprocessing should be used, and data should be
screened for outliers. outliers and for irrelevant or accidental correlations which may confound attempts to find exploitable
correlations. All processing of data, exclusion of outliers, selection of samples or variables, or both, and other analysis parameters
need to be justified and documented.
7.2 Data Source:
7.2.1 Data can be continuous, discrete, or categorical and from multiple sources. The most common sources are input/raw
material properties, process parameters, in situ/PAT data and intermediate/finished product properties. Data should be gathered with
acceptable quality (free of any obvious human or machine errors but properly representing a typical noise level likely to be present
in such data), with appropriate significant figures. Outlier detection is strongly recommended (see 7.4).
7.2.2 Depending on the MVDA-defined objective, the data couldcan come from designed experiments (DOE) specifically for
developing the MVDA model, or from routineother development runs and manufacturing processes,routine manufacturing, or both.
Data originating from a DOE on input/process parameters has inherent variation (special cause variability), while data obtained
from routine operations may reflect smaller variation within the acceptable operational ranges, tighter than ranges studied during
process development (common cause variability). The data collected from a routine process may be used for trending, process
monitoring, identification of atypical behavior but rarely for predictive regression-based multivariate analysis. A predictive
empirical model built from the data that has small variation will typically have a very small range limited by the combination of
specification, constrained incoming material variation and routine process parameter variation (operating ranges). Model
diagnostics should be used to ensure the model is predicting a meaningful result. providing reliable outputs. Often, intentionally
induced variations, preferably following a DOE, are created so that the data with a larger variation range and non-confounded
process conditions is used as part of the training set to build the model.
7.2.3 Data can be continuous, discrete, or categorical and from multiple sources. The most common sources are input/raw
material properties, process parameters, in situ/PAT data and intermediate/finished product properties. Data should be gathered with
acceptable quality (free of any obvious human or machine errors but properly representing a typical noise level likely to be present
in such data), with appropriate significant figures. Outlier detection is strongly recommended (seeThe manufacturing scale at which
data collection is performed should be carefully considered. While collecting full-scale manufacturing data at target conditions is
necessary, it is often prohibitive (time and cost) to perform DOEs at full-scale. Small scale DOEs are often preferred. However,
if the MVDA model is to be used as part of the control strategy or the final release of the commercial manufacturing scale, or both,
E2891 − 20
the commercial manufacturing scale should be considered during the MVDA data collection phase. When appropriate and the
MVDA model is to be used at full-scale manufacturing, the relevance and equivalence of the data collected at other scales than
full-scale (scale, operation, raw materials, process dynamics, impact on the signal, batch run time, etc.) should be discussed and
documented to support the overall data collection strategy, such that the impact of ma
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...