Statistical methods of uncertainty evaluation — Guidance on evaluation of uncertainty using two-factor crossed designs

ISO/TS 17503:2015 describes the estimation of uncertainties on the mean value in experiments conducted as crossed designs, and the use of variances extracted from such experiments and applied to the results of other measurements (for example, single observations). ISO/TS 17503:2015 covers balanced two-factor designs with any number of levels. The basic designs covered include the two-way design without replication and the two-way design with replication, with one or both factors considered as random. Calculations of variance components from ANOVA tables and their use in uncertainty estimation are given. In addition, brief guidance is given on the use of restricted maximum likelihood estimates from software, and on the treatment of experiments with small numbers of missing data points. Methods for review of the data for outliers and approximate normality are provided. The use of data obtained from the treatment of relative observations (for example, apparent recovery in analytical chemistry) is included.

Méthodes statistiques d'évaluation de l'incertitude — Lignes directrices pour l'évaluation de l'incertitude des modèles à deux facteurs croisés

General Information

Status
Published
Publication Date
05-Nov-2015
Current Stage
9093 - International Standard confirmed
Completion Date
10-May-2023
Ref Project

Buy Standard

Technical specification
ISO/TS 17503:2015 - Statistical methods of uncertainty evaluation -- Guidance on evaluation of uncertainty using two-factor crossed designs
English language
19 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

TECHNICAL ISO/TS
SPECIFICATION 17503
First edition
2015-11-01
Statistical methods of uncertainty
evaluation — Guidance on evaluation
of uncertainty using two-factor
crossed designs
Méthodes statistiques d’évaluation de l’incertitude — Lignes
directrices pour l’évaluation de l’incertitude des modèles à deux
facteurs croisés
Reference number
ISO/TS 17503:2015(E)
©
ISO 2015

---------------------- Page: 1 ----------------------
ISO/TS 17503:2015(E)

COPYRIGHT PROTECTED DOCUMENT
© ISO 2015, Published in Switzerland
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Ch. de Blandonnet 8 • CP 401
CH-1214 Vernier, Geneva, Switzerland
Tel. +41 22 749 01 11
Fax +41 22 749 09 47
copyright@iso.org
www.iso.org
ii © ISO 2015 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/TS 17503:2015(E)

Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols . 2
5 Conduct of experiments . 4
6 Preliminary review of data — Overview. 4
7 Variance components and uncertainty estimation . 4
7.1 General considerations for variance components and uncertainty estimation . 4
7.2 Two-way layout without replication . 5
7.2.1 Design . 5
7.2.2 Preliminary inspection . 5
7.2.3 Variance component estimation. 5
7.2.4 Standard uncertainty for the mean of all observations . 6
7.2.5 Degrees of freedom for the standard uncertainty. 6
7.3 Two-way balanced experiment with replication (both factors random) . 7
7.3.1 Design . 7
7.3.2 Preliminary inspection . 7
7.3.3 Variance component extraction . 7
7.3.4 Standard uncertainty for the mean of all observations . 8
7.3.5 Degrees of freedom for the standard uncertainty. 9
7.4 Two-way balanced experiment with replication (one factor fixed, one factor random) .10
7.4.1 Design .10
7.4.2 Preliminary inspection .10
7.4.3 Variance component extraction .11
7.4.4 Standard uncertainty for the mean of all observations .11
7.4.5 Degrees of freedom for the standard uncertainty.12
8 Application to observations on a relative scale .12
9 Use of variance components in subsequent measurements .12
10 Alternative treatments .13
10.1 Restricted (or residual) maximum likelihood estimates .13
10.2 Alternative methods for model reduction .13
11 Treatment with missing values .13
Annex A (informative) Examples .14
Bibliography .19
© ISO 2015 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/TS 17503:2015(E)

Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation on the meaning of ISO specific terms and expressions related to conformity
assessment, as well as information about ISO’s adherence to the WTO principles in the Technical
Barriers to Trade (TBT) see the following URL: Foreword - Supplementary information
The committee responsible for this document is ISO/TC 69, Applications of statistical methods,
Subcommittee SC 6, Measurement methods and results.
iv © ISO 2015 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/TS 17503:2015(E)

Introduction
Uncertainty estimation usually requires the estimation and subsequent combination of uncertainties
arising from random variation. Such random variation may arise within a particular experiment under
repeatability conditions, or over a wider range of conditions. Variation under repeatability conditions
is usually characterized as repeatability standard deviation or coefficient of variation; precision under
wider changes in conditions is generally termed intermediate precision or reproducibility.
The most common experimental design for estimating the long- and short-term components of variance
is the classical balanced nested design of the kind used by ISO 5725-2. In this design, a (constant)
number of observations are collected under repeatability conditions for each level of some other factor.
Where this additional factor is ‘Laboratory’, the experiment is a balanced inter-laboratory study, and
2
can be analysed to yield estimates of within-laboratory variance, σ , the between-laboratory
r
2 22 2
component of variance, σ , and hence the reproducibility variance, σσ=+σ . Estimation of
L RL r
uncertainties based on such a study is considered by ISO 21748. Where the additional grouping factor is
another condition of measurement, however, the between-group term can usefully be taken as the
uncertainty contribution arising from random variation in that factor. For example, if several different
extracts are prepared from a homogeneous material and each is measured several times, analysis of
variance can provide an estimate of the effect of variations in the extraction process. Further
elaboration is also possible by adding successive levels of grouping. For example, in an inter-laboratory
study the repeatability variance, between-day variance and between-laboratory variance can be
estimated in a single experiment by requiring each laboratory to undertake an equal number of
replicated measurements on each of two days.
While nested designs are among the most common designs for estimation of random variation, they
are not the only useful class of design. Consider, for example, an experiment intended to characterize
a reference material, conducted by measuring three separate units of the material in three separate
instrument runs, with (say) two observations per unit per run. In this experiment, unit and run are
said to be ‘crossed’; all units are measured in all runs. This design is often used to investigate variation
in ‘fixed’ effects, by testing for changes which are larger than expected from the within-group or
‘residual’ term. This particular experiment, for example, could easily test whether there is evidence
of significant differences between units or between runs. However, the units are likely to have been
selected randomly from a much larger (if ostensibly homogeneous) batch, and the run effects are also
most appropriately treated as random. If the mean of all the observations is taken as the estimate of
the reference material value, it becomes necessary to consider the uncertainties arising from both run-
to-run and unit-to-unit variation. This can be done in much the same way as for the nested designs
described previously, by extracting the variances of interest using two-way analysis of variance. In the
statistical literature, this is generally described as the use of a random-effects or (if one factor is a fixed
effect) mixed-effects model.
Variance component extraction can be achieved by several methods. For balanced designs, equating
expected mean squares from classical analysis of variance is straightforward. Restricted (sometimes
also called residual) maximum likelihood estimation (REML) is also widely recommended for estimation
of variance components, and is applicable to both balanced and unbalanced designs. This Technical
Specification describes the classical ANOVA calculations in detail and permits the use of REML.
Note that random effects rarely include all of the uncertainties affecting a particular measurement
result. If using the mean from a crossed design as a measurement result, it is generally necessary
to consider uncertainties arising from possible systematic effects, including between-laboratory
effects, as well as the random variation visible within the experiment, and these other effects can be
considerably larger than the variation visible within a single experiment.
This present Technical Specification describes the estimation and use of uncertainty contributions
using factorial designs.
© ISO 2015 – All rights reserved v

---------------------- Page: 5 ----------------------
TECHNICAL SPECIFICATION ISO/TS 17503:2015(E)
Statistical methods of uncertainty evaluation — Guidance on
evaluation of uncertainty using two-factor crossed designs
1 Scope
This Technical Specification describes the estimation of uncertainties on the mean value in experiments
conducted as crossed designs, and the use of variances extracted from such experiments and applied to
the results of other measurements (for example, single observations).
This Technical Specification covers balanced two-factor designs with any number of levels. The
basic designs covered include the two-way design without replication and the two-way design with
replication, with one or both factors considered as random. Calculations of variance components from
ANOVA tables and their use in uncertainty estimation are given. In addition, brief guidance is given on
the use of restricted maximum likelihood estimates from software, and on the treatment of experiments
with small numbers of missing data points.
Methods for review of the data for outliers and approximate normality are provided.
The use of data obtained from the treatment of relative observations (for example, apparent recovery
in analytical chemistry) is included.
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are
indispensable for its application. For dated references, only the edition cited applies. For undated
references, the latest edition of the referenced document (including any amendments) applies.
ISO 3534-1, Statistics — Vocabulary and symbols — Part 1: General statistical terms and terms used
in probability
ISO 3534-3, Statistics — Vocabulary and symbols — Part 3: Design of experiments
3 Terms and definitions
For the purposes of this document, the terms and definitions in ISO 3534-1, ISO 3534-3 and the
following apply.
3.1
factor
predictor variable that is varied with the intent of assessing its effect on the response variable
Note 1 to entry: A factor may provide an assignable cause for the outcome of an experiment.
Note 2 to entry: The use of factor here is more specific than its generic use as a synonym for predictor variable.
Note 3 to entry: A factor may be associated with the creation of blocks.
[SOURCE: ISO 3534-3:2013, 3.1.5, modified — cross-references within ISO 3534-3 omitted from
Notes to entry]
3.2
level
potential setting, value or assignment of a factor
Note 1 to entry: A synonym is the value of a predictor variable.
© ISO 2015 – All rights reserved 1

---------------------- Page: 6 ----------------------
ISO/TS 17503:2015(E)

Note 2 to entry: The term “level” is normally associated with a quantitative characteristic. However, it also serves
as the term describing the version or setting of qualitative characteristics.
Note 3 to entry: Responses observed at the various levels of a factor provide information for determining the
effect of the factor within the range of levels of the experiment. Extrapolation beyond the range of these levels is
usually inappropriate without a firm basis for assuming model relationships. Interpolation within the range may
depend on the number of levels and the spacing of these levels. It is usually reasonable to interpolate, although
it is possible to have discontinuous or multi-modal relationships that cause abrupt changes within the range of
the experiment. The levels may be limited to certain selected fixed values (whether these values are or are not
known) or they may represent purely random selection over the range to be studied.
EXAMPLE The ordinal-scale levels of a catalyst may be presence and absence. Four levels of a heat treatment
may be 100 °C, 120 °C, 140 °C and 160 °C. The nominal-scale variable for a laboratory can have levels A, B and C,
corresponding to three facilities.
[SOURCE: ISO 3534-3:2013, 3.1.12]
3.3
fixed effects analysis of variance
analysis of variance in which the levels of each factor are pre-selected over the range of values of the
factors
Note 1 to entry: With fixed levels, it is inappropriate to compute components of variance. This model is sometimes
referred to as a model 1 analysis of variance.
[SOURCE: ISO 3534-3:2013, 3.3.9]
3.4
random effects analysis of variance
analysis of variance in which each level of each factor is assumed to be sampled from the population of
levels of each factor
Note 1 to entry: With random levels, the primary interest is usually to obtain components of variance estimates.
This model is commonly referred to as a model 2 analysis of variance.
EXAMPLE Consider a situation in which an operation processes batches of raw material. “Batch” may be
considered a random factor in an experiment when a few batches are randomly selected from the population
of all batches.
[SOURCE: ISO 3534-3:2013, 3.3.10]
4 Symbols
ν Calculated effective degrees of freedom for a standard error calculated from a two-way factorial
eff
(crossed) experiment
σ True between-level standard deviation for the first factor (if considered a random effect) in a
1
two-way factorial (crossed) experiment
σ True between-level standard deviation for the second factor (if considered a random effect) in a
2
two-way factorial (crossed) experiment
σ True between-group standard deviation for the interaction term in a factorial experiment (where
I
one or more of the factors is considered a random effect)
σ True standard deviation for the residual term in a classical analysis of variance for a two-way
r
factorial (crossed) experiment
d Residual corresponding to level i of one factor and level j of a second factor in a two-way factorial
ij
experiment without replication
2 © ISO 2015 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/TS 17503:2015(E)

M Mean square for the first factor in a classical analysis of variance for a two-way factorial
1
(crossed) experiment
M Mean square for the second factor in a classical analysis of variance for a two-way factorial
2
(crossed) experiment
M Mean square for the interaction term in a classical analysis of variance for a two-way factorial
I
(crossed) experiment with replication
M Mean square for the residual term in a classical analysis of variance for a two-way factorial
r
(crossed) experiment
M Mean square calculated from the “Total” sum of squares in a classical analysis of variance for a
tot
two-way factorial (crossed) experiment
n The number of replicate observations at each combination of factor levels (that is, within each
“cell”) in a two-way factorial (crossed) experiment with replication
p The number of levels for the first factor in a two-way factorial (crossed) experiment
q The number of levels for the second factor in a factorial (crossed) experiment
x Observation corresponding to level i of one factor and level j of a second factor in a two-way fac-
ij
torial experiment without replication
th
x k observation corresponding to level i of one factor and level j of a second factor in a two-way
ijk
factorial experiment with replication
S Sum of squares for the first factor in a classical analysis of variance for a two-way factorial
1
(crossed) experiment
S Sum of squares for the second factor in a classical analysis of variance for a two-way factorial
2
(crossed) experiment
S Sum of squares for the interaction term in a classical analysis of variance for a two-way factorial
I
(crossed) experiment with replication
S Sum of squares for the residual term in a classical analysis of variance for a two-way factorial
r
(crossed) experiment
S “Total” sum of squares in a classical analysis of variance for a two-way factorial (crossed) exper-
tot
iment
s Standard deviation of a set of independent observations
s Estimated between-level standard deviation for the first factor (if considered a random effect) in
1
a two-way factorial (crossed) experiment
s Estimated between-level standard deviation for the second factor (if considered a random effect)
2
in a two-way factorial (crossed) experiment
s Estimated between-group standard deviation for the interaction term in a factorial experiment
I
(where one or more of the factors is considered a random effect)
s Estimated standard deviation for the residual term in a classical analysis of variance for a two-
r
way factorial (crossed) experiment
Estimated standard error associated with the mean in a two-way factorial (crossed) experiment
s
x
© ISO 2015 – All rights reserved 3

---------------------- Page: 8 ----------------------
ISO/TS 17503:2015(E)

u A standard uncertainty
Standard uncertainty, associated with random variation, for the mean in a two-way factorial
u
x
(crossed) experiment
The mean of all data for a particular level i of Factor 1 in a factorial design
x
i•
The mean for a particular level j of Factor 2 in a factorial design
x
•j
The mean for all data in a given experiment
x
5 Conduct of experiments
It should be noted that as far as possible, observations should be collected in randomized order. Action
should also be taken to remove confounding effects; for example, a design intended to investigate the
effect of changes in test material matrix and different analyte concentrations on recovery in analytical
chemistry should not run each different sample type in a single run on a different day.
6 Preliminary review of data — Overview
In general, preliminary review should rely on graphical inspection. The general principle is to form
and fit the appropriate linear model (for balanced designs this is adequately done by estimating row,
column and, if necessary, cell means in the two-way layout) and inspect the residuals.
Mandel’s statistics, as presented in ISO 5725-2, are applicable to inspection of individual data points
in two-way designs, by replacing the ‘laboratory’ in ISO 5725-2 by the ‘cell’ in a two-way design and
are recommended.
Ordinary residual plots and normal probability plots are also applicable to the residuals.
Outlier tests might additionally be suggested, though they would need to be used with care; the degrees
of freedom for the residuals is smaller than for the whole data set, compromising critical values. In
addition, in designs for duplicate measurements, the residuals for a cell with a serious outlier typically
appear as two outliers equidistant from a common mean. Residuals for the ‘main effects’ model as well
as the model including cell means (the interaction term) may usefully be inspected separately to avoid
such an effect.
7 Variance components and uncertainty estimation
7.1 General considerations for variance components and uncertainty estimation
Basic calculations are based on the two-way ANOVA tables obtained from classical ANOVA for the two-
way layout. Detailed procedures are shown below. The use of software implementations of restricted
maximum likelihood estimation (“REML”) is permitted when normality is a realistic assumption for all
random effects.
When calculating variance estimates from classical ANOVA tables negative estimates of variance can
arise. In the following calculations (7.2 to 7.4), it is recommended that these estimates be set to zero. It
is further recommended that terms in the initial, complete, statistical model that are associated with
negative or zero estimates of variance are dropped from the model and the model recalculated when
standard uncertainties and associated effective degrees of freedom are of interest.
NOTE 1 REML calculations do not return negative estimates of variance and it is then unnecessary to reduce
and re-fit models unless effective degrees of freedom are of interest.
4 © ISO 2015 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/TS 17503:2015(E)

NOTE 2 Variance estimates from small data sets are highly variable from one sample to another. For example,
estimated variances taken from independent samples of 10 observations drawn from a normal distribution can
vary by more than a factor of two (that is, either greater or smaller) from the true variance. Variance estimates
from other distributions can vary more.
7.2 Two-way layout without replication
7.2.1 Design
The experiment involves variation in two different factors (for example, test item and instrument) with
a single observation per factor combination. Let p be the number of levels for the first factor of interest,
and q the number of levels for the second, so that there are pq observations x , where the subscripts
ij
denote level i of Factor 1 and level j of Factor 2.
7.2.2 Preliminary inspection
Calculate the mean x of all data for each level i of Factor 1, the mean x for each level j of Factor 2, and
i• •j
the mean x for all data. Calculate the residuals d from
ij
dx=− xx−+ x (1)
ij ij ij••
Plot the residuals in run order and inspect for unexpected trends and outlying observations. Additionally,
prepare a normal probability plot and inspect for serious departures from normality. Check and correct
any aberrant values, by re-measurement if necessary. If outlying observations are found and cannot
reasonably be corrected, inspect other values within the same factor levels. If values within the same
level of one factor all appear discrepant (for example, if results for a particular test material appear
unusually imprecise), discard all data from that factor level before estimating variances. If this affects
more than one factor level, discontinue the analysis and either treat different factor levels separately or
investigate the cause and repeat the experiment.
NOTE A single missing value can be removed if it is inconsistent with normal performance of the
measurement, that is, it can be attributable to instrumental or other causes. Refer to ‘treatment with missing
values’ below for further analysis.
7.2.3 Variance component estimation
Conduct an analysis of variance to obtain the ANOVA table of the form shown in Table 1.
Table 1 — ANOVA table for two-way design without replication
Factor SS DF MS Expected mean
square
Factor 1 S p − 1 M = S /(p − 1)
1 1 1
2 2
σσ+ q
r1
Factor 2 S q − 1 M = S /(q − 1)
2 2 2 2 2
σσ+ p
r2
Residual S (p − 1)(q − 1) M = S /[(p − 1) (q − 1)]
r r r
2
σ
r
Total S = S + S + S pq − 1 M = S /(pq − 1)
tot 1 2 r tot tot
© ISO 2015 – All rights reserved 5

---------------------- Page: 10 ----------------------
ISO/TS 17503:2015(E)

2 2 2
From the table, the variance estimates s , s and s for Factor 1, Factor 2 and the repeatability
1 2 r
variance, respectively, are given by
MM−
2
1 r
s = with p − 1 degrees of freedom
1
q
MM−
2 2 r
s = with q − 1 degrees of freedom
2
p
2
sM=
rr
Where a variance component is less than zero and is of interest for uncertainty evaluation other than in
the assessment of the uncertainty for the mean value from the experiment, set the estimate equal to zero.
EXAMPLE In a randomized block design used to determine a between-unit variance for a reference material,
the between-unit variance is of interest for uncertainty evaluation even though the mean of the homogeneity
experiment is of no importance.
7.2.4 Standard uncertainty for the mean of all observations
Where the experiment is intended to yield a mean value x over all observations and all variance
estimates are positive, the standard uncertainty arising from repeatability, r, and from variation in the
two experimental factors F and F is identical to the standard error s calculated from
1 2
x
2 22
s s s
1 2 r
s =+ + (2)
x
p q pq
Where one or more variance estimates are negative or zero, either set the corresponding term in
Formula (2) to zero if only the standard uncertainty in the mean is of interest or, if the effective degrees
o
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.