Three statistical approaches for the assessment and interpretation of measurement uncertainty

ISO/TR 13587:2012 is concerned with three basic statistical approaches for the evaluation and interpretation of measurement uncertainty: the frequentist approach including bootstrap uncertainty intervals, the Bayesian approach, and fiducial inference. The common feature of these approaches is a clearly delineated probabilistic interpretation or justification for the resulting uncertainty intervals. For each approach, the basic method is described and the fundamental underlying assumptions and the probabilistic interpretation of the resulting uncertainty are discussed. Each of the approaches is illustrated using two examples including an example from the ISO/IEC Guide 98-3 (Uncertainty of measurement ? Part 3: Guide to the expression of uncertainty in measurement (GUM:1995)). This document also includes a discussion of the relationship between the methods proposed in GUM Supplement 1 and these three statistical approaches.

Trois approches statistiques pour l'évaluation et l'interprétation de l'incertitude de mesure

General Information

Status
Published
Publication Date
12-Jul-2012
Current Stage
6060 - International Standard published
Start Date
13-Jul-2012
Completion Date
13-Dec-2025
Ref Project
Technical report
ISO/TR 13587:2012 - Three statistical approaches for the assessment and interpretation of measurement uncertainty
English language
43 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


TECHNICAL ISO/TR
REPORT 13587
First edition
2012-07-15
Three statistical approaches for the
assessment and interpretation of
measurement uncertainty
Trois approches statistiques pour l'évaluation et l'interprétation de
l'incertitude de mesure
Reference number
©
ISO 2012
©  ISO 2012
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2012 – All rights reserved

Contents Page
Foreword . v
Introduction . vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols (and abbreviated terms) . 2
5 The problem addressed . 3
6 Statistical approaches . 4
6.1 Frequentist approach . 4
6.2 Bayesian approach . 5
6.3 Fiducial approach . 5
6.4 Discussion . 6
7 Examples . 6
7.1 General . 6
7.2 Example 1a . 6
7.3 Example 1b . 7
7.4 Example 1c . 7
8 Frequentist approach to uncertainty evaluation . 7
8.1 Basic method . 7
8.2 Bootstrap uncertainty intervals . 10
8.3 Example 1 . 13
8.3.1 General . 13
8.3.2 Example 1a . 14
8.3.3 Example 1b . 15
8.3.4 Example 1c . 15
9 Bayesian approach for uncertainty evaluation . 16
9.1 Basic method . 16
9.2 Example 1 . 18
9.2.1 General . 18
9.2.2 Example 1a . 18
9.2.3 Example 1b . 20
9.2.4 Example 1c . 21
9.2.5 Summary of example . 21
10 Fiducial inference for uncertainty evaluation . 21
10.1 Basic method . 21
10.2 Example 1 . 23
10.2.1 Example 1a . 23
10.2.2 Example 1b . 25
10.2.3 Example 1c . 26
11 Example 2: calibration of a gauge block . 26
11.1 General . 26
11.2 Frequentist approach . 28
11.3 Bayesian approach . 30
11.4 Fiducial approach . 33
12 Discussion . 35
12.1 Comparison of uncertainty evaluations using the three statistical approaches . 35
12.2 Relation between the methods proposed in GUM Supplement 1 (GUMS1) and the three
statistical approaches .38
13 Summary .40
Bibliography .42

iv © ISO 2012 – All rights reserved

Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
In exceptional circumstances, when a technical committee has collected data of a different kind from that
which is normally published as an International Standard (“state of the art”, for example), it may decide by a
simple majority vote of its participating members to publish a Technical Report. A Technical Report is entirely
informative in nature and does not have to be reviewed until the data it provides are considered to be no
longer valid or useful.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
Subcommittee SC 6, Measurement methods and results.
This Technical Report is primarily based on Reference [10].

Introduction
[1]
The adoption of ISO/IEC Guide 98-3 (GUM) has led to an increasing recognition of the need to include
uncertainty statements in measurement results. Laboratory accreditation based on International Standards
[2]
like ISO 17025 has accelerated this process. Recognizing that uncertainty statements are required for
effective decision-making, metrologists in laboratories of all types, from National Metrology Institutes to
commercial calibration laboratories, are exerting considerable effort on the development of appropriate
uncertainty evaluations for different types of measurement using methods given in the GUM.
Some of the strengths of the procedures outlined and popularized in the GUM are its standardized approach
to uncertainty evaluation, its accommodation of sources of uncertainty that are evaluated either statistically
(Type A) or non-statistically (Type B), and its emphasis on reporting all sources of uncertainty considered. The
main approach to uncertainty propagation in the GUM, based on linear approximation of the measurement
function, is generally simple to carry out and in many practical situations gives results that are similar to those
obtained more formally. In short, since its adoption, the GUM has sparked a revolution in uncertainty
evaluation.
Of course, there will always be more work needed to improve the evaluation of uncertainty in particular
applications and to extend it to cover additional areas. Among such other work, the Joint Committee for
Guides in Metrology (JCGM), responsible for the GUM since the year 2000, has completed Supplement 1 to
[3]
the GUM, namely, “Propagation of distributions using a Monte Carlo method” (referred to as GUMS1) . The
JCGM is developing other supplements to the GUM on topics such as modelling and models with any number
of output quantities.
Because it should apply to the widest possible set of measurement problems, the definition of measurement
[4]
uncertainty in ISO/IEC Guide 99:2007 as a “non-negative parameter characterizing the dispersion of the
quantity values being attributed to a measurand, based on the information used” cannot reasonably be given
at more than a relatively conceptual level. As a result, defining and understanding the appropriate roles of
different statistical quantities in uncertainty evaluation, even for relatively well-understood measurement
applications, is a topic of particular interest to both statisticians and metrologists.
Earlier investigations have approached these topics from a metrological point of view, some authors focusing
on characterizing statistical properties of the procedures given in the GUM. Reference [5] shows that these
procedures are not strictly consistent with either a Bayesian or frequentist interpretation. Reference [6]
proposes some minor modifications to the GUM procedures that bring the results into closer agreement with a
Bayesian interpretation in some situations. Reference [7] discusses the relationship between procedures for
uncertainty evaluation proposed in GUMS1 and the results of a Bayesian analysis for a particular class of
models. Reference [8] also discusses different possible probabilistic interpretations of coverage intervals and
recommends approximating the posterior distributions for this class of Bayesian analyses by probability
distributions from the Pearson family of distributions.
Reference [9] compares frequentist (“conventional”) and Bayesian approaches to uncertainty evaluation.
However, the study is limited to measurement systems for which all sources of uncertainty can be evaluated
using Type A methods. In contrast, measurement systems with sources of uncertainty evaluated using both
Type A and Type B methods are treated in this Technical Report and are illustrated using several examples,
including one of the examples from Annex H of the GUM.
Statisticians have historically placed strong emphasis on using methods for uncertainty evaluation that have
probabilistic justification or interpretation. Through their work, often outside metrology, several different
approaches for statistical inference relevant to uncertainty evaluation have been developed. This Technical
Report presents some of those approaches to uncertainty evaluation from a statistical point of view and
relates them to the methods that are currently being used in metrology or are being developed within the
metrology community. The particular statistical approaches under which different methods for uncertainty
evaluation will be described are the frequentist, Bayesian, and fiducial approaches, which are discussed
further after outlining the notational conventions needed to distinguish different types of quantities.
vi © ISO 2012 – All rights reserved

TECHNICAL REPORT ISO/TR 13587:2012(E)

Three statistical approaches for the assessment and
interpretation of measurement uncertainty
1 Scope
This Technical Report is concerned with three basic statistical approaches for the evaluation and
interpretation of measurement uncertainty: the frequentist approach including bootstrap uncertainty intervals,
the Bayesian approach, and fiducial inference. The common feature of these approaches is a clearly
delineated probabilistic interpretation or justification for the resulting uncertainty intervals. For each approach,
the basic method is described and the fundamental underlying assumptions and the probabilistic interpretation
of the resulting uncertainty are discussed. Each of the approaches is illustrated using two examples, including
an example from ISO/IEC Guide 98-3 (Uncertainty of measurement — Part 3: Guide to the expression of
uncertainty in measurement (GUM:1995)). In addition, this document also includes a discussion of the
relationship between the methods proposed in the GUM Supplement 1 and these three statistical approaches.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO 3534-1:2006, Statistics — Vocabulary and symbols — Part 1: General statistical terms and terms used in
probability
ISO 3534-2:2006, Statistics — Vocabulary and symbols — Part 2: Applied statistics
ISO/IEC Guide 98-3:2008, Uncertainty of measurement — Part 3: Guide to the expression of uncertainty in
measurement (GUM:1995)
ISO/IEC Guide 98-3:2008/Suppl 1:2008, Uncertainty of measurement — Part 3: Guide to the expression of
uncertainty in measurement (GUM:1995) — Supplement 1: Propagation of distributions using a Monte Carlo
method
3 Terms and definitions
For the purposes of this document, the terms and definitions in ISO 3534-1, ISO 3534-2 and the following
apply.
3.1
empirical distribution function
empirical cumulative distribution function
distribution function that assigns probability 1 n to each of the items in a random sample, i.e., the empirical
n
distribution function is a step function defined by
x  x

i
Fx() ,
n
n
where x ,., x is the sample and A is the number of elements in the set A .
 
1 n
3.2
Bayesian sensitivity analysis
study of the effect of the choices of prior distributions for the parameters of the statistical model on the
posterior distribution of the measurand
3.3
sufficient statistic
function of a random sample X ,., X from a probability density function with parameter  for which the
1 n
conditional distribution of X ,., X given this function does not depend on 
1 n
NOTE A sufficient statistic contains as much information about  as X ,., X .
1 n
3.4
observation model
mathematical relation between a set of measurements (indications), the measurand, and the associated
random measurement errors
3.5
structural equation
statistical model relating the observable random variable to the unknown parameters and an unobservable
random variable whose distribution is known and free of unknown parameters
3.6
non-central chi-squared distribution
probability distribution that generalizes the typical (or central) chi-squared distribution
NOTE 1 For independent, normally distributed random variables X with mean  and variance  , the random
k
i i i
k
variable XX () is non-central chi-squared distributed. The non-central chi-squared distribution has two
 ii
i1
parameters: k , the degrees of freedom (i.e., the number of X ), and  , which is related to the means of the random
i
k
variables X by  () and called the non-centrality parameter.
i  ii
i1
NOTE 2 The corresponding probability density function is expressed as a mixture of central  probability density
functions as given by
 2 i

e (2 )
gg() ()

XY
ki2
i!
i0
() k
i 1
,
 i
e 


k
k

i0 2i
ii2!


where Y is distributed as chi-squared with q degrees of freedom.
q
4 Symbols (and abbreviated terms)
In 4.1.1 of the GUM, it is stated that Latin letters are used to represent both physical quantities to be
determined by measurement (i.e., measurands in GUM terminology) as well as random variables that may
take different observed values of a physical quantity. This use of the same symbols, whose different meanings
are only indicated by context, can be difficult to interpret and sometimes leads to unnecessary ambiguities or
misunderstandings. To mitigate this potential source of confusion, the more traditional notation often used in
the statistical literature is employed in this Technical Report. In this notation, Greek letters are used to
represent parameters in a statistical model (e.g., measurands), which can be either random variables or
2 © ISO 2012 – All rights reserved

constants depending on the statistical approach being used and nature of the model. Upper-case Latin letters
are used to represent random variables that can take different values of an observable quantity (e.g., potential
measured values), and lower-case Latin letters to represent specific observed values of a quantity (e.g.,
specific measured values). Since additional notation may be required to denote other physical, mathematical,
1)
or statistical concepts, there will still always be some possibility for ambiguity . In those cases the context
clarifies the appropriate interpretation.
5 The problem addressed
5.1 The concern in this Technical Report is with a measurement model in which  ,., are input
1 p
quantities and  is the output quantity:
f ., , (1)

1 p
where f is known as the measurement function. The function f is specified mathematically or as a
calculation procedure. In the GUM (4.1, NOTE 1), the same functional relationship is given as
YfX .,X (2)

1 p
which cannot be easily distinguished from the measurement function evaluated at the values of the
corresponding random variables for each observed input.
Using the procedure recommended in the GUM, the p unknown quantities … are estimated by
1 p
values x., x obtained from physical measurement or from other sources. Their associated standard
1 p
uncertainties are also obtained from the relevant data by statistical methods or from probability density
functions based on expert knowledge that characterize the variables. The GUM (also see 4.5 in
Reference [11]) recommends that the same measurement model that relates the measurand  to the input
quantities … be used to calculate y from x., x . Thus, the measured value (or, in statistical
1 p 1 p
nomenclature, the estimate) y of  is obtained as
yf(.x .,x), (3)
1 p
that is, the evaluated Y , yf (,x .x) , is taken to be the measured value of  . The estimates y , x., x
1 p 1 p
are realizations of YX, .,X , respectively.
1 p
5.2 In this Technical Report, three statistical approaches are each used to provide (a) a best estimate y of
 , (b) the associated standard uncertainty uy() , and (c) a confidence interval or coverage interval for  for a
prescribed coverage probability (often taken as 95 %).
5.3 When discussing standard uncertainties, distinction is made between evaluated standard uncertainties
associated with estimates of various quantities and their corresponding theoretical values. Accordingly,
notation such as  or  will denote theoretical standard uncertainties and notation such as S and s will
 X X x
denote an evaluated standard uncertainty before and after being observed, respectively.

1) For example, not all quantities represented by Greek letters in a statistical model must be parameters of the model.
One common example of this type of quantity is the set of unobservable quantities that represent the random
measurement errors found in most statistical models (i.e., the  in the model Y ).
i ii
6 Statistical approaches
6.1 Frequentist approach
6.1.1 The first statistical approach to be considered, in which uncertainty can be evaluated probabilistically,
is frequentist. The frequentist approach is sometimes referred to as “classical” or “conventional”. However,
due to the nature of uncertainty in metrology, these familiar methods must often be adapted to obtain
frequentist uncertainty intervals under realistic conditions.
6.1.2 In the frequentist approach, the input quantities … in the measurement model (1) and the output
1 p
quantity  are regarded as unknown constants. Then, data related to each input parameter, , is obtained
i
and used to estimate the value of  based on the measurement model or the corresponding statistical
models. Finally, confidence intervals for , for a specified level of confidence, are obtained using one of
several mathematical principles or procedures, for example, least-squares, maximum likelihood, or the
bootstrap.
6.1.3 Because  is treated as a constant, a probabilistic statement associated with a confidence interval
for  is not a direct probability statement about its value. Instead, it is a probability statement about how
frequently the procedure used to obtain the uncertainty interval for the measurand would encompass the value
of  with repeated use. “Repeated use” means that the uncertainty evaluation is replicated many times using
different data drawn from the same distributions. Traditional frequentist uncertainty intervals provide a
probability statement about the long-run properties of the procedure used to construct the interval under the
particular set of conditions assumed to apply to the measurement process.
6.1.4 In most practical metrological settings, on the other hand, uncertainty intervals are to account for the
uncertainty associated with estimates of quantities obtained using measured values (observed data) and also
the uncertainty associated with estimates of quantities based on expert knowledge. To obtain an uncertainty
interval analogous to a confidence interval, the quantities that are not based on measured values are treated
as random variables with probability distributions for their values while those quantities whose values can be
estimated using statistical data are treated as unknown constants.
6.1.5 Traditional frequentist procedures for the construction of confidence intervals are then to be modified
to attain the specified confidence level after averaging over the potential values of the quantities assessed
[5]
using expert judgment . Such modified coverage intervals provide long-run probability statements about the
procedure used to obtain the interval given probability distributions for the quantities that have not been
measured, just as traditional confidence intervals do when all parameters are treated as constants.
6.1.6 Table 1 summarizes interpretations of the frequentist, Bayesian and fiducial approaches to uncertainty
evaluation.
4 © ISO 2012 – All rights reserved

Table 1 — Interpretations of the approaches to uncertainty evaluation
Approach Characterization of quantities Uncertainty interval for Note
in measurement model output quantity 
f .,

1 p
Frequentist Long-run occurrence Classical frequentist approach
 and the  all unknown
i
frequency that interval extended to integrate over
constants
contains  uncertainties that are not
statistically evaluated
Bayesian Coverage interval Possible non-uniqueness of
 and the  are random
i
interval due to the choice of
containing  based on a
variables. Their probability
priors
posterior distribution
distributions represent beliefs
for 
about the values of the input
and output quantities
Fiducial Coverage interval Non-uniqueness due to the
 regarded as random
i
choice of the structural equation
containing  based on a
variables whose distributions
fiducial distribution for 
are obtained from assumptions
on observed data used to
estimate  and expert
i
knowledge about 
i
6.2 Bayesian approach
The second approach is called the Bayesian approach. It is named after the fundamental theorem on which it
[12]
is based, which was proved by the Reverend Thomas Bayes in the mid-1700s . In this approach,
knowledge about the quantities in measurement model (1) in Clause 5 is modelled as a set of random
variables that follow a joint probability distribution for … and  . Bayes’ theorem then allows these
1 p
probability distributions to be updated based on the observed data (also modelled using probability
distributions) and the interrelationships of the parameters defined by the function f or equivalent statistical
models. Then, a probability distribution is obtained that describes knowledge of  given the observed data.
Uncertainty intervals that contain  with any specified probability can then be obtained from this distribution.
Because knowledge of the parameter values is described by probability distributions, Bayesian methods
provide direct probabilistic statements about the value of  and the other parameters, using a definition of
probability as a measure of belief.
6.3 Fiducial approach
[13]
6.3.1 The fiducial approach was developed by R.A. Fisher in the 1930s. In this approach, a probability
distribution, called the fiducial distribution, for  conditional on the data is obtained based on the
interrelationship of  and the  described by f and the distributional assumptions about the data used to
i
estimate the  . Once obtained, the fiducial distribution for  can be used to obtain uncertainty intervals that
i
contain  with any specified probability.
6.3.2 The argument that justifies the process used to obtain the fiducial distribution is illustrated using a
simple example. Suppose the values taken by a quantity Y can be described by the equation YZ ,
where  is the measurand and Z is a quantity characterized by a standard normal random variable. If y is a
realized value of Y corresponding to a realized value z of Z , then yz . Despite Z not being
observable, knowledge of the distribution from which z was generated enables a set of plausible values of 
to be determined. The probability distribution for Z can be used to infer the probability distribution for  . The
process of transferring the relationship yz to the relation  y Z is what constitutes the fiducial
argument. The fiducial distribution for  is the probability distribution for the random variable y Z with y
fixed.
6.4 Discussion
When describing the different methods for uncertainty evaluation under each of these statistical approaches,
their fundamental underlying assumptions, incorporation of uncertainties obtained using Type A or Type B
evaluation, and the probabilistic interpretation of the resulting uncertainty evaluations will be discussed. A
description of how the methods used in the GUM relate to the frequentist, Bayesian, or fiducial results will also
be given.
7 Examples
7.1 General
Two examples are given to illustrate the approaches. Example 1 is concerned with a physical quantity that is
to be corrected for background interference. Table 2 gives the notation used and Subclauses 7.2 to 7.4 define
variants of this evaluation problem. Example 2 is the calibration of the length of a gauge block taken from
Annex H.1 of the GUM. Because it is more complicated, it is considered in Clause 11, after the three methods
for uncertainty evaluation are discussed and illustrated using Example 1.
In later clauses, the three approaches will be applied to these examples.
NOTE The units of the quantities involved are not given when they are immaterial for the example.
Table 2 — Notation for Example 1
Quantity Symbol

Physical quantity of interest (the measurand)

Quantity detected by measurement method when measuring background
(i.e., expected value of B ) (Background interference)

Quantity detected by measurement method when measuring the physical
quantity of interest  (i.e., expected value of Y )
Standard deviation of measurement method when measuring the physical 

Y
quantity of interest  (i.e., standard deviation of Y )
Standard deviation of measurement method when measuring background   
B
(i.e., standard deviation of B )

7.2 Example 1a
Five measured values, obtained independently, of signal plus background are observed. Each measured
value is assumed to be a realization of a random variable, Y , having a Gaussian distribution with mean
 and standard deviation . The measured values, y , of the signal plus background are
Y
3,738,  3,442,  2,994,  3,637,  3,874.
This data has a sample mean of y 3,537 and a sample standard deviation of s  0,342 .
y
6 © ISO 2012 – All rights reserved

Similarly, five measured values, obtained independently, of the background are obtained. These measured
values are assumed to be realizations of a random variable, B , having a Gaussian distribution with mean 
and standard deviation  . The observed values, b , of the background are
B
1,410,  1,085,  1,306,  1,137,  1,200.
Because there are measured values for each quantity that is a source of uncertainty, Example 1a has a
straightforward statistical interpretation for each approach.
7.3 Example 1b
Example 1b is identical to Example 1a with the exception that the assessment of the background is based on
expert knowledge or past experience, rather than on fresh experimental data. In this case, the background
is believed to follow a uniform (or rectangular) distribution with endpoints 1,126 and 1,329. Because expert
judgment is applied, the uncertainty associated with a value of the background will be obtained using a Type B
evaluation. Thus, Example 1b can be considered closer than Example 1a to a real measurement situation.
7.4 Example 1c
Example 1c is identical to Example 1b except that the signal  is closer to the background. The data
observed for the signal plus background in this case are
1,340,  1,078,  1,114,  1,256,  1,192.
With the signal just above the background, Example 1c illustrates how physical constraints can be
incorporated in the evaluation of uncertainty for each approach.
8 Frequentist approach to uncertainty evaluation
8.1 Basic method
8.1.1 In the frequentist context, parameters are unknown constants. Following the convention to denote
random variables by upper case letters and observed values of random variables by lower case letters, a
confidence interval can be obtained from a pivotal quantity for  , i.e., a function WY(,) of the (possibly
multivariate) data Y and the parameter  , whose probability distribution is parameter-free (provided such a
distribution can be determined.) Then, a 100(1) % confidence interval for  can be determined by
calculating lower and upper percentiles  and u to satisfy PW((  Y)u)1 .
   
8.1.2 For example, let YY ( ,.,Y ) be random variables, distributed as N() , with the further random
1 n
n
Y
variable YYn . If the parameter of interest is , then for known , Z ~ N(0,1).

i
i1  n
is a pivotal quantity. The frequentist confidence interval for  is

Yz , (4)
 2
n
where z is the 100 percentile of the standardized normal distribution.
β
If  is not known, it can be estimated by the sample standard deviation
n
YY

 j
j1
S
n1
Then, the (exact) pivotal quantity for  is obtained by replacing  in interval (4) by S :
Y
~tn(1). (5)
Sn
Thus, a 100(1 ) % confidence interval for  based on the Student’s t-distribution is
S
Yt
n112
n
where t is the 100 percentile of the t-distribution with n1 degrees of freedom.
n1
8.1.3 Instead of exact pivotal quantities, which exist only in simple situations, approximate pivotal quantities
are commonly employed in applications. For large samples, the central limit theorem can be invoked to obtain
approximate confidence intervals based on the normal distribution.
8.1.4 Further methods of obtaining confidence intervals (inverting a test statistic, pivoting a continuous
cumulative distribution function, ordering the discrete sample values according to their probabilities, etc.) are
discussed in Reference [14]. Some of them are mentioned in Example 1. A computer-intensive method, called
the bootstrap, also can be used to construct a confidence interval for pivotal quantities that have unknown
distributions. The bootstrap procedure is discussed in 8.2.
8.1.5 Although not explicitly given a frequentist justification from fundamental scientific considerations, the
procedures recommended in the GUM can be used to obtain an approximate confidence interval for the
measurand. Such confidence intervals are based on an approximate pivotal quantity with an
assumed t-distribution obtainable from the measurement model (1). Under this procedure, the unknown
quantities … are estimated by values x x obtained from physical measurement or from other
1 p 1 p
sources. Some of the values x might be sample means or other functions of data designed to estimate the
i
quantitiesi 1,.,m . Their associated standard uncertainties ux() are also evaluated from the data by
i i
statistical methods, typically using the sample standard deviation or using robust rank-based procedures.
Such methods are known as Type A evaluations of uncertainty. The degrees of freedom  associated with
i
ux() is determined from the sample size used to estimate .
i i
8.1.6 Since physical measurements might not always be possible or feasible for some of the  , estimates
i
x of  for some i , say im1,.,p , are obtained by subjective (or potentially subjective) evaluations, and
i i
used together with x , for ip 1,., , obtained from Type A evaluations of uncertainty. Thus, non-statistical
i
types of information are used to estimate  … using Type B evaluations of uncertainty, including
m1 p
scientific judgment, manufacturer’s specifications, or other indirectly related or incompletely specified
information.
NOTE Sometimes uncertainties are obtained by both Type A and Type B evaluations of uncertainty.
8.1.7 The GUM recommends that the same measurement model relating the measurand  to the input
quantities be used to calculate y from . Thus, the measured value (or the estimate) y of 
… x., x
1 p 1 p
is obtained as
yf(x .,x ,x ,.,x ),
11mm p
8 © ISO 2012 – All rights reserved

that is, the evaluated Y , yf (,x .x) , is taken to be the measured value of  .
1 p
8.1.8 In the GUM, the law of propagation of uncertainty is used to evaluate the standard uncertainty, uy() .
associated with y . The standard uncertainties ux( ).,u(x ) associated with the values x (x ,., x ) are
1 p 1 p
used in the Taylor series expansion of the function fx( ,.x ) at … , whose terms up to first order are
1 p 1 p
p
fx( ,.,x )f (,., ) c (x ) (6)

11p pii i
i1
Denoting ( ,., ) by μ , the partial derivatives
1 p
f
c
i

i
μ=x
are called sensitivity coefficients. Applying the law of propagation of uncertainty in the GUM gives the
approximate standard uncertainty associated with y :
p
uy()c u (x) 2 ccu(xx) (7)
ii ij ij
ii1 j
where ux( x) is the covariance between X and X .
ij i j
8.1.9 To evaluate the standard uncertaintyuy() , the GUM uses the effective degrees of freedom 
eff
computed from the Welch-Satterthwaite formula,
uy()
 (8)
eff
p
cu ()x
ii

()x
i
i1
NOTE Reference [15] discusses a counter-intuitive property according to which in interlaboratory studies a
confidence interval based on the Welch-Satterthwaite approximation may be shorter for a between-laboratory difference
than for one of its components.
8.1.10 Finally, in order to construct a confidence interval for , the approximate pivotal quantity,
y
Wy() (9)
uy()
is employed. According to the GUM,
WY() ~ t() , (10)
eff
that is, WY( ) is an approximately pivotal quantity having a t-distribution with  degrees of freedom.
eff
The 100(1) % confidence interval
yu ()yt , (11)
 ,1 2
eff
for  can then be recommended as the 100(1 –  ) % uncertainty interval for  . The half-width tu()y of
,12
eff
this interval is known as the expanded uncertainty associated with y .
8.1.11 This recommendation agrees with standard statistical practice when all uncertainties are determined
using Type A evaluation, in which case the most commonly used statistical estimate for a particular input
quantity  is the sample mean of n observed values. The traditional method for summarizing data to obtain
the Type A standard uncertainty of this estimator is Sn with n1 degrees of freedom. This is based on the
fact that (1nS)  has a chi-squared distribution with n1degrees of freedom. This method applies to
more general statistics of the form YG(.X .,X) , where estimators Xi 1.,p obey the central limit
1 p i
theorem. Indeed in this situation, the standard deviation of Y can be approximated by Expression (7)
with ux( x) replaced by Cov(X X ) .
i j i j
The GUM method presents the collective wisdom of many metrologists, but is restricted by assumptions of
 local linearity of the function f : ideally the sensitivity coefficients should not vary much and not vanish;
 normality of the probability distribution of point estimators Yf (X ,.,X ) : may not hold even
1 p
approximately for small samples;
 validity of the Welch-Satterthwaite Formula (8): it may not work well when the input quantities are
mutually dependent, the input quantities are not normally distributed, and the standard uncertainties are
dissimilar (degrees of freedom for distributions unrelated to the chi-squared law are difficult to interpret,
indeed, they are not used in statistical theory).
8.1.12 To motivate Expression (7) in the frequentist setting, the concepts of statistical decision theory can be
employed and the variance (squared standard uncertainty) uy() interpreted as the mean squared error of the
statistical estimator of f(.xx.,x) . These steps can be taken provided that the quantities whose
12 p
uncertainties are determined using a Type B evaluation, namely, x ,., x , are eliminated by integrating over
m1 p
their distributions. See Reference [5]. If f “is sufficiently close to being linear”, Expression (7) provides the
first order approximation of the mean squared error.
8.1.13 The discussion in Example 1 gives another customary frequentist procedure for obtaining confidence
intervals.
8.2 Bootstrap uncertainty intervals
[16]
8.2.1 Bootstrapping is a resampling strategy for estimating distribution parameters such as variance and
determining confidence intervals for parameters when the form of the underlying distribution is unknown. The
key idea for the bootstrap method is that the relation between the cumulative probability distribution (CDF) F
ˆ
for Y and a sample from F is similar to the relation between an estimated CDF F , which may be not the
ˆ
empirical distribution generated by the sample and a second sample drawn from F . When F is not available,
ˆ
draws cannot be made from it, but modern computers allow a large number of draws to be made from F So,
ˆ
one uses the primary sample to form an approximation F of F , and then calculates the sampling distribution
ˆ
of the parameter estimate based on F . This calculation is carried out by drawing many secondary samples
ˆ
and forming the estimate (or a function of the estimate) for each of the secondary sample. If F is a good
ˆ
approximation to F , then H , the sampling distribution of the estimate based on F , is generally a good
approximation to the sampling distribution for the estimate based on F . H is commonly called the bootstrap
distribution of the parameter.
8.2.2 There are two types of bootstrap procedures useful, respectively, for non-parametric and parametric
ˆ
inference. The non-parametric bootstrap
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...