Accuracy (trueness and precision) of measurement methods and results — Part 3: Intermediate precision and alternative designs for collaborative studies

This document provides
a) a discussion of alternative experimental designs for the determination of trueness and precision measures including reproducibility, repeatability and selected measures of intermediate precision of a standard measurement method, including a review of the circumstances in which their use is necessary or beneficial, and guidance as to the interpretation and application of the resulting estimates, and
b) worked examples including specific designs and computations.
Each of the alternative designs discussed in this document is intended to address one (or several) of the following issues:
a) a discussion of the implications of the definitions of intermediate precision measures;
b) a guidance on the interpretation and application of the estimates of intermediate precision measures in practical situations;
c) determining reproducibility, repeatability and selected measures of intermediate precision;
d) improved determination of reproducibility and other measures of precision;
e) improving the estimate of the sample mean;
f) determining the range of in-house repeatability standard deviations;
g) determining other precision components such as operator variability;
h) determining the level of reliability of precision estimates;
i) reducing the minimum number of participating laboratories by optimizing the reliability of precision estimates;
j) avoiding distorted estimations of repeatability (split-level designs);
k) avoiding distorted estimations of reproducibility (taking the heterogeneity of the material into consideration).
Often, the performance of the method whose precision is being evaluated in a collaborative study will have previously been assessed in a single-laboratory validation study conducted by the laboratory which developed it. Relevant factors for the determination of intermediary precision will have been identified in this prior single-laboratory study.

Exactitude (justesse et fidélité) des résultats et méthodes de mesure — Partie 3: Fidélité intermédiaire et plans alternatifs pour les études collaboratives

Le présent document fournit:
a) une discussion de plans d’expérience alternatifs pour la détermination de mesures de justesse et de fidélité, y compris la reproductibilité, la répétabilité et les mesures sélectionnées de la fidélité intermédiaire d’une méthode de mesure normalisée, incluant un examen des circonstances dans lesquelles leur utilisation est nécessaire ou bénéfique, ainsi que des recommandations relatives à l’interprétation et à l’application des estimations en résultant; et
b) des exemples détaillés, incluant des plans et des calculs spécifiques.
Chacun des plans alternatifs abordés dans le présent document est destiné à traiter l’un (ou plusieurs) des problèmes suivants:
a) une discussion des implications des définitions des mesures de fidélité intermédiaire;
b) des recommandations relatives à l’interprétation et à l’application des estimations des mesures de fidélité intermédiaire dans des situations pratiques;
c) la détermination de la reproductibilité, de la répétabilité et de mesures sélectionnées de la fidélité intermédiaire;
d) la détermination améliorée[1] de la reproductibilité et d’autres mesures de la fidélité;
e) l’amélioration de l’estimation de la moyenne de l’échantillon;
f) la détermination de la plage des écarts-types de répétabilité interne;
g) la détermination d’autres composantes de la fidélité, telles que la variabilité des opérateurs;
h) la détermination du niveau de fiabilité des estimations de la fidélité;
i) la réduction du nombre minimal de laboratoires participants en optimisant la fiabilité des estimations de la fidélité;
j) l’évitement d’estimations biaisées de la répétabilité (plans à niveau fractionné);
k) l’évitement d’estimations biaisées de la reproductibilité (en tenant compte de l’hétérogénéité du matériau).
Il arrive souvent que la performance de la méthode dont la fidélité est soumise à évaluation dans une étude collaborative ait déjà été évaluée dans le cadre d’une étude de validation intralaboratoire menée par le laboratoire qui l’a élaborée. Des facteurs pertinents pour la détermination de la fidélité intermédiaire ont donc déjà été identifiés lors de cette étude intralaboratoire antérieure.  
[1] Autorisant une réduction du nombre de laboratoires.

Točnost (pravilnost in natančnost) merilnih metod in rezultatov – 3. del : Vmesne mere natančnosti in alternativni pristopi za primerjalne študije

Ta dokument zagotavlja
a) razpravo o alternativnih poskusnih pristopih k določanju mer pravilnosti in natančnosti, vključno z obnovljivostjo, ponovljivostjo in izbranimi merami vmesne natančnosti standardne merilne metode, kar vključuje pregled okoliščin, v katerih je njihova uporaba potrebna ali koristna, ter smernice za interpretacijo in uporabo dobljenih ocen ter
b) praktične primere, vključno s posebnimi pristopi in izračuni.
Vsak od alternativnih pristopov, obravnavanih v tem dokumentu, je namenjen obravnavanju enega (ali več) od naslednjih vprašanj:
a) razprava o posledicah opredelitve mer vmesne natančnosti;
b) navodila za interpretacijo in uporabo ocenjenih mer vmesne natančnosti v praktičnih situacijah;
c) določanje obnovljivosti, ponovljivosti in izbranih mer vmesne natančnosti;
d) izboljšano določanje obnovljivosti in drugih mer natančnosti;
e) izboljšanje ocene vzorčnega povprečja;
f) določanje obsega standardnih odklonov interne ponovljivosti;
g) določanje drugih komponent natančnosti, kot je spremenljivost izvajalca;
h) določanje stopnje zanesljivosti ocen natančnosti;
i) zmanjšanje najmanjšega števila sodelujočih laboratorijev z optimizacijo zanesljivosti ocen natančnosti;
j) preprečevanje popačenja ocen ponovljivosti (pristopi na dveh ravneh);
k) izogibanje popačenju ocen obnovljivosti (upoštevanje heterogenosti materiala).
Pogosto je učinkovitost metode, katere natančnost se ocenjuje v primerjalni študiji, predhodno ocenjena v študiji potrjevanja, ki jo izvede laboratorij, v katerem je bila metoda razvita. V tej predhodni študiji, ki jo izvede en laboratorij, se opredelijo dejavniki, ki se upoštevajo pri določitvi vmesne natančnosti.

General Information

Status
Published
Publication Date
17-Jun-2024
Technical Committee
Current Stage
6060 - National Implementation/Publication (Adopted Project)
Start Date
17-Jun-2024
Due Date
22-Aug-2024
Completion Date
18-Jun-2024

Relations

Standard
SIST ISO 5725-3:2024
English language
64 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day
Standard
ISO 5725-3:2023 - Accuracy (trueness and precision) of measurement methods and results — Part 3: Intermediate precision and alternative designs for collaborative studies Released:28. 06. 2023
English language
57 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


SLOVENSKI STANDARD
01-september-2024
Točnost (pravilnost in natančnost) merilnih metod in rezultatov – 3. del : Vmesne
mere natančnosti in alternativni pristopi za primerjalne študije
Accuracy (trueness and precision) of measurement methods and results — Part 3:
Intermediate precision and alternative designs for collaborative studies
Exactitude (justesse et fidélité) des résultats et méthodes de mesure — Partie 3: Fidélité
intermédiaire et plans alternatifs pour les études collaboratives
Ta slovenski standard je istoveten z: ISO 5725-3:2023
ICS:
03.120.30 Uporaba statističnih metod Application of statistical
methods
17.020 Meroslovje in merjenje na Metrology and measurement
splošno in general
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

INTERNATIONAL ISO
STANDARD 5725-3
Second edition
2023-06
Accuracy (trueness and precision) of
measurement methods and results —
Part 3:
Intermediate precision and alternative
designs for collaborative studies
Exactitude (justesse et fidélité) des résultats et méthodes de mesure —
Partie 3: Fidélité intermédiaire et plans alternatifs pour les études
collaboratives
Reference number
© ISO 2023
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .v
Introduction . vi
1 Scope . 1
2 Normative references . 2
3 Terms and definitions . 2
4 Symbols . 3
5 General requirements . 4
6 Intermediate measures of the precision of a standard measurement method .5
6.1 Factors and factor levels . 5
6.1.1 Definitions and examples . 5
6.1.2 Selection of factors of interest . 6
6.1.3 Random and fixed effects . 6
6.1.4 Statistical model . 7
6.2 Within-laboratory study and analysis of intermediate precision measures . 9
6.2.1 Simplest approach . 9
6.2.2 Alternative method . 10
6.2.3 Effect of the measurement conditions on the final quoted result . 10
7 Nested design .11
7.1 Balanced fully-nested design . 11
7.2 Staggered-nested design . 12
7.3 Balanced partially-nested design . 13
7.4 Orthogonal array design . 14
8 Design for heterogeneous material .16
8.1 Applications of the design for a heterogeneous material . 16
8.2 Layout of the design for a heterogeneous material . 17
8.3 Statistical analysis . 17
9 Split-level design .17
9.1 Applications of the split-level design . 17
9.2 Layout of the split-level design . 19
9.3 Statistical analysis . 19
10 Design across levels .19
10.1 Applications of the design across levels . 19
10.2 Layout of the design across levels . 20
10.3 Statistical analysis . 20
11 Reliability of interlaboratory parameters .20
11.1 Reliability of precision estimates . 20
11.2 Reliability of estimates of the overall mean . 21
11.2.1 General . 21
11.2.2 Balanced fully-nested design (2 factors) . 21
11.2.3 Staggered nested design (2 factors) . 21
11.2.4 Balanced partially-nested design . 21
11.2.5 Orthogonal array design . 21
11.2.6 Split-level design . 22
Annex A (informative) Fully- and partially-nested designs .23
Annex B (informative) Analysis of variance for balanced fully-nested design .25
Annex C (informative) Analysis of variance for staggered design .30
Annex D (informative) Analysis of variance for the balanced partially-nested design (three
factors) .38
iii
Annex E (informative) Statistical model for an experiment with heterogeneous material .41
Annex F (informative) Analysis of variance for split-level design .42
Annex G (informative) Example for split-level design . 44
Annex H (informative) Design across levels .47
Annex I (informative) Restricted maximum likelihood (REML) .48
Annex J (informative) Examples of the statistical analysis of intermediate precision
experiment .49
Annex K (informative) Example for an analysis across levels .55
Bibliography .57
iv
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use
of (a) patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed
patent rights in respect thereof. As of the date of publication of this document, ISO had not received
notice of (a) patent(s) which may be required to implement this document. However, implementers are
cautioned that this may not represent the latest information, which may be obtained from the patent
database available at www.iso.org/patents. ISO shall not be held responsible for identifying any or all
such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 69, Applications of statistical methods,
Subcommittee SC 6, Measurement methods and results.
This second edition cancels and replaces the first edition (ISO 5725-3:1994), which has been technically
revised. It also incorporates the Technical Corrigendum ISO 5725-3:1994/Cor.1:2001.
The main changes are as follows:
— Several additional experimental designs have been added to this version compared to the previous
version, some of them from ISO 5725-5. These are orthogonal array designs, split level designs,
designs for heterogeneous sample material as well as designs across levels.
— Furthermore, the standard was supplemented by considerations on the selection of factors and
modelling of the factorial effects, as well as by a section in which the reliability of the various
interlaboratory test parameters (mean and precision parameters) are considered.
A list of all parts in the ISO 5725 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
v
Introduction
0.1  ISO 5725 uses two terms “trueness” and "precision” to describe the accuracy of a measurement
method. “Trueness” refers to the degree of agreement between the average value of a large number
of test results and the true or accepted reference value. “Precision” refers to the degree of agreement
between test results.
0.2  General consideration of these quantities is given in ISO 5725-1 and is not repeated here. It is
stressed that ISO 5725-1 provides underlying definitions and general principles should be read in
conjunction with all other parts of ISO 5725.
0.3  Many different factors (apart from test material heterogeneity) may contribute to the variability of
results from a measurement method, including:
a) the laboratory;
b) the operator;
c) the equipment used;
d) the calibration of the equipment;
e) the batch of a reagent;
f) the time elapsed between measurements;
g) environment (temperature, humidity, air pollution, etc.);
h) other factors.
0.4  Two conditions of precision, termed repeatability and reproducibility conditions, have been found
necessary and, for many practical cases, useful for describing the variability of a measurement method.
Under repeatability conditions, none of the factors a) to h) in 0.3 are considered to vary, while under
reproducibility conditions, all of the factors are considered to vary and contribute to the variability of
the test results. Thus, repeatability and reproducibility conditions are the two extremes of precision,
the first describing the minimum and the second the maximum variability in results. Intermediate
conditions between these two extreme conditions of precision are also conceivable, when one or more
of the factors listed in b) to g) are allowed to vary.
To illustrate the need for including a consideration of intermediate conditions in method validation,
consider the operation of a present-day laboratory connected with a production plant involving, for
example, a three-shift working system where measurements are made by different operators on
different equipment. Operators and equipment are then some of the factors that contribute to the
variability in the test results.
The standard deviation of test results obtained under repeatability conditions is generally less than
that obtained under intermediate precision conditions. Generally, in chemical analysis, the standard
deviation under intermediate precision conditions may be two or three times larger than that under
repeatability conditions. It should not, of course, exceed the reproducibility standard deviation.
As an example, in the determination of copper in copper ore, a collaborative study among 35 laboratories
revealed that the standard deviation under intermediate precision conditions (different times) was
1,5 times larger than that under repeatability conditions, both for the electrolytic gravimetry and
Na S 0 titration methods.
2 2 3
0.5  This document focuses on intermediate precision and alternative designs for collaborative studies
of a measurement method. Apart from the determination of intermediate precision measures, the
aims of these alternative designs include reducing the number of required measurements, increasing
the reliability of the estimates for precision and overall mean and taking into account test material
heterogeneity.
vi
Indeed, a t -factor fully-nested experiment with two levels per factor (inside each laboratory, there are
t−1
t−1 factors) and two replicates per setting requires 22 · test results from each laboratory, which
can be an excessive requirement on the laboratories. For this reason, in the previous version of
ISO 5725-3, the staggered nested design is also discussed. While the estimation of the precision
parameters is more complex and subject to greater uncertainty in a staggered nested design, the
workload is reduced. This document offers alternative strategies to reduce the workload without
compromising the reliability of the precision estimates.
As far as the special designs for sample heterogeneity are concerned, they were discussed in the
previous version of ISO 5725-5. However, it is convenient to have one part of this standard dedicated to
the question of the design of experiments.
0.6  The repeatability precision as determined in accordance with ISO 5725-2 is computed as a mean
across participating laboratories. Whether it can be used for quality control purposes depends on
whether the repeatability standard deviation can be considered to remain constant across laboratories.
For this reason, it is important to obtain information on how the repeatability standard deviation varies
within and between the laboratories under different conditions.
0.7  In many collaborative studies, the between-laboratory variability is large in comparison to the
repeatability, and it would be useful to a) decompose it into several different precision components, b)
reduce, if possible, some sources of variability which are due to the intermediate precision conditions.
This can be done by identifying factors (e.g. time, calibration, operator or equipment) which contribute
to the variability under intermediate precision conditions of measurement, by quantifying the
corresponding variability components and, wherever achievable, decreasing their contribution. In this
manner, the intermediate precision component of the overall variance is enlarged while the between-
laboratory component of the overall variance is reduced. Only random effects are considered: it is only
reasonable to model a factor as a fixed effect after a method or calibration optimization study has been
conducted. In this standard, different relationships between factors are taken into account, e.g. whether
a particular factor is subsumed under another factor or not.
0.8  Estimates for precision and overall mean are subject to random variability. Accordingly, it
is important to determine the uncertainty associated with each estimate, and to understand the
relationships between this uncertainty, the number of participants and the design. Once these
relationships are understood, it becomes possible to make much more informed decisions concerning
the number of participants and the experimental design.
0.9  Provided different factorial effects do contribute to the variability, determining the respective
precision components may make it possible to reduce the required number of participating laboratories,
since the between-laboratory variability can be expected to be less dominant. However, it is highly
recommended to have a reasonable number of participating laboratories in order to ensure a realistic
assessment of the overall method variability obtained under routine conditions of operation.
0.10  In the uniform-level design according to part 2 of this standard, there is a risk that an operator will
allow the result of a measurement on one sample to influence the result of a subsequent measurement
on another sample of the same material, causing the estimates of the repeatability and reproducibility
standard deviations to be biased. When this risk is considered to be serious, the split-level design
described in this document may be preferred as it reduces this risk. Care should be taken that the
two materials used at a particular level of the experiment are sufficiently similar to ensure that the
same precision measures can be expected (in other words: the question arises whether the precision
component associated with a particular factor remains unchanged across a range of similar matrices).
0.11  The experimental design presented in ISO 5725-2 requires the preparation of a number of
identical samples of the material for use in the experiment. With heterogeneous materials this may not
be possible, so that the use of the basic method then gives estimates of the reproducibility standard
deviation that are inflated by the variation between the samples. The design for a heterogeneous
material given in this document yields information about the variability between samples which is not
obtainable from the basic method; it may be used to calculate an estimate of reproducibility from which
the between-sample variation has been removed.
vii
INTERNATIONAL STANDARD ISO 5725-3:2023(E)
Accuracy (trueness and precision) of measurement
methods and results —
Part 3:
Intermediate precision and alternative designs for
collaborative studies
1 Scope
This document provides
a) a discussion of alternative experimental designs for the determination of trueness and precision
measures including reproducibility, repeatability and selected measures of intermediate precision
of a standard measurement method, including a review of the circumstances in which their use
is necessary or beneficial, and guidance as to the interpretation and application of the resulting
estimates, and
b) worked examples including specific designs and computations.
Each of the alternative designs discussed in this document is intended to address one (or several) of the
following issues:
a) a discussion of the implications of the definitions of intermediate precision measures;
b) a guidance on the interpretation and application of the estimates of intermediate precision
measures in practical situations;
c) determining reproducibility, repeatability and selected measures of intermediate precision;
1)
d) improved determination of reproducibility and other measures of precision;
e) improving the estimate of the sample mean;
f) determining the range of in-house repeatability standard deviations;
g) determining other precision components such as operator variability;
h) determining the level of reliability of precision estimates;
i) reducing the minimum number of participating laboratories by optimizing the reliability of
precision estimates;
j) avoiding distorted estimations of repeatability (split-level designs);
k) avoiding distorted estimations of reproducibility (taking the heterogeneity of the material into
consideration).
Often, the performance of the method whose precision is being evaluated in a collaborative study will
have previously been assessed in a single-laboratory validation study conducted by the laboratory
which developed it. Relevant factors for the determination of intermediary precision will have been
identified in this prior single-laboratory study.
1) Allowing a reduction in the number of laboratories.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 3534-1, Statistics — Vocabulary and symbols — Part 1: General statistical terms and terms used in
probability
ISO 3534-2, Statistics — Vocabulary and symbols — Part 2: Applied statistics
ISO 5725-1, Accuracy (trueness and precision) of measurement methods and results — Part 1: General
principles and definitions
ISO Guide 33, Reference materials — Good practice in using reference materials
ISO Guide 35, Reference materials — Guidance for characterization and assessment of homogeneity and
stability
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 3534-1, ISO 3534-2 and
ISO 5725-1 and the following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
block
group of settings (3.7) conducted in parallel or within a short time interval, and with the same samples
EXAMPLE Two settings:
Operator 1 + Calibration 1 + Equipment 1 + Batch 1
and
Operator 1 + Calibration 2 + Equipment 2 + Batch 1
Note 1 to entry: This definition is more specific than the general definition given in ISO 3534-3:2013, 3.1.25,
where block is defined as a collection of experimental units.
3.2
factor
feature under examination as a potential source of variation
EXAMPLE Operator, calibration, equipment, day, reagent batch, storage temperature, shaker orbit, shaker
frequency.
Note 1 to entry: Strictly speaking, the factor laboratory is a factor just like any other. However, since the ISO 5725
standard focuses on method validation by means of interlaboratory studies, the factor laboratory can be
considered to have a somewhat privileged role. The following characteristics distinguish it from other factors:
— The factor laboratory is indispensable: For each measurement, the name of the particular laboratory where
it was performed will always be provided in a collaborative study.
— The factor laboratory will almost always have more levels than other factors.
It should also be noted that categories such as measurand, sample/matrix and level may also be
considered to be factors. However, in collaborative studies, they are often not taken into account
as such in the factorial design. The reason is that, for these factors, one is interested in a separate
statistical analysis for each separate factor level. In other words, one is interested in obtaining separate
precision measures for each particular measurand or concentration level, not across measurands or
concentration levels. However, in cases where it is required to quantify precision across, say, matrices,
then the factor sample/matrix should also be included in the design. Accordingly, in this document,
designs are discussed to be applied for a particular measurand or concentration level by different
laboratories all applying the same measurement procedure.
[SOURCE: ISO 3534-3:2013, 3.1.5, modified — Note 1 to entry was modified and Note 2 to entry was
deleted.]
3.3
factor level
setting (3.7), value or assignment of a factor (3.2)
EXAMPLE Operator 1, Operator 2
Note 1 to entry: In many designs, the majority of factors will be varied across two levels.
3.4
fully-nested design
nested design, where there is a nesting hierarchy for every pair of factors (3.2)
EXAMPLE There are 2 operators in each laboratory, and each operator performs 2 calibrations, i.e., the
study includes 2 operators and 4 calibrations for each laboratory.
3.5
partially-nested design
nested design where one factor (3.2) (the factor laboratory) is ranked higher than all other factors (i.e.,
all other factors are nested within the factor laboratory), and there is at least one factor pair without a
nesting hierarchy
EXAMPLE There are 2 operators and 2 instruments in each laboratory, and each operator performs
measurements on 2 instruments, i.e., the study includes 2 operators and 2 instruments for each laboratory.
3.6
run
actual measurement carried out for a particular setting (3.7) and for a particular laboratory
EXAMPLE Operator 1 + Equipment 1 + Batch 1 + Day 1 carried out in laboratory 1
Note 1 to entry: This definition is more specific than the general definition given in ISO 3534-3 (3.1.13), where
run is defined as specific settings of every factor used on a particular experimental unit.
Note 2 to entry: “Identical” runs are called replicates, whereby “identical” means that the different time points are
close enough to each other to allow for the results to be considered as obtained under repeatability conditions.
3.7
setting
combination of factor levels (3.3), for all factors (3.2) except the factor laboratory
EXAMPLE Operator 1 + Equipment 1 + Batch 1 + Day 1.
4 Symbols
B
Component in a test result representing the deviation of a laboratory from the general
average (laboratory component of bias)
B
Component of B representing all factors that do not vary under intermediate precision
conditions – laboratory bias per se
BB,, etc.
Components of B representing factors that vary under intermediate precision conditions
() ()
e
Component representing the random error occurring in every test result, corresponding
to the analytical, repeatability, model or residual error
m
Overall mean of the measurand or test property for a particular matrix; level
ˆ
m Estimate of the overall mean
n
Number of replicate test results obtained in one laboratory at one level for one setting
p
Number of laboratories participating in the collaborative study
q
Number of levels of the test property in the collaborative study
Within-laboratory standard deviation of the residual term e
σ
w
σ
Repeatability standard deviation
r
σ
Reproducibility standard deviation
R
σ Standard deviation corresponding to factor B
0 0
σ Standard deviation corresponding to factor B
()1 ()1
σ Standard deviation corresponding to factor B
()2 ()2
Standard deviation corresponding to factor A
σ
A
σ
Standard deviation corresponding to the interaction of two factors
Interaction
Standard deviation corresponding to the interaction of the two factors A and B
σ
AB
s
Estimate of a standard deviation
se
Standard error
Variance of X
VarX()
w
Range of a set of test results
y
Test result
Mean of X
X
Absolute value of X
X
5 General requirements
In order to ensure that measurements are carried out in the same way, the measurement method shall
have been standardized. All measurements obtained in the framework of an experiment within a
specific laboratory or of a collaborative study shall be carried out according to that standard.
NOTE The terms collaborative experiment, collaborative trial and interlaboratory experiment are
used interchangeably to denote a collaborative study conducted in order to characterize and/or assess the
performance of a measurement method.
6 Intermediate measures of the precision of a standard measurement method
6.1 Factors and factor levels
6.1.1 Definitions and examples
In this document, the term factor denotes an identifiable and quantifiable source of variability such
as time, calibration, operator or equipment (see 3.2). In order to investigate a factor’s contribution to
variability, it is necessary to conduct measurements under different conditions or states. For instance,
measurements shall be carried out with different pieces of equipment, or with different operators. The
different states associated with a particular factor are called factor levels (see 3.3). Table 1 provides
typical examples of factors and their factor levels.
Table 1 — Examples of factors
Description/example of the
Factor Comments
different factor levels
Laboratory The different participating labo- Some of the special designs presented in this document
ratories, typically between 4 and allow reliable precision estimates with as few as 4 partic-
15 different laboratories. ipating laboratories.
Point in time Two different time points (e.g. Differences between “measurements made at different
different days, different weeks, times”, i.e. separated by a relatively long time interval (as
etc.) compared with the repeatability interval) will reflect effects
which correspond to uncontrolled changes in environmental
conditions as well as other “controlled” sources of variability
such as the use of different reagent batches, etc.
Calibration Before and after instrument is Calibration does not refer here to any calibration required
sent to the manufacturer for a as an integral part of obtaining a test result by the measure-
recalibration ment method. It refers to the calibration process that takes
place at regular intervals between groups of measurements
within a laboratory.
Operator The different technicians working In some circumstances, the operator may be, in fact, a team
in the laboratory of operators, each of whom performs some specific part of
the procedure. In such a case, the team should be regarded
as the operator, and any change in membership or in the
allotment of duties within the team should be regarded as
constituting a different operator.
Equipment Two different pieces of equipment Equipment is often a set of equipment, and any change in any
significant component should be regarded as constituting
different equipment. As to what constitutes a significant
component, common sense must prevail (e.g. different
burettes/pipettes, thermometers, pH meters, centrifuges,
shaker orbits or frequencies).
Consumables (buffer Different batches or producers A change of a batch of a reagent should be considered a sig-
solutions, reagents, nificant component. It can lead to different equipment or to
calibrators, cartridg- a recalibration if such a change is followed by calibration.
es)
NOTE 1 In practice, it may not be possible to consider factors in isolation from one another; this is due to a characteristic
of experimental designs called confounding. In theory, it should always be possible to disentangle the effects of different
factors by additional testing. For instance, if Operator 1 always carried out tests with Equipment 1 (e.g. HPLC system 1) and
Operator 2 with Equipment 2, then it would be possible to tell the effects of the two factors Operator and Equipment apart
by adding further runs for Operator 1 with Equipment 2 and for Operator 2 with Equipment 1.
NOTE 2 Further effects called interaction effects are not explicitly considered here. However, some interaction effects are
implicitly taken into consideration. For instance, the effect of skill or fatigue of an operator may be considered to be the
interaction of operator and time. Similarly, the performance of a piece of equipment may be different at the time it is first
turned on and after many hours of use: this is an example of interaction between equipment and time.
NOTE 3 In ISO 5725-2, the factor laboratory is implicitly included in the analysis.
6.1.2 Selection of factors of interest
In the standard for a measurement method, the repeatability and reproducibility standard deviations
should always be specified, but it is not necessary (or even feasible) to state all possible intermediate
precision measures. The selection of relevant factors is informed by experience and an understanding
of the relevant physical, chemical or microbiological processes.
Practical considerations in most laboratories, such as the desired precision of the final quoted result
and the cost of performing the measurements, will govern the number and choice of factors taken into
consideration in the standardization of the measurement method.
Finally, the choice of factors to include in the design should reflect concerns with uncontrollable
variations between the laboratories.
It will often be sufficient to specify only one suitable intermediate precision measure, together with
a detailed stipulation of the specific measurement conditions associated with it. The factors should
be carefully defined; in particular, for the intermediate precision associated with the factor Time, a
practical mean time interval between successive measurements should be specified.
It is assumed that, in the case of a standardized measurement method, the bias inherent in the method
itself will have been corrected by technical means. For this reason, this document only addresses the
bias arising in connection with different measurement conditions.
6.1.3 Random and fixed effects
This subclause provides a discussion of the question why, in this document, factors are modelled as
random rather than as fixed effects.
The term fixed effect is used to describe a contribution to the deviation from the overall mean or true
value whose direction and magnitude is predictable and can thus be determined. Say, for example, that
measurements always lie below the true value with equipment 1 or reagent supplier 1 and above the
true value with equipment 2 or reagent supplier 2. Then it would be appropriate to model the factor
Equipment or Reagent supplier as a fixed effect.
On the other hand, the term random effect is used to describe a contribution to the deviation from the
overall mean or true value whose direction varies – and thus cannot be determined. In such cases, the
only quantity of interest is the magnitude of the contribution (independently of its direction) often
described in terms of a standard deviation.
NOTE A factor is modelled as a fixed effect if the specific factor levels included in the experiment are of
interest in and of themselves. On the other hand, if the aim is to characterize the variability associated with
the underlying population from which the factor levels were selected, the factor is modelled as a random effect.
In this document, it is usually the variability of the underlying population which is of interest, rather than the
individual factor levels included in the experiment – this is the rationale for modelling factors as random.
The rationale for modelling factors as random rather than as fixed effects is now illustrated on the
basis of several examples.
Table 2 — Rationale for modelling factors as random rather than as fixed effects
Factor Discussion
Operator Effects due to differences between operators include personal habits in operating measurement
methods, e.g. in reading graduations on scales, etc. Thus, even though there is a bias in the test
results obtained by an individual operator, this bias is not always constant. The magnitude of
such a bias should be reduced by use of a clear operation manual and training. Under such cir-
cumstances, the effect of changing operators can be considered to be of a random nature.
Equipment Effects due to different equipment include the effects due to different places of installation,
particularly in fluctuations of the indicator, etc. Systematic differences should be corrected by
calibration and such a procedure should be included in the standard method (e.g. a change in the
batch of a reagent). An accepted reference value is needed for this, for which ISO Guide 33 and ISO
Guide 35 shall be consulted. Remaining equipment effects are considered random.
Time Effects due to time may be caused by environmental differences, such as changes in room
temperature, humidity, etc. Standardization of environmental conditions should be attempted
to minimize these effects. Clearly, achieving an ideal degree of standardization would make it
appropriate to model the factor Time as a fixed effect. However, it is more realistic to model this
factor in terms of random effects.
6.1.4 Statistical model
6.1.4.1 Basic model
For the reader’s convenience and ease of reference, the basic model described in ISO 5725-1 is
reproduced here. For estimating the accuracy (trueness and precision) of a measurement method, it is
useful to assume that every test result y is the sum of three components given by Formula (1):
ym=+Be+ (1)
where, for the particular material tested
m is the overall mean (expectation);
B is the laboratory component of bias under repeatability conditions;
e is the random error occurring in every measurement under repeatability conditions.
For a general discussion of these components, the reader is referred to ISO 5725-1, 5.1.
NOTE 1 Depending on the context, m denotes either the theoretical (unknown) overall mean or its estimate.
ˆ
It is possible to use different symbols (e.g. m versus m ) in order to distinguish between a theoretical quantity
and its estimate. However, this type of notational nuance seems unnecessary in this document. The same holds
for the other symbols used to denote quantities which are to be estimated – though the symbol σ will be
reserved for theoretical standard deviations and s for their estimates. The reader is referred to ISO 5725-1 for a
discussion of this issue.
NOTE 2 In ISO 5725-4, the bias is further decomposed into two parts: method bias and l
...


INTERNATIONAL ISO
STANDARD 5725-3
Second edition
2023-06
Accuracy (trueness and precision) of
measurement methods and results —
Part 3:
Intermediate precision and alternative
designs for collaborative studies
Exactitude (justesse et fidélité) des résultats et méthodes de mesure —
Partie 3: Fidélité intermédiaire et plans alternatifs pour les études
collaboratives
Reference number
© ISO 2023
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .v
Introduction . vi
1 Scope . 1
2 Normative references . 2
3 Terms and definitions . 2
4 Symbols . 3
5 General requirements . 4
6 Intermediate measures of the precision of a standard measurement method .5
6.1 Factors and factor levels . 5
6.1.1 Definitions and examples . 5
6.1.2 Selection of factors of interest . 6
6.1.3 Random and fixed effects . 6
6.1.4 Statistical model . 7
6.2 Within-laboratory study and analysis of intermediate precision measures . 9
6.2.1 Simplest approach . 9
6.2.2 Alternative method . 10
6.2.3 Effect of the measurement conditions on the final quoted result . 10
7 Nested design .11
7.1 Balanced fully-nested design . 11
7.2 Staggered-nested design . 12
7.3 Balanced partially-nested design . 13
7.4 Orthogonal array design . 14
8 Design for heterogeneous material .16
8.1 Applications of the design for a heterogeneous material . 16
8.2 Layout of the design for a heterogeneous material . 17
8.3 Statistical analysis . 17
9 Split-level design .17
9.1 Applications of the split-level design . 17
9.2 Layout of the split-level design . 19
9.3 Statistical analysis . 19
10 Design across levels .19
10.1 Applications of the design across levels . 19
10.2 Layout of the design across levels . 20
10.3 Statistical analysis . 20
11 Reliability of interlaboratory parameters .20
11.1 Reliability of precision estimates . 20
11.2 Reliability of estimates of the overall mean . 21
11.2.1 General . 21
11.2.2 Balanced fully-nested design (2 factors) . 21
11.2.3 Staggered nested design (2 factors) . 21
11.2.4 Balanced partially-nested design . 21
11.2.5 Orthogonal array design . 21
11.2.6 Split-level design . 22
Annex A (informative) Fully- and partially-nested designs .23
Annex B (informative) Analysis of variance for balanced fully-nested design .25
Annex C (informative) Analysis of variance for staggered design .30
Annex D (informative) Analysis of variance for the balanced partially-nested design (three
factors) .38
iii
Annex E (informative) Statistical model for an experiment with heterogeneous material .41
Annex F (informative) Analysis of variance for split-level design .42
Annex G (informative) Example for split-level design . 44
Annex H (informative) Design across levels .47
Annex I (informative) Restricted maximum likelihood (REML) .48
Annex J (informative) Examples of the statistical analysis of intermediate precision
experiment .49
Annex K (informative) Example for an analysis across levels .55
Bibliography .57
iv
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use
of (a) patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed
patent rights in respect thereof. As of the date of publication of this document, ISO had not received
notice of (a) patent(s) which may be required to implement this document. However, implementers are
cautioned that this may not represent the latest information, which may be obtained from the patent
database available at www.iso.org/patents. ISO shall not be held responsible for identifying any or all
such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 69, Applications of statistical methods,
Subcommittee SC 6, Measurement methods and results.
This second edition cancels and replaces the first edition (ISO 5725-3:1994), which has been technically
revised. It also incorporates the Technical Corrigendum ISO 5725-3:1994/Cor.1:2001.
The main changes are as follows:
— Several additional experimental designs have been added to this version compared to the previous
version, some of them from ISO 5725-5. These are orthogonal array designs, split level designs,
designs for heterogeneous sample material as well as designs across levels.
— Furthermore, the standard was supplemented by considerations on the selection of factors and
modelling of the factorial effects, as well as by a section in which the reliability of the various
interlaboratory test parameters (mean and precision parameters) are considered.
A list of all parts in the ISO 5725 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
v
Introduction
0.1  ISO 5725 uses two terms “trueness” and "precision” to describe the accuracy of a measurement
method. “Trueness” refers to the degree of agreement between the average value of a large number
of test results and the true or accepted reference value. “Precision” refers to the degree of agreement
between test results.
0.2  General consideration of these quantities is given in ISO 5725-1 and is not repeated here. It is
stressed that ISO 5725-1 provides underlying definitions and general principles should be read in
conjunction with all other parts of ISO 5725.
0.3  Many different factors (apart from test material heterogeneity) may contribute to the variability of
results from a measurement method, including:
a) the laboratory;
b) the operator;
c) the equipment used;
d) the calibration of the equipment;
e) the batch of a reagent;
f) the time elapsed between measurements;
g) environment (temperature, humidity, air pollution, etc.);
h) other factors.
0.4  Two conditions of precision, termed repeatability and reproducibility conditions, have been found
necessary and, for many practical cases, useful for describing the variability of a measurement method.
Under repeatability conditions, none of the factors a) to h) in 0.3 are considered to vary, while under
reproducibility conditions, all of the factors are considered to vary and contribute to the variability of
the test results. Thus, repeatability and reproducibility conditions are the two extremes of precision,
the first describing the minimum and the second the maximum variability in results. Intermediate
conditions between these two extreme conditions of precision are also conceivable, when one or more
of the factors listed in b) to g) are allowed to vary.
To illustrate the need for including a consideration of intermediate conditions in method validation,
consider the operation of a present-day laboratory connected with a production plant involving, for
example, a three-shift working system where measurements are made by different operators on
different equipment. Operators and equipment are then some of the factors that contribute to the
variability in the test results.
The standard deviation of test results obtained under repeatability conditions is generally less than
that obtained under intermediate precision conditions. Generally, in chemical analysis, the standard
deviation under intermediate precision conditions may be two or three times larger than that under
repeatability conditions. It should not, of course, exceed the reproducibility standard deviation.
As an example, in the determination of copper in copper ore, a collaborative study among 35 laboratories
revealed that the standard deviation under intermediate precision conditions (different times) was
1,5 times larger than that under repeatability conditions, both for the electrolytic gravimetry and
Na S 0 titration methods.
2 2 3
0.5  This document focuses on intermediate precision and alternative designs for collaborative studies
of a measurement method. Apart from the determination of intermediate precision measures, the
aims of these alternative designs include reducing the number of required measurements, increasing
the reliability of the estimates for precision and overall mean and taking into account test material
heterogeneity.
vi
Indeed, a t -factor fully-nested experiment with two levels per factor (inside each laboratory, there are
t−1
t−1 factors) and two replicates per setting requires 22 · test results from each laboratory, which
can be an excessive requirement on the laboratories. For this reason, in the previous version of
ISO 5725-3, the staggered nested design is also discussed. While the estimation of the precision
parameters is more complex and subject to greater uncertainty in a staggered nested design, the
workload is reduced. This document offers alternative strategies to reduce the workload without
compromising the reliability of the precision estimates.
As far as the special designs for sample heterogeneity are concerned, they were discussed in the
previous version of ISO 5725-5. However, it is convenient to have one part of this standard dedicated to
the question of the design of experiments.
0.6  The repeatability precision as determined in accordance with ISO 5725-2 is computed as a mean
across participating laboratories. Whether it can be used for quality control purposes depends on
whether the repeatability standard deviation can be considered to remain constant across laboratories.
For this reason, it is important to obtain information on how the repeatability standard deviation varies
within and between the laboratories under different conditions.
0.7  In many collaborative studies, the between-laboratory variability is large in comparison to the
repeatability, and it would be useful to a) decompose it into several different precision components, b)
reduce, if possible, some sources of variability which are due to the intermediate precision conditions.
This can be done by identifying factors (e.g. time, calibration, operator or equipment) which contribute
to the variability under intermediate precision conditions of measurement, by quantifying the
corresponding variability components and, wherever achievable, decreasing their contribution. In this
manner, the intermediate precision component of the overall variance is enlarged while the between-
laboratory component of the overall variance is reduced. Only random effects are considered: it is only
reasonable to model a factor as a fixed effect after a method or calibration optimization study has been
conducted. In this standard, different relationships between factors are taken into account, e.g. whether
a particular factor is subsumed under another factor or not.
0.8  Estimates for precision and overall mean are subject to random variability. Accordingly, it
is important to determine the uncertainty associated with each estimate, and to understand the
relationships between this uncertainty, the number of participants and the design. Once these
relationships are understood, it becomes possible to make much more informed decisions concerning
the number of participants and the experimental design.
0.9  Provided different factorial effects do contribute to the variability, determining the respective
precision components may make it possible to reduce the required number of participating laboratories,
since the between-laboratory variability can be expected to be less dominant. However, it is highly
recommended to have a reasonable number of participating laboratories in order to ensure a realistic
assessment of the overall method variability obtained under routine conditions of operation.
0.10  In the uniform-level design according to part 2 of this standard, there is a risk that an operator will
allow the result of a measurement on one sample to influence the result of a subsequent measurement
on another sample of the same material, causing the estimates of the repeatability and reproducibility
standard deviations to be biased. When this risk is considered to be serious, the split-level design
described in this document may be preferred as it reduces this risk. Care should be taken that the
two materials used at a particular level of the experiment are sufficiently similar to ensure that the
same precision measures can be expected (in other words: the question arises whether the precision
component associated with a particular factor remains unchanged across a range of similar matrices).
0.11  The experimental design presented in ISO 5725-2 requires the preparation of a number of
identical samples of the material for use in the experiment. With heterogeneous materials this may not
be possible, so that the use of the basic method then gives estimates of the reproducibility standard
deviation that are inflated by the variation between the samples. The design for a heterogeneous
material given in this document yields information about the variability between samples which is not
obtainable from the basic method; it may be used to calculate an estimate of reproducibility from which
the between-sample variation has been removed.
vii
INTERNATIONAL STANDARD ISO 5725-3:2023(E)
Accuracy (trueness and precision) of measurement
methods and results —
Part 3:
Intermediate precision and alternative designs for
collaborative studies
1 Scope
This document provides
a) a discussion of alternative experimental designs for the determination of trueness and precision
measures including reproducibility, repeatability and selected measures of intermediate precision
of a standard measurement method, including a review of the circumstances in which their use
is necessary or beneficial, and guidance as to the interpretation and application of the resulting
estimates, and
b) worked examples including specific designs and computations.
Each of the alternative designs discussed in this document is intended to address one (or several) of the
following issues:
a) a discussion of the implications of the definitions of intermediate precision measures;
b) a guidance on the interpretation and application of the estimates of intermediate precision
measures in practical situations;
c) determining reproducibility, repeatability and selected measures of intermediate precision;
1)
d) improved determination of reproducibility and other measures of precision;
e) improving the estimate of the sample mean;
f) determining the range of in-house repeatability standard deviations;
g) determining other precision components such as operator variability;
h) determining the level of reliability of precision estimates;
i) reducing the minimum number of participating laboratories by optimizing the reliability of
precision estimates;
j) avoiding distorted estimations of repeatability (split-level designs);
k) avoiding distorted estimations of reproducibility (taking the heterogeneity of the material into
consideration).
Often, the performance of the method whose precision is being evaluated in a collaborative study will
have previously been assessed in a single-laboratory validation study conducted by the laboratory
which developed it. Relevant factors for the determination of intermediary precision will have been
identified in this prior single-laboratory study.
1) Allowing a reduction in the number of laboratories.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 3534-1, Statistics — Vocabulary and symbols — Part 1: General statistical terms and terms used in
probability
ISO 3534-2, Statistics — Vocabulary and symbols — Part 2: Applied statistics
ISO 5725-1, Accuracy (trueness and precision) of measurement methods and results — Part 1: General
principles and definitions
ISO Guide 33, Reference materials — Good practice in using reference materials
ISO Guide 35, Reference materials — Guidance for characterization and assessment of homogeneity and
stability
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 3534-1, ISO 3534-2 and
ISO 5725-1 and the following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
block
group of settings (3.7) conducted in parallel or within a short time interval, and with the same samples
EXAMPLE Two settings:
Operator 1 + Calibration 1 + Equipment 1 + Batch 1
and
Operator 1 + Calibration 2 + Equipment 2 + Batch 1
Note 1 to entry: This definition is more specific than the general definition given in ISO 3534-3:2013, 3.1.25,
where block is defined as a collection of experimental units.
3.2
factor
feature under examination as a potential source of variation
EXAMPLE Operator, calibration, equipment, day, reagent batch, storage temperature, shaker orbit, shaker
frequency.
Note 1 to entry: Strictly speaking, the factor laboratory is a factor just like any other. However, since the ISO 5725
standard focuses on method validation by means of interlaboratory studies, the factor laboratory can be
considered to have a somewhat privileged role. The following characteristics distinguish it from other factors:
— The factor laboratory is indispensable: For each measurement, the name of the particular laboratory where
it was performed will always be provided in a collaborative study.
— The factor laboratory will almost always have more levels than other factors.
It should also be noted that categories such as measurand, sample/matrix and level may also be
considered to be factors. However, in collaborative studies, they are often not taken into account
as such in the factorial design. The reason is that, for these factors, one is interested in a separate
statistical analysis for each separate factor level. In other words, one is interested in obtaining separate
precision measures for each particular measurand or concentration level, not across measurands or
concentration levels. However, in cases where it is required to quantify precision across, say, matrices,
then the factor sample/matrix should also be included in the design. Accordingly, in this document,
designs are discussed to be applied for a particular measurand or concentration level by different
laboratories all applying the same measurement procedure.
[SOURCE: ISO 3534-3:2013, 3.1.5, modified — Note 1 to entry was modified and Note 2 to entry was
deleted.]
3.3
factor level
setting (3.7), value or assignment of a factor (3.2)
EXAMPLE Operator 1, Operator 2
Note 1 to entry: In many designs, the majority of factors will be varied across two levels.
3.4
fully-nested design
nested design, where there is a nesting hierarchy for every pair of factors (3.2)
EXAMPLE There are 2 operators in each laboratory, and each operator performs 2 calibrations, i.e., the
study includes 2 operators and 4 calibrations for each laboratory.
3.5
partially-nested design
nested design where one factor (3.2) (the factor laboratory) is ranked higher than all other factors (i.e.,
all other factors are nested within the factor laboratory), and there is at least one factor pair without a
nesting hierarchy
EXAMPLE There are 2 operators and 2 instruments in each laboratory, and each operator performs
measurements on 2 instruments, i.e., the study includes 2 operators and 2 instruments for each laboratory.
3.6
run
actual measurement carried out for a particular setting (3.7) and for a particular laboratory
EXAMPLE Operator 1 + Equipment 1 + Batch 1 + Day 1 carried out in laboratory 1
Note 1 to entry: This definition is more specific than the general definition given in ISO 3534-3 (3.1.13), where
run is defined as specific settings of every factor used on a particular experimental unit.
Note 2 to entry: “Identical” runs are called replicates, whereby “identical” means that the different time points are
close enough to each other to allow for the results to be considered as obtained under repeatability conditions.
3.7
setting
combination of factor levels (3.3), for all factors (3.2) except the factor laboratory
EXAMPLE Operator 1 + Equipment 1 + Batch 1 + Day 1.
4 Symbols
B
Component in a test result representing the deviation of a laboratory from the general
average (laboratory component of bias)
B
Component of B representing all factors that do not vary under intermediate precision
conditions – laboratory bias per se
BB,, etc.
Components of B representing factors that vary under intermediate precision conditions
() ()
e
Component representing the random error occurring in every test result, corresponding
to the analytical, repeatability, model or residual error
m
Overall mean of the measurand or test property for a particular matrix; level
ˆ
m Estimate of the overall mean
n
Number of replicate test results obtained in one laboratory at one level for one setting
p
Number of laboratories participating in the collaborative study
q
Number of levels of the test property in the collaborative study
Within-laboratory standard deviation of the residual term e
σ
w
σ
Repeatability standard deviation
r
σ
Reproducibility standard deviation
R
σ Standard deviation corresponding to factor B
0 0
σ Standard deviation corresponding to factor B
()1 ()1
σ Standard deviation corresponding to factor B
()2 ()2
Standard deviation corresponding to factor A
σ
A
σ
Standard deviation corresponding to the interaction of two factors
Interaction
Standard deviation corresponding to the interaction of the two factors A and B
σ
AB
s
Estimate of a standard deviation
se
Standard error
Variance of X
VarX()
w
Range of a set of test results
y
Test result
Mean of X
X
Absolute value of X
X
5 General requirements
In order to ensure that measurements are carried out in the same way, the measurement method shall
have been standardized. All measurements obtained in the framework of an experiment within a
specific laboratory or of a collaborative study shall be carried out according to that standard.
NOTE The terms collaborative experiment, collaborative trial and interlaboratory experiment are
used interchangeably to denote a collaborative study conducted in order to characterize and/or assess the
performance of a measurement method.
6 Intermediate measures of the precision of a standard measurement method
6.1 Factors and factor levels
6.1.1 Definitions and examples
In this document, the term factor denotes an identifiable and quantifiable source of variability such
as time, calibration, operator or equipment (see 3.2). In order to investigate a factor’s contribution to
variability, it is necessary to conduct measurements under different conditions or states. For instance,
measurements shall be carried out with different pieces of equipment, or with different operators. The
different states associated with a particular factor are called factor levels (see 3.3). Table 1 provides
typical examples of factors and their factor levels.
Table 1 — Examples of factors
Description/example of the
Factor Comments
different factor levels
Laboratory The different participating labo- Some of the special designs presented in this document
ratories, typically between 4 and allow reliable precision estimates with as few as 4 partic-
15 different laboratories. ipating laboratories.
Point in time Two different time points (e.g. Differences between “measurements made at different
different days, different weeks, times”, i.e. separated by a relatively long time interval (as
etc.) compared with the repeatability interval) will reflect effects
which correspond to uncontrolled changes in environmental
conditions as well as other “controlled” sources of variability
such as the use of different reagent batches, etc.
Calibration Before and after instrument is Calibration does not refer here to any calibration required
sent to the manufacturer for a as an integral part of obtaining a test result by the measure-
recalibration ment method. It refers to the calibration process that takes
place at regular intervals between groups of measurements
within a laboratory.
Operator The different technicians working In some circumstances, the operator may be, in fact, a team
in the laboratory of operators, each of whom performs some specific part of
the procedure. In such a case, the team should be regarded
as the operator, and any change in membership or in the
allotment of duties within the team should be regarded as
constituting a different operator.
Equipment Two different pieces of equipment Equipment is often a set of equipment, and any change in any
significant component should be regarded as constituting
different equipment. As to what constitutes a significant
component, common sense must prevail (e.g. different
burettes/pipettes, thermometers, pH meters, centrifuges,
shaker orbits or frequencies).
Consumables (buffer Different batches or producers A change of a batch of a reagent should be considered a sig-
solutions, reagents, nificant component. It can lead to different equipment or to
calibrators, cartridg- a recalibration if such a change is followed by calibration.
es)
NOTE 1 In practice, it may not be possible to consider factors in isolation from one another; this is due to a characteristic
of experimental designs called confounding. In theory, it should always be possible to disentangle the effects of different
factors by additional testing. For instance, if Operator 1 always carried out tests with Equipment 1 (e.g. HPLC system 1) and
Operator 2 with Equipment 2, then it would be possible to tell the effects of the two factors Operator and Equipment apart
by adding further runs for Operator 1 with Equipment 2 and for Operator 2 with Equipment 1.
NOTE 2 Further effects called interaction effects are not explicitly considered here. However, some interaction effects are
implicitly taken into consideration. For instance, the effect of skill or fatigue of an operator may be considered to be the
interaction of operator and time. Similarly, the performance of a piece of equipment may be different at the time it is first
turned on and after many hours of use: this is an example of interaction between equipment and time.
NOTE 3 In ISO 5725-2, the factor laboratory is implicitly included in the analysis.
6.1.2 Selection of factors of interest
In the standard for a measurement method, the repeatability and reproducibility standard deviations
should always be specified, but it is not necessary (or even feasible) to state all possible intermediate
precision measures. The selection of relevant factors is informed by experience and an understanding
of the relevant physical, chemical or microbiological processes.
Practical considerations in most laboratories, such as the desired precision of the final quoted result
and the cost of performing the measurements, will govern the number and choice of factors taken into
consideration in the standardization of the measurement method.
Finally, the choice of factors to include in the design should reflect concerns with uncontrollable
variations between the laboratories.
It will often be sufficient to specify only one suitable intermediate precision measure, together with
a detailed stipulation of the specific measurement conditions associated with it. The factors should
be carefully defined; in particular, for the intermediate precision associated with the factor Time, a
practical mean time interval between successive measurements should be specified.
It is assumed that, in the case of a standardized measurement method, the bias inherent in the method
itself will have been corrected by technical means. For this reason, this document only addresses the
bias arising in connection with different measurement conditions.
6.1.3 Random and fixed effects
This subclause provides a discussion of the question why, in this document, factors are modelled as
random rather than as fixed effects.
The term fixed effect is used to describe a contribution to the deviation from the overall mean or true
value whose direction and magnitude is predictable and can thus be determined. Say, for example, that
measurements always lie below the true value with equipment 1 or reagent supplier 1 and above the
true value with equipment 2 or reagent supplier 2. Then it would be appropriate to model the factor
Equipment or Reagent supplier as a fixed effect.
On the other hand, the term random effect is used to describe a contribution to the deviation from the
overall mean or true value whose direction varies – and thus cannot be determined. In such cases, the
only quantity of interest is the magnitude of the contribution (independently of its direction) often
described in terms of a standard deviation.
NOTE A factor is modelled as a fixed effect if the specific factor levels included in the experiment are of
interest in and of themselves. On the other hand, if the aim is to characterize the variability associated with
the underlying population from which the factor levels were selected, the factor is modelled as a random effect.
In this document, it is usually the variability of the underlying population which is of interest, rather than the
individual factor levels included in the experiment – this is the rationale for modelling factors as random.
The rationale for modelling factors as random rather than as fixed effects is now illustrated on the
basis of several examples.
Table 2 — Rationale for modelling factors as random rather than as fixed effects
Factor Discussion
Operator Effects due to differences between operators include personal habits in operating measurement
methods, e.g. in reading graduations on scales, etc. Thus, even though there is a bias in the test
results obtained by an individual operator, this bias is not always constant. The magnitude of
such a bias should be reduced by use of a clear operation manual and training. Under such cir-
cumstances, the effect of changing operators can be considered to be of a random nature.
Equipment Effects due to different equipment include the effects due to different places of installation,
particularly in fluctuations of the indicator, etc. Systematic differences should be corrected by
calibration and such a procedure should be included in the standard method (e.g. a change in the
batch of a reagent). An accepted reference value is needed for this, for which ISO Guide 33 and ISO
Guide 35 shall be consulted. Remaining equipment effects are considered random.
Time Effects due to time may be caused by environmental differences, such as changes in room
temperature, humidity, etc. Standardization of environmental conditions should be attempted
to minimize these effects. Clearly, achieving an ideal degree of standardization would make it
appropriate to model the factor Time as a fixed effect. However, it is more realistic to model this
factor in terms of random effects.
6.1.4 Statistical model
6.1.4.1 Basic model
For the reader’s convenience and ease of reference, the basic model described in ISO 5725-1 is
reproduced here. For estimating the accuracy (trueness and precision) of a measurement method, it is
useful to assume that every test result y is the sum of three components given by Formula (1):
ym=+Be+ (1)
where, for the particular material tested
m is the overall mean (expectation);
B is the laboratory component of bias under repeatability conditions;
e is the random error occurring in every measurement under repeatability conditions.
For a general discussion of these components, the reader is referred to ISO 5725-1, 5.1.
NOTE 1 Depending on the context, m denotes either the theoretical (unknown) overall mean or its estimate.
ˆ
It is possible to use different symbols (e.g. m versus m ) in order to distinguish between a theoretical quantity
and its estimate. However, this type of notational nuance seems unnecessary in this document. The same holds
for the other symbols used to denote quantities which are to be estimated – though the symbol σ will be
reserved for theoretical standard deviations and s for their estimates. The reader is referred to ISO 5725-1 for a
discussion of this issue.
NOTE 2 In ISO 5725-4, the bias is further decomposed into two parts: method bias and laboratory bias. While
laboratory bias is modelled as a random effect, method bias is modelled as a fixed effect.
6.1.4.2 Partitioning the laboratory bias term
The model described in Formula (1) is appropriate for the situation described in ISO 5725-2, where,
within each laboratory, results are obtained under repeatability conditions (i.e. within a short period
of time, by the same operator, etc.). Under these conditions, B can be considered constant and is called
the “laboratory component of bias”. In practice, however, B arises from a combination of a number of
effects. The statistical model as given in Formula (1) can be rewritten in the form given by Formula (2):
ym=+BB++Be+…+ (2)
0 ()12()
where B is partitioned into contributions from variates
B
the residual component of the laboratory bias;
B B
effects corresponding to intermediate precision factors (such as those in Table 1).
()1, ()2 , …
6.1.4.3 Terms B , B , B , etc.
0 (1) (2)
Under repeatability conditions, these terms all remain constant and add to the bias of the test results.
Under intermediate precision conditions, B is the effect corresponding to the residual laboratory bias,
i.e. it characterizes the background component of laboratory bias which remains invariant as th
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...