Standard Practice for Dealing With Outlying Observations

ABSTRACT
This practice covers outlying observations in samples and how to test the statistical significance of outliers. The procedures in this practice were developed primarily to apply to the simplest kind of experimental data, that is, replicate measurements of some property of a given material or observations in a supposedly random sample.
SCOPE
1.1 This practice covers outlying observations in samples and how to test the statistical significance of outliers.  
1.2 The system of units for this standard is not specified. Dimensional quantities in the standard are presented only as illustrations of calculation methods. The examples are not binding on products or test methods treated.  
1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use.  
1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

General Information

Status
Published
Publication Date
31-May-2021
Technical Committee
E11 - Quality and Statistics

Relations

Effective Date
01-Apr-2022
Effective Date
01-Apr-2019
Effective Date
01-Oct-2017
Effective Date
01-Oct-2017
Effective Date
01-Jun-2014
Effective Date
15-Nov-2013
Effective Date
15-Nov-2013
Effective Date
15-Nov-2013
Effective Date
15-Nov-2013
Effective Date
01-Oct-2013
Effective Date
15-Aug-2013
Effective Date
01-Oct-2012
Effective Date
01-May-2012
Effective Date
01-May-2012
Effective Date
15-Feb-2012

Overview

ASTM E178-21: Standard Practice for Dealing With Outlying Observations provides guidance for identifying and handling outlier data points in samples. Published by ASTM International, this practice addresses the detection of outliers in experimental datasets-primarily those consisting of replicate measurements or observations assumed to be from random samples. The standard outlines statistical methods recognized internationally for determining whether a data point is significantly different from the remainder of a dataset. The importance of correctly managing outliers is emphasized, as they can result from either valid but extreme natural variability or experimental error.

Key Topics

  • Definitions and Terminology: Clarifies key terms such as "outlier," "order statistic," and "null hypothesis" in the context of quality and statistics, referencing ASTM E456 for consistency.
  • Statistical Significance Testing: Details established statistical tests, including methods using sample mean and standard deviation, to determine if extreme values should be considered outlying observations.
  • Recommended Procedures: Provides step-by-step guidance on treating outliers, including physical investigation and multiple statistical tests (e.g., Grubbs' test, Dixon's criteria, Tietjen-Moore tests).
  • Handling Outliers in Practice:
    • When a known physical cause is identified, the observation may be discarded.
    • If no cause is known, formal statistical testing is advised before excluding any data.
    • When outliers are retained, documentation should clearly note how they were addressed.
  • Significance Levels and Critical Values: Introduces the concept of significance levels (e.g., 1%, 5%, 10%) and the use of critical values, acknowledging that most outlier criteria assume an underlying normal distribution.
  • Recursive and Multiple Outlier Testing: Covers recursive testing methods for identifying multiple outliers on one or both ends of a sample.
  • Alternative Criteria: Discusses the use of skewness and kurtosis as additional statistical tools for assessing sample normality and outlier detection.

Applications

The practical application of ASTM E178-21 spans various industries and fields where data integrity and reliability are critical:

  • Laboratory Testing: Ensures accurate measurement reporting by identifying and scrutinizing suspicious results in chemical, materials, or environmental labs.
  • Quality Control in Manufacturing: Facilitates root cause analysis and process improvement by distinguishing between random variation and genuine process anomalies.
  • Scientific Research: Guides researchers in deciding whether to retain, adjust, or exclude anomalous data points, maintaining scientific rigor in fields such as biology, physics, or engineering.
  • Data Analysis and Statistical Auditing: Provides a framework for analysts to report and handle data outliers transparently in compliance, regulatory reporting, and peer-reviewed studies.
  • Process Optimization: Assists organizations in segregating suspect test units for further investigation, which may inform corrective actions or further research.

Related Standards

  • ASTM E456 – Terminology Relating to Quality and Statistics: Offers standardized definitions relevant to statistical practices and quality analysis, referenced throughout E178-21.
  • ASTM E2586 – Practice for Calculating and Using Basic Statistics: Provides foundational procedures for statistical calculations, which underpin the statistical significance tests used in ASTM E178-21.
  • ISO/IEC Guide 98-3 – Uncertainty of Measurement: Though not directly referenced, this international guideline is relevant for laboratories concerned with uncertainty and data quality.

By implementing ASTM E178-21, organizations can improve the validity of their data analyses, enhance consistent decision-making, and align with recognized best practices for outlier detection and management. This standard is essential for industries reliant on statistically sound methodologies to ensure product quality, safety, and compliance.

Buy Documents

Standard

ASTM E178-21 - Standard Practice for Dealing With Outlying Observations

English language (11 pages)
sale 15% off
sale 15% off
Standard

REDLINE ASTM E178-21 - Standard Practice for Dealing With Outlying Observations

English language (11 pages)
sale 15% off
sale 15% off

Get Certified

Connect with accredited certification bodies for this standard

BSI Group

BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

UKAS United Kingdom Verified

Bureau Veritas

Bureau Veritas is a world leader in laboratory testing, inspection and certification services.

COFRAC France Verified

DNV

DNV is an independent assurance and risk management provider.

NA Norway Verified

Sponsored listings

Frequently Asked Questions

ASTM E178-21 is a standard published by ASTM International. Its full title is "Standard Practice for Dealing With Outlying Observations". This standard covers: ABSTRACT This practice covers outlying observations in samples and how to test the statistical significance of outliers. The procedures in this practice were developed primarily to apply to the simplest kind of experimental data, that is, replicate measurements of some property of a given material or observations in a supposedly random sample. SCOPE 1.1 This practice covers outlying observations in samples and how to test the statistical significance of outliers. 1.2 The system of units for this standard is not specified. Dimensional quantities in the standard are presented only as illustrations of calculation methods. The examples are not binding on products or test methods treated. 1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use. 1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

ABSTRACT This practice covers outlying observations in samples and how to test the statistical significance of outliers. The procedures in this practice were developed primarily to apply to the simplest kind of experimental data, that is, replicate measurements of some property of a given material or observations in a supposedly random sample. SCOPE 1.1 This practice covers outlying observations in samples and how to test the statistical significance of outliers. 1.2 The system of units for this standard is not specified. Dimensional quantities in the standard are presented only as illustrations of calculation methods. The examples are not binding on products or test methods treated. 1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use. 1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

ASTM E178-21 is classified under the following ICS (International Classification for Standards) categories: 03.120.30 - Application of statistical methods. The ICS classification helps identify the subject area and facilitates finding related standards.

ASTM E178-21 has the following relationships with other standards: It is inter standard links to ASTM E456-13a(2022)e1, ASTM E2586-19e1, ASTM E456-13A(2017)e1, ASTM E456-13A(2017)e3, ASTM E2586-14, ASTM E456-13ae1, ASTM E456-13ae2, ASTM E456-13ae3, ASTM E456-13a, ASTM E2586-13, ASTM E456-13, ASTM E2586-12b, ASTM E456-12, ASTM E456-12e1, ASTM E2586-12a. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

ASTM E178-21 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.

Standards Content (Sample)


This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
Designation: E178 − 21 An American National Standard
Standard Practice for
Dealing With Outlying Observations
This standard is issued under the fixed designation E178; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision.Anumber in parentheses indicates the year of last reapproval.A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope 3.1.2.1 Discussion—In this practice, x is used to denote
k
order statistics in place of x , to simplify the notation.
(k)
1.1 This practice covers outlying observations in samples
3.1.3 outlier—see outlying observation.
and how to test the statistical significance of outliers.
3.1.4 outlying observation, n—an extreme observation in
1.2 The system of units for this standard is not specified.
either direction that appears to deviate markedly in value from
Dimensional quantities in the standard are presented only as
other members of the sample in which it appears.
illustrations of calculation methods. The examples are not
binding on products or test methods treated.
3.1.4.1 Discussion—The identification of a value as
1.3 This standard does not purport to address all of the
outlying, and therefore a doubtful observation, is a judgement
safety concerns, if any, associated with its use. It is the
of the analyst and can be made before any statistical test.
responsibility of the user of this standard to establish appro-
4. Significance and Use
priate safety, health, and environmental practices and deter-
mine the applicability of regulatory limitations prior to use. 4.1 An outlying observation, or “outlier,” is an extreme one
in either direction that appears to deviate markedly from other
1.4 This international standard was developed in accor-
dance with internationally recognized principles on standard- members of the sample in which it occurs.
ization established in the Decision on Principles for the
4.2 Statistical rules test the null hypothesis of no outliers
Development of International Standards, Guides and Recom-
against the alternative of one or more actual outliers. The
mendations issued by the World Trade Organization Technical
procedures covered were developed primarily to apply to the
Barriers to Trade (TBT) Committee.
simplest kind of experimental data, that is, replicate measure-
ments of some property of a given material or observations in
2. Referenced Documents
a supposedly random sample.
2.1 ASTM Standards:
4.3 Astatistical test may be used to support a judgment that
E456Terminology Relating to Quality and Statistics
a physical reason does actually exist for an outlier, or the
E2586Practice for Calculating and Using Basic Statistics
statistical criterion may be used routinely as a basis to initiate
3. Terminology action to find a physical cause.
3.1 Definitions—Unlessotherwisenotedinthisstandard,all
5. Procedure
terms relating to quality and statistics are defined in Terminol-
5.1 In dealing with an outlier, the following alternatives
ogy E456.
should be considered:
3.1.1 null hypothesis, H ,n—astatementaboutaparameter
5.1.1 An outlying observation might be the result of gross
of a probability distribution or about the type of probability
deviation from prescribed experimental procedure or an error
distribution, tentatively regarded as true until rejected using a
in calculating or recording the numerical value. When the
statistical hypothesis test. E2586
experimenter is clearly aware that a deviation from prescribed
3.1.2 order statistic x ,n—value of the kth observed value
(k)
experimental procedure has taken place, the resultant observa-
in a sample after sorting by order of magnitude. E2586
tion should be discarded, whether or not it agrees with the rest
of the data and without recourse to statistical tests for outliers.
ThispracticeisunderthejurisdictionofASTMCommitteeE11onQualityand
If a reliable correction procedure is available, the observation
Statistics and is the direct responsibility of Subcommittee E11.10 on Sampling /
may sometimes be corrected and retained.
Statistics.
5.1.2 An outlying observation might be merely an extreme
Current edition approved June 1, 2021. Published June 2021. Originally
manifestation of the random variability inherent in the data. If
approved in 1961. Last previous edition approved in 2016 as E178–16a. DOI:
10.1520/E0178-21.
this is true, the value should be retained and processed in the
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
same manner as the other observations in the sample. Trans-
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
formation of data or using methods of data analysis designed
Standards volume information, refer to the standard’s Document Summary page on
the ASTM website. for a non-normal distribution might be appropriate.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E178 − 21
5.1.3 Test units that give outlying observations might be of 6.2 Almost all criteria for outliers are based on an assumed
special interest. If this is true, once identified they should be underlying normal (Gaussian) population or distribution. The
segregated for more detailed study. Outliers may contain null hypothesis that we are testing in every case is that all
important information for a possible root cause analysis and observations in the sample come from the same normal
action on the process or procedure. population. In choosing an appropriate alternative hypothesis
(one or more outliers, separated or bunched, on same side or
5.2 In many cases, evidence for deviation from prescribed
different sides, and so forth) it is useful to plot the data as
procedure will consist primarily of the discordant value itself.
showninthedotdiagramsofthefigures.Whenthedataarenot
Insuchcasesitisadvisabletoadoptacautiousattitude.Useof
normally or approximately normally distributed, the probabili-
one of the criteria discussed below will sometimes permit a
ties associated with these tests will be different. The experi-
clearcut decision to be made.
menter is cautioned against interpreting the probabilities too
5.2.1 When the experimenter cannot identify abnormal
literally.
conditions, they should report the discordant values and
indicate to what extent they have been used in the analysis of 6.3 Although our primary interest here is that of detecting
the data. outlying observations, some of the statistical criteria presented
may also be used to test the hypothesis of normality or that the
5.3 Thus, as part of the over-all process of experimentation,
random sample taken come from a normal or Gaussian
the process of screening samples for outlying observations and
population. The end result is for all practical purposes the
acting on them is the following:
same, that is, we really wish to know whether we ought to
5.3.1 Physical Reason Known or Discovered for Outlier(s):
proceed as if we have in hand a sample of homogeneous
5.3.1.1 Reject observation(s) and possibly take additional
normal observations.
observation(s).
6.4 One should distinguish between data to be used to
5.3.1.2 Correct observation(s) on physical grounds.
estimate a central value from data to be used to assess
5.3.2 Physical Reason Unknown—Use Statistical Test:
variability. When the purpose is to estimate a standard
5.3.2.1 Reject observation(s) and possibly take additional
deviation,itmightbeseriouslyunderestimatedbydroppingtoo
observation(s).
many “outlying” observations.
5.3.2.2 Transform observation(s) to improve fit to a normal
distribution.
7. Recommended Criteria for Single Samples
5.3.2.3 Use estimation appropriate for non-normal distribu-
tions.
7.1 Criterion for a Single Outlier—Sort the n observations
5.3.2.4 Segregate samples for further study.
in order of increasing magnitude by x ≤ x ≤ x ≤ . ≤ x ,
1 2 3 n
called order statistics. Let the largest value, x , be the doubtful
n
6. Basis of Statistical Criteria for Outliers
value, that is the largest value. The test criterion, T , for a
n
single outlier is as follows:
6.1 In testing outliers, the doubtful observation is included
in the calculation of the numerical value of a sample criterion T 5 x 2 x¯ /s (1)
~ !
n n
(or statistic), which is then compared with a critical value
where:
based on the theory of random sampling to determine whether
x¯ = arithmetic average of all n values, and
the doubtful observation is to be retained or rejected. The
s = estimate of the population standard deviation based on
critical value is that value of the sample criterion which would
the sample data, calculated as follows:
be exceeded by chance with some specified (small) probability
n
on the assumption that all the observations did indeed consti- 1
x¯ 5 x (2)
( i
tute a random sample from a common system of causes, a
n
i51
singleparentpopulation,distributionoruniverse.Thespecified
n n
small probability is called the “significance level” or “percent- 2 2 2
x 2 x¯ x 2 n·x¯
~ !
( i ( i
i51 i51
age point” and can be thought of as the risk of erroneously
s 5 5
! !
n 2 1 n 2 1
rejecting a good observation. If a real shift or change in the
n n 2
value of an observation arises from nonrandom causes (human
x 2 x /n
S D
error, loss of calibration of instrument, change of measuring i i
( (
i51 i51
5 (3)
!
instrument, or even change of time of measurements, and so
n 2 1
forth),thentheobservedvalueofthesamplecriterionusedwill
If x rather than x is the doubtful value, the criterion is as
1 n
exceed the “critical value” based on random-sampling theory.
follows:
Tables of critical values are usually given for several different
significance levels. In particular for this practice, significance
T 5 ~x¯ 2 x !/s (4)
1 1
levels 10, 5, and 1 % are used.
The critical values for either case, for the 1, 5, and 10 %
NOTE 1—In this practice, we will usually illustrate the use of the 5 %
levels of significance, are given in Table 1.
significance level. Proper choice of level in probability depends on the
7.1.1 The test criterion T can be equated to the Student’s t
n
particularproblemandjustwhatmaybeinvolved,alongwiththeriskthat
test statistic for equality of means between a population with
one is willing to take in rejecting a good observation, that is, if the
one observation x and another with the remaining observa-
null-hypothesis stating “all observations in the sample come from the n
same normal population” may be assumed correct. tions x , . , x , and the critical value of T for significance
1 n–1 n
E178 − 21
TABLE 1 Critical Values for T (One-Sided Test) When Standard
A
Deviation is Calculated from the Same Sample
Number of Upper 10 % Upper 5 % Upper 1 %
Observations, Significance Significance Significance
n Level Level Level
FIG. 1 Ten Observations of Breaking Strength from Example 1
3 1.1484 1.1531 1.1546
4 1.4250 1.4625 1.4925
5 1.602 1.672 1.749
T 5 ~596 2 575.2!/8.70 5 2.39
6 1.729 1.822 1.944
7 1.828 1.938 2.097
From Table 1, for n = 10, note that a T as large as 2.39
8 1.909 2.032 2.221
would occur by chance with probability less than 0.05. In fact,
9 1.977 2.110 2.323
so large a value would occur by chance not much more often
10 2.036 2.176 2.410
11 2.088 2.234 2.485
than1%ofthe time. Thus, the weight of the evidence is
12 2.134 2.285 2.550
against the doubtful value having come from the same popu-
13 2.175 2.331 2.607
lation as the others (assuming the population is normally
14 2.213 2.371 2.659
15 2.247 2.409 2.705
distributed). Investigation of the doubtful value is therefore
16 2.279 2.443 2.747
indicated.
17 2.309 2.475 2.785
18 2.335 2.504 2.821
7.2 Dixon Criteria for a Single Outlier—An alternative
19 2.361 2.532 2.854
system, the Dixon criteria (2), based entirely on ratios of
20 2.385 2.557 2.884
21 2.408 2.580 2.912 differences between the observations may be used in cases
22 2.429 2.603 2.939
where it is desirable to avoid calculation of s or where quick
23 2.448 2.624 2.963
judgment is called for. For the Dixon test, the sample criterion
24 2.467 2.644 2.987
25 2.486 2.663 3.009 or statistic changes with sample size. Table 2 gives the
26 2.502 2.681 3.029
appropriate statistic to calculate and also gives the critical
27 2.519 2.698 3.049
values of the statistic for the 1, 5, and 10 % levels of
28 2.534 2.714 3.068
29 2.549 2.730 3.085 significance. In most situations, the Dixon criteria is less
30 2.563 2.745 3.103
powerful at detecting an outlier than the criterion given in 7.1.
35 2.628 2.811 3.178
7.2.1 Example 2—As an illustration of the use of Dixon’s
40 2.682 2.866 3.240
45 2.727 2.914 3.292
test,consideragaintheobservationsonbreakingstrengthgiven
50 2.768 2.956 3.336
in Example 1. Table 2 indicates use of:
A 3
Values of T are taken from Grubbs (1), Table 1.All values have been adjusted
r 5 x 2 x / x 2 x (6)
~ !
~ !
for division by n – 1 instead of n in calculating s. Use Ref. (1) for higher sample 11 n n21 n 2
sizes up to n = 147.
Thus, for n=10:
r 5 x 2 x /~x 2 x ! (7)
~ !
11 10 9 10 2
For the measurements of breaking strength above:
level α can be approximated using the α/n percentage point of
Student’s twithn–2 degrees of freedom. The approximation r 5 596 2 584 / 596 2 570 5 0.462
~ ! ~ !
is exact for small enough values of α, depending on n, and
Which is a little less than 0.478, the 5 % critical value for n
otherwise a slight overestimate unless both α and n are large:
= 10. Under the Dixon criterion, we should therefore not
t
α⁄n,n22 consider this observation as an outlier at the 5 % level of
T ~α! # (5)
n
significance. These results illustrate how borderline cases may
nt 2 1
α⁄n,n22
Œ11
be accepted under one test but rejected under another.
n 2 1
~ !
7.3 Recursive Testing for Multiple Outliers in Univariate
7.1.2 To test outliers on the high side, use the statistic T =
n
Samples—For testing multiple outliers in a sample, recursive
(x –x¯ )/s and take as critical value the 0.05 point of Table 1.
n
application of a test for a single outlier may be used. In
To test outliers on the low side, use the statistic T =(x¯–x )/s
1 1
recursive testing, a test for an outlier, x or x , is first
1 n
and again take as a critical value the 0.05 point of Table 1.If
conducted. If this is found to be significant, then the test is
we are interested in outliers occurring on either side, use the
repeated, omitting the outlier found, to test the point on the
statistic T =(x –x¯ )/sorthestatistic T =(x¯–x )/swhichever
n n 1 1
opposite side of the sample, or an additional point on the same
is larger. If in this instance we use the 0.05 point of Table 1 as
side. The performance of most tests for single outliers is
our critical value, the true significance level would be twice
affected by masking, where the probability of detecting an
0.05 or 0.10. Similar considerations apply to the other tests
outlierusingatestforasingleoutlierisreducedwhenthereare
given below.
twoormoreoutliers.Therefore,therecommendedprocedureis
7.1.3 Example 1—As an illustration of the use of T and
n
to use a criterion designed to test for multiple outliers, using
Table 1, consider the following ten observations on breaking
recursive testing to investigate after the initial criterion is
strength (in pounds) of 0.104-in. hard-drawn copper wire: 568,
significant.
570, 570, 570, 572, 572, 572, 578, 584, 596. See Fig. 1. The
doubtful observation is the high value, x = 596. Is the value
of 596 significantly high? The mean is x¯ = 575.2 and the
The boldface numbers in parentheses refer to a list of references at the end of
estimated standard deviation is s = 8.70. We compute: this standard.
E178 − 21
A
TABLE 2 Dixon Criteria for Testing of Extreme Observation (Single Sample)
Significance Level (One-Sided Test)
n Criterion
10 % 5 % 1 %
3 r =(x − x )/(x − x ) if smallest value is suspected; 0.886 0.941 0.988
10 2 1 n 1
4=(x − x )/(x − x ) if largest value is suspected 0.679 0.766 0.889
n n−1 n 1
5 0.558 0.642 0.781
6 0.484 0.562 0.698
7 0.434 0.507 0.637
8 r =(x − x )/(x − x ) if smallest value is suspected; 0.480 0.554 0.681
11 2 1 n−1 1
9=(x − x )/(x − x ) if largest value is suspected. 0.440 0.511 0.634
n n−1 n 2
10 0.410 0.478 0.597
11 r =(x − x )/(x − x ) if smallest value is suspected; 0.517 0.575 0.674
21 3 1 n−1 1
12 = (x − x )/(x − x ) if largest value is suspected. 0.490 0.546 0.643
n n−2 n 2
13 0.467 0.521 0.617
14 r =(x − x )/(x − x ) if smallest value is suspected; 0.491 0.546 0.641
22 3 1 n−2 1
15 = (x − x )/(x − x ) if largest value is suspected. 0.470 0.524 0.618
n n−2 n 3
16 0.453 0.505 0.598
17 0.437 0.489 0.580
18 0.424 0.475 0.564
19 0.412 0.462 0.550
20 0.401 0.450 0.538
21 0.391 0.440 0.526
22 0.382 0.430 0.516
23 0.374 0.421 0.506
24 0.366 0.413 0.497
25 0.359 0.406 0.489
26 0.353 0.399 0.482
27 0.347 0.393 0.474
28 0.342 0.387 0.468
29 0.336 0.381 0.462
30 0.332 0.376 0.456
35 0.311 0.354 0.431
40 0.295 0.337 0.412
45 0.283 0.323 0.397
50 0.272 0.312 0.384
A
x # x # . # x . Original Table in Dixon (2), Appendix. Critical values updated by calculations by Bohrer (3) and Verma-Ruiz (4).
1 2 n
7.4.2 The deviations –1.40 and 1.01 appear to be outliers.
Here the suspected observations lie at each end of the sample.
Themeanofthedeviationsis x¯ =0.018,thestandarddeviation
is s = 0.551, and:
w/s 5 1.01 2 21.40 /0.551 5 2.41/0.551 5 4.374
@ ~ !#
FromTable3for n=15,weseethatthevalueof w/s=4.374
FIG. 2 Fifteen Residuals from the Semidiameters of Venus from
falls between the critical values for the 1 and 5 % levels, so if
Example 3
the test were being run at the 5 % level of significance, we
would conclude that this sample contains one or more outliers.
7.4.3 The lowest measurement, –1.40, is 1.418 below the
sample mean, and the highest measurement, 1.01, is 0.992
7.4 Criterion for Two Outliers on Opposite Sides of a
above the mean. Since these extremes are not symmetric about
Sample—In testing the least and the greatest observations
the mean, either both extremes are outliers, or else only –1.40
simultaneouslyasprobableoutliersinasample,usetheratioof
is an outlier. That –1.40 is an outlier can be verified by use of
sample range to sample standard deviation test of David,
the T statistic. We have:
Hartley, and Pearson (5):
T 5 ~x¯ 2 x !/s 5 @0.018 2 ~21.40!#/0.551 5 2.574
1 1
w/s 5 x 2 x /s (8)
~ !
n 1
Thisvalueisgreaterthanthecriticalvalueforthe5%level,
The significance levels for this sample criterion are given in 2.409 from Table 1, so we reject –1.40. Since we have decided
Table 3. Alternatively, the largest residuals test of Tietjen and
that –1.40 should be rejected, we use the remaining 14
Moore (7.5) could be used. observations and test the upper extreme 1.01, either with the
criterion:
7.4.1 Example 3—Thisclassicsetconsistsofasampleof15
observations of the vertical semidiameters of Venus made by
T 5 x 2 x¯ /s (9)
~ !
n n
Lieutenant Herndon in 1846 (6). In the reduction of the
or with Dixon’s r . Omitting–1.40 and renumbering the
observations, Prof. Pierce found the following residuals (in
observations, we compute:
secondsofarc)whichhavebeenarrangedinascendingorderof
magnitude. See Fig. 2, above. x¯ 5 1.67/14 5 0.119, s 5 0.401
E178 − 21
A
TABLE 3 Critical Values (One-Sided Test) for w/s (Ratio of
Now relabel the original observations x , x , ., x as z’s in
1 2 n
Range to Sample Standard Deviation)
th
such a manner that z is that x whose r is the i smallest
i i
Number of 10 % 5% 1%
absolute residual above. This now means that z is that
Observations, Significance Significance Significance
n Level Level Level observation x which is closest to the mean and that z is the
n
observation x which is farthest from the mean. The Tietjen-
3 1.9973 1.9993 2.0000
4 2.409 2.429 2.445
Moore statistic for testing the significance of the k largest
5 2.712 2.755 2.803
residuals is then:
6 2.949 3.012 3.095
7 3.143 3.222 3.338 n2k n
2 2
8 3.308 3.399 3.543
E 5 z 2 z¯ / ~z 2 z¯! (11)
F ~ ! G
k ( i k ( i
i51 i51
9 3.449 3.552 3.720
10 3.574 3.685 3.875
11 3.684 3.803 4.011 where:
12 3.782 3.909 4.133
n2k
13 3.871 4.005 4.244
z¯ 5 z /~n 2 k! (12)
k ( i
14 3.952 4.092 4.344
i51
15 4.025 4.171 4.435
16 4.093 4.244 4.519
is the mean of the (n− k) least extreme observations and z¯ is
17 4.156 4.311 4.597
the mean of the full sample. Percentage points of E in Table 4
k
18 4.214 4.374 4.669
were computed by simulation.
19 4.269 4.433 4.736
20 4.320 4.487 4.799
7.5.1 Example 4—Applying this test to the Venus semidi-
21 4.368 4.539 4.858
22 4.413 4.587 4.913
ameter residuals data in Example 3, we find that the total sum
23 4.456 4.633 4.965
of squares of deviations for the entire sample is 4.24964.
24 4.497 4.676 5.015
Omitting –1.40 and 1.01, the suspected two outliers, we find
25 4.535 4.717 5.061
26 4.572 4.756 5.106
thatthesumofsquaresofdeviationsforthereducedsampleof
27 4.607 4.793 5.148
13 observations is 1.24089. Then E = 1.24089/4.24964 =
28 4.641 4.829 5.188
0.292, and by using Table 4, we find that this observed E is
29 4.673 4.863 5.226
30 4.704 4.895 5.263
slightly smaller than the 5 % critical value of 0.317, so that the
35 4.841 5.040 5.426
E test would reject both of the observations, –1.40 and 1.01.
40 4.957 5.162 5.561
45 5.057 5.265 5.674
7.6 Criterion for Two Outliers on the Same Side of the
50 5.144 5.356 5.773
Sample—Where the two largest or the two smallest observa-
A
Each entry calculated by 50 000 000 simulations.
tions are probable outliers, employ a test provided by Grubbs
(8, 9) which is based on the ratio of the sample sum of squares
whenthetwodoubtfulvaluesareomittedtothesamplesumof
squares when the two doubtful values are included. In illus-
trating the test procedure, we give the following Examples 5
and:
and 6.
T 5 ~1.01 2 0.119!/0.401 5 2.22
7.6.1 ItshouldbenotedthatthecriticalvaluesinTable5for
the 1 % level of significance are smaller than those for the 5 %
FromTable1,for n=14,wefindthatavalueaslargeas2.22
would occur by chance more than5%ofthe time, so we level. So for this particular test, the calculated value is
should retain the value 1.01 in further calculations. The Dixon significant if it is less than the chosen critical value.
test criterion is:
7.6.2 Example 5—In a comparison of strength of various
plastic materials, one characteristic studied was the percentage
r
5~x 2 x !/~x 2 x !
14 12 14 3
elongation at break. Before comparison of the average elonga-
5 1.01 2 0.48 / 1.0110.24
~ ! ~ !
tion of the several materials, it was desirable to isolate for
50.53/1.25
further study any pieces of a given material which gave very
50.424
small elongation at breakage compared with the rest of the
From Table 2 for n=14, we see that the 5 % critical value pieces in the sample. Ten measurements of percentage elonga-
for r is 0.546. Since our calculated value (0.424) is less than
tion at break made on a material are: 3.73, 3.59, 3.94, 4.13,
the critical value, we also retain 1.01 by Dixon’s test, and no
3.04, 2.22, 3.23, 4.05, 4.11, and 2.02. See Fig. 3. Arranged in
further values would be tested in this sample.
ascending order of magnitude, these measurements are: 2.02,
2.22, 3.04, 3.23, 3.59, 3.73, 3.94, 4.05, 4.11, 4.13.
7.5 Criteria for Two or More Outliers on Opposite Sides of
7.6.2.1 The questionable readings are the two lowest, 2.02
the Sample—For suspected observations on both the high and
and 2.22. We can test these two low readings simultaneously
lowsidesinthesample,andtodealwiththesituationinwhich
2 2
by using the S /S criterion of Table 5. For the above
some of k ≥ 2 suspected outliers are larger and some smaller 1,2
measurements:
thantheremainingvaluesinthesample,TietjenandMoore (7)
n
suggest the following statistic. Let the sample values be x , x ,
1 2
2 2
S 5 Σ x 2 x¯ 5 5.351
~ !
i
x , ., x . Compute the sample mean, x¯ , and the n absolute
3 n
i51
residuals:
n n
2 2
S 5 Σ x 2 x¯ 5 1.196, where x¯ 5 Σ x ⁄ n 2 2
~ ! ~ !
1,2 1,2 1,2 i
r 5 x 2 x¯ , r 5 x 2 x¯ , … , r 5 x 2 x¯ (10)
1 ? 1 ? 2 ? 2 ? n ? n ? i53 i53
E178 − 21
TABLE 4 Tietjen-Moore Critical Values (One-Sided Test) for E
k
A
k1 23 4 5
n α 10% 5% 1% 10% 5% 1% 10% 5% 1% 10% 5% 1% 10% 5% 1%
3 0.003 0.001 0.000 . . . . . . . . . . . .
4 0.049 0.025 0.004 0.002 0.001 0.000 . . . . . . . . .
5 0.127 0.081 0.029 0.022 0.010 0.002 . . . . . . . . .
6 0.203 0.145 0.068 0.056 0.034 0.012 0.009 0.004 0.001 . . . . . .
7 0.270 0.207 0.110 0.094 0.065 0.028 0.027 0.016 0.006 . . . . . .
8 0.326 0.262 0.156 0.137 0.099 0.050 0.053 0.034 0.014 0.016 0.010 0.004 . . .
9 0.374 0.310 0.197 0.175 0.137 0.078 0.080 0.057 0.026 0.032 0.021 0.009 . . .
10 0.415 0.353 0.235 0.214 0.172 0.101 0.108 0.083 0.044 0.052 0.037 0.018 0.022 0.014 0.006
11 0.451 0.390 0.274 0.250 0.204 0.134 0.138 0.107 0.064 0.073 0.055 0.030 0.036 0.026 0.012
12 0.482 0.423 0.311 0.278 0.234 0.159 0.162 0.133 0.083 0.094 0.073 0.042 0.052 0.039 0.020
13 0.510 0.453 0.337 0.309 0.262 0.181 0.189 0.156 0.103 0.116 0.092 0.056 0.068 0.053 0.031
14 0.534 0.479 0.374 0.337 0.293 0.207 0.216 0.179 0.123 0.138 0.112 0.072 0.086 0.068 0.042
15 0.556 0.503 0.404 0.360 0.317 0.238 0.240 0.206 0.146 0.160 0.134 0.090 0.105 0.084 0
...


This document is not an ASTM standard and is intended only to provide the user of an ASTM standard an indication of what changes have been made to the previous version. Because
it may not be technically possible to adequately depict all changes accurately, ASTM recommends that users consult prior editions as appropriate. In all cases only the current version
of the standard as published by ASTM is to be considered the official document.
Designation: E178 − 16a E178 − 21 An American National Standard
Standard Practice for
Dealing With Outlying Observations
This standard is issued under the fixed designation E178; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
Note—Corrections were made to Table 2 and the year date was changed on Sept. 7, 2016.
1. Scope
1.1 This practice covers outlying observations in samples and how to test the statistical significance of outliers.
1.2 The system of units for this standard is not specified. Dimensional quantities in the standard are presented only as illustrations
of calculation methods. The examples are not binding on products or test methods treated.
1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility
of the user of this standard to establish appropriate safety safety, health, and healthenvironmental practices and determine the
applicability of regulatory requirementslimitations prior to use.
1.4 This international standard was developed in accordance with internationally recognized principles on standardization
established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued
by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
2. Referenced Documents
2.1 ASTM Standards:
E456 Terminology Relating to Quality and Statistics
E2586 Practice for Calculating and Using Basic Statistics
3. Terminology
3.1 Definitions—The terminology defined in Terminology—Unless otherwise noted in this E456 applies to this standard unless
modified herein.standard, all terms relating to quality and statistics are defined in Terminology E456.
3.1.1 null hypothesis, H , n—a statement about a parameter of a probability distribution or about the type of probability
distribution, tentatively regarded as true until rejected using a statistical hypothesis test. E2586
3.1.2 order statistic x , n—value of the kth observed value in a sample after sorting by order of magnitude. E2586
(k)
3.1.2.1 Discussion—
In this practice, x is used to denote order statistics in place of x , to simplify the notation.
k (k)
3.1.3 outlier—see outlying observation.
This practice is under the jurisdiction of ASTM Committee E11 on Quality and Statistics and is the direct responsibility of Subcommittee E11.10 on Sampling / Statistics.
Current edition approved Sept. 7, 2016June 1, 2021. Published September 2016June 2021. Originally approved in 1961. Last previous edition approved in 2016 as
E178 – 16.E178 – 16a. DOI: 10.1520/E0178-16A.10.1520/E0178-21.
For referenced ASTM standards, visit the ASTM website, www.astm.org, or contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM Standards
volume information, refer to the standard’s Document Summary page on the ASTM website.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E178 − 21
3.1.4 outlying observation, n—an extreme observation in either direction that appears to deviate markedly in value from other
members of the sample in which it appears.
3.1.4.1 Discussion—
The identification of a value as outlying, and therefore a doubtful observation, is a judgement of the analyst and can be made before
any statistical test.
4. Significance and Use
4.1 An outlying observation, or “outlier,” is an extreme one in either direction that appears to deviate markedly from other
members of the sample in which it occurs.
4.2 Statistical rules test the null hypothesis of no outliers against the alternative of one or more actual outliers. The procedures
covered were developed primarily to apply to the simplest kind of experimental data, that is, replicate measurements of some
property of a given material or observations in a supposedly random sample.
4.3 A statistical test may be used to support a judgment that a physical reason does actually exist for an outlier, or the statistical
criterion may be used routinely as a basis to initiate action to find a physical cause.
5. Procedure
5.1 In dealing with an outlier, the following alternatives should be considered:
5.1.1 An outlying observation might be the result of gross deviation from prescribed experimental procedure or an error in
calculating or recording the numerical value. When the experimenter is clearly aware that a deviation from prescribed experimental
procedure has taken place, the resultant observation should be discarded, whether or not it agrees with the rest of the data and
without recourse to statistical tests for outliers. If a reliable correction procedure is available, the observation may sometimes be
corrected and retained.
5.1.2 An outlying observation might be merely an extreme manifestation of the random variability inherent in the data. If this is
true, the value should be retained and processed in the same manner as the other observations in the sample. Transformation of
data or using methods of data analysis designed for a non-normal distribution might be appropriate.
5.1.3 Test units that give outlying observations might be of special interest. If this is true, once identified they should be segregated
for more detailed study. Outliers may contain important information for a possible root cause analysis and action on the process
or procedure.
5.2 In many cases, evidence for deviation from prescribed procedure will consist primarily of the discordant value itself. In such
cases it is advisable to adopt a cautious attitude. Use of one of the criteria discussed below will sometimes permit a clearcut
decision to be made.
5.2.1 When the experimenter cannot identify abnormal conditions, hethey should report the discordant values and indicate to what
extent they have been used in the analysis of the data.
5.3 Thus, as part of the over-all process of experimentation, the process of screening samples for outlying observations and acting
on them is the following:
5.3.1 Physical Reason Known or Discovered for Outlier(s):
5.3.1.1 Reject observation(s) and possibly take additional observation(s).
5.3.1.2 Correct observation(s) on physical grounds.
5.3.2 Physical Reason Unknown—Use Statistical Test:
5.3.2.1 Reject observation(s) and possibly take additional observation(s).
E178 − 21
5.3.2.2 Transform observation(s) to improve fit to a normal distribution.
5.3.2.3 Use estimation appropriate for non-normal distributions.
5.3.2.4 Segregate samples for further study.
6. Basis of Statistical Criteria for Outliers
6.1 In testing outliers, the doubtful observation is included in the calculation of the numerical value of a sample criterion (or
statistic), which is then compared with a critical value based on the theory of random sampling to determine whether the doubtful
observation is to be retained or rejected. The critical value is that value of the sample criterion which would be exceeded by chance
with some specified (small) probability on the assumption that all the observations did indeed constitute a random sample from
a common system of causes, a single parent population, distribution or universe. The specified small probability is called the
“significance level” or “percentage point” and can be thought of as the risk of erroneously rejecting a good observation. If a real
shift or change in the value of an observation arises from nonrandom causes (human error, loss of calibration of instrument, change
of measuring instrument, or even change of time of measurements, and so forth), then the observed value of the sample criterion
used will exceed the “critical value” based on random-sampling theory. Tables of critical values are usually given for several
different significance levels. In particular for this practice, significance levels 10, 5, and 1 % are used.
NOTE 1—In this practice, we will usually illustrate the use of the 5 % significance level. Proper choice of level in probability depends on the particular
problem and just what may be involved, along with the risk that one is willing to take in rejecting a good observation, that is, if the null-hypothesis stating
“all observations in the sample come from the same normal population” may be assumed correct.
6.2 Almost all criteria for outliers are based on an assumed underlying normal (Gaussian) population or distribution. The null
hypothesis that we are testing in every case is that all observations in the sample come from the same normal population. In
choosing an appropriate alternative hypothesis (one or more outliers, separated or bunched, on same side or different sides, and
so forth) it is useful to plot the data as shown in the dot diagrams of the figures. When the data are not normally or approximately
normally distributed, the probabilities associated with these tests will be different. The experimenter is cautioned against
interpreting the probabilities too literally.
6.3 Although our primary interest here is that of detecting outlying observations, some of the statistical criteria presented may also
be used to test the hypothesis of normality or that the random sample taken come from a normal or Gaussian population. The end
result is for all practical purposes the same, that is, we really wish to know whether we ought to proceed as if we have in hand
a sample of homogeneous normal observations.
6.4 One should distinguish between data to be used to estimate a central value from data to be used to assess variability. When
the purpose is to estimate a standard deviation, it might be seriously underestimated by dropping too many “outlying” observations.
7. Recommended Criteria for Single Samples
7.1 Criterion for a Single Outlier—LetSort the sample of n observations be denoted in order of increasing magnitude by x ≤ x
1 2
≤ x ≤ . ≤ x ., called order statistics. Let the largest value, x , be the doubtful value, that is the largest value. The test criterion,
3 n n
T , for a single outlier is as follows:
n
T 5 x 2 x¯ /s (1)
~ !
n n
where:
x¯ = arithmetic average of all n values, and
s = estimate of the population standard deviation based on the sample data, calculated as follows:
n n n n 2
s =
2 2 2 2
~x 2x¯ ! x 2n·x¯ x 2 x /n
S D
( i ( i ( i ( i
i51 i51 i51 i51
5 5
! ! !
n21 n21 n21
n
x¯ 5 x (2)
( i
n
i51
E178 − 21
n n n n 2
2 2 2 2
~x 2 x¯ ! x 2 n·x¯ x 2 x /n
S D
( i ( i ( i ( i
i51 i51 i51 i51
s 5 5 5 (3)
! ! !
n 2 1 n 2 1 n 2 1
If x rather than x is the doubtful value, the criterion is as follows:
1 n
T 5 ~x¯ 2 x !/s (4)
1 1
The critical values for either case, for the 1, 5, and 10 % levels of significance, are given in Table 1.
7.1.1 The test criterion T can be equated to the Student’s t test statistic for equality of means between a population with one
n
observation x and another with the remaining observations x , . , x , and the critical value of T for significance level α can
n 1 n – 1 n
be approximated using the α/n percentage point of Student’s t with n – 2 degrees of freedom. The approximation is exact for small
enough values of α, depending on n, and otherwise a slight overestimate unless both α and n are large:
t
α⁄n,n22
T α # (5)
~ !
n
nt 2 1
α⁄n,n22
Œ11
~n 2 1!
7.1.2 To test outliers on the high side, use the statistic T = (x – x¯ )/s and take as critical value the 0.05 point of Table 1. To test
n n
outliers on the low side, use the statistic T = (x¯ – x )/s and again take as a critical value the 0.05 point of Table 1. If we are
1 1
interested in outliers occurring on either side, use the statistic T = (x – x¯ )/s or the statistic T = (x¯ – x )/s whichever is larger.
n n 1 1
If in this instance we use the 0.05 point of Table 1 as our critical value, the true significance level would be twice 0.05 or 0.10.
Similar considerations apply to the other tests given below.
7.1.3 Example 1—As an illustration of the use of T and Table 1, consider the following ten observations on breaking strength (in
n
TABLE 1 Critical Values for T (One-Sided Test) When Standard
A
Deviation is Calculated from the Same Sample
Number of Upper 10 % Upper 5 % Upper 1 %
Observations, Significance Significance Significance
n Level Level Level
3 1.1484 1.1531 1.1546
4 1.4250 1.4625 1.4925
5 1.602 1.672 1.749
6 1.729 1.822 1.944
7 1.828 1.938 2.097
8 1.909 2.032 2.221
9 1.977 2.110 2.323
10 2.036 2.176 2.410
11 2.088 2.234 2.485
12 2.134 2.285 2.550
13 2.175 2.331 2.607
14 2.213 2.371 2.659
15 2.247 2.409 2.705
16 2.279 2.443 2.747
17 2.309 2.475 2.785
18 2.335 2.504 2.821
19 2.361 2.532 2.854
20 2.385 2.557 2.884
21 2.408 2.580 2.912
22 2.429 2.603 2.939
23 2.448 2.624 2.963
24 2.467 2.644 2.987
25 2.486 2.663 3.009
26 2.502 2.681 3.029
27 2.519 2.698 3.049
28 2.534 2.714 3.068
29 2.549 2.730 3.085
30 2.563 2.745 3.103
35 2.628 2.811 3.178
40 2.682 2.866 3.240
45 2.727 2.914 3.292
50 2.768 2.956 3.336
A 3
Values of T are taken from Grubbs (1), Table 1. All values have been adjusted
for division by n – 1 instead of n in calculating s. Use Ref. (1) for higher sample
sizes up to n = 147.
E178 − 21
pounds) of 0.104-in. hard-drawn copper wire: 568, 570, 570, 570, 572, 572, 572, 578, 584, 596. See Fig. 1. The doubtful
observation is the high value, x = 596. Is the value of 596 significantly high? The mean is x¯ = 575.2 and the estimated standard
deviation is s = 8.70. We compute:
T 5 ~596 2 575.2!/8.70 5 2.39
From Table 1, for n = 10, note that a T as large as 2.39 would occur by chance with probability less than 0.05. In fact, so large
a value would occur by chance not much more often than 1 % of the time. Thus, the weight of the evidence is against the doubtful
value having come from the same population as the others (assuming the population is normally distributed). Investigation of the
doubtful value is therefore indicated.
7.2 Dixon Criteria for a Single Outlier—An alternative system, the Dixon criteria (2), based entirely on ratios of differences
between the observations may be used in cases where it is desirable to avoid calculation of s or where quick judgment is called
for. For the Dixon test, the sample criterion or statistic changes with sample size. Table 2 gives the appropriate statistic to calculate
and also gives the critical values of the statistic for the 1, 5, and 10 % levels of significance. In most situations, the Dixon criteria
is less powerful at detecting an outlier than the criterion given in 7.1.
7.2.1 Example 2—As an illustration of the use of Dixon’s test, consider again the observations on breaking strength given in
Example 1. Table 2 indicates use of:
r 5 x 2 x /~x 2 x ! (6)
~ !
11 n n21 n 2
Thus, for n = 10:
r 5 x 2 x /~x 2 x ! (7)
~ !
11 10 9 10 2
For the measurements of breaking strength above:
r 5 596 2 584 / 596 2 570 5 0.462
~ ! ~ !
Which is a little less than 0.478, the 5 % critical value for n = 10. Under the Dixon criterion, we should therefore not consider
this observation as an outlier at the 5 % level of significance. These results illustrate how borderline cases may be accepted under
one test but rejected under another.
7.3 Recursive Testing for Multiple Outliers in Univariate Samples—For testing multiple outliers in a sample, recursive application
of a test for a single outlier may be used. In recursive testing, a test for an outlier, x or x , is first conducted. If this is found to
1 n
be significant, then the test is repeated, omitting the outlier found, to test the point on the opposite side of the sample, or an
additional point on the same side. The performance of most tests for single outliers is affected by masking, where the probability
of detecting an outlier using a test for a single outlier is reduced when there are two or more outliers. Therefore, the recommended
procedure is to use a criterion designed to test for multiple outliers, using recursive testing to investigate after the initial criterion
is significant.
7.4 Criterion for Two Outliers on Opposite Sides of a Sample—In testing the least and the greatest observations simultaneously
as probable outliers in a sample, use the ratio of sample range to sample standard deviation test of David, Hartley, and Pearson
(5):
w/s 5 x 2 x /s (8)
~ !
n 1
The significance levels for this sample criterion are given in Table 3. Alternatively, the largest residuals test of Tietjen and Moore
(7.5) could be used.
7.4.1 Example 3—This classic set consists of a sample of 15 observations of the vertical semidiameters of Venus made by
Lieutenant Herndon in 1846 (6). In the reduction of the observations, Prof. Pierce found the following residuals (in seconds of arc)
which have been arranged in ascending order of magnitude. See Fig. 2, above.
FIG. 1 Ten Observations of Breaking Strength from Example 1
The boldface numbers in parentheses refer to a list of references at the end of this standard.
E178 − 21
A
TABLE 2 Dixon Criteria for Testing of Extreme Observation (Single Sample)
Significance Level (One-Sided Test)
n Criterion
10 % 5 % 1 %
3 r = (x − x )/(x − x ) if smallest value is suspected; 0.886 0.941 0.988
10 2 1 n 1
4 = (x − x )/(x − x ) if largest value is suspected 0.679 0.766 0.889
n n−1 n 1
5 0.558 0.642 0.781
6 0.484 0.562 0.698
7 0.434 0.507 0.637
8 r = (x − x )/(x − x ) if smallest value is suspected; 0.480 0.554 0.681
11 2 1 n−1 1
9 = (x − x )/(x − x ) if largest value is suspected. 0.440 0.511 0.634
n n−1 n 2
10 0.410 0.478 0.597
11 r = (x − x )/(x − x ) if smallest value is suspected; 0.517 0.575 0.674
21 3 1 n−1 1
12 = (x − x )/(x − x ) if largest value is suspected. 0.490 0.546 0.643
n n−2 n 2
13 0.467 0.521 0.617
14 r = (x − x )/(x − x ) if smallest value is suspected; 0.491 0.546 0.641
22 3 1 n−2 1
15 = (x − x )/(x − x ) if largest value is suspected. 0.470 0.524 0.618
n n−2 n 3
16 0.453 0.505 0.598
17 0.437 0.489 0.580
18 0.424 0.475 0.564
19 0.412 0.462 0.550
20 0.401 0.450 0.538
21 0.391 0.440 0.526
22 0.382 0.430 0.516
23 0.374 0.421 0.506
24 0.366 0.413 0.497
25 0.359 0.406 0.489
26 0.353 0.399 0.482
27 0.347 0.393 0.474
28 0.342 0.387 0.468
29 0.336 0.381 0.462
30 0.332 0.376 0.456
35 0.311 0.354 0.431
40 0.295 0.337 0.412
45 0.283 0.323 0.397
50 0.272 0.312 0.384
A
x # x # . # x . Original Table in Dixon (2), Appendix. Critical values updated by calculations by Bohrer (3) and Verma-Ruiz (4).
1 2 n
FIG. 2 Fifteen Residuals from the Semidiameters of Venus from Example 3
7.4.2 The deviations –1.40 and 1.01 appear to be outliers. Here the suspected observations lie at each end of the sample. The mean
of the deviations is x¯ = 0.018, the standard deviation is s = 0.551, and:
w/s 5 @1.01 2 ~21.40!#/0.551 5 2.41/0.551 5 4.374
From Table 3 for n = 15, we see that the value of w/s = 4.374 falls between the critical values for the 1 and 5 % levels, so if
the test were being run at the 5 % level of significance, we would conclude that this sample contains one or more outliers.
7.4.3 The lowest measurement, –1.40, is 1.418 below the sample mean, and the highest measurement, 1.01, is 0.992 above the
mean. Since these extremes are not symmetric about the mean, either both extremes are outliers, or else only –1.40 is an outlier.
That –1.40 is an outlier can be verified by use of the T statistic. We have:
T 5 x¯ 2 x /s 5 0.018 2 21.40 /0.551 5 2.574
~ ! @ ~ !#
1 1
This value is greater than the critical value for the 5 % level, 2.409 from Table 1, so we reject –1.40. Since we have decided
that –1.40 should be rejected, we use the remaining 14 observations and test the upper extreme 1.01, either with the criterion:
T 5 x 2 x¯ /s (9)
~ !
n n
or with Dixon’s r . Omitting –1.40 and renumbering the observations, we compute:
x¯ 5 1.67/14 5 0.119, s 5 0.401
and:
E178 − 21
A
TABLE 3 Critical Values (One-Sided Test) for w/s (Ratio of
Range to Sample Standard Deviation)
Number of 10 % 5 % 1 %
Observations, Significance Significance Significance
n Level Level Level
3 1.9973 1.9993 2.0000
4 2.409 2.429 2.445
5 2.712 2.755 2.803
6 2.949 3.012 3.095
7 3.143 3.222 3.338
8 3.308 3.399 3.543
9 3.449 3.552 3.720
10 3.574 3.685 3.875
11 3.684 3.803 4.011
12 3.782 3.909 4.133
13 3.871 4.005 4.244
14 3.952 4.092 4.344
15 4.025 4.171 4.435
16 4.093 4.244 4.519
17 4.156 4.311 4.597
18 4.214 4.374 4.669
19 4.269 4.433 4.736
20 4.320 4.487 4.799
21 4.368 4.539 4.858
22 4.413 4.587 4.913
23 4.456 4.633 4.965
24 4.497 4.676 5.015
25 4.535 4.717 5.061
26 4.572 4.756 5.106
27 4.607 4.793 5.148
28 4.641 4.829 5.188
29 4.673 4.863 5.226
30 4.704 4.895 5.263
35 4.841 5.040 5.426
40 4.957 5.162 5.561
45 5.057 5.265 5.674
50 5.144 5.356 5.773
A
Each entry calculated by 50 000 000 simulations.
T 5 ~1.01 2 0.119!/0.401 5 2.22
From Table 1, for n = 14, we find that a value as large as 2.22 would occur by chance more than 5 % of the time, so we should
retain the value 1.01 in further calculations. The Dixon test criterion is:
r
5~x 2 x !/~x 2 x !
14 12 14 3
5 1.01 2 0.48 / 1.0110.24
~ ! ~ !
50.53/1.25
50.424
From Table 2 for n = 14, we see that the 5 % critical value for r is 0.546. Since our calculated value (0.424) is less than the
critical value, we also retain 1.01 by Dixon’s test, and no further values would be tested in this sample.
7.5 Criteria for Two or More Outliers on Opposite Sides of the Sample—For suspected observations on both the high and low sides
in the sample, and to deal with the situation in which some of k ≥ 2 suspected outliers are larger and some smaller than the
remaining values in the sample, Tietjen and Moore (7) suggest the following statistic. Let the sample values be x , x , x , ., x .
1 2 3 n
Compute the sample mean, x¯ , and the n absolute residuals:
r 5 x 2 x¯ , r 5 x 2 x¯ , … , r 5 x 2 x¯ (10)
1 ? 1 ? 2 ? 2 ? n ? n ?
th
Now relabel the original observations x , x , ., x as z’s in such a manner that z is that x whose r is the i smallest absolute
1 2 n i i
residual above. This now means that z is that observation x which is closest to the mean and that z is the observation x which
1 n
is farthest from the mean. The Tietjen-Moore statistic for testing the significance of the k largest residuals is then:
n2k n
2 2
E 5 z 2 z¯ / z 2 z¯ (11)
F ~ ! ~ ! G
k i k i
( (
i51 i51
where:
n2k
z¯ 5 z /~n 2 k! (12)
k ( i
i51
E178 − 21
is the mean of the (n − k) least extreme observations and z¯ is the mean of the full sample. Percentage points of E in Table 4
k
were computed by simulation.
7.5.1 Example 4—Applying this test to the Venus semidiameter residuals data in Example 3, we find that the total sum of squares
of deviations for the entire sample is 4.24964. Omitting –1.40 and 1.01, the suspected two outliers, we find that the sum of squares
of deviations for the reduced sample of 13 observations is 1.24089. Then E = 1.24089/4.24964 = 0.292, and by using Table 4,
we find that this observed E is slightly smaller than the 5 % critical value of 0.317, so that the E test would reject both of the
2 2
observations, –1.40 and 1.01.
7.6 Criterion for Two Outliers on the Same Side of the Sample—Where the two largest or the two smallest observations are
probable outliers, employ a test provided by Grubbs (8, 9) which is based on the ratio of the sample sum of squares when the two
doubtful values are omitted to the sample sum of squares when the two doubtful values are included. In illustrating the test
procedure, we give the following Examples 5 and 6.
7.6.1 It should be noted that the critical values in Table 5 for the 1 % level of significance are smaller than those for the 5 % level.
So for this particular test, the calculated value is significant if it is less than the chosen critical value.
7.6.2 Example 5—In a comparison of strength of various plastic materials, one characteristic studied was the percentage elongation
at break. Before comparison of the average elongation of the several materials, it was desirable to isolate for further study any
pieces of a given material which gave very small elongation at breakage compared with the rest of the pieces in the sample. Ten
measurements of percentage elongation at break made on a material are: 3.73, 3.59, 3.94, 4.13, 3.04, 2.22, 3.23, 4.05, 4.11, and
2.02. See Fig. 3. Arranged in ascending order of magnitude, these measurements are: 2.02, 2.22, 3.04, 3.23, 3.59, 3.73, 3.94, 4.05,
4.11, 4.13.
7.6.2.1 The questionable readings are the two lowest, 2.02 and 2.22. We can test these two low readings simultaneously by using
2 2
the S /S criterion of Table 5. For the above measurements:
1,2
n
2 2
S 5 Σ x 2 x¯ 5 5.351
~ !
i
i51
n n
2 2
S 5 Σ x 2 x¯ 5 1.196, where x¯ 5 Σ x ⁄ n 2 2
~ ! ~ !
1,2 1,2 1,2 i
i53 i53
2 2
S ⁄S 5 1.197⁄5.351 5 0.2237
1,2
2 2
From Table 5 for n = 10, the 5 % significance level for S /S is 0.2305. Since the calculated value is less than the critical value,
1,2
we should conclude that both 2.02 and 2.22 are outliers. In a situation such as the one described in this example, where the outliers
TABLE 4 Tietjen-Moore Critical Values (One-Sided Test) for E
k
A
k 1 2 3 4 5
n α 10 % 5 % 1 % 10 % 5 % 1 % 10 % 5 % 1 % 10 % 5 % 1 % 10 % 5 % 1 %
3 0.003 0.001 0.000 . . . . . . . . . . . .
4 0.049 0.025 0.004 0.002 0.001 0.000 . . . . . . . . .
5 0.127 0.081 0.029 0.022 0.010 0.002 . . . . . . . . .
6 0.203 0.145 0.068 0.056 0.034 0.012 0.009 0.004 0.001 . . . . . .
7 0.270 0.207 0.110 0.094 0.065 0.028 0.027 0.016 0.006 . . . . . .
8 0.326 0.262 0.156 0.137 0.099 0.050 0.053 0.034 0.014 0.016 0.010 0.004 . . .
9 0.374 0.310 0.197 0.175 0.137 0.078 0.080 0.057 0.026 0.032 0.021 0.009 . . .
10 0.415 0.353 0.235 0.214 0.172 0.101 0.108 0.083 0.044 0.052 0.037 0.018 0.022 0.014 0.006
11 0.451 0.390 0.274 0.250 0.204 0.134 0.138 0.107 0.064 0.073 0.055 0.030 0.036 0.026 0.012
12 0.482 0.423 0.311 0.278 0.234 0.159 0.162 0.1
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...