You are on page 1of 3

7/20/2018

• Often, a data set of replicates may contain a value


that seems so far removed from the other values
• when this value is suspected to be not
Detection of Outliers in accountable by random error alone

Analytical Data • such data are known as outliers


• many statistical techniques used for the treatment
The Grubbs Test
of quantitative data are sensitive to the presence
and Dixon’s Q-test of outliers

• simple calculations (e.g. mean, standard deviation)


of a set of data may be distorted by even an Q-test
outlying point • still used
• checking therefore for outliers should be a routine • likely to continue to be used for sometime
𝑠𝑢𝑠𝑝𝑒𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒 −𝑛𝑒𝑎𝑟𝑒𝑠𝑡 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑔𝑎𝑝
part of any data analysis Qcalc = 𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 −𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒
= 𝑟𝑎𝑛𝑔𝑒

• good judgment and common sense


are most valuable 'tools' ( to accept/reject) datum) Grubbs test
• now recommended by some professional organizations
before calculating the average value of the set of • more robust test for the detection of outliers
replicates • now considered as a more accurate test

• statistical tests that can also be applied: Gcalc =


𝑠𝑢𝑠𝑝𝑒𝑐𝑡 𝑣𝑎𝑙𝑢𝑒 −𝑥
𝑠 where x = average
Grubbs test and Dixon's Q- test s = standard deviation

1
7/20/2018

Critical Values for rejection of outliers Problem:


Consider the data set: 5.43, 5.39, 5.47, 5.88, and 5.42. Here, the
Number of G Q value 5.88 seems to lie far away from the values. Should the
observations (95% confidence) (90% confidence)
value be retained or rejected?
4 1.463 0.76
If Qcalc (or Gcalc) >
5 1.672 0.64
Qcritical, reject Solution : In either the Grubbs test or the Q-test, rank the data
6 1.822 0.56
datum. set from smallest to largest value.
7 1.938 0.51
8 2.032 0.47 If Qcalc (or Gcalc) < 5.39, 5.42, 5.43, 5.47, 5.88
9 2.110 0.44 Qcritical, retain
10 2.176 0.41 datum. solution 1 (Q-test):
11 2.234 𝑠𝑢𝑠𝑝𝑒𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒 −𝑛𝑒𝑎𝑟𝑒𝑠𝑡 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑔𝑎𝑝
12 2.285
Qcalc = =
𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 −𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 𝑟𝑎𝑛𝑔𝑒
15 2.409
5.88 −5.47
20 2.557 Qcalc = = 0.837
5.88−5.39

• From the table of critical values: solution 2 (Grubbs Test)


Qcritical (for 90% and 5 observations ) = 0.64.
𝑠𝑢𝑠𝑝𝑒𝑐𝑡 𝑣𝑎𝑙𝑢𝑒 −𝑥
Gcalc =
𝑠

• Since Qcalc (0.837) > Qcritical (0.64), where x = average value of the data set
the value of 5.88 should be rejected. s = standard deviation

In this case, x = 5.518


• The average of the remaining four data points
s = 0.204.
should then be considered.
5.88 −5.518
Hence, Gcalc = = 1.766
0.204

2
7/20/2018

From the table of critical values:


(for 95% confidence level and 5 observations ) • Consider the data set:
Gcritical =1.672. 0.5980 , 0.5993, 0.5995, 0.5997, 0.601, 0.6400.
Since, Gcalc (1.766) > Gcrit (1.672), the datum may be rejected.
Is there an outlier in the set?
Note:
• neither the Grubbs test nor the Q-test Determine the outlier by Q-test and Grubbs test.
should be applied a second time if a datum Calculate the average and the relative standard
has already been rejected. deviation.

• that is, once the value 5.88 in the data set above has been rejected, you
cannot apply either test a second time to consider whether the value 5.47
might also be an outlier.

You might also like