Professional Documents
Culture Documents
•The aspects which can answer the quality assurance are the
assessing of analytical data.
2
Some Important Terms
Replicate determinations:
single measurement cannot be taken as an accurate
result.
Determination of the number of times a measurement
should be replicated in order to approach the value of
experimental mean around the true mean with a certain
degree of probability.
Our confidence in an analytical result is increased by
increasing the number of parallel determinations,
known as
That is, the more numerous the number of
observations the more their results approach the truth.
3
The mean is the most commonly used measure
of the central value and the less common used
measures are the median and the mode.
Second, an analysis of the variation in results
helps us to estimate the uncertainty associated
with the central value of the data.
You should note that population is the
collection of all measurements (very large
number and to infinity to the analyst, while a
sample is the subset of these measurements
(finite number of measurements) selected from
the population and we also call it as finite sample.
4
3.1 Mean, Median, Standard Deviation, Variance
• Mean
• The mean is the most widely used measure of the central
value.
• For a finite sample (for n < 30) the mean known as sample
mean, which represented by x and is the arithmetic average
of all the observations in the set of data;
6
• Example.3.1
• What is the mean for the data in Table 3.1?
• SOLUTION
• To calculate the mean, we add the results for all measurements
3.080 + 3.094 + 3.107 + 3.056 + 3.112 + 3.174 + 3.198 =
21.821 and divide by the number of measurements
7
Median
• Median (M): is the middle value of an odd number of results
listed in the order of magnitude, or the average of the two middle
ones for an even number of results.
• Example: For X1, X2, X3, X4, X5, where X1 < X2 < X3 < X4 < X5, M = X3
• For X1, X2, X3, X4, X5, X6, where X1 < X2 < X3 < X4 < X5 < X6,
M=
• An outlying result does not affect the median value since the
outlying result lies on the extremes. On the other hand, an
outlying result can have a significant effect on the mean of the
set since it is included in the calculation of the mean.
9
• Example.3.2
• What is the median for the data in Table 3.1?
SOLUTION
• To determine the median, we order the data from the
smallest to the largest value
3.056 3.080 3.094 3.107 3.112 3.174 3.198
• Since there is a total of seven measurements, the median is
the fourth value in the ordered data set; thus, the median is
3.107.
10
Mode
• The observation which occurs most frequently (i.e. with
maximum frequency) in a series of observations is known as
mode.
• For example, the mode of the set of data: 12.6, 12, 7, 12.9, 12, 7,
12.6, 12.8, 13.0, 12.5, 12.6, the value 12.6 is the mode since this
is occurring with maximum frequency (four times).
Historically the average deviation has been widely employed as the estimate of
precision. However, it suffers from the disadvantage that the estimate of this statistics
depends upon the number of measurements. The larger the number the better will be
the estimate. 13
Standard Deviation
• Standard deviation is the most important statistic to indicate the
precision of an analysis.
14
• where, xi represents the individual observations, Di the individual
deviations, μ the population mean, N the number of observations,
and the symbol ∑ denotes the summation for i = 1 to i = N values.
16
WeStandard Deviation of the Mean
know that the arithmetic mean of a series of n
measurements is more reliable (precise) than an
individual observation.
It can be shown statistically that the mean of n results
is n times as reliable as any one of the individual
results.
The precision is expressed in terms of deviation and
less the deviation the more precise the result is.
20
• Pooled standard deviation (sp)
• When we wish to calculate a standard deviation of a number of set
of analytical data obtained from several samples of varying
composition, it is preferable to use pooled standard deviation, sp.
That is, rather than relying on a single set of data to describe the
precision of a method, it is sometimes preferable to perform several
sets of analyses, for example, on different days, or on different
21
samples with slightly different compositions.
• If the indeterminate (random) error is assumed to be the same
for each set (assume the same source of random error in all the
measurements), then the precision of data of the different sets
can be pooled.
• This assumption is usually valid if the samples have similar
composition and have been analyzed in exactly the same way.
• This provides a more reliable estimate of the precision of a
method than is obtained from a single set.
• In the pooled standard deviation calculation, one degree of
freedom is lost in each subset.
• Thus, the number of degrees of freedom for the pooled s is
equal to the total number of measurements minus the number
of subset.
22
• 3.2 Accuracy and Precision of measurements
• A dart board is a good way to illustrate precision and accuracy.
Fig. 1 Precision
Precision is the strength of agreement between replicate
measurements. It tells us how close multiple values are to each
other. It refers to the magnitude of random errors and the
reproducibility of measurements.
Figure 1. illustrates a series of results that are very close to each
other i.e. have good precision.
Measuring Precision
Precision is usually discussed in terms of standard deviation (SD)
and percent coefficient of variation (%CV).
23
Fig. 2 Accuracy
Accuracy is a measure of the agreement between the estimates of a value and the
“true” value.
• Accuracy refers to how close a value is to the “true” value.
• Figure 2 illustrates a series of results that are accurate i.e. close to true value.
• Accuracy is expressed in terms of either absolute or relative error.
• a) Absolute error (E): is the difference between the measured value and the
accepted true value. It bears sign (could be positive or negative). Negative sign
indicates the experimental result is smaller than the accepted value. E = Xi – Xt
where Xi is measured value and Xt is the accepted true value
24
• b) Relative error (Er): is the absolute error divided by the true
value
28
• Examples include estimating position of a pointer between two
scale divisions, the color of a solution at the end point of
titration, level of liquid with respect to a graduation in a pipette
or burette.
• Some examples of personal errors include the following:
mechanical loss of materials in various steps of analysis, under
washing or over washing precipitate, ignition of precipitate at
incorrect temperature; insufficient cooling of crucibles before
weighing, allowing hygroscopic materials to absorb moisture
before weighing or during weighing,
• errors during transfer of solution, effervescence and ‘bumping’
during sample dissolution, incomplete drying of samples, and
mathematical errors in calculation and prejudice in estimating
measurement
29
• Most personal errors can be minimized by care and self
discipline.
b) Instrumental errors
• These arise from faulty construction of balances, the use of
uncalibrated or improperly calibrated weights, graduated glassware
and other instruments.
• Generally they include faulty instruments, uncalibrated weights
and uncalibrated glasswares.
• Instrument errors are caused by imperfection in measuring device
and instabilities in their power supplies; all measuring devices are
sources of systematic errors. For example, pipettes, burettes and
volumetric flasks may hold or deliver volumes slightly different
from those indicated by their graduations.
30
• These differences very arise from using glassware at a temperature
that differ significantly from the calibration temperature, from
distortion in container walls, due to heating while drying, from
errors is the original calibration or from contaminants on the inner
surfaces of the container.
• Systematic instrument errors are usually found and corrected by
calibrations.
• Periodic calibration of equipment is always desirable because the
response of most instruments change with time as a result of wear,
corrosion or mistreatment.
31
c) Method errors
• Method errors often arise from non-ideal chemical or physical
behavior of analytical system.
• The non-ideal chemical or physical behavior of the reagents and
reactions upon which an analysis is based often introduces
systematic method errors.
• Such sources of non-ideablity include
the slowness of reaction,
33
• However, errors inherent in a method are often difficult to detect
and are thus the most serious of the three types of systematic
errors.
• Of the three types of systematic errors encountered in chemical
analysis, method errors are usually the most difficult to identify
and correct.
• Sometimes correction can be relatively simple, for example, by
running blank titration. Bias in an analytical method is
particularly difficult to detect.
34
• 2.3.1.2 Random or indeterminate errors
• Random or indeterminate errors arise when a system of
measurement is caused by many uncontrollable variables that are
inevitable part of every physical or chemical measurement.
• They are due to cause over which the analyst has no control, and
which are so intangible that they are incapable of analysis.
• They have no specific causes. There are many contributors to
random error, but none can be surely identified or measured
because most are so small that they can’t be detected individually.
• Indeterminate errors are random and can’t be avoided.
Indeterminate (random) errors accompany every measurement
and are due to non-permanent causes and include noise present in
the measurement.
• Random fluctuations of electronic signals appearing in a
recorded spectrum.
35
• Various types of random noise may occur in measurements, such
as electronic noise in a detector or noise due to non-reproducible
placement of a sample cuvette is the cell holder of
spectrophotometer, etc.
• Random errors represent the experimental uncertainty that occurs
in any measurement. The errors are revealed by small differences
in successive measurement made by the same analyst under
virtually identical condition and they can’t be predicted or
estimated.
• The accidental errors will follow a random distribution therefore;
mathematical laws of probability can be applied to arrive at some
conclusion regarding the most probable result of a series of
measurements. If a sufficiently large number of observations
(measurements) are taken, it can be shown that these errors lie on
the normal or Gaussian curve.
36
• An inspection of this error normal (Gaussian) curve shows a)
small errors occurs more frequently than large ones b) positive
and negative errors of the same numerical magnitude are
equally likely to occur.
• Random or indeterminate error causes data to be scattered
more or less symmetrically around a mean value. They are
bidirectional (positive and negative), and therefore affect the
results irregularly.
• Random errors are decreased to a certain extent by increasing
the number of measurements, but they can’t be eliminated,
since an infinite number of measurements would be required.
• In general, random error in measurement is reflected by its
precision. The total error observed in any chemical analysis is a
combination of the determinate and the random error.
37
Some of the features of systematic and random errors are shown in
the table below.
38
quiz
The reproducibility of a method for the determination of selenium
in foods was investigated by taking nine samples from a single
batch of brown rice and determining the selenium concentration
in each. The following results were obtained:
0.07 0.07 0.08 0.07 0.07 0.08 0.08 0.09 0.08 μg g−1
Calculate the mean, standard deviation and relative standard
deviation of these results.
39
Confidence limit and test of significance
• The mean of a set of analytical results is an estimate of the true mean for
the analysis. The true mean is the mean result of an infinite number of
analyses.
41
• Those upper and lower boundaries are the confidence limits. As
the number of analyses increases, the mean of the results
approaches the actual mean and confidence limits around the
mean move closer together.
42
• Figure 2.6 shows the sampling distribution of the mean for
samples of size n. If we assume that this distribution is normal
then 95% of the sample means will lie in the range given by:
43
For large samples, the confidence limits of the mean are given by
• where the value of z depends on the degree of confidence
required.
44
The term ‘degrees of freedom’ refers to the number of independent
deviations which are used in calculating s. In this case the number is (n
− 1), because when (n − 1) deviations are known the last can be
deduced since .
45
• Example 2.6.1
• Calculate the 95% and 99% confidence limits of the mean for
the nitrate ion concentration measurements in Table 2.1.
• We have x = 0.500, s = 0.0165 and n = 50. Using equation
gives the 95% confidence limits as:
46
Example
• The sodium ion content of a urine specimen was determined by
using an ion-selective electrode. The following values were
obtained: 102, 97, 99, 98, 101, 106 mM. What are the 95% and
99% confidence limits for the sodium ion concentration?
• The mean and standard deviation of these values are 100.5 mM
and 3.27 Mm respectively.
• There are six measurements and therefore 5 degrees of freedom.
• From Table A.2 the value of t5 for calculating the 95%
confidence limits is 2.57 and the 95% confidence limits of the
mean are given by:
47
• Confidence intervals can be used as a test for systematic errors as
shown in the following example.
• The absorbance scale of a spectrometer is tested at a particular
wavelength with a standard solution which has an absorbance
given as 0.470. Ten measurements of the absorbance with the
spectrometer give = 0.461, and s = 0.003. Find the 95%
confidence interval for the mean absorbance as measured by the
spectrometer, and hence decide whether a systematic error is
present. The 95% confidence limits for the absorbance as
measured by the spectrometer are :
48
Tests of significance:
• Experimental data rarely agree completely with those expected on
the basis of a theoretical model.
• Tests of this kind make use of null hypothesis, which assumes that
the numerical quantities being compared are, in fact, the same. The
probability of the observed differences appearing as a result of
random errors is then computed from statistical theory.
50
• Other probability levels, such as 1 in 100 or 10 in 100, 90 in
100, 99 in 100 or 99.9% in 100 may also be adopted, depending
upon the certainty desired in the judgment.
51
a) Comparing of precision of two
measurements: F-test
• The F-test provides a simple method for comparing the
precision of two sets of measurements.
• This test is designed indicate whether there is a significant
different between two methods based on there standard
deviation.
• The sets do not necessarily have to be obtained from the same
sample as long as the samples are sufficiently alike that the
sources of random error can be assumed to be the same.
• F is defined in terms of the variance of the two methods
where the variance is the square of standard deviation.
52
• where S1 > S2. The large s is always used as numerator so that the value
of F is greater than unity. There are two different degrees of freedom V1
and V2 where the degrees of freedom is defined as
N-1 for each.
• If the calculated F value exceeds a tabulated F value at the selected
confidence level, then there is a significant different between the
variances of the method and this indicates the presence of
systematic errors in the measurement.
• However, if the calculated F value less than a tabulated F value at
the selected confidence level, then there is no statistically significant
different between the variances of the method and the
measurements is only due to random errors.
53
54
55
• example
• A proposed method for the determination of the chemical oxygen
demand of wastewater was compared with the standard (mercury salt)
method. The following results were obtained for a sewage effluent
sample:
57
• example
• The standard deviation, sA, from one set of 11 determinations was
0.210 and the standard deviation, sB, from another 9 determinations
was 0.641. Is there any significant difference between the precision
of the two sets of results at 95% confidence level?
• Solutions:
59
• The F-test is used to determine if two variances are statistically
different.
• For the tabulated F value for v1 = 6 and v2 = 5 is 4.95. Since the
calculated value is less than the tabulated value, there is no
significant difference in the precision of the two methods and
the difference in the standard deviations are due to random
error.
60
Comparison of an experimental mean with a
known value
• In order to decide whether the difference between the measured
and standard amounts can be accounted for by random error, a
statistical test known as a significance test can be employed. As
its name implies, this approach tests whether the difference
between the two results is significant, or whether it can be
accounted for merely by random variations. Significance tests are
widely used in the evaluation of experimental results.
• In making a significance test we are testing the truth of a
hypothesis which is known as a null hypothesis, often denoted
by H0. The term null is used to imply that there is no difference
between the observed and known values other than that which
can be attributed to random variation.
61
• Assuming that this null hypothesis is true, statistical theory can
be used to calculate the probability that the observed difference
(or a greater one) between the sample mean, and the true
value, μ, arises solely as a result of random errors.
62
• Using this level of significance there is, on average, a 1 in 20
chance that we shall reject the null hypothesis when it is in fact
true.
63
• In order to decide whether the difference between and μ is
significant, that is to test H0: population mean = μ, the statistic t is
calculated:
t = (x − μ)√n/s
• where = sample mean, s = sample standard deviation and n =
sample size.
• If (i.e. the calculated value of t without regard to sign) exceeds a
certain critical value then the null hypothesis is rejected.
64
• example
• In a new method for determining selenourea in water, the following
values were obtained for tap water samples spiked with 50 ng ml−1 of
selenourea: 50.4, 50.7, 49.1, 49.0, 51.1 ng ml−1 Is there any evidence of
systematic error?
• The mean of these values is 50.06 and the standard deviation is 0.956.
Adopting the null hypothesis that there is no systematic error, i.e. μ = 50,
and using equation
From Table A.2, the critical value is t4 = 2.78 (P = 0.05). Since the
observed value of is less than the critical value the null hypothesis is
retained:
there is no evidence of systematic error. Note again that this does not
mean that there are no systematic errors, only that they have not been
demonstrated.
65
66
Comparison of two experimental means
• Another way in which the results of a new analytical method may
be tested is by comparing them with those obtained by using a
second (perhaps a reference) method. In this case we have two
sample means
• Taking the null hypothesis that the two methods give the same
result, that is H0: μ1 = μ2, we need to test whether differs
significantly from zero.
68
• In a comparison of two methods for the determination of
chromium in rye grass, the following results (mg kg−1 Cr) were
obtained:
• In fact since the critical value of t for P = 0.01 is about 3.36, the
difference is significant at the 1% level. In other words, if the null
hypothesis is true the probability of such a large difference arising
by chance is less than 1 in 100.
70
• In a series of experiments on the determination of tin in foodstuffs,
samples were boiled with hydrochloric acid under reflux for
different times. Some of the results are shown below:
Does the mean amount of tin found differ significantly for the two boiling times?
The mean and variance (square of the standard deviation) for the two times are:
The null hypothesis is adopted that boiling has no effect on the amount of tin
found. By equation (3.3), the pooled value for the variance is given by:
71
• There are 10 degrees of freedom so the critical value is t10 =
2.23 (P = 0.05). The observed value of (= 0.88) is less than
the critical value so the null hypothesis is retained: there is no
evidence that the length of boiling time affects the recovery
rate.
72
• If the population standard deviations are unlikely to be equal
then it is no longer appropriate to pool sample standard
deviations in order to give an overall estimate of standard
deviation. An approximate method in these circumstances is
given below:
• In order to test H0: μ1 = μ2 when it cannot be assumed that the
two samples come from populations with equal standard
deviations, the statistic t is calculated, where
73
• The data below give the concentration of thiol (mM) in the blood
lysate of the blood of two groups of volunteers, the first group
being ‘normal’ and the second suffering from rheumatoid
arthritis:
• Normal: 1.84, 1.92, 1.94, 1.92, 1.85, 1.91, 2.07
• Rheumatoid: 2.81, 4.06, 3.62, 3.27, 3.27, 3.76
Substitution in equation (3.4) gives t = −8.48 and substitution in equation (3.5) gives 5.3,
which is truncated to 5. The critical value is t5 = 4.03 (P = 0.01) so the null hypothesis is
rejected: there is sufficient evidence to say that the mean concentration of thiol differs
between the groups.
74
Outliers
• Every experimentalist is familiar with the situation in which one
(or possibly more) of a set of results appears to differ
unreasonably from the others in the set. Such a measurement is
called an outlier.
• In order to use Grubbs’ test for an outlier, that is to test H0 : all
measurements come from the same population, the statistic G is
calculated:
(3.8)
• where and s are calculated with the suspect value included.
• The test assumes that the population is normal.
75
• The following values were obtained for the nitrite concentration
(mg l−1) in a sample of river water:
0.403, 0.410, 0.401, 0.380
• The last measurement is suspect: should it be rejected? The four
values have = 0.3985 and s = 0.01292, giving
76
• If three further measurements were added to
those given in the example above so that the
complete results became:
• 0.403, 0.410, 0.401, 0.380, 0.400, 0.413, 0.408
• should 0.380 still be retained?
• The seven values have = 0.4021 and s = 0.01088.
The calculated value of G is now
78
• In order to use Dixon’s test for an outlier, that is to test H0 : all
measurements come from the same population, the statistic Q is
calculated:
• This test is valid for samples size 3 to 7 and assumes that the
population is normal.
• The critical values of Q for P = 0.05 for a two-sided test are given in
Table A.6. If the
• calculated value of Q exceeds the critical value, the suspect value is
rejected.
79
80