You are on page 1of 67

Analytical Chemistry

• Chemical analysis:
Process that is associated with detection, identification and
determination of different chemical species (atoms, ions,
compounds, functional groups etc.) in a particular sample.

Biology Chemistry Physics 1. Determination of


hazardous chemicals and
Ecology Engineering pollutants
2. Purity Check and Quality
control
Bio-tech Analytical Chemistry Medicine 3. Discovery of chemical
no logy compounds and reaction
mechanism
Geology Material 4. Diagnosis of diseases
Science 5. Study the cause of natural
Environment Social phenomenon
science
Agriculture 6. In Criminology
7. Determining Composition
In manufacturing Industry
Figure 1: Analytical chemistry and related fields Government Regulation
1
Types of chemical analysis
• Qualitative: Deals with identification of the substance (analyte).
• Quantitative: Concerned with the determination of amount
/quantity of the analyte in a sample.
Problem identification,
Series of steps in chemical analysis

Selection of a method,
Sample collection,
Processing the sample,
Making sample solution,
Preparation of standard,
Measure the property of interest,
Compare with standard,
Calculation and interpretation
Check validity and reliability
2
CHAPTER

1 Results of coin flipping

Errors and the Treatment of


Analytical Data
3
Syllabus
Topics to be Covered
1. Errors in chemical analysis-absolute, relative errors.
2. Types of errors in experimental data: systematic errors (instrumental,
methodical and personal). Effect of systematic error-constant,
proportional error.
3. Sources of random error, distribution of experimental data.
4. Statistical treatment of random error. Central tendency and variability,
confidence limits, Student’s t, detection of gross error- Q-test
(rejection of data),
5 the least square method, correlation coefficients.
6. Propagation of determinate and indeterminate errors, Numerical

4
Errors in chemical analysis
• The magnitude of the estimated uncertainties in a
measurement process is the error.
• Refers to the numerical difference between a measured
value and the true value.
 True value of any quantity is something that we never know.
 A value is accepted as being true when it is believed that the uncertainty in
the value is less than the uncertainty in something else with which it is being
compared.
Sources of errors:
a. Personal skill of analysis.
b. Selected method (steps and formulas).
c. Instrument used (calibration, standardization)
d. Random error due to different factors
e. Accidental (gross) error
• The reliability of result: Performance of the process
is considerably good, true and dependable to
some ..…{a reliable assistant, a … data, a… source} 5
Absolute Error In analysis of 20.0 ppm of Fe+2 the the
following result is obtained. Calculate
absolute error of measurement.
• Difference between
experimental value and true
value
• Shows whether the value is
high or low

both positive and negative sign.


• Sign is retained Here, x 
19.4  19.5  19.6  19.7  20.1  20.3
 19.8 ppm
• Less informative as it has no 6
xt  20.00 ppm
relation with sample size E a  19.8-20.0  -0.2 ppm
For individual data:
AbsoluteError (E a )  X i  X t
For average of data:
Negative sign indicates that measured
AbsoluteError (E a )  X i  X t value is less than the true value.
Where,
Xi= observed value
Xt= true value
6
Relative Error
• Magnitude of error is expressed relative to the sample size
• Expressed as %, ppt, ppm
x  xt (X i – X t )
Relative Error (E r )   100; In %
Er  i  10 n Xt
xt (X i – X t )
n = 2,3,6……….   1000; In PPT
Xt
•Relative error also possesses the sign.
•It is independent upon the size of sample if the error is proportional kind
1) Suppose that 0.5mg of precipitate is lost as a 3) A loss of 0.4mg of Zinc occurs in the course
of an analysis for that element. Calculate
result of being washed with 200ml of wash the percent relative error due to this loss if
liquid. If the precipitate weighs 500mg , what the weight of Zinc in the sample is 40 mg.
will be the relative error due to solubility loss. 4) (i) A relative error of 0.5% is how many
parts per thousand? (ii) A relative error of
2) A method of analysis yields weights for gold 2.0 parts per 500 is what percent error?
that are low by 0.4mg. Calculate the percent
relative error caused by this uncertainty if the
weight of gold in the sample is (i) 250 mg
(ii)700mg.
7
Accuracy and Precision
Accuracy and Precision …
sound the same thing…

…is there a
difference??

8
Precise value may be
Accuracy inaccurate and vice verssa
Precision
 Measure of correctness of result  Measure of reproducibility of the
 A result is accurate if it is more result (closeness or agreement of the
close with the true value results in the replicate measurement)
 Closeness of a measured value  Has no relation with the true value
to the true value  Measure scatter-ness from the
 Less is the … error greater is the average value
accuracy and vice versa  Commonly stated in terms standard
 Can never be determined exactly deviation, average deviation or range

Dart board
Accuracy Precision True value
Low (a) & (c) (a) & (b)
High (b) & (d) (c) &d) 9
Commonly used statistical parameters for precision
( x i  x ) 2
(a) Absolute standard deviation s 
N 1
(Root mean square deviation)
( x i  x ) 2
(b) Variance s 2

N 1
s
(c) Coeficient of variation (CV )   100% also in ppt
x
s
(d) Relative standard deviation ( RSD) 
x
s
(e) Standard deviation/ error of mean ( s m ) 
N
(f) Average deviation =
(g) Relative average deviation(ppt) =

These all parameters depend on the deviation from the mean.


di = │xi -x│
10
Determination of Nitrogen in Benzyl iso-thio-urea-hydhrochloride and in
Nicotonic-acid by Kjeldahl method.

Kjeldahl method
NH

Sample CH2 S
NH2Cl

Hot and conc. H2SO4 Benzyl isothiourea hydrochloride


(HgO, Se,Cu2+)

Ammonium sulphate
conc. NaOH O

C OH

Ammonia Gas
N

Nicotinic acid
{prevents pellagra)

Analysis 3 & 4 give negative


deviation (bias), because of
incomplete decomposition by H2SO4;
(N is in pyridine ring).

Each dot represents an error for


single determination and vertical line
represent average deviation.
This error can be minimized by
Addition of K2SO4: increase the boiling
temperature. Pretreatment to obtain O-N & N-N, 11
TYPES ERROR IN EXPERIMENTAL
DATA

Systematic Random
Gross
(Indeterminate)
(Determinate)

• Error with definite value • Errors with no assignable


and an assignable cause • Caused by
cause
• Are unidirectional (same mistake in
• Never be totally eliminated operation
sign) • Bi directional (both high and • Few are out
• Bias low results has equal lyers
• Affects accuracy probability) • Affects
• Known cause accuracy
• Symmetrically distributed
• Could be minimized around the mean.
• either be constant or • Affects precision.
proportional • Obey the theory of statistics
• Decreases with increasing the
number of sample
12
Types of
Systematic
Errors Instrumental
Method Error Error
Personal Error
• • Caused from non ideal
• failure of measuring
Results from
carelessness, inattention, chemical or physical devices to be in
or personal limitation of behavior of reagents and accordance with required
the required knowledge. reactions upon which the standard leads to
• example: estimation method is based instrumental error.
errors, observation errors, • Slow, or incomplete reaction. • Due to improper
etc • Incomplete drying before calibration of instrument.
• Mesresuring of an instrument or weighing.
• Working outside of the
scale • Instability of
reagents/products. specified condition ( Temp,
• Insensitivity to colour changes
• Improper calibration • Possibility of side and chain concn, voltage pressure,
• Poor technique/sample reactions. pH)
preparation • Non specificity of most • Side reactions, heating to
• Improper calculation of results reagents. high temperature.
• Preconceived idea of ‘true’ value • Improper amount of • They are detectable and
– personal bias reagents, indicators. correctable
• These are blunders that can be • Most serious of the three
minimised or eliminated with
• Calibration eliminates most
of this types of errors 13
types Difficult to detect
proper training and experience.
Detection and Elimination of Systematic Error
Personal / Operative error Method error Instrumental Error
• Causes: skill, knowledge, • Causes: Instability of analytes, • Causes: non ideal
attitude, carelessness, reagents and products, slow instrumental behavior,
faulty construction, no
habits, health physical reaction, incompleteness of
proper calibration,
condition, biasness, some reagents in a reaction, attacking by reagent etc.
• Detection and removal improper drying before using outside of
• Proper reading of an weighing, side and chain calibrating temperature,
instrument or scale reaction, non specificity of contamination of inner
• Using glass to correct eye most), amount of indicator, surface due to
defect, Improper set of temperature reagents , attack by
• Every time proper reagents interference in reagents, corrosion and
distortion, decrease of
calibration. Using proper method voltage or increase of
technique for sample Detection and removal resistance due toi.e.
preparation and analysis. • Validation of the method can be used outside the
Leaving personal bias done by using standards (from working range; pH meter
• Proper training and NIST). It sells variety of suffers large error at
experience. reference materials. higher acidic or basic
range.
• Analyzing Standard Samples
• These types of error are
• Using an Independent easily detected and
Analytical Method removed
• Performing Blank • Simply frequent calibration
Determinations can remove large 14
Constant error Proportional error
• Magnitude error is constant
• The absolute value of this type of error
whatever be the sample size.
• It is more serious for small sized varies with the sample size in such a way
sample. On other hand increasing that the relative error remains constant.
sample size the effect of this error Presence of interfering contaminants in the
is minimized (relative error sample.
decreases with sample size) . (i) In determining of Cu(II) by KI; presence of
• 0.1 ml end point error in titration. Fe(II) also liberates I2 from KI which give
It causes 1% error for 10 ml positive error. If the sample size is doubled,
sample but only 0.2% for 50 ml iodine liberated by iron contaminant is also
sample. doubled.
• Also if 0.5mg of precipitate is lost
(ii) Determination of an oxidant like chlorate,
in washing by certain volume of
presence another oxidizing agent such as
solvent,
• Weighing error and rounding the bromate would cause positive error.
significance figures.
Taking larger samples would increase the
absolute error, but the relative error would
remain constant provided the sample was
homogenous 15
Random Error
• These errors which have no particular assignable cause.
• Can never be totally eliminated or corrected
• Caused by many uncontrollable variables that are inevitable part of every
analysis made by human beings; are random error.
• These variables are impossible to identified, even if we identify some they
cannot be measured, because most of them are so small
• Both high and low results appears with equal probability.
• In absence of systematic error random error only affects the precision.
• Random error arises obeying the probability theory of statistics
• Increasing the number of sample will minimize the magnitude of random
error.

16
How do small undetectable uncertainties produce a detectable random error ?
Imagine a situation in which four small random errors combine to give a overall error. Let
each factors can fluctuate the final result by ± 1 with equal probability of occurrence. The
following combinations (addition/subtraction) are possible.

Combination of Magnitude of Number of Relative


uncertainties Random error Combinations frequencies
(Frequency)
+ U1 + U2 + U3 + U4 +4U 1 1/16 = 0.0625
- U1 + U2 + U3 + U4 4 4/16 = 0.250
+ U1 - U2 + U3 + U4 +2U
+ U1 + U2 - U3 + U4
+ U1 + U2 + U3 - U4
- U1 - U2 + U3 + U4 6 6/16 = 0.375
-U 1+U 2-U 3+U 4 0
-U 1+U 2+U 3-U 4
+U 1-U 2-U 3+U 4
+U 1-U 2+U 3-U 4
+U 1+U 2-U 3-U 4
+U 1-U 2-U 3-U 4 4 4/16 = 0.250
-U 1+U 2-U 3-U 4 - 2U
-U 1-U 2+U 3-U 4
-U 1-U 2-U 3+U 4
If acurve
-U 1-U 2of
-U 3relative
-U 4frequency
- 4U versus
1 deviation from mean is plotted, it will give a bell
1/16 = 0.0625

shaped curve. As the number of individual error is very large, the curve takes the shape
as in figure, this curve is called normal error curve or Gaussian curve (properties are
discussed later).
17
Some examples of random distribution of data:
• Calibration of a pipette:
10 mL of water is pipetted → poured in stoppered flask of known weight →
weight change is calculated for all steps → Volume is calculated by dividing
with density, at that temperature → Data is tabulated (converting into discrete
frequency or continuous frequency) → This can also be plotted as bar diagram
or histogram or frequency polygon → in absence of systematic error, with
increasing number of data the frequency polygon curve is bell shaped,
symmetric in both sides → This curve is called as normal error curve.
• In this determination there are numerous possible sources of random error. → visual
error (level of water, mercury level in thermometer…) → temperature fluctuation (affects;
volume of pipette, viscosity of liquid, performance of balance…) → variation in drainage
time and the angle of pipette at drainage → vibrations and drafts (affects the balance
reading)
• Any other examples can be considered.
→ Production run of multivitamin tablets
→ Determination of Ca2+ in community water supply
→ Determination of glucose in the blood of diabetic patient
→ Measurement of absorbance of 50 replicate 10ppm Fe(II) sample complexing
with thiocyanate ion.

18
Measurement of central tendency: A single value or data that can represent the
total sample (mass of data} is called measure of central tendency.

1. Mean: Average of all the values:- Population and Sample


mean = µ,
standard deviation = s, s

N N
• Clear x i x i
• Strictly Sample mean; x  i 1
Population mean;   i 1

defined Most N N
N
common
• Mostly
fx i i
 i 1
(for frequency distributi on)
affected by N
N
extreme 1
values
Others, x  A 
N
d
i 1
i d  xi  A (deviation ) and A = assumed mean
• Reliability 1 N

affected by  A
N
fd
i 1
i i

sample size N
h
x  A
N
fd
i 1
i i h = width of class interval

Mean of N result is (N)1/2 times more reliable


than individual result. 19
1. Median:-
The value which divides the data equally in two parts.
• The individual data set
 Odd number of data: middle value represents the median
 Even number of data average of two middle values is the median.
For N 1
• Discrete frequency distribution, Find 2
o Calculate cumulative frequency (Cf)
N 1
o Values with Cf  is median
2
• Continuous frequency distribution, Find N
see the class with N
Cf  2
• Calculate median 2

h N
Md  l  (  c) l = lower limit of the class, h = width of the middle class,
f 2
f = frequency of the middle class, c = Cf of preceding class.

20
3. Mode: The value of observation that repeats for the maximum
number of times is the mode.

For the continuous frequency distribution,


h( f 1  f 0 )
M0  l  f1 = frequency of the modal class, f 0 = frequency of the precedingl clas
( f1  f 0 )  ( f1  f 2 )
f 2 = frequency of the succeeding l class.
also, M 0  3M d  2 x if it is difficult to calculate mode.

21
Variability or Dispersion or Spread:
Measurement of scatter-ness of the data (precision).
Data may have same central value but different dispersion.
Range
Discrete data: Range = Largest value - Smallest value in the data
(R = L - S) L -S
Coefficien t of range =
Continuous data: LS
Range = upper limit of upper class – lower limit of lower class or their mid values.

Quartile deviation Q  Q1 N 1
th
3(N  1)
th
QD  3 Q1  item and Q3  item
(or semi inter 2 4 4
Q  Q1
quartile range): Coefficeen t of QD = 3 It is the relative measure of variabili ty.
Q3  Q1

Average or  | xi  x |  f i | xi  x |
MD  
mean deviation N N
MD
( MD, d ) Coefficeen t of MD =
x
MD
Relative average deviation  X100% also in ppt.
x 22
Standard deviation:
Sample standard deviation, s 
 (x i  x) 2
Population standard deviation  
 (x i  x) 2
N 1 N


 (x ) i
2


( xi ) 2
N N2


 (x i fi )2

( xi f i ) 2
N N2

Coefficien t of standard deviation or coefficien t of variation : v  X 100%
x

• When N-1 is used instead of N, s is said to be an unbiased estimator of the population


standard deviation. But if N is used in the formula it is said to be negatively biased.
• The mean of each set is less and less scattered as N increases.
• The standard deviation of each mean is known as standard error of mean; it is proportional
to 1/√N. That is mean of four measurement is more precise by √4 =2 than the individual.
Variance:
It is also the measure of dispersion.
It is simply the square of SD, but it is additive Sample Variance, s 2 
 ( x  x) i
2

N 1
 t2   12   22   32  .......... Population Variance  2  .... So on
, greater is the total variance greater will be the random error.

23
Root mean standard deviation: Measures the deviation from any arbitrary value. i.
RMSD 
 (x i  A) 2
N
Standard error of mean: The SD divided by the square root of total number of data
is called standard error of mean.

s
Sm 
N
Q, The normality in a solution is determined to be 0.2041, 0.2043, 0.2039, 0.2443.
Calculate mean, median, range, mean deviation, SD, cf

x i
0.2041  0.2043
Sample mean; x  i 1
 0.2043 N Median; M d   0.2042
4 2
 | xi  x |
R  L - S  0.001 d  0.0003
4
d
Relative average deviation  X 1000  1.5 ppt
x
( xi  x s
s  0.0004 v X 100  0.2%
4 1 x
24
Distribution of Experimental data and Statistical measurement
The data can be tabulated as, Individual series, Discrete frequency distribution,
Continuous frequency distribution, or in the form of Diagrams; Bar diagram,
Histogram, Frequency polygon.

Roll Marks Marks No of


No
Students Histogram
1

Number of students
0-10 2
2
10-20 1
.
…..
0-10, 10-20, 20-30, 30-40,40-50, 50-60,60-70,70-80,80-90,90-100
Individual series Marks
Continuous frequency
Frequency polygon

Number of students
Coin Flipping
Measure =
How many
heads will 0-10, 10-20, 20-30, 30-40,40-50, 50-60,60-70,70-80,80-90,90-100
Marks
occur in ten
(10) flips? 25
Methods to draw the frequency polygon:

 Arrange the data in sequence,


lowest to highest or …
 Condense the data into
class/cells by grouping, decide
the number of cells (10-20) and
choose the boundaries.
Confusion can be eliminated by
choosing boundaries halfway
between two possible values.
 Pictorial representation of
frequency distribution in terms of
histogram & frequency polygon.
In frequency polygon the middle
value of a cell connected
together.

26
Classification
of data

Frequency
polygon

27
• Normal error curve:
– The limiting case approached by the frequency polygon as more and more
replicate measurements are performed is the normal error curve or Gaussian
distribution curve.
– If we plot the curve as the function of relative frequency versus deviation from
the mean and assume that the probability of falling the data under the curve is
1. The curve is called normal error curve, Normalization of distribution
function. Here the area of the curve is proportional to number of data between
that ranges.
– The area under the curve between any two value of x-µ gives the fraction of
total population having magnitudes between these two values.
µ±σ includes 68.26% of result
µ±2σ includes 95.46% of result
µ±3σ includes 99.74% of result
68.26% of
result

95.46% of
result
99.74% of
result
Where,
x
z
 28
Properties of Normal error curve
• The curve is explained by an equation containing two parameters
µ and .

Area = =
• The curve is exponential, as well as symmetric with mean at the
center
• It is bell shaped, The breadth of the curve gives the precision i.e.
larger the broader is the base more will be range or spread.
• X axis can be as (), While doing so, Data with same mean but
different SD gives different curve . But if axis is deviation from
mean in terms of SD, i.e. (z = ), In this case all curves will be on
same scale.
• where z = , where, dz =
Area =
• The area of the curve between different region can be
calculated. i.e. between µ ± σ = 68.26%, µ ± 2σ = 95.46% and µ ±
3σ = 99.74% respectively.
• The range for required % can also be determined: 50%(µ ±
0.67σ) 80% (µ ± 1.28σ) 90% (µ ± 01.64σ) 99% (µ ± 2.58σ)
• assumt and s are good
estimates of µ and σ.
29
Methods to obtain good estimate of S:
• Here we will discuss, how to obtain reliable estimate of σ from small samples of
data.
• Performing preliminary experiments: If we have more time and adequate
sample, reliable SD for the method can be obtained in the preliminary step by
increasing N; if N is about 20; s and σ can be assumed identical.
• Pooling data: If we have several subsets of data, we can get better estimate of
population SD by pooling (combining) the data; assuming that, data are
replicate and each sub set has the common σ.
• Spooled is the weighted average of the individual estimates.

N1 N2

 i 1  j 2  .........
( x
i 1
 x ) 2
 ( x  x ) 2

j 1
s pooled 
N 1  N 2  ......  nt

No of degree of freedom is equal to total no of sample (N 1 + N2 + ….) minus no of subsets (nt)

30
Q. The mercury in the sample of seven fish taken from Mississipi River was determined with
a method based on the absorption of radiation by gaseous elemental mercury.
(a)Calculate the pooled estimate of the standard deviation for the method, based on the first
three columns of data in the table it follows:
(b) Calculate the 50% and the 95% confidence limits for the mean value (1.67ppm) for the
sample 1, consider s ≈ σ = 0.10
(c) How many replicate measurements of specimen 1 would be needed to decrease the
95% confidence interval to ± 0.07ppm Hg?

Specimen No. of sample Hg content, ppm Mean ppm Sum of sq. of deviation
number measured of Hg from mean
1 3 1.80, 1.58, 1.64, 1.673 0.0259
2 4 0.96, 0.98, 1.02, 1.10, 1.015 0.0115
3 2 3.13, 3.35 3.240 0.0242
4 6 2.06, 1.93, 2.12, 2.16, 1.89, 1.95 2.018 0.0611
5 4 0.57, 0.58, 0.64, 0.49 0.570 0.0114
6 5 2.35, 2.44, 2.70, 2.48, 2.44 2.482 0.0685
7 4 1.11, 1.15, 1.12, 1.04 1.130 0.0170
Σ = 28 Σ = 0.2196

N1 N2

 (x
i 1
i  x1 ) 2   ( x j  x 2 ) 2  .........
j 1
s pooled 
N 1  N 2  ......  nt
31
Confidence Limit or interval:- How Sure are we?
• For most analytical task, we do not know the true value, however, we can use the
experimental mean and standard deviation to estimate the range of true value.
• Since μ (true value) cannot be determined;
• In absence of systematic error; we can define the numerical interval around mean
() of replicate results within which the population mean (μ) is expected to lie with
the given degree of probability, is called confidence interval and the boundaries
are called confidence limits.
• For example conc. of population at 90% CI is 7.25 ± 0.15; which is, 7.10% -7.40%.
It is calculated from sample standard deviation. Thus it is clear that
t, z and population
Cn values ate
mean lies ……., at 90% confidence level. used from the table
to determine
confidence interval

• Confidence interval- calculated range of estimated true value


• Confidence limit- limit of this range
• Confidence level- the likely hood that the true value falls in this range
𝑧𝜎 𝜇=𝑥 ± 𝐶 𝑛 𝑅
𝜇=𝑥 ±
Large no of sample
√𝑛 Confidence interval in terms of Range 32
Methods to obtain good estimate of S:

When σ is known or s is good estimate of σ:-


• For large number of sample data. N > 20 confide
nce
z confide
nce
z

• From equation level, level,


x % %
z 50 0.67 96 2

• 
The confidence interval of true mean based on single
68
80
1
1.29
99
99.7
2.58
3
90 1.64 99.9 3.29
value (x) can be written as, 95 1.96
CI for μ = x ± zσ.
• If we use experimental mean
z
CI of   x  here, x is replaced by x and  with standard error of the mean
N
• of N data is the better estimator of μ, than single
measurement;

• , CI can be narrowed by taking more measurement (taking


4 measurements will half the CI of μ)
• This equation is applied only if bias is zero (s is good
approximation of σ)

Used to compare of experimental mean with known


value.
33
F- Test (Variance ratio test)
• F test, express the spread of result for two types of analysis.
F=
• If Fcalculated > Ftabylated at a given confidence level, Significance
difference between variances of two methods.
• IF Fcal ≤ Ftabulated two standard deviations and variances are not
significantly different
Colorimetry Folin-Wu
Determination of glucose in blood:
127 mg/dL 130 mg/dL
(a) Colorimetric procedure
(b) Standard Folin-Wu procedure 125 mg/dL 128 mg/dL
123 m g/dL 131 mg/dL
130 mg/dL 129 mg/dL
131 mg/dL 127 mg/dL
The tabulated value for 1 = 6 and 2 = 5

sPooled =
126 mg/dL 125 mg/dL

=
is 4.95 at 95% confidence level. 129 mg/dL -
Determine whether the variance differs
significantly or not? Are two means significantly different at 95% probability?
Student’s t (t-test)

– W. S Gosset (1908) studied the problem of making predictions


based upon a finite sample drawn from unknown population.
– In practical works we know and s rather than µ & σ (where,
and s are estimates of µ & σ )
– These estimates are subjected to uncertainty and predictions
can be made about the falling of an odd observation outside the
limit.
– Conficence limit for small no of data set is explained by a new
statistical parameter “t”, analogous to z.
– s calculated from small set of data may be quite uncertain and it
broadens the confidence interval.
– The quantity t which is calculated to compensate uncertainty
using “s” instead of “σ”

35
When σ is unknown:-
x
t , for a single measurement, is defined t
as: s
x
t , for a mean of N measurement, t 
s
N
• This parameter t is called student’s t, which explains about the uncertainty on s
as the estimate of σ.
• Magnitude of t depends on degree of freedom in calculating s (DF for small
sample is N-1).
• The confidence interval for the mean .. of N replicate measurement can be
calculated by t-equation,

ts
CI of   x  here, x is replaced by x and s with standard error of the mean
N
Degree of Freedom: refers to the number of
where = sample mean values in a sample that can be choosen freely,
µ = population mean or true mean ( the no. of observations remains unspecified)
s= sample standard deviation Degree of freedom= sample size – number of
n=no. of observation population parameter that are estimated from
t depends on degree of freedom in the calculation of ‘s’ sample observations.
36
• x- mean value; s- standard
deviation
• N- number of measurements
• t – located using table
• Determine the value of v-
degrees of freedom (N-1) than
identify the t- value with the
respective % confidence level.

37
Testing for Significance, by t statistics
i. Comparison of experimental mean with known/true mean:
• Null hypothesis (H0) :- μ = μ0 x
t
• Alternative hypothesis (Ha) :- μ≠μ0, s
N
(It may be into two kinds, μ< μ0,or μ> μ0 one tailed.)

• We apply z test for the large sample where s and σ are good approximate.
• We apply t test for the small sample where s and σ are not good approximate.
• Null hypothesis is accepted if test statistics lie within the accepted region and is
rejected if lies in the rejection region. We compare these values with the
tabulated values at given degree of freedom.
• If t or z calculated > t or z tabulated  we reject the null hypothesis, ie………

• If t or z calculated < t or z tabulated we accept the null hypothesis i.e.

For determination of Cu in biological sample orchard leaf (standard from


NIST, with listed concentration of 11.7ppm) is analysed in 5 replicates.
Mean is calculated = 10.8 ppm, SD is calculated = ± 0.7. Is your method
statistically correct at 95% probability?

Note: As no of replicate increases, there is less probability of … 38


ii. Comparison of Two Means: t-test can be used for the comparison of two means.
• Sample is analyzed by two different methods, each repeated several times, and the
mean value obtained are different.– Is the difference between two values significant?
• t-test enables us to decide whether the difference in mean is simply due to random error
or there exists certain systematic error in any one of them.
We predict, Null hypothesis: mean of analyst 1 = mean of analyst 2; ( x 1  x 2 ), It is.
Alternative hypothesis is ( x 1  x 2 ) (two tailed),
t- is calculated using formula Determination of iron %
Gravimetric Ammonia
x1  x 2 N1 N 2 Method method
t 20.10 18.89
s N N 1 2
if s1and s2 are their standard deviations, t 20.50 19.20
test for the difference between mean is 18.65 19.00
computed by using formula,
19.25 19.70
Here number of degree of freedom (df) = N1 + N2 - 2 19.40 19.40
 If tcalculated < ttabulated;
19.99 -
Mean 19.65 Mean 19.24
we accept the null hypothesis
i.e. there is no significance difference between two mean.
Is there any significance
sPooled = difference between two 39
methods?
Paired t Test: (not in course)
• In the clinical chemistry laboratory, a new method is frequently
tested against an accepted method by analyzing several different
samples of slightly varying composition (within physiological
range). In this case t value is calculated in a slightly different form.
The difference between each of the paired measurement on each
sample is computed. An average difference is calculated and the
individual deviation of each from are used to compute a standard
deviation sd. t value is calculated from
• Where Di is the individual difference between the two methods for
each sample, with regard to sign and is the mean of all individual


differences. 𝐷 ∑ ( 𝐷𝑖 − 𝐷 )2
𝑡= √ 𝑁 𝑠𝑑=
𝑠𝑑 𝑁 −1
Confidence interval of the Mean

– Confidence interval for the mean is the range of values


within which the population mean µ is expected to lie
with a certain probability.
– Confidence limit is the probability that the true mean lies
within the certain interval.
– It is given as ts
Confidence limit of   x 
N
– CL is used to estimate the probability that the population
mean lies with in a certain region centered at x or sample
mean.( certain range either side of x bar)

41
Gross Error
• A third type of error is gross error; it differs from determinate and
indeterminate errors,
• Gross errors occur occasionally and are too large (higher or lower)
• Only few of the results will scatter outside from the rest.
• Te result that differs markedly from all other replicate data are called
outliers.
• There is no evidence of gross errors. But are produced by human error.
For eg, If a part of ppt is lost before weighing or if a weighed bottle is
touched by fingers.
• The outliers can introduce error in the analysis, so criteria should be made
weather to retain or reject the data (that is remaining free from bias).
• A single result appears to be out side the range of what random errors in
procedure gives
• Generally arise due to human error

42
Gross Error
• Criteria must be developed to decide on the rejection or retaining of outlying
data
• Proper statistical treatment needed before rejection
• The consequence of making error in statistical tests are often compared with
the consequences of error made in judicial procedure.
• In this test the absolute value of the difference between questionable and the
nearest value is compared with spread; which is called Q.
• This Q value is compared with tabulated value; if Q > Q tab or Qcrit, the data is
Qcal > Qtab,
rejected; otherwise retained. discarded
Steps: with 90%
1. Calculate the range of the result (W) confidence
2. Calculate difference between suspected result and the nearest neighbor.
3. Divide II by I, to get rejection quotient (Q) Qcal < Qtab,
4. Find the tabulated value and compare. Accept result
Apply the Q test to the following data to determine whether the outlying
result should be retained or rejected at 95 % confidence level, 50 27 , 50 61,
50 84, 50 70, 50 76 ppm
43
Gross Error

• Type I and Type II Error


The minimum difference between suspected data and other data
has to be assigned before the result is to be discarded which may
introduce other types of error.
– If the minimum difference is made too small, the valid data may
be rejected too frequently, such error are TYPE I.
• Occurs when null hypothesis is rejected although it is actually
true (false negative)
– If the minimum difference is made too high, there is too frequent
retention of highly erroneous values, such error are TYPE II.
• Occurs when null hypothesis is accepted although it is actually
false (false positive)

44
Propagation of Error
• The method of transferring errors from individual observation
into final result through series of calculations is called
propagation of errors.
• Attention!! Focused on accuracy and precision of final
computed result, but it is instructive to see how errors in the
individual measurements are propagated into the result.
1. Determinate error (i)Addition and subtraction
(ii)Multiplication and Division
2. Indeterminate error (i)Addition and subtraction
(ii)Multiplication and Division

45
Propagation of determinate error:
Let us consider the final result R and the A, B, C are the preliminary
measurements. If error associated with are represented by ρ, α, β and δ
respectively.
Suppose R = A + B - C
• In the case of addition
Where, R = computed result
and subtraction, we have A, B and C are measured quantity

R  A  B  C...................(i )
Introducin g the respective errors R    ( A   )  ( B   )  (C   )
 ( A  B  C )  (     )...................(ii )
solving (i), (ii) gives        ...................(iii )

That is if addition and subtraction were involved, determinate errors are transmitted
directly into the result.
-ve sign for γ introduces the maximum positive error and -ve sign for α and β
introduces the maximum negative error in the final result (the remaining is positive).
46
R  AB C ...................(iv )
( A   )  (B   )
Introducin g the respective errors R 
(C   )
In the case of
( AB  B   A   )
multiplication and  .
(C   )
division,
( AB  B   A)
 &  are negligible so; R  .................(v)
(C   )
AB  B   A AB
Solving (iv ) & (v) gives  
C  C
ABC  BC  AC  ABC  AB

C (C   )
BC  AC  AB
 .................(vi )
C (C   )
 BC  AC  AB C
Combining it with (iv )  
R C (C   ) AB
Since  is very small as compared to C so, C    C
   
   
R A B C
i.e. if multiplication and division are involved , relative determinate error are
transmitted directly into the result.
[For the case to obtain maximum relative error, the expression C - should be
47
used in place of C + ]
• Propagation of Indeterminate error:
Can indeterminate errors be measured individually?.................... NO
Here error is interpreted as scatterness
How scatterness in A, B, C transformed to scatterness in R ?
Letus use standard deviation or variance as the measure in variability. If error in the
final result R depends upon the errors in preliminary measurements A, B, C. {R =
f(A,B,C)}.
The propagation of error in terms of variance obeys the equation
2 2 2
2  R  2  R  2  R  2
sR    s A    sB    s C
A
  B ,C B
  A ,C  C  A, B
• For the case of addition and subtraction, we have

R  A  B  C...................(i )
if s R , s A , s B , sC , are their standard deviations
2 2 2
Absolute variances of the 2  R  2  R  2  R  2
measured values are sR    s A    sB    sC
additive in determining
 A  B ,C  B  A,C  C  A, B
2 2 2
the most probable  1 s A  1 s B  1 s C
uncertainty in the result.
2 2 2
s R  s A  s B  s C ...................(ii ) 48
• For the case of multiplication and division,

R  AB ...................(iii )
C
if s R , s A , s B , s C , are their standard deviations , As in the previous case :
2 2 2
2  R  2  R  2  R  2
sR    sA    sB    sC
 A  B ,C  B  A,C  C  A, B
2 2 2
B 2  A 2   AB  2
   s A    s B   2  sC
C  C   C 
Dividing by R 2  AB  C

2 The squares of the
relative variances are
2 2 2 transmitted
sR s  s  s 
  A    B    C  ...................(iv )
R  A  B C

Thus for the product and quotients, the relative standard deviation of the result is equal to
the sum of squares of the relative standard deviations of the number making up the product
or quotient.

49
Consider the addition and subtraction of following
25.12 ± 0.08 + 13.51 ± 0.02 – 15.24 ± 0.06
Calculate the result with suitable uncertainty.
What will be the relative uncertainty of measurement?
= [(0.08) 2 + (0.02) 2 + (0.06)2]0.5 = 0.10198

R = 23.39 ± 0.10
Relative error =

The replicate measurement of mercury in water sample is 3.152 ± 0.004 ppm,


2.912 ± 0.003ppm and 3.021 ± 0.005ppm. Calculate the average mercury in water
sample, indicating the absolute and relative uncertainties.
Consider a calculation Calculate the number
of milimoles of Cl- ion
Calculate the result with suitable uncertainty. in 250.0 mL sample
What will be the relative uncertainty of measurement? when equal aliquots

(√ ) ( ) ( ) √( ) ( ) ( )
of 25.00 ml treated
2 2 2 with AgNO3 gives the
𝑠𝑅 𝑠 𝐴 𝑠𝐵 𝑠𝐶 0.08 2 0.02 2 0.06 2 following result:
= + + = + + = 0.005275791 48.78, 48.82, 48.75

𝑅 𝐴 𝐵 𝐶 25.12 13.51 15.24


mL. (molarity of
AgNO3 = 0.1207 ±
0.0003) R = 22.27 0.12
The Least Square Method:
• Most of chemical analysis requires a plot of
linear curve (i.e. the detector response or the
final result is linearly related with the
Y
concentration of the analyte)

Instrumental Response
y  mx  b
• By using different standard solutions we can
plot the curve, called calibration curve. But due
to accumulation of errors all readings may not
be located in the line. In such a situation we
have to draw the line of best fit, (using method
of least square).
• In the least square method the slope and X
intercept of the line is determined Concentration
mathematically.
y = mx + b……….. (i); where m= slope and b =
intercept
• Not only can the best line be
The least square method assumes that;
determined but also the
– The sum of squares of residuals from all the points
uncertainties in the use of the
is minimum. [The vertical deviation from each point
to the line is called residual].
calibration graph for the
analysis of an unknown can be
• The uncertainties of the analysis can also be specified.
51
determined from this regression line
The Least Square Method:
• First calculating: sum of y = mx + b
P(xi,yi)
squares of residuals
Y Residual = PQ= yi-(mxi+b)
The sum of square of residual is given by, SS resid

SS resid   [ yi  (mxi  b)]2


i
• To minimize SSresid ; the first derivative is set to 0.5 1.0 2.0 2.5 3.0
1.5
zero w.r.t. the variables m and b. X
Fig: Calibration curve for isooctane (peak area vs % mole)

SS resid SS resid


0 &  0 ............. (ii) We have to estimate m and b.
m b
SS resid SS resid  2 SS resid  2 SS resid
From the principle of minima,  0& 0 & 0
m b m 2 b 2
[ y i  (mxi  b)] 2
• Solving the equations gives,
i.e. 0
m
i.e. 2  [ y i  (mxi  b)]  - x i  0
i.e. [ x i y i  mxi2  bx i ]  0 ............. (iii)
[ y i  (mxi  b)] 2
Also, 0
b
i.e. 2  [ y i  (mxi  b)]  -1  0
i.e. [ y i  mxi  b]  0 ............. (iv)

Equations (iii) and (iv) are normal equations to fix the line. i.e. solving the values of m 52
and b from the above equations we can get the line of best fit.
Then we can put into If we (xi ) 2
S xx  ( xi  x)  xi 
2 2

formula represent, N
(yi ) 2
S yy  ( yi  y )  yi 
2 2

N
 x y 
S xy  ( xi  x)( yi  y )  xi yi   i i 
 N 
 x x
S xy  xy  N
Where x i and yi are individual parts of x and y
(a) Slope of line m or,
S xx C N is number of pairs
(b) Intercept, b  y  mx x & y are average values of x and y
(c) SD about regression  standard error of the estimate
is also called SD of y. (a rough measure of a typical deviation from the regression line)
S yy  m 2 S xx
sr 
N 2

(d) 90% C.L. for slope  CL 0.90  m  ts m (N - 2) degree of freedom


2
s
(e) SD of slope sm  r
S xx
2
xi
(f) SD of intercept sb  2
Nxi  (xi ) 2

s 1 1 ( y c  y)2
(g) SD of replicate results obtained using same calibratio n curve sc  r  
m M N m 2 S xx
53
Alternative formula
If we represent,

The SD of the y-values,


(where of freedom = n-2)

SD of m, sm is given by

The 90% CLof the m


Standard

Standard deviation in the result ¯ ¯


54
Determination of vitamin B
Vit B Fluorescence
0.000 0
0.100 5.8
0.200 12.2 53.75
0.400 22.3
0.800 43.3
0.595
Obtain the best fitted calibration curve.
Calculate the vitamin B if sample
fluorescence is 15.4
0.6437
Calculate the uncertainty in
(i) Slope
1.0178
(ii) Intercept
(iii) Vitamin B concentration

Sxx C 0.4
Syy D 1156.9
Sxy 21.5
CHAPTER
Errors and the treatment of

1 Analytical Data

Problems
Example: 2.1 For the following normality's; 0.2039,
0.2041, 0.2043, 0.2049,
4

Mean x   xi  0.043 N
i 1
0.2041  0.2043
Median Md 
2
 0.042 N

Range L - S  0.2049  0.2039  0.001N


4

Average deviation | x
i 1
i x|
d  0.0003 N
4

Relative average RAD 


d
 100  15%
deviation x
4

 (x  x) 2
Standard deviation s 1
i
 0.0004 N
3
s 0.0004 N
Coff of variation v
x
 100 
0.2043 N
 100  0.2%

s 0.0004 N
Standard error of sm    0.0002 N
n 4
mean
Example: 2.2 Calculate 90% and 99% confidence interval of mean
if percentage of iron ia iron ore from four samples give (i) mean =
15.3% and (ii) SD = 0.1
given t at 90% = 2.353 and t at 99% = 5.841
Confidence interval ts
of mean
  x
n
90% confidence 15.30.11765
interval of mean = 15.18235% to 15.41765%
99% confidence 15.30.29205
interval of mean = 15.00795% to 15.59205%
Example: 2.3 % of soda ash (Na2CO3) is determined by two
methods. Are two means significantly different at 95% confidence
level? Given t = 2.365
Data: N1

i 1
N2

 ( x  x )   ( x  x )  .........
2
j 2
2

Method A Method B s pooled 


i 1 j 1

N 1  N 2  ......  nt
Mean = 42.34 Mean = 42.34
SD = 0.1 SD = 0.1
No of sample = 5 No of sample = 4

Weighted SD =  ( n  1) s 2
 ( n  1) s 2
4  0.01  3  0.0144
A A B B
  0.1090216
n A  nB  2 7
x1  x 2 N1 N 2
Calculation of t t
s N1  N 2
Calculated t | 42.44  42.34 | 20
t   1.3674
0.1090216 9
t cal < t tabulated The difference between two mean is not significant.
Example: 2.4 Comparison of calculated mean with standard mean: If
Sample containing 10.6% iron on analysis give the following result
mean = 10.52, SD = 0.05, n = 10.
Is the calculated mean significantly different at 95% confidence
level? Given t = 2.262 Calculated mean
ts
First of all t is calculated   x
n
using the equation
Standard value

10.6  10.52
t  10  5.06
0.05

t cal > t tabulated The difference between two mean is significant.


Example: 2.5 Rejecting or discarding a data:\
Vitamin C in a fruit juice is determined:
data: 0.215, 0.218, 0.219, 0.22 amd 0.23 all in mg/mL
Whether the largest value be discarded? Given Qtab = 0.64

First of all Q is calculated Q  0.23  0.22  0.67


using the equation 0.23  0.215
Q cal > Q tabulated The value can be discarded?.

Additional questions:
i) Whether the smallest value be discarded?
ii) What is the largest value that can be retained?
Example: 2.6 propagation of error:
MW(pot. Thiocyanide) = AW(potassium) + AW(sulphur) + AW(carbon) + AW(nitrogen)
Measured AW 39.1 32.6 12 14
Accurate AW 39.01 32.06 12 14
Absolute error 0.09 0.54 0.00 0.00

Error in determined value of MW = 0.09 + 0.54 = 0.63


Example: 2.7 % of chlorine in a sample is determined from the
weight of AgCl precipitate Wt ( AgCl )  AW (Cl )
%Cl   0.67
MW ( AgCl )  Wt ( sample )
Wt of sample= 0.8625g
Wt AgCl ppt = 0.7864g

By mistake AW of Cl is taken 35.345, the correct at wt is 35.453


Determine the determinate error made while calculating % of chlorine.

    
    where,   0 and   0
R A B C D
First calculate R, B,C,  and  and finally calculate .
Example: 2.8 liquid is kept in a beaker using calibrate pipette.
Calculate the uncertainty in measuring volume?
Given: uncertainty in reading the level of liquid =  0.02

Ans   2   2  0.03

Example: 2.9 :% of copper in a sample is determined from the


relation
mL of reagent  M of reagent  AW of Cu
%Cu   100
mg of sample
Readings
mL of reagent 30.340.03
Molarity of reagent 0.10120.0002
Mg of sample 1073.20.2
AW Cu 63.5460.003
2 2 2 2
  0.03   0.0o2   0.003   0.2 
0.18180408
          =sR= 0.000403
 30.34   0.1012   63.546   1073.2 

First calculate R, A, B,C,D, sA, sB, sC and sD and finally calculate sR.
Problem: 46 (determination of lead by polarographic analysis)
Specimen Concn of lead, M Diffusion current, id
Standard sample 1 0.0002 2.8 A
Standard sample 2 0.001 6.2A
Standard sample 3 0.002 10.1 A
Standard sample 4 0.0005 3.9 A
Standard sample 5 0.0015 7.8 A
Unknown sample 4.8 A

Calculate:
a. Equation for the best straight line
b. Standard deviation of the current values
c. Standard deviation of slope
d. 90% confidence interval of slope
e. Concentration of unknown
f. Standard deviation of result
i. Only one sample is considered.
ii. Mean of four sample is considered.
Solution Problem: 46 (Construct a table)
x=[pb] y=(id) xy x2 y2
0.0002 2.8 5
(xi ) 2
C  S xx   x 
0.00056 4E-08 7.84 2
0.001 6.2 0.0062 1E-06 38.44 i
0.002 10.1 0.0202 4E-06 102.01 i 1 n
0.0005 3.9 0.00195 3E-07 15.21 5
(yi ) 2
0.0015 7.8 0.0117 2E-06 60.84 D  S yy   y  2
i
sum 0.0052 30.8 0.04061 7.5E-06 224.34 i 1 n
xi
5
x y
x = 0.001 S xy   xi yi  i i
N i 1 n
y S xy
y i = 6.16
N m
S xx
Slope = 4023.5
Iinne: y = 4023.5x + 1.956
Intercept = 1.9756 b  y  mx

S yy  m 2 S xx
SD of current (sr) = 0.1815 sr 
N 2
 m  tsm
90% for CL of slope =4023.5  292.49 where t = 2.353 for 3df
sr2
Sd slope sm = 124.3 sm 
S xx
xi2
sb  sr
Sd intercept sb = 0.1526 Nxi2  (xi ) 2
Sd of the result: sc sr 1 1 ( y c  y)
sc    2
i. Only one sample is considered. = 0.00005 m M N m S xx
Example Skoog (determination of iron by absorbance measurement)
Specimen Conc. of Fe Absorbance
Standard sample 1 0.352 1.09
Standard sample 2 0.803 1.78
Standard sample 3 1.08 2.6
Standard sample 4 1.38 3.03
Standard sample 5 1.75 4.01
Unknown sample 2.65

Calculate:
a. Equation for the best straight line
b. Standard deviation of the current values
c. Standard deviation of slope
d. 90% confidence interval of slope
e. Concentration of unknown
f. Standard deviation of result
i. Only one sample is considered.
ii. Mean of four sample is considered.
Example: skoog (Construct a table)
x
y xy x2 y2 5
(xi ) 2
0.352 1.09 0.38368 1E-01 1.1881 C  S xx   x  2
i
0.803 1.78 1.42934 6E-01 3.1684 i 1 n
1.08 2.6 2.808 1E+00 6.76
1.38 3.03 4.1814 2E+00 9.1809 5
(yi ) 2
1.75 4.01 7.0175 3E+00 16.0801 D  S yy   y  2
i
sum 5.365 12.51 15.8199 6.90201 36.3775 i 1 n
xi
5
x y
x = 1.073 S xy   xi yi  i i
N i 1 n
y S xy
y i = 2.502
N m
S xx
Slope = 2.0925
Iinne: y = 2.0925x + 0.2567
Intercept = 0.2567 b  y  mx

S yy  m 2 S xx
SD of current (sr) = 0.1442 sr 
N 2
90% for CL of slope = 2.0925  0.3171 where t = 2.353 for 3df  m  tsm
sr2
SD slope sm = 0.1347sm 
S xx
xi2
sb  sr
SD intercept sb = 0.1583 Nxi2  (xi ) 2
SD of the result: sc sr 1 1 ( y c  y)
sc    2
i. Only one sample is considered. = 0.0778 m M N m S xx

You might also like