You are on page 1of 16

Statistics: Error (Chpt.

5)
• Always some amount of error in every analysis (How much can you tolerate?)

• We examine error in our measurements to know reliably that a given amount of analyte is
in the sample

• To determine the error in the measurement, we run replicate samples: samples of about
the same size that are carried through an analysis in exactly the same way

• If a measurement has no error, the replicate samples should yield the same answer. This
does not happen

• With replicate data, we usually report the mean or average

• In some instances, we are interested in the median: middle value in a set of data that has
been arranged in order of size

• Median is important in data sets with outlier. Outliers can have large effects on the mean,
but they will have little effect on the median.

• Example: Consider masses: 3.080, 3.094, 3.107, 3.056, 3.112, 3.174, 3.198
What happens if record 31.07 on accident?
1
Statistics: Error (Chpt. 5)
Precision vs. Accuracy
Precision is the closeness of data to other data that have been obtained in exactly the same
way
High precision measurements have small standard deviations, variance, and coefficient of
variance. These terms are a function of deviation from the mean value and have no
relationship to the true value.
Accuracy is the closeness of a result to
its true or accepted value. Accuracy
determines how much error is in the
method, not how reproducible the
method is

2
Statistics: Error (Chpt. 5)
Error related to Accuracy
Absolute error: difference between the measured value and the true value. It bears a sign
E = xi – xt where xt is true or accepted value and xi is measured value

Relative Error: absolute error divided by true value (aka % error)


Example: True value is 20.0 ppm and measured value is 19.8 ppm

Precision is determined by comparing replicate data, but accuracy is not as easy to


determine (we usually don’t know the true value)

Different types of error


Random (or indeterminate) errors: affect the precision of measurement; non-traceable

Systematic (or determinate) errors: affect the accuracy of results; traceable; has assignable
cause; same magnitude for replicate measurements

Gross Error (aka outlier): quite large, don’t occur often, caused by human error (loss of
precipitate, etc.)

3
Statistics: Error (Chpt. 5)
Sources of systematic errors

1. Instrumental errors (fixed by calibration)


- volumetric glassware may differ from listed value
- electrical: increased resistance from dirty contacts or temperature changes

2. Method errors (from non-ideal behavior of reagents used in analysis)


- slow reactivity between analyte and titrant, side reactions, end point vs. equiv. point
- often most difficult to detect
- fixed by doing analysis of standard samples (standard reference material) and/or by
performing blank determinations
- also fixed by cross validation with other method

3. Personal errors (fix by taking care and doing replicates)


- incorrect reading of liquid level in a buret
- error in detecting color change in titrated (esp. if color blind)
- prejudice in numerical readings
- incorrect significant figures

4
Statistics: Error (Chpt. 5)
Effect of systematic error on results

1. Constant error: same amount of error is made each time, but the relative error will change
- independent of sample size
- becomes serious as sample size decreases
Example: 0.5 mg of a precipitate (ppt) is lost as a result of a wash with liquid. Calculate
Er if ppt is 500 mg or 50 mg

2. Proportional Error: absolute error changes, but the relative error remains constant
- dependent on sample size
- changes with sample size
Example: When washing a ppt with a liquid a proportional error is occurring. If the Er is
2.5% calculate E for washing a 50 mg and 500 mg ppt

5
Random Errors (Chpt. 6)
Significant Figures
General rule: don’t report what you don’t know
See pages 134-136, you must know these, we won’t cover

1. Addition/subtraction

Do not be more specific that your least specific number

2. Multiplication/Division

same general rules apply

6
Random Errors (Chpt. 6)
All measurements have random error (can only be minimized not eliminated)
Consider measuring the volume dispensed by a 10-mL volumetric pipet

As N >30, starts to form bell-shaped curve


Central limit theorem: distribution of measurements subject to random errors is
often a normal distribution (Gaussian distribution)
7
Random Errors (Chpt. 6)
Properties of a Gaussian Curve
Population (collection of all measurements of interest to a experiment) vs. sample (subset
of measurements selected from the population)

Population mean (µ) vs. sample mean ( )

Precision = closeness of data to other data that have been obtained in a similar manner,
expressed usually by standard deviation

Population std. dev. (σ)

8
Random Errors (Chpt. 6)
Properties of a Gaussian Curve
z-variable: deviation from the mean relative to the standard deviation, describes all
populations of data regardless of standard deviation

µ ± 1σ = 68.3%
µ ± 2σ = 95.5%
µ ± 3σ = 99.7%

Sample standard deviation


�(s): more calculator friendly
�n
i=1 (xi − x̄) )
2
s=
n−1
• Use sample std. dev. (s) with data sets of 30 points or less
• Lower value of s indicates better precision
• Scatter from “true” value will decrease as N is increased
• What is n-1? Degrees of freedom: anytime you make an assumption, lose one degree
of freedom, N-1 = # of data that remain independent

9
Random Errors (Chpt. 6)
Relative standard deviation
RSD (parts per thousand) =

Coefficient of variation (% RSD) =

Standard error of the mean (Sm)

- Shows relationship between mean and std. dev.

Pooled Standard Deviation


SPooled is used to pool standard deviations from different measurements, done when
increasing # of measurements is not possible (several subsets of data)

When have 2 sets of data can simplify to be calculator friendly (not in the book)

10
Statistical Treatment of Data (Chpt. 7)
Scientists use statistical calculations to judge the quality of experimental measurements
These calculations are based upon means, standard deviations, Gaussian curves and test
statistics

Confidence Limits
define an interval around the experimentally determined mean that “probably” contains the
population mean (µ)

If population standard deviation is known:

Again, CI decreases by

Value for z depends on confidence level in measurement

Confidence interval is:

Example: Determine 80% and 95 % confidence interval for


experimentally determined glucose level of 1108 mg/L
if s = 19 mg/L and s is good estimator of σ (n=7)

11
Statistical Treatment of Data (Chpt. 7)
But…..s is not always a good estimator of σ
Then use t statistic, which depends on the number of measurements

Example: A chemist found the following data for the alcohol content of a sample of blood:
0.084%, 0.089%, and 0.079%. Calculate the 95% confidence level for the mean.

12
Statistical Treatment of Data (Chpt. 7)
Often use t or z statistic to accept or reject data: Hypothesis testing

Null hypothesis: postulates that there is no difference between two observed quantities

Rules for hypothesis testing when true mean is known:


1. Write null hypothesis
2. Depending upon whether σ or s is to be used, look up corresponding test statistic
(z or t) for a given confidence level
3. Determine zcal or tcal

4. If calculated value is greater than table value, reject null hypothesis


If calculated value is less than table value, accept null hypothesis

Example: A new procedure for test sulfur in fuel.


Certified standard gives 0.123%S. New test (n=4)
gives 0.112, 0.118, 0.115 and 0.119% S. Is there a bias
at the 95% confidence level?

13
Statistical Treatment of Data (Chpt. 7)
Often times want to compare two different experimentally determined means (N<30)
Use spooled and different tcal formula but same t table

Example: Analysis of two barrels of wine for alcohol content. 6 analyses of 1st barrel =
12.61%; 4 analyses of 2nd barrel = 12.53%. 10 analyses spooled = 0.070% At 95% CL, is
there a difference between the 2 wines?
Note: number of degrees of freedom: N1+N2 -2

14
Statistical Treatment of Data (Chpt. 7)
Comparison of precision: F-test
Similar to t-tests, but this test compares precision of two sets of data
Can be used to test experimental and true standard dev.

Fcalc=

so that Fcalc > 1.0

if Fcalc > Ftable:


reject null hypothesis

Example: Standard method for measuring CO = std dev of 0.21 ppm. This is a well
established method that has been performed a number of times. Modification of method that
was done 13 times leads to std. dev. of 0.15 ppm. Is one method more precise?

15
Statistical Treatment of Data (Chpt. 7)
Test for Outliers: Q-test
Is the outlier from a gross error?
For small data sets, it is best to try and collect more
data
If not possible apply Q-test

where xq is questionable result, xn is nearest neighbor, and w is spread

If Qexp is greater than Qcrit then reject the questionable result, it is from a gross error

Consider the following data set: 81, 100, 101, 102, 103. Is 81 bad at 99% CL?
16

You might also like