You are on page 1of 7

IX anal module statistics

Arithmetic mean or Average

Arithmetic mean is often referred to as mean or arithmetic average. Arithmetic mean is


calculated by adding all the numbers in a given data set and then dividing it by the total number
of items within that set. The arithmetic mean for evenly distributed numbers is equal to the
middle most number. Further, the arithmetic mean is calculated using numerous methods, which
is based on the amount of the data, and the distribution of the data. The general formula to find
the arithmetic mean of a given data is:

Mean (x̄) = Sum of all observations/ Number of observations


It is denoted by x̄, (read as x bar).
Deviation

Deviation represent the difference between mean and individual values.

Average deviation

Average of the absolute value of deviations represents the average deviation.

Relative average deviation

Relative average deviation is denoted by d- /x *1000 , often represented by parts per thousand

Standard deviation

S = [d12 +d22 +d32+ … +dn2 / (n-1)] 1/2

Variance

Represents the square of standard deviation

Median

The median is the average or middle value in a set. For an odd number of values, the middle
value is the median and for an even number of values, the average of the two middle values is the
median.

Ex. For the odd values 18,19,20,21, 22 Median M =20

For the even values 18,19,20, 21,22, 23 Median = 20.5

Range

Range is the arithmetical difference between the smallest and the largest values of a series.
Ex. Analysis of silver alloy revealed the following % of silver
Sample No . 1 2 3 4

% Ag 16.37 16.29 16.39 16.35

Highest value = 16.39; lowest value = 16.29 Range = 0.1

Standard deviation

*A quantity expressing by how much the members of a group differ from the mean value for
the group.

*In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set
of values. A low standard deviation indicates that the values tend to be close to the mean of the
set, while a high standard deviation indicates that the values are spread out over a wider range.

*The standard deviation is a statistic that measures the dispersion of a dataset relative to its
mean and is calculated as the square root of the variance.

The confidence limit

Calculation of the standard deviation for a set of data provides an indication of the precision
inherent in a particular procedure or analysis. But, unless there is a large number of data, it does
not by itself give any information about how close the experimentally determined mean x* might
be to the true mean value. Statistical theory, though, allows us to estimate the range within which
the true value might fall within a given probability, defined by the experimental mean and the
standard deviation. This range is called the confidence interval and the limit of this range is
called the confidence interval. The likelihood that the true value falls within the range is called
the probability or the confidence level, usually expressed as a percent. The confidence limit is
given by

Confidence limit = x* = t s / (N) 1/2


Where ‘t’ is a statistical factor that depends on the number of degrees of freedom and the
confidence level desired. The number of degrees of freedom is one less than the number of
measurements. Values of t at different confidence levels and degrees of freedom are given in
data table. Note that the confidence limit is simply the product of t and the standard deviation of
the mean (s / (N)1/2The confidence limit of a single observation, x, is 1 is given by x+/- t s, being
larger than that of the mean by a factor (N)1/2; t is for the number of measurements used to
determine s.

Tests of significance

In developing a new analytical method, it is often desirable to compare the results of this
method with those of an accepted , perhaps, a standard method. How one can tell if there is a
significant difference between the new method and the accepted one? Again, we resort to
statistics for the answer.

Deciding whether one set of results is significantly different from another depends not only
on the difference in the means, but also on the amount of data available and the spread. As the
number of measurements increases, both t and (N)1/2 decrease, with the result that the confidence
interval is narrowed. More measurements we make, more confident we are that the true value
lies within a given range or conversely, that the range will be narrowed at a given confidence
level.

The F test

This is a test designed to indicate whether there is a significant difference between two
methods based on their standard deviation. F is defined in terms of the variances of the two
methods, where the variance is the square of the standard deviation: F = s12 / s22 where s12
>s22.There are two different degrees of freedom, v1 and v2where, degrees of freedom is defined as
(N-1) in each case.

If the calculated F value from the above relation exceeds a tabulated F value at the selected
confidence level, then, there is a significant difference between the variances of the two methods.
A list of F value at 95% confidence level is presented in table.
The student t test

The t test is used to determine if two sets of measurements are statistically different

Frequently, the analyst wishes to decide whether there is a statistical difference between the
results obtained using two different procedures, i.e., whether they both indeed measure the same
thing. The t test is very useful for such comparisons.

In this method, comparison is made between two sets of replicate measurements made by two
different methods. One of them will be the test method and the other will be an accepted method.
A statistical t value is calculated and compared with a tabulated value for the given number of
tests at the desired confidence level. If the calculated t value exceeds the tabulated t value, then
there is a significant difference between the results by the two methods at that confidence level.
If it does not exceed the tabulated value, then we can predict that there is no significant
difference between the methods. This in no way implies that the two results are identical.
Three ways in which a t test can be used will be described .If an accepted value of mean is
available from other measurements, then the test can be used to determine if a particular analysis
method gives results statistically equal to the true value at a given confidence level .If an
accepted value is not available, then a series of replicate analyses on a single sample may be
performed using two methods, or a series of analyses may be performed on a set of different
samples by the two methods. One method should be an accepted method.

(i) t test when an accepted value is known

True value = mean +/- ts / (N)1/2

It follows that +/-t = (mean value-true value) *(N) 1/2 /s

If a good estimate of the true value is available from other analyses, for example, from a
National Institute of standards and Technology (NIST) standard reference material or the
ultimate in chemical analysis, an atomic weight), then the above relation can be used to
determine whether the value obtained from a test method is statistically equal to the accepted
value

As the precision is improved, i.e. as s becomes smaller, the calculated t becomes larger.
Thus, there is a greater chance that the tabulated t value will be less than this .i.e. as the precision
improves, it is easier t distinguish nonrandom differences.

(ii) comparison of the means of two samples

When the t test is applied to two sets of data, true value in the above relation is replaced by
the mean of the second set. The reciprocal of the standard deviation of the mean (N)1/2/s is
replaced by that of the differences between the two, which is readily shown to be [N1N2/N1+N2]1/2
/sp
Where sp is the pooled standard deviation of the individual measurements of two sets.

+/-t = x1* -x2* / sp [N1N2/N1+N2]1/2

In applying the t test between the tw mwthods, it is ssumed that both methods have essentially
the same standard deviation.

(iii) paired t test

In the clinical chemistry laboratory, a new method is frequently tested against an accepted
method by analyzing several different samples of slightly varying composition within
physiological range. In this case, the t value is calculated in a slightly different form. The
difference between each of the paired measurements on each sample is computed. An average
difference D* is calculated and the individual deviations of each from D* are used to compute a
standard deviation , sd. The t value is calculated from t= D* / sd (N) 1/2 where D* represents the
mean of all individual differences.

Statistics for small data sets

Large population statistics do not strictly apply for small populations

*The median may be a better representative f the true value tha the mean fr small numbers f
measurements.

*Range instead of standard deviation.

Rejection of a result

It is found that when a series f replicate analysis is performed, one of the results may be
abnormal i.e. differs markedly from others. The following rules decide whether to reject the
result or to retain it.

I. Rule based on average deviation

To apply this rule, first calculate the mean and the average deviation of the god result.
Determine the deviation of the suspected result from the mean of good one i.e. let the suspected
value be x and if the deviation of the suspected value from the mean is at least four times the
average deviation, then the rejection of the result is justified.

II. Rule based on the Range

Q test is applied as follows:

(i)The data are arranged in a decreasing order (ii)The range ,w of the results is calculated

(iii)The difference, a between the suspected result and its nearest neighbor is found
(iv)The difference obtained in step (iii) is divided by the range in step (ii) to obtain the rejection
quotient Q. (v)The computed values of Q is compared with the values presented in the already
prescribed Q table. If the value of Q is greater than r equal to the value in the Q table, then the
suspected result can be discarded.

Critical Values for the Rejection of Quotient Q


Number of Qcrit(Reject if Qexp > Qcrit)
Observations 90% Confidence 95% Confidence 99% Confidence
3 0.941 0.970 0.994
4 0.765 0.829 0.926
5 0.642 0.710 0.821
6 0.560 0.625 0.740
7 0.507 0.568 0.680
8 0.468 0.526 0.634
9 0.437 0.493 0.598
10 0.412 0.466 0.568

You might also like