You are on page 1of 9

CHAPTER FIVE

ANALYSIS OF VARIANCE (ANOVA)


Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two
or more population (or treatment) means by examining the variances of samples that are taken.
ANOVA allows one to determine whether the differences between the samples are simply due
to random error (sampling errors) or whether there are systematic treatment effects that cause
the mean in one group to differ from the mean in another.
Most of the time ANOVA is used to compare the equality of three or more means, however
when the means from two samples are compared using ANOVA it is equivalent to using a t-test
to compare the means of independent samples.
ANOVA is based on comparing the variance (or variation) between the data samples to variation
within each particular sample. If the between variation is much larger than the within variation,
the means of different samples will not be equal. If the between and within variations are
approximately the same size, then there will be no significant difference between sample
means.
Assumptions of ANOVA:
(i) All populations involved follow a normal distribution.
(ii) All populations have the same variance or standard deviation (homogeneity of variance).
(iii) The samples are randomly selected and independent of one another.

Since ANOVA assumes the populations involved follow a normal distribution, ANOVA falls into a
category of hypothesis tests known as parametric tests. If the populations involved did not
follow a normal distribution, an ANOVA test could not be used to examine the equality of the
sample means. Instead, one would have to use a non-parametric test (or distribution-free test),
which is a more general form of hypothesis testing that does not rely on distributional variance
(ANOVA), as the name implies, is a statistical technique that is intended to analyze variability in
data in order to infer the inequality among population means.

Characteristics of the F-Distribution


 There is a “family” of F Distributions.
 Each member of the family is determined by two parameters: the numerator degrees of
freedom and the denominator degrees of freedom.

Page 1
 F distribution cannot be negative and it is a continuous distribution.
 The F distribution is positively skewed
 Its values range from 0 to ¥ . As F ® ¥ the curve approaches the X-axis but never
touches it.

Page 2
Example: A state employee wishes to see if there is a significant difference in the number of
employees at the interchanges of three state toll roads. The data are shown. At a =0.05, can it
be concluded that there is a significant difference in the average number of employees at each
interchange?

State X State Y State Z


7 10 1
14 1 12
32 1 1
19 0 9
10 11 1
11 1 11

Solution
Step: 1 State the hypotheses and identify the claim.
Ho: 1=2=3
H1: At least one mean is different from the others (claim).

Page 3
Exercise: A researcher wishes to try three different techniques to lower the blood pressure of
individuals diagnosed with high blood pressure. The subjects are randomly assigned to three
groups; the first group takes medication, the second group exercises, and the third group
follows a special diet. After four weeks, the reduction in each person’s blood pressure is
recorded. At a = 0.05, test the claim that there is no difference among the means. The data are
shown.
Medication Exercise Diet
10 6 5
12 8 9
9 3 12
15 0 8
13 2 4

Exercise 2: XYZ Restaurants specialize in meals for families. Ato Yohannes, the manager
recently developed a new meat loaf dinner. Before making it a part of the regular menu he
decides to test it in several of his restaurants. He would like to know if there is a difference in
the mean number of dinners sold per day at the Soddo, Hawassa, and Arbaminch restaurants.
Use the .05 significance level.

Page 4
Number of Dinners Sold by Restaurant

Restaurant Sodo Hawassa Arbaminch

Day

Day 1 13 10 18
Day 2 12 12 16
Day 3 14 13 17
Day 4 12 11 17
Day 5 17

Exercise 3: Muger cement factory operates 24 hours a day, five days a week. The workers
rotate shifts each week. Bulcha Tufa, the manager, is interested in whether there is a
difference in the number of units produced when the employees work on various shifts. A
sample of five workers is selected and their output recorded on each shift. At the .05
significance level, can we conclude there is a difference in the mean production by shift and in
the mean production by employee?

Employee Day Output Evening Output Night Output

Beti 31 25 35

Ahmed 33 26 33

Tolcha 28 24 30

Hiwot 30 29 28

Tesfaye 28 26 27

Variance test
A chi-square test can be used to test if the variance of a population is equal to a specified
value. This test can be either a two-sided test or a one-sided test. The two-sided tests against
the alternative that the true variance is either less than or greater than the specified value. The
one sided version only tests in one direction. The choice of a two-sided or one-sided test is

Page 5
determined by the problem. For example, if we are testing a new process, we may only be
concerned if its variability is greater than the variability of the current process.
Steps in variance test
 Step1: State the null hypotheses and alternative hypothesis.
 Step 2 Find the critical value(s).
 Step 3 Compute the test value.
 Step 4 Make the decision.
 Step 5 Summarize the results.
Example: An instructor wishes to see whether the variation in scores of the 23 students in her
class is less than the variance of the population. The variance of the class is 198. The variance
of the population is 225. Is there enough evidence to support the claim that the variation of the
students is less than the population variance at a=0.05? Assume that the scores are normally
distributed.

Page 6
Example: A hospital administrator believes that the standard deviation of the number of
people using outpatient surgery per day is greater than 8. A random sample of 15 days is
selected. The data are shown. At a=0.10, is there enough evidence to support the
administrator’s claim? Assume the variable is normally distributed.

25 30 5 15 18 42 16 9 10 12 12 38 8
14 27

First calculate variance

Page 7
Example: 3 a cigarette manufacturer wishes to test the claim that the variance of the nicotine
content of its cigarettes is 0.644. Nicotine content is measured in milligrams, and assumes that
it is normally distributed. A sample of 20 cigarettes has a standard deviation of 1.00 milligram.
At a = 0.05, is there enough evidence to reject the manufacturer’s claim?

Page 8
Exercise 1 In a random sample the amount of time which 18 women took to complete the
written test for their driver licenses has standard deviation 2.1 minutes. Do the data give
sufficient evidence that the population variance is significantly less than 6.25 minutes at
2
σ
(Assume normality).

Exercise 2 A researcher claims that the standard deviation of the number of deaths annually
from tornadoes in the United States is less than 35. If a sample of 11 randomly selected years
had a standard deviation of 32, is the claim believable? Use a=0.05.

Page 9

You might also like