You are on page 1of 20

1

CHAPTER 9
Analysis of variance
9.1 Introduction
9.2 Analysis of variance method for two treatments
9.3 One-way Analysis of variance, completely randomized design
9.4 Two-way Analysis of variance, randomized complete block design
9.5 Multiple comparisons
9.6 Chapter Summary
9.7 Computer examples
Projects for chapter 9
2

EXERCISES 9.2

9.2.1.
(a) We need to test vs.
From the random sample, we obtain the following needed estimates ,
, , , ,

, ,

Where Total , then


, , and

With ,
Since 3.0843 is not greater than 4.49, is not rejected.
There is not enough evidence to indicate that the means differs for the tho populations.

(b) , ,
Then, the t-statistic is

Now, , and the rejections region is


Since is not less than , is not rejected, which implies that there is
no significant difference between the mean for the two populations. , implying that
in the two sample case, t-test and F-test lead to the same result.

9.2.2
(a) We need to test vs.
From the random sample, we obtain the following needed estimates

, , , , ,

, ,
,where ,then

, , and

At ,
Since is greater than the critical value, , is rejected. Based on
the data, there is evidence to suggest difference in mean for the two populations.

(b) , ,
3

Then, the t-statistic is

Now, , and the rejections region is


Since is less than , is rejected, which implies that there is
significant difference between the mean for the two populations. And , implying
that, in the two sample case, t-test and F-test lead to the same result.

9.2.3.
(a) At , we need to test vs.
We need the estimates , , ,

, ,

and since , SSE =


294.60897.
, , and . At ,

. Because 6.898 is not greater than 7.881134, H0 is not rejected.


Therefore, there is not enough evidence at 1% significance level to suggest that mean relief times
of two medicines are significantly different.

(b) Assumptions: The samples are assumed to be independent from the Normal population with
respective means and equal but unknown variances.
(c) , , . Then, the t-statistic is:

Now, , and the rejection region is . Since 2.6263 is not greater


than 2.807, is not rejected at 1% level, which implies that there is no significant difference
between the mean time to relief for the two populations, and implies that in the two
sample case, t- test and F test lead to the same result.

9.2.4. We need to test vs. , where is the mean SAT score for
math in 1989 and is the mean SAT score for math in 1999.
From the random sample, we obtain the following needed estimates
, , , , ,
4

, ,

,where ,then

, , and

Since is not greater than the critical value, , is not rejected.


Based on the data, there is no enough evidence to indicate the mean SAT score for math
in 1999 is different from than that in 1989.

(b) , ,
Then, the t-statistic is

Now, , and the rejections region is


Since is less than , is rejected, which implies that there is
significant difference between the mean for the two populations. And , implying
that in the two sample case, t-test and F-test lead to the same result.

9.2.5. Let , with for , and let , with for


, be two set of independent random variables. To test vs.

we reject when . Now, for ANOVA, with , we have

, since
5

Therefore . Then, we reject if .

Since and for appropriate

values and , the probability for this events are the same. Hence, the two sample t-test and the
analysis of variance are equivalent for testing vs. .
Note: In the text, is , is , is , and is .

EXERCISES 9.3

9.3.1.

(a) Assuming that the samples are from populations which are normally distributed with
equals variances and means . In our case, , ,

, , , ,

, , ,

, or ,

, ,

, ,

. At
Therefore, the ANOVA table is

ANOVA table
Source of Degrees of Sum of squares Mean square F-statistic p-value
variation freedom
Treatments 2 10206 5103 1.0972 0.37462
Error 9 41859 4651
Total 11 52065

From the table, since the p-value is more than 0.05, we reject at the null
6

hypothesis
(b) Letting : The mean auto insurance premium paid per six months by all drivers
insured for each of these companies is the same. Based on the data, there is evidence to
suggest that the mean auto insurance premium pay per six months by all drivers insured
for each of these companies is the same.

9.3.2.

(a) Assuming that the samples are from populations which are normality distributed with

egual variances and means . In our case , , ,

, ,

, ,

. At ,

Therefore, the ANOVA Table is


ANOVA Table
Source of variation Degres of freedom Sum of squares Mean square F-Statistic P- Value
Treatments 2 339.733 169.867, 0.7319 0.5013
Error 12 2785.2 232.1
Total 14 3124.93
From the table, since the p- value is more than 0.05, we do not reject at the null
hypothesis

(b) Letting : The mean scores for the three persons teaching is the same, and based on
the data, there is no evidence to suggest that the mean scores for the three persons
7

teaching are different.

9.3.3.

because .

Therefore,

9.3.4.

Let and , then

where, are normally distributed with common variance and


then,

and, since all are independent


8

, where

then,

then,

9.3.5.

(a) From exercise 10.3.4 we know that (where T stand for

“Treatment”), and SSTotal . Then,

SSE =SSTotal – SST

and
9

Since

Then

Therefore, ;

since it is given that .

(b) Since ~ , ~ , and since they are independent,

follows a chi-square distribution with , or ,

degrees of freedom.

9.3.6. Suppose each observation in a set of independent random variables is ,


for . Then, under , for , ,

are independent, and , where T stand for

“Treatment”, follows a chi-square distribution con degrees of freedom.

Then, . Therefore, by definition, .

9.3.7. , , , ,

, , ,

, , ,

, , or
10

, ,

, ,

Therefore, the ANOVA table is

(a) ANOVA table


Source of Degrees of Sum of squares Mean square F-statistic p-value
variation freedom
Treatments 3 241 80.3333 9.8367 0.00046
Error 18 147 8.166667
Total 21 388
Assumptions: The samples are randomly selected from the 4 populations in an
independent manner. The populations are normally distributed with equal variances
and means
(b) Since is greater than critical value at , there is sufficient evidence to
indicate a difference between the mean number of customers served by the 4 employees.

9.3.8.
Assuming the samples are randomly selected from four populations in an independent
manner. The populations are normally distributed with common variance , and with
means . In this case: , , , , , ,

, , , , , , , ,

, , , ,

, ,

, ,

, ,

The sample evidence supports that the means for each of the age group are indeed
different for the four groups at0.05 level of significance.
11

9.3.9. Assumptions: The samples are normally selected from the population in an
independent manner. The populations are assumed to be normally distributed with
common variances.
, , ,

, , ,

, ,

, or ,

, ,

, ,

At
Since , the sample evidence supports the alternative hypothesis that the true
rental and homeowner vacancy rates by area indeed different for all five years at 0.01 l of
significance level.

9.3.10. , , ,

, , ,

, ,

, ,

, ,

At
Since is greater than critical value, the sample evidence supports the alternative
hypothesis that the true income lower limits of top 5 percents of U.S. households for each
races are different for all five years at 0.05 level of significance.
12

9.3.11. , , ,

, , ,

, , or

, ,

, ,

At
Since , based on the data there is not enough evidence to support the
alternative hypothesis that the true mean cholesterol levels for all races in the United
States during 1978-1980 are different at 0.01 of significance level.

Exercises 9.4

9.4.1.

Then

Then

Now, since , and


13

Then, , and

Then,

and

and .

Therefore,

9.4.2.
Considering the model ,
(i) By some advanced theorems, which are beyond the scope of this text, it can be show
that
, with non-central parameter , and

with non-central parameter , so .


Then,

(ii) . Since with non-central parameter ,

, then
14

(iii) , where T stand for “Treatment”, and since with non-

central parameter , then

9.4.3. , then

If , then , and since by the restriction

, the solution is given by

Now, for any fixed i

if , then . Then, since then , for

any i = 1, 2,…,k; i.e.,

For any fixed j, . If , since the solution is

, j = 1,2,…,b.
9.4.4.
, ,

, , ,

, , ,

, ,

, ,
(a) Since , we reject the null hypothesis and conclude that there is difference in
average wear among the four material
(b) To test for the difference in average wear among the positions
and . Since , we conclude that there is evidence to conclude
there is no difference in average wear among the positions
(c) Assumptions: The samples are randomly selected in an independent manner from
15

populations. The populations are assumed to be normally distributed with equal


variances. Also, there is no interaction between the variables (two factors).

9.4.5.

, ,

, , ,

, ,

, ,

To test if the true income lower limits of top 5 percent of U.S. households for each races
are the same, , . Since the observed value
, we reject the null hypothesis and conclude that there is difference in the true income
lower limits of top 5 percent of U.S. households for each races.
To test if the true income lower limits of top 5 percent of U.S. households for each year
between 1994-1998 are the same, , and . Since
the observed value , we conclude that there is difference in the true income
lower limits of top 5 percent of U.S. households for each year among 1994-1998

9.4.6.

, ,

, ,

, , ,

, ,

To test if the true mean cholesterol levels for all races in the United States during 1978-
1980 are the same, , . Since the observed value
, there is evidence to conclude that there is no difference in the true mean
cholesterol levels for all races in the United States during 1978-1980 at 0.01 of
significance level.
16

To test if the true mean cholesterol levels for all ages in the United States during 1978-
1980, , and . Since the observed value
, there is evidence to conclude that there is difference in the true mean
cholesterol levels for all ages in the United States during 1978-1980.

9.4.7. ,
, ,

, ,

, , ,

, ,

To test if the true mean performance for different hours of sleep are the same,
, . Since the observed value , there is
evidence to conclude that there is no difference in the true mean performance for
different hours of sleep

To test if the true mean performance for each category of the test are the same,
, and . Since the observed value , there
is evidence to conclude that there is no difference in the true mean performance for each
category of the test.

Exercises 9.5

9.5.1.
(a) For simplicity of computation, we will use SPSS. The following is the output.

Oneway
ANOVA
Averagetime
Sum of Squares df Mean Square F Sig.
Between Groups 0.900 3 0.300 0.919 0.461
Within Groups 3.919 12 0.327
Total 4.818 15
17

(b) Since there is no significant difference, Tukey’s method is not necessary.

(c) Since F is smaller than the critical value, , there is evidence to


conclude there is no difference in the average time to process claim forms among the four
processing facilities.

Assumptions: The samples are randomly selected from the 4 populations in an


independent manner. The population are normally distributed with equal variances
and mean .

9.5.2.
(a) Oneway
ANOVA
Vacancy          
Sum of
  Squares Df Mean Square F Sig.
Between
Groups 18.420 3 6.14 18.33 0.00001
Within    
Groups 5.360 16 0.335
Total 23.780 19      

Since F is greater than the critical value, , there is evidence to conclude


there is difference in the true rental and homeowner vacancy rates by area for all five
years at 0.01 level of significance.

(b)
Post Hoc Tests
Multiple Comparisons
Vacancy          
Tukey HSD
99% Confidence
Interval
(I) (J) Mean Difference Std. Lower Upper
area_num area_num (I-J) Error Sig. Bound Bound
1 2 -1.06 0.36606 0.05 -2.4039 0.2839
3 -2.32000* 0.36606 0 -3.6639 -0.9761
4 0.02 0.36606 1 -1.3239 1.3639
2 1 1.06 0.36606 0.05 -0.2839 2.4039
3 -1.26 0.36606 0.02 -2.6039 0.0839
4 1.08 0.36606 0.04 -0.2639 2.4239
18

3 1 2.32000* 0.36606 0 0.9761 3.6639


2 1.26 0.36606 0.02 -0.0839 2.6039
4 2.34000* 0.36606 0 0.9961 3.6839
4 1 -0.02 0.36606 1 -1.3639 1.3239
2 -1.08 0.36606 0.04 -2.4239 0.2639
3 -2.34000* 0.36606 0 -3.6839 -0.9961
*. The mean difference is significant at the 0.01 level.  

The next summarize these results, where N.R. represents ‘not reject’.

Tukey interval Reject or N.R Conclusion


6.86-7.92 (-2.404, 0.284) N.R.
6.86-9.18 (-3.664, -0.976) R
6.86-6.84 (-1.324, 1.364) N.R.
7.92-918 (-2.604,0.084) N.R.
7.92-6.84 (-0.264,2.424) N.R.
9.18-6.84 (0.996,3.684) R

Based on 99% Tukey intervals, Northeast vacancy rate is different from South, and South
is different from West vacancy rate. All other vacancy rates are similar.

9.5.3.
(a) Oneway

ANOVA
Income          
Sum of Mean
  Squares df Square F Sig.
Between
Groups 6344.550 3 2114.850 32.350 .000
Within Groups 1046.000 16 65.375    
Total 7390.550 19      

Since F is greater than critical value F, , based on data provided, there is


evidence to conclude that there is difference in the income lower limits of top 5 percents
of U.S. households for each races for all five years at 0.05 level of significance.

(b) Post Hoc Tests


19

Multiple Comparisons
Income          
Tukey HSD
95% Confidence
Interval
(I) (J) Mean Std. Lower Upper
race_num race_num Difference (I-J) Error Sig. Bound Bound
1 2 -3.4 5.11371 0.91 -18.0304 11.2304
3 35.00000* 5.11371 0 20.3696 49.6304
4 32.60000* 5.11371 0 17.9696 47.2304
2 1 3.4 5.11371 0.91 -11.2304 18.0304
3 38.40000* 5.11371 0 23.7696 53.0304
4 36.00000* 5.11371 0 21.3696 50.6304
3 1 -35.00000* 5.11371 0 -49.6304 -20.3696
2 -38.40000* 5.11371 0 -53.0304 -23.7696
4 -2.4 5.11371 0.97 -17.0304 12.2304
4 1 -32.60000* 5.11371 0 -47.2304 -17.9696
2 -36.00000* 5.11371 0 -50.6304 -21.3696
3 2.4 5.11371 0.97 -12.2304 17.0304

(c)
Tukey interval Reject or N.R Conclusion
120.4-123.8 (-18.03,11.23) N.R.
120.4-85.4 (20.37,49.63) R
120.4-87.8 (17.97,47.23) R
123.8-85.4 (23.77,53.03) R
123.8-87.8 (21.37,50.63) R
85.4-87.8 (-17.03,12.23) N.R.

Assuming the samples are randomly selected in an independent manner, the populations
are normally distributed with equal variances , and based on 95% Tukey intervals, All
races is similar to White, and Black is similar to Hispanic. All other true income lower
limits for each races are different.

9.5.4.
(a) According exercise 10.3.11, and at
. Then based on the data, there is no significant difference in the
20

true mean cholesterol levels for all races in the United States during 1978-1980 at 0.01 of
significance level.

(b) Since there is no significant difference, Tukey’s method is not necessary.

Exercise 9.7

9.7.1.
Oneway
ANOVA
colesterol          
  Sum of Squares df Mean Square F Sig.
Between
Groups 32.444 2 16.222 .041 .960
Within    
Groups 5961.833 15 397.456
Total 5994.278 17      
Since F is smaller than critical value F, , based on data provided, there
is not enough evidence to conclude that there is difference in the true cholesterol levels
for all races in United States during 1987-1980.

You might also like