Professional Documents
Culture Documents
9
Between Two Means,
Two Proportions, and
Two Variances
STATISTICS TODAY
To Vaccinate or Not to Vaccinate? © Fuse/Corbis/Getty Images RF
Small versus Large Nursing Homes
OUTLINE
Influenza is a serious disease among the elderly, especially those Introduction
living in nursing homes. Those residents are more susceptible to 9–1 Testing the Difference Between
Two Means: Using the z Test
influenza than elderly persons living in the community because
9–2 Testing the Difference Between Two Means
the former are usually older and more debilitated, and they live in of Independent S amples: Using the t Test
a closed environment where they are exposed more so than com- 9–3 Testing the Difference Between
munity residents to the virus if it is introduced into the home. Three Two Means: Dependent Samples
researchers decided to investigate the use of vaccine and its value 9–4 Testing the Difference Between Proportions
9–5 Testing the Difference Between
in determining outbreaks of influenza in small nursing homes. Two Variances
These researchers surveyed 83 randomly selected licensed Summary
homes in seven counties in Michigan. Part of the study consisted of
comparing the number of people being vaccinated in small n
ursing OBJECTIVES
homes (100 or fewer beds) with the number in larger nursing homes After completing this chapter, you should be able to:
(more than 100 beds). Unlike the statistical methods presented in 1 Test the difference between two means,
using the z test.
Chapter 8, these researchers used the techniques explained in this
chapter to compare two sample proportions to see if there was a sig- 2 Test the difference between two means for
independent samples, using the t test.
nificant difference in the vaccination rates of patients in small nurs-
3 Test the difference between two means for
ing homes compared to those in large nursing homes. See Statistics dependent samples.
Today—Revisited at the end of the chapter.
4 Test the difference between two
Source: Nancy Arden, Arnold S. Monto, and Suzanne E. Ohmit, “Vaccine Use and the Risk of
proportions.
Outbreaks in a Sample of Nursing Homes During an Influenza Epidemic,” American Journal of
Public Health 85, no. 3, pp. 399–401. Copyright by the American Public Health Association. 5 Test the difference between two variances
or standard deviations.
9–1
488 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
Introduction
𝜒 2 tests, a sample mean, variance, or proportion can be compared to a specific popula-
The basic concepts of hypothesis testing were explained in Chapter 8. With the z, t, and
tion mean, variance, or proportion to determine whether the null hypothesis should be
rejected.
There are, however, many instances when researchers wish to compare two sample
means, using experimental and control groups. For example, the average lifetimes of two
different brands of bus tires might be compared to see whether there is any difference in
tread wear. Two different brands of fertilizer might be tested to see whether one is better
than the other for growing plants. Or two brands of cough syrup might be tested to see
whether one brand is more effective than the other.
In the comparison of two means, the same basic steps for hypothesis testing shown
in Chapter 8 are used, and the z and t tests are also used. When comparing two means
by using the t test, the researcher must decide if the two samples are independent or
dependent. The concepts of independent and dependent samples will be explained in
Sections 9–2 and 9–3.
The z test can be used to compare two proportions, as shown in Section 9–4. Finally,
two variances can be compared by using an F test as shown in Section 9–5.
9–1 Testing the Difference Between Two Means: Using the z Test
OBJECTIVE 1 Suppose a researcher wishes to determine whether there is a difference in the average
age of nursing students who enroll in a nursing program at a community college and
Test the difference between
those who enroll in a nursing program at a university. In this case, the researcher is not
two means, using the z test.
interested in the average age of all beginning nursing students; instead, he is interested in
comparing the means of the two groups. His research question is, Does the mean age of
nursing students who enroll at a community college differ from the mean age of nursing
students who enroll at a university? Here, the hypotheses are
H0: μ1 = μ2
H1: μ1 ≠ μ2
H0: μ1 − μ2 = 0
H1: μ1 − μ2 ≠ 0
If there is no difference in population means, subtracting them will give a difference of
zero. If they are different, subtracting will give a number other than zero. Both methods
of stating hypotheses are correct; however, the first method will be used in this text.
If two samples are independent of each other, the subjects selected for the first sample
in no way influence the way the subjects are selected in the second sample. For example,
if a group of 50 people were randomly divided into two groups of 25 people each in order
to test the effectiveness of a new drug, where one group gets the drug and the other group
gets a placebo, the samples would be independent of each other.
On the other hand, two samples would be dependent if the selection of subjects for
the first group in some way influenced the selection of subjects for the other group. For
example, suppose you wanted to determine if a person’s right foot was slightly larger than
his or her left foot. In this case, the samples are dependent because once you selected a
9–2
Section 9–1 Testing the Difference Between Two Means: Using the z Test 489
– –
F I G U R E 9 – 1 Distribution of X 1 − X 2
Differences of Means of Pairs
of Samples
– –
X1 − X2
0
person’s right foot for sample 1, you must select his or her left foot for sample 2 because
you are using the same person for both feet.
Before you can use the z test to test the difference between two independent sample
means, you must make sure that the following assumptions are met.
Assumptions for the z Test to Determine the Difference Between Two Means
In this book, the assumptions will be stated in the exercises; however, when encountering
statistics in other situations, you must check to see that these assumptions have been met
before proceeding.
The theory behind testing the difference between two means is based on selecting
pairs of samples and comparing the means of the pairs. The population means need not
be known.
All possible pairs of samples are taken from populations. The means for each pair of
samples are computed and then subtracted, and the differences are plotted. If both popu-
lations have the same mean, then most of the differences will be zero or close to zero.
Occasionally, there will be a few large differences due to chance alone, some positive
U n u s u a l Stats and others negative. If the differences are plotted, the curve will be shaped like a normal
Adult children who distribution and have a mean of zero, __ as__shown in Figure 9–1.
live with their parents __ The variance
__ of the difference
X1 – X2 is equal to the sum of the individual variances
of X
1 and X
2 . That is,
σ 2__X 2 = σ X
1 + σ
spend more than
__
2 hours a day doing 2 2
1 X X 2
__ __
σ 2 σ 2
household chores.
σ 2__X
1 = ___ σ 2__X 2 = ___
According to a study,
daughters contribute where n1 and
n 2
1 2
about 17 hours a __ __
week and sons about So the standard deviation of X
1 – X
2 is
σ 2 σ 22
________
n1 + ___
14.4 hours.
√
___
1
n
2
Formula for the z Test for Comparing Two Means from Independent Populations
2 ) – (μ1 − μ2)
(X 1 − X
z = _________________
__ __
___ +
√
n1 n2
9–3
490 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
(a) Difference is not significant. The means of the populations are the same. (b) Difference is significant. The means of the populations are different.
– – – –
Do not reject H 0: μ1 = μ2 since X 1 − X 2 is not significant. Reject H 0: μ1 = μ2 since X 1 − X 2 is significant.
the null hypothesis is μ1 = μ2, since that is equivalent to μ1 − μ2 = 0. Finally, the standard
where X
1 – X
σ 2 σ 22
________
√ n1 + ___
___ n
1 2
In the comparison of two sample means, the difference may be due to chance, in
which case the null hypothesis will not be rejected and the researcher can assume that
the means of the populations are basically the same. The difference in this case is not
significant. See Figure 9–2(a). On the other hand, if the difference is significant, the null
hypothesis is rejected and the researcher can conclude that the population means are
different. See Figure 9–2(b).
These tests can also be one-tailed, using the following hypotheses:
The same critical values used in Section 8 –2 are used here. They can be obtained
from Table E in Appendix A.
The basic format for hypothesis testing using the traditional method is reviewed here.
Step 1 State the hypotheses and identify the claim.
Step 2 Find the critical value(s).
Step 3 Compute the test value.
Step 4 Make the decision.
Step 5 Summarize the results.
9–4
Section 9–1 Testing the Difference Between Two Means: Using the z Test 491
found by previous studies was 5.8 hours. At α = 0.05, can it be concluded that there is a
studies is 6.3 hours, and the population standard deviation of those in the second group
significant difference in the average times each group spends on leisure activities?
SOLUTION
Step 2 Find the critical values. Since α = 0.05, the critical values are +1.96 and −1.96.
Make the decision. Reject the null hypothesis at α = 0.05 since 2.90 > 1.96.
1 2 35 35
Step 4
See Figure 9–3.
F I G U R E 9 – 3 Critical and Test Values for Example 9–1
z
−1.96 0 +1.96 +2.90
Step 5 Summarize the results. There is enough evidence to support the claim that
the means are not equal. That is, the average of the times spent on leisure
activities is different for the groups.
The P-values for this test can be determined by using the same procedure shown in
obtained from Table E is 0.0038. This value is obtained by looking up the area for z = 2.90,
Section 8 –2. For example, if the test value for a two-tailed test is 2.90, then the P-value
is doubled to get 0.0038 since the test is two-tailed. If α = 0.05, the decision would be to
which is 0.9981. Then 0.9981 is subtracted from 1.0000 to get 0.0019. Finally, this value
reject the null hypothesis, since P-value < α (that is, 0.0038 < 0.05). Note: The P-value
obtained on the TI-84 is 0.0037.
The P-value method for hypothesis testing for this chapter also follows the same for-
mat as stated in Chapter 8. The steps are reviewed here.
Step 1 State the hypotheses and identify the claim.
Step 2 Compute the test value.
Step 3 Find the P-value.
Step 4 Make the decision.
Step 5 Summarize the results.
9–5
492 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
α = 0.10, is there enough evidence to support the claim? Assume σ1 and σ2 = 3.3.
sample of the number of sports offered by colleges for males and females is shown. At
Males Females
6 11 11 8 15 6 8 11 13 8
6 14 8 12 18 7 5 13 14 6
6 9 5 6 9 6 5 5 7 6
6 9 18 7 6 10 7 6 5 5
15 6 11 5 5 16 10 7 8 5
9 9 5 5 8 7 5 5 6 5
8 9 6 11 6 9 18 13 7 10
9 5 11 5 8 7 8 5 7 6
7 7 5 10 7 11 4 6 8 7
10 7 10 8 11 14 12 5 8 5
SOLUTION
Step 2 Compute the test value. Using a calculator or the formula in Chapter 3, find the
√ ___
√
2 2
3.3
____
50 50
Find the P-value from Table E. For z = 1.06, the area is 0.8554, and
1 2
Step 4 Make the decision. Since the P-value is larger than α (that is, 0.1446 > 0.10),
the decision is to not reject the null hypothesis. See Figure 9–4.
Step 5 Summarize the results. There is not enough evidence to support the claim that
colleges offer more sports for males than they do for females at the 0.10 level
of significance.
F I G U R E 9 – 4 P-Value and α Value for Example 9–2
0.1446
0.10
z
0
*Note: Calculator results may differ due to rounding.
9–6
Section 9–1 Testing the Difference Between Two Means: Using the z Test 493
2 ) − (μ1 − μ2)
z = _________________
__ __
(X1 – X
σ 2 σ 22
________
√ n1 + ___
___ n
Confidence intervals for the difference between two means can also be found. When
you are hypothesizing a difference of zero, if the confidence interval contains zero, the
null hypothesis is not rejected. If the confidence interval does not contain zero, the null
hypothesis is rejected.
Confidence intervals for the difference between two means can be found by using
this formula:
Formula for the z Confidence Interval for Difference Between Two Means
σ σ 22 σ σ 22
< μ μ <
________ ________
(X 1 − X
2) − zα∕2 ___ + − 1 − X 2) + zα∕2 ___ +
21 ___ 21 ___
√ √
__ __ __ __
n1 n2
1 2 (
X n1 n2
(39.6 − 35.4) − 1.96 ____ < μ1 − μ2 < (39.6 − 35.4) + 1.96 ____
__________ __________
√
6.32 + ____
5.82
√
2 5.82
6.3 + ____
35 35 35 35
4.2 − 2.8 < μ1 − μ2 < 4.2 + 2.8
1.4 < μ1 − μ2 < 7.0
(The confidence interval obtained from the TI-84 is 1.363 < μ1 − μ2 < 7.037.)
Since the confidence interval does not contain zero, the decision is to reject the
null hypothesis, which agrees with the previous result.
9–7
494 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
National League
47 49 73 50 65 70 49 47 40 43
46 35 38 40 47 39 49 37 37 36
40 37 31 48 48 45 52 38 38 36
44 40 48 45 45 36 39 44 52 47
American League
47 57 52 47 48 56 56 52 50 40
46 43 44 51 36 42 49 49 40 43
39 39 22 41 45 46 39 32 36 32
32 32 37 33 44 49 44 44 49 32
Exercises 9–1
1. Explain the difference between testing a single mean c. Compute the test value.
and testing the difference between two means. d. Make the decision.
2. When a researcher selects all possible pairs of samples e. Summarize the results.
from a population in order to find the difference be-
tween the means of each pair, what will be the shape Use the traditional method of hypothesis testing unless
of the distribution of the differences when the original otherwise specified.
distributions are normally distributed? What will be
5. Recreational Time A researcher wishes to see if there
the mean of the distribution? What will be the standard
is a difference between the mean number of hours per
deviation of the distribution?
week that a family with no children participates in
3. What three assumptions must be met when you are recreational activities and a family with children partici-
when σ1 and σ2 are known? samples and the data are shown. At α = 0.10, is there a
using the z test to test differences between two means pates in recreational activities. She selects two random
4. Show two different ways to state that the means of two difference between the means?
𝝈
populations are equal. __
X n
For Exercises 5 through 16, perform each of the following No children 8.6 2.1 36
steps. Children 10.6 2.7 36
a. State the hypotheses and identify the claim. 6. Teachers’ Salaries Teachers’ Salaries New York and
b. Find the critical value(s). Massachusetts lead the list of average teacher’s salaries.
9–8
Section 9–1 Testing the Difference Between Two Means: Using the z Test 495
The New York average is $76,409 while teachers enough evidence to reject the claim that the average
α = 0.01.
in Massachusetts make an average annual salary of cost of a home in both locations is the same? Use
$73,195. Random samples of 45 teachers from each
state yielded the following.
Scott Ligonier
9–9
496 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
𝝈
__
15. Self-Esteem Scores In a study of a group of women X n
science majors who remained in their profession and Day students 4.7 1.5 40
a group who left their profession within a few months Evening Students 6.2 1.7 40
9–10
Section 9–1 Testing the Difference Between Two Means: Using the z Test 497
23. Store Sales A company owned two small Bath and 24. Home Prices According to the almanac, the average
Body Goods stores in different cities. It was desired to sales price of a single-family home in the metropolitan
see if there was a difference in their mean daily sales. Dallas/Ft. Worth/Irving, Texas, area is $215,200. The
The following results were obtained from a random average home price in Orlando, Florida, is $198,000.
σ1 = 15 σ2 = 15
age earnings of year-round full-time workers with bach-
n1 = 60 n2 = 60
elor’s degrees or more is $88,641 for men and $58,000
for women—a difference of slightly over $30,000 a
year. One hundred of each were randomly sampled,
26. Sale Prices for Houses The average sales price of new resulting in a sample mean of $90,200 for men, and the
one-family houses in the Midwest is $250,000 and in population standard deviation is $15,000; and a mean
the South is $253,400. A random sample of 40 houses of $57,800 for women, and the population standard
in each region was examined with the following results. deviation is $12,800. At the 0.01 level of significance,
At the 0.05 level of significance, can it be concluded can it be concluded that the difference in means is not
that the difference in mean sales price for the two re- $30,000?
gions is greater than $3400? Source: New York Times Almanac.
9–11
498 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
Hypothesis Test for the Difference Between This refers to Example 9–1 in the text.
Two Means and z Distribution (Statistics)
Example TI9–2
1. Press STAT and move the cursor to TESTS.
2. Press 3 for 2-SampZTest.
3. Move the cursor to Stats and press ENTER.
4. Type in the appropriate values.
5. Move the cursor to the appropriate alternative hypothesis
and press ENTER.
6. Move the cursor to Calculate and press ENTER.
Set A 10 2 15 18 13 15 16 14 18 12 15 15 14 18 16
Set B 5 8 10 9 9 11 12 16 8 8 9 10 11 7 6
The two-sample z test dialog box is shown (before the variances are entered); the results appear
in the table that Excel generates. Note that the P-value and critical z value are provided for
9–12
Section 9–2 Testing the Difference Between Two Means of Independent Samples: Using the t Test 499
tation: 7.09045E-06 = 7.09045 × 10−6 = 0.00000709045. Because this value is less than 0.05,
both the one-tailed test and the two-tailed test. The P-values here are expressed in scientific no-
we reject the null hypothesis and conclude that the population means are not equal.
2 ) – (μ1 − μ2)
(X − X
t =
__ __
_________________
1 ________
n1 + ___
s2 s22
√
___ n
9–13
500 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
The formula
2) − (μ1 − μ2)
t = _________________
__ __
(X – X
1
________
s1 ∕
n1 + s22 ∕
when no difference between population means is hypothe
_____________
two means. This formula is similar to the one used when σ1 and σ2 are known; but when
2
sized. The denominator √ n2 is the standard error of the difference between
we use this t test, σ1 and σ2 are unknown, so s1 and s2 are used in the formula in place of
σ1 and σ2. Since mathematical derivation of the standard error is somewhat complicated,
it will be omitted here.
means differ when σ1 and σ2 are unknown, the following assumptions must be met.
Before you can use the testing methods to determine whether two independent s ample
Assumptions for the t Test for Two Independent Means When σ1 and σ2
Are Unknown
1. The samples are random samples.
2. The sample data are independent of one another.
3. When the sample sizes are less than 30, the populations must be normally or
approximately normally distributed.
In this book, the assumptions will be stated in the exercises; however, when encountering
statistics in other situations, you must check to see that these assumptions have been met
before proceeding.
Again the hypothesis test here follows the same steps as those in Section 9–1; how-
ever, the formula uses s1 and s2 and Table F to get the critical values.
and a standard deviation of 1.19 days. At α = 0.05, can it be concluded that there is a
sample was 1.23. A random sample of 7 women found that the mean was 4.3 days
SOLUTION
Find the critical values. Since the test is two-tailed and α = 0.05, the degrees of
freedom are the smaller of n1 − 1 and n2 − 1. In this case, n1 − 1 = 9 − 1 = 8
Step 2
9–14
Section 9–2 Testing the Difference Between Two Means of Independent Samples: Using the t Test 501
2 ) − (μ1 − μ2)
(X1 − X (5.5 − 4.3) − 0
Step 3 Compute the test value.
t
−2.447 0 1.972 2.447
Step 5 Summarize the results. There is not enough evidence to support the claim
that the means are different.
When raw data are given in the exercises, use your calculator or the formulas in
Chapter 3 to find the means and variances for the data sets. Then follow the procedures
shown in this section to test the hypotheses.
Confidence intervals can also be found for the difference of two means with this
formula:
(X1 − X
2 ) − tα∕2 ___
n1 + ___ 2 ) + tα∕2 ___
n1 + ___
s 2 s 22 s 2 s 22
√ √
__ __ __ __
n n
EXAMPLE 9–5
Find the 95% confidence interval for the data in Example 9–4.
SOLUTION
9–15
502 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
In many statistical software packages, a different method is used to compute the de-
(s 21∕
n1 + s 22∕
grees of freedom for this t test. They are determined by the formula
d.f. =
n2)2
(s 21∕
n1)2∕(n1 − 1) + (s 22∕ n2)2∕(n2 − 1)
______________________________
2 ) − (μ1 − μ2)
When the variances are assumed__to be__equal, this formula is used and
t =
(X1 – X
_____________________________
(n1 − 1)s21 + (n2 − 1)s22 __
__________________ _______
√ n1 + n2 − 2 √
__________________
n1 + __
1
n2
1
scope of this text. Because of this, we will assume that σ1 ≠ σ2 in this text.
cance will change the overall level of significance of the t test. Their reasons are beyond the
Degrees of freedom = 56
division are 6.93 and 4.93, respectively. A hypothesis test was run, and the computer output follows.
9–16
Section 9–2 Testing the Difference Between Two Means of Independent Samples: Using the t Test 503
Exercises 9–2
Use α = 0.10.
For these exercises, perform each of these steps. Assume the difference in the means is statistically significant?
that all variables are normally or approximately normally
distributed.
Chocolate: 29 25 17 36 41 25 32 29
a. State the hypotheses and identify the claim. 38 34 24 27 29
b. Find the critical value(s).
Nonchocolate: 41 41 37 29 30 38 39 10
c. Compute the test value.
29 55 29
d. Make the decision. Source: The Doctor’s Pocket Calorie, Fat, and Carbohydrate Counter.
e. Summarize the results.
6. Weights of Vacuum Cleaners Upright vacuum clean-
Use the traditional method of hypothesis testing unless ers have either a hard body type or a soft body type.
2. Tax-Exempt Properties A tax collector wishes to see 7. Weights of Running Shoes The weights in ounces of a
if the mean values of the tax-exempt properties are dif- sample of running shoes for men and women are shown.
α = 0.05, can it be concluded that there is a difference deviation is $5533. At α = 0.05, can it be concluded that
and the sample standard deviation was 7.5 dBA. At ondary school teachers is $45,633. The sample standard
in the means? the mean of the salaries of the elementary school teachers
is greater than the mean of the salaries of the secondary
4. Ages of Gamblers The mean age of a random sample school teachers? Use the P‑value method.
of 25 people who were playing the slot machines is
48.7 years, and the standard deviation is 6.8 years. The 9. Find the 90% confidence for the difference of the means
mean age of a random sample of 35 people who were in Exercise 1 of this section.
9–17
504 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
13. Cyber School Enrollment The data show the number 474 577 605 663 783 605 427 728
of students attending cyber charter schools in Allegheny 783 467 670 414 546 474 371 107
County and the number of students attending cyber 813 443 565 696 442 587 293 277
692 694 277 419 662 555 527 320
α = 0.01, is there enough evidence to support the claim
schools in counties surrounding Allegheny County. At
884
that the average number of students in school districts in Source: U.S. News & World Report Best Graduate Schools.
Allegheny County who attend cyber schools is greater 18. Out-of-State Tuitions The out-of-state tuitions (in dollars)
than those who attend cyber schools in school districts for random samples of both public and private four-year
outside Allegheny County? Give a factor that should be colleges in a New England state are listed. Find the 95%
considered in interpreting this answer. confidence interval for the difference in the means.
Allegheny County Outside Allegheny County Private Public
25 75 38 41 27 32 57 25 38 14 10 29 13,600 13,495 7,050 9,000
Source: Pittsburgh Tribune-Review. 16,590 17,300 6,450 9,758
23,400 12,500 7,050 7,871
14. Hockey’s Highest Scorers The number of points held
by random samples of the NHL’s highest scorers for 16,100
9–18
Section 9–2 Testing the Difference Between Two Means of Independent Samples: Using the t Test 505
21. Home Runs Two random samples of professional base- 22. Batting Averages Random samples of batting averages
ball players were selected and the number of home runs hit from the leaders in both leagues prior to the All-Star
were recorded. One sample was obtained from the National break are shown. At the 0.05 level of significance, can a
9–19
506 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
Example XL9–2
data. Assume the population variances are not equal. Use α = 0.05.
Test the claim that there is no difference between population means based on these sample
Set A 32 38 37 36 36 34 39 36 37 42
Set B 30 36 35 36 31 34 37 33 32
2. Under the Home tab, select Format > enter the 9-number data set B into column B.
1. Enter the 10-number data set A into column A.
3. Select the Data tab from the toolbar. Then select Data Analysis.
4. In the Data Analysis box, under Analysis Tools select t-test: Two-Sample Assuming Unequal
Variances, and click [OK].
5. In Input, type in the Variable 1 Range: A1:A10 and the Variable 2 Range: B1:B9.
6. Type 0 for the Hypothesized Mean Difference.
7. Type 0.05 for Alpha.
8. In Output options, type D7 for the Output Range, then click [OK].
Note: You may need to increase the column width to see all the results. To do this:
1. Highlight the columns D, E, and F.
2. Select Format>AutoFit Column Width.
The output reports both one- and two-tailed P-values.
9–20
Section 9–3 Testing the Difference Between Two Means: Dependent Samples 507
9–21
508 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
Besides samples in which the same subjects are used in a pre-post situation, there are
other cases where the samples are considered dependent. For example, students might
be matched or paired according to some variable that is pertinent to the study; then one
student is assigned to one group, and the other student is assigned to a second group. For
instance, in a study involving learning, students can be selected and paired according
to their IQs. That is, two students with the same IQ will be paired. Then one will be as-
signed to one sample group (which might receive instruction by computers), and the other
student will be assigned to another sample group (which might receive instruction by the
lecture discussion method). These assignments will be done randomly. Since a student’s
IQ is important to learning, it is a variable that should be controlled. By matching subjects
on IQ, the researcher can eliminate the variable’s influence, for the most part. Matching,
then, helps to reduce type II error by eliminating extraneous variables.
Two notes of caution should be mentioned. First, when subjects are matched according
to one variable, the matching process does not eliminate the influence of other variables.
Matching students according to IQ does not account for their mathematical ability or their
familiarity with computers. Since not all variables influencing a study can be controlled, it
is up to the researcher to determine which variables should be used in matching. Second,
when the same subjects are used for a pre-post study, sometimes the knowledge that they
are participating in a study can influence the results. For example, if people are placed in
a special program, they may be more highly motivated to succeed simply because they
have been selected to participate; the program itself may have little effect on their success.
When the samples are dependent, a special t test for dependent means is used. This
test employs the difference in values of the matched pairs. The hypotheses are as follows:
H 0: μD = 0 H 0: μD = 0 H 0: μD = 0
Two-tailed Left-tailed Right-tailed
H 1: μD ≠ 0 H 1: μD < 0 H 1: μD > 0
Here, μD is the symbol for the expected mean of the difference of the matched pairs. The
general procedure for finding the test value involves several steps.
D = X 1 − X2
First, find the differences of the values of the pairs of data.
__
ΣD
Second, find the mean D
of the differences, using the formula
= ___
__
D n
√
where n is the number of data pairs. Third, find the standard deviation sD of the differ-
____________
ences, using the formula
nΣD2 − (ΣD)2
sD =
n(n − 1)
____________
Fourth, find the estimated standard error s__ D of the differences, which is
D = √
sD
____
s__ __
n
− μD
Finally, find the test value, using the formula
sD∕√
D
__
n
where the observed value is the mean of the differences. The expected value μD is zero if
standard error
the hypothesis is μD = 0. The standard error of the difference is the standard deviation of
9–22
Section 9–3 Testing the Difference Between Two Means: Dependent Samples 509
the difference, divided by the square root of the sample size. Both populations must be
normally or approximately normally distributed.
Before you can use the testing method presented in this section, the following
assumptions must be met.
Assumptions for the t Test for Two Means When the Samples Are Dependent
In this book, the assumptions will be stated in the exercises; however, when encountering
statistics in other situations, you must check to see that these assumptions have been met
before proceeding.
The formulas for this t test are given next.
Formulas for the t Test for Dependent Samples
− μD
t = _______
__
sD∕√n
D
__
nΣD2 − (ΣD)2
ΣD
____________
= D =
√ n(n − 1)
__
D ___ ____________
n and s
The steps for this t test are summarized in the Procedure Table.
Procedure Table
Testing the Difference Between Means for Dependent Samples
Step 1 State the hypotheses and identify the claim.
Step 2 Find the critical value(s).
Step 3 Compute the test value.
a. Make a table, as shown.
A B
⋮ ⋮ ΣD =
D = X1 – X2
ΣD2 =
X1 X2 D2 = ( X1 – X2)2
ΣD
spend at least one night c. Find the mean of the differences.
=
__
in jail each year. ___
Dn
D2 = (X1 − X2)2
d. Square the differences and place the results in column B. Complete the table.
nΣD2 − (ΣD)2
____________
sD = ____________
√ n(n − 1)
− μD
f. Find the test value.
t = _______ with d.f. = n − 1
__
sD∕√n
D
__
9–23
510 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
ago and their deposits (in billions of dollars) today. At α = 0.05, can it be concluded
A random sample of nine local banks shows their deposits (in billions of dollars) 3 years
Bank 1 2 3 4 5 6 7 8 9
3 years ago 11.42 8.41 3.98 7.37 2.28 1.10 1.00 0.9 1.35
Today 16.69 9.44 6.53 5.58 2.92 1.88 1.78 1.5 1.22
SOLUTION
Step 1 State the hypothesis and identify the claim. Since we are interested to see if
there has been an increase in deposits, the deposits 3 years ago must be less
than the deposits today; hence, the deposits must be significantly less 3 years
ago than they are today. Hence, the mean of the differences must be less
than zero.
3 years A B
ago (X1) Today (X2) D = X1 – X2 D2 = (X1 – X2)2
11.42 16.69
8.41 9.44
3.98 6.53
7.37 5.58
2.28 2.92
1.10 1.88
1.00 1.78
0.90 1.50
1.35 1.22
9–24
Section 9–3 Testing the Difference Between Two Means: Dependent Samples 511
(−5.27)2 = 27.7729
d. Square the differences and place the results in column B.
(−1.03)2 = 1.0609
(−2.55)2 = 6.5025
(+1.79)2 = 3.2041
(−0.64)2 = 0.4096
(−0.78)2 = 0.6084
(−0.78)2 = 0.6084
(−0.60)2 = 0.3600
(+0.13)2 = 0.0169
ΣD2 = 40.5437
The completed table is shown next.
3 years A B
−5.27
ago (X1) Today (X2) D = X1 – X2 D2 = (X1 – X2)2
−1.03
11.42 16.69 27.7729
−2.55
8.41 9.44 1.0609
+1.79
3.98 6.53 6.5025
−0.64
7.37 5.58 3.2041
−0.78
2.28 2.92 0.4096
−0.78
1.10 1.88 0.6084
−0.60
1.00 1.78 0.6084
+0.13
0.90 1.50 0.3600
nΣD2 − (ΣD)2
____________
sD = ____________
√ n(n − 1)
9(40.5437) − (−9.73)
__________________
= √
2
__________________
9(9 – 1)
= ________
________
√ 270.2204
= 1.937
72
−1.674, is greater than the critical value, −1.860. See Figure 9–6.
Step 4 Make the decision. Do not reject the null hypothesis since the test value,
t
−1.860 −1.674 0
Step 5 Summarize the results. There is not enough evidence to show that the
deposits have increased over the last 3 years.
9–25
512 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
that the cholesterol level has been changed at α = 0.10? Assume the variable is approxi-
table. (Cholesterol level is measured in milligrams per deciliter.) Can it be concluded
Subject 1 2 3 4 5 6
Before (X1) 210 235 208 190 172 244
After (X2) 190 170 210 188 173 228
SOLUTION
Step 1 State the hypotheses and identify the claim. If the diet is effective, the
before cholesterol levels should be different from the after levels.
H0: μD = 0 and H1: μD ≠ 0 (claim)
Find the critical value. The degrees of freedom are 6 − 1 = 5. At α = 0.10,
the critical values are ±2.015.
Step 2
A B
Before (X1) After (X2) D = X1 – X2 D2 = (X1 – X2)2
210 190
235 170
208 210
190 188
172 173
244 228
ΣD
= ___ n = 6 = 16.7
__
D 100
_____
(16)2 = 256
ΣD2 = 4890
9–26
Section 9–3 Testing the Difference Between Two Means: Dependent Samples 513
A B
Before (X1) After (X2) D = X1 – X2 D2 = (X1 – X2)2
210 190 20 400
235 170 65 4225
208 210 –2 4
190 188 2 4
172 173 –1 1
nΣD2 − (ΣD)2
____________
sD =
√ n(n − 1)
____________
6 · 4890 − 100
__________________
= __________________
√
2
6(6 – 1)
= ________
________
29,340 − 10,000
√
= 25.4
30
− μD _________
t = _______ = 16.7 – 0__
__
__
sD∕√n 25.4∕√6
D
= 1.610
Step 4 Make the decision. The decision is to not reject the null hypothesis, since the
test value 1.610 is in the noncritical region, as shown in Figure 9–7.
F I G U R E 9 – 7 Critical and Test Values for Example 9–7
t
−2.015 0 1.610 2.015
Step 5 Summarize the results. There is not enough evidence to support the claim
that the mineral changes a person’s cholesterol level.
The P-values for the t test are found in Table F. For a two-tailed test with d.f. = 5 and
t = 1.610, the P-value is found between 1.476 and 2.015; hence, 0.10 < P-value < 0.20.
Thus, the null hypothesis cannot be rejected at α = 0.10.
If a specific difference is hypothesized, this formula should be used
− μD
t = ______
__
sD∕√
D
__
n
where μD is the hypothesized difference.
9–27
SPEAKING OF STATISTICS Can Video Games Save Lives?
Can playing video games help doctors perform surgery?
The answer is yes. A study showed that surgeons who
played video games for at least 3 hours each week made
about 37% fewer mistakes and finished operations 27%
faster than those who did not play video games.
The type of surgery that they performed is called
laparoscopic surgery, where the surgeon inserts a
tiny video camera into the body and uses a joystick to
maneuver the surgical instruments while watching the
results on a television monitor. This study compares two
groups and uses proportions. What statistical test do
you think was used to compare the percentages? (See
Section 9–4.)
For example, if a dietitian claims that people on a specific diet will lose an average of
3 pounds in a week, the hypotheses are
H0: μD = 3 and H1: μD ≠ 3
The value 3 will be substituted in the test statistic formula for μD.
Confidence intervals can be found for the mean differences with this formula.
EXAMPLE 9–8
Find the 90% confidence interval for the data in Example 9–7.
SOLUTION
H0: μD = 0. Hence, there is not enough evidence to support the claim that the mineral
Since 0 is contained in the interval, the decision is to not reject the null hypothesis
9–28
Section 9–3 Testing the Difference Between Two Means: Dependent Samples 515
Exercises 9–3
1. Classify each as independent or dependent samples. Use the traditional method of hypothesis testing unless
a. Heights of identical twins otherwise specified.
b. Test scores of the same students in English and psy- 2. Retention Test Scores A random sample of non-
cholog English majors at a selected college was used in a
c. The effectiveness of two different brands of aspirin study to see if the student retained more from reading
on two different groups of people a 19th-century novel or by watching it in DVD form.
d. Effects of a drug on reaction time of two different Each student was assigned one novel to read and a
groups of people, measured by a before-and-after test different one to watch, and then they were given a
9–29
516 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
the seminar. At α = 0.10, did attending the seminar canned green beans. Six overweight dogs were ran-
increase the number of hours the students studied domly selected from her practice and were put on this
per week? program. Their initial weights were recorded, and they
were weighed again after 4 weeks. At the 0.05 level
Before 9 12 6 15 3 18 10 13 7
of significance, can it be concluded that the dogs lost
After 9 17 9 20 2 21 15 22 6 weight?
4. Obstacle Course Times An obstacle course was set Before 42 53 48 65 40 52
up on a campus, and 8 randomly selected volunteers
After 39 45 40 58 42 47
were given a chance to complete it while they were
being timed. They then sampled a new energy drink 9. Pulse Rates of Identical Twins A researcher wanted to
and were given the opportunity to run the course again. compare the pulse rates of identical twins to see whether
there sufficient evidence at α = 0.05 to conclude that the
The “before” and “after” times in seconds are shown. Is there was any difference. Eight sets of twins were ran-
can it be concluded that the number of errors has been 12. Mistakes in a Song A random sample of six music stu-
reduced? dents played a short song, and the number of mistakes
in music each student made was recorded. After they
Student 1 2 3 4 5 6 practiced the song 5 times, the number of mistakes each
9–30
Section 9–3 Testing the Difference Between Two Means: Dependent Samples 517
7. Type in the appropriate values, using 0 for μ0 and L3 for the list.
6. Move the cursor to Data and press ENTER.
Set A 33 35 28 29 32 34 30 34
Set B 27 29 36 34 30 29 28 24
9–31
518 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
5. In Input, type in the Variable 1 Range: A1:A8 and the Variable 2 Range: B1:B8.
6. Type 0 for the Hypothesized Mean Difference.
7. Type 0.05 for Alpha.
8. In Output options, type D5 for the Output Range, then click [OK].
Note: You may need to increase the column width to see all the results. To do this:
1. Highlight the columns D, E, and F.
2. Under the Home tab, select Format>AutoFit Column Width.
The output shows a P-value of 0.3253988 for the two-tailed case. This value is greater than the
alpha level of 0.05, so we fail to reject the null hypothesis.
the effectiveness of the vitamin regimen at α = 0.05. Each value in these data represents the
After 2 weeks of regular training, supplemented with the vitamin, they are tested again. Test
maximum number of pounds the athlete can bench-press. Assume that the variable is approxi-
mately normally distributed.
Athlete 1 2 3 4 5 6 7 8
Before (X1) 210 230 182 205 262 253 219 216
After (X2) 219 236 179 204 270 250 222 216
9–32
Section 9–4 Testing the Difference Between Proportions 519
Difference 8 −2.38
After 8 224.50 27.9 1 9.87
4.84 1.71
Since the P-value is 0.104, do not reject the null hypothesis. The sample difference of −2.38 in
the strength measurement is not statistically significant.
n = sample size
When you are testing the difference between two population proportions p1 and p2, the hy-
H1: p1 ≠ p2 H1: p1 − p2 ≠ 0
or
Similar statements using < or > in the alternate hypothesis can be formed for one-tailed
tests.
9–33
520 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
For two proportions, pˆ 1 = X1∕n1 is used to estimate p1 and pˆ 2 = X2∕n2 is used to
estimate p2. The standard error of the difference is
= 1 1
n1 + n2
__ __________
p
and q = 1 − p. This weighted estimate is based on the hypothesis that p1 = p2. Hence, p is
_ __ __
X + X2
can be simplified to
=
n1 + n2
_______ __
1 p
Finally, the standard error of the difference in terms of the weighted estimate is
1 2 √
__ _ __
p q n1 + __
1
1
)
n
2
In this book, the assumptions will be stated in the exercises; however, when encountering
statistics in other situations, you must check to see that these assumptions have been met
before proceeding.
The hypothesis-testing procedure used here follows the five-step procedure presented
__ _
previously except that pˆ 1, pˆ 2, p
, and q must be computed.
9–34
Section 9–4 Testing the Difference Between Proportions 521
homes had a vaccination rate of less than 80%. At α = 0.05, test the claim that there is
vaccination rate of less than 80%, while 17 out of 24 randomly selected large nursing
no difference in the proportions of the small and large nursing homes with a resident
vaccination rate of less than 80%.
Source: Nancy Arden, Arnold S. Monto, and Suzanne E. Ohmit, “Vaccine Use and the Risk of Outbreaks in a Sample of Nursing
Homes During an Influenza Epidemic,” American Journal of Public Health.
SOLUTION
Step 2 Find the critical values. Since α = 0.05, the critical values are +1.96 and −1.96.
__ _
Step 3 Compute the test value. First compute p 1, p 2, p
ˆ ˆ , and q . Then substitute in the
formula.
Let pˆ 1 be the proportion of the small nursing homes with a vaccination rate of less
than 80% and pˆ 2 be the proportion of the large nursing homes with a vaccination rate of
less than 80%. Then
= 12 + 17
= ___
29 = 0.5
n1 + n2 34 + 24 58
__ X + X2 _______
_______
=
p 1
q = 1 − p
= 1 − 0.5 = 0.5
_ __
√
p
__ _ __
(
q n1 + __
1
)
n
(0.35 − 0.71) − 0
−0.36
1 2
= _________________ = ______
___________________ = −2.70
+
√
(0.5)(0.5)
1
___ 1
___
( 0.1333
)
Make the decision. Reject the null hypothesis, since −2.70 < −1.96.
34 24
Step 4
See Figure 9–8.
F I G U R E 9 – 8 Critical and Test Values for Example 9–9
z
−2.70 −1.96 0 +1.96
Step 5 Summarize the results. There is enough evidence to reject the claim that
there is no difference in the proportions of small and large nursing homes
with a resident vaccination rate of less than 80%.
9–35
522 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
α = 0.10, can it be concluded that the percentages of people who committed burglaries
rested. In a random sample of 50 car thefts, 12% of the criminals were arrested. At
and were arrested was greater than the percentages of people who committed car thefts
and were arrested.
SOLUTION
Step 2 Find the critical value, using Table E. At α = 0.10 the critical value is 1.28.
__
Step 3 Compute
_
the test value. Since percentages are given, you need to compare p
and q .
X1 = pˆ 1n1 = 0.16(50) = 8
X2 = pˆ 2n2 = 0.12(50) = 6
= 8 + 6
= ____
14 = 0.14
n1 + n2 50 + 50 100
__ X + X2 _______
= _______
p 1
1 − p
= 1 − 0.14 = 0.86
_ __
q =
Step 4 Make the decision. Do not reject the null hypothesis since 0.58 < 1.28.
That is, 0.58 falls in the noncritical region. See Figure 9–9.
z
0 0.58 1.28
Step 5 Summarize the results. There is not enough evidence to support the
claim that the percentage of people who are arrested for burglaries is
greater than the percentage of people who are arrested who committed
car thefts.
9–36
in Section 9–1. In Example 9–10, the table value for 0.58 is 0.7190, and 1 − 0.7190 =
The P-value for the difference of proportions can be found from Table E as shown
0.2810. Hence, 0.2810 > 0.01; thus the decision is to not reject the null hypothesis.
The sampling distribution of the difference of two proportions can be used to
construct a confidence interval for the difference of two proportions. The formula for the
confidence interval for the difference between two proportions is shown next.
Here, the confidence interval uses a standard deviation based on estimated values
of the population proportions, but the hypothesis test uses a standard deviation based
on the assumption that the two population proportions are equal. As a result, you may
obtain different conclusions when using a confidence interval or a hypothesis test. So
when testing for a difference of two proportions, you use the z test rather than the
confidence interval.
EXAMPLE 9–11
Find the 95% confidence interval for the difference of proportions for the data in
Example 9–9.
SOLUTION
9–37
524 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
− + ____ < p1 – p2
___________
– zα∕2 ____
pˆ qˆ pˆ qˆ
( pˆ 1
pˆ 2) √
n1 1
1
n2 2
2
< − + ____
___________
+ zα∕2 ____
pˆ qˆ pˆ 2 qˆ 2
( pˆ 1
pˆ 2) √
n1 1
1 n2
Exercises 9–4
a. n = 52, X = 32 a. n = 36, X = 20
1. Find the proportions pˆ and qˆ for each. 2. Find pˆ and qˆ for each.
b. n = 80, X = 66 b. n = 50, X = 35
c. n = 36, X = 12 c. n = 64, X = 16
d. n = 42, X = 7 d. n = 200, X = 175
e. n = 160, X = 50 e. n = 148, X = 16
9–38
Section 9–4 Testing the Difference Between Proportions 525
At α = 0.10, is there a difference in the proportions?
a. pˆ = 0.60, n = 240
3. Find each X, given pˆ .
Find the 90% confidence interval for the difference of the
pˆ = 0.20, n = 320 two proportions. Does the confidence interval contain 0?
b. X1 = 9, n1 = 15, X2 = 7, n2 = 20
Philadelphia, in a random sample of 80 mail carriers,
b. X1 = 21, n1 = 100, X2 = 43, n2 = 150 the claim that fewer household owners have cats than
c. X1 = 20, n1 = 80, X2 = 65, n2 = 120 household owners who have dogs as pets.
d. X1 = 15, n1 = 50, X2 = 3, n2 = 12
e. X1 = 24, n1 = 40, X2 = 18, n2 = 36
12. Seat Belt Use In a random sample of 200 men,
130 said they used seat belts. In a random sample of
300 women, 63 said they used seat belts. Test the claim
7. Lecture versus Computer-Assisted Instruction A 14. Hypertension It has been found that 26% of men
survey found that 83% of the men questioned pre 20 years and older suffer from hypertension (high
ferred computer-assisted instruction to lecture and blood pressure) and 31.5% of women are hypertensive.
75% of the women preferred computer-assisted A random sample of 150 of each gender was selected
α = 0.05.
test the claim that there is no difference in the pro- percentage of women have high blood pressure? Use
portion of men and the proportion of women who
favor computer-assisted instruction over lecture. Find
the 95% confidence interval for the difference of the Men 43 patients had high blood pressure
two proportions. Women 52 patients had high blood pressure
Source: www.nchs.gov
8. Leisure Time In a sample of 150 men, 132 said that they
had less leisure time today than they had 10 years ago. 15. Commuters A recent random survey of
In a random sample of 250 women, 240 women said that 100 individuals in Michigan found that 80 drove to
they had less leisure time than they had 10 years ago. work alone. A similar survey of 120 commuters in
9–39
526 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
9–40
Section 9–4 Testing the Difference Between Proportions 527
27. Bullying Bullying is a problem at any age but espe- randomly selected from each. At the 0.05 level of sig-
cially for students aged 12 to 18. A study showed that nificance, can a difference be concluded?
7.2% of all students in this age bracket reported being Private Public
bullied at school during the past six months with 6th
grade having the highest incidence at 13.9% and 12th Sample size 200 200
grade the lowest at 2.2%. To see if there is a difference No. bullied 13 16
between public and private schools, 200 students were Source: www.nces.ed.gov
−0.3554 Difference
34 24 58 n
0. Hypothesized difference
0.1333 Standard error
−2.67 z
0.0077 P-value (two-tailed)
9–41
528 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
b) For the Test method, select Use the pooled estimate of the proportion.
6. Click [OK] twice. The results are shown in the session window.
The P-value of the test is 0.008. Reject the null hypothesis. The difference is statistically sig-
nificant. Of all small nursing homes 35%, compared to 71% of all large nursing homes, have an
immunization rate of less than 80%. We can’t tell why, only that there is a difference.
9–42
Section 9–5 Testing the Difference Between Two Variances 529
Figure 9–10 shows the shapes of several curves for the F distribution.
F I G U R E 9 – 1 0
The F Family of Curves
1. The values of F cannot be negative, because variances are always positive or zero.
2. The distribution is positively skewed.
3. The mean value of F is approximately equal to 1.
4. The F distribution is a family of curves based on the degrees of freedom of the variance
of the numerator and the degrees of freedom of the variance of the denominator.
F = __
s2
1
s22
where the larger of the two variances is placed in the numerator regardless of the subscripts.
The F test has two values for the degrees of freedom: that of the numerator, n1 − 1, and
(See note on page 534.)
that of the denominator, n2 − 1, where n1 is the sample size from which the larger variance
was obtained.
When you are finding the F test value, the larger of the variances is placed in the
n umerator of the F formula; this is not necessarily the variance of the larger of the two
Table H in Appendix A gives the F critical values for α = 0.005, 0.01, 0.025, 0.05, and
sample sizes.
0.10 (each α value involves a separate table in Table H). These are one-tailed v alues; if a
two-tailed test is being conducted, then the α∕2 value must be used. For example, if a two-
tailed test with α = 0.05 is being conducted, then the 0.05∕2 = 0.025 table of Table H
should be used.
EXAMPLE 9–12
Find the critical value for a right-tailed F test when α = 0.05, the degrees of free-
dom for the numerator (abbreviated d.f.N.) are 15, and the degrees of freedom for the
denominator (d.f.D.) are 21.
9–43
530 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
Since this test is right-tailed with α = 0.05, use the 0.05 table. The d.f.N. is listed across
SOLUTION
the top, and the d.f.D. is listed in the left column. The critical value is found where the
row and column intersect in the table. In this case, it is 2.18. See Figure 9–11.
F I G U R E 9 – 1 1 Finding the Critical Value in Table H for Example 9–12
α = 0.05
d.f.N.
d.f.D. 1 2 ... 14 15
...
20
21 2.18
22
...
the numerator of the formula. When you are conducting a two-tailed test, α is split; and
As noted previously, when the F test is used, the larger variance is always placed in
even though there are two values, only the right tail is used. The reason is that the F test
value is always greater than or equal to 1.
EXAMPLE 9–13
Find the critical value for a two-tailed F test with α = 0.05 when the sample size from
which the variance for the numerator was obtained was 21 and the sample size from which
the variance for the denominator was obtained was 12.
Since this is a two-tailed test with α = 0.05, the 0.05∕2 = 0.025 table must be used.
SOLUTION
Here, d.f.N. = 21 − 1 = 20, and d.f.D. = 12 − 1 = 11; hence, the critical value is 3.23.
See Figure 9–12.
F I G U R E 9 – 1 2 Finding the Critical Value in Table H for Example 9–13
α = 0.025
d.f.N.
d.f.D. 1 2 ... 20
2
...
10
11 3.23
12
...
9–44
Section 9–5 Testing the Difference Between Two Variances 531
should be used. For example, if α = 0.05 (right-tailed test), d.f.N. = 18, and d.f.D. = 20,
If the exact degrees of freedom are not specified in Table H, the closest smaller value
use the column d.f.N. = 15 and the row d.f.D. = 20 to get F = 2.20. Using the smaller
value is the more conservative approach.
When you are testing the equality of two variances, these hypotheses are used:
H0: 𝜎
21 = 𝜎 22 H0: 𝜎 21 = 𝜎 22 H0: 𝜎
21 = 𝜎 22
Right-tailed Left-tailed Two-tailed
H1: 𝜎
21 > 𝜎 22 H1: 𝜎 21 < 𝜎 22 H1: 𝜎
21 ≠ 𝜎
22
There are four key points to keep in mind when you are using the F test.
1. The larger variance should always be placed in the numerator of the formula regardless of
U n u s u a l Stat the subscripts. (See note on page 534.)
F = __
Of all U.S. births, 2% are s2
twins. 1
s22
2. For a two-tailed test, the α value must be divided by 2 and the critical value placed on the
right side of the F curve.
3. If the standard deviations instead of the variances are given in the problem, they must be
squared for the formula for the F test.
4. When the degrees of freedom cannot be found in Table H, the closest value on the
smaller side should be used.
Before you can use the testing method to determine the difference between two vari-
ances, the following assumptions must be met.
In this book, the assumptions will be stated in the exercises; however, when encountering
statistics in other situations, you must check to see that these assumptions have been met
before proceeding.
Remember also that in tests of hypotheses using the traditional method, these five
steps should be taken:
Step 1 State the hypotheses and identify the claim.
Step 2 Find the critical value.
Step 3 Compute the test value.
Step 4 Make the decision.
Step 5 Summarize the results.
This procedure is not robust, so minor departures from normality will affect the
results of the test. So this test should not be used when the distributions depart from
normality because standard deviations are not a good measure of the spread in nonsym-
metrical distributions. The reason is that the standard deviation is not resistant to outliers
or extreme values. These values increase the value of the standard deviation when the
distribution is skewed.
9–45
532 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
not smoke. Two samples are selected, and the data are shown. Using α = 0.05, is there
per minute) of smokers is different from the variance of heart rates of people who do
enough evidence to support the claim? Assume the variable is normally distributed.
n1 = 26 n2 = 18
Smokers Nonsmokers
SOLUTION
Find the critical value. Use the 0.025 table in Table H since α = 0.05 and this
is a two-tailed test. Here, d.f.N. = 26 − 1 = 25, and d.f.D. = 18 − 1 = 17. The
Step 2
0.025 0.025
F
2.56
average was 0.51. At α = 0.01, can we conclude that the variance of the grade point
ple of 14 engineering students and found that the standard deviation of their grade point
averages of the psychology graduates is greater than the variance of the grade point av-
erages of the engineering graduates?
SOLUTION
9–46
Section 9–5 Testing the Difference Between Two Variances 533
F = __
1 = ___ = 1.99
s2 0.722
2
s22 0.51
Step 4 Make the decision. Do not reject the null hypothesis since 1.99 < 4.19. That
is, 1.99 does not fall in the critical region. See Figure 9–14.
F I G U R E 9 – 1 4 Critical and Test Value for Example 9–15
F
0 1.99 4.19
Step 5 Summarize the results. There is not enough evidence to support the claim
that the variance in the grade point average of psychology graduates is greater
than the variance in the grade point average of the engineering graduates.
Finding P-values for the F test statistic is somewhat more complicated since it requires
d.f.D. values. For example, suppose that a certain test has F = 3.58, d.f.N. = 5, and d.f.D.
looking through all the F tables (Table H in Appendix A) using the specific d.f.N. and
= 10. To find the P-value interval for F = 3.58, you must first find the corresponding F
values for d.f.N. = 5 and d.f.D. = 10 for α equal to 0.005, 0.01, 0.025, 0.05, and 0.10 in
Table H. Then make a table as shown.
𝛂 0.10 0.05 0.025 0.01 0.005
F 2.52 3.33 4.24 5.64 6.87
Now locate the two F values that the test value 3.58 falls between. In this case, 3.58 falls
tailed test for F = 3.58 falls between 0.025 and 0.05 (that is, 0.025 < P-value < 0.05).
between 3.33 and 4.24, corresponding to 0.05 and 0.025. Hence, the P-value for a right-
For a right-tailed test, then, you would reject the null hypothesis at α = 0.05, but not at
α = 0.01. The P-value obtained from a calculator is 0.0408. Remember that for a two‑tailed
test the values found in Table H for α must be doubled. In this case, 0.05 < P‑value <
0.10 for F = 3.58. Once again, if the P-value is less than α, we reject the null hypothesis.
Once you understand the concept, you can dispense with making a table as shown
and find the P-value directly from Table H.
airports. At α = 0.10, is there enough evidence to support the hypothesis? The data in
American airports is greater than the variance in the number of passengers for foreign
millions of passengers per year are shown for selected airports. Use the P-value method.
Assume the variable is normally distributed and the samples are random and independent.
American airports Foreign airports
36.8 73.5 60.7 51.2
72.4 61.2 42.7 38.6
60.5 40.1
Source: Airports Council International.
9–47
534 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
SOLUTION
Step 2 Compute the test value. Using the formula in Chapter 3 or a calculator, find
the variance for each group.
s21 = 246.38 and
s 22 = 95.87
F = __
1 = ___ = 2.57
s2 246.38
s22 95.87
Step 3 Find the P-value in Table H, using d.f.N. = 6 − 1 = 5 and d.f.D. = 4 − 1 = 3.
𝛂 0.10 0.05 0.025 0.01 0.005
F 5.31 9.01 14.88 28.24 45.39
Since 2.57 is less than 5.31, the P-value is greater than 0.10. (The P-value
obtained from a calculator is 0.234.)
Step 5 Summarize the results. There is not enough evidence to support the claim
that the variance in the number of passengers for American airports is
greater than the variance in the number of passengers for foreign airports.
Note: It is not absolutely necessary to place the larger variance in the numerator when
you are performing the F test. Critical values for left-tailed hypotheses tests can be found
by interchanging the degrees of freedom and taking the reciprocal of the value found in
Table H.
Also, you should use caution when performing the F test since the data can run
not be performed and you would not reject the null hypothesis.
9–48
Section 9–5 Testing the Difference Between Two Variances 535
3. Is there a significant difference in the variability in the prices between the German cars and
the U.S. cars?
4. What effect does a small sample size have on the standard deviations?
5. What degrees of freedom are used for the statistical test?
6. Could two sets of data have significantly different variances without having significantly dif-
ferent means?
See page 545 for the answers.
Exercises 9–5
1. When one is computing the F test value, what condi- 8. Using Table H, find the P-value interval for each F test
3. What are the two different degrees of freedom associ- For Exercises 9 through 24, perform the following steps.
ated with the F distribution? Assume that all variables are normally distributed.
a. State the hypotheses and identify the claim.
4. What are the characteristics of the F distribution?
b. Find the critical value.
Two-tailed α = 0.05
e. Summarize the results.
Right-tailed α = 0.01
packs? Random samples of packs were selected for each
area, and the numbers of pups per pack were recorded.
At the 0.05 level of significance, can a difference in
21 = 27.3, n1 = 5
6. Using Table H, find the critical value for each. variances be concluded?
Right-tailed, α = 0.01
wolf packs 3 1 7 6 5
21 = 164, n1 = 21
Idaho 2 4 5 4 2 4 6 3
Two-tailed, α = 0.10
Source: www.fws.gov
21 = 92.8, n1 = 11
Sample 2: s22 = 43.6, n2 = 11
c. Sample 1: s 10. Noise Levels in Hospitals In a hospital study, it was
Right-tailed, α = 0.05
found that the standard deviation of the sound levels
from 20 randomly selected areas designated as “casualty
doors” was 4.1 dBA and the standard deviation of 24
9–49
536 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
11. Calories in Ice Cream The numbers of calories con- a difference between the variances of the two groups
tained in _ 12 -cup
servings of randomly selected flavors exists?
of ice cream from two national brands are listed. At the Research Primary care
0.05 level of significance, is there sufficient evidence
to conclude that the variance in the number of calories 30,897 34,280 31,943 26,068 21,044 30,897
differs between the two brands? 34,294 31,275 29,590 34,208 20,877 29,691
20,618 20,500 29,310 33,783 33,065 35,000
Brand A Brand B 21,274 27,297
Source: U.S. News & World Report Best Graduate Schools.
330 300 280 310
310 350 300 370 16. County Size in Indiana and Iowa A researcher wishes
270 380 250 300 to see if the variance of the areas in square miles for
310 300 290 310 counties in Indiana is less than the variance of the areas
12. Winter Temperatures A random sample of daily high concluded that the variance of the areas for counties in
9–50
Section 9–5 Testing the Difference Between Two Variances 537
is greater than the variation in the salaries of secondary 23. Test Scores An instructor who taught an online sta‑
school teachers. A random sample of the salaries of tistics course and a classroom course feels that the
30 elementary school teachers has a variance of 8324, variance of the final exam scores for the students
n1 = 11 n2 = 16
Two random samples of pet owners who own dogs are
selected. Sample 1 of 13 dog owners was selected from
owners who live in Miami. The standard deviation of
the ages of the dogs in this sample is 1.3 years. Sample 24. Museum Attendance A metropolitan children’s
2 of 8 dog owners was selected from dog owners who museum open year-round wants to see if the variance
Portfolio A 36.44 44.21 12.21 59.60 55.44 39.42 51.29 48.68 41.59 19.49
Portfolio B 32.69 47.25 49.35 36.17 63.04 17.74 4.23 34.98 37.02 31.48
Source: Washington Observer-Reporter.
9–51
At α = 0.05, test the hypothesis that the two population variances are equal, using the sample
Example XL9–4
Set A 63 73 80 60 86 83 70 72 82
Set B 86 93 64 82 81 75 88 63 63
9–52
Important Terms 539
hypothesized ratio should be 1. For the Alternative hypothesis, select Ratio > hypothesized
select (sample 1 variance) / (sample 2 variance) and change the confidence level to 90. The
ratio. Check the box for Use test and confidence intervals based on normal distribution.
3. Click [OK] twice. A graph window will open that includes a small window that says
variances = 2.570. You can view the session window by closing the graph or clicking and
the P-value is 0.234. In the session window, the F-test statistic is shown as the Ratio of
Summary
Many times researchers are interested in comparing two • When the two samples are dependent or related, such
parameters such as two means, two proportions, or two
as using the same subjects and comparing the means
variances. These measures are obtained from two samples, of before-and-after tests, then the t test for dependent
then compared using a z test, t test, or an F test. samples is used. (9–3)
• If two sample means are compared, when the samples are • Two proportions can be compared by using the z test for
independent and the population standard deviations are proportions. In this case, each of n1p1, n1q1, n2 p2, and
known, a z test is used. If the sample sizes are less than n2q2 must all be 5 or more. (9–4)
30, the populations should be normally distributed. (9–1) • Two variances can be compared by using an F test. The
• If two means are compared when the samples are inde- critical values for the F test are obtained from the F
pendent and the sample standard deviations are used, distribution. (9–5)
then a t test is used. The two variances are assumed to • Confidence intervals for differences between two
be unequal. (9–2) parameters can also be found.
Important Terms
dependent F distribution 529 independent pooled estimate of the
samples 507 F test 528 samples 499 variance 502
9–53
540 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
Important Formulas
__
z = __________________
__ __ n
(X1 – X
𝛔 2 𝛔 22
________ and sD is the standard deviation of the differences
____________
√
___1 + ___
n1 n2
√
nΣD2 – (ΣD)2
____________
sD =
n(n – 1)
< (X1 – √
__ __
2 )
X + z𝛂∕2 ___
1 + ___
n1 n2
Formula for the z test for comparing two proportions:
2 ) – (𝛍1 – 𝛍2) ( )
_ _ __
q 1 + __
1
__ __
(X1 – X √
p
√
___
1 + ___
where
< 𝛍1 – 𝛍2
Formula for confidence interval for the difference of two
√
__ __
2) – t𝛂∕2 ___1 + ___
(X 1 – X proportions:
< p1 – p2
n1 n2
s 2 s 22
____________
________
and d.f. = smaller of n1 − 1 and n2 − 1. < ( p 1 – p 2) + z𝛂∕2 _____
n1 n2 ____________
pˆ qˆ 1 _____
pˆ qˆ
ˆ
ˆ
√
1
n1
+ 2 2
n2
Formula for the t test for comparing two means from depen- Formula for the F test for comparing two variances:
– 𝛍D
dent samples:
__ s2
F = __
1 d.f.N. = n1 – 1
sD∕√n
D
_______
t =
__ s22
d.f.D. = n2 – 1
d.f. = n − 1 The larger variance is placed in the numerator.
Review Exercises
σ2 = 16.1.
b. Find the critical value(s).
c. Compute the test value.
d. Make the decision.
Single drivers Married drivers
e. Summarize the results.
106 110 115 121 132 97 104 138 102 115
Use the traditional method of hypothesis testing unless 119 97 118 122 135 133 120 119 136 96
otherwise specified. 110 117 116 138 142 139 108 117 145 114
115 114 103 98 99 140 136 113 113 150
Section 9–1 108 117 152 147 117 101 114 116 113 135
1. Driving for Pleasure Two groups of randomly 154 86 115 116 104 115 109 147 106 88
selected drivers are surveyed to see how many miles 107 133 138 142 140 113 119 99 108 105
9–54
Review Exercises 541
2. Average Earnings of College Graduates The average records of the actual high and low temperatures for a
yearly earnings of male college graduates (with at least a selection of days in March from the weather report for
bachelor’s degree) are $58,500 for men aged 25 to 34. Pittsburgh, Pennsylvania. At the 0.01 level of signifi-
The average yearly earnings of female college graduates cance, is there sufficient evidence to conclude that there
with the same qualifications are $49,339. Based on the is more than a 10° difference between average highs
results below, can it be concluded that there is a differ- and lows?
ence in mean earnings between male and female college
graduates? Use the 0.01 level of significance. Maximum 44 46 46 36 34 36 57 62 73 53
Minimum 27 34 24 19 19 26 33 57 46 26
Male Female Source: www.wunderground.com
Sample mean $59,235 $52,487
Population standard deviation $8,945 $10,125 8. Testing After Review A statistics class was given
Sample size 40 35 a pretest on probability (since many had previous
experience in some other class). Then the class was
Source: New York Times Almanac.
given a six-page review handout to study for two
Section 9–2 days. At the next class they were given another test.
Use α = 0.05.
Is there sufficient evidence that the scores improved?
3. Physical Therapy A recent study of 20 individuals
found that the average number of therapy sessions
a person takes for a shoulder problem is 9.6. The Student 1 2 3 4 5 6
standard deviation of the sample was 2.8. A study Pretest 52 50 40 58 60 52
of 25 individuals with a hip problem found that Posttest 62 65 50 65 68 63
9–55
542 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
evidence at α = 0.05 to conclude that there is a differ- manufacturers is shown. At α = 0.01, is there a sig-
ence in the variances in height between the two groups? nificant difference in the variances?
Cathedrals 72 114 157 56 83 108 90 151 Manufacturer 1 Manufacturer 2
Tallest buildings 452 442 415 391 355 344 310 302 209
87 92 96 87 92 93
Source: www.infoplease.com
100 94 94 91 100 94
13. Sodium Content of Cereals The sodium content 101 103 98 103 96 98
of brands of cereal produced by two major 91 92 96 87 92 91
STATISTICS TODAY
To Vaccinate Using a z test to compare two proportions, the researchers found that the proportion
of residents in smaller nursing homes who were vaccinated (80.8%) was statistically
or Not to greater than that of residents in large nursing homes who were vaccinated (68.7%).
Using statistical methods presented in later chapters, they also found that the larger
Vaccinate? size of the nursing home and the lower frequency of vaccination were significant
Small or Large? predictions of influenza outbreaks in nursing homes.
—Revisited
Data Analysis
The Data Bank is found in Appendix B, or on the 3. Compare the proportion of men who are smokers with
World Wide Web by following links from the proportion of women who are smokers. Use the data
www.mhhe.com/math/stat/bluman/ in the Data Bank. Choose random samples of size 30 or
1. From the Data Bank, select a variable and compare the more. Use the z test for proportions.
mean of the variable for a random sample of at least 4. Select two samples of 20 values from the data in Data
30 men with the mean of the variable for the random Set IV in Appendix B. Test the hypothesis that the mean
sample of at least 30 women. Use a z test. heights of the buildings are equal.
2. Repeat the experiment in Exercise 1, using a differ- 5. Using the same data obtained in Exercise 4, test the hy-
ent variable and two samples of size 15. Compare the pothesis that the variances are equal.
means by using a t test.
Chapter Quiz
Determine whether each statement is true or false. If the 6. To test the equality of two proportions, you would use
statement is false, explain why. a(n) _______ test.
1. When you are testing the difference between two a. z c. Chi-square
means, it is not important to distinguish whether the b. t d. F
samples are independent of each other.
7. The mean value of F is approximately equal to
2. If the same diet is given to two groups of randomly
selected individuals, the samples are considered to be a. 0 c. 1
dependent. b. 0.5 d. It cannot be determined.
3. When computing the F test value, you should place the 8. What test can be used to test the difference between two
larger variance in the numerator of the fraction. sample means when the population variances are
known?
4. Tests for variances are always two-tailed.
a. z c. Chi-square
Select the best answer.
b. t d. F
5. To test the equality of two variances, you would use
a(n) _______ test. Complete these statements with the best answer.
a. z c. Chi-square 9. If you hypothesize that there is no difference between
b. t d. F means, this is represented as H0: _______.
9–56
Chapter Quiz 543
10. When you are testing the difference between two concluded that the average number of accidents per year
means, the _______ test is used when the population has increased from one period to the next?
variances are not known.
Earlier period Later period
11. When the t test is used for testing the equality of two
means, the populations must be _______. 376 650 844 1650 2236 3002
1162 1513 4028 4010
12. The values of F cannot be _______. Source: USA TODAY.
13. The formula for the F test for variances is _______. 18. Salaries of Chemists A random sample of 12 chem-
ists from Washington state shows an average salary of
For each of these problems, perform the following steps. $39,420 with a standard deviation of $1659, while a
a. State the hypotheses and identify the claim. random sample of 26 chemists from New Mexico has
b. Find the critical value(s). an average salary of $30,215 with a standard deviation
Use the traditional method of hypothesis testing unless 19. Family Incomes The average income of 15 randomly
otherwise specified. selected families who reside in a large metropolitan East
Coast city is $62,456. The standard deviation is $9652.
14. Cholesterol Levels A researcher wishes to see if there The average income of 11 randomly selected families
495 390 540 445 420 525 400 310 375 750 21. Egg Production To increase egg production, a farmer
410 550 499 500 550 390 795 554 450 370 decided to increase the amount of time the lights in his
389 350 450 530 350 385 395 425 500 550 hen house were on. Ten hens were randomly selected,
375 690 325 350 799 380 400 450 365 425 and the number of eggs each produced was recorded.
475 295 350 485 625 375 360 425 400 475 After one week of lengthened light time, the same hens
9–57
544 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
23. Male Head of Household A recent survey of 200 variances of the amounts spent in the two counties? Use
randomly selected households showed that 8 had a the P-value method.
single male as the head of household. Forty years
County A County B
s1 = $11,596 s2 = $14,837
ago, a survey of 200 randomly selected households
Data Projects
Use a significance level of 0.05 for all tests below. 4. Health and Wellness Use the data regarding BMI
that were collected in data project 6 of Chapter 7 to
1. Business and Finance Use the data collected in data
complete this problem. Test the claim that the mean
project 1 of Chapter 2 to complete this problem. Test
BMI for males is the same as that for females. Test the
the claim that the mean earnings per share for Dow
claim that the standard deviation for males is the same
Jones stocks are greater than for NASDAQ stocks.
as that for females.
2. Sports and Leisure Use the data collected in data proj-
5. Politics and Economics Using data from the Internet
ect 2 of Chapter 7 regarding home runs for this problem.
for the last Presidential election to categorize the 50
Test the claim that the mean number of home runs hit
states as “red” or “blue” based on who was supported
by the American League sluggers is the same as the
for President in that state, the Democratic or Republican
mean for the National League.
candidate, test the claim that the mean incomes for red
3. Technology Use the cell phone data collected for data states and blue states are equal.
project 2 in Chapter 8 to complete this problem. Test
6. Your Class Use the data collected in data project 6
the claim that the mean length for outgoing calls is the
of Chapter 2 regarding heart rates. Test the claim that
same as that for incoming calls. Test the claim that the
the heart rates after exercise are more variable than the
standard deviation for outgoing calls is more than that
heart rates before exercise.
for incoming calls.
9–58
Hypothesis-Testing Summary 1 545
3. The P-value of 0.06317 also gives the probability of a Section 9–4 Smoking and Education
1. Our hypotheses are H0: p1 = p2 and H1: p1 ≠ p2.
type I error.
4. Since two critical values are shown, we know that a
z = ±1.96.
two-tailed test was done. 2. At the 0.05 significance level, our critical values are
1 + _____
(
Section 9–3 Air Quality
1. The purpose of the study is to determine if the air
√ (0.234)(0.766)_____
1000 1000
1 )
quality in the United States has changed over the past and our P-value is very close to zero. We reject the null
2 years. hypothesis and find that there is enough evidence to
conclude that there is a difference in the proportions of
2. These are dependent samples, since we have two
public school students and private school students who
readings from each of 10 metropolitan areas.
3. The hypotheses we will test are H0: μD = 0 and
smoke.
H1: μD ≠ 0.
Section 9–5 Variability and Automatic
Transmissions
values of t = ±2.262.
4. We will use the 0.05 significance level and critical
same: H0: σ 21 = σ 22 (H1: σ 21 ≠ σ 22) .
1. The null hypothesis is that the variances are the
Hypothesis-Testing Summary 1
1. Comparison of a sample mean with a specific popula- Example: H0: σ2 = 225
Example: H0: μ = 100
tion mean. Use the chi-square test:
(n − 1)s2
the z test when σ is known: 𝜒2 = _______ with d.f. = n − 1
𝜎2
− 𝜇
a. Use __
z = __
𝜎∕√n
______
X
Example: H0: μ1 = μ2
3. Comparison of two sample means.
b. Use the t test when σ is unknown:
− 𝜇
t = __
with d.f. = n − 1
__ a. Use the z test when the population variances are
X
______
(X1 − X 2 ) − (μ1 − μ2)
known:__ __
s∕√n
n 1 + ___
2. Comparison of a sample variance or standard devia-
tion with a specific population variance or standard
deviation.
√ ___
1
n
2
9–59
546 Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances
b. Use the t test for independent samples when the 5. Comparison of two sample proportions.
Example: H0: p1 = p2
population variances are unknown and assume
the sample variances are unequal:
2 ) − (μ1 − μ2)
(X1 − X
( pˆ – pˆ 2) − (p1 − p2)
__ __ Use the z test:
X + X2
where
pˆ 1 = ___
__ _______ X
Example: H0: μD = 0
c. Use the t test for means for dependent samples: = 1
p n 1
n1 + n2
q = 1 − p
pˆ 2 = n 2
1
– 𝜇D
_ __
X
___
t = _______ with d.f. = n − 1
__
sD∕√n
D 2
__
6. Comparison of two sample variances or standard
where n = number of pairs.
Example: H0: 𝜎 21 = 𝜎 22
deviations.
Example: H0: p = 0.32 F = __
s2
1
s22
Use the z test:
X−𝜇 pˆ − p s 21 = larger variance d.f.N. = n1 − 1
where
z = _____
𝜎
or z = _______
9–60