You are on page 1of 6

9.

6  The Power of a Test 1

9.6  The Power of a Test


Section 9.1 defines Type I and Type II errors and their associated risks. Recall that a represents
the probability that you reject the null hypothesis when it is true and should not be rejected,
and b represents the probability that you do not reject the null hypothesis when it is false and
should be rejected. The power of the test, 1 - b, is the probability that you correctly reject a
false null hypothesis. This probability depends on how different the actual population parameter
is from the value being hypothesized (under H0), the value of a used, and the sample size. If
there is a large difference between the population parameter and the hypothesized value, the
power of the test will be much greater than if the difference between the population parameter
and the hypothesized value is small. Selecting a larger value of a makes it easier to reject H0
and therefore increases the power of a test. Increasing the sample size increases the precision
in the estimates and therefore increases the ability to detect differences in the parameters and
increases the power of a test.
The power of a statistical test can be illustrated by using the Oxford Cereal Company sce-
nario. The filling process is subject to periodic inspection from a representative of the consumer
affairs office. The representative’s job is to detect the possible “short weighting” of boxes, which
means that cereal boxes having less than the specified 368 grams are sold. Thus, the represen-
tative is interested in determining whether there is evidence that the cereal boxes have a mean
weight that is less than 368 grams. The null and alternative hypotheses are as follows:
H0: m Ú 368 (filling process is working properly)
H1: m 6 368 (filling process is not working properly)
The representative is willing to accept the company’s claim that the standard deviation, s, equals
15 grams. Therefore, one can use the Z test. Using Equation (9.1) on page 280, with XL (the
lower critical X value) substituted for X, one can calculate the value of X that enables one to
reject the null hypothesis:

XL - m
Z =
s
2n
s
Za/2 = XL - m
2n
s
XL = m + Za/2
2n

Because this is a one-tail test with a level of significance of 0.05, the value of Za/2 is equal to
- 1.645 (see Figure 9.15). The sample size n = 25. Therefore,

(15)
XL = 368 + ( - 1.645) = 368 - 4.935 = 363.065
225

The decision rule for this one-tail test is

Reject H0 if X 6 363.065;
otherwise, do not reject H0.

Copyright © 2021 Pearson Education, Inc.

Section 9.6.indd 1 18/03/20 11:35 AM


2 CHAPTER 9  |  Fundamentals of Hypothesis Testing: One-Sample Tests

FIGURE 9.15
Determining the lower
critical value for a one-tail
Z test for a population .95
mean at the 0.05 level of
significance
.05
XL m = 368 X

Region of Region of
Rejection Nonrejection

ZL = –1.645 0 Z

The decision rule states that if in a random sample of 25 boxes, the sample mean is less than
363.065 grams, you reject the null hypothesis, and the representative concludes that the process
is not working properly. The power of the test measures the probability of concluding that the
process is not working properly for differing values of the true population mean.
What is the power of the test if the actual population mean is 360 grams? To determine the
chance of rejecting the null hypothesis when the population mean is 360 grams, first determine
the area under the normal curve below XL = 363.065 grams. Using Equation (9.1), with the
population mean m = 360,

X - m
ZSTAT =
s
2n
363.065 - 360
= = 1.02
15
225
From Table E.2, there is an 84.61% chance that the Z value is less than + 1.02. Figure 9.16
­visualizes this power of the test when m is the actual population mean. The probability (b) that
one will not reject the null hypothesis (m = 368) is 1 - 0.8461 = 0.1539. Thus, the probability
of committing a Type II error is 15.39%.

FIGURE 9.16
Determining the power of
the test and the probability b
of a Type II error when Power = .8461
m = 360 grams
.1539
m = 360 XL = 363.065 X

0 +1.02 Z

Having determined the power of the test if the population mean were equal to 360, one can
calculate the power for any other value of m. For example, what is the power of the test if the
population mean is 352 grams? Assuming the same standard deviation, sample size, and level
of significance, the decision rule is

Reject H0 if X 6 363.065
otherwise, do not reject H0.

Copyright © 2021 Pearson Education, Inc.

Section 9.6.indd 2 18/03/20 11:35 AM


9.6  The Power of a Test 3

Because this is a test of a hypothesis for a mean, from Equation (9.1),


X - m
ZSTAT =
s
2n
If the population mean shifts down to 352 grams (see Figure 9.17), then

363.065 - 352
ZSTAT = = 3.69
15
225

FIGURE 9.17
Determining the power of
the test and the ­probability b = .00011
of a Type II error when Power = .99989
m = 352 grams

m = 352 XL = 363.065 X

0 +3.69 Z

From Table E.2, there is a 99.989% chance that the Z value is less than + 3.69. This is the
power of the test when the population mean is 352. The probability (b) that one will not reject
the null hypothesis (m = 368) is 1 - 0.99989 = 0.00011. Thus, the probability of committing
a Type II error is only 0.011%.
In the preceding two examples, the power of the test is high, and the chance of committing
a Type II error is low. Consider a second example in which the population mean is equal to
367 grams—a value that is very close to the hypothesized mean of 368 grams.
Once again, from Equation (9.1),

X - m
ZSTAT =
s
2n
If the population mean is equal to 367 grams (see Figure 9.18), then

363.065 - 367
ZSTAT = = - 1.31
15
225

FIGURE 9.18
Determining the power of
the test and the probability Power = .0951
b = .9049
of a Type II error when
m = 367 grams

XL = 363.065 m = 367 X

–1.31 0 Z

Copyright © 2021 Pearson Education, Inc.

Section 9.6.indd 3 18/03/20 11:35 AM


4 CHAPTER 9  |  Fundamentals of Hypothesis Testing: One-Sample Tests

From Table E.2, the probability less than Z = - 1.31 is 0.0951 (or 9.51%). Because the rejection
region is in the lower tail of the distribution, the power of the test is 9.51%, and the chance of
making a Type II error is 90.49%.
Figure 9.19 illustrates the power of the test for various possible values of m (including the
three values examined). This graph is called a power curve.

FIGURE 9.19 .99961 .9964 .9783


1.00 .9545
Power curve of the cereal- .99989 .99874 .9909
.9131
box-filling process for 0.90
.8461
H1: m 6 368 grams
0.80
.7549
0.70
.6406
0.60
Power

0.50 .5080

0.40 .3783
0.30
.2578
0.20
.1635
0.10 .0951
.0500
0.00
352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368
Possible Values for m (grams)

From Figure 9.19, you can see that the power of this one-tail test increases sharply (and
approaches 100%) as the population mean takes on values farther below the hypothesized mean
of 368 grams. Clearly, for this one-tail test, the smaller the actual mean m, the greater the power
1
For situations involving one-tail to detect this difference.1 For values of m close to 368 grams, the power is small because the test
tests in which the actual mean, m1, cannot effectively detect small differences between the actual population mean and the hypoth-
exceeds the hypothesized mean, the
converse would be true. The larger
esized value of 368 grams. When the population mean approaches 368 grams, the power of the
the actual mean, m1, compared with test approaches a, the level of significance (which is 0.05 in this example).
the hypothesized mean, the greater Figure 9.20 on the next page summarizes the computations for the three cases. You can see
is the power. For two-tail tests, the the drastic changes in the power of the test for different values of the actual population means
greater the distance between the
actual mean, m1, and the hypothe-
by reviewing the different panels of Figure 9.20. From Panels A and B you can see that when
sized mean, the greater the power the population mean does not greatly differ from 368 grams, the chance of rejecting the null
of the test. hypothesis, based on the decision rule involved, is not large. However, when the population mean
shifts substantially below the hypothesized 368 grams, the power of the test greatly increases,
approaching its maximum value of 1 (or 100%).
In the above discussion, a one-tail test with a = 0.05 and n = 25 was used. The type of
statistical test (one-tail vs. two-tail), the level of significance, and the sample size all affect the
power. Three basic conclusions regarding the power of the test can be summarized as:
• A one-tail test is more powerful than a two-tail test.
• An increase in the level of significance (a) results in an increase in power. A decrease in
a results in a decrease in power.
• An increase in the sample size, n, results in an increase in power. A decrease in the sample
size, n, results in a decrease in power.

Copyright © 2021 Pearson Education, Inc.

Section 9.6.indd 4 18/03/20 11:35 AM


9.6  The Power of a Test 5

FIGURE 9.20 Region of Rejection Region of Nonrejection


Determining statistical XL = 363.065
power for varying values
of the population mean Given: a = .05, s = 15, n = 25
One-tail test
m = 368 (null hypothesis is true)

XL = 368 – (1.645) 15 = 363.065


1 – a = .95
25
Decision rule: Reject H0 if X < 363.065; otherwise a = .050
do not reject

368 X

Given: a = .05, s = 15, n = 25


One-tail test
H0: m = 368

m = 367 (true mean shifts to 367 grams)


X–m 363.065 – 367
ZSTAT = = = –1.31
s 3
n Power = .0951 b = .9049
Power = .0951

367 X

Given: a = .05, s = 15, n = 25


One-tail test
H0: m = 368

m = 360 (true mean shifts to 360 grams)


X–m 363.065 – 360
ZSTAT = = = +1.02
s 3 Power = .8461 b = .1539
n
Power = .8461

360 X

Given: a = .05, s = 15, n = 25


One-tail test
H0: m = 368

m = 352 (true mean shifts to 352 grams)


X–m 363.065 – 352
ZSTAT = = = +3.69
s 3 Power = .99989
n b = .00011
Power = .99989

352 X
XL = 363.065
Region of Rejection Region of Nonrejection

Copyright © 2021 Pearson Education, Inc.

Section 9.6.indd 5 18/03/20 11:35 AM


6 CHAPTER 9  |  Fundamentals of Hypothesis Testing: One-Sample Tests

PROBLEMS FOR SECTION  9.6

APPLYING THE CONCEPTS destructive testing) and you are willing to have an a = 0.05 risk
9.80  A coin-operated soft-drink machine is designed to discharge of committing a Type I error, compute the power of the test and
at least 7 ounces of beverage per cup, with a standard deviation the probability of a Type II error (b) if the population mean life is
of 0.2 ounce. If you select a random sample of 16 cups and you actually
are willing to have an a = 0.05 risk of committing a Type I error, a. 24,000 miles.
compute the power of the test and the probability of a Type II error b. 24,900 miles.
(b) if the population mean amount dispensed is actually 9.84  Refer to Problem 9.83. If you are willing to have an a = 0.01
a. 6.9 ounces per cup. risk of committing a Type I error, compute the power of the test and
b. 6.8 ounces per cup. the probability of a Type II error (b) if the population mean life is
9.81  Refer to Problem 9.80. If you are willing to have an a = 0.01 actually
risk of committing a Type I error, compute the power of the test and a. 24,000 miles.
the probability of a Type II error (b) if the population mean amount b. 24,900 miles.
dispensed is actually c. Compare the results in (a) and (b) of this problem and (a) and (b)
a. 6.9 ounces per cup. in Problem 9.83. What conclusion can you reach?
b. 6.8 ounces per cup. 9.85  Refer to Problem 9.83. If you select a random sample of
c. Compare the results in (a) and (b) of this problem and in­ 25 tires and are willing to have an a = 0.05 risk of committing a
Problem 9.80. What conclusion can you reach? Type I error, compute the power of the test and the probability of a
9.82  Refer to Problem 9.80. If you select a random sample of Type II error (b) if the population mean life is actually
25 cups and are willing to have an a = 0.05 risk of committing a. 24,000 miles.
a Type I error, compute the power of the test and the probability b. 24,900 miles.
of a Type II error (b) if the population mean amount dispensed is c. Compare the results in (a) and (b) of this problem and (a) and (b)
actually in Problem 9.83. What conclusion can you reach?
a. 6.9 ounces per cup. 9.86  Refer to Problem 9.83. If the operations manager stops the
b. 6.8 ounces per cup. process when there is evidence that the mean life is different from
c. Compare the results in (a) and (b) of this problem and in Prob- 25,000 miles (either less than or greater than) and a random sam-
lem 9.80. What conclusion can you reach? ple of 100 tires is selected, along with a level of significance of
9.83  A tire manufacturer produces tires that have a mean life of at a = 0.05, compute the power of the test and the probability of a
least 25,000 miles when the production process is working ­properly. Type II error (b) if the population mean life is actually
Based on past experience, the standard deviation of the tires is a. 24,000 miles.
3,500 miles. The operations manager stops the production process b. 24,900 miles.
if there is evidence that the mean tire life is below 25,000 miles. c. Compare the results in (a) and (b) of this problem and (a) and (b)
If you select a random sample of 100 tires (to be subjected to in Problem 9.83. What conclusion can you reach?

Copyright © 2021 Pearson Education, Inc.

Section 9.6.indd 6 18/03/20 11:35 AM

You might also like