You are on page 1of 31

Chapter 9

Testing a Claim

Section 9.1
Significance Tests: The
Basics
Significance Tests: The Basics

LEARNING TARGETS
By the end of this section, you should be able to:
üSTATE appropriate hypotheses for a significance test about a
population parameter.
üINTERPRET a P-value in context.
üMAKE an appropriate conclusion for a significance test.
üINTERPRET a Type I error and a Type II error in context. GIVE a
consequence of each error in a given setting.

Starnes/Tabor, The Practice of Statistics


Activity: I’m a great free-throw shooter!
Page 526

+
In this activity, you and your classmates will perform a simulation to test a claim about a
population proportion.

1. Using the spinner provided by your teacher, numbers 1-4 will represent a “made shot” and
number 5 will represent a “missed shot”. On a flat surface, flick the spinner and see where the
pointer lands. {for die: 1-8 is a “made shot” and 9-0 is a “missed shot”}
2. Flick the spinner a total of 50 times, and count the number of times that the pointer lands in
the “made shot” region.

Starnes/Tabor, The Practice of Statistics


Activity: I’m a great free-throw shooter!
Page 526

+
1. Using the spinner provided by your teacher, numbers 1-4 will represent a “made shot” and
number 5 will represent a “missed shot”. On a flat surface, flick the spinner and see where the
pointer lands. {for die: 1-8 is a “made shot” and 9-0 is a “missed shot”}
2. Flick the spinner a total of 50 times, and count the number of times that the pointer lands in
the “made shot” region.

4. Repeat Steps 2 and 3 as needed to get at least 40 trials of the simulation for your class.
5. Based on the class’s simulation results, how likely is it for an 80% shooter to make 64% or less
when he shoots 50 free throws?

Starnes/Tabor, The Practice of Statistics


Activity: Free-throw shooter Applet

+
In this activity, you and your classmates will perform a simulation to test a claim about a
population proportion.
1. Open the Free-throw Shooter Applet
https://digitalfirst.bfwpub.com/stats_applet/stats_applet_15_reasoning.html

2. Change “Shots” to 50.

3. Click “Shoot”. Once the applet has completed all the shots, decide if you think it is plausible
the play is actually an 80% free-throw shooter overall.

4. Check “Show true probability” to see if the player actually is an 80% free-throw shooter.

5. Write down how far off the “True Probability” the actual results were.

6. Uncheck “Show true probability”, click “New Shooter” and repeat the process. Go through a
few cycles and try to guess the true probability for each before revealing the value. Is the
simulation value always pretty close to the true value?

Starnes/Tabor, The Practice of Statistics


Stating Hypotheses
Confidence intervals are one of the two most
common methods of statistical inference.

The second common method of inference,


called a significance test, allows us to weigh
the evidence in favor of or against a particular
claim.

Starnes/Tabor, The Practice of Statistics


Stating Hypotheses

The claim that we weigh evidence against in a


significance test is called the null hypothesis (H0).
The claim that we are trying to find evidence for is
the alternative hypothesis (Ha).

Usually, the null hypothesis H0 is a


statement of “no difference.”

Starnes/Tabor, The Practice of Statistics


Stating Hypotheses

The alternative hypothesis is one-sided if it states


that a parameter is greater than the null value or if
it states that the parameter is less than the null
value.
The alternative hypothesis is two-sided if it states
that the parameter is different from the null value
(it could be either greater than or less than).

Starnes/Tabor, The Practice of Statistics


Stating Hypotheses

JTB Photo/JTB Photo/Superstock


Problem: At the Hawaii Pineapple Company, managers
are interested in the size of the pineapples grown in the
company’s fields. Last year, the mean weight of the
pineapples harvested from one large field was 31 ounces.
A different irrigation system was installed in this field after
the growing season. Managers wonder if this change will
affect the mean weight of pineapples grown in the field this year.
State appropriate hypotheses for performing a significance test. Be sure to define
the parameter of interest.

H0: µ = 31
Ha: µ ≠ 31

where µ = the true mean weight in oz. of all pineapples grown


in the field this year.

Starnes/Tabor, The Practice of Statistics


Stating Hypotheses

CAUTION:
The hypotheses should express the belief or suspicion we have before
we see the data.

AP® Exam Tip

Starnes/Tabor, The Practice of Statistics


Check Your Understanding: Page 556

+
For each of the following settings, state appropriate hypotheses for performing a significance
test. Be sure to define the parameter of interest.

1. According to the National Sleep Foundation, 85% of teens are getting too little sleep on
school nights. Jannie wonders whether this result holds in her large high school. She asks an
SRS of 100 students at the school how much sleep they get on a typical night. In all, 75 of the
students are getting less than the recommended amount of sleep.

2. As part of its marketing campaign for the 2010 census, the U.S. Census Bureau advertised “10
questions, 10 minutes – that’s all it takes.” On the census form itself, we read, “The U.S. Census
Bureau estimates that, for the average household, this form will take about 10 minutes to
complete, including the time for reviewing the instructions and answers.” We suspect that the
time it takes to complete the form may be longer than advertised.

Starnes/Tabor, The Practice of Statistics


Interpreting P-values

In other words, how likely is it for an


80% shooter to make 64% or less by
chance alone in a random sample of
50 attempts?

Starnes/Tabor, The Practice of Statistics


Interpreting P-values
The P-value of a test is the probability of getting
evidence for the alternative hypothesis Ha as
strong or stronger than the observed evidence
when the null hypothesis H0 is true.

P-value ≈ 3/400 = 0.0075

We’ll show you how to


calculate P-values later. For
now, let’s focus on
interpreting them.

Starnes/Tabor, The Practice of Statistics


Interpreting P-values

Martin Shields/Alamy
Starnes/Tabor, The Practice of Statistics
Interpreting P-values
Problem:
(a) Explain what it would mean for the null hypothesis to be true in this setting.
(b) Interpret the P-value.

(a) If H0: µ = 1300 is true, then the mean daily


calcium intake in the population of teenagers
is 1300 mg.

(b) Assuming that the mean daily calcium intake


in the teen population is 1300 mg, there is about
a 14% probability of getting a sample mean

Martin Shields/Alamy
of 1198 mg or less just by chance in a random
sample of 20 teens.

Starnes/Tabor, The Practice of Statistics


Making Conclusions
We make a decision based on the strength of the evidence
in favor of the alternative hypothesis (and against the null
hypothesis) as measured by the P-value.
• If the observed result is unlikely to occur by chance alone
when H0 is true (small P-value), we will “reject H0.”
• If the observed result is not unlikely to occur by chance alone
when H0 is true (large P-value), we will “fail to reject H0.”

How to Make a Conclusion in a Significance Test


• If the P-value is small, reject H0 and conclude that there is convincing evidence
for Ha (in context).
• If the P-value is not small, fail to reject H0 and conclude that there is not
convincing evidence for Ha (in context).

Starnes/Tabor, The Practice of Statistics


Making Conclusions
How small does a P-value have to be for us to reject H0?

In Chapter 4 , we suggested that you use a


boundary of 5% when determining whether a
result is statistically significant.

That is equivalent to saying, “View a


P-value less than 0.05 as small.”

Sometimes it may be preferable to


use a different boundary value—like
0.01 or 0.10—when drawing a
conclusion in a significance test.

Starnes/Tabor, The Practice of Statistics


Making Conclusions

The significance level α is the value that we use as a boundary for


deciding whether an observed result is unlikely to happen by chance alone
when the null hypothesis is true.

If the P-value is less than α, we say that the result


is “stascally significant at the α = ____ level.”

CAUTION:
α should be stated before the data are produced.

Starnes/Tabor, The Practice of Statistics


Making Conclusions

Starnes/Tabor, The Practice of Statistics


Making Conclusions
Problem:
A significance test is performed using the hypotheses
H0: µ = 30
Ha: µ > 30
where µ is the true mean lifeme (in hours) of the
deluxe AAA baeries. The resulng P-value is 0.0717.
What conclusion would you make at the α = 0.05 level?

Because the P-value of 0.0717 > α = 0.05, we fail to reject H0. We


don’t have convincing evidence that the true mean lifetime of the
company’s deluxe AAA batteries is greater than 30 hours.

Starnes/Tabor, The Practice of Statistics


Making Conclusions
Problem:
A significance test is performed using the hypotheses
H0: µ = 30
Ha: µ > 30
where µ is the true mean lifeme (in hours) of the
deluxe AAA baeries. The resulng P-value is 0.0717. !
e
What conclusion would you make at the α = 0.05 ilevel? s tr u
: at H0
N de t
h
TIO nclu
CAU or co
H 0”
Because the P-value eofpt0.0717 > α = 0.05, we fail to reject H0. We
c c
a evidence that the true mean lifetime of the
ever “
don’t have convincing
company’sNdeluxe AAA batteries is greater than 30 hours.

Starnes/Tabor, The Practice of Statistics


Making Conclusions

AP® Exam Tip


We recommend that you follow the two-sentence structure from the
example when writing the conclusion to a significance test.
• The first sentence should give a decision about the null
hypothesis—reject H0 or fail to reject H0—based on an explicit
comparison of the P-value to a stated significance level.

• The second sentence should provide a statement about whether or


not there is convincing evidence for Ha in the context of the
problem.

Example
Because the P-value of 0.0717 > α = 0.05, we fail to reject
H0.don’t have convincing evidence that the true mean lifetime
We
of the company’s deluxe AAA batteries is greater than 30 hours.
Starnes/Tabor, The Practice of Statistics
Type I and Type II Errors
When we draw a conclusion from a significance test, we hope our
conclusion will be correct. But sometimes it will be wrong.

A Type I error occurs if a test rejects H0 when H0 is true. That is, the test
finds convincing evidence that Ha is true when it really isn’t.

A Type II error occurs if a test fails to reject H0 when Ha is true. That is, the
test does not find convincing evidence that Ha is true when it really is.

H0 false

Starnes/Tabor, The Practice of Statistics


Type I and Type II Errors

Starnes/Tabor, The Practice of Statistics


Type I and Type II Errors

Type I error: The producer finds convincing evidence that more than
8% of the potatoes in the shipment have blemishes, when the true
proportion is really 0.08. (or less, as that would be acceptable in this case)

Consequence: The potato-chip producer sends away the truckload of


acceptable potatoes, wasting time and depriving the supplier of
money.

Starnes/Tabor, The Practice of Statistics


Type I and Type II Errors

Type II error: The producer does not find convincing evidence that
more than 8% of the potatoes in the shipment have blemishes, when
the true proportion is greater than 0.08.

Consequence: More potato chips are made with blemished potatoes,


which may upset customers and lead to decreased sales.

Starnes/Tabor, The Practice of Statistics


Type I and Type II Errors

The most common significance levels are α = 0.05, α = 0.01, and α = 0.10.

Which one of these is the best choice for a given significance test? That
depends on whether a Type I error or a Type II error is more serious.

Type I Error Probability


The probability of making a Type I error in a significance test is equal to
the significance level α.

Starnes/Tabor, The Practice of Statistics


Type I and Type II Errors

We can decrease the probability of


making a Type I error in a significance
test
by using a smaller significance level.

But there is a trade-off between P(Type I


error) and P(Type II error): as one
increases, the other decreases.

If we make it more difficult to reject H0 by


decreasing α, we increase the probability
that we will not find convincing evidence for
Ha when it is true.

Starnes/Tabor, The Practice of Statistics


Check Your Understanding: Page 562

+
The manager of a fast food restaurant wants to reduce the proportion of customers who have to
wait longer than 2 minutes to receive their food after placing an order. Based on store records,
the proportion of customers who had to wait longer than 2 minutes was p = 0.63. To reduce this
proportion, the manager assigns an additional employee to drive-thru orders.

1. Describe a Type I error and a Type II error in this setting.

2. Which type of error is more serious this case? Justify your answer.

4. The P-value of the manager’s test is 0.0385. Interpret the P-value.


Starnes/Tabor, The Practice of Statistics
Section Summary

LEARNING TARGETS
After this section, you should be able to:
üSTATE appropriate hypotheses for a significance test about a
population parameter.
üINTERPRET a P-value in context.
üMAKE an appropriate conclusion for a significance test.
üINTERPRET a Type I error and a Type II error in context. GIVE a
consequence of each error in a given setting.

Starnes/Tabor, The Practice of Statistics


Section 9.1 Homework,

+
pages 563 - 567

1 – 9 odd, 13 – 15 all, 19 – 27 odd,


29 – 32 all

Starnes/Tabor, The Practice of Statistics

You might also like