The Statistical Imagination

The Statistical Imagination
Chapter 9.
Hypothesis Testing I: The Six
Steps of Statistical Inference
© 2008 McGraw-Hill Higher Education

A Hypothesis
• A hypothesis is a prediction about the

relationship between two variables
that asserts that differences among
the measurements of an independent
variable will correspond to differences
among the measurements of a
dependent variable

Using a Hypothesis to
Test a Theory
• The hypothesis is stated before we gather

data
• The theoretical purpose of a hypothesis
test is to corroborate theory by testing
preconceived ideas against facts
• Theory motivates or pushes us to expect
certain empirical outcomes
Statistical Inference
• Statistical inference is drawing conclusions

about a population on the basis of sample
statistics
• The logic of hypothesis testing involves
deciding whether to accept or reject a
statement on the basis of observations of data
• “Accounting for sampling error” with sampling
distributions is key to the process

The Test Effect of the
Hypothesis Test
• The difference between what is observed

in a sample and what is hypothesized is
called the test effect
• We ask: What is the probability that the test
effect is simply the result of sampling
error?
• The sampling distribution provides a
measuring stick to answer the question

The Statistical Purpose
of a Hypothesis Test
• The statistical purpose of a

hypothesis test is to determine
whether statistical test effects
computed from a sample indicate
(1) real effects in the population
or (2) sampling error
Making Empirical Predictions
• To prove a hypothesis we must predict two
things:
1. A mathematical prediction of a
parameter outcome
2. A sampling distribution, a prediction of all
possible sampling outcomes factoring in
sampling error
• With these predictions, we determine the
probability that our single sample outcome
differs significantly from the predicted
outcome (i.e., if the effect is real)
Basic Logical Procedure
of a Hypothesis Testing
a. A question is raised
b. Predictions are made on the basis of
probability theory
c. An event is observed and its effects are
measured
d. The probability of the test effect occurring
is computed
e. A conclusion is drawn
Two Sets of Tasks When
Testing a Hypothesis
1. Test preparation — deciding

what test to use, and organizing
the data
2. Test the hypothesis following the
six steps of statistical inference

Test Preparation
• State the research question: A goal that

can be stated in terms of a hypothesis
• Draw a conceptual diagram depicting
givens (population under study, sample
size, variables and their levels of
measurement, provided and calculated
parameters and statistics)
• Select the statistical test

Step 1 of The Six Steps
of Statistical Inference
• Step 1: State the null hypothesis (H0).
State the alternative hypothesis (HA)
and stipulate the direction of the test

The Null Hypothesis
• The null hypothesis, H0, is a hypothesis

stated in such a way that we will know
what statistical outcomes will occur in
repeated random sampling if this
hypothesis is true
• It is a “statistical” hypothesis: it directs
us to the sampling distribution, which
provides sampling predictions
What does “null” mean?
• Null means none

• H0 predicts sampling outcomes
assuming no effect or no difference
• Often we can “nullify” (negate the
wording) of the research question to
determine the H0

The Alternative
Hypothesis (HA)
• HA is the statement we accept if H0 is

rejected
• HA is often a direct statement of the
research question

The Direction of a
Hypothesis Test
• Test direction refers to whether we

are able to predict the direction our
observed sample statistic will fall
• Direction must be specified before we
observe data

The Direction of a
Hypothesis Test (cont.)
• Three possible directional

statements:
1. Nondirectional (two-tailed test)
2. Positive direction (one-tailed test)
3. Negative direction (one-tailed test)

The Direction of a
Hypothesis Test (cont.)
• In Step 1, the HA, we specify whether we

expect the outcome in our observed
sample to fall above (positive, one-tailed)
or below (negative, one-tailed) the
hypothesized parameter of the H0
• For a nondirectional, two-tailed test, we do
not predict a direction and simply assert
that the outcome is expected to differ from
the hypothesized parameter

When to State a Positive,
One-Tailed Test
• When the content of research

question includes terms such as
greater than, more, increase, faster,
heavier, and gain

When to State a Negative,
One-Tailed Test
• When the content of research

question includes terms such as less
than, fewer, decrease, slower, lighter,
and loss

When to State a Nondirectional,
Two-Tailed Test
• When the content of the research

question includes no statements
about direction, or simply asserts
inequality

Step 2 of The Six Steps of
• Step 2: Describe the sampling
distribution and draw its curve
• The sampling distribution is a
description of all possible sampling
outcomes and a stipulation of the
probability of each outcome assuming
that the H0 is true
• It is built around the H0
• Step 3: State the chosen level of
significance, alpha (α), and indicate again
whether the test is one-tailed or two-tailed.
Specify the critical test value
• The level of significance, alpha, is the
amount of sampling error we are willing to
tolerate in coming to a conclusion
• Critical test values are obtained from the
statistical tables in Appendix B
Step 4 of The Six Steps
of Statistical Inference
• Step 4: Observe the actual

sample; compute the test
effects, the test statistic, and
the p-value

Step 4 (cont.): The
Test Effect
• The test effect is the difference

between the value of the sample
statistic and the parameter value
predicted by the null hypothesis (H0 in
Step 1)
• It is a deviation score on the sampling
distribution curve

Step 4 (cont.):
The Test Statistic
• The test statistic is a formula for
measuring the likelihood of the
observed effect
• It transforms the effect into standard
error units so that the result may be
compared to critical scores of the
statistical tables in Appendix B

Step 4 (cont.)
The p-Value
• The p-value is a measure of the
unusualness of a sample outcome when the
H0 is true. E.g., Is it unusual to roll four 7’s in
a row with honest dice?
• Calculation: p-value = probability (p) of
sampling outcomes as unusual as or more
unusual than the outcome observed under
the assumption that the H0 is true
• An area in the tail(s) of the curve in Step 2
• Step 5: Make the rejection decision

by comparing the p-value to α
• If p < α, reject the H0 and accept the
HA at the 1- α level of confidence
• If p > α, “fail to reject” the H0

Step 6 of The Six
Steps: Interpretation
• Step 6: Interpret and apply the results, and
provide best estimates in everyday terms
• Fit the interpretation to either a professional
or public audience: use as little statistical
jargon as possible
• Frame the interpretation around the H0 or
the HA, whichever survived the hypothesis
test
Probability Theory in
Hypothesis Testing
• Computing probabilities is the essential
mathematical operation in hypothesis
testing
• Hypothesis testing is based on comparing
two probabilities:
1. What actually occurs in our single observed
sample
2. What we expect to occur in repeated
sampling
A Focus on p-Values:
When the p-Value is Large
• When p > α, we fail to reject the H0

• A large p-value tells us that our observed
sample outcome is not much different or “far
off” from the outcome predicted by the H0
• A large p-value occurs when the test effect is
small, and this suggests that the effect could
easily be the result of expected sampling
error
A Focus on p-Values:
When the p-Value is Small
• When p < α, we reject the H0

• A small p-value tells us that assuming the
H0 is true, our sample outcome is unusual
or “far off” from the outcome predicted by
the H0
• A small p-value occurs when the test effect
is large leading us to conclude that the test
effect did not result from sampling error

Inverse Relationship Between
Effect Size and p-Value
• A small test effect = a large

p-value = “fail to reject” the H0
• A large test effect = a small
p-value = “reject” the H0 and
accept the HA

The Level of Significance (α)
in Hypothesis Testing
• The level of significance (α) is the
critical probability point at which we are
no longer willing to say that our
sampling outcome resulted from
random sampling error
• α is stated in Step 3 and compared in
Step 5 to the p-value. This comparison
is called the rejection decision
Critical Test Scores
• The critical test score (Zα) is the statistical
test score that is large enough to indicate a
significant difference between the
observed sample statistic and the
hypothesized parameter
• The critical region is the area in the tail(s)
of the probability curve that is beyond the
critical test score of the stated level of
significance
Critical Test Scores (cont.)
• Zobserved is a test statistic

• Zα is a critical score
• If │Zobserved │ > │ Zα │, then p < α; reject

H0
• If │Zobserved │ < │ Zα │, then p > α; fail to

reject H0

Critical Z-scores
on the Normal Curve
• Critical Z-scores are ones of great
importance in statistical procedures
and are used very frequently
• Some widely used critical Z-scores
are 1.64, 1.96, 2.33, 2.58, 3.08, and
3.30
• See if you can match these scores to the
level of significance and direction of a
hypothesis test
Choosing the Level
of Significance
• Setting the level of significance (α)

allows us to control the chances of
making a wrong decision or “error”
• Short of double-checking against data
for the entire population, we will never
know for sure whether we made the
correct rejection decision or made an
error
Possible Results of a
Rejection Decision
• Correct decision: Fail to reject a true

H0
• Type I error: Rejecting a true H0
• Correct decision: reject a false H0
• Type II error: Failing to reject a false H0

Managing and Controlling
Rejection Decision Errors
• When we reject H0, we either made a
correct decision or made a Type I error; we
could not have made a Type II error
• When we fail to reject H0, we made either
a correct decision or a Type II error; we
could not have made a Type I error

Controlling Type I
and Type II Errors
• Type I error is easily controlled by setting the

level of significance (α), because it turns out
that α = p [of making a Type I error]
• β = p [of making a Type II error]; controlling
beta (β) is difficult
• β is indirectly controlled when we set α
because the two are inversely related; β is
also minimized by using a large sample size
Four Conventional
Levels of Alpha (α )
• α =.10: High likelihood of rejecting the H0. Used in
exploratory research, where little is known about a
topic
• α =.05: Moderate likelihood of rejecting the H0.
Used in survey research
• α =.01 and α =.001: Low likelihood of rejecting the
H0. Used in biological, laboratory, and medical
research, especially when a Type I error is life-
threatening

The Level of Confidence
(LOC) for a Hypothesis Test
• The LOC is the confidence we have that
we did not make a Type I error
• LOC = 1 - level of significance = 1 - α
• E.g., the .05 level of significance
corresponds to a 95% LOC
• The only time we have 100% confidence in
a conclusion is when every subject in a
population is observed

Selecting Which
Statistical Test to Use
• Ask: How many variables are we observing

for this test?
• What are the levels of measurement of the
variables?
• Are we dealing with one representative
sample from a single population or more?
• What is the sample size?
• Are there peculiar circumstances to
consider?
When to Use a Large
Single-Sample Means Test
1. One variable
2. Interval/ratio level of measurement
3. One representative sample from one
population
4. n > 121 cases
Sampling distribution will be the normal
curve (See Chapter 7)

The Statistical Imagination

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Statistical Imagination

Uploaded by

Copyright:

Available Formats

The Statistical Imagination

© 2008 McGraw-Hill Higher Education

• A hypothesis is a prediction about the

© 2008 McGraw-Hill Higher Education

• The hypothesis is stated before we gather

• Statistical inference is drawing conclusions

© 2008 McGraw-Hill Higher Education

• The difference between what is observed

© 2008 McGraw-Hill Higher Education

• The statistical purpose of a

1. Test preparation — deciding

© 2008 McGraw-Hill Higher Education

• State the research question: A goal that

© 2008 McGraw-Hill Higher Education

© 2008 McGraw-Hill Higher Education

• The null hypothesis, H0, is a hypothesis

• Null means none

© 2008 McGraw-Hill Higher Education

• HA is the statement we accept if H0 is

© 2008 McGraw-Hill Higher Education

• Test direction refers to whether we

© 2008 McGraw-Hill Higher Education

• Three possible directional

© 2008 McGraw-Hill Higher Education

• In Step 1, the HA, we specify whether we

© 2008 McGraw-Hill Higher Education

• When the content of research

© 2008 McGraw-Hill Higher Education

• When the content of research

© 2008 McGraw-Hill Higher Education

• When the content of the research

© 2008 McGraw-Hill Higher Education

• Step 4: Observe the actual

© 2008 McGraw-Hill Higher Education

• The test effect is the difference

© 2008 McGraw-Hill Higher Education

© 2008 McGraw-Hill Higher Education

• Step 5: Make the rejection decision

© 2008 McGraw-Hill Higher Education

• When p > α, we fail to reject the H0

• When p < α, we reject the H0

© 2008 McGraw-Hill Higher Education

• A small test effect = a large

© 2008 McGraw-Hill Higher Education

• Zobserved is a test statistic

• If │Zobserved │ > │ Zα │, then p < α; reject

• If │Zobserved │ < │ Zα │, then p > α; fail to

© 2008 McGraw-Hill Higher Education

• Setting the level of significance (α)

• Correct decision: Fail to reject a true

© 2008 McGraw-Hill Higher Education

© 2008 McGraw-Hill Higher Education

• Type I error is easily controlled by setting the

© 2008 McGraw-Hill Higher Education

© 2008 McGraw-Hill Higher Education

• Ask: How many variables are we observing

You might also like