Hypothesis Testing

Hypothesis
Testing
Hypothesis
An educated opinion
What you think will happen, based on
previous research
anecdotal evidence
reading the literature
Body fat level of 8th graders
National Norm:
Mean = 23%, SD = 7%
postulated parameter ( and o)
Your 8th grade PE program (N=200)

How does my program compare??
Your gut feeling
You expect to find, you want to find,
your instincts tell you that your
students are better.

Your gut feeling
You expect to find, you want to find,
your instincts tell you that your
students are better.

But are they??
Question
Is any observed difference between
your sample mean (representative
of your 8th grade population mean)
and the National Norm (population
of all 8th graders) attributable to
random sampling errors, or is there
a real difference?
Question
Is any observed difference between your
sample mean (representative of your 8th
grade population mean) and the
National Norm (population of all 8th
graders) attributable to random sampling
errors, or is there a real difference?
Is the mean of your class REALLY the
same as the National Norm?
How to determine this
Research Question
is my POPULATION mean really 23%
Statistical Question
= 23%
set the Null Hypothesis that the mean of YOUR
group is 23% (equal to the National Norm)
assume that your group is NOT REALLY different
Null Hypothesis
H
o
: = 23%
The true difference between your sample
and the population mean is 0.
There is NO real difference between your
sample mean and the population mean.
The performance of your students is not
really different from the national norm.
Null Hypothesis
In inferential statistics, we usually want
to reject the Null hypothesis
to say that the differences are more than
what would be expected by random
sampling error
this was our initial gut feeling
our program is better
3 Possible Outcomes
No difference between groups
do not reject the null hypothesis
3 Possible Outcomes
One specific group is higher than the other
directional hypothesis
What you EXPECT to happen when planning the
experiment/measurement
3 Possible Outcomes
One specific group is higher than the other
Either group mean is higher
non-directional hypothesis
The possible outcome of the
experiment/measurement

Alternative Hypothesis
Our research hypothesis (what we
expect to see)
H
A
: = 23%
non-directional hypothesis
interested to see if my grade body composition
is better than or worse than the national norm
Our research hypothesis (what we
expect to see)
H
A
: < 23% (H
A
: > 23%)
directional hypothesis
expect to see my grade mean less than (better
than) the national norm
expect to see my grade mean greater than
(worse than) that of the national norm

Comparing My Class to the
National Norm
My 8th grade PE program (N = 200)
National Norm = 23%
postulated parameter
At the end of the semester, calculate the
mean % body fat
Using a random sample ( n = 25)
mean % body fat of 20 %
Is my sample mean different from the National Norm?
Need to Test H
o

Determine whether the observed
difference is means is attributable to
random sampling error rather than a true
difference between the groups (my class
and the national norm)
treatment effect
Hypothesis Testing
Null Hypothesis
No true difference between two means
(sample mean and national norm)
Infers: my sample is drawn from the
identified population
Nothing more than random sampling errors
accounts for any observed difference
between the means.
An element of uncertainty is inherent
in any act of observation (Menards Philosophy)
A true difference does exist between two
means
Infers: my sample is not drawn from the
identified population
Observed difference between the means is
larger than what we are willing to attribute
to random sampling error
Testing H
o

Test the probability that the observed
difference between means is attributable
to random sampling error alone
Evaluate the probability that H
o
is not to
be rejected
reject or do not reject H
o

What amount of risk
are you willing to take?
Weatherman Example
85% chance of rain
put up the sunroof
5% chance of rain
it may happen, but the chance is slight
not very likely to rain
willing to risk being wrong to avoid the
inconvenience of having to put up the sunroof.
If we do not put up the sunroof:
We reject
the
hypothesis
that
it will rain
If we do not put up the sunroof:
We could be right
or
We could be wrong
Wait for certainty
means to
wait forever
What risk are
YOU
willing to take

1%?? 5%?? 10%%
Applied Research
= 0.10
= 0.05
= 0.01
= 0.05
With these observed conditions
5 times in 100 it will rain
5 times in 100 it will rain when we have
kept the sunroof down

95 times in 100 it will not rain
95 times in 100 it will not rain when we
have kept the sunroof down
= 0.05
Reject H
o
if the observed mean
difference is greater than what we would
expect to occur by chance (random
sampling error) less than 5 times in 100
instances

reported in research as a statistically
significant difference
Testing H
o
at = 0.05
If p > 0.05 : do not reject H
o

difference is attributable to random
sampling error (expected variability in mean
drawn from a population)

If p s 0.05 : reject H
o

difference is attributable to something other
than random sampling error
Decision Table
H
o
TRUE H
o
FALSE
DECISION
Decision Table
H
o
TRUE H
o
FALSE
H
o
TRUE
H
o
FALSE
DECISION
R
E
A
L
I
T
Y
Decision Table: Correct
H
o
TRUE H
o
FALSE
H
o
TRUE
H
o
FALSE
DECISION
R
E
A
L
I
T
Y
Decision Table: Incorrect (RT1)
H
o
TRUE H
o
FALSE
H
o
TRUE
H
o
FALSE
DECISION
R
E
A
L
I
T
Y
Decision Table: Incorrect (AFII)
H
o
TRUE H
o
FALSE
H
o
TRUE
H
o
FALSE
DECISION
R
E
A
L
I
T
Y
Belief in God
as Decision
Table
H
o
TRUE H
o
FALSE
H
o
TRUE
H
o
FALSE

H
o
: God does not exist
DECISION
R
E
A
L
I
T
Y
Eternal life
Lived life of hope
Lost out on
Eternal life
Life no hope
To this juncture
Sampling involves error
Expect differences between samples
To this juncture

If we expect a difference between
treatments/conditions, BUT we also
expect a difference because of random
sampling error

To this juncture

If we expect a difference between
treatments/conditions, BUT we also expect a
difference because of random sampling error

HOW do we determine if difference is
statistically significant (> than RSE)?

Testing H
o
requires
Mean value
measure of typical performance level
Standard deviation
measure of the variability
n of cases
known to affect
variability expected with the estimate of the population
mean
z test for one sample
Our beginning point
National Norm BF = 23% (SD = 7%)
Our sample performance
n = 25
Mean = 20%
SD = 6%
Do my students differ
from the National Norm??
Our hypotheses
Research Hypothesis
Do my students differ from the national norm
want to know if better OR worse
H
o
There is no real difference in the BF% of my
students and the national norm

= 0.05
Recall
z-score of > 1.96 or < -1.96 occurs less
than 5% of the time
see table of the Normal Curve
That is, the probability of obtaining a z-
score value this extreme purely by
chance is 5% (only 5 times in 100)
(explain).
Relevance to Hypothesis Testing
Use the same general idea to evaluate
the probability of obtaining a sample
mean score of 20% with n = 25 if the
true population mean is 23%

Recall the concept of the distribution of
sampling means
Recall: Z score equation
Z =
X - X
SD
Introduce: Z test equation
Z =
X -
SE
m

Standard Error of the Mean
n
SD
SE
m
=
Z test equation
Z =
X -
SE
m

Mean
difference
Z test equation
Z =
X -
SE
m

Expected
variability
in sample means
Our given & required data
X = 20%
SD = 6%
n = 25
= 23%
o = 7%
SE
m
= ???
X - = ???
Z = ???
Z =
X -
SE
m

X = 20%
SD = 6%
n = 25
= 23%
o = 7%
SE
m
= 7/5 = 1.4
X - = ???
Z = ???
Z =
X -
SE
m

Use the population
standard deviation (SD
p
)
X = 20%
SD = 6%
n = 25
= 23%
o = 7%
SE
m
= 7/5 = 1.4
X - = 20% - 23% = -3%
z = ???
Z =
X -
SE
m

X = 20%
SD = 6%
n = 25
= 23%
o = 7%
SE
m
= 7/5 = 1.4
X - = 20% - 23% = -3%
Z = -3 / 1.4 = -2.14
Z =
-3%
1.4
Decision Making
What is the probability of obtaining a Z = -2.14
IF the difference is attributable only to random
sampling error?

Is the observed probability (p) LESS THAN or
EQUAL TO the o level set?

Is p s o ?
From the tables
Z > 1.96 or Z < -1.96 has a 5% chance of
occurring purely by chance (explain).
Since Z
observed
= -2.14, our statistical
conclusion is to reject H
o

the difference of -2.14 is not likely to have occurred
by chance
The data indicate/suggest (not prove) that
our class HAS less body fat than the norm.
Graphically, = 0.05
-1.96
1.96
Z
observed
= -2.14
Z
critical
= 1.96
Graphically, = 0.05
-1.96
1.96
Z
critical
= 1.96
Region of
Non-Rejection
Z
observed
= -2.14
Graphically, = 0.05
-1.96
1.96
Z
observed
= -2.14
Z
critical
= 1.96
Region of
Rejection
Region of
Rejection
Graphically, = 0.05
-1.96
1.96
Z
observed
= -2.14
Z
critical
= 1.96
Region of
Rejection
Region of
Rejection
Region of
Non-Rejection
Reporting the Results
o = 0.05
The observed mean of our treatment group of 25
students was 20% ( 6%) body fat. The z-test for one
sample indicates that the difference between the
observed mean of 20% and the National Norm of 23%
was statistically significant (Z
obs
= -2.14, p s 0.05).
These data suggest that our measured percent body
fat was less than the national norm.

Reporting the Results
= 0.01
The observed mean of our treatment group was
20% ( 6%) body fat. The z-test for one sample
indicates that the difference between the observed
mean of 20% and the National Norm of 23% was not
statistically significant (Z
obs
= -2.14, p > 0.01). Our
measured percent body fat was not significantly
different from the national norm.

Reporting the Results, you
set = 0.01
The observed mean of our treatment group was
20% ( 6%) body fat. With o = 0.01, the z-test for
one sample indicates that the difference between
the observed mean of 20% and the National Norm
of 23% was not statistically significant (Z
obs
= -2.14,
p = 0.028). Our measured percent body fat was
not significantly different from the national norm.

Consider all possible reasons
for your outcome
Statistics humour
What does a statistician call it
when the heads of 10 rats are
cut off and 1 survives?

Statistics humour
What does a statistician call it
when the heads of 10 rats are
cut off and 1 survives?

Non-significant.

Do not reject H
0
vs Accept H
0
Accept infers that we are sure H
o
is valid
Do not reject H
0
vs Accept H
0
Accept infers that we are sure H
o
is valid

Do not reject reflects that this time we are
unable to say with a high enough degree of
confidence that the difference observed is
attributable to other than sampling error.
Examples
Z
obs
= -3.45

= 0.05

Decision (statistical conclusion) = ???
Examples
Z
obs
= 1.45

= 0.01

Examples
Z
obs
= 1.96

= 0.05

Examples
Z
obs
= -1.96

= 0.01

Examples
Z
obs
= 1.96

= 0.01

Examples
Z
obs
= -1.95

= 0.05

Z-test vs t-test
SPSS does not provide the z-test
Can only use z-test if you know population SD
Typically, all population parameter values are
estimated from sample statistics
Mean
Standard deviation
Standard error
SPSS uses t-test
Same concept, different assumptions
t-test more robust against departures from normality
(doesnt affect the accuracy of the p-estimate as much)

When population mean is not
knownchanging distributions
The Z-test uses one sample statistic to
estimate population parameters
sample mean population mean
Population standard deviation is known
The t-test uses two sample statistics to
estimate population parameters
sample mean population mean
sample standard error population SD

t-test equation
So the test statistic now becomes
X
s
X
t
0

=
Estimated population SD
To estimate pop SD from sample
SD, the sample SD is inflated a
little
1
) (
2
= =

n
x x
s
est
o
You may have noticed this modification earlier
SE
m
from estimated SD population
To estimate standard error from
sample SD, use the estimated SD
again, thus
n
s
s
X
=
Recall factors affecting S
x

Size of estimated SE obviously
depends on both SD of sample, and
sample size
n
s
s
X
=
When population mean is not
knownchanging distributions
The distribution used to evaluate calculated
ratio switches from the normal distribution to
the t-distribution
Sampling variation in Z-distribution reflected
variability with respect to sample mean
BUT sampling variation in t-distribution reflects
variability with respect to sample mean and
standard error of the mean
Soas the sample gets smaller (and the
standard error of the mean increases) the
sampling distribution of t differs from that of Z
The good old 1.96 for 95% is toast
Concept of
Degrees of Freedom (df)
The number of independent pieces of information a
sample of observations can provide for purposes of
statistical inference
E.g. 3 numbers in a sample: 2, 2, 5
Sample mean = 3; deviations are 1, -1, 2
Are these independent?
No when you know two, youll know the other because

For any sample of size n you have n-1 values
that are free to vary the last value is fixed
0 = E ) ( X X
Sampling distribution of t
Large n t-dist pretty much like the z-dist
(because sample SD is a good estimate of pop SD,
& sample SE is a good estimate of pop SE)
Sampling distribution of t
Because distribution gets flatter as n
gets smaller, this implies t for
significance gets bigger as n gets
smaller
http://duke.usask.ca/~rbaker/Tables.html

Work an example with SPSS
Heart Rate (bpm) following aerobic activity
147
155
132
165
133
National standard: 158
Group Mean : 146.4 ( 14.21)
Atble351.sav
SPSS Output
One-Sample Statistics
5 146.4000 14.2056 6.3530
HR
N Mean Std. Devi ati on
Std. Error
Mean
On e-Sampl e T est
-1.8264 .142 -11.6000 -29.2386 6.0386
HR
t df Sig. (2-tailed)
Mean
Dif f erence Lower Upper
95% Conf idence
Interval of the
Dif f erence
Test Value = 158
Statistics and beer
Time Out

Hypothesis Testing

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hypothesis Testing

Uploaded by

Copyright:

Available Formats

Hypothesis

You might also like