You are on page 1of 127

Chapter 5

Introduction to
Hypothesis Testing
Chapter Goals
After completing this chapter, you
should be able to:
 Formulate null and alternative hypotheses for
applications involving a single population mean or
proportion
 Formulate a decision rule for testing a hypothesis
 Know how to use the test statistic, critical value,
and p-value approaches to test the null hypothesis
 Know what Type I and Type II errors are
 Compute the probability of a Type II error
What is a Hypothesis?
Ahypothesis is a claim
(assumption) about a
population parameter:
◦ population mean
Example: The mean monthly cell phone bill
of this city is  = $42
◦ population proportion
Example: The proportion of adults in this
city with cell phones is p = .68
The Null Hypothesis, H0

 Statesthe assumption (numerical) to


be tested
Example: The average number of TV sets in
U.S. Homes is at least three (H0 : μ  3)

 Is
always about a population
parameter, not about a sample statistic
H0 : μ  3 H0 : x  3
The Null Hypothesis, H0
(continued)

 Begin with the assumption that the


null hypothesis is true
◦ Similar to the notion of innocent
until proven guilty
 Refers to the status quo
 Always contains “=” , “≤” or “” sign
 May or may not be rejected
The Alternative Hypothesis, HA
 Is the opposite of the null hypothesis
◦ e.g.: The average number of TV sets in U.S.
homes is less than 3 ( HA:  < 3 )
 Challenges the status quo
 Never contains the “=” , “≤” or “” sign
 May or may not be accepted
 Is generally the hypothesis that is believed
(or needs to be supported) by the
researcher
Hypothesis Testing Process

Claim: the
population
mean age is 50.
(Null Hypothesis:
Population
H0:  = 50 )
Now select a
random sample
x = 20 likely if  = 50?
If not likely, Suppose
the sample
REJECT mean age Sample
Null Hypothesis is 20: x = 20
Reason for Rejecting H0
Sampling Distribution of x

x
20  = 50
If H0 is true
... then we
If it is unlikely that
reject the null
we would get a
... if in fact this were hypothesis that
sample mean of
this value ... the population mean…  = 50. pCha8-
8
Level of Significance, 
 Defines unlikely values of sample
statistic if null hypothesis is true
 Defines rejection region of the sampling
distribution
 Is designated by  , (level of significance)
 Typical values are .01, .05, or .10
 Is selected by the researcher at the beginning
 Provides the critical value(s) of the test
Level of Significance
and the Rejection Region
Level of significance =  Represents
critical value
H0: μ ≥ 3 
HA: μ < 3 Rejection
Lower tail test 0 region is
shaded
H0: μ ≤ 3 
HA: μ > 3
Upper tail test 0

H0: μ = 3 /2 /2


HA: μ ≠ 3
Two tailed test 0
Errors in Making Decisions
 Type I Error
◦ Reject a true null hypothesis
◦ Considered a serious type of error

The probability of Type I Error is 


 Called level of significance of the test
 Set by researcher in advance
Errors in Making Decisions(continued)

 Type II Error
◦ Fail to reject a false null hypothesis

The probability of Type II Error is β


Outcomes and Probabilities
Possible Hypothesis Test Outcomes

State of Nature
Decision H0 True H0 False
Do Not
No error Type II Error
Reject
Key: (1 -  ) (β)
Outcome H0
(Probability) Reject Type I Error No Error
H0 () (1-β)
Type I & II Error Relationship

 Type I and Type II errors can not happen at


the same time
 Type I error can only occur if H0 is true
 Type II error can only occur if H0 is false
If Type I error probability (  ) , then
Type II error probability ( β )
Factors Affecting Type II Error

 All else equal,


◦ β when the difference between
hypothesized parameter and its true value

◦ β when 
◦ β when σ
◦ β when n
Critical Value
Approach to Testing
 Convert sample statistic (e.g.: x ) to test
statistic ( Z or t statistic )

 Determine the critical value(s) for a


specified level of significance  from a
table or computer

 If the test statistic falls in the rejection


region, reject H0 ;otherwise do not reject H0
Lower Tail Tests
H0: μ ≥ 3
 The cutoff value, HA: μ < 3
-zα or xα , is called a
critical value

Reject H0 Do not reject H0


-zα 0
xα μ
σ
x  = μ  z
n
Upper Tail Tests
H0: μ ≤ 3
 The cutoff value,
HA: μ > 3
zα or xα , is called a
critical value

Do not reject H0 Reject H0


0 zα
μ xα

σ
x  = μ  z
n
Two Tailed Tests
 There are two cutoff H0: μ = 3
values (critical values): HA: μ  3

± zα/2
or /2 /2
xα/2
Lower
Reject H0 Do not reject H0 Reject H0
xα/2 -zα/2 0 zα/2
Upper
xα/2 μ0 xα/2
Lower Upper

σ
x /2 = μ  z /2
n
Critical Value
Approach to Testing

 Convert sample statistic ( x ) to a test


statistic
( Z or t statistic ) Hypothesis
Tests for 

 Known  Unknown

Large Small
Samples Samples
Calculating the Test Statistic
Hypothesis
Tests for μ

 Known  Unknown

The test statistic is:


Large Small
x μ
z = Samples Samples
σ
n
Calculating the Test Statistic
(continued)

Hypothesis
Tests for 

 Known  Unknown

The test statistic is:


But is sometimes
approximated Large Small
x μ using a z:
t n1 = x μ
Samples Samples
s z =
s
n n
Calculating the Test Statistic
(continued)

Hypothesis
Tests for 

 Known  Unknown

The test statistic is:


Large Small
x μ
t n1 = Samples Samples
s
n (The population must be
approximately normal)
Review: Steps in Hypothesis
Testing

 1. Specify the population value of interest


 2. Formulate the appropriate null and
alternative hypotheses
 3. Specify the desired level of significance
 4. Determine the rejection region
 5. Obtain sample evidence and compute the
test statistic
 6. Reach a decision and interpret the result
Hypothesis Testing Example
Test the claim that the true mean # of TV
sets in US homes is at least 3.
(Assume σ = 0.8)
 1. Specify the population value of interest
 The mean number of TVs in US homes

 2. Formulate the appropriate null and alternative


hypotheses
 H0: μ  3 HA: μ < 3 (This is a lower tail test)
 3. Specify the desired level of significance
 Suppose that  = .05 is chosen for this test
Hypothesis Testing Example
(continued)
 4. Determine the rejection region

 = .05

Reject H0 Do not reject H0

-zα= -1.645 0

This is a one-tailed test with  = .05.


Since σ is known, the cutoff value is a z value:
Reject H0 if z < z = -1.645 ; otherwise do not reject H0
Hypothesis Testing Example
 5. Obtain sample evidence and compute the
test statistic
Suppose a sample is taken with the following
results: n = 100, x = 2.84 ( = 0.8 is assumed
known)

◦ Then the test statistic is:


x μ 2.84  3  .16
z = = = = 2.0
σ 0.8 .08
n 100
Hypothesis Testing Example
(continued)
 6. Reach a decision and interpret the result

 = .05

z
Reject H0 Do not reject H0

-1.645 0
-2.0
Since z = -2.0 < -1.645, we reject the null
hypothesis that the mean number of TVs in US
homes is at least 3
Hypothesis Testing Example
(continued)
 An alternate way of constructing rejection region:
Now
expressed
 = .05 in x, not z
units
x
Reject H0 Do not reject H0

2.8684 3
2.84 σ 0.8
Since x = 2.84 < 2.8684, x α = μ  zα n = 3  1.645 100 = 2.8684
we reject the null
hypothesis
p-Value Approach to Testing

 Convert Sample Statistic (e.g. x ) to


Test Statistic ( Z or t statistic )
 Obtain the p-value from a table or
computer
 Compare the p-value with 
◦ If p-value <  , reject H0
◦ If p-value   , do not reject H0
p-Value Approach to Testing
(continued)

 p-value: Probability of obtaining a test


statistic more extreme ( ≤ or  ) than
the observed sample value given H0 is
true
◦ Also called observed level of significance

◦ Smallest value of  for which H0 can be


rejected
p-value example

 Example: How likely is it to see a sample


mean of 2.84 (or something further below
the mean) if the true mean is  = 3.0?
 = .05
P( x  2.84 | μ = 3.0)
p-value =.0228
 
 2.84  3.0 
= P z  
0.8 x
 
 100 
2.8684 3
= P(z  2.0) = .0228
2.84
p-value example (continued)

 Compare the p-value with 


◦ If p-value <  , reject H0
◦ If p-value   , do not reject H0
 = .05
Here: p-value = .0228 p-value =.0228
 = .05
Since .0228 < .05, we reject
the null hypothesis
2.8684 3
2.84
Example: Upper Tail z Test
for Mean ( Known)
A phone industry manager thinks that
customer monthly cell phone bill have
increased, and now average over $52
per month. The company wishes to test
this claim. (Assume  = 10 is known)
Form hypothesis test:
H0: μ ≤ 52 the average is not over $52 per month
HA: μ > 52 the average is greater than $52 per month
(i.e., sufficient evidence exists to support the
manager’s claim)
Example: Find Rejection Region
(continued)
 Suppose that  = .10 is chosen for this test

Find the rejection region: Reject H0

 = .10

Do not reject H0 Reject H0


0 zα=1.28

Reject H0 if z > 1.28


Review:
Finding Critical Value - One Tail

Standard Normal
What is z given  = 0.10? Distribution Table (Portion)
.90 .10
Z .07 .08 .09
 = .10
1.1 .3790 .3810 .3830
.50 .40
1.2 .3980 .3997 .4015
z 0 1.28
1.3 .4147 .4162 .4177
Critical Value
= 1.28
Example: Test Statistic (continued)

Obtain sample evidence and compute the


test statistic
Suppose a sample is taken with the
following results: n = 64, x = 53.1 (=10
was assumed known)

◦ Then the test statistic is:


x μ 53.1  52
z = = = 0.88
σ 10
n 64
Example: Decision (continued)
Reach a decision and interpret the
result: Reject H0

 = .10

Do not reject H0 Reject H0


1.28
0
z = .88
Do not reject H0 since z = 0.88 ≤ 1.28
i.e.: there is not sufficient evidence that the
mean bill is over $52
p -Value Solution (continued)

Calculate the p-value and compare to 


p-value = .1894
P( x  53.1 | μ = 52.0)
Reject H0
 = .10  
 53.1  52.0 
= P z  
 10 
0  64 
Do not reject H0 Reject H0
1.28 = P(z  0.88) = .5  .3106
z = .88 = .1894

Do not reject H0 since p-value = .1894 >  = .10


Example: Two-Tail Test
( Unknown)

The average cost of a


hotel room in New York
is said to be $168 per
night. A random sample
of 25 hotels resulted in
x = $172.50 and
s = $15.40. Test at the H0: μ = 168
 = 0.05 level. HA: μ  168
(Assume the population distribution is normal)
Example Solution: Two-Tail Test

H0: μ = 168 /2=.025 /2=.025


HA: μ  168

  = 0.05 Reject H0 Do not reject H0 Reject H0


tα/2
-tα/2 0
 n = 25 -2.0639 1.46
2.0639
  is unknown, x μ 172.50  168
so use a t t n1 = = = 1.46
s 15.40
statistic n 25
 Critical Value:
Do not reject H0: not sufficient evidence that
t24 = ± 2.0639 true mean cost is different than $168
Hypothesis Tests for Proportions

 Involves categorical values


 Two possible outcomes
◦ “Success” (possesses a certain characteristic)
◦ “Failure” (does not possesses that
characteristic)

 Fraction or proportion of population in the


“success” category is denoted by p
Proportions (continued)

 Sample proportion in the success


category is denoted by p
x number of successes in sample
◦ p= =
n sample size

 When both np and n(1-p) are at least


5, p can be approximated by a normal
distribution with mean and standard
deviation
μ = p p(1  p)
◦ P σp =
n
Hypothesis Tests for Proportions

 The sampling
distribution of p Hypothesis
is normal, so the Tests for p
test statistic is a
z value:
np  5 np < 5
pp and or
z= n(1-p)  5 n(1-p) < 5
p(1  p)
Not discussed
n in this chapter
Example: z Test for Proportion

A marketing company
claims that it receives
8% responses from
its mailing. To test
this claim, a random
sample of 500 were
Check:
surveyed with 25
responses. Test at n p = (500)(.08) = 40 
the  = .05 n(1-p) = (500)(.92) = 460
significance level.
Z Test for Proportion: Solution
H0: p = .08 Test Statistic:
HA: p  .08 pp .05  .08
z= = = 2.47
 = .05
p(1  p) .08(1  .08)
n = 500, p = .05 n 500
Critical Values: ± 1.96 Decision:
Reject Reject Reject H0 at  = .05
Conclusion:
.025 .025
There is sufficient
-1.96 0 1.96 z evidence to reject the
-2.47 company’s claim of 8%
response rate.
p -Value Solution (continued)
Calculate the p-value and compare to 
(For a two sided test the p-value is always two sided)

Do not reject H0
Reject H0 Reject H0 p-value = .0136:
/2 = .025 /2 = .025
P(z  2.47)  P(x  2.47)
.0068 .0068 = 2(.5  .4932)
= 2(.0068) = 0.0136
-1.96 0 1.96

z = -2.47 z = 2.47

Reject H0 since p-value = .0136 <  = .05


Type II Error
 Type II error is the probability of
failing to reject a false H0
Suppose we fail to reject H0: μ  52
when in fact the true mean is μ = 50

50 52
Reject Do not reject
H0: μ  52 H0 : μ  52
Type II Error (continued)
 Suppose we do not reject H0:   52 when
in fact the true mean is  = 50

This is the range of x where


This is the true H0 is not rejected
distribution of x if  = 50

50 52
Reject Do not reject
H0:   52 H0 :   52
Type II Error (continued)
 Suppose we do not reject H0: μ  52
when in fact the true mean is μ = 50

Here, β = P( x  cutoff ) if μ = 50

 β

50 52
Reject Do not reject
H0: μ  52 H0 : μ  52
Calculating β
 Suppose n = 64 , σ = 6 , and  = .05
σ 6
cutoff = x  = μ  z  = 52  1.645 = 50.766
(for H0 : μ  52) n 64
So β = P( x  50.766 ) if μ = 50

50 50.766 52
Reject Do not reject
H0: μ  52 H0 : μ  52
Calculating β
(continued)
 Suppose n = 64 , σ = 6 , and  = .05
 
 50.766  50 
P( x  50.766 | μ = 50) = P z   = P(z  1.02) = .5  .3461 = .1539
 6 
 64 

Probability of
type II error:
 β = .1539

50 52
Reject Do not reject
H0: μ  52 H0 : μ  52
Chapter Summary

 Addressed hypothesis testing methodology


 Performed z Test for the mean (σ known)
 Discussed p–value approach to
hypothesis testing
 Performed one-tail and two-tail tests . . .
Chapter Summary (continued)

 Performed t test for the mean (σ


unknown)
 Performed z test for the proportion
 Discussed
type II error and
computed its probability
Chapter 5
Estimation and Hypothesis
Testing for Two Population
Parameters
(Part B)
Chapter Goals
After completing this chapter, you
should be able to:
 Test hypotheses or form interval
estimates for
◦ two independent population means
 Standard deviations known
 Standard deviations unknown
◦ two means from paired samples
◦ the difference between two population
proportions
Estimation for Two Populations

Estimating two
population values

Population
means, Paired Population
independent samples proportions
samples
Examples:
Group 1 vs. Same group Proportion 1 vs.
independent before vs. after Proportion 2
Group 2 treatment
Difference Between Two Means

Population means, Goal: Form a confidence


independent
samples
* interval for the difference
between two population
means, μ1 – μ2
σ1 and σ2 known

The point estimate for the


σ1 and σ2 unknown, difference is
n1 and n2  30
x1 – x2
σ1 and σ2 unknown,
n1 or n2 < 30
Independent Samples

 Different data sources


Population means,
◦ Unrelated
independent
samples
* ◦ Independent
 Sample selected from
one population has no
effect on the sample
σ1 and σ2 known selected from the other
population
σ1 and σ2 unknown,  Use the difference
n1 and n2  30 between 2 sample means
 Use z test or pooled
variance t test
σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 known

Population means, Assumptions:


independent
samples  Samples are randomly and
independently drawn
σ1 and σ2 known *  population distributions are
normal or both sample sizes
σ1 and σ2 unknown,
n1 and n2  30 are  30

 Population standard
σ1 and σ2 unknown,
n1 or n2 < 30 deviations are known
σ1 and σ2 known (continued)

Population means, When σ1 and σ2 are known and


both populations are normal or
independent
both sample sizes are at least 30,
samples
the test statistic is a z-value…

σ1 and σ2 known * …and the standard error of


x1 – x2 is
σ1 and σ2 unknown,
n1 and n2  30 2 2
σ σ2
σ1 and σ2 unknown,
σ x1  x 2 = 1

n1 or n2 < 30
n1 n2
σ1 and σ2 known (continued)

Population means,
independent The confidence interval for
samples μ1 – μ2 is:

σ1 and σ2 known *
x 
2 2
σ σ2
1  x 2  z /2 1

σ1 and σ2 unknown, n1 n2
n1 and n2  30

σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, large
samples

Population means, Assumptions:


independent
 Samples are randomly and
samples
independently drawn

σ1 and σ2 known  both sample sizes


are  30
σ1 and σ2 unknown,
n1 and n2  30
*  Population standard
deviations are unknown
σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, large
samples
(continued)

Population means,
independent Forming interval
samples estimates:

 use sample standard


σ1 and σ2 known
deviation s to estimate σ

σ1 and σ2 unknown,
n1 and n2  30
*  the test statistic is a z value

σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, large
samples
(continued)

Population means,
independent The confidence interval for
samples μ1 – μ2 is:

σ1 and σ2 known

 
2 2
s s2
σ and σ unknown, *
x 1  x 2  z /2 1

1 2
n1 and n2  30
n1 n2

σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, small
samples

Population means, Assumptions:


independent
samples  populations are normally
distributed

σ1 and σ2 known  the populations have equal


variances
σ1 and σ2 unknown,
n1 and n2  30  samples are independent

σ1 and σ2 unknown,
n1 or n2 < 30
*
σ1 and σ2 unknown, small
samples
(continued)

Population means, Forming interval


independent estimates:
samples
 The population variances
are assumed equal, so use
σ1 and σ2 known the two sample standard
deviations and pool them to
σ1 and σ2 unknown, estimate σ
n1 and n2  30
 the test statistic is a t value
σ1 and σ2 unknown,
n1 or n2 < 30
* with (n1 + n2 – 2) degrees
of freedom
σ1 and σ2 unknown, small
(continued)
samples

Population means, The pooled standard


independent deviation is
samples

σ1 and σ2 known

sp =
n1  1s
1
2
 n2  1s 2
2

σ1 and σ2 unknown, n1  n2  2
n1 and n2  30

σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, small
samples
(continued)

Population means, The confidence interval for


independent μ1 – μ2 is:
samples

σ1 and σ2 known x 1 
 x 2  t /2 sp
1 1

n1 n2
σ1 and σ2 unknown, Where t/2 has (n1 + n2 – 2) d.f.,
n1 and n2  30 and
sp =
n1  1s12  n2  1s2 2
σ1 and σ2 unknown,
n1 or n2 < 30
* n1  n2  2
Paired Samples
Tests Means of 2 Related Populations
Paired ◦ Paired or matched samples
samples ◦ Repeated measures (before/after)
◦ Use difference between paired values:

d = x1 - x 2
 Eliminates Variation Among Subjects
 Assumptions:
◦ Both Populations Are Normally Distributed
◦ Or, if Not Normal, use large samples
Paired Differences
The ith paired difference is di , where
Paired di = x1i - x2i
samples
n
The point estimate for d i
the population mean d= i =1
paired difference is d : n

n
The sample standard
deviation is  i
(d  d) 2

sd = i=1
n 1
n is the number of pairs in the paired sample
Paired Differences (continued)

Paired The confidence interval for d is


samples
sd
d  t /2
n
n
Where t/2 has  (d  d)
i
2

n - 1 d.f. and sd is: sd = i=1


n 1
n is the number of pairs in the paired sample
Hypothesis Tests for the
Difference Between Two Means

 Testing Hypotheses about μ1 – μ2

 Use the same situations discussed


already:
◦ Standard deviations known or
unknown
◦ Sample sizes  30 or not  30
Hypothesis Tests for
Two Population Proportions

Two Population Means, Independent Samples

Lower tail test: Upper tail test: Two-tailed test:

H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2


HA: μ1 < μ2 HA: μ1 > μ2 HA: μ1 ≠ μ2
i.e., i.e., i.e.,
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
HA: μ1 – μ2 < 0 HA: μ1 – μ2 > 0 HA: μ1 – μ2 ≠ 0
Hypothesis tests for μ1 – μ2

Population means, independent samples

σ1 and σ2 known Use a z test statistic

Use s to estimate unknown


σ1 and σ2 unknown, σ , approximate with a z
n1 and n2  30 test statistic

σ1 and σ2 unknown, Use s to estimate unknown


n1 or n2 < 30 σ , use a t test statistic and
pooled standard deviation
σ1 and σ2 known

Population means,
independent The test statistic for
samples μ1 – μ2 is:

σ1 and σ2 known * z = x 1 
 x 2   μ1  μ2 
2 2
σ1 and σ2 unknown, σ σ2
n1 and n2  30
1

n1 n2
σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, large
samples

Population means,
independent The test statistic for
samples μ1 – μ2 is:

σ1 and σ2 known
z=
 x 1 
 x 2   μ1  μ2 
2 2
σ1 and σ2 unknown,
n1 and n2  30
* s

1 s2
n1 n2
σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, small
samples
The test statistic for
Population means,
independent μ1 – μ2 is:
samples

t=
 x 1 
 x 2   μ1  μ2 
σ1 and σ2 known 1 1
sp 
σ1 and σ2 unknown,
n1 n2
n1 and n2  30 Where t/2 has (n1 + n2 – 2) d.f.,
n1  1s12  n2  1s2 2
σ1 and σ2 unknown,
n1 or n2 < 30
* and
sp =
n1  n2  2
Hypothesis tests for μ1 – μ2
Two Population Means, Independent Samples
Lower tail test: Upper tail test: Two-tailed test:
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
HA: μ1 – μ2 < 0 HA: μ1 – μ2 > 0 HA: μ1 – μ2 ≠ 0

  /2 /2

-z z -z/2 z/2


Reject H0 if z < -z Reject H0 if z > z Reject H0 if z < -z/2
or z > z/2
Pooled sp t Test: Example
You’re a financial analyst for a brokerage firm. Is
there a difference in dividend yield between stocks
listed on the NYSE & NASDAQ? You collect the
following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16

Assuming equal variances, is


there a difference in average
yield ( = 0.05)?
Calculating the Test Statistic
The test statistic is:

sp =
n1  1s12  n2  1s2 2 =
21  11.30 2  25  11.16 2 = 1.2256
n1  n2  2 21  25  2
Solution
Reject H0 Reject H0
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
HA: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
 = 0.05 .025 .025
df = 21 + 25 - 2 = 44
Critical Values: t = ± 2.0154 -2.0154 0 2.0154 t
2.040
Test Statistic:
Decision:
3.27  2.53 Reject H0 at  = 0.05
t= = 2.040
1 1
1.2256  Conclusion:
21 25 There is evidence of a
difference in means.
Hypothesis Testing
for Paired Samples

The test statistic for d is


Paired
samples
d  μd
t=
sd
n is the
n
number
of pairs n
in the
paired
Where t/2 has n - 1 d.f.
 i
(d  d) 2

sample and sd is: sd = i=1


n 1
Hypothesis Testing for
Paired Samples
(continued)
Paired Samples

Lower tail test: Upper tail test: Two-tailed test:

H0: μd  0 H0: μd ≤ 0 H0: μd = 0


HA: μd < 0 HA: μd > 0 HA: μd ≠ 0

  /2 /2

-t t -t/2 t/2


Reject H0 if t < -t Reject H0 if t > t Reject H0 if t < -t/2
or t > t/2
Where t has n - 1 d.f.
Paired Samples Example
 Assume you send your salespeople to a
“customer service” training workshop. Is the
training effective? You collect the following data:

Number of Complaints:
Salesperson Before (1) After (2)
(2) - (1)
Difference, di
 di
d = n
C.B. 6 4 - 2
T.F. 20 6 -14 = -4.2
M.H. 3 2 - 1
R.K. 0 0 0
sd =
 i
(d  d) 2

M.O. 4 0 - 4 n 1
-21 = 5.67
Paired Samples: Solution
 Has the training made a difference in the number of
complaints (at the 0.01 level)?
Reject Reject
H0: μd = 0
HA: μd  0
/2 /2
 = .01 d = - 4.2 - 4.604 4.604
- 1.66
Critical Value = ± 4.604
d.f. = n - 1 = 4
Decision: Do not reject H0
(t stat is not in the reject region)
Test Statistic:
Conclusion: There is not a
d  μd  4.2  0
t= = = 1.66 significant change in the
sd / n 5.67/ 5 number of complaints.
Two Population Proportions

Goal: Form a confidence interval for


Population or test a hypothesis about the
proportions difference between two population
proportions, p1 – p2
Assumptions:
n1p1  5 , n1(1-p1)  5
n2p2  5 , n2(1-p2)  5
The point estimate for
the difference is p1 – p2
Confidence Interval for
Two Population Proportions

Population The confidence interval for


proportions
p1 – p2 is:

p 1 
 p 2  z /2
p1(1  p1 ) p 2 (1  p 2 )
n1

n2
Hypothesis Tests for
Two Population Proportions
Population proportions

Lower tail test: Upper tail test: Two-tailed test:

H0: p1  p2 H0: p1 ≤ p2 H0: p1 = p2


HA: p1 < p2 HA: p1 > p2 HA: p1 ≠ p2
i.e., i.e., i.e.,
H0: p1 – p2  0 H0: p1 – p2 ≤ 0 H0: p1 – p2 = 0
HA: p1 – p2 < 0 HA: p1 – p2 > 0 HA: p1 – p2 ≠ 0
Two Population Proportions
Since we begin by assuming the null
hypothesis is true, we assume p1 = p2
Population
and pool the two p estimates
proportions
The pooled estimate for the
overall proportion is:

n1p1  n2 p 2 x1  x 2
p= =
n1  n2 n1  n2
where x1 and x2 are the numbers from
samples 1 and 2 with the characteristic of interest
Two Population Proportions
(continued)

Population The test statistic for


proportions p1 – p2 is:

z=
 p  p   p  p 
1 2 1 2

1 1
p (1  p)   
 n1 n2 
Hypothesis Tests for
Two Population Proportions

Population proportions
Lower tail test: Upper tail test: Two-tailed test:
H0: p1 – p2  0 H0: p1 – p2 ≤ 0 H0: p1 – p2 = 0
HA: p1 – p2 < 0 HA: p1 – p2 > 0 HA: p1 – p2 ≠ 0

  /2 /2

-z z -z/2 z/2


Reject H0 if z < -z Reject H0 if z > z Reject H0 if z < -z/2
or z > z/2
Example:
Two population Proportions
Is there a significant difference between
the proportion of men and the
proportion of women who will vote Yes
on Proposition A?

 In a random sample, 36 of 72 men and


31 of 50 women indicated they would
vote Yes

 Test at the .05 level of significance


Example:
Two population Proportions
(continued)
 The hypothesis test is:
H0: p1 – p2 = 0 (the two proportions are equal)
HA: p1 – p2 ≠ 0 (there is a significant difference between proportions)

 The sample proportions are:


 Men: p1 = 36/72 = .50
 Women: p2 = 31/50 = .62

 The pooled estimate for the overall proportion is:


x1  x 2 36  31 67
p= = = = .549
n1  n2 72  50 122
Example:
Two population Proportions
(continued)
Reject H0 Reject H0

The test statistic for p1 – p2 is:


p 
.025 .025
 p 2   p1  p 2 
z= 1

1 1
p (1  p)    -1.96 1.96
 n1 n2  -1.31

=
 .50  .62    0 =  1.31
 1 1  Decision: Do not reject H0
.549 (1  .549)   
 72 50  Conclusion: There is not
significant evidence of a
Critical Values = ±1.96 difference in proportions
For  = .05 who will vote yes between
men and women.
Chapter Summary
 Compared two independent samples
◦ Formed confidence intervals for the differences between
two means
◦ Performed Z test for the differences in two means
◦ Performed t test for the differences in two means
 Compared two related samples (paired samples)
◦ Formed confidence intervals for the paired difference
◦ Performed paired sample t tests for the mean difference
 Compared two population proportions
◦ Formed confidence intervals for the difference between
two population proportions
◦ Performed Z-test for two population proportions
Exercises 9.1
n1 = 100, n2 = 150,
x1 = 50, x2 = 65
s1 = 6, s2 = 8
 Determine the 90% confidence interval estimate for the
difference between population means. Interpret the
estimate. (90% => Zα/2 = 1.645)
 Determine the 98% confidence interval estimate for the
difference between population means. Interpret the estimate. (98%
=> Zα/2 = 2.33)
 What are the advantages and disadvantages of using a
higher confidence level to estimate the difference between
the two populatiuon means?
Exercises 9.2

 You are given the following results of a


paired difference test 𝑑=344,
ҧ sd=34, n=23
 a.Contruct and interpret a 95% confidence interval
estimate for the paired difference in mean values
 b. Contruct and interpret a 90% confidence
interval estimate for the paired difference in mean
values
 Discuss why the two estimates are different. What
are the advantages and disadvantages of using a
lower confidence level?
Chapter 5
Hypothesis Tests for
One and Two Population
Variances _ Part C
Chapter Goals
After completing this chapter, you
should be able to:
 Formulate and complete hypothesis tests for a
single population variance
 Find critical chi-square distribution values from the
chi-square table
 Formulate and complete hypothesis tests for the
difference between two population variances
 Use the F table to find critical F values
Hypothesis Tests for Variances

Hypothesis Tests
for Variances

Tests for a Single Tests for Two


Population Variances Population Variances

Chi-Square test statistic F test statistic


Single Population

Hypothesis Tests for Variances

Tests for a Single * H0: σ2 = σ02


HA: σ2 ≠ σ02
Two tailed test
Population Variances
H0: σ2  σ02
Lower tail test
HA: σ2 < σ02
Chi-Square test statistic
H0: σ2 ≤ σ02
Upper tail test
HA: σ2 > σ02
Chi-Square Test Statistic

Hypothesis Tests for Variances

The chi-squared test statistic for


a Single Population Variance is:
Tests for a Single
Population Variances (n  1)s 2
 =2

σ2
Chi-Square test statistic * where
2 = standardized chi-square variable
n = sample size
s2 = sample variance
σ2 = hypothesized variance
The Chi-square Distribution
 The chi-square distribution is a family of
distributions, depending on degrees of
freedom:
 d.f. = n - 1

0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2

d.f. = 1 d.f. = 5 d.f. = 15


Finding the Critical Value

 2
 The critical value,  , is found from the
chi-square table
Upper tail test:
H0: σ2 ≤ σ02
HA: σ2 > σ02

2
Do not reject H0 Reject H0
2 
Example
 A commercial freezer must hold the selected
temperature with little variation.
Specifications call for a standard deviation of
no more than 4 degrees (or variance of 16
degrees2). A sample of 16 freezers is tested
and yields a sample variance
of s2 = 24. Test to see
whether the standard
deviation specification
is exceeded. Use
 = .05
Finding the Critical Value
 The the chi-square table to find the critical value:
2 = 24.9958 ( = .05 and 16 – 1 = 15 d.f.)
The test statistic is:

(n  1)s 2
(16  1)24
 =
2
= = 22.5
σ 2
16
Since 22.5 < 24.9958,
do not reject H0  = .05

There is not significant


evidence at the  = .05 level 2
that the standard deviation Do not reject H0 Reject H0
specification is exceeded  2
= 24.9958
Lower Tail or Two Tailed
Chi-square Tests

Lower tail test: Two tail test:


H0: σ2  σ02 H0: σ2 = σ02
HA: σ2 < σ02 HA: σ2 ≠ σ02

 /2
/2

2 2
Reject Do not reject H0 Reject Do not Reject
21- reject H0
21-/2 2/2
F Test for Difference in Two
Population Variances
Hypothesis Tests for Variances

H0: σ12 – σ22 = 0


Two tailed test
* Tests for Two
HA: σ1 – σ2 ≠ 0
2 2 Population Variances

H0: σ12 – σ22  0 Lower tail test


HA: σ12 – σ22 < 0 F test statistic

H0: σ12 – σ22 ≤ 0 Upper tail test


HA: σ12 – σ22 > 0
F Test for Difference in Two
Population Variances
Hypothesis Tests for Variances
The F test statistic is:
2
(Place the
s Tests for Two
larger sample
variance in the F= 1
2 Population Variances
numerator)
s 2

s12 = Variance of Sample 1


* F test statistic
n1 - 1 = numerator degrees of freedom

s 22 = Variance of Sample 2
n2 - 1 = denominator degrees of freedom
The F Distribution

 The F critical value is found from the F table


 The are two appropriate degrees of freedom:
numerator and denominator
s12
F= 2 where df1 = n1 – 1 ; df2 = n2 – 1
s2
 In the F table,
◦ numerator degrees of freedom determine the row
◦ denominator degrees of freedom determine the column
Finding the Critical Value
H0: σ12 – σ22  0 H0: σ12 – σ22 = 0
HA: σ12 – σ22 < 0 HA: σ12 – σ22 ≠ 0
H0: σ12 – σ22 ≤ 0
HA: σ12 – σ22 > 0
 /2

0 F 0 F
Do not Reject H0 Do not Reject H0
reject H0 F reject H0 F/2
 rejection region  rejection region for
for a one-tail test is a two-tailed test is

s12 s12
F = 2  F F = 2  F / 2
s2 s2
(when the larger sample variance in the numerator)
F Test: An Example
You are a financial analyst for a brokerage firm.
You want to compare dividend yields between
stocks listed on the NYSE & NASDAQ. You collect
the following data:
NYSE NASDAQ
Number 21 25
Mean 3.27 2.53
Std dev 1.30 1.16

Is there a difference in the


variances between the NYSE
& NASDAQ at the  = 0.05 level?
F Test: Example Solution

 Form the hypothesis test:


H0: σ21 – σ22 = 0 (there is no difference between variances)
HA: σ21 – σ22 ≠ 0 (there is a difference between variances)

 Find the F critical value for  = .05:


 Numerator:

 df1 = n1 – 1 = 21 – 1 = 20

 Denominator:

 df2 = n2 – 1 = 25 – 1 = 24

F.05/2, 20, 24 = 2.327


F Test: Example Solution
(continued)
 The test statistic is: H0: σ12 – σ22 = 0
HA: σ12 – σ22 ≠ 0
s12 1.302
F= 2 = 2
= 1.256
s2 1.16
/2 = .025

 F = 1.256 is not greater than 0


Do not Reject H0
the critical F value of 2.327, so reject H0 F/2
we do not reject H0 =2.327

 Conclusion: There is no evidence of a


difference in variances at  = .05
Chapter Summary

 Performed chi-square tests for the variance


 Used the chi-square table to find chi-square
critical values
 Performed F tests for the difference
between two population variances
 Used the F table to find F critical values
Exercise
 A company is interested in dertemining whether there is
a difference in mean sales after lauching ads campaign.
They conduct a test using random samples of 15 shops
before and after ads campaign as follow:
Shops 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Before 57 61 12 38 12 69 5 39 88 9 92 26 14 70 22
After 60 54 20 35 21 70 1 65 79 10 90 32 19 77 29

 Give your opinion with confidence level .05?


Exercise 2
A medical research group is investigating what differenences
might exist between the two pain killing drugs, Azerlieve and
Zynumbic. The researchers have already established that
there is no difference between the 2 drugs in terms of the
average amounts of time required before the drugs take
effect. However, they are also interested in knowing if there is
any difference between the variability of time until pain relief
occurs. A random sample of 24 patients using Azerlieve and
32 patients using Zynubic yields the following results:
 Azerlieve: n1 = 36, s1 = 37.5 sec
 Zynumbic: n2 = 32, s2 = 41.3 sec
At the .05 level of significance, can the reseachers conclude
that there is a significance in the effect time variability
between the two drugs?
Exercise 3
 In clinical trials of testing a certain drug, before it is
released for the public, 3800 adults were randomly
divided into two groups. The patients in Group 1
(Experimental group) received 200 mg of the drug,
while the patients in group 2 (control group) received
a placebo. Out of the 2100 patients in the
experimental group, 550 reported headache as a side
effect. Of the 1700 patients in the control group 370
reported headaches as a side effect. Is there
significant evidence to support the claim that the
proportion of the drug users that experienced
headaches as a side effect is greater than the
proportion in the control group at the α = 0.05 level of
significance.
 Using the conditions, and all requirements, to carry the test of a statistical hypothesis
on the difference between two proportions, we have
 1. The samples are independently obtained using simple random sampling
 2.x550 ,ˆy370 . pˆ1= = 0.261=9p2= = 0.217=6
 n1 2100 n2 1700
3. Therefore n1 pˆ1(1− pˆ1) = 2100.(0.2619).(1-0.26190) = 405.9476 ≥10, and
 4. n2 pˆ2(1− pˆ2) = 1700(0.2176)(1-0.2176) = 289.4254 ≥10.
Thus we proceed with the classical method using the 6 steps, and then we apply the
p-value
 method second. So we have:
 H0: P1 ≤ P2 versus H1: P1 > P2, right-tailed test.
 α = 0.05 is the level of significance. The Critical value is Z0.05 = 1.645 and the
rejection
 region is given by Z > 1.645.
3. The test statistic is Z = ( pˆ1 − pˆ2 ) − ( p1 − p2 )
 Z = ( pˆ 1 − pˆ 2 ) − ( p 1 − p 2 ) , pˆ ( 1 − pˆ ) 1 + 1
 pˆ ( 1 − pˆ ) 1 + 1 n1 n2
 Zcal = 3.1668, based on the data provided.
 Since the test statistic falls in the rejection region, the null hypothesis is rejected, i.e.,
H0:
 P1 ≤ P2 is rejected, and H1: P1 > P2 is being supported.
 There is sufficient evidence at the α = 0.05 level of significance to support the claim
that
 the proportion of adults taking 200 mg of the drug who experienced headaches is
greater than the proportion of adults taking a placebo who experienced headaches.
 Test the claim that μ1 > μ2 at the 0.05
level of significance for the given data
Population 1 Population 2

n 35 35
x 15.3 14.2
s 3.2 3.5
We have two large samples each n > 30. We will do the p-value method on
testing the difference between two means, with population variances
unknown.
2. Let α = 0.05
3. The test statistic we have Z=
(x−y)−(μ −μ ) 1 2
S2 S2 12 .
+nn
12
4. The above test statistic, based on the information provided is Z = 1.3722
5. Apply the p-value for the right-tailed test we see that p-value = 0.08499 >α.
Hence the null
hypothesis is rejected.
6. The two population means are the equal.
Problem 1: Reading Scores
Suppose we want to compare the reading scores
of men and women on a standardized reading test.
We take a random sample of 31 people and obtain
the results below. Note that the women
outperform the men by 4 points. Of course, this
might simply be sampling error. We would like to
test whether or not this difference is significant at
the =.05 level. Men Women
 80 =  1 84 =  2
S 16 = S1 20 = S 2
n 16 = n1 15 = n2
Problem 1: Reading Scores
(cont’d)

 H0: μ1= μ2
H1: μ1≠ μ2

 Note that
◦ σ1,σ2 are not given
◦ n1+ n2 = 31

 We will use a t statistic with n1+ n2 -2 =


29 degrees of freedom
Using MS Excel: Job Satisfaction

 Comparing men and women on job satisfaction


◦ 10 is the highest job satisfaction score;
MEN0 theWOMEN
lowest
7 1 1 4
8 7 10 3
6 2 3 5
5 4 4 6
6 6 1 4
5 7 1 2
6 8 2 5
9 9 3 1
8 7 5 4
Spending on Wine
 A marketer wants to determine whether men and
women spend different amounts on wine. (It is
well known that men spend considerably more on
beer.)
 The researcher randomly samples 34 people (17
women and 17 men).
Spending on Wine (cont’d)
Women Men

$100.00 $107.00
 This is the data of $250.00 $240.00
numbers showing how $890.00 $880.00
much money 17 men and$765.00 $770.00
17 women spent on wine$456.00 $409.00
$356.00 $500.00
over the year.
$876.00 $800.00
$740.00 $900.00
$231.00 $1,000.00
$222.00 $489.00
$555.00 $800.00
$666.00 $890.00
$876.00 $770.00
$10.00 $509.00
$290.00 $100.00
$98.00 $102.00