You are on page 1of 51

LECTURE 10

Two-sample Hypothesis Tests


Learning Objectives
In this chapter, you learn:
 Comparing Two Means: Independent Samples
 Confidence Interval for the Difference of Two Means
 Comparing Two Means: Paired Samples
 Comparing Two Proportions
 Confidence Interval for the Difference of Two
Proportions
 Comparing Two Variances
Two-sample Tests
 A one-sample test compares a sample estimate
against a non-sample benchmark
 A Two-sample test compares two sample
estimates with each other
Two-Sample Tests

Two-Sample Tests

Population Population
Means, Means, Population Population
Independent Related Proportions Variances
Samples Samples
Difference Between Two Means
Independent Samples

Goal: Test hypothesis or form a confidence


interval for the difference between two population
means, μ1 – μ2

The point estimate for the difference is X1 – X2


Hypothesis Tests for
Two Population Means
Two Population Means, Independent Samples

Lower-tail test: Upper-tail test: Two-tail test:


H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2
H1: μ1 < μ2 H1: μ1 > μ2 H1: μ1 ≠ μ2
i.e., i.e., i.e.,
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0

a a a/2 a/2
Hypothesis tests for µ1 - µ2 with σ1
and σ2 known

Population means, Assumptions:


independent  Samples are randomly and
samples
independently drawn

σ1 and σ2 known  Populations are normally


distributed or both sample
sizes are at least 30
σ1 and σ2 unknown,
assumed equal  Population variances are
known
σ1 and σ2 unknown,
not assumed equal
Hypothesis tests for µ1 - µ2 with σ1
and σ2 known

The test statistic is:

ZSTAT 
 X X  μ μ 
1 2 1 2

  12  22 
  
 n1 n2 

The confidence interval for μ1 – μ2 is:

 X X   Z
1 2 /2
  12  22 
  
 n1 n2 
Hypothesis tests for µ1 - µ2 with σ1
and σ2 unknown and assumed equal

Population means, Assumptions:


independent  Samples are randomly and
samples
independently drawn

σ1 and σ2 known  Populations are normally


distributed or both sample
sizes are at least 30
σ1 and σ2 unknown,
assumed equal  Population variances are
unknown but assumed equal
σ1 and σ2 unknown,
not assumed equal
Hypothesis tests for µ1 - µ2 with σ1
and σ2 unknown and assumed equal

The pooled variance is:

S 2

 n1  1 S1   n 2  1 S2
2 2

p
(n1  1)  ( n 2  1)
The test statistic is:

t STAT 
 X X  μ μ 
1 2 1 2

1 1 
S   
2
p
 n1 n 2 

Where tSTAT has d.f. = (n1 + n2 – 2)


Confidence interval for µ1 - µ2 with σ1
and σ2 unknown and assumed equal

The confidence interval for μ1 – μ2 is:

X 1 
 X 2  t/2
1 1 
S   
2
p
 n1 n 2 

Where tα/2 has d.f. = n1 + n2 – 2


Pooled-Variance t Test Example
You are a financial analyst for a brokerage firm. Is there
a difference in dividend yield between stocks listed on the
NYSE & NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16

Assuming both populations are


approximately normal with
equal variances, is
there a difference in mean
yield ( = 0.05)?
Pooled-Variance t Test Example:
Calculating the Test Statistic
H0: μ1 - μ2 = 0
H1: μ1 - μ2 ≠ 0

The test statistic is:

t STAT 
 X  X   μ
1 2 1  μ2 

 3.27  2.53  0  2.040
1 1  1 1 
S   
2
p
1.5021  
 n1 n 2   21 25 


S2  1
n  1 S1
2
  n 2  1 S 2
2

 21  1 1.302
  25  1 1.162
 1.5021
P
(n1  1)  (n 2  1) (21 - 1)  (25  1)
Pooled-Variance t Test Example:
Hypothesis Test Solution
Reject H0 Reject H0

 = 0.05
df = 21 + 25 - 2 = 44 .025 .025

Critical Values: t = ± 2.0154 -2.0154 0 2.0154 t


2.040
Test Statistic:
Decision:
3.27  2.53
t STAT   2.040 Reject H0 at a = 0.05
 1 1 
1.5021    Conclusion:
 21 25  There is evidence of a
difference in means.
Minitab Pooled-Variance t test
Comparing NYSE & NASDAQ
Two-Sample T-Test and CI

Sample N Mean StDev SE Mean


1 21 3.27 1.30 0.28
2 25 2.53 1.16 0.23

Difference = mu (1) - mu (2)


Estimate for difference: 0.740
95% CI for difference: (0.009, 1.471)
T-Test of difference = 0 (vs not =): T-Value = 2.04 P-Value = 0.047 DF = 44
Both use Pooled StDev = 1.2256
Decision:
Reject H0 at a = 0.05
Conclusion:
There is evidence of a
difference in means.
Pooled-Variance t Test Example:
Confidence Interval for µ1 - µ2

Since we rejected H0 can we be 95% confident that µNYSE >


µNASDAQ?

95% Confidence Interval for µNYSE - µNASDAQ

 X X   t
1 2 /2 p
1 1 
S     0.74  2.0154  0.3628  (0.009, 1.471)
2

 n1 n 2 

Since 0 is less than the entire interval, we can be 95%


Hypothesis tests for µ1 - µ2 with σ1
and σ2 unknown, not assumed equal

Population means, Assumptions:


independent  Samples are randomly and
samples
independently drawn

σ1 and σ2 known  Populations are normally


distributed or both sample
sizes are at least 30
σ1 and σ2 unknown,
assumed equal  Population variances are
unknown and cannot be
σ1 and σ2 unknown, assumed to be equal
not assumed equal
Hypothesis tests for µ1 - µ2 with σ1 and
σ2 unknown and not assumed equal
The test statistic is:

t STAT 
 X 1 
 X 2   μ1  μ 2 
S12 S 22

n1 n 2

tSTAT has d.f. ν:


2
 S1 2 S 2 2 
 
n  n 
  2
1 2 
2
 S1 2   S22 
   
n  n 
 1   2 
n1  1 n2  1

A Quick Rule for degrees of freedom is to use min(n1 – 1, n2 – 1).


Confidence interval for µ1 - µ2 with σ1
and σ2 unknown and not assumed equal

The confidence interval for μ1 - μ2 is:

X 
2 2
S S
1  X 2  t 1
 2
2 n1 n 2
Difference Between Two Means

The table below presents the summary statistics for the


starting annual salaries (in thousands of dollars) for
individuals entering the public accounting and financial
planning professions.

Sample I (public accounting) X1 = 60.35, S1 = 3.25, n1= 12


Sample II (financial planning) X2 = 58.20, S2 = 2.48, n2 = 14

Test whether the mean starting annual salaries for


individuals entering the public accounting professions is
higher than that of financial planning.
Summary:
Two Independent Sample Tests
Related Populations
The Paired Difference Test
 Tests Means of 2 Related Populations
 Paired samples
 Repeated measures (before/after)
 Use difference between paired values:

Di = X1i - X2i

 Assumptions:
 Both Populations Are Normally Distributed
 Or, if not Normal, use large samples
Related Populations
The Paired Difference Test

The ith paired difference is: Di = X1i - X2i


n
The point estimate for the
paired difference
D i
D i 1

population mean μD is : n
n
The sample standard  i
(D  D ) 2

deviation is SD SD  i 1
n1

n is the number of pairs in the paired sample


The Paired Difference Test:
Finding tSTAT

 The test statistic for μD is:

D  μD
t STAT 
SD
n

 Where tSTAT has n - 1 d.f.


The Paired Difference Test:
Possible Hypotheses

Lower-tail test: Upper-tail test: Two-tail test:

H0: μD  0 H0: μD ≤ 0 H0: μD = 0


H1: μD < 0 H1: μD > 0 H1: μD ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if tSTAT < -ta/2
tSTAT > ta/2 or
Where tSTAT has n - 1 d.f.
The Paired Difference
Confidence Interval
The confidence interval for μD is

SD
D  t / 2
n
Paired Difference Test:
Example
 Assume you send your salespeople to a “customer
service” training workshop. Has the training made a
difference in the number of complaints? You collect
the following data:
Number of Complaints: (2) - (1)  Di
Salesperson Before (1) After (2) Difference, Di D = n
C.B. 6 4 - 2 = -4.2
T.F. 20 6 -14
M.H. 3 2 - 1
R.K. 0 0 0
SD 
 i
(D  D ) 2

M.O. 4 0 - 4 n 1
-21
 5.67
Paired Difference Test:
Solution
Has the training made a difference in the number of complaints
(at the 0.01 level)?
Reject Reject
H0: μD = 0
H1: μD  0 /2
/2
 = .01 D = - 4.2 - 4.604 4.604
- 1.66
t0.005 = ± 4.604
Decision: Do not reject H0

d.f. = n - 1 = 4 Conclusion: There is


D  μ D  4.2  0 insufficient evidence there
t STAT    1.66
SD / n 5.67/ 5 is significant change in the
number of complaints.
Paired t Test In Minitab Yields
The Same Conclusions
Paired T-Test and CI: After, Before

Paired T for After - Before

N Mean StDev SE Mean


After 5 2.40 2.61 1.17
Before 5 6.60 7.80 3.49
Difference 5 -4.20 5.67 2.54

95% CI for mean difference: (-11.25, 2.85)


T-Test of mean difference = 0 (vs not = 0): T-Value = -1.66 P-Value = 0.173
Two Population Proportions

Goal: test a hypothesis or form a confidence


interval for the difference between two population
proportions, π1 – π2

The point estimate for the difference is p1  p2


Two Population Proportions

The pooled estimate for the overall


proportion is:

X1  X 2
p
n1  n 2

where X1 and X2 are the number of items of


interest in samples 1 and 2
Two Population Proportions
The test statistic for π1 – π2 is a Z statistic:

Z STAT 
 p1  p2    π1  π 2 
 1 1 
p (1  p )   
 n1 n2 
where
X1  X 2 X1 X2
p , p1  , p2 
n1  n2 n1 n2
Hypothesis Tests for
Two Population Proportions

Lower-tail test: Upper-tail test: Two-tail test:

H0: π1  π2 H0: π1 ≤ π2 H0: π1 = π2


H1: π1 < π2 H1: π1 > π2 H1: π1 ≠ π2
i.e., i.e., i.e.,
H0: π1 – π2  0 H0: π1 – π2 ≤ 0 H0: π1 – π2 = 0
H1: π1 – π2 < 0 H1: π1 – π2 > 0 H1: π1 – π2 ≠ 0
Hypothesis Tests for
Two Population Proportions

Lower-tail test: Upper-tail test: Two-tail test:


H0: π1 – π2  0 H0: π1 – π2 ≤ 0 H0: π1 – π2 = 0
H1: π1 – π2 < 0 H1: π1 – π2 > 0 H1: π1 – π2 ≠ 0

a a a/2 a/2

-za za -za/2 za/2


Reject H0 if ZSTAT < -Za Reject H0 if ZSTAT > Za Reject H0 if ZSTAT < -Za/2
ZSTAT > Za/2 or
Hypothesis Test Example:
Two population Proportions
Is there a significant difference between the
proportion of men and the proportion of women
who will vote Yes on Proposition A?

 In a random sample, 36 of 72 men and 35 of 50


women indicated they would vote “Yes”

 Test at the .05 level of significance


Hypothesis Test Example:
Two population Proportions
 The hypothesis test is:
H0: π1 – π2 = 0
H1: π1 – π2 ≠ 0
 The sample proportions are:
 Men: p1 = 36/72 = 0.50
 Women: p2 = 35/50 = 0.70

 The pooled estimate for the overall proportion is:

X 1  X 2 36  35 71
p    0 .582
n1  n2 72  50 122
Hypothesis Test Example:
Two population Proportions
Reject H0 Reject H0

.025 .025
z STAT 
 p1  p2    π1  π 2 
 1 1 
p ( 1  p)    -1.96 1.96
 n1 n2  -2.20


 .50  .70   0   2 .20
 1 1  Decision: Reject H0
.582 ( 1  .582 )   
 72 50 
Conclusion: There is
evidence of a difference in
Critical Values = ±1.96
For  = .05 proportions who will vote
yes between men and
women.
Two Proportion Test In Minitab
Shows The Same Conclusions

Test and CI for Two Proportions

Sample X N Sample p
1 36 72 0.500000
2 35 50 0.700000

Difference = p (1) - p (2)


Estimate for difference: -0.2
95% CI for difference: (-0.371676, -0.0283244)
Test for difference = 0 (vs not = 0): Z = -2.28 P-Value = 0.022

Conclusion: There is evidence of a difference in


proportions who will vote yes between men and women.
Confidence Interval for
Two Population Proportions

 If
the hypothesized difference is
nonzero (like=0.02), using the
following formula:

(p1  p 2 )  ( 1   2 )
Z STAT 
p1 (1  p1 ) p 2 (1  p 2 )

n1 n2
Confidence Interval for
Two Population Proportions

The confidence interval for π1 – π2 is:

p1 (1  p1 ) p 2 (1  p 2 )
 p1  p 2   Z/2 
n1 n2
Testing for the Ratio of Two
Population Variances
Hypotheses FSTAT
H0: σ12 = σ22
H1: σ12 ≠ σ22 S12
FSTAT  2
H0: σ12 ≤ σ22 S2
H1: σ12 > σ22
Where:
S12 = Variance of sample 1 (the larger sample variance)
n1 = sample size of sample 1
S 22 = Variance of sample 2 (the smaller sample variance)
n2 = sample size of sample 2
The F Distribution
 The F critical value is found from the F table
 There are two degrees of freedom required: numerator
and denominator
 The larger sample variance is always the numerator

S12
 When FSTAT  2 df1 = n1 – 1 ; df2 = n2 – 1
S2
 In the F table,
 numerator degrees of freedom determine the column
 denominator degrees of freedom determine the row
Finding the Rejection Region

H0: σ12 = σ22 H0: σ12 ≤ σ22


H1: σ12 ≠ σ22 H1: σ12 > σ22

/2 /2 
F
0
Reject H0 Do not Do not Reject H0 F
reject H0 Reject H0 reject H0 Fα
FL 1 FR
FR = F α/2, df1, df2
FL = F 1-α/2, df1, df2 = 1/Fα/2, df2, df1

Reject H0 if FSTAT > FR or FSTAT < FL Reject H0 if FSTAT > Fα


F Test: An Example

You are a financial analyst for a brokerage firm. You


want to compare dividend yields between stocks listed on
the NYSE & NASDAQ. You collect the following data:
NYSE NASDAQ
Number 21 25
Mean 3.27 2.53
Std dev 1.30 1.16

Is there a difference in the variances between the


NYSE & NASDAQ at the  = 0.05 level?
F Test: Example Solution
 Form the hypothesis test:
H 0: σ 12  σ(22there is no difference between variances)
(2there is a difference between variances)
H 1: σ 1  σ 2
2

 Significance level  = 0.05


 Numerator d.f. = n1 – 1 = 21 –1 = 20
 Denominator d.f. = n2 – 1 = 25 –1 = 24
 FR = F.025, 20, 24 = 2.33 (FINV(0.025, 20, 24)
 FL = 1/ F.025, 24, 20 = 0.41 (or FINV(0.975, 20, 24)
F Test: Example Solution
 The test statistic is: H0: σ12 = σ22
2 2 H 1: σ 12 ≠ σ 22
S 1.30
FSTAT  1

2 2
 1.256
S2 1.16
/2 = .025

Reject H0 Do not Reject H0


F
reject H0
FL = 0.41 1 FR = 2.33

 FSTAT = 1.256 is not in the rejection region, so we do not reject


H0
 Conclusion: There is insufficient evidence of a difference in
variances at  = .05
Two Variance F Test In Minitab
Yields The Same Conclusion
Test and CI for Two Variances

Null hypothesis Sigma(1) / Sigma(2) = 1


Alternative hypothesis Sigma(1) / Sigma(2) not = 1
Significance level Alpha = 0.05

Statistics
Sample N StDev Variance
1 21 1.300 1.690
2 25 1.160 1.346

Ratio of standard deviations = 1.121


Ratio of variances = 1.256

95% Confidence Intervals

CI for
Distribution CI for StDev Variance
of Data Ratio Ratio
Normal (0.735, 1.739) (0.540, 3.024)

Tests
Test
Method DF1 DF2 Statistic P-Value
F Test (normal) 20 24 1.26 0.589
Chapter Summary
 Compared two independent samples
 Performed pooled-variance t test for the difference in
two means
 Performed separate-variance t test for difference in
two means
 Formed confidence intervals for the difference
between two means
 Compared two related samples (paired
samples)
 Performed paired t test for the mean difference
 Formed confidence intervals for the mean difference
Chapter Summary
 Compared two population proportions
 Formed confidence intervals for the difference
between two population proportions
 Performed Z-test for two population proportions
 Performed F test for the ratio of two
population variances
The Wall Street Journal recently published an article indicating
differences in perception of sexual harassment on the job between men
and women. The article claimed that women perceived the problem to
be much more prevalent than did men. One question asked of both men
and women was: "Do you think sexual harassment is a major problem in
the American workplace?" 24% of the men compared to 62% of the
women responded "Yes." Assuming W designates women's responses
and M designates men's.
1. What hypothesis should The Wall Street Journal test in order to show
that its claim is true?
2. Suppose that 150 women and 200 men were interviewed. For a 0.01
level of significance, what is the critical value for the rejection region
3. What is the value of the test statistic?

4. Construct a 99% confidence interval estimate of the difference


between the proportion of women and men who think sexual
harassment is a major problem in the American workplace.
5. Calculate p-value for testing the above claim of The Wall Street
Journal.
Homeworks
 Ebook: Chaper 10
 10.70
 10.76
 10.82
 10.86
 10.90

You might also like