You are on page 1of 38

LECTURE 10

Two-sample Hypothesis Tests

Lecturer: Nguyen Thi Thu Van


Email: van.nguyen@isb.edu.vn
Content
 Comparing Two Means

 Confidence Interval for the Difference of Two Means

 Comparing Two Means: Independent Samples

 Comparing Two Means: Paired Samples

 Comparing Two Proportions

 Confidence Interval for the Difference of Two


Proportions

 Comparing Two Variances


Two-sample Tests
 A test performed on the data of two random samples,
each independently obtained from a different given
population.
 Aim to determine whether the difference between these
two populations is statistically significant. 𝜇1 ≠ 𝜇2

𝑋ത1
𝜇1 𝑋ത2

𝜇2
Two-Sample Tests

Two-Sample Tests

Population Population
Means, Means, Population Population
Independent Related Proportions Variances
Samples Samples
Comparing Two Means
Difference Between Two Means & Independent Samples
Test hypothesis or form a confidence interval for the difference between
two population means, μ1 – μ2 where the point estimate for the difference
is 𝑋1 − 𝑋2

Lower-tail test Upper-tail test Two-tail test


H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2
H1: μ1 < μ2 H1: μ1 > μ2 H1: μ1 ≠ μ2

meaning meaning meaning

H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0


H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0

a a a/2
Hypothesis Tests for µ1 - µ2 with σ1 and σ2 known

Population means, Assumptions


independent
samples  Samples are randomly and
independently drawn.
σ1 and σ2 known  Populations are normally
distributed or both sample
σ1 and σ2 unknown,
assumed equal sizes are at least 30.

σ1 and σ2 unknown,
not assumed equal
Hypothesis tests for µ1 - µ2 with σ1 and σ2 known
 The test statistic:

ZSTAT 
 X  X   μ  μ 
1 2 1 2

  12  22 
  
 n1 n2 

 The confidence interval for μ1 – μ2:

X  X   Z
1 2 a/2
  12  22 
  
 n1 n2 
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown assumed equal

Population means, Assumptions


independent
samples  Samples are randomly and
independently drawn
σ1 and σ2 known  Populations are normally
distributed or both sample
σ1 and σ2 unknown,
assumed equal sizes are at least 30

σ1 and σ2 unknown,
not assumed equal
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown and assumed equal
 The pooled variance
S 2

n1  1 S1  n 2  1 S2
2 2

(n1  1)  (n 2  1)
p

 The test statistic:


where tSTAT has t STAT 
 X  X   μ  μ 
1 2 1 2

1 1 
d.f. = (n1 + n2 – 2) S   
2
p
 n1 n 2 
 The confidence interval

where tα/2 has X  X   t


1 2 a/2
1 1 
S   
2
p
 n1 n 2 
d.f. = n1 + n2 – 2
Pooled-Variance t Test Example NYSE NASDAQ

Number 21 25

Sample 3.27 2.53


mean

Sample 1.3 1.16


SD

 You are a financial analyst for a brokerage firm. Is


there a difference in dividend yield between stocks
listed on the NYSE & NASDAQ?

 Assuming both populations are approximately normal


with equal variances, is there a difference in mean
yield (a = 0.05)? You collect the above data.
Pooled-Variance t Test Example: Calculating the Test Statistic
H0: μ1 - μ2 = 0
H1: μ1 - μ2 ≠ 0

The test statistic is:

t STAT 
X  X   μ
1 2 1  μ2 

3.27  2.53  0  2.040
1 1  1 1 
S   
2
1.5021  
p
 n1 n 2   21 25 

n
S2  1
 1S1
2
 n 2  1S 2
2

21  1 1.30 2
 25  1 1.16 2
 1.5021
(n1  1)  (n 2  1) (21 - 1)  (25  1)
P
Pooled-Variance t Test Example: Hypothesis Test Solution

Reject H0 Reject H0

a = 0.05
df = 21 + 25 - 2 = 44 .025 .025

Critical Values: t = ± 2.0154 -2.0154 0 2.0154 t


2.040
Test Statistic:

3.27  2.53
t STAT   2.040
 1 1 
1.5021   
 21 25 
Conclusion: Reject H0 at a = 0.05. There is evidence of a difference in means.
Pooled-Variance t Test Example: Confidence Interval for µ1 - µ2
Since we rejected H0 can we be 95% confident that
µNYSE > µNASDAQ?

95% Confidence Interval for µNYSE - µNASDAQ

X  X   t
1 2 a/2 p
1 1 
S     0.74  2.0154  0.3628  (0.009, 1.471)
2

 n1 n 2 

Since 0 is less than the entire interval, we can be 95%


confident that µNYSE > µNASDAQ
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown, not assumed equal

Population means,
Assumptions
independent  Samples are randomly and
samples independently drawn.
 Populations are normally
σ1 and σ2 known
distributed or both sample

σ1 and σ2 unknown, sizes are at least 30.


assumed equal  Population variances are
σ1 and σ2 unknown, unknown and cannot be
not assumed equal assumed to be equal.
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown and not assumed equal
 The test statistic:

 
2
 S1 2 S 2 2 
X  X 2   μ1  μ 2  
n  n 


1
  2
2 
t STAT
2 2
tSTAT has 1
2
S S  S1 2   S22 
1
 2
d.f. ν: 
n 
 
n 

n1 n 2  1   2 
n1  1 n2  1

Welch’s rule:
df = min(n1 – 1, n2 – 1)

 The confidence interval:

X 1 
 X 2  ta
2
S12 S22

n1 n 2
Difference Between Two Means & Paired Samples
 The average score of subjects on the posttest is different
than the average of those same subjects on the pretest.

 People will listen longer to a female telephone marketer


than the very same people will listen to a male telephone
marketer.
 On average, soldiers weighed less
after they completed basic training
than they weighed before they
started.
 and so forth.
Difference Between Two Means & Paired Samples
 Tests means of 2 related populations
 Paired samples

 Repeated measures (before/after)

 Use difference between paired values:

Di = X1i - X2i
 Assumptions:
 Both Populations Are Normally Distributed

 Or, if not Normal, use large samples


Related Populations - Paired Difference Test
 The ith paired difference: Di = X1i - X2i
 The point estimate for the paired n

difference population mean μD:


D i
D i 1
n
 The sample standard deviation:
n
 n is the number of pairs in the  i
(D  D ) 2

SD  i 1

paired sample n 1

D  μD
 The test statistic for μD where t STAT 
SD
tSTAT has n - 1 d.f. n
Paired Difference Test: Possible Hypotheses & CI
Lower-tail test Upper-tail test Two-tail test

H0: μD  0 H0: μD ≤ 0 H0: μD = 0


H1: μD < 0 H1: μD > 0 H1: μD ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if tSTAT < -ta/2
or tSTAT > ta/2
Where tSTAT has n - 1 d.f.

SD
The confidence interval for μD is D  ta / 2
n
Paired Difference Test: Example
 Assume you send your salespeople to a “customer
service” training workshop. Has the training made a
difference in the number of complaints? You collect
the following data:

 Di
Salesperson Number of complaints Difference
Before After D = n
A 6 4 -2
B 20 6 -14 = -4.2
 (D  D)
C 3 2 -1 2
D 0 0 0 SD  i

F 4 0 -4
n 1
-21  5.67
Has the training made a difference in the
number of complaints (at the 0.01 level)?

H0: μD = 0
H1: μD  0 Reject Reject

a/2 a/2
a = .01 D = - 4.2 - 4.604 4.604
- 1.66
t0.005 = ± 4.604
d.f. = n - 1 = 4
Conclusion: Do not reject 𝐻0 . There
is insufficient evidence there is
D  μ D  4.2  0
t STAT    1.66 significant change in the number of
SD / n 5.67/ 5
complaints.
Comparing Two Proportions
Two Population Proportions
 Time magazine reported the result of a telephone poll of 800
adult Americans. The question posed of the Americans who
were surveyed was: "Should the federal tax on cigarettes be
raised to pay for health care reform?” Is there sufficient
evidence at the 𝛼 = 0.05, say, to conclude that the true
populations – smokers and non-smokers – differ significantly?
Assumptions about Normality

 We have assumed a normal distribution for the


statistic p1  p2
 For a test of two proportions, the criterion for
normality is np ≥ 10 and n(1 − p) ≥ 10 for each
sample.

 If either sample proportion is not normal, their


difference cannot safely be assumed normal.
Two Population Proportions Test for Zero Difference
Lower-tail test Upper-tail test Two-tail test
H0: π1 – π2  0 H0: π1 – π2 ≤ 0 H0: π1 – π2 = 0
H1: π1 – π2 < 0 H1: π1 – π2 > 0 H1: π1 – π2 ≠ 0

a a a/2 a/2

-za za -za/2 za/2


Reject H0 if ZSTAT < -Za Reject H0 if ZSTAT > Za Reject H0 if ZSTAT < -Za/2
or ZSTAT > Za/2

The point estimate for the difference is p1  p2


Two Population Proportions Test for Zero Difference
 The pooled estimate for the overall X1  X 2
p
n1  n 2
proportion, where X1 and X2 are the number of
items of interest in samples 1 and 2
 p1  p2    π1  π2 
 The test statistic Z STAT 
 1 1 
p (1  p )   
 n1 n2 

X1  X 2 X1 X2
p , p1  , p2 
n1  n2 n1 n2
Hypothesis Test Example: Two population Proportions
Is there a significant difference between the
proportion of men and the proportion of
women who will vote Yes on Proposition A?

 In a random sample, 36 of 72 men and 35 of


50 women indicated they would vote “Yes”.

 Test at the .05 level of significance.


Hypothesis Test Example: Two population Proportions
 Hypotheses
H0: π1 – π2 = 0
H1: π1 – π2 ≠ 0
 The sample proportions
 Men: p1 = 36/72 = 0.50
 Women: p2 = 35/50 = 0.70

 The pooled estimate for the overall proportion

X 1  X 2 36  35 71
p    0 .582
n1  n2 72  50 122
Reject H0
z STAT 
 p1  p2    π1  π2 
Reject H0
 1 1 .025

p ( 1  p)   
 n1 n2  .025


 .50  .70   0   2 .20
 1 1 
.582 ( 1  .582 )    -1.96 1.96
 72 50  -2.20

Critical Values = ±1.96


For a = .05

Conclusion: Reject 𝐻0 . There is evidence of a difference in


proportions who will vote yes between men and women.
Two Population Proportions Test for Difference
Lower-tail test Upper-tail test Two-tail test

H0: π1 – π2  D0 H0: π1 – π2 ≤ D0 H0: π1 – π2 = D0


H1: π1 – π2 < D0 H1: π1 – π2 > D0 H1: π1 – π2 ≠ D0

(p1  p 2 )  ( 1   2 )
Z STAT 
p1 (1  p1 ) p 2 (1  p 2 )

n1 n2
p1 (1  p1 ) p 2 (1  p 2 )
 p1  p 2   Za/2 
n1 n2
Comparing Two Variances
Comparing Two Variances

Hypothesis Test for


Variances

Test for a Single Test for Two


Population Population
Variances Variances

Chi – square F - Test


Test Statistic Statistic
F - Statistic
 For variance tests, we'll use the F-test to determine whether two

𝜎12
variances are different by considering 𝐹 =
𝜎22

𝑠12
 Test statistic is the ratio between two sample variances 𝐹𝑆𝑇𝐴𝑇 =
𝑠22

 Degrees of freedom for the top: 𝑑𝑓1 = 𝑛1 − 1

Degrees of freedom for the bottom: 𝑑𝑓2 = 𝑛2 − 1


Example of Comparing Two Variances

 You are a financial NYSE NASDAQ

Number 21 25
analyst for a brokerage
Sample mean 3.27 2.53
firm. Is there a Sample SD 1.3 1.16

difference in the
variances between the
NYSE & NASDAQ at
the a = 0.05 level?
Solution.
 Form the hypothesis test:

H 0: σ 12  σ 22 = no difference between variances


H 1: σ 12  σ 22 = a difference between variances

 Significance level a = 0.05

 Numerator d.f. = n1 – 1 = 21 –1 = 20

 Denominator d.f. = n2 – 1 = 25 –1 = 24

 FR = F.025, 20, 24 = 2.33 (FINV(0.025, 20, 24)

 FL = 1/ F.025, 24, 20 = 0.41 (or FINV(0.975, 20, 24)


 The test statistic H0: σ12 = σ22
H1: σ12 ≠ σ22
S12 1.302
FSTAT  2  2
 1.256
S 2 1.16
a/2 = .025
F
Reject H0 Do not Reject H0
reject H0
FL = 0.41 1 FR = 2.33
 FSTAT = 1.256 is not in the rejection region, so we do not reject
H0

Conclusion: There is insufficient evidence of a difference in


variances at a = .05
-- The End of Topic --
Thank You!

You might also like