You are on page 1of 11

MODULE 9: STATISTICAL INFERENCE OF TWO SAMPLES

Review question: What is the use level of significance in statistics? Write your
answers in the space below.

Read: Chapter 10: One- and Two-Sample Tests of Hypotheses by Myers, W.


Probability and Statistics for Engineers and Scientists

The previous module presented hypothesis tests for a single population


parameter (the mean µ, the variance σ2, or a proportion p). This module extends
those results to the case of two independent populations.

The z-test
1. Population Parameter
If the population variances are known, use:
(𝒙 ̅𝟐 ) − (𝝁
̅𝟏 − 𝒙 ̅𝟏 − 𝝁
̅𝟐) (𝒙 ̅𝟐 ) − 𝒅𝟎
̅𝟏 − 𝒙
𝒛= =
𝟐 𝟐 𝟐 𝟐
√𝝈𝟏 + 𝝈𝟐 √𝝈𝟏 + 𝝈𝟐
𝒏𝟏 𝒏𝟐 𝒏𝟏 𝒏𝟐

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.
Null Hypothesis Alternative Critical Region
(H0) Hypothesis (H1) (Reject H0 if)

𝜇1 − 𝜇2 < 𝑑0 𝑧 < −𝑧𝛼


𝜇1 − 𝜇2 > 𝑑0 𝑧 > 𝑧𝛼
𝜇1 − 𝜇2 = 𝑑0
𝜇1 − 𝜇2 ≠ 𝑑0 𝑧 < −𝑧𝛼/2 𝑜𝑟 𝑧 > 𝑧𝛼/2

Example 1: An examination was given to two classes consisting of 40 and 50


students, respectively. In the first class the mean grade was 74 with a
standard deviation of 8, while in the second class the mean grade was
78 with a standard deviation of 7. Is there a significant difference
between the performance of the two classes at the (a) 0.05 and (b) 0.01
levels?

Solution:
1. Hypotheses:
H0: 𝝁𝟏 = 𝝁𝟐 (𝝁𝟏 − 𝝁𝟐 = 𝟎), and the difference is due merely to chance.
H1: 𝝁𝟏 ≠ 𝝁𝟐 , and there is a significant difference between the classes.

2. Level of Significance:
𝒂. 𝜶 = 𝟎. 𝟎𝟓
𝒃. 𝜶 = 𝟎. 𝟎𝟏
3. Critical regions (𝑧 < −𝑧𝛼/2 𝑜𝑟 𝑧 > 𝑧𝛼/2 ):
a. P(Z < -𝑧𝛼/2 ) = 0.05/2 = 0.025
z < -1.96
or z > 1.96
b. P(Z < -𝑧𝛼/2 ) = 0.01/2 = 0.005
z < -2.575
or z > 2.575

4. Calculate the z-value and compare it with the critical value


(𝒙 ̅𝟐 ) − (𝝁
̅𝟏 − 𝒙 ̅𝟏 − 𝝁
̅𝟐)
𝒛=
𝟐 𝟐
√𝝈𝟏 + 𝝈𝟐
𝒏𝟏 𝒏𝟐
(𝟕𝟒 − 𝟕𝟖) − 𝟎
𝒛=
𝟐 𝟐
√𝟖 + 𝟕
𝟒𝟎 𝟓𝟎

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.
𝒛 = −𝟐. 𝟒𝟗
5. Decision
The results are significant at the 0.05 level (the mean grade of the first
class is less than the second class) but not at the 0.01 level.

2. Population proportions
̂𝟏 − 𝒑
𝒑 ̂𝟐
𝒛=
𝟏 𝟏
√𝒑
̂𝒒̂(
𝒏𝟏 + 𝒏𝟐 )
𝑥1 𝑥2
𝑤ℎ𝑒𝑟𝑒: 𝑝̂1 = , 𝑝̂ 2 =
𝑛1 𝑛2
𝑥1 + 𝑥2 𝑛1 𝑝̂1 + 𝑛1 𝑝̂1
𝑝̂ = = , 𝑞̂ = 1 − 𝑝̂
𝑛1 + 𝑛2 𝑛1 + 𝑛2

Null Hypothesis Alternative Critical Region


(H0) Hypothesis (H1) (Reject H0 if)
𝑝1 < 𝑝2 𝑧 < −𝑧𝛼
𝑝1 = 𝑝2 𝑝1 > 𝑝2 𝑧 > 𝑧𝛼
𝑝1 ≠ 𝑝2 𝑧 < −𝑧𝛼/2 𝑜𝑟 𝑧 > 𝑧𝛼/2

Example 2: Two groups, A and B, consist of 100 people each who have a
disease. A serum is given to group A but not to group B (which is
called the control); otherwise, the two groups are treated identically.
It is found that in groups A and B, 75 and 65 people, respectively,
recover from the disease. At significance levels of (a) 0.01, (b) 0.05,
and (c) 0.10, test the hypothesis that the serum helps cure the
disease.

Solution:
1. Hypotheses:
Let p1 and p2 denote the population proportions cured by (1) using
the serum and (2) not using the serum, respectively.
H0: 𝒑𝟏 = 𝒑𝟐 , and the observed difference is due merely to chance
(i.e., the serum is ineffective).
H1: 𝒑𝟏 > 𝒑𝟐 , and the serum is effective.

2. Level of Significance:
𝒂. 𝜶 = 𝟎. 𝟎𝟏

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.
𝒃. 𝜶 = 𝟎. 𝟎𝟓
𝒄. 𝜶 = 𝟎. 𝟏𝟎

3. Critical region (𝑧 > 𝑧𝛼 ):


a. P(Z > 𝑧𝛼 ) = 1 - P(Z < 𝑧𝛼 ) = 0.01
z > 2.33
b. P(Z > 𝑧𝛼 ) = 1 - P(Z < 𝑧𝛼 ) = 0.05
z > 1.645
c. P(Z > 𝑧𝛼 ) = 1 - P(Z < 𝑧𝛼 ) = 0.10
z > 1.28

4. Calculate the z-value and compare it with the critical value


̂𝟏 − 𝒑
𝒑 ̂𝟐
𝒛=
𝟏 𝟏
√𝒑̂𝒒
̂( + )
𝒏 𝒏 𝟏 𝟐
75 65
𝑤ℎ𝑒𝑟𝑒: 𝑝̂1 = = 0.75, 𝑝̂2 = = 0.65
100 100
𝑥1 + 𝑥2 75 + 65
𝑝̂ = = = 0.7, 𝑞̂ = 1 − 𝑝̂ = 0.3
𝑛1 + 𝑛2 100 + 100
𝟎. 𝟕𝟓 − 𝟎. 𝟔𝟓
𝒛=
√(𝟎. 𝟕)(𝟎. 𝟑) ( 𝟏 + 𝟏 )
𝟏𝟎𝟎 𝟏𝟎𝟎
𝒛 = 𝟏. 𝟓𝟒

5. Decision
We conclude that the serum is effective at the 0.10 level but not
at 0.05 and 0.01 level. Note that this conclusion depends on how
much we are willing to risk being wrong.

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.
The t-test
1. Population Parameter
a. For unknown but equal variances
(𝒙 ̅𝟐 ) − 𝒅𝟎
̅𝟏 − 𝒙
𝒕=
𝟏 𝟏
𝒔𝒑 √𝒏 + 𝒏
𝟏 𝟐
𝒔𝟏 𝟐 (𝒏𝟏 − 𝟏) + 𝒔𝟐 𝟐 (𝒏𝟐 − 𝟏)
𝟐
𝑤ℎ𝑒𝑟𝑒: 𝑠𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒, 𝒔𝒑 =
𝒏𝟏 + 𝒏𝟐 − 𝟐
𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚, 𝒗 = 𝒏𝟏 + 𝒏𝟐 − 𝟐

Null Hypothesis Alternative Critical Region


(H0) Hypothesis (H1) (Reject H0 if)
𝜇1 − 𝜇2 < 𝑑0 𝑡 < −𝑡𝛼
𝜇1 − 𝜇2 > 𝑑0 𝑡 > 𝑡𝛼
𝜇1 − 𝜇2 = 𝑑0
𝜇1 − 𝜇2 ≠ 𝑑0 𝑡 < −𝑡𝛼/2 𝑜𝑟 𝑡 > 𝑡𝛼/2

Example 3: An experiment was performed to compare the abrasive wear


of two different laminated materials. Twelve pieces of material 1
were tested by exposing each piece to a machine measuring
wear. Ten pieces of material 2 were similarly tested. In each case,
the depth of wear was observed. The samples of material 1 gave
an average (coded) wear of 85 units with a sample standard
deviation of 4, while the samples of material 2 gave an average of
81 with a sample standard deviation of 5. Can we conclude at the
0.05 level of significance that the abrasive wear of material 1
exceeds that of material 2 by more than 2 units? Assume the
populations to be approximately normal with equal variances.

Solution:
1. Hypotheses:
H0 : 𝝁 𝟏 − 𝝁 𝟐 = 𝟐
H1 : 𝝁 𝟏 − 𝝁 𝟐 > 𝟐

2. Level of Significance:
𝜶 = 𝟎. 𝟎𝟓

3. Critical region (𝑡 > 𝑡𝛼 ):


P(t > 𝑡𝛼 ) = 0.05

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.
Looking t-Distribution Table at 𝛼 = 0.05 and with degree of
freedom, v =12 + 10 -2 = 20, we have
t > 1.725

4. Calculate the t-value and compare it with the critical value


(𝑥̅1 − 𝑥̅2 ) − 𝑑0
𝑡=
1 1
𝑠𝑝 √𝑛 + 𝑛
1 2

𝑠1 2 (𝑛1 − 1) + 𝑠2 2 (𝑛2 − 1)
𝑤ℎ𝑒𝑟𝑒: 𝑠𝑝 2 =
𝑛1 + 𝑛2 − 2
42 (12 − 1) + 52 (10 − 1)
= = 20.05
12 + 10 − 2
𝑠𝑝 = 4.4777
(85 − 81) − 2
𝑡=
1 1
4.4777√12 + 10
𝒕 = 𝟏. 𝟎𝟒
5. Decision
Since t = 1.04 is NOT greater than 1.725, DO NOT reject H0.
We are unable to conclude that the abrasive wear of material 1
exceeds that of material 2 by more than 2 units.

b. For unknown and unequal variances


(𝒙 ̅𝟐 ) − 𝒅𝟎
̅𝟏 − 𝒙
𝒕′ =
𝟐 𝟐
√𝒔𝟏 + 𝒔𝟐
𝒏𝟏 𝒏𝟐
2
𝑠 2 𝑠 2
( 𝑛1 + 𝑛2 )
1 2
𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚, 𝑣 =
(𝑠1 2 /𝑛1 )2 (𝑠2 2 /𝑛2 )2
𝑛1 − 1 + 𝑛2 − 1

Null Hypothesis Alternative Hypothesis Critical Region


(H0) (H1) (Reject H0 if)
𝜇1 − 𝜇2 < 𝑑0 𝑡′ < −𝑡𝛼
𝜇1 − 𝜇2 > 𝑑0 𝑡′ > 𝑡𝛼
𝜇1 − 𝜇2 = 𝑑0
𝜇1 − 𝜇2 ≠ 𝑑0 𝑡′ < −𝑡𝛼/2 𝑜𝑟 𝑡′ > 𝑡𝛼/2

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.
Example 4: The following data represent the running times of films produced
by two motion-picture companies:
Company Time (minutes)
1 102 86 98 109 92
2 81 165 97 134 92 87 114
Test the hypothesis that the average running time of films produced
by company 2 exceeds the average running time of films produced
by company 1 by 10 minutes. Use a 0.1 level of significance and
assume the distributions of times to be approximately normal with
unequal variances.

Solution:
1. Hypotheses:
H0: 𝝁𝟐 − 𝝁𝟏 = 𝟏𝟎
H1: 𝝁𝟐 − 𝝁𝟏 > 𝟏𝟎

2. Level of Significance:
𝜶 = 𝟎. 𝟏

3. Critical region (𝑡 < −𝑡𝛼 ):


P(t < −𝑡𝛼 ) = 0.1
Looking t-Distribution Table at 𝛼 = 0.1 and with degree of freedom,
2
𝑠1 2 𝑠2 2 78.8 913.3333 2
(𝑛 + 𝑛 ) ( + )
𝑣= 1 2
= 5 7 = 7.3756 ≈ 7
(𝑠1 2 /𝑛1 )2 (𝑠2 2 /𝑛2 )2 (78.8/5)2 (913.3333/7)2
𝑛1 − 1 + 𝑛2 − 1 +
5−1 7−1
where:
𝑛1 = 5
102 + 86 + 98 + 109 + 92
𝑥̅1 = = 97.4
5
𝑛1
1
𝑠1 2 = ∑(𝑥𝑖 − 𝑥̅ )2 = 78.8
𝑛1 − 1
𝑖=1
Similarly, 𝑛2 = 7, 𝑥̅2 = 110, and 𝑠2 2 = 913.3333

We have
t >1.415

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.
4. Calculate the t-value and compare it with the critical value
(𝑥̅2 − 𝑥̅1 ) − 𝑑0
𝑡′ =
𝑠 2 𝑠 2
√ 2 + 1
𝑛 𝑛 2 1

(110 − 97.4) − 10
𝑡′ =
√913.3333 + 78.8
7 5
𝒕′ = 𝟎. 𝟐𝟏𝟓

5. Decision
Fail to reject H0.

c. Paired Observations (ex: before and after study)


̅𝒅 − 𝝁 𝒅
𝒙
𝒕= 𝒔𝒅 𝑤ℎ𝑒𝑟𝑒: 𝑑 𝑖𝑠 𝑓𝑜𝑟 difference(𝑖. 𝑒. 𝑎𝑓𝑡𝑒𝑟 𝑟𝑒𝑠𝑢𝑙𝑡 − 𝑏𝑒𝑓𝑜𝑟𝑒 𝑟𝑒𝑠𝑢𝑙𝑡)
√𝒏
𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚, 𝑣 = 𝑛 − 1

Null Hypothesis Alternative Hypothesis Critical Region


(H0) (H1) (Reject H0 if)
𝜇𝐷 < 𝑑0 𝑡 < −𝑡𝛼
𝜇𝐷 = 𝑑0 𝜇𝐷 > 𝑑0 𝑡 > 𝑡𝛼
𝜇𝐷 ≠ 𝑑0 𝑡 < −𝑡𝛼/2 𝑜𝑟 𝑡 > 𝑡𝛼/2

Example 5: It is claimed that a new diet will reduce a person’s weight by 4.5
kilograms on average in a period of 2 weeks. The weights of 7
women who followed this diet were recorded before and after the
2-week period.
Woman Weight Weight
Before After
1 58.5 60.0
2 60.3 54.9
3 61.7 58.1
4 69.0 62.1
5 64.0 58.5
6 62.6 59.9
7 56.7 54.4

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.
Use the t-distribution to test the hypothesis that the diet reduces a
woman’s weight by 4.5 kilograms on average against the
alternative hypothesis that the mean difference in weight is less
than 4.5 kilograms. Use 0.05 level of significance.
Solution:
1. Hypotheses:
H0: 𝝁𝟏 − 𝝁𝟐 = 𝟒. 𝟓 𝒌𝒈 𝒐𝒓 𝝁𝑫 = 𝟒. 𝟓 𝒌𝒈
H1: 𝝁𝟏 − 𝝁𝟐 < 𝟒. 𝟓 𝒌𝒈 𝒐𝒓 𝝁𝑫 < 𝟒. 𝟓 𝒌𝒈

2. Level of Significance:
𝜶 = 𝟎. 𝟎𝟓

3. Critical region (𝑡 < −𝑡𝛼 ):


P(t < −𝑡𝛼 ) = 0.05
Looking t-Distribution Table at 𝛼 = 0.05 and with degree of
freedom, v = 7-1 = 6, we have
t < -1.943

4. Calculate the t-value and compare it with the critical value


𝑥̅𝑑 − 𝜇𝑑
𝑡= 𝑠𝑑
√𝑛
Woman Weight Weight Difference,
Before After d
1 58.5 60.0 -1.5
2 60.3 54.9 5.4
3 61.7 58.1 3.6
4 69.0 62.1 6.9
5 64.0 58.5 5.5
6 62.6 59.9 2.7
7 56.7 54.4 2.3
Mean, 𝑥𝑑̅ 3.5571
s.d. of d (sd) 2.776

3.5571 − 4.5
𝑡=
2.776
√7
𝒕 = −𝟎. 𝟖𝟗𝟗

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.
5. Decision
Do not reject H0.

1. The mean height of 50 male students who showed above-average


participation in college athletics was 68.2 inches (in) with a standard
deviation of 2.5 in, while 50 male students who showed no interest in such
participation had a mean height of 67.5 in with a standard deviation of
2.8 in. Test the hypothesis that male students who participate in college
athletics are taller than other male students at 0.1 level.
2. A study was made to determine if the subject matter in a physics course
is better understood when a lab constitutes part of the course. Students
were randomly selected to participate in either a 3-semester-hour course
without labs or a 4-semester-hour course with labs. In the section with labs,
11 students made an average grade of 85 with a standard deviation of
4.7, and in the section without labs, 17 students made an average grade
of 79 with a standard deviation of 6.1. Would you say that the laboratory
course increases the average grade by as much as 8 points? Use 0.05
level of significance.
3. In a study conducted by the Department of Human Nutrition and Foods
at Virginia Tech, the following data were recorded on sorbic acid
residuals, in parts per million, in ham immediately after dipping in a
sorbate solution and after 60 days of storage:

Assuming the populations to be normally distributed, is there sufficient


evidence, at the 0.05 level of significance, to say that the length of
storage influences sorbic acid residual concentrations?

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.
Summative Graded Quiz

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited.

You might also like