You are on page 1of 39

Chapter 9: Statistical Inference for Two

Samples

Course Name: PROBABILITY & STATISTICS


Lecturer: Duong Thi Hong

Hanoi, 2022

1 / 38 Chapter 9: Statistical Inference for Two Samples


Content

1 Inference on the Difference in Means of Two Normal


Distributions, Variance Known
2 Inference on the Difference in Means of Two Normal
Distributions, Variance Unknown

3 Inference on the Two Proportions

2 / 38 Chapter 9: Statistical Inference for Two Samples


Content

1 Inference on the Difference in Means of Two Normal


Distributions, Variance Known
2 Inference on the Difference in Means of Two Normal
Distributions, Variance Unknown

3 Inference on the Two Proportions

3 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two

Normal Distributions, Variance Known

We will assume that


is a random sample from population 1
X11 , X12 , ..., X1n1
X21 , X22 , ..., X2n is a random sample from population 2
2

The two populations represented by X1 and X2 are


independent
Both populations are normal
Then, the quantity
X̄1 − X̄2 − (µ1 − µ2 )
Z= q 2 ,
σ1 σ22
n1 + n2

has a N (0, 1) distribution.


4 / 38 Chapter 9: Statistical Inference for Two Samples
Confidence Interval on the Difference in Means,

Variances Known

Confidence Interval on the Difference in Means, Variances Known

If x̄1 and x̄2 are the means of independent random samples of sizes
n1 and n2 from two independent normal populations with known
variances σ12 and σ22, respectively, a 100(1 − α)% confidence interval
for µ1 − µ2 is
s s
σ12 σ22 σ12 σ22
x̄1 − x̄2 − zα/2 + ≤ µ1 − µ2 ≤ x̄1 − x̄2 + zα/2 +
n1 n2 n1 n2

where zα/2 is the upper α/2 percentage point of the standard normal
distribution.

5 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in Means,

Variances Known

Example 1

A product developer is interested in reducing the drying time of a


primer paint. Two formulations of the paint are tested; formulation
1 is the standard chemistry, and formulation 2 has a new drying
ingredient that should reduce the drying time. From experience, it
is known that the standard deviation of drying time is 8 minutes,
and this inherent variability should be unaffected by the addition
of the new ingredient. Ten specimens are painted with formula-
tion 1, and another 10 specimens are painted with formulation
2; the 20 specimens are painted in random order. The two sam-
ple average drying times are x̄1 = 121 minutes and x̄2 = 112 minutes.
Construct 95% confidence interval on the difference in means.
6 / 38 Chapter 9: Statistical Inference for Two Samples
Confidence Interval on the Difference in Means,

Variances Known

Sample Size for a Confidence Interval on the Difference in Means,

Variances Known

If the standard deviations σ1 and σ2 are known and the two sample
sizes n1 and n2 are equal (n1 = n2 = n) we can determine the sample
size required so that the error in estimating µ1 − µ2 by x̄1 − x̄2 will
be less than E at 100(1 − α)% confidence. The required sample size
from each population is
zα/2 2 2
n=( ) (σ1 + σ22 )
E

7 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in Means,

Variances Known

One-Sided Confidence Bounds

One-Sided Upper Confidence Bound


s
σ12 σ22
µ1 − µ2 ≤ x̄1 − x̄2 + zα +
n1 n2

One-Sided Lower Confidence Bound


s
σ12 σ22
x̄1 − x̄2 − zα + ≤ µ1 − µ2
n1 n2

8 / 38 Chapter 9: Statistical Inference for Two Samples


Hypothesis Tests on the Difference in Means,

Variances Known

Formally, we summarize these results in the following display.


Tests on the Difference in Means, Variances Known

9 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two

Normal Distributions, Variance Known

Exercise 1

Consider the hypothesis test H0 : µ1 = µ2 against H1 : µ1 6= µ2 with


known variances σ1 = 10 and σ2 = 5. Suppose that sample sizes n1 = 10
and n2 = 15 and that x̄1 = 4.7 and x̄2 = 7.8. Use α = 0.05
a) Test the hypothesis and find the P-value.
b) Explain how the test could be conducted with a confidence interval.

10 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two

Normal Distributions, Variance Known

Exercise 1

Consider the hypothesis test H0 : µ1 = µ2 against H1 : µ1 6= µ2 with


known variances σ1 = 10 and σ2 = 5. Suppose that sample sizes n1 = 10
and n2 = 15 and that x̄1 = 4.7 and x̄2 = 7.8. Use α = 0.05
a) Test the hypothesis and find the P-value.
b) Explain how the test could be conducted with a confidence interval.

Exercise 2

Consider the hypothesis test H0 : µ1 = µ2 against H1 : µ1 < µ2 with


known variances σ1 = 10 and σ2 = 5. Suppose that sample sizes n1 = 10
and n2 = 15 and that x̄1 = 14.2 and x̄2 = 19.7. Use α = 0.05
a) Test the hypothesis and find the P-value.
b) Explain how the test could be conducted with a confidence interval.

10 / 38 Chapter 9: Statistical Inference for Two Samples


Question 1

11 / 38 Chapter 9: Statistical Inference for Two Samples


Question 2

12 / 38 Chapter 9: Statistical Inference for Two Samples


Question 3

13 / 38 Chapter 9: Statistical Inference for Two Samples


Question 4

14 / 38 Chapter 9: Statistical Inference for Two Samples


Content

1 Inference on the Difference in Means of Two Normal


Distributions, Variance Known
2 Inference on the Difference in Means of Two Normal
Distributions, Variance Unknown

3 Inference on the Two Proportions

15 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two

Normal Distributions, Variance Unknown

Case 1: σ12 = σ22 = σ2.


Let
X11 , X12 , ..., X1n1 is a random sample from population 1
X21 , X22 , ..., X2n is a random sample from population 2
2

X̄1 , X̄2 , S12 and S22 be the sample means and sample variances,
respectively.

16 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two

Normal Distributions, Variance Unknown and

Equal

Pooled Estimator of Variance

The pooled variance of σ2 denoted by Sp2, is defined by


(n1 − 1)S12 + (n2 − 1)S22
Sp2 =
n1 + n2 − 2

17 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two

Normal Distributions, Variance Unknown and

Equal

Given the assumptions of this section, the quantity


X̄1 − X̄2 − (µ1 − µ2 )
T = q 2
Sp Sp2
n1 + n2

has a t distribution with n1 + n2 − 2 degrees of freedom.

18 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in Means,

Variances Unknown and Equal

Confidence Interval on the Difference in Means, Variances

Unknowns and Equal

If x̄1, x̄2, s21 and s22 are the sample means and variances of two random
samples of sizes n1 and n2 respectively, from two independent normal
populations with unknown but equal variances, then a 100(1 − α)%
confidence interval on the difference in means µ1 − µ2 is
q
s2p s2
x̄1 − x̄2 − tα/2,n1 +n2 −2 + np2 ≤ µ1 − µ2
n1
q 2
s s2
≤ x̄1 − x̄2 + tα/2,n1 +n2 −2 np1 + np2

where tα/2,n +n −2 is the upper α/2 percentage point of the t


distribution with n1 + n2 − 2 degrees of freedom.
1 2

19 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in Means,

Variances Unknown and Equal

One-sided confidence bound on the difference in means

One-Sided Upper Confidence Bound


s
s2p s2p
µ1 − µ2 ≤ x̄1 − x̄2 + tα,n1 +n2 −2 +
n1 n2

One-Sided Lower Confidence Bound


s
s2p s2p
x̄1 − x̄2 − tα,n1 +n2 −2 + ≤ µ1 − µ2
n1 n2

20 / 38 Chapter 9: Statistical Inference for Two Samples


Hypotheses Tests on the Difference in Means,

Variances Unknown and Equal

Tests on the Difference in Means of Two Normal Distributions,

Variances Unknown and Equal

21 / 38 Chapter 9: Statistical Inference for Two Samples


Hypotheses Tests on the Difference in Means,

Variances Unknown and Not Assumed Equal

Case 2: σ12 6= σ22.


Test Statistic for the Difference in Means, Variances Unknown and

Not Assumed Equal

If H0 : µ1 − µ2 = ∆0 is true, the statistic


X̄1 − X̄2 − ∆0
T0∗ = q 2
S1 S22
n1 + n2

is distributed approximately as t with degrees of freedom given by


s2 s22 2
( n11 + n2 )
v= (s21 /n1 )2 (s22 /n2 )2
n1 −1 + n2 −1

If v is not an integer, round down to the nearest integer.


22 / 38 Chapter 9: Statistical Inference for Two Samples
Confidence Interval on the Difference in Means,

Variances Unknown and Not Assumed Equal

Case 2: σ12 6= σ22.


Approximate Confidence Interval on the Difference in Means,

Variances Unknown Are Not Assumed Equal

If x̄1, x̄2, s21 and s22 are the means and variances of two random samples
of sizes n1 and n2 respectively, from two independent normal
populations with unknown and unequal variances, an approximate
100(1 − α)% confidence interval on the difference in means µ1 − µ2 is
s s
s21 s22 s21 s2
x̄1 − x̄2 − tα/2,v + ≤ µ1 − µ2 ≤ x̄1 − x̄2 + tα/2,v + 2
n1 n2 n1 n2

23 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two

Normal Distributions, Variance Unknown

Exercise 1

Consider the hypothesis test H0 : µ1 = µ2 against H1 : µ1 6= µ2 . Suppose


that sample sizes n1 = 15 andn2 = 15 and that x̄1 = 4.7 and x̄2 = 7.8,
s21 = 4, s22 = 6.25. Assume that σ12 = σ22 and that the data are drawn from
normal distributions. Use α = 0.05
a) Test the hypothesis and find the P-value.
b) Explain how the test could be conducted with a confidence interval.

24 / 38 Chapter 9: Statistical Inference for Two Samples


Question 1

25 / 38 Chapter 9: Statistical Inference for Two Samples


Question 2

26 / 38 Chapter 9: Statistical Inference for Two Samples


Question 3

27 / 38 Chapter 9: Statistical Inference for Two Samples


Question 4

28 / 38 Chapter 9: Statistical Inference for Two Samples


Question 5

29 / 38 Chapter 9: Statistical Inference for Two Samples


Content

1 Inference on the Difference in Means of Two Normal


Distributions, Variance Known
2 Inference on the Difference in Means of Two Normal
Distributions, Variance Unknown

3 Inference on the Two Proportions

30 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Two Proportions

Two independent random samples of size n1 and n2 (large enough).


Sample proportion: p̂1 = nx , p̂2 = nx
1 2

p̂1 − p̂2 is point estimator of p1 − p2


1 2

If n1, n2 are large enough, we have


p1 (1 − p1 ) p2 (1 − p2 )
p̂1 − p̂2 ∼ N (p1 − p2 , + )
n1 n2
Pooled proportion
x1 + x2
p̂ =
n1 + n2

31 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in

Population Proportions

Approximate Confidence Interval on the Difference in Population

Proportions

If p̂1 and p̂2 are the sample proportions of observations in two


independent random samples of sizes n1 and n2 that belong to a class
of interest, an approximate twosided 100(1 − α)% confidence interval
on the difference in the true proportions p1 − p2 is
s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
p̂1 − p̂2 − zα/2 + ≤ p1 − p2
n1 n2
s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
≤ p̂1 − p̂2 + zα/2 +
n1 n2
where zα/2 is the upper α/2 percentage point of the standard normal
distribution.
32 / 38 Chapter 9: Statistical Inference for Two Samples
Large-Sample Tests on the Difference in

Population Proportions

We are interested in testing the hypotheses


H0 : p1 = p2
H1 : p1 6= p2

Test Statistic:
P̂1 − P̂2 − (p1 − p2 )
Z=q
p1 (1−p1 )
n1 + p2n(1−p
2 −1
2)

33 / 38 Chapter 9: Statistical Inference for Two Samples


Large-Sample Tests on the Difference in

Population Proportions

Approximate Tests on the Difference of Two Population

Proportions

34 / 38 Chapter 9: Statistical Inference for Two Samples


Question 1

35 / 38 Chapter 9: Statistical Inference for Two Samples


Question 2

36 / 38 Chapter 9: Statistical Inference for Two Samples


Question 3

37 / 38 Chapter 9: Statistical Inference for Two Samples


Question 4

38 / 38 Chapter 9: Statistical Inference for Two Samples

You might also like