You are on page 1of 75

6.

Confidence Intervals

Dave Goldsman
H. Milton Stewart School of Industrial and Systems Engineering
Georgia Institute of Technology

3/2/20

ISYE 6739 — Goldsman 3/2/20 1 / 75


Outline

1 Introduction to Confidence Intervals


2 Normal Mean (variance known)
3 Difference of Two Normal Means (variances known)
4 Normal Mean (variance unknown)
5 Difference of Two Normal Means (unknown equal variances)
6 Difference of Two Normal Means (variances unknown)
7 Difference of Paired Normal Means (variances unknown)
8 Normal Variance
9 Ratio of Variances of Two Normals
10 Bernoulli Proportion

ISYE 6739 — Goldsman 3/2/20 2 / 75


Introduction to Confidence Intervals

Lesson 6.1 — Introduction to Confidence Intervals

Next Few Lessons:


This Introduction to Confidence Intervals.
CI for the Mean of a Normal with Known Variance (easiest case).
CI for the Mean Difference of Two Normals with Known Variances.

Idea: Instead of estimating a parameter by a point estimator alone, give a


(random) interval that contains the parameter with a certain probability.

Example: X̄ is a point estimator for the parameter µ.


A 95% confidence interval for µ might look like
h p p i
µ ∈ X̄ − z0.025 σ 2 /n, X̄ + z0.025 σ 2 /n ,

where z0.025 is the Nor(0, 1)’s 0.975 quantile.


This means that µ is in the interval with probability 0.95.
ISYE 6739 — Goldsman 3/2/20 3 / 75
Introduction to Confidence Intervals

Definition: A 100(1 − α)% confidence interval for an unknown


parameter θ is given by two random variables L and U satisfying

P (L ≤ θ ≤ U ) = 1 − α.

L and U are the lower and upper confidence limits, and


1 − α is the confidence coefficient, specified in advance.

There is a 1 − α chance that θ actually lies between L and U .

Example: We’re 95% sure that President Smith’s popularity is 56% ± 3%.

Since L ≤ θ ≤ U , we call [L, U ] a two-sided CI for θ.

If L is such that P (L ≤ θ) = 1 − α, then [L, ∞) is a 100(1 − α)%


one-sided lower CI for θ.

Similarly, if U is such that P (θ ≤ U ) = 1 − α, then (−∞, U ] is a


100(1 − α)% one-sided upper CI for θ.
ISYE 6739 — Goldsman 3/2/20 4 / 75
Introduction to Confidence Intervals

Example: Here are some results from 10 independent samples, each


consisting of 100 different observations. From each sample, we use the 100
observations to re-calculate L and U .

Is the unknown θ in [L, U ]?

Sample # L U θ CI covers θ?
1 1.86 2.23 2 Yes
2 1.90 2.31 2 Yes
3 3.21 3.86 2 No
4 1.75 2.10 2 Yes
5 1.72 2.03 2 Yes
.. .. .. ..
. . . .
10 1.62 1.98 2 No

ISYE 6739 — Goldsman 3/2/20 5 / 75


Introduction to Confidence Intervals

As the number of samples gets large, the proportion of CIs that cover the
unknown θ approaches 1 − α.

Sometimes CIs miss θ too low, i.e., θ > U .

Sometimes CIs miss θ too high, i.e., θ < L.

But 1 − α of the time, they’re just right — Three Little Bears.

There are lots of different kinds of confidence intervals. We’ll look at CIs for
the mean and variance of normal distributions as well as CIs for the success
probability of Bernoulli distributions. We’ll also extend those results to
compare competing normal distributions as well as competing Bernoulli
distributions.

ISYE 6739 — Goldsman 3/2/20 6 / 75


Normal Mean (variance known)

Outline

1 Introduction to Confidence Intervals


2 Normal Mean (variance known)
3 Difference of Two Normal Means (variances known)
4 Normal Mean (variance unknown)
5 Difference of Two Normal Means (unknown equal variances)
6 Difference of Two Normal Means (variances unknown)
7 Difference of Paired Normal Means (variances unknown)
8 Normal Variance
9 Ratio of Variances of Two Normals
10 Bernoulli Proportion

ISYE 6739 — Goldsman 3/2/20 7 / 75


Normal Mean (variance known)

Lesson 6.2 — Normal Mean (variance known)

Set-up: Sample from a normal distribution with unknown mean µ and known
variance σ 2 .

Goal: Obtain a confidence interval for µ.

Remark: This is an unrealistic case, since if we didn’t know µ in real life,


then we wouldn’t know σ 2 either. But it’s a good place to start the discussion.

iid
Details: Suppose X1 , . . . , Xn ∼ Nor(µ, σ 2 ), where σ 2 is known.
Pn
Use X̄ = i=1 Xi /n as our point estimator. Recall

X̄ − µ
X̄ ∼ Nor(µ, σ 2 /n) ⇒ Z ≡ p ∼ Nor(0, 1).
σ 2 /n

The quantity Z is called a pivot. It’s a “starting point” for us.


ISYE 6739 — Goldsman 3/2/20 8 / 75
Normal Mean (variance known)

The definition of Z implies that


 
1 − α = P − zα/2 ≤ Z ≤ zα/2

 X̄ − µ 
= P − zα/2 ≤ p ≤ zα/2
σ 2 /n
 p p 
= P − zα/2 σ 2 /n ≤ X̄ − µ ≤ zα/2 σ 2 /n
 p p 
= P X̄ − zα/2 σ 2 /n ≤ µ ≤ X̄ + zα/2 σ 2 /n
| {z } | {z }
L U
= P (L ≤ µ ≤ U ).

Thus, the 100(1 − α)% two-sided CI for µ is


p p
X̄ − zα/2 σ 2 /n ≤ µ ≤ X̄ + zα/2 σ 2 /n.

ISYE 6739 — Goldsman 3/2/20 9 / 75


Normal Mean (variance known)

Remarks:

Notice how we used the pivot to “isolate” µ all by itself to the middle of the
inequalities.

After you observe X1 , . . . , Xn , you calculate L and U . Nothing is unknown,


since L and U don’t involve µ.

Sometimes we’ll write the CI as µ ∈ X̄ ± H, where the half-width is


p
H ≡ zα/2 σ 2 /n.

ISYE 6739 — Goldsman 3/2/20 10 / 75


Normal Mean (variance known)

Example: Suppose we take n = 25 iid observations from a Nor(µ, σ 2 )


distribution, where we somehow know that σ = 30. Further suppose that X̄
turns out to be 278. Let’s find a 100(1 − α)% = 95% CI for µ.
p
µ ∈ X̄ ± zα/2 σ 2 /n

= 278 ± z0.025 (30/5)

= 278 ± 11.76 (z0.025 = 1.96).

So a 95% CI for µ is 266.24 ≤ µ ≤ 289.76. 2

ISYE 6739 — Goldsman 3/2/20 11 / 75


Normal Mean (variance known)

Sample-Size Calculation

If we had taken p
more observations, then the CI would have gotten shorter,
since H = zα/2 σ 2 /n.

In fact, how many observations should be taken to make the half-length (or
“error”) ≤ ?
p
zα/2 σ 2 /n ≤  iff n ≥ (σzα/2 /)2 .

Thus, the sample size goes up as. . .


The variance σ 2 increases (more uncertainty).
The required half-width decreases (becomes more stringent).
The value of α decreases (more confidence).

ISYE 6739 — Goldsman 3/2/20 12 / 75


Normal Mean (variance known)

Example: Suppose, in the previous example, that we want the half-length to


be ≤ 10, i.e., µ ∈ X̄ ± 10. What should n be?
2
n ≥ (σzα/2 /)2 = (30)(1.96)/10 = 34.57.

Just to make n an integer, round up to n = 35. 2

Remark: We can similarly obtain one-sided CIs for µ (if we’re just interested
in one bound). . .
p
100(1 − α)% upper CI: µ ≤ X̄ + zα σ 2 /n.
p
100(1 − α)% lower CI: µ ≥ X̄ − zα σ 2 /n.

Note that we use the 1 − α quantile (not 1 − α/2).

ISYE 6739 — Goldsman 3/2/20 13 / 75


Difference of Two Normal Means (variances known)

Outline

1 Introduction to Confidence Intervals


2 Normal Mean (variance known)
3 Difference of Two Normal Means (variances known)
4 Normal Mean (variance unknown)
5 Difference of Two Normal Means (unknown equal variances)
6 Difference of Two Normal Means (variances unknown)
7 Difference of Paired Normal Means (variances unknown)
8 Normal Variance
9 Ratio of Variances of Two Normals
10 Bernoulli Proportion

ISYE 6739 — Goldsman 3/2/20 14 / 75


Difference of Two Normal Means (variances known)

Lesson 6.3 — Difference of Two Normal Means (variances known)

Idea: Compare the means of two competing alternatives or processes by


getting a confidence interval for their difference.

Example: Do Georgia Tech students have higher average IQ’s than Univ. of
Georgia students? Answer: Yes.

We’ll again assume that the variances of the two competitors are somehow
known. We’ll do the more-realistic unknown variance case in the next lesson.

ISYE 6739 — Goldsman 3/2/20 15 / 75


Difference of Two Normal Means (variances known)

Set-up: Suppose we have samples of sizes n and m from the two competing
populations.
iid
X1 , X2 , . . . , Xn ∼ Nor(µx , σx2 ) (population 1)
iid
Y1 , Y2 , . . . , Ym ∼ Nor(µy , σy2 ) (population 2),

where the means µx and µy are unknown, while σx2 and σy2 are somehow
known.

Also assume that the Xi ’s are independent of the Yi ’s.

Let’s find a CI for the difference in means, µx − µy .

ISYE 6739 — Goldsman 3/2/20 16 / 75


Difference of Two Normal Means (variances known)

Define the sample means from populations 1 and 2,


n m
1X 1 X
X̄ ≡ Xi and Ȳ ≡ Yi .
n m
i=1 i=1

Obviously,

X̄ ∼ Nor(µx , σx2 /n) and Ȳ ∼ Nor(µy , σy2 /m),

so that
σx2 σy2
 
X̄ − Ȳ ∼ Nor µx − µy , + .
n m

ISYE 6739 — Goldsman 3/2/20 17 / 75


Difference of Two Normal Means (variances known)

This implies that

X̄ − Ȳ − (µx − µy )
Z ≡ q ∼ Nor(0, 1),
σx2 σy2
n + m

so that
P (−zα/2 ≤ Z ≤ zα/2 ) = 1 − α.

Using the same manipulations as in the single-population case, we get a


two-sided 100(1 − α)% CI for µx − µy :
s
σx2 σy2
µx − µy ∈ X̄ − Ȳ ± zα/2 + .
n m

ISYE 6739 — Goldsman 3/2/20 18 / 75


Difference of Two Normal Means (variances known)

Similarly,

One-sided upper CI:


s
σx2 σy2
µx − µy ≤ X̄ − Ȳ + zα + .
n m

One-sided lower CI:


s
σx2 σy2
µx − µy ≥ X̄ − Ȳ − zα + .
n m

ISYE 6739 — Goldsman 3/2/20 19 / 75


Difference of Two Normal Means (variances known)

Example: A traveling professor gives the same test to Georgia Tech (X) and
University of Georgia (Y ) students. She assumes that the test scores are
normally distributed with known standard deviations of σx = 20 points and
σy = 12 points, respectively. She takes random samples of 40 GT scores and
24 UGA tests and observes sample means of X̄ = 95 points and Ȳ = 60,
respectively. Let’s find the 90% two-sided CI for µx − µy .
s
σx2 σy2
µx − µy ∈ X̄ − Ȳ ± zα/2 +
n m
r
400 144
= 35 ± 1.645 +
40 24
= 35 ± 6.58,

implying that 28.42 ≤ µx − µy ≤ 41.58. In other words, we’re 90% sure that
µx − µy lies in this interval. 2

ISYE 6739 — Goldsman 3/2/20 20 / 75


Difference of Two Normal Means (variances known)

Remark: Let’s assume that the sample sizes are equal, i.e., n = m.
To obtain a half-length ≤ , we require
2 (σ 2 + σ 2 )
zα/2 x y
n ≥ .
2

In the next lesson we’ll explore the more-realistic case in which the variance
of the underlying observations is unknown. Now it’s time for t. . . .

ISYE 6739 — Goldsman 3/2/20 21 / 75


Normal Mean (variance unknown)

Outline

1 Introduction to Confidence Intervals


2 Normal Mean (variance known)
3 Difference of Two Normal Means (variances known)
4 Normal Mean (variance unknown)
5 Difference of Two Normal Means (unknown equal variances)
6 Difference of Two Normal Means (variances unknown)
7 Difference of Paired Normal Means (variances unknown)
8 Normal Variance
9 Ratio of Variances of Two Normals
10 Bernoulli Proportion

ISYE 6739 — Goldsman 3/2/20 22 / 75


Normal Mean (variance unknown)

Lesson 6.4 — Normal Mean (variance unknown)

Next Few Lessons:


Confidence Interval for the Mean of a Normal Distribution with
Unknown Variance.
CIs for the Mean Difference of Two Normals with Unknown Variances
when. . .
the variances are assumed to be equal.
the variances are unequal.
observations between the two normals are taken in special pairs.

ISYE 6739 — Goldsman 3/2/20 23 / 75


Normal Mean (variance unknown)

At this point we’ll look at the more-realistic case in which the variance of the
underlying normal random variables is unknown.

This takes a little more work, but has many more applications.

iid
Set-up: X1 , X2 , . . . , Xn ∼ Nor(µ, σ 2 ), with σ 2 unknown.

Facts:
X̄ − µ
(a) p ∼ Nor(0, 1).
σ 2 /n
(b) X̄ and S 2 are independent.
n
1 X σ 2 χ2 (n − 1)
(c) S2 = (Xi − X̄)2 ∼ .
n−1 n−1
i=1

ISYE 6739 — Goldsman 3/2/20 24 / 75


Normal Mean (variance unknown)

Sketch of Proof: We proved Fact (a) previously. The proof of Fact (b) uses
what is known as Cochran’s Theorem, which is a bit beyond P our scope. In
order to derive Fact (c), we first do some algebra to re-write ni=1 (Xi − µ)2 ,
n n
X
2
X  2
(Xi − µ) = (Xi − X̄) + (X̄ − µ)
i=1 i=1
n
X n
X
2
= (Xi − X̄) + 2(X̄ − µ) (Xi − X̄) + n(X̄ − µ)2
i=1 i=1
n
X P 
= (Xi − X̄)2 + n(X̄ − µ)2 i (Xi − X̄) = 0 ,
i=1

which can be rewritten as


n
1 X n
(Xi − µ)2 = S 2 + (X̄ − µ)2 . (1)
n−1 n−1
i=1

ISYE 6739 — Goldsman 3/2/20 25 / 75


Normal Mean (variance unknown)

Now note that Xi ∼ Nor(µ, σ 2 ) and the fact that [Nor(0, 1)]2 ∼ χ2 (1)
together imply (Xi − µ)2 ∼ σ 2 χ2 (1), for all i; so the additive property of
independent χ2 ’s implies that
n
1 X σ 2 χ2 (n)
(Xi − µ)2 ∼ . (2)
n−1 n−1
i=1

Similarly, Fact (a) implies that

n σ 2 χ2 (1)
(X̄ − µ)2 ∼ . (3)
n−1 n−1

Since X̄ and S 2 are independent (by Fact (b)), we can substitute the results
from Equations (2) and (3) into (1) and then invoke χ2 additivity to obtain

σ 2 χ2 (n) σ 2 χ2 (1) σ 2 χ2 (n − 1)
∼ S2 + ⇒ S2 ∼ . 2
n−1 n−1 n−1

ISYE 6739 — Goldsman 3/2/20 26 / 75


Normal Mean (variance unknown)

Using Facts (a)–(c), we have, by the definition of the t distribution,

√X̄−µ
σ 2 /n Nor(0, 1)
p ∼ p ∼ t(n − 1).
2
S /σ 2 2
χ (n − 1)/(n − 1)

In other words,
X̄ − µ
p ∼ t(n − 1).
S 2 /n
Note that this expression doesn’t contain the unknown σ 2 . It’s been replaced
by S 2 , which we can calculate.

This process is called “standardizing and Studentizing.”

ISYE 6739 — Goldsman 3/2/20 27 / 75


Normal Mean (variance unknown)

Now, by the same manipulations as in the known-variance case, we can get a


two-sided 100(1 − α)% CI for µ:
p
µ ∈ X̄ ± tα/2,n−1 S 2 /n.

p
100(1 − α)% lower CI: µ ≥ X̄ − tα,n−1 S 2 /n.
p
100(1 − α)% upper CI: µ ≤ X̄ + tα,n−1 S 2 /n.

Remark: Here we use t-distribution quantiles (instead of normal quantiles as


in the known-variance case from the previous lesson). The t quantile tends to
be larger than the corresponding Nor(0,1) quantile, so these
unknown-variance CIs tend to be a bit longer than the known-variance CIs.
The longer CIs are the result of the fact that we lack precise info about the
variance.

ISYE 6739 — Goldsman 3/2/20 28 / 75


Normal Mean (variance unknown)

Example: Here are 20 residual flame times (in seconds) of treated specimens
of children’s nightwear. (Don’t worry — children were not in the nightwear
when the clothing was set on fire.)

9.85 9.93 9.75 9.77 9.67


9.87 9.67 9.94 9.85 9.75
9.83 9.92 9.74 9.99 9.88
9.95 9.95 9.93 9.92 9.89

Let’s get a 95% CI for the mean residual flame time.

ISYE 6739 — Goldsman 3/2/20 29 / 75


Normal Mean (variance unknown)

After a little algebra, we obtain

X̄ = 9.8525 and S = 0.0965.

Further, you can use the Excel function T.INV(0.975,19) to get


tα/2,n−1 = t0.025,19 = 2.093.

Then the half-length of the CI is


p (2.093)(0.0965)
H = tα/2,n−1 S 2 /n = √ = 0.0451.
20

Thus, the CI is µ ∈ X̄ ± H, or 9.8074 ≤ µ ≤ 9.8976. 2

ISYE 6739 — Goldsman 3/2/20 30 / 75


Normal Mean (variance unknown)

R Code:

x <- data.frame(x=c(
9.85, 9.93, 9.75, 9.77, 9.67,
9.87, 9.67, 9.94, 9.85, 9.75,
9.83, 9.92, 9.74, 9.99, 9.88,
9.95, 9.95, 9.93, 9.92, 9.89))

print(confint(lm(x~1,x)))
2.5 \% 97.5 \%
(Intercept) 9.807357 9.897643

ISYE 6739 — Goldsman 3/2/20 31 / 75


Difference of Two Normal Means (unknown equal variances)

Outline

1 Introduction to Confidence Intervals


2 Normal Mean (variance known)
3 Difference of Two Normal Means (variances known)
4 Normal Mean (variance unknown)
5 Difference of Two Normal Means (unknown equal variances)
6 Difference of Two Normal Means (variances unknown)
7 Difference of Paired Normal Means (variances unknown)
8 Normal Variance
9 Ratio of Variances of Two Normals
10 Bernoulli Proportion

ISYE 6739 — Goldsman 3/2/20 32 / 75


Difference of Two Normal Means (unknown equal variances)

Lesson 6.5 — Difference of Two Normal Means (unknown equal


variances)

Compare the means from two competing normal populations.

Suppose we have independent samples of sizes n and m from the two.


iid
X1 , X2 , . . . , Xn ∼ Nor(µx , σx2 ) (population 1).
iid
Y1 , Y2 , . . . , Ym ∼ Nor(µy , σy2 ) (population 2).

We assume that the means µx and µy are unknown, and the variances σx2 and
σy2 are also unknown.

Big Assumption: Suppose for now that σx2 = σy2 = σ 2 , i.e., that the two
variances are unknown but equal.

Let’s find a CI for the difference in means, µx − µy .

ISYE 6739 — Goldsman 3/2/20 33 / 75


Difference of Two Normal Means (unknown equal variances)

To get things going, calculate sample means.


n m
1X 1 X
X̄ ≡ Xi and Ȳ ≡ Yi .
n m
i=1 i=1

Now, sample variances.


Pn 2
Pm 2
2 i=1 (Xi − X̄) 2 i=1 (Yi − Ȳ )
Sx ≡ and Sy ≡ .
n−1 m−1

Both Sx2 and Sy2 are estimators for σ 2 (the common variance).

ISYE 6739 — Goldsman 3/2/20 34 / 75


Difference of Two Normal Means (unknown equal variances)

A better estimator is the pooled estimator for σ 2 , which uses the info from
both Sx2 and Sy2 .
(n − 1)Sx2 + (m − 1)Sy2
Sp2 ≡ .
n+m−2

Theorem:
σ 2 χ2 (n + m − 2)
Sp2 ∼ ,
n+m−2

Proof: Follows from the fact that independent χ2 RVs add up. 2

After some of the usual algebra, it turns out that

X̄ − Ȳ − (µx − µy )
q ∼ t(n + m − 2).
Sp n1 + m 1

ISYE 6739 — Goldsman 3/2/20 35 / 75


Difference of Two Normal Means (unknown equal variances)

So, after more of the usual algebra, we get a two-sided 100(1 − α)% CI for
µx − µy :
r
1 1
µx − µy ∈ X̄ − Ȳ ± tα/2,n+m−2 Sp + ,
n m

where tα/2,n+m−2 is the appropriate t distribution quantile.

One-Sided Lower CI:


r
1 1
µx − µy ≥ X̄ − Ȳ − tα,n+m−2 Sp + .
n m
One-Sided Upper CI:
r
1 1
µx − µy ≤ X̄ − Ȳ + tα,n+m−2 Sp + .
n m

ISYE 6739 — Goldsman 3/2/20 36 / 75


Difference of Two Normal Means (unknown equal variances)

Example: IQ’s of students at Georgia Tech and the “Univ.” of Georgia.

iid
Georgia Tech students: X1 , . . . , X25 ∼ Nor(µx , σ 2 ).

iid
Univ. of Georgia students: Y1 , . . . , Y36 ∼ Nor(µy , σ 2 ).

Note: We assume common σ 2 .

Suppose it turns out that

X̄ = 120 and Ȳ = 80.


Pn
i=1 (Xi− X̄)2
Sx2 = = 100.
n−1
Pm
− Ȳ )2
i=1 (Yi
Sy2 = = 95.
m−1

ISYE 6739 — Goldsman 3/2/20 37 / 75


Difference of Two Normal Means (unknown equal variances)

The two sample variances are pretty close, so we’ll go ahead and feel good
about our common σ 2 assumption; and we’ll use the pooled estimator,

(n − 1)Sx2 + (m − 1)Sy2 (24)(100) + (35)(95)


Sp2 = = = 97.03.
n+m−2 59
Thus, a two-sided 95% CI for µx − µy is
r
1 1
µx − µ y ∈ X̄ − Ȳ ± tα/2,n+m−2 Sp +
n m
√ √
= 120 − 80 ± 2.00 97.03 0.0678
= 40 ± 5.13.

So the 95% CI is 34.87 ≤ µx − µy ≤ 45.13.

Note that the above CI doesn’t contain 0 (not even close). But it’s obvious
anyway that GT students are smarter than UGA students! Duh. 2

ISYE 6739 — Goldsman 3/2/20 38 / 75


Difference of Two Normal Means (variances unknown)

Outline

1 Introduction to Confidence Intervals


2 Normal Mean (variance known)
3 Difference of Two Normal Means (variances known)
4 Normal Mean (variance unknown)
5 Difference of Two Normal Means (unknown equal variances)
6 Difference of Two Normal Means (variances unknown)
7 Difference of Paired Normal Means (variances unknown)
8 Normal Variance
9 Ratio of Variances of Two Normals
10 Bernoulli Proportion

ISYE 6739 — Goldsman 3/2/20 39 / 75


Difference of Two Normal Means (variances unknown)

Lesson 6.6 — Difference of Two Normal Means (variances


unknown)

Again find a CI for the difference in means, µx − µy , from two normal


populations. But now the normals have unknown and possibly unequal
variances, σx2 and σy2 .

Again suppose we have samples of sizes n and m,


iid
X1 , X2 , . . . , Xn ∼ Nor(µx , σx2 ) (population 1)
iid
Y1 , Y2 , . . . , Ym ∼ Nor(µy , σy2 ) (population 2),

where we assume that the Xi ’s are independent of the Yj ’s.

As before, start by calculating sample means and variances, X̄, Ȳ , Sx2 , Sy2 .

ISYE 6739 — Goldsman 3/2/20 40 / 75


Difference of Two Normal Means (variances unknown)

Since the variances are possibly unequal, we can’t use the pooled estimator
Sp2 . Instead, use Welch’s approximation method.

X̄ − Ȳ − (µx − µy )
t? ≡ q ≈ t(ν),
Sx2 Sy2
n + m

where the approximate degrees of freedom is given by


Sy2 2
 2 
Sx
n + m
ν ≡ (Sy2 /m)2
.
(Sx2 /n)2
n−1 + m−1

After the usual algebra, we obtain an approximate two-sided 100(1 − α)% CI


for µx − µy (which has very wide application in practice):
s
Sx2 Sy2
µx − µy ∈ X̄ − Ȳ ± tα/2,ν + .
n m

ISYE 6739 — Goldsman 3/2/20 41 / 75


Difference of Two Normal Means (variances unknown)

Example: Two normal populations. Let’s get a 95% CI for µx − µy .


Suppose it turns out that

n = 25 X̄ = 100 Sx2 = 400


m = 16 Ȳ = 80 Sy2 = 100.

You can tell from Sx2 and Sy2 that there’s no way that the two variances are
equal. So we’ll have to use the approximation method.

The approximate degrees of freedom is


2
Sy2

Sx2
+ 400 100 2

n m 25 + 16 .
ν ≡ (Sy2 /m)2
= (400/25)2 (100/16)2
= 37.30 = 37,
(Sx2 /n)2
n−1 + m−1 24 + 15

where we round df down to be conservative (i.e., slightly longer CIs).

ISYE 6739 — Goldsman 3/2/20 42 / 75


Difference of Two Normal Means (variances unknown)

Since the confidence coefficient is 0.95, we have tα/2,ν = t0.025,37 = 2.026.

This gives us the following CI for µx − µy .


s
Sx2 Sy2
µx − µy ∈ X̄ − Ȳ ± tα/2,ν +
n m

= 20 ± 2.026 22.25

= 20 ± 9.56. 2

ISYE 6739 — Goldsman 3/2/20 43 / 75


Difference of Paired Normal Means (variances unknown)

Outline

1 Introduction to Confidence Intervals


2 Normal Mean (variance known)
3 Difference of Two Normal Means (variances known)
4 Normal Mean (variance unknown)
5 Difference of Two Normal Means (unknown equal variances)
6 Difference of Two Normal Means (variances unknown)
7 Difference of Paired Normal Means (variances unknown)
8 Normal Variance
9 Ratio of Variances of Two Normals
10 Bernoulli Proportion

ISYE 6739 — Goldsman 3/2/20 44 / 75


Difference of Paired Normal Means (variances unknown)

Lesson 6.7 — Difference of Paired Normal Means (variances


unknown)

Again consider two competing normal populations. Suppose we collect


observations from the two populations in pairs.

The RVs between different pairs are independent. The two observations
within the same pair may not be independent.


 Pair 1 : (X1 , Y1 )




 Pair 2 : (X2 , Y2 )
.. ..

independent . .





 Pair n : (Xn , Yn )

 | {z }
not indep

Pair i is indep of pair j, i.e., the pair (Xi , Yi ) is indep of (Xj , Yj ).


But within pair i, it may be that Xi and Yi are not independent.
ISYE 6739 — Goldsman 3/2/20 45 / 75
Difference of Paired Normal Means (variances unknown)

Example: One twin takes a new drug, the other takes a placebo.

Idea: By setting up such experiments, we hope to be able to capture the


difference between the two normal populations more precisely, since we’re
using the pairs to eliminate extraneous noise.

Here’s the set-up. Take n pairs of observations:


iid
X1 , X2 , . . . , Xn ∼ Nor(µx , σx2 ).
iid
Y1 , Y2 , . . . , Yn ∼ Nor(µy , σy2 ).
(Additional technical assumption: All Xi ’s and Yj ’s are jointly normal.)

We assume that the means µx and µy are unknown, and the variances σx2 and
σy2 are also unknown and possibly unequal.

Further, pair i is independent of pair j (between pairs), but Xi may not be


independent of Yi (within a pair).

ISYE 6739 — Goldsman 3/2/20 46 / 75


Difference of Paired Normal Means (variances unknown)

Goal: Find a CI for the difference in means, µx − µy .

Define the pair-wise differences,

Di ≡ Xi − Yi , i = 1, 2, . . . , n.

Note that
iid
D1 , D2 , . . . , Dn ∼ Nor(µd , σd2 ),
where µd ≡ µx − µy (which is what we want the CI for), and

σd2 ≡ σx2 + σy2 − 2 Cov(Xi , Yi ).

Now the problem reduces to the old Nor(µ, σ 2 ) case with unknown µ and σ 2 .

ISYE 6739 — Goldsman 3/2/20 47 / 75


Difference of Paired Normal Means (variances unknown)

So let’s calculate the sample mean and variance as before.


n
1X
D̄ ≡ Di ∼ Nor(µd , σd2 /n).
n
i=1
n
1 X σ 2 χ2 (n − 1)
Sd2 ≡ (Di − D̄)2 ∼ d .
n−1 n−1
i=1

Just like before, we get

D̄ − µd
q ∼ t(n − 1).
Sd2 /n

ISYE 6739 — Goldsman 3/2/20 48 / 75


Difference of Paired Normal Means (variances unknown)

After the usual algebra on the pivot (with the t distribution instead of the
normal), we obtain a two-sided 100(1 − α)% CI,
q
µd = µx − µy ∈ D̄ ± tα/2,n−1 Sd2 /n.

q
One-sided lower: µd ≥ D̄ − tα,n−1 Sd2 /n.
q
One-sided upper: µd ≤ D̄ + tα,n−1 Sd2 /n.

ISYE 6739 — Goldsman 3/2/20 49 / 75


Difference of Paired Normal Means (variances unknown)

Example: Times for people to parallel park two cars.

Person Park Honda Park Cadillac Difference


1 10 20 −10
2 25 40 −15
3 5 5 0
4 20 35 −15
5 15 20 −5

Clearly, the people are independent, but the times for the same individual to
park the two cars may not be independent.

ISYE 6739 — Goldsman 3/2/20 50 / 75


Difference of Paired Normal Means (variances unknown)

Let’s assume that all times are normal. We want a 90% two-sided CI for
µd = µh − µc .

We see that n = 5, D̄ = −9, and Sd2 = 42.5.

Thus, the CI is
q
µd ∈ D̄ ± t0.05,4 Sd2 /n

p
= −9 ± 2.13 42.5/5

= −9 ± 6.21. 2

ISYE 6739 — Goldsman 3/2/20 51 / 75


Difference of Paired Normal Means (variances unknown)

So why didn’t we just use the “usual” CI for the difference of two means?
Namely, s
Sx2 Sy2
µx − µy ∈ X̄ − Ȳ ± tα/2,ν + .
n m

Main reason: This CI requires the X’s to be independent of the Y ’s. (Recall
that the paired-t method allows Xi and Yi to be dependent.)

Good thing about using the “usual” method: The approximate df ν from the
“usual” method would probably be larger than the df n − 1 from the paired-t
method. This would make the CI smaller.

Bad thing: You’d introduce much more noise into the system by using the
“usual” method. This could make the CI much larger.

ISYE 6739 — Goldsman 3/2/20 52 / 75


Difference of Paired Normal Means (variances unknown)

Example: Back to the car example (now use “usual” approximate method).

A guy parks Same guy Different guy


Honda Xi parks Caddy parks Caddy Yi
10 20 30
25 40 15
5 5 40
20 35 10
15 20 25

Just concentrate on the Xi and Yi columns. (The middle column is from the
last example for comparison purposes.)

The Xi ’s and Yi ’s have the same sample averages as before, but now there’s
more natural variation, since we’re using 10 different people.

ISYE 6739 — Goldsman 3/2/20 53 / 75


Difference of Paired Normal Means (variances unknown)

Now all of the Xi ’s are independent of all of the Yi ’s.

X̄ = 15, Ȳ = 24, Sx2 = 62.5, Sy2 = 142.5.

Then we have
2
Sy2

Sx2 2
n + n 4 62.5 + 142.5 .
ν = (Sy2 /n)2
= 2 2
= 6.94 = 7.
(Sx2 /n)2 (62.5) + (142.5)
n−1 + n−1

This gives us the following 90% CI for µx − µy .


s
Sx2 Sy2 √
µx − µy ∈ X̄ − Ȳ ± t0.05,7 + = −9 ± 1.895 41 = −9 ± 12.13.
n n

This CI is wider than the paired-t version, even though we have more df here.

Moral: Use paired-t when you can. 2

ISYE 6739 — Goldsman 3/2/20 54 / 75


Normal Variance

Outline

1 Introduction to Confidence Intervals


2 Normal Mean (variance known)
3 Difference of Two Normal Means (variances known)
4 Normal Mean (variance unknown)
5 Difference of Two Normal Means (unknown equal variances)
6 Difference of Two Normal Means (variances unknown)
7 Difference of Paired Normal Means (variances unknown)
8 Normal Variance
9 Ratio of Variances of Two Normals
10 Bernoulli Proportion

ISYE 6739 — Goldsman 3/2/20 55 / 75


Normal Variance

Lesson 6.8 — Normal Variance

Next Few Lessons: We’ll now look at a potpourri of different types of CIs.
Normal variance.
Ratio of normal variances from two populations.
Bernoulli proportion.
Difference of Bernoulli proportions from two populations.

First, CIs for normal variance. . . .

ISYE 6739 — Goldsman 3/2/20 56 / 75


Normal Variance

To address how much variability we can expect from some system, we’ll now
get a CI for the variance σ 2 of a normal distribution (instead of the mean).

Usual set-up:
iid
X1 , X2 , . . . , Xn ∼ Nor(µ, σ 2 ).

Recall that the distribution of the sample variance is


n
2 1 X σ 2 χ2 (n − 1)
S = (Xi − X̄)2 ∼ .
n−1 n−1
i=1

ISYE 6739 — Goldsman 3/2/20 57 / 75


Normal Variance

Since
(n − 1)S 2
∼ χ2 (n − 1),
σ2
we use χ2 quantiles and some algebra to get

(n − 1)S 2
 
2 2
1 − α = P χ1− α ,n−1 ≤ ≤ χ α ,n−1
2 σ2 2

σ2
 
1 1
= P ≥ ≥ 2
χ21− α ,n−1 (n − 1)S 2 χ α ,n−1
2 2

(n − 1)S 2 (n − 1)S 2
 
2
= P ≤σ ≤ 2 .
χ2α ,n−1 χ1− α ,n−1
2 2

ISYE 6739 — Goldsman 3/2/20 58 / 75


Normal Variance

So a 100(1 − α)% CI for σ 2 is

(n − 1)S 2 (n − 1)S 2
 
σ2 ∈ , .
χ2α ,n−1 χ21− α ,n−1
2 2

Remark: The CI for σ 2 is directly proportional to the sample variance S 2 .

Remark: This CI contains no reference to the unknown µ!

Meanwhile, a 100(1 − α)% lower CI for σ 2 is:


(n − 1)S 2
≤ σ2.
χ2α,n−1

100(1 − α)% upper CI for σ 2 :


(n − 1)S 2
σ2 ≤ .
χ21−α,n−1

ISYE 6739 — Goldsman 3/2/20 59 / 75


Normal Variance

Example: Suppose 25 people take an IQ test and that their scores are
normally distributed.

If S 2 = 100, find a 95% upper CI for the variance σ 2 .

Looking up the χ2 quantile, we get

(n − 1)S 2 (24)(100) 2400


σ2 ≤ 2 = 2 = = 173.3. 2
χ1−α,n−1 χ0.95,24 13.85

ISYE 6739 — Goldsman 3/2/20 60 / 75


Ratio of Variances of Two Normals

Outline

1 Introduction to Confidence Intervals


2 Normal Mean (variance known)
3 Difference of Two Normal Means (variances known)
4 Normal Mean (variance unknown)
5 Difference of Two Normal Means (unknown equal variances)
6 Difference of Two Normal Means (variances unknown)
7 Difference of Paired Normal Means (variances unknown)
8 Normal Variance
9 Ratio of Variances of Two Normals
10 Bernoulli Proportion

ISYE 6739 — Goldsman 3/2/20 61 / 75


Ratio of Variances of Two Normals

Lesson 6.9 — Ratio of Variances of Two Normals

Which of two normal distributions is more variable?

Set-up:
iid
X1 , X2 , . . . , Xn ∼ Nor(µx , σx2 ).
iid
Y1 , Y2 , . . . , Ym ∼ Nor(µy , σy2 ).

All X’s and Y ’s are independent with unknown means and variances.

We’ll get a confidence interval for the ratio σx2 /σy2 .

ISYE 6739 — Goldsman 3/2/20 62 / 75


Ratio of Variances of Two Normals

Recall the distributions of the two sample variances:


n
1 X σ 2 χ2 (n − 1)
Sx2 = (Xi − X̄)2 ∼ x , and
n−1 n−1
i=1

m
1 X σy2 χ2 (m − 1)
Sy2 = (Yi − Ȳ )2 ∼ .
m−1 m−1
i=1

Thus,
Sy2 /σy2 χ2 (m − 1)/(m − 1)
∼ ∼ F (m − 1, n − 1).
Sx2 /σx2 χ2 (n − 1)/(n − 1)

ISYE 6739 — Goldsman 3/2/20 63 / 75


Ratio of Variances of Two Normals

Using F quantiles and some algebra, we get

Sy2 /σy2
 
1 − α = P F1− α2 ,m−1,n−1 ≤ 2 2 ≤ F α2 ,m−1,n−1
Sx /σx
 2
σx2 Sx2

Sx 1
= P ≤ 2 ≤ 2 F 2 ,m−1,n−1 .
α
Sy2 F α2 ,n−1,m−1 σy Sy

So a 100(1 − α)% CI for σx2 /σy2 is

σx2 Sx2 Sx2


 
1
∈ , Fα .
σy2 Sy2 F α2 ,n−1,m−1 Sy2 2 ,m−1,n−1

Remark: The CI for σx2 /σy2 is proportional to the ratio of the sample
variances, Sx2 /Sy2 , and contains no reference to µx or µy .

ISYE 6739 — Goldsman 3/2/20 64 / 75


Ratio of Variances of Two Normals

Meanwhile, a 100(1 − α)% lower CI for σx2 /σy2 is:

Sx2 1 σx2
≤ .
Sy2 Fα,n−1,m−1 σy2

100(1 − α)% upper CI for σx2 /σy2 :

σx2 Sx2
≤ Fα,m−1,n−1 .
σy2 Sy2

Remark: If you want CIs for σy2 /σx2 , just flip all of the X’s and Y ’s in the
various CIs discussed above.

ISYE 6739 — Goldsman 3/2/20 65 / 75


Ratio of Variances of Two Normals

Example: Suppose 25 people take IQ test A, and 16 people take IQ test B.


Assume all scores are normal and independent.
2 = 100 and S 2 = 70, find a 95% upper CI for the ratio σ 2 /σ 2 .
If SA B A B

Looking up the Fα,nB −1,nA −1 = F0.05,15,24 quantile, we get


2
σA 2
SA 100
2 ≤ 2 Fα,nB −1,nA −1 = (2.11) = 3.01. 2
σB SB 70

ISYE 6739 — Goldsman 3/2/20 66 / 75


Bernoulli Proportion

Outline

1 Introduction to Confidence Intervals


2 Normal Mean (variance known)
3 Difference of Two Normal Means (variances known)
4 Normal Mean (variance unknown)
5 Difference of Two Normal Means (unknown equal variances)
6 Difference of Two Normal Means (variances unknown)
7 Difference of Paired Normal Means (variances unknown)
8 Normal Variance
9 Ratio of Variances of Two Normals
10 Bernoulli Proportion

ISYE 6739 — Goldsman 3/2/20 67 / 75


Bernoulli Proportion

Lesson 6.10 — Bernoulli Proportion

iid
Suppose that X1 , X2 , . . . , Xn ∼ Bern(p).

What probability of “success” can we expect from this distribution?

We’ll get a CI for the proportion p of successes.


Pn
Since i=1 Xi ∼ Bin(n, p), we know that

Bin(n, p)
X̄ ∼ .
n
Let’s assume that n is “large,” so that we’ll be able to use the Central Limit
Theorem (and don’t worry about the continuity correction).
(If n isn’t large, then we’ll have to use nasty Binomial tables, which I don’t
want to deal with here!)

ISYE 6739 — Goldsman 3/2/20 68 / 75


Bernoulli Proportion

Note that
E[X̄] = E[Xi ] = p, and
Var(X̄) = Var(Xi )/n = pq/n.

Then for large n, the Central Limit Theorem implies

X̄ − E[X̄] X̄ − p
p = p ≈ Nor(0, 1).
Var(X̄) pq/n

Now let’s do something crazy and estimate pq by its maximum likelihood


estimator, X̄(1 − X̄). This gives

X̄ − p
p ≈ Nor(0, 1).
X̄(1 − X̄)/n

ISYE 6739 — Goldsman 3/2/20 69 / 75


Bernoulli Proportion

Then the “usual” algebra implies


 
. X̄ − p
1 − α = P −zα/2 ≤ p ≤ zα/2
X̄(1 − X̄)/n
 r r 
X̄(1 − X̄) X̄(1 − X̄)
= P X̄ − zα/2 ≤ p ≤ X̄ + zα/2 .
n n
So an approximate two-sided CI for p is
r
X̄(1 − X̄)
p ∈ X̄ ± zα/2 .
n

Similarly, an approximate lower CI is


r
X̄(1 − X̄)
X̄ − zα ≤ p,
n
and an approximate upper CI is
r
X̄(1 − X̄)
p ≤ X̄ + zα .
n
ISYE 6739 — Goldsman 3/2/20 70 / 75
Bernoulli Proportion

Example: The probability that a student correctly answers a certain test


question is p.

Suppose that a random sample of 200 students yields 160 correct answers to
the question.

A 95% two-sided CI for p is given by


r
X̄(1 − X̄)
p ∈ X̄ ± zα/2
r n
(0.8)(0.2)
= 0.8 ± 1.96
200
= 0.8 ± 0.055. 2

ISYE 6739 — Goldsman 3/2/20 71 / 75


Bernoulli Proportion

The half-width of the two-sided CI is


r
X̄(1 − X̄)
zα/2 .
n

How many observations should we take so that the half-length is ≤ ?


r
X̄(1 − X̄)
zα/2 ≤  ⇐⇒ n ≥ (zα/2 /)2 X̄(1 − X̄).
n

Of course, X̄ is unknown before taking observations. A conservative choice


for n arises by maximizing X̄(1 − X̄) = 1/4. Then we have
2
n ≥ zα/2 /(42 ).

On the other hand, if we can somehow make a preliminary estimate p̂ of p


(based on a preliminary sample mean), we could use
2
n ≥ zα/2 p̂(1 − p̂)/2 .

ISYE 6739 — Goldsman 3/2/20 72 / 75


Bernoulli Proportion

Example: Suppose we want to take enough observations so that a 95%


two-sided CI for p will be no bigger than ±3%. How many are required?

2 (1.96)2
n ≥ zα/2 /(42 ) = = 1067.1.
4 (0.03)2

So take n = 1068 samples (though this will likely be conservative).

Instead suppose that a pilot study gives an estimate of p̂ = 0.7. How should n
be revised?

2 (1.96)2
n ≥ zα/2 p̂(1 − p̂)/2 = (0.7)(0.3) = 896.4.
(0.03)2

So now we only have to take n = 897. 2

ISYE 6739 — Goldsman 3/2/20 73 / 75


Bernoulli Proportion

Remark: We can also derive approximate CIs for the difference between
two competing Bernoulli parameters.

iid iid
Suppose that X1 , X2 , . . . , Xn ∼ Bern(px ) and Y1 , Y2 , . . . , Ym ∼ Bern(py ),
where the two samples are independent of each other.

Using the techniques already developed in this lesson, we can easily obtain
the following approximate two-sided 100(1 − α)% CI for the difference of
the success proportions:
r
X̄(1 − X̄) Ȳ (1 − Ȳ )
px − py ∈ X̄ − Ȳ ± zα/2 + .
n m

ISYE 6739 — Goldsman 3/2/20 74 / 75


Bernoulli Proportion

Example: The probabilities that Georgia Tech and University of Georgia


students earn at least a 700 on the Math portion of the SAT test are p and p ,
respectively.
A random sample of 200 GT students yielded 160 students who scored ≥ 700
on the test. But, sadly, a sample of 400 UGA students revealed only 50 who
succeeded.
Then a 95% CI for the difference in success probabilities is
r
X̄(1 − X̄) Ȳ (1 − Ȳ )
p −p ∈ X̄ − Ȳ ± zα/2 +
rn m
(0.8)(0.2) (0.125)(0.875)
= 0.8 − 0.125 ± 1.96 +
200 400
= 0.675 ± 0.064.

Not looking real good for UGA. 2

Coming Up: Hypothesis Testing!


ISYE 6739 — Goldsman 3/2/20 75 / 75

You might also like