You are on page 1of 20

Lec4 2.

April 20, 2005

Normal approximation to Binomial


distribution
• Binomial calculations for compound
events
• The normal approximation to the
binomial
• Parameters of the approximating
distribution
• Behavior of the approximation as a
function of p
• Calculations with the normal
approximation
• The continuity correction*
• Sampling distributions
• The mean and standard deviation of x̄
• Normal approximation to the binomial
revisited

1
Binomial Calculations for Compound
Events

Recall the equation for Binomial


probabilities:
If X ∼ B(n, p), then
µ ¶
n k
P (X = k) = p (1 − p)n−k
k

For a compound event such as P (X ≤ k),


the probability is given by
k
X
P (X ≤ k) = P (X = j)
j=0

2
The Normal Approximation to the
Binomial
Consider the following problem:
Suppose we draw an SRS of 1,500 Americans and
want to assess whether the representation of
blacks in the sample is accurate. We know that
about 12% of Americans are black, so we expect
X, the number of blacks in the sample, to be
around 180. Allowing a little leeway, what is the
probability that the sample contains 170 or fewer
blacks?
170
X
P (X ≤ 170) = P (X = j)
j=0
170 µ
X ¶
1500
= (0.12)j (0.88)1500−j
j=0
j

That’s pretty ugly. Is there an easier way?

3
It turns out that as n gets larger, the
Binomial distribution looks increasingly
like the Normal distribution.
Consider the following Binomial
histograms, each representing 10,000
samples from a Binomial distribution with
p = 0.1:

4
Probability Probability Probability

0.00 0.02 0.04 0.06 0.00 0.05 0.10 0.15 0.0 0.1 0.2 0.3 0.4 0.5 0.6

0
0

30
1

40
5

50
2

Value
Value
Value
n=5

n = 50

n = 500

60
3

10

70
4

80
15

5
Probability Probability Probability

0.00 0.01 0.02 0.03 0.04 0.00 0.04 0.08 0.12 0.0 0.1 0.2 0.3 0.4

70
0

80
5
1

90
2

10
3

Value
Value
Value
n = 10

n = 100

n = 1000
4

15
5

20
6

100 110 120 130 140


Parameters of the Approximating
Distribution

The approximating Normal distribution


has the same mean and standard deviation
as the underlying Binomial distribution.
Thus, if X ∼ B(n, p), having mean
E[X] = np and standard deviation
p
SD(X) = np(1 − p), it is approximated
by a Normal distribution Y ∼ N (µ, σ),
where

µ = np
p
σ = np(1 − p)

6
When is the approximation
appropriate?

The farther p is from 12 , the larger n needs


to be for the approximation to work.
Thus, as a rule of thumb, only use the
approximation if

np ≥ 10 and
n(1 − p) ≥ 10

7
Behavior of the Approximation as a
Function of p, for n = 100
p = 0.001 p = 0.005

0.6
0.8

0.5
0.6

0.4
Probability

Probability

0.3
0.4

0.2
0.2

0.1
0.0

0.0
0 1 2 3 0 1 2 3 4 5

Value Value

p = 0.01 p = 0.05
0.15
0.3
Probability

Probability

0.10
0.2

0.05
0.1

0.00
0.0

0 2 4 6 0 5 10 15

Value Value

p = 0.1 p = 0.5
0.08
0.12

0.06
Probability

Probability
0.08

0.04
0.04

0.02
0.00

0.00

0 5 10 15 20 30 40 50 60 70

Value Value

8
Calculations with the Normal
Approximation

Recall the problem we set out to solve:

P (X ≤ 170), where X ∼ B(1500, 0.12)

How do we calculate this using the


Normal approximation?
If we were to draw a histogram of the
B(1500, 0.12) distribution with bins of
width one, P (X ≤ 170) would be
represented by the total area of the bins
spanning

(−0.5, 0.5], (0.5, 1.5], . . . , (169.5, 170.5]

9
Thus, using the approximating Normal
distribution Y ∼ N (180, 12.59), we
calculate

P (X ≤ 170) ≈ P (Y ≤ 170.5) = 0.2253

For reference, the exact Binomial


probability is 0.2265, so the
approximation is apparently pretty good.

10
The Continuity Correction*
The addition of 0.5 in the previous slide is
an example of the continuity correction
which is intended to refine the
approximation by accounting for the fact
that the Binomial distribution is discrete
while the Normal distribution is
continuous.
In general, we make the following
adjustments:
P (X ≤ x) ≈ P (Y ≤ x + 0.5)
P (X < x) = P (X ≤ x − 1)
≈ P (Y ≤ x − 0.5)
P (X ≥ x) ≈ P (Y ≥ x − 0.5)
P (X > x) = P (X ≥ x + 1)
≈ P (Y ≥ x + 0.5)

11
Sampling Distributions
The Normal approximation to the
Binomial distribution is, in fact, a special
case of a more general phenomenon.
The general reason for this phenomenon
depends on the notion of a sampling
distribution.
Consider the following setup: We observe
a sample of size n from some population
and compute the mean
n
1X
x̄ = xi
n i=1

Since the particular individuals included


in our sample are random, we would
observe a different value of x̄ if we
repeated the procedure. That is, x̄ is also

12
a random quantity.
If we repeatedly drew samples of size n
and calculated x̄, we could ascertain the
sampling distribution of x̄.

13
A Word on Notation

If Xi (ωi ) = xi , ∀i, then


Xn
1
X̄(ω) = Xi (ωi )
n i=1

where ω = (ω1 , . . . , ωn ). That is, X̄, like


Xi is a function mapping values in a
sample space to numbers.
Thus, X̄(ω) = x̄, where
n
1X
x̄ = xi
n i=1

The thing to keep in mind is that x̄ is a


fixed number, while X̄ is a random
variable.

14
The Mean and Standard Deviation
of X̄

What are the mean and standard deviation of


X̄?
Let’s be more specific about what we mean
by a sample of size n. We consider the
sample to be a collection of n independent
and identically distributed or iid random
variables X1 , X2 , . . . , Xn , with common mean
µ and common standard deviation σ.
Thus,
à n
! n
1 X 1X
E(X̄) = E Xi = E(Xi )
n n
i=1 i=1
n
1X 1
= µ = (nµ) = µ
n n
i=1

15
à n
! n
1 X 1 X
Var(X̄) = Var Xi = 2 Var(Xi )
n n
i=1 i=1
n
1 X 2 1 2 σ2
= 2 σ = 2 (nσ ) =
n n n
i=1
p σ
SD(X) = Var(X) = √
n

16
The Central Limit Theorem*
Now we know that X̄ has mean µ and
standard deviation √σn , but what is its
distribution?
If X1 , X2 , . . . , Xn are Normally distributed,
then X̄ is also normally distributed. Thus,
µ ¶
σ
Xi ∼ N (µ, σ), ∀i =⇒ X̄ ∼ N µ, √
n

If X1 , X2 , . . . , Xn are not Normally


distributed, then the Central Limit
Theorem tells us that X̄ is approximately
Normal.

17
The Central Limit Theorem
Suppose X1 , X2 , . . . , Xn are iid random
variables with mean µ and finite standard
deviation σ.
If n is sufficiently large, the sampling dis-
tribution of X̄ is approximately Normal
with mean µ and standard deviation √σn .

18
Normal Approximation to the
Binomial Revisited
What does all this have to do with the
Normal approximation to the Binomial?
An observation from a Binomial distribution
Y is actually the sum of n independent
observations from a simpler distribution, the
Bernoulli distribution.
A Bernoulli random variable X takes the
value 1 with probability p or the value 0 with
probability 1 − p, and has
p
E[X] = p and SD(X) = p(1 − p)

Letting X1 , . . . , Xn be n iid Bernoulli


random variables,
n
X
Y = Xi = nX̄.
i=1

19
According to the CLT, X̄ has a N (µ, σ)
distribution, where
r
p(1 − p)
µ = p and σ =
n

It turns out that nX̄ is also Normal, and


it has

E[nX̄] = nE[X̄] = nµ = np
Var(nX) = n2 Var(X) = n2 σ 2
2 p(1 − p)
= n = np(1 − p)
n
Thus, in general, if X1 , . . . , Xn are iid
random variables with mean µ and
standard deviation σ, then
n
X √
Xi ∼
˙ N (nµ, nσ)
i=1

20

You might also like