You are on page 1of 7

Ismor Fischer, 5/29/2012 5.

2-1

5.2 Formal Statement and Examples

Sampling Distribution of a Normal Variable


Given a random variable X. Suppose that the population
distribution of X is known to be normal, with mean µ and
variance σ 2, that is, X ~ N(µ, σ). Then, for any sample size n,
it follows that the sampling distribution of X is normal,
σ2  σ 
with mean µ and variance n , that is, X ~ Nµ, .
 n

Comments:
σ
 is called the “standard error of the mean,” denoted SEM, or more simply, s.e.
n
X −µ
 The corresponding Z-score transformation formula is Z = ~ N(0, 1).
σ/ n

Example: Suppose that the ages X of a certain population are normally distributed,
with mean µ = 27.0 years, and standard deviation σ = 12.0 years, i.e., X ~ N(27, 12).
The probability that the age of
a single randomly selected
individual is less than 30 years
 30 − 27
is P(X < 30) = PZ < 12 
 X
µ = 27 30
= P(Z < 0.25) = 0.5987.
In this population, the
Now consider all random samples of size n = 36 taken probability that the average
from this population. By the above, their mean ages age of 36 random people is
under 30 years old, is much
X are also normally distributed, with mean µ = 27 yrs greater than the probability
σ 12 yrs that the age of one random
as before, but with standard error = = 2 yrs. person is under 30 years old.
n 36 Exercise: Compare the two
That is, X ~ N(27, 2). probabilities of being under
24 years old.
The probability that the mean age of a single sample of Exercise: Compare the two
n = 36 randomly selected individuals is less than 30 probabilities of being
between 24 and 30 years old.
 30 − 27
years is P( X < 30) = PZ < 
2  = P(Z < 1.5) =
0.9332.
X
µ = 27 30
Ismor Fischer, 5/29/2012 5.2-2

 σ 
 If X ~ N(µ, σ) approximately, then X ~ Nµ,  approximately. (The larger the value
 n
of n, the better the approximation.) In fact, more is true...

IMPORTANT GENERALIZATION:

The Central Limit Theorem


Given any random variable X, discrete or continuous, with finite
mean µ and finite variance σ 2. Then, regardless of the shape of
the population distribution of X, as the sample size n gets larger,
the sampling distribution of X becomes increasingly closer to
σ2  σ 
normal, with mean µ and variance n , that is, X ~ Nµ, ,
 n
approximately.
 X −µ 
=
 More formally, Z → N ( 0,1) as n → ∞ . 

 σ/ n 

 Intuitively perhaps, there is less variation between different sample mean values, than
there is between different population values. This formal result states that, under very
general conditions, the sampling variability is usually much smaller than the population
variability, as well as gives the precise form of the “limiting distribution” of the statistic.

 What if the population standard deviation σ is unknown? Then it can be replaced by the
 s 
sample standard deviation s, provided n is large. That is, X ~ Nµ,  approximately,
 n
s
if n ≥ 30 or so, for “most” distributions (... but see example below). Since the value
n

is a sample-based estimate of the true standard error s.e., it is commonly denoted s.e.

 Because the mean µ X of the sampling distribution is equal to the mean µ X of the
population distribution – i.e., E [ X ] = µ X – we say that X is an unbiased estimator of
µ X . In other words, the sample mean is an unbiased estimator of the population mean.
A biased sample estimator is a statistic θˆ whose “expected value” either consistently
overestimates or underestimates its intended population parameter θ .

 Many other versions of CLT exist, related to so-called Laws of Large Numbers.
Ismor Fischer, 5/29/2012 5.2-3

Example: Consider a(n infinite) population of paper notes, 50% of which are
blank, 30% are ten-dollar bills, and the remaining 20% are twenty-dollar bills.

Experiment 1: Randomly select a single note from the population.


Random variable: X = $ amount obtained

x f(x) = P(X = x)

0 .5

10 .3
.5
20 .2 .3
.2

 Mean µ X = E[X] = (.5)(0) + (.3)(10) + (.2)(20) = $7.00

 Variance σ X 2 = E[ (X – µ X )2 ] = (.5)(−7)2 + (.3)(3)2 + (.2)(13)2 = 61

 Standard deviation σ X = $7.81


Ismor Fischer, 5/29/2012 5.2-4

Experiment 2: Each of n = 2 people randomly selects a note, and split the winnings.

Random variable: X = $ sample mean amount obtained per person

x 0 5 10 5 10 15 10 15 20
(x1, x2) (0, 0) (0, 10) (0, 20) (10, 0) (10, 10) (10, 20) (20, 0) (20, 10) (20, 20)
.5 × .5 .5 × .3 .5 × .2 .3 × .5 .3 × .3 .3 × .2 .2 × .5 .2 × .3 .2 × .2
Probability
= 0.25 = 0.15 = 0.10 = 0.15 = 0.09 = 0.06 = 0.10 = 0.06 = 0.04

x f ( x ) = P( X = x )

0 .25

5 .30 = .15 + .15

.30 .29
10 .29 = .10 + .09 + .10 .25
.12
15 .12 = .06 + .06 .04

20 .04

 Mean µ X = (.25)(0) + (.30)( 5) + (.29)(10) + (.12)(15) + (.04)(20) = $7.00 = µ X !!

 Variance σ X 2 = (.25)(−7)2 + (.30)(−2)2 + (.29)(3)2 + (.12)(8)2 + (.04)(13)2

61 σX2
= 30.5 = = n !!
2

σX
 Standard deviation σ X = $5.52 =
n
!!
Ismor Fischer, 5/29/2012 5.2-5

Experiment 3: Each of n = 3 people randomly selects a note, and split the winnings.

Random variable: X = $ sample mean amount obtained per person


x 0 3.33 6.67 3.33 6.67 10 6.67 10 13.33
(x1, x2, x3) (0, 0, 0) (0, 0, 10) (0, 0, 20) (0, 10, 0) (0, 10, 10) (0, 10, 20) (0, 20, 0) (0, 20, 10) (0, 20, 20)
.5 × .5 × .5 .5 × .5 × .3 .5 × .5 × .2 .5 × .3 × .5 .5 × .3 × .3 .5 × .3 × .2 .5 × .2 × .5 .5 × .2 × .3 .5 × .2 × .2
Probability
= 0.125 = 0.075 = 0.050 = 0.075 = 0.045 = 0.030 = 0.050 = 0.030 = 0.020

3.33 6.67 10 6.67 10 13.33 10 13.33 16.67


(10, 0, 0) (10, 0, 10) (10, 0, 20) (10, 10, 0) (10, 10, 10) (10, 10, 20) (10, 20, 0) (10, 20, 10) (10, 20, 20)
.3 × .5 × .5 .3 × .5 × .3 .3 × .5 × .2 .3 × .3 × .5 .3 × .3 × .3 .3 × .3 × .2 .3 × .2 × .5 .3 × .2 × .3 .3 × .2 × .2
= 0.075 = 0.045 = 0.030 = 0.045 = 0.027 = 0.018 = 0.030 = 0.018 = 0.012

6.67 10 13.33 10 13.33 16.67 13.33 16.67 20


(20, 0, 0) (20, 0, 10) (20, 0, 20) (20, 10, 0) (20, 10, 10) (20, 10, 20) (20, 20, 0) (20, 20, 10) (20, 20, 20)
.2 × .5 × .5 .2 × .5 × .3 .2 × .5 × .2 .2 × .3 × .5 .2 × .3 × .3 .2 × .3 × .2 .2 × .2 × .5 .2 × .2 × .3 .2 × .2 × .2
= 0.050 = 0.030 = 0.020 = 0.030 = 0.018 = 0.012 = 0.020 = 0.012 = 0.008

x f ( x ) = P( X = x )

0.00 .125

3.33 .225 = .075 + .075 + .075

.285 = .050 + .045 + .050 +


6.67 .045 + .045 + .050
.285
.225
.207 = .030 + .030 + .030 + .027 .207
10.00 + .030 + .030 + .030 .125
.114
.114 = .020 + .018 + .018 +
13.33 .020 + .018 + .020
.036
.008

16.67 .036 = .012 + .012 + .012

20.00 .008

 Mean µ X = Exercise = $7.00 = µ X !!!

61 σX2
 Variance σ X 2
= Exercise = 20.333 =
3
= n !!!

σX
 Standard deviation σ X = $4.51 =
n
!!!
Ismor Fischer, 5/29/2012 5.2-6

The tendency toward a normal distribution becomes stronger as the sample size
n gets larger, despite the mild skew in the original population values. This is
an empirical consequence of the Central Limit Theorem.

For most such distributions, n ≥ 30 or so is sufficient for a reasonable


normal approximation to the sampling distribution. In fact, if the
distribution is symmetric, then convergence to a bell curve can often be
seen for much lower n, say only n = 5 or 6. Recall also, from the first
result in this section, that if the population is normally distributed (with
known σ), then so will be the sampling distribution, for any n.

BUT BEWARE....
Ismor Fischer, 5/29/2012 5.2-7

However, if the population distribution of X is highly skewed, then the sampling


distribution of X can be highly skewed as well (especially if n is not very large),
i.e., relying on CLT can be risky! (Although, sometimes using a transformation,
such as ln(X) or X, can restore a bell shape to the values. Later…)
Example: The two graphs on the bottom of this page are simulated sampling
distributions for the highly skewed population shown below. Both are density
histograms based on the means of 1000 random samples; the first corresponds to
samples of size n = 30, the second to n = 100. Note that skew is still present!
Population Distribution

You might also like