Professional Documents
Culture Documents
2-1
Comments:
σ
is called the “standard error of the mean,” denoted SEM, or more simply, s.e.
n
X −µ
The corresponding Z-score transformation formula is Z = ~ N(0, 1).
σ/ n
Example: Suppose that the ages X of a certain population are normally distributed,
with mean µ = 27.0 years, and standard deviation σ = 12.0 years, i.e., X ~ N(27, 12).
The probability that the age of
a single randomly selected
individual is less than 30 years
30 − 27
is P(X < 30) = PZ < 12
X
µ = 27 30
= P(Z < 0.25) = 0.5987.
In this population, the
Now consider all random samples of size n = 36 taken probability that the average
from this population. By the above, their mean ages age of 36 random people is
under 30 years old, is much
X are also normally distributed, with mean µ = 27 yrs greater than the probability
σ 12 yrs that the age of one random
as before, but with standard error = = 2 yrs. person is under 30 years old.
n 36 Exercise: Compare the two
That is, X ~ N(27, 2). probabilities of being under
24 years old.
The probability that the mean age of a single sample of Exercise: Compare the two
n = 36 randomly selected individuals is less than 30 probabilities of being
between 24 and 30 years old.
30 − 27
years is P( X < 30) = PZ <
2 = P(Z < 1.5) =
0.9332.
X
µ = 27 30
Ismor Fischer, 5/29/2012 5.2-2
σ
If X ~ N(µ, σ) approximately, then X ~ Nµ, approximately. (The larger the value
n
of n, the better the approximation.) In fact, more is true...
IMPORTANT GENERALIZATION:
Intuitively perhaps, there is less variation between different sample mean values, than
there is between different population values. This formal result states that, under very
general conditions, the sampling variability is usually much smaller than the population
variability, as well as gives the precise form of the “limiting distribution” of the statistic.
What if the population standard deviation σ is unknown? Then it can be replaced by the
s
sample standard deviation s, provided n is large. That is, X ~ Nµ, approximately,
n
s
if n ≥ 30 or so, for “most” distributions (... but see example below). Since the value
n
is a sample-based estimate of the true standard error s.e., it is commonly denoted s.e.
Because the mean µ X of the sampling distribution is equal to the mean µ X of the
population distribution – i.e., E [ X ] = µ X – we say that X is an unbiased estimator of
µ X . In other words, the sample mean is an unbiased estimator of the population mean.
A biased sample estimator is a statistic θˆ whose “expected value” either consistently
overestimates or underestimates its intended population parameter θ .
Many other versions of CLT exist, related to so-called Laws of Large Numbers.
Ismor Fischer, 5/29/2012 5.2-3
Example: Consider a(n infinite) population of paper notes, 50% of which are
blank, 30% are ten-dollar bills, and the remaining 20% are twenty-dollar bills.
x f(x) = P(X = x)
0 .5
10 .3
.5
20 .2 .3
.2
Experiment 2: Each of n = 2 people randomly selects a note, and split the winnings.
x 0 5 10 5 10 15 10 15 20
(x1, x2) (0, 0) (0, 10) (0, 20) (10, 0) (10, 10) (10, 20) (20, 0) (20, 10) (20, 20)
.5 × .5 .5 × .3 .5 × .2 .3 × .5 .3 × .3 .3 × .2 .2 × .5 .2 × .3 .2 × .2
Probability
= 0.25 = 0.15 = 0.10 = 0.15 = 0.09 = 0.06 = 0.10 = 0.06 = 0.04
x f ( x ) = P( X = x )
0 .25
.30 .29
10 .29 = .10 + .09 + .10 .25
.12
15 .12 = .06 + .06 .04
20 .04
61 σX2
= 30.5 = = n !!
2
σX
Standard deviation σ X = $5.52 =
n
!!
Ismor Fischer, 5/29/2012 5.2-5
Experiment 3: Each of n = 3 people randomly selects a note, and split the winnings.
x f ( x ) = P( X = x )
0.00 .125
20.00 .008
61 σX2
Variance σ X 2
= Exercise = 20.333 =
3
= n !!!
σX
Standard deviation σ X = $4.51 =
n
!!!
Ismor Fischer, 5/29/2012 5.2-6
The tendency toward a normal distribution becomes stronger as the sample size
n gets larger, despite the mild skew in the original population values. This is
an empirical consequence of the Central Limit Theorem.
BUT BEWARE....
Ismor Fischer, 5/29/2012 5.2-7