You are on page 1of 16

CENTRAL LIMIT THEOREM

(CLT)
BOI205/4 - BIOSTATISTICS
Given a population with a mean of μ and standard deviation σ

Example: If my population is the Malaysian adults of all age structures.

Variable: Heights.

They will have the average of Heights: μ

and also standard deviation: σ (How spread the data is around the mean, μ).
BUT THERE IS NO WAY YOU CAN GET AN ACCURATE HEIGHT
MEASUREMENTS FOR ALL ADULTS IN MALAYSIA JUST LIKE
THAT!

HOW ABOUT SAMPLING?

YES…

BUT…

SAMPLING WILL BE VERY EXTENSIVE AND TIME CONSUMING! COULD


EVEN TAKE MONTHS? YEARS? WHAT ABOUT IF YOU NEED THE DATA
FAST? YOU HAVE DEADLINES?
Start with sampling of the sample size of “n”

So you start sampling every Malaysian adults with the


sample size of “n”

UNTIL YOU SAMPLE ALL MALAYSIAN ADULTS?

Yet, it is still very extensive, expensive, and impractical…

You will see how Central Limit Theorem will help you!
You DO NOT NEED TO SAMPLE each and every adults in
Malaysia to make it possible, but with CENTRAL LIMIT
THEOREM, it is possible…

Initially, you can never know the shape or type of population


distribution for your population of interest, but NO PROBLEM
AT ALL FOR CENTRAL LIMIT THEOREM!

The population of interest DOES NOT HAVE TO BE/TO HAVE


NORMAL DISTRIBUTION FOR CENTRAL LIMIT THEOREM
TO WORK…
For example, the sample size (n) = 2 = sample mean
Sample #3 3
THE POPULATION Volunteer: 1
Sample #2 2 Volunteer: 2
CAN BE OF ANY Volunteer: 1
SHAPE! Volunteer: 2

Sample #4
Sample #1 Volunteer: 1
4
Volunteer: 1 Volunteer: 2
Volunteer: 2

Sample #5
Volunteer: 1
1 Volunteer: 2

5
THEOREM 1:
THE MEAN OF THE SAMPLE MEANS = MEAN OF THE
ENTIRE POPULATION!

8 Mean of the sample means Mean of the population

μx̄
6

1
3
= μ
Sample means 5

2 4
INDEPENDENT OF THE POPULATION
n SHAPE/TYPE OF POPULATION AND
7 SAMPLE SIZES!
In real life situations, you CAN NEVER ABLE TO GET ALL THE
SAMPLES FROM A POPULATION, hence, at least, IF YOU HAVE AN
ACCEPTABLE (AT LEAST NOT ALL) AMOUNT OF SAMPLES AS FAR
AS YOU CAN GET WITH YOUR EFFORTS, CENTRAL LIMIT
THEOREM APPLIES…
THEOREM 2:

σ x̄
= σ
√n

YOU CAN ESTIMATE THE


POPULATION STANDARD DEVIATION
JUST BY SAMPLING DATA!
In reality, even if you don’t take ALL samples of your
population…
You can still get closer and closer to the population standard
deviation if your number of samples are larger and larger…
BUT DOES NOT MEAN YOU HAVE TO TAKE ALL
SAMPLES FROM YOUR POPULATION!
THEOREM 3:

If the population is normal/has normal distribution


pattern, then, the sampling distribution of the sample
means will have normal distribution, independent of
sample sizes

x̄ x̄x̄

x̄ x̄ x̄ x̄
x̄ x̄ x̄ x̄x̄ x̄
x̄ x̄ x̄ x̄ x̄

x̄ x̄ x̄ x̄ x̄ x̄ x̄ x̄
μ
Surely each sample means will be different from
each other (although with similar sample sizes, but if
you plot the sample means according to their
respective different frequencies (plotting over a
Histogram), and the parent population is known to
be normal, you will have a normal sampling
distribution…
THEOREM 4:

If the population IS NOT normal, BUT the sample size (n) is


greater than 30 (> 30), then the sampling distribution of the
sample means (plotting over Histogram) approximates a normal
distribution!
For any population distribution shape…
Population

n > 30

x ̄ x̄n > 30
n > 30

n > 30 x̄
x̄ n > 30
n > 30 x̄ n > 30
x̄ n > 30 x̄ n > 30 Sample means

x̄ x̄
x̄ x̄
x̄ x̄
x̄ x̄

x̄ x̄
x̄ x̄
x̄ x̄
x̄ x̄ x̄
SO WHAT IS THE POINT OF CENTRAL LIMIT THEOREM, AGAIN?

We do not need to know what will the parent population looks like…
(Symmetric, Non-Symmetric, Skewed, Uniform)
As long as you sample with sample sizes of greater than 30 (> 30), then the
sampling distribution of your sample means will look normal/will achieve
normal distribution…

IMPORTANT!
INFERENTIAL STATISTICS IN MOST CASES REFER TO
NORMAL DISTRIBUTION PATTERN!

You might also like