You are on page 1of 17

Sampling Distributions

• Concept of Sampling Distribution


• Central Limit Theorem
• Distributions of Sample Mean and Sample Proportion

Gaurav Garg (IIM Lucknow)


Parameter and Statistic
• Parameter:
• Statistical measures computed using population observations.
• Let be the population observations.
• Population mean:
• Population Variance:
• Statistic:
• Statistical measures computed using sample observations.
• Let are sample units.
• Sample mean:
• Sample Variance:

Gaurav Garg (IIM Lucknow)


• In practice, parameter values are unknown.
• They are estimated through sample statistics.
• Parameter values are fixed.
• Values of statistic vary from sample to sample.
• Let us consider the following population of size 4:
• 18, 20, 22, 24
• Clearly,
• Consider all possible samples of size 2
• Obtain sample mean and sample variance of all the samples.

Gaurav Garg (IIM Lucknow)


Samples
18, 18 18 0 0 •
20, 18 19 2 1 •
22, 18 20 8 4
24, 18 21 18 9
18, 20 19 2 1 • If E(statistic) = parameter
20, 20 20 0 0 • then the statistic is said to be an Unbiased Estimate of the parameter.
22, 20 21 2 1 • Sample mean is an unbiased estimate of the population mean.
24, 20 22 8 4 • This means that the average of all sample means equals the population
18, 22 20 8 4 mean.
20, 22 21 2 1
22, 22 22 0 0 • Also, and
24, 22 23 2 1
18, 24 21 18 9
20, 24 22 8 4
22, 24 23 2 1
24, 24 24 0 0
Average 21 5 2.5
Gaurav Garg (IIM Lucknow)
Sampling Distributions
• Unknown parameters are estimated using sample observations.
• Parameter values are fixed.
• Values of statistic varies sample to sample.
• Each sample has some probability of being chosen.
• Each value of a statistic is associated with a probability.
• Thus, Statistic is a random variable.
• Distribution of a statistic is called a sampling distribution.
• Distribution of a statistic may not be the same as the distribution of the
population.

Gaurav Garg (IIM Lucknow)


Sampling Distribution of Mean (or Distribution of Sample Mean)

• Consider the previous example again.


• Histogram of population units

0.25

0 18 20 22 24

• Each item is frequented only once.


• Population distribution is discrete uniform distribution.
Gaurav Garg (IIM Lucknow)
Sample Probability = relative
Samples Mean Frequency frequency
(18, 18) 18 1 1/16
(20, 18), (18, 20) 19 2 2/16
(22, 18), (18, 22), (20, 20) 20 3 3/16
(24, 18), (18, 24), (20, 22), (22,20) 21 4 4/16
(20, 24), (24, 20), (22, 22) 22 3 3/16
(22, 24), (24, 22) 23 2 2/16
(24, 24) 24 1 1/16
Total 1

4/16
3/16 (no longer uniform)
2/16
1/16
0
18 19 20 21 22 23 24
Gaurav Garg (IIM Lucknow)
• The value of the depends on the chosen sample.
• Each sample is chosen with a certain probability.
• So, each possible value is associated with some probability.
• Distribution of is the list of all possible values along with corresponding
probabilities.
Sample Mean 18 19 20 21 22 23 24
Probability 1/16 2/16 3/16 4/16 3/16 2/16 1/16

• Thus, is a random variable.


• Mean and variance of can be calculated using the probability distribution
of .
• Clearly,
Gaurav Garg (IIM Lucknow)
• We saw in previous example, .
• This is always true and can be proved as below:

• Square root of variance is generally called as standard deviation.


• Here we shall call it Standard Error.
• Different samples of the same size from the same population yield
different sample means.
• Standard Error of is a measure of the variability in different values of
sample mean.
Gaurav Garg (IIM Lucknow)
Central Limit Theorem
• When population distribution is N(μ, σ),
• Then
• When the population distribution is not normal,
• Then also, , provided
• Practically, this result is true for .
• The result may also be written as

• Clearly, this result is valid when


• Sample comes out of a normal population, or
• Sample size is large .

Gaurav Garg (IIM Lucknow)


1,800 Randomly Selected Values
from an Exponential Distribution

Distribution of Sample Mean

10
n=2 9 16
8
7
n=5 14 n=30
6 12
5 10
4 8
3 6
2 4
1 2
00.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.750
4.00
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
x x

Gaurav Garg (IIM Lucknow)


1,800 Randomly Selected Values from a Uniform Distribution
F250
r 200
e
150
q
u100
e 50
n0
c 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
X
y
Distribution of Sample Mean
F10
9
F12 F25
10
r87 r8 r20
6 15
e54 e6 e10
q32 q4 q5
2
u10 u0 u0
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
3.75 4.25 1.00
4.00
e
1.25
1.50
1.75
2.00
2.25
n=5
2.50
2.75
3.00
3.25
3.50
3.75
4.00
4.25
x 1.00
1.25
1.50
1.75
2.00
2.25n=3
2.50
2.75
3.00
3.25
3.50
3.75
4.00
4.25
e n=2 x e x
n n n 0
c c c
y y y
Gaurav Garg (IIM Lucknow)
• Example:
• Suppose a population has mean μ = 8 and standard deviation σ = 3.
• Suppose a random sample of size n = 36 is selected.
• What is the probability that the sample mean is between 7.75 and 8.25?

• Sample may not have come from a normal population


• But the sample size is large.
• Therefore, as per the central limit theorem,
• , or
• Using Excel,
• = NORM.DIST(8.25,8,0.5,1)-NORM.DIST(7.75,8,0.5,1)

Gaurav Garg (IIM Lucknow)


Sampling Distribution of Proportion (or Distribution of Sample Proportion)

• Let us consider that the population is divided into two mutually exclusive and
collectively exhaustive classes.
• One class possesses a particular attribute,
• Other class does not possess that attribute.
• For example people in a city could be divided into “Smokers” and “Non-smokers”.
• Let
• N= population size
• X= no. of people out of N possessing a particular attribute
• 𝝅 = X/N = Actual proportion of the people possessing a particular attribute
• Let a sample is selected from this population.
• n= sample size
• x= no. of people in the sample possessing a particular attribute
• p= x/n = sample proportion
Gaurav Garg (IIM Lucknow)
• and are population parameters.
• and are sample statistics.
• provides an estimate of .
• Note that,

• This implies that

Gaurav Garg (IIM Lucknow)


• When the sample size is large enough, binomial distribution approaches normal distribution.
• So, for large ,

• , or

• ~N(0,1), or

• ~N(0,1),

• This is a particular case of central limit theorem.


• Practically, this result is true for .
• Or, when as well as .
Gaurav Garg (IIM Lucknow)
• Example:
• If the true proportion of voters who support ABC party is 0.4.
• What is the probability that a sample of size 200 yields a sample proportion between 0.40
and 0.45?
• 𝝅 = 0.4, 1 - 𝝅 = 1 – 0.4 = 0.6
• n = 200.
• P[ 0.40 < p < 0.45 ] =?

• We can use the following result:

• ~N(0,1),

• <<

• = NORM.S.DIST(1.4434,1)-0.5 = 0.4255

Gaurav Garg (IIM Lucknow)

You might also like