Professional Documents
Culture Documents
Bhattacharya
INTRODUCTION TO SAMPLING
DISTRIBUTION
Obtain k different random samples each of size n, from the designated population distribution.
For each such sample calculate the value of such statistic and construct a histogram of k
calculated values. This histogram gives the approximate sampling distribution of the statistic.
The larger the value of k, the better the approximation will tend to be ( the actual sampling
distribution emerges as k ) in practice k=500 or 1000.
Example Q1: Suppose a small finite population consists of only N=8 numbers:
54 55 59 63 64 68 69 70
The shape of the distribution of this population data.
Histogram
20
0 Frequency
450
F
400
r
e 350
q 300
u 250
e 200
n 150
c 100
y 50
0
0 .5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10
X
F 9
r 8
e
7
q
u 6
e 5
n
c 4
y 3
0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00
8/15/2023 DEVELOPED BU PROF. U.K. BHATTACHARYA, IIM INDORE 8
x
Means of 60 Samples (n = 5)
from an Exponential Distribution
10
F
r 9
e 8
q 7
u
6
e
n 5
c 4
y 3
2
1
0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00
8/15/2023 DEVELOPED BU PROF. U.K. BHATTACHARYA, IIM INDORE 9
x
Means of 60 Samples (n = 30)
from an Exponential Distribution
16
F
14
r
e 12
q
10
u
e 8
n
c 6
y 4
0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00
x
F 250
r
e 200
q
u 150
e
n 100
c
y 50
0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
X
F 10
r 9
e 8
q 7
u
6
e
n 5
c 4
y 3
2
1
0
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
8/15/2023 DEVELOPED BU PROF. U.K. BHATTACHARYA, IIM INDORE 12
x
Means of 60 Samples (n = 5)
from a Uniform Distribution
F 12
r
e 10
q
u 8
e
n 6
c
y 4
0
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
8/15/2023 DEVELOPED BU PROF. U.K. BHATTACHARYA, IIM INDORE 13
x
Means of 60 Samples (n = 30)
from a Uniform Distribution
F 25
r
e 20
q
u 15
e
n
c 10
y
5
0
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
8/15/2023 DEVELOPED BUxPROF. U.K. BHATTACHARYA, IIM INDORE 14
CENTRAL LIMIT THEOREM
If the samples of size n are drawn randomly from a population that has a mean
and a standard deviation , the sample means, , are approximately normally
distributed for sufficiently large sample sizes ( n 30 ) regardless of the shape of
the population distribution.
If the population is normally distributed , the sample means are normally
distributed for any size sample.
Also the mean of the sampling distribution is . And the standard deviation
x
of the sample means ( called the standard error of the mean) is the standard
deviation of the population divided by the square root of the sample size x
n
X
N n Where N=size of the population
n N 1 n=size of the sample
When the population is small in relation to the size of the sample, the finite
population multiplier reduces the size of the standard error. Any decrease in
the standard error increases the precision with which the sample mean can be
used to estimate the population mean.
According to Nielsen Media Research, the average number of hours of TV viewing per
household per week in the United States is 50.4 hours. Suppose the standard deviation is 11.8
hours and a random sample of 42 is taken.
a. What is the probability that the sample average is more than 52 hours?
b. What is the probability that the sample average is less than 47.5 hours?
c. What is the probability that the sample average is less than 40 hours? If the sample
average actually is less than 40 hours, what would it mean in terms of the Nielsen Media
Research figures?
d. Suppose the population standard deviation is unknown. If 71% of all sample means are
greater than 49 hours and the population mean is still 50.4 hours , what is the value of the
population standard deviation?
If the research produces measurable data such as weight, distance, time and income, the
sample mean is often the statistics of choice. However, if research results in countable items
such as how many people in the sample have the flexible work schedule, the sample
proportion is the statistic of choice.
Sample proportion is computed by dividing the frequency with which a given
characteristics occurs in a sample by the number of items in the sample.
x
pˆ
n
pˆ p
z
p.q
n