Professional Documents
Culture Documents
LECTURE 5
SAMPLING METHODS
&
SAMPLING DISTRIBUTIONS
5-2
SAMPLING
Some terms:
• Census: conducting a survey to collect data for the
entire population
• Sampling: selecting a sample from population
Why sampling is necessary?
Cost
Practicality
Statistical inference will allow us to draw
conclusions for population based on sample data
5-3
SAMPLING WITH/WITHOUT REPLACEMENT
5-4
SAMPLING METHODS
5-5
SAMPLING METHODS
5-6
SOME IMPORTANT PROBABILITY
SAMPLING METHODS
5-7
SIMPLE RANDOM SAMPLING
Selecting a sample in such a way that every possible
sample of the same size is equally likely to be
chosen.
Example: Suppose our lecture class is the
population. We are going to select a simple random
sample (SRS) of 20 students from the population.
5-8
TAKE A SRS IN R
sample(x, size, replace = FALSE)
5-9
STRATIFIED RANDOM SAMPLING
Example:
5 - 10
OTHER TYPES OF SAMPLING
5 - 11
STATISTICAL METHODS
Statistical
Methods
Descriptive Inferential
Statistics Statistics
5 - 12
IMPORTANCE OF SAMPLING DISTRIBUTION
5 - 13
REPEATED SAMPLING
Example
We wish to estimate population mean
Select a random sample
Find the sample mean (e.g. = 20) and use it as an
estimate
If other people select different samples, and find
markedly different sample means
Would we trust our estimate?
5 - 14
REPEATED SAMPLING
5 - 15
EXAMPLE OF SAMPLING DISTRIBUTION
5 - 16
EXAMPLE OF SAMPLING DISTRIBUTION
5 - 17
SAMPLING DISTRIBUTION OF SAMPLE MEANS
5 - 18
QUESTIONS
5 - 19
EXAMPLE OF SAMPLING DISTRIBUTION
OF VARIANCE
5 - 20
IN GENERAL
5 - 21
ACTIVITY: EXPLORING SAMPLING DISTRIBUTIONS
VIA SIMULATION
5 - 22
ACTIVITY
5 - 23
OBSERVATIONS
Let’s write down our observations:
Many sampling distributions (for each n)
Shape of sampling distribution
Mean of sampling distribution (and compare it
with mean of population)
SD of sampling distribution (and compare it with
SD of population)
The difference between sampling distribution
and population
5 - 24
ACTIVITY
Now let’s choose a different population (a non-
normal population) provided by the website
Repeat what we have done
Write down our observations
When does the sampling distribution becomes
approximately normal?
5 - 25
ACTIVITY
Now we should
Clearly distinguish between population and sampling
distribution.
Homework: you should experiment with other
populations in the website to deepen your
understanding of sampling distributions.
Question: Is there a sampling distribution of
another statistic?
5 - 26
THEOREM I
If a random sample is selected from a normal
population, the sampling distribution of sample
mean is normal.
5 - 27
THEOREM II: CENTRAL LIMIT THEOREM
5 - 28
THEOREM II: CENTRAL LIMIT THEOREM
Practical guideline:
• If the population is nearly normal, then a sample of size n
= 5 will probably be large enough to assure that 𝑿 ഥ is
approximately normal.
• If the population is symmetric, then a sample of size n =
20 to 25 is enough for the Central Limit Theorem (CLT) to
hold.
• For most moderately skewed distributions, a sample size
of around 30 is traditionally thought to be sufficiently
large for the CLT to hold. This is a rule of thumb but this
is not a definitive number.
• For very skewed distributions or distributions with
outliers, the sample size required for the CLT to hold may
be much larger than 30.
5 - 29
PROPERTIES OF SAMPLING DISTRIBUTION OF
MEAN
5 - 30
SAMPLING ERROR
Difference between sample statistic and
parameter
Important when making inference about
population
5 - 31
STANDARD ERROR OF MEAN
SD of sample means
Represents (approx.) average deviation of sample
means to center
The center = population mean
Represents (approx.) average error when using
sample mean to estimate population mean
So called Standard error of mean:
𝝈𝑿
𝝈𝑋ത =
𝒏
(if n/N ≤ 0.05)
5 - 32
FINITE POPULATION CORRECTION FACTOR
In cases where n/N > 0.05, the standard error
of mean is:
5 - 33
FINDING PROBABILITY OF SAMPLE MEAN
First, check that the sampling distribution of
sample mean is normal or nearly so
If so, convert to Z to find probability:
5 - 34
EXERCISE 1
You’re an operations analyst
for AT&T. Long-distance
telephone calls are normally
distributed with µ = 8 min. &
= 2 min. If you select a
random sample of 25 calls,
what is the probability that
the sample mean would be
between 7.8 & 8.2 minutes?
5 - 35
SOLUTION
X 7.8 8
Z .50
n 2 25
X 8.2 8
Z .50
n 2 25
Sampling Standard Normal
Distribution Distribution
X = .4 =1
.3830
5 - 37
SOLUTION
X 7 .8 8
Z .55
n 2 30
X 8 .2 8
Z .55
n 2 30
Sampling Standard Normal
Distribution Distribution
= .365 =1
X
.4176
-.55 0 Z
7.8
5 - 38
8 8.2 X .55
CONCLUSION
Sampling methods
The importance of sampling distribution
5 - 39