Professional Documents
Culture Documents
Sampling
Sampling
Learning Objectives
Determine when to use sampling instead of a census.
Distinguish between random and nonrandom
sampling.
Decide when and how to use various sampling
techniques.
Be aware of the different types of errors that can
occur in a study.
Reasons for Sampling
Sampling – A means for gathering useful information
about a population
Information gathered from sample, and conclusions
drawn
Sampling vs. census has advantages
Sampling can save money.
Sampling can save time.
Given the resources can broaden the scope of the study
Because research process is sometimes destructive sample
can save product
If accessing the population is impossible ,the sample is
the only option.
Reasons for Taking a Census
Eliminate the possibility that a random
sample is not representative of the population.
The person authorizing the study is
uncomfortable with sample information.
Sampling Frame
Every research study has target population that consists of the
individuals ,institutions, or entities that are the objects of
investigation.
Sample is taken from a population list,map,directory,or other
source used to represent the population called the sampling
frame.
Sampling is done from the frame not target population
Ideally a one to one correspondence exists between frame and
population units
Frames may have overregistration or underregistration
Random Versus Nonrandom Sampling
Nonrandom Sampling (nonprobability
sampling) - Every unit of the population does
not have the same probability of being
included in the sample
Random sampling(probabilty sampling) -
Every unit of the population has the same
probability of being included in the sample.
Random Sampling Techniques
Simple Random Sample – basis for other
random sampling techniques
Each unit is numbered from 1 to n
A random number generator can be used to select
n items from the sample
Random Sampling Techniques
Stratified Random Sample
Proportionate (% of the sample taken from each
stratum is proportionate to the % that each stratum
is within the whole population)
Disproportionate (when the % of the sample taken
from each stratum is not proportionate to the %
that each stratum is within the whole population)
Systematic Random Sample
Cluster (or Area) Sampling
Simple Random Sample:
Sample Members
01 Alaska Airlines 11 DuPont 21 Lucent
02 Alcoa 12 Exxon Mobil 22 Mattel
03 Ashland 13 General Dynamics 23 Mead
04 Bank of America 14 General Electric 24 Microsoft
05 BellSouth 15 General Mills 25 Occidental Petroleum
06 Chevron 16 Halliburton 26 JCPenney
07 Citigroup 17 IBM 27 Procter & Gamble
08 Clorox 18 Kellog 28 Ryder
09 Delta Air Lines 19 KMart 29 Sears
10 Disney 20 Lowe’s 30 Time Warner
N=30
n=6
Simple Random Sampling:
Random Number Table
N = 30
99 4n3 7=8 6
7 9 6 1 4 5 7 3 7 3 7 5 5 2 9 7 9 6 9 3 9 0 9 4 3 4 4 7 5 3 1 6 1 8
5 0 6 5 6 0 0 1 2 7 6 8 3 6 7 6 6 8 8 2 0 8 1 5 6 8 0 0 1 6 7 8 2 2 4 5 8 3 2 6
8 0 8 8 0 6 3 1 7 1 4 2 8 7 7 6 6 8 3 5 6 0 5 1 5 7 0 2 9 6 5 0 0 2 6 4 5 5 8 7
8 6 4 2 0 4 0 8 5 3 5 3 7 9 8 8 9 4 5 4 6 8 1 3 0 9 1 2 5 3 8 8 1 0 4 7 4 3 1 9
6 0 0 9 7 8 6 4 3 6 0 1 8 6 9 4 7 7 5 8 8 9 5 3 5 9 9 4 0 0 4 8 2 6 8 3 0 6 0 6
5 2 5 8 7 7 1 9 6 5 8 5 4 5 3 4 6 8 3 4 0 0 9 9 1 9 9 7 2 9 7 6 9 4 8 1 5 9 4 1
8 9 1 5 5 9 0 5 5 3 9 0 6 8 9 4 8 6 3 7 0 7 9 5 5 4 7 0 6 2 7 1 1 8 2 6 4 4 9 3
N = 30
n=6
Stratified Random Sample
Stratified Random sampling – population is
divided into non-overlapping subpopulations
called strata
Researcher extracts a simple random sample from
each subpopulation
Stratified random sampling has the potential for
reducing error
Stratified Random Sample
Sampling error – a sample does not represent
the population
Stratified random sampling has the potential to
match the sample closely to the population
Stratified sampling is more costly
Stratum should be relatively homogeneous, i.e.
race, gender, religion
Stratified Random Sample
Proportionate -- the percentage of the sample
taken from each stratum is proportionate to
the percentage that each stratum is within the
population
Disproportionate -- proportions of the strata
within the sample are different than the
proportions of the strata within the population
Systematic Sampling
Used because of its
convenience and easy N ,
of administration k=
Population elements are n
an ordered sequence where:
(at least, conceptually).
With systematic sampling, n = sample size
every kth item is selected
to produce a sample of N = population size
size n from a population of
size N k = size of selection interval
Systematic Sampling
Thereafter, sample elements are selected at a
constant interval, k, from the ordered sequence
frame.
Advantages of systematic sampling
Systematic sampling is evenly distributed across the
frame
Evenly determined if a sampling plan has been followed
Systematic sampling is based on the assumption that the
source of the population is random
Systematic Sampling: Example
Purchase orders for the previous fiscal year
are serialized 1 to 10,000 (N = 10,000).
A sample of fifty (n = 50) purchases orders is
needed for an audit.
k = 10,000/50 = 200
Systematic Sampling: Example
First sample element randomly selected from
the first 200 purchase orders. Assume the
45th
purchase order was selected.
Subsequent sample elements: 45, 245, 445,
645, . . .
Cluster Sampling
Cluster sampling – involves dividing the
population into non-overlapping areas
Identifies the clusters that tend to be internally
homogeneous
Each cluster is a microcosm of the population
If the cluster is too large, a second set of
clusters is taken from each original cluster
This is two stage sampling
Cluster Sampling
Advantages
More convenient for geographically dispersed
populations
Reduced travel costs to contact sample elements
Simplified administration of the survey
Unavailability of sampling frame prohibits using
other random sampling methods
Cluster Sampling
Disadvantages
Statistically less efficient when the cluster
elements are similar
Costs and problems of statistical analysis are
greater than for simple random sampling
Nonrandom Sampling
Non-Random sampling – sampling techniques
used to select elements from the population
by any mechanism that does not involve a
random selection process
These techniques are not desirable for use in
gathering data to be analyzed by inferential
statistics
Sampling error cannot be determined objectively
from these techniques
Types of non random sampling
techniques
Convenience sampling: Elements of sample are
selected for convenience (readily
available,nearby,willing to participate) of researcher.
Judgment sampling: Elements of sample are
chosen by judgment of the researcher.
Quota sampling: Quota sets the size of samples to
be obtained from subgroups based on the
proportions of subclasses in population.
Snowball sampling: Survey subjects are selected
based on referral from other survey respondents.
Errors
Data from nonrandom samples are not appropriate for
analysis by inferential statistical methods.
Sampling Error occurs when the sample is not representative
of the population
Non-sampling Errors – all errors other than sampling errors
Missing Data, Recording, Data Entry, and Analysis Errors
Poorly conceived concepts , unclear definitions, and defective
questionnaires
Response errors occur when people do not know, will not say, or
overstate in their answers
Determining Sample Size when Estimating
It may be necessary to estimate the sample size when
working on a project to accomplish the purpose of the
study.
In studies where µ is being estimated, the size of the
sample can be determined by using the z formula for
sample means to solve for n
Difference between X and µ is the error of estimation
Error of Estimation = ( X - µ)
Determining Sample Size when
Estimating
z
x
• z formula
n
• Error of Estimation E x
(tolerable error)
2
z
2
2
z
• Estimated Sample n 2
2
E
2
E
Size
• Estimated 1
range
4
Determining Sample Size when
Estimating p
• z formula Z
ˆ p
p
pq
n
• Error of Estimation
(tolerable error) E p
ˆp
2
z pq
• Estimated Sample
n 2
Size E
Demonstration problem
• A researcher wants to estimate the
average monthly expenditure on
bread by a family in Chicago. she
wants to be 90% confident in her
results and wants the estimate to be
within 1$ of the actual figure .The
standard deviation of average
monthly bread purchases is 4 $.What
is the sample size estimate ?
Sample Size When Estimating
: Example
E 1, 4
90% confidence z 1.645
n
z
2
2
2
2
E
(1.645) 2 ( 4) 2
12
43.30 or 44
Demonstration Problem
• Hewitt Associates conducted a national
survey to determine the extent to which
employers are promoting health and fitness
among their employees. One of the
questions asked was, Does your company
offer on-site exercise classes? Suppose it
was estimated before the study that no
more than 40% of the companies would
answer Yes.
How large a sample would Hewitt Associates
have to take in estimating the population
proportion to ensure a 98% confidence in
the results and to be within .03 of the true
population proportion?
Solution for Demonstration
Problem
E 0.03
98% Confidence Z 2.33
estimated P 0.40
Q 1 P 0.60
z 2 pq
n
E2
( 2.33) 2 (0.40)(0.60)
(.003) 2
1, 447.7 or 1, 448