Professional Documents
Culture Documents
Sources:
-EPIET Introductory course,
Thomas Grein, Denis Coulombier, Philippe Sudre, Mike Catchpole
-IDEA
Brigitte Helynck, Philippe Malfait, Institut de veille sanitaire
• Definition of sampling
• Why do we use samples?
• Concept of representativeness
• Main methods of sampling
• Sampling error
• Sample size calculation
Definition of sampling
• Sampling unit
– Subject under observation on which information is
collected
• Sampling fraction
– Ratio between the sample size and the population
size
• Sampling frame
– Any list of all the sampling units in the population
• Sampling scheme
– Method of selecting sampling units from sampling
frame
Why do we use samples ?
Precision
Cost
What we need to know
• Concepts
– Representativeness
– Sampling methods
– Choice of the right design
• Calculations
– Sampling error
– Design effect
– Sample size
Sampling and representativeness
Sampling
Population
Sample
Target Population
• Non-probability samples
• Probability samples
Non probability samples
• Quotas
• Sample reflects population structure
• Time/resources constraints
• Random sampling
• Each subject has a known probability of
being chosen
• Reduces possibility of selection bias
• Allows application of statistical theory to
results
Sampling error
• No sample is the exact mirror image of
the population
• Magnitude of error can be measured in
probability samples
• Expressed by standard error
– of mean, proportion, differences, etc
• Function of
– amount of variability in measuring factor of
interest
– sample size
Methods used in probability samples
Random Systematic
error ! error (Bias) !
Simple random sampling
• Principle
– Equal chance of drawing each unit
• Procedure
– Number all units
– Randomly draw units
Simple random sampling
• Advantages
– Simple
– Sampling error easily measured
• Disadvantages
– Need complete list of units
– Does not always achieve best
representativeness
– Units may be scattered
Simple random sampling
Example: evaluate the prevalence of tooth
decay among the 1200 children attending a
school
• N = 1200, and n = 60
⇒ sampling fraction = 1200/60 = 20
• List persons from 1 to 1200
• Randomly select a number between 1 and
20 (ex : 8)
⇒ 1st person selected = the 8th on the
list
⇒ 2nd person = 8 + 20 = the 28th
etc .....
Systematic sampling
1 2 3 4 5 6 7 8 9 10 11 12 13 14
15
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
46 47 48 49 50 51 52 53 54 55 ……..
Systematic sampling
Example: systematic sampling
Stratified sampling
• Principle :
Principle
• = consecutive samplings
• example :
sampling unit = household
Section 3
Section 5
Section 4
Cluster sampling
• Advantages
– Simple as complete list of sampling units
within population not required
– Less travel/resources required
• Disadvantages
– Imprecise if clusters homogeneous and
therefore sample variation greater than
population variation (large design effect)
– Sampling error difficult to measure
EPI cluster sampling
A 1600 IIII
Then compute sampling fraction :
B 1820 I
K= 9820 = 327 C 5020 IIIIIIIIII
30
Draw a random number (between 1 D 5420 I
and 327) E 6220 II
F 6420 I
Example: 62 G 7620 IIII
H 7820 I
Start from the village including “62” I 9420 IIIII
and draw the clusters adding the
J 9820 I
sampling fraction
Drawing households and children
On the spot
p(1-p) Σ (pi-p)²
Var srs = ---------- Var clus = -------------
n k(k-1)
Var clust
Design effect = ------------------
Var srs
p= global proportion
pi= proportion in each stratum
n= number of subjects
k= number of strata
srs= simple random sampling
EPITABLE: Calculating design effect
Selecting a sampling method
• Population to be studied
– Size/geographical distribution
– Heterogeneity with respect to variable
• Level of precision required
• Resources available
• Importance of having a precise estimate
of the sampling error
Steps in estimating sample size
• Identify major study variable
• Determine type of estimate (%, mean, ratio,...)
• Indicate expected frequency of factor of interest
• Decide on desired precision of the estimate
• Decide on acceptable risk that estimate will fall outside
its real population value
• Adjust for estimated design effect
• Adjust for expected response rate
• (Adjust for population size? In case of small size
population only)
Sample size formula in
descriptive survey
Simple random / systematic sampling
z² * p * q 1.96²*0.15*0.85
n = -------------- ---------------------- = 544
d² 0.03²
Cluster sampling
z² * p * q 2*1.96²*0.15*0.85
n = g* -------------- ------------------------ = 1088
d² 0.03²
• If in doubt…