You are on page 1of 21

Sampling Frames and

Sample Design

Pres. 5

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample Frames & Sample Design
Frames: Material from which a sample is drawn

 Each unit to be included in the universe


 There should be no duplicates
 Each unit should be well defined and distinguishable
from other units (it should be unique)
 Should be updated
 For PES first stage units Primary Sampling Units
(PSUs), in many countries area clusters

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sampling Strategies
 Probability household surveys
 It is usual to make inferences in a PES for a number of analytical
domains
 Relatively large samples necessary in each domain for reliable
estimates
 Stratified cluster sample design-common
 First-stage units–area clusters/EAs
 PPS systematic sample selection
 Second-stage, common to canvass all persons in selected
households

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Importance of Stratification
 Population subdivided into heterogeneous groups that are
internally homogenous
 Stratification based on variables correlated with the extent
of coverage-geopolitical subdivisions
 Internal homogeneity can be maintained with regard to
socio-demographic variables e.g. urban stratum
 Common strata may include: rural, urban, provinces etc.

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Multi-stage Cluster Sampling

 Usually used when sampling hierarchical populations


 The hierarchical levels are called stages
 First stage units are called primary sampling units (PSUs)
e.g. EAs
 Second stage units are called secondary sampling units
(SSUs) e.g. households
 Last stage units are called ultimate sampling units (USUs) e.g.
persons within households which can be selected from EAs

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Why Area sampling?

 At national level only a frame of EAs is required


 Data collection is more efficient
 Lower costs compared to simple random sampling (SRS)
 Supervision is easier
 However, estimates are prone to higher variability
compared to SRS

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Choices of PSUs

 Must have clearly identifiable and stable boundaries


 Must completely cover the relevant population
 Preferably must have measures of size
 They should be mapped
 Must cover the whole country
 The number of PSUs must be relatively large

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Common problems with EAs

 Incomplete coverage
 Inadequate maps
 Poor measures of size or lack of them

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
PES sample design

 A single-stage stratified clustered sample design is


commonly adopted
 When the PSUs i.e. EAs are selected all households in
selected EAs are canvassed, or more rarely only a
sample (e.g. 1 every 5).
 This is beneficial for matching operation

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample Size

Sample size depends on estimate requirements


 Geographic level (national, province, urban/rural)
 Demographic (sex, age)
 Reliability
 Confidence level

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample Size
 To estimate sample size in the case of proportions you
must:
 Know the occurrence of the event in the population by
domain of estimation

 Specify a confidence interval (e.g 95%)

 Specify the margin of error or precision (e.g 1%)

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample Size (contd.)
 To estimate sample size in the case of proportions, the
following formula can be used:
 n s 2
( y ) 

P Y  t1 (1  )  (1   )
 N n  

Y estimated value
n sample size
t1 value of t that cuts  % area in tails of a normal distributi on curve
N size of total population
s 2 ( y)
estimated variance of the proportion to estimate
n

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample size (contd.)
 From that it is deduced :

n s 2 ( y)
m  t1 (1  ) is the precision in points of %
N n
s 2 ( y )  pq, binomial distribution
n
1  1
N
then
pq
n  t 21 2
m

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample Size (contd.)
 Example:
 To estimate percentage of households omitted in the census
(expected about 5%); confidence interval at 95% (t=1.96) for a
margin of error of 2 %
 The sample size works out to be:

pq 2 5 x95
nh  t 2 2
 (1.96) 2
 456
m 2

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample Size (contd.)
 Adjusting for non-response, e.g. 10%: 456
 507
0.90

 Adjusting for the design effect for a complex sample


design
 Design effect of 2 is a default value : 2 x 507 =1,014
 This may apply to each province (analysis) domains. If they
are five provinces
 Sample size will be 5 x 1,014 = 5,070

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample selection procedures

 For greater convenience and efficiency, the sample of


PSUs should be selected using a systematic procedure.
 If there are good measures of size, probability
proportional to size (PPS) should be used to increase the
efficiency of the sample design.
 Otherwise, the selection should be made with equal
probabilities

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample selection procedures -- PPS
 1) Order the EAs geographically (and, if applicable, by
other stratification characteristic) to allow implicit stratification
 2) Record for each EA i of the stratum h the measure of
size Mhi, typically the number of households or persons from the
census mapping operation
 3) Cumulate the size measures down the list of EAs, the
last cumulated number will be equal to the total number of
households (or persons) in stratum h (Mh)
 4) Determine the number of EAs (nh) to be selected in a
stratum according to the allocation

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample selection procedures –- PPS (contd.)

 5) Determine the sampling interval (I h) by:

 6) Obtain a random number (A h) between 1 and Ih


inclusively;

 7) Determine the selected EAs as follows:


 Shi=Ah + (i-1) x Ih, for i = 1,...,nh, rounded up to the next integer

 The i-th EA selected will be the one for which the cumulated
measure is closest to S hi without exceeding it.

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Illustration: Selection of Eight EAs with
probability Proportional to size

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Sample Allocation – 2009 Kenyan PES

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010
Thank You!

United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses:
Census Evaluation and Post Enumeration Surveys, Bangkok, Thailand, 10-14 May, 2010

You might also like