You are on page 1of 14

Cluster Sampling and EPI Methods

Cluster sampling

Cluster:
A group of sampling units close to each other i.e.
crowding together in the same area or neighborhood

 Used when "natural" but relatively homogeneous


groupings are evident in a statistical population.
 In this technique, the total population is divided into
these groups (or clusters) and a simple random
sample of the groups is selected.

2
Cluster sampling

1. The population within a cluster should ideally be as


heterogeneous as possible.
2. There should be homogeneity between cluster means.
3. Each cluster should be a small scale representation
of the total population.
4. The clusters should be mutually exclusive and
collectively exhaustive.
Cluster sampling

Household Village

So all the sampling units within the cluster is selected


This is particularly useful in population based studies

4
Cluster sampling
Block 1 Block 2

Block 3

Block 5

Block 4
EPI cluster survey design

 Select a central location in the village or town (e.g.


market, church, tree)
 Randomly select a direction to walk in

6
EPI cluster survey design

 Walk to the edge of the


village in the selected
direction and count the
number of houses
 Select a random number
between 1 and the total
number of houses and return
to this house
 This is the first household
 The second household
should be the one whose
front door is closest to the
first

7
Cluster Sampling

Pros Cons
 Most economical way of  May not reflect diversity of
sampling community
 Sampling frame may exist  Other elements in cluster
at cluster level (e.g. may share similar
districts/ boroughs/ characteristics and
therefore provides less
postcode)
information than simple
 Efficient (less time to list random sampling
and implement)  More variation around the
estimate (bigger confidence
intervals)

8
Example: GPS use

 What about an
area which isn’t
mapped?
 E.g. refugee camp

 Use of GPS or
drones to map an
area
Cluster vs. Stratified Sampling
• Cluster sampling is often confused with stratified sampling, because
they both involve "groups".
• In stratified sampling, the population is split into groups (strata) based
on some characteristic.
• In cluster sampling, the population is already broken into groups
(clusters), and each cluster represents the population.
• Cluster sampling is appropriate when the clusters are approximately
the same size.
Multi-stage sampling

In a two-stage sampling design, a sample of is primary units


is selected and then a sample of secondary units is selected
within each primary unit

Classes Students
11
Multi-stage sampling

Two-stage cluster sampling aims at minimizing survey costs and at the


same time controlling the uncertainty related to estimates of interest. It is
used frequently in health and social sciences.
Summary
Summary
• Often a non-random selection of basic sampling
frame (city, organization etc.)
• Fit between sampling frame and research goals
must be evaluated
• Sampling frame as a concept is relevant to all kinds of
research (including nonprobability)
• Nonprobability sampling means you cannot generalize
beyond the sample
• Probability sampling means you can generalize to
the population defined by the sampling frame

You might also like