Sampling Methods
Objective
– Explain the role of sampling in research process
– Distinguish between probability and non probability sampling
– Understand factors considered when determining sample size
– Understand the steps in developing a sample
08/12/2021 1
Introduction
• Proverbs
– ”To call in the statistician after the experiment is done
may be no more than asking him to perform a post-
mortem examination: he may be able to say what the
experiment died of.”
08/12/2021 2
Introduction
• Sampling refers to strategies that enable us to pick a subgroup
from a larger group and then use this subgroup as a basis for
making inferences about the larger group.
08/12/2021 3
Reasons for Sampling
• Main reasons for sampling instead of doing a
census.
– Economy
– Timeliness
– The large size of many populations
– Inaccessibility of some of the population
– accuracy
08/12/2021 4
Sampling….
• Sample surveys are almost never conducted for the purposes
of describing the particular sample under study.
• Rather they are conducted for purposes of understanding the
larger population from which the sample was initially selected
08/12/2021 5
Disadvantages of sampling
There is always sampling error.
Sampling may create a feeling of
discrimination in the population.
Inadvisable where every unit in the population
is legally required to have a record.
08/12/2021 6
Sampling Plan scheme
1. Define the Population of Interest
2. Identify a Sampling Frame (if possible)
3. Select a Sampling Method
4. Determine Sample Size
5. Execute the Sampling Plan
08/12/2021 7
Types of Sampling
A. Probability Sampling
Each member of the population has a known chance of
being selected
1. Simple random sampling
2. Systematic random sampling
3. Stratified random sampling
4. Cluster sampling
5. multistage sampling
08/12/2021 8
Types of sampling cont…
B. Non-probability sampling
The researcher has no way of forecasting that each member
of the population will be represented in the sample.
1. Judgmental/Purposive
2.Quota
3.Convenience/haphazard
4. Snow ball
5. Voluntary/self selection
08/12/2021 9
1. Simple random sampling
• The least sophisticated of all sampling designs
• Simple random selection where every member of the population
is given an equal chance of being selected
• Good for homogeneous population
• Easy when the population is small and elements are known
• Impractical for very large populations
08/12/2021 10
Simple random sampling cont…
• SRS removes the possibility of any bias on the part of
researcher in selecting the sample from sampling frame
• we can apply methods like
Lottery method (sample drawn from box)
Table of random numbers
Computer generated random numbers
08/12/2021 11
2.Systematic random sampling
• A method of probability sampling in which the defined
target population is ordered and the sample is selected
according to position using a skip interval
• Selecting elements of the population in predetermined
sequence
• Select every kth item on a list (k= N/n)
• Randomness element is in picking up the starting point
08/12/2021 12
Systematic random sampling cont…
08/12/2021 13
Systematic random sampling cont…
Advantage
• Easier to perform it
• Require less time than SRS
• Very good when the population from which
sample is to be drawn is homogeneously distributed
08/12/2021 14
3.Stratified random sampling
• Method of probability sampling in which the population is
divided into different subgroups and samples are selected from
each.
• Divide the population by certain characteristics into homogeneous
subgroups (strata)
• Elements within each strata are homogeneous, but are
heterogeneous across strata.
• A simple random or a systematic sample is taken from each strata
relative to the proportion of that stratum to each of the others.
08/12/2021 15
Stratified random sampling cont…
Researchers use stratified sampling:
• When a stratum of interest is a small percentage
of a population and random processes could
miss the stratum by chance.
• When enough is known about the population
that it can be easily broken into subgroups or
strata.
08/12/2021 16
Stratified random sampling cont…
1. Equal intensity
POPULATION
n=1000
STRATA 1
STRATA 2
n= 500
n= 500
08/12/2021 17
Stratified random sampling cont…
2.Proportional to size
POPULATION
n=1000
STRATA 1
STRATA 2
n= 400
n= 600
08/12/2021 18
Stratified random sampling cont…
Advantages
• representativeness of the sample is improved
• focuses on important subpopulations and
ignores irrelevant ones
• improves the accuracy of estimation
08/12/2021 19
Stratified random sampling cont…
Disadvantages
• can be difficult to select relevant stratification
variables
• not useful when there are no homogeneous subgroups
• can be expensive
• Requires accurate information about the population,
or introduces bias.
08/12/2021 20
4. Cluster sampling
• Used when:
– Researchers lack a good sampling frame for a
dispersed population.
– The cost to reach an element to sample is very
high.
• A random sample of clusters is taken, then all units
within those clusters are examined.
• Each cluster is as varied heterogeneous internally
and homogeneous to all the other clusters.
08/12/2021 21
Cluster sampling cont….
Advantage:
• Sampling frame of the reference population is not
required (Sufficient to have a list of clusters)
• Cost effective
Disadvantage:
• Larger sampling error than other forms of random
sampling.
• If clusters are not small it can become expensive.
08/12/2021 22
5. Multistage sampling
• Used when the reference population is large and
widely scattered.
• Selection is done in stages until the final sampling
unit are arrived at.
• Finally study subjects will be selected by
SRS
• No need of sampling frame for the
reference population.
08/12/2021 23
Multistage sampling cont…
Advantage
• Cuts the cost of preparing the sample
frame
Disadvantage
• sampling error is high compared with
simple random sampling
08/12/2021 24
Probability sampling summary
08/12/2021 25
Probability sampling summary cont…
Population characteristics Appropriate sampling
technique
1. Homogeneous members •Simple random sampling
•Systematic random
sampling
2. Stratified population with Stratified random
approximately equal in sampling
size
3. Stratified population, Proportional stratified
strata different in size sampling
4. Population with discrete Cluster sampling
clusters with similar
characteristics
08/12/2021 26
2.Non-probability sampling
• A type of sampling where each study unit in the
population has an unknown probability of inclusion in the
sample.
• The selection of subjects is subjective.
08/12/2021 27
When to use Non probability
• Group that represents the target population already exists.
• Difficult or impossible to obtain the list of names for
sampling (Homeless, IV Drug user).
• For rare population.
08/12/2021 28
Non-probability sampling…
Advantages
• Used when a sampling frame does not exist.
• They are quick, inexpensive and convenient.
• Good for pretests, pilot studies, In-depth interviews.
• Used when Precise representativeness is not
necessary.
08/12/2021 29
Non-probability sampling…
Disadvantage
• No random selection (unrepresentative).
• Reliability cannot be measured.
• No way to measure the precision of the resulting
sample.
• Inappropriate for generalizing findings.
08/12/2021 30
1.Judgmental Sampling
• The researchers choose the sample based on who they
think would be appropriate for the study.
• Primarily used when there is a limited number of people
that have expertise in the area being researched.
• Appropriate when the study subjects are difficult to
locate.
08/12/2021 31
Judgmental Sampling cont…
• More efficient and economic where the sample sizes
are small.
• Used where randomization is not expected to provide
representative samples.
• Reduced cost and time involved in acquiring the
sample
08/12/2021 32
2. Quota sampling
• A variation of convenience sampling
• Elements are selected in the same proportion as in
the population but not in a random fashion
Advantage
• Interviewers are required to find cases with
particular characteristics
• Better than convenience, introduce some diversity
Disadvantage
• non random sampling
08/12/2021 33
3.Convenience/Haphazard
• Selection of subjects based on easy availability &
accessibility
Eg. People who just happen walking
• Often used in face to face interviews
Advantage
• very easy to carry out
Disadvantage
• Difficult to draw any meaningful conclusion.
• May not be representative
08/12/2021 34
4.Purposive sampling
• Use judgment and deliberate effort to pick individuals
who meet a specific criteria.
• Choosing people who we have decided are “typical”
of a group
• Especially good for exploratory or field research.
08/12/2021 35
Purposive sampling cont…
• Appropriate for at least 3 situations.
1. select cases that are especially informative
2. desired population for the study is rare or very
difficult to locate.
3.case studies analysis – find important individuals
and study them in depth.
08/12/2021 36
5.Snowball sampling
• Involves a process of “chain referrals”
• Suitable for locating key informants.
• You start with one or two key informants and ask
them if they know persons who know a lot about your
topic of interest.
• Used when trying to interview hard to reach groups.
08/12/2021 37
6. Volunteer/self selection
• Subjects selected are volunteers who show interest to
the study.
• Common in trials demanding long duration.
• Payments for subjects some times be involved.
• Introduces strong bias/self selection bias.
08/12/2021 38
Errors in Sampling
1. Non sampling error (Bias)
• Systematic error in the design or conduct of a sampling
procedure which results in distortion of the sample so that
it is no longer representative of the reference population.
We can eliminate or reduce the non-sampling error (bias)
by careful design of the sampling procedure and not by
increasing the sample size.
08/12/2021 39
Errors in sampling cont…
2.Sampling Error (random error)
· A results obtained from the sample cannot reflect
the full range of variation found in the larger group
(population).
· This type of error, arising from the sampling process
itself, is called sampling error, which is a form of
random error.
Sampling error can be minimized by increasing the
size of the sample.
08/12/2021 40
Data collection
08/12/2021 41
Learning objecti ves
By the end of this session, students will be expected to:
o List different data collection techniques
o Discus strength and limitation of different data collection
techniques
o Develop data collection instrument for your own research
proposal
42
Data collection methods
Data collection techniques allow us to systematically collect data to met the
objectives of study
If data are collected haphazardly, it will be difficult to answer the research
questions
Depending on the type of variables and the objective of the study different
data collection methods can be used
43
Commonly used data collection techniques
• Using available information
• Observation
• Interviewing
• questionnaires
• Focus group discussions
Differences between data collection
techniques and data collection tools
Data collection techniques Data collection tools
Using available information Checklist; data compilation
forms
Observation Eyes and other senses,
pen/paper, watch, scales,
microscope, etc..
Interviewing Interview guide, checklist,
questionnaire, tape recorder
Administering written Questionnaire
questionnaire
Various data collection techniques
- Observation
– Face-to-face and self-administered interviews
– Postal or mail method and telephone interviews
– Using available information (document review)
– Focus group discussions guide (FGD)
– Others too
46
1. Observation
Observation is a technique that involves systematically selecting, watching
and recording behavior and characteristics of living beings, objects or
phenomena
It includes all methods from simple visual observations to the use of high
level machines and measurements, sophisticated equipment or facilities,
such as radiographic, biochemical, X-ray
Examples
Observing midwifes while they conduct delivery service
Observing health extension workers while they provide ANC service
47
Observation…
Types
Participant observation
The observer takes part in the situation he or she observes
Non-participant observation
The observer watches the situation, openly or concealed, but does not
participate
48
Observation…
Advantages
o Gives relatively more accurate data on behavior and activities
Disadvantages
o Investigators or observer’s own biases, prejudice, desires, and etc. and
needs more resources and skilled human power during the use of high
level machines
49
2. Questionnaire
A questionnaire is simply a list of mimeographed or printed
questions that is completed by or for a respondent(s)
50
Types of Questionnaires
1. Interviewer-administered
– face to face
– Telephone
Tadesse A 51
Types of Questionnaires
2. Self-administered
– by post
– email/Internet
– Administering the questionnaire for the respondent in a room
Tadesse A 52
Common problems during data collection might include
o Language barriers
o Lack of adequate time
o Expense
o Inadequately trained and experienced staff
o Invasion of privacy
o Bias
o Cultural norms (e.g. which may preclude men interviewing women)
53
Choosing a method of data collection
Decision-makers need information that is relevant, timely, accurate and usable
Some methods pay attention to:
– timeliness and reduction in cost
Others pay attention to:
– accuracy and the strength of the method in using scientific approaches
The selection of the method of data collection is also based on practical
considerations, such as:
o The need for personnel, skills, equipment, etc …
o The acceptability of the procedures to the study subjects
54
Factors to be considered during designing a tool phase includes:
o Study objectives and major research questions
o Study hypotheses: what data are required to accept or reject a
hypothesis?
o Data to be collected
o Plan for analysis
o Budget; and
o The audience or target population (e.g. can a wife be interviewed in the
absence of the husband?)
o Above all, will the respondents be able to give the required information?
55
Question forms
• Nonstructured questions
– Open-ended
Eg. What do you think about the new abortion legislation?
Explain?
• Structured questions/closed ended
– Fixed-response
1. Do you smoke cigarettes?
___Yes ___No
2. Have you ever watched CNN News?
___Yes ___No
Tadesse A 56
Requirements of questions
Must have face validity
– Appropriate source of information should be inquired
– the question that we design should be one that give an obviously valid
and relevant measurement for the variable
– For example, it may be self-evident that records kept in an obstetrics
ward will provide a more valid indication of birth weights than
information obtained by questioning the mother
57
Requirements of questions…
Must be clear and unambiguous
– They must be phrased in language that it is believed the respondent
will understand, and all respondents will understand in the same way
To ensure clarity, each question should contain only one idea;
– Avoid ‘double-barreled’ questions like:
‘Do you take your child to a doctor when s/he has a cold or diarrhea?’
58
Requirements of questions…
Must not be offensive
– whenever possible it is wise to avoid questions that may offend
the respondent
– for example those that deal with intimate matters
– those which may seem to expose the respondent’s ignorance
– those requiring her/him to give a socially unacceptable answer
59
Requirements of questions…
The questions should be fair
– They should not be phrased in a way that suggests a specific answer
and should not be loaded
– Short questions are generally preferable to long ones
60
Requirements of questions…
Sensitive questions
– It may not be possible to avoid asking ‘sensitive’ questions
that may offend respondents
– Taking it at last is recommended
61
Designing of Questionnaire
Designing a good questionnaire always takes several drafts
Steps of questionnaire development:
o Content of the questionnaire
o Formulating of the questions
o Sequencing of the questions
o Formatting of the questions
o Translation of the questionnaire
o Pretesting the questionnaire
62
Step1: Content
Take study objectives and variables as a starting point:
– Decide what questions will be needed to measure or to define your
variables and reach your objectives
– When developing the questionnaire, you should reconsider the
variables you have chosen, if necessary, add, drop or change some of
the variables
– You may even change some of your objectives at this stage
63
Step 2: Formulating Questions
Formulate one or more questions that will provide the
information needed to measure each variables
– Take care that questions are specific and precise enough that different
respondents do not interpret them differently
– For example, a question such as: “Where do community members
usually seek treatment when they are sick?”
64
Step 2: Formulating Questions…
The question, therefore, as rule:
– has to be broken up into different parts
– made so specific that all informants focus on the same thing
Check whether each question measures one thing at a time:
– For example, the question, ''How large an interval would you and your
husband prefer between two successive births?‘’
65
S t e p 2 : F o r m u l at i ng Q u e st i o n s …
Avoid professional jargon and abbreviations
Avoid leading questions
o A question is leading if it suggests a certain response
– ‘Don’t you think that the intrauterine device is safer than the pill?’ It
would be better to ask like: ‘Which do you think is safer, the intrauterine
device or the pill?’
66
Step 3: Sequencing of questions
Design your interview schedule or questionnaire to be “consumer friendly”
– The sequence of questions must be logical for the respondent and allow as much as
possible for a “natural” discussion, even in more structured interviews
At the beginning of the interview, keep questions concerning “background
variables” to a minimum
If possible, pose most or all of these questions later in the interview
Respondents may be hesitate to provide “personal” information early in
an interview
67
Step 3: Sequencing of questions…
Pose more sensitive questions as late as possible in the interview
– questions pertaining to income, sexual behavior, or disease with
stigma attached to it, etc…
Use simple everyday local language
Make the questionnaire as short as possible
– Conduct the interview in two parts if the nature of the topic requires a
long questionnaire (more than 1 hour interview)
68
Step 4: Formatting the questionnaire
When you finalize your questionnaire, be sure that:
o A separate, introductory page is attached to each questionnaire, explaining
the purpose of the study, requesting the informant’s consent to be
interviewed and assuring confidentiality of the response
o Each questionnaire has a heading and space to insert the number, date
and location of the interview, if required the name of the informant
o You may add the name of the interviewer to facilitate quality control
69
Step 4: Formatting the questionnaire…
Layout is such that questions belonging together appear together visually
If the questionnaire is long, you may use sub-headings for groups of
questions
Sufficient space is provided for answers to open-ended questions if any
Boxes (codes) for pre-categorized answers are placed in a consistent
manner half of the page
70
Step 5: Translation to local language
If interview will be conducted in one or more local languages, the questionnaire
has to be translated to standardize the way questions will be asked
After having it translated you should have it retranslated into the original
language to ensure consistency
You can then compare the two versions of questionnaire for differences and
make a decision concerning the final phrasing of difficult concepts and
wordings
71
Step 6: Pre-testing the questionnaire
A pretest is a try-out of the questionnaire
Pretesting is carried out on a small number of respondents who are
comparable with the sample of correspondents
It is by doing this, that error and confusing questions can be corrected in
time
After the pretest interviews are taken, one can modify the questionnaire
accordingly
To check content and approach of questionnaire
72
Take home assignment
• Design data collection method and tool for
your research proposal
Factors considered in sample design
• Research objectives
• Resources
• Knowledge of target population
• Degree of accuracy
• Time frame
• Research scope
• Statistical analysis needs
08/12/2021 74
Sample size determination
In order to calculate the required sample size, you need
to know the following facts:
1. The reasonable estimate of the key proportion to be
studied. If you cannot guess the proportion, take it as
50%.
2. The degree of accuracy required. That is, the allowed
deviation from the true proportion in the population as a
whole. It can be within 1% or 5%, etc.
3. The confidence level required, usually specified as 95%.
08/12/2021 75
Sample size determination …
4. The difference between the two sub-groups and the value
of the likelihood or the power that helps in finding a
statistically significant difference.
Note that number 4 is required when there are two
population groups and the interest is to compare between
two means or proportions.
08/12/2021 76
A. Sample size for estimating a single population mean
• AIM: Estimate µ
• WANT: Estimate ± d units
where d = Margin of error =
Population 2 can be estimated from
• Pilot or preliminary sample
• Previous similar survey
08/12/2021 77
B. Sample size to estimate a single
population proportion
• Aim: Estimate p
• Want: Estimate ± d units where d = Z•SE
08/12/2021 78
C. Sample size for estimating difference in
two means
• Aim: Estimate μ1-μ2
• Want: within ± d units,
where d = Zα/2.SE
08/12/2021 79
D. Sample size for estimating difference in
two proportions
• Aim: Estimate p1-p2
• Want: within ± d units
where d = Zα/2•SE
08/12/2021 80
Sample Size Based on Hypothesis Testing
• The method of determining sample size in the
preceding sections takes into account the probability of
a type I error, but not a type II error since the level of
confidence is determined by the confidence level (1-α).
• However, in many statistical inference procedures, type
II and type I errors are considered when determining
the sample size.
• Type I error (α) = The probability of rejecting Ho
when it is true
• Type II error () = The probability of not rejecting Ho
when it is false
08/12/2021 81
• Power (1-) = the probability H0 is rejected
given that it is false
Ho = There is no difference between the
two groups
Ho: µ1 - µ2 = 0
P1 - P2 = 0
HA = There is a difference between the
two groups
HA: µ1 - µ2 ≠ 0
P1 - P2 ≠ 0
08/12/2021 82
A. Comparison between two means (Equal
sample sizes)
∆ = /μ1-μ2/
The means and variances of the two respective groups
are (µ1, 2 ) and (µ2, 22).
1
08/12/2021 83
B. Comparison between two means (Unequal
sample sizes)
λ =n2/n1
08/12/2021 84
C. Comparison between two proportions
(Equal sample sizes)
∆ = p1-p2
08/12/2021 85
D. Comparison between two proportions
(Unequal sample sizes)
Note: This formula is quite general, and applies to cross-sectional, case-
control and cohort studies.
08/12/2021 86
EPIDEMIOLOGICAL STUDY DESIGN
08/12/2021 87
Epidemiological
Studies
Descriptive Analytical
Case Case Cross -
Ecological
reports series sectional Observational Interventional
Case -
control RT
Cohort
Cross –
sectional(compara
tive)