You are on page 1of 40

Research Design: The Population, Sampling

& Data collection

1
Study Population
A complete set of elements (persons or
objects) that possess some common
characteristic defined by the sampling
criteria established by the researcher.
•The entire group of people or objects to
which the researcher wishes to generalize
the study findings
•Composed of two groups - target
population & accessible population
Target population (universe)
• The entire group of people or objects to which
the researcher wishes to generalize the study
findings
• Meet set of criteria of interest to the
researcher
Examples:
• All institutionalized elderly with Alzheimer's
• All low birth weight infants
• All school-age children with asthma
• All pregnant teens
Accessible population
The portion of the population to which the researcher
has reasonable access; may be a subset of the target
population.
•May be limited to region, state, city, county, or
institution
•Examples:
•All institutionalized elderly with Alzheimer's in Kinondoni
Municipal public hospitals.
•All low birth weight infants admitted to the neonatal
ICUs Ilala and Temeke municipal hospitals
•All school-age children with asthma treated in pediatric
asthma clinics in private hospitals in the Ilala.
•All pregnant teens in the district of Kilombero
Specifying the study population
Total Population

Target population

Accessible
population

Population
sample
Who’s in and who’s out? Selecting a study
population
• Why select ? Rarely possible to study all target
population
• Criteria for selection: relevance to the study
objectives; practicality (accessible); usually
defined by time & place
• Sources of study population: Community,
workplace, school, hospital etc.

6
The sampling concepts
• Sampling: Selecting a group of people, events,
behaviors, or other elements with which to
conduct a study
• Sampling plan: Sampling method; defines the
selection process
• Sample: Defines the selected group of people or
elements from which data are collected for a
study
• Members of the sample can be called the
subjects/participants/individual cases.
Selecting samples
Population, sample and individual cases

Source: Saunders et al. (2009)

Figure 7.1 Population, sample and individual cases


Primary Considerations in Sampling
I. Population -- determine who (or what subjects/objects)
can provide the required information.
II. Sample Frame -- develop a specific list of population
members.
III. Sampling Unit -- determine the basis for drawing the
sample (Individuals, Study Centres, households etc.).
IV. Sampling Method -- determine how the sample will be
selected.
V. Sample Size -- determine how many population
members are to be included in the sample.
VI. Sample Plan -- develop a mechanism for selecting and
contacting the sample members.
VII. Execution -- carry out the sampling plan.
9
The need for sampling
• It is often not feasible to study the whole population
(impracticable)
• Budget constraints restrict data collection
• Time constraints may restrict data collection
• Results from data collection might be needed quickly
• A sample should provide accurate estimates of population
characteristics
• Reliability of the information drawn from the sample about
the entire population depends on the nature of the sample,
its size & representativeness
• If the sample is not representative the conclusion drawn
from the research can not be generalized for the entire
population but will be limited to the sample studied
Qualities of a good sampling method
•Should provide a sample that is
representative of the population from
which is drawn

•Have information about the likely accuracy


of sample estimates of population
characteristics

•Achieved at a reasonable cost


Important issues about sample size
• General rule - as large as possible to increase the
representativeness of the sample
• Increased size decreases sampling error
• Relatively small samples in qualitative, exploratory,
case studies, experimental studies
• Descriptive studies need large samples; e.g. 10
subjects for each item on the questionnaire or
interview guide
• As the number of variables studied increases, the
sample size also needs to increase in order to detect
significant relationships or differences
• A minimum of 30 subjects is needed for use of the
central limit theorem (statistics based on the mean)
Overview of sampling techniques/methods
• Choice of sampling techniques/methods
depends upon the research question(s) and
their objectives
Probability sampling

The four stage process:

1. Identify sampling frame from research objectives

2. Decide on a suitable sample size

3. Select the appropriate technique and the sample

4. Check that the sample is representative


Probability sampling
Stage 1: Identifying a suitable sampling frame

Key points to consider:

• Problems of using existing databases

• Extent of possible generalisation from the sample

• Validity and reliability

• Avoidance of bias
Probability sampling
Stage 2: Decide on a suitable sample size
Choice of sample size is influenced by:

• Confidence needed in the data

• Margin of error that can be tolerated (degree of accuracy)

• Types (categories) of analyses to be undertaken

• Size of the sample population and its distribution

• Expected response rate


Probability sampling
The importance of response rate
Key considerations

• Non- respondents and analysis of refusals

• Obtaining a representative sample

• Calculating the active response rate

• Estimating response rate and sample size


Probability sampling
Stage three: Selecting a sampling technique
Probability (formal) Sampling Methods:
►Is based on the assumption that every member of the
population has a known & non zero probability of
selection.
• Increases sample's representativeness of the
population.
Five main techniques used for a probability sample:
• Simple random – procedure that ensures that each
case in the population has an equal chance of being
included in the sample.
• Systematic Random – Involves sampling at regular
intervals (e.g. every tenth or one-hundredth person,
• the initial sampling point is determined at random, and
then the cases are selected at regular interval.
• Population (N) = 2000, sample size (n) = 50, k=N/n, so k =
2000 ) 50 = 40. Use table of random numbers to obtain the
starting point.
Probability sampling
• Stratified random - Involves dividing a
population into sub-populations (strata) and
sample separately & proportionately from these
sub-populations (main aim-reduce variability
and hence sampling errors)
• the population is divided into two of more
relevant strata and a random sample is drawn
(simple or systematic)
• There are two approaches to stratification:
 proportional
& disproportional
Probability sampling
• Proportional stratified sampling:
• Subgroup sample sizes equal the proportions of the subgroup in the
population e.g if A high school population has 15% seniors, 25%
juniors, 25% sophomores, 35% freshmen
• With proportional sample the sample has the same proportions as
the population
• Disproportional stratified sampling:
• Subgroup sample sizes are not equal to the proportion of the
subgroup in the population
CLASS POPULATION SAMPLE
jUNIORS 15% 25%
SENIORS 25% 25%
Sophomores 25% 25%
Freshmen 35% 25%

With disproportional sample the sample does not have the same
proportions as the population.
• Cluster sampling – Applied to highly dispersed
samples (to reduce cost) by concentrating the sampling
in certain areas.
• The population is divided into discrete groups or
clusters prior to sampling. Then a random sample
(systematic or simple) of these cluster is drawn

• Multi-stage sampling – developing clusters and


sampling within clusters using systematic or simple
random
Non- probability sampling (1)

Key considerations:

• Deciding on a suitable sample size

• Selecting the appropriate technique


Non- probability sampling (3)

Cons. 1: Deciding on a suitable sample size

• as in probability sampling
Non- probability sampling (5)

• Purposive sampling – the judgement of the


researcher is used to select the cases that make
up the sample.
• Involves use of own judgment or intuition to identify a
sample unit & sample out of it. Commonly used when a
particular category (which is rare) within a population
is to be studied
• Bases include
• Extreme cases
• Heterogeneity (maximum variation)
• Homogeneity (maximum similarity)
• Critical or typical cases
Non- probability sampling (6)
• Snowball sampling – subsequent respondents
are obtained from information provided by the
initial respondents. Networks of people are traced
from an initial contact. The first contact person may
be chosen randomly or non randomly and subsequent
contacts follow from that (his / her links). Applicable in
migration / educational studies.
• Self-selection sampling – the case, usually an
individual, is allowed to identify their desire to be
part of the sample
Non- probability sampling (7)
• Convenience sampling (or accidental) Sampling -
Uses what is immediately available. Samples are
selected because are available not because are known
to be representative. – cases are selected
haphazardly on the basis that they are easiest to
obtain
• Weak approach to sampling because it is hard to
control for bias
• The sample includes whomever is available and willing
to give consent.
• Representativeness is a concern
Non- probability sampling (4)
Cons. 2: selecting the appropriate technique
• Quota sampling (larger populations) – ensures that
the sample represents certain characteristics of the
population chosen by the researcher
• Selects units of study on the basis of set criteria until the
quota is filled but representativeness of the sample is
not assured
• Uses convenience sampling, but with a strategy to
ensure inclusion of subject types who are likely to be
underrepresented in the convenience sample
• Goal is to replicate the proportions of subgroups
present in the population
• Works better than convenience sampling to reduce bias
Overview of sampling techniques

Source: Saunders et al. (2009)


Figure 7.2 Sampling techniques
Closing remarks on sampling

• Probability sampling requires a sampling frame and can be


more time consuming

• When a sampling frame is not possible, non- probability


sampling is used

• Many research projects use a combination of sampling


techniques

• All choices depend on the ability to gain access to organisations


Visit: https://www.surveysystem.com/sscalc.htm for sample size
calculators
Data collection methods
Primary considerations
• Whether qualitative or quantitative data
• Primary or secondary data
Primary Data collection Methods
1) Questionnaire-- a formalized instrument for asking
information directly from respondents.

2) Observation-- the direct examination of behaviour, the


results of behaviour or physiological changes.
information is sought by way of investigators’ own direct
observation

3) Interviewing (face-to-face)-Involves direct


communication between the interviewer and the
interviewee
31
Primary Data collection Methods…….
4) Projective Techniques and Depth Interviews-- designed
to gather information that respondents are either
unable or unwilling to provide in response to direct
questions.
5) Focus group discussions-are basically discussions
conducted by a researcher with a group of respondents
who are considered to be representative for the target
population
6) Experiment-an investigation in which a factor or
variable is isolated and its effects measured
Questionnaire
Steps to Develop a Good Questionnaire
• Decide and then plan what to measure

• Formulate questions to obtain the needed information.

• Decide the order and wording of questions and the layout of


the questionnaire.

• Using a small sample, test the questionnaire for omissions


and ambiguities.

• Correct the problems (and pretest it again).

34
Preparation of Questionnaires: Crucial
Decisions-1
1. Preliminary Decisions
* Exactly what information is required?
* Exactly who are the target respondents?
* What method of communication to use to reach the target
respondents?
2. Decisions About the Content of each Question
* Is this question really needed?
* Is this question sufficient to generate the needed
information?
* Can the respondent answer the question correctly?
* Will the respondent answer the question correctly?
* Are there any external events that might bias the response to
the question?

35
Preparation of Questionnaires: Crucial Decisions-
2
3.Decisions Concerning the Language of Questions
* Do the words used have just one meaning for all the
respondents?
* Are any of the words/phrases loaded or misleading in any
way?
* Are there any alternatives implied in the question?
* Are there any unstated assumptions related to the question?
* Will the respondents approach the question from the frame
of reference desired by the researcher?
4.Decisions About the Response Format
* Should the question be asked as an open-ended, a multiple-
choice, or a dichotomous one?
36
Preparation of Questionnaires: Crucial
Decisions-3
5.Decisions Concerning the Sequence of Questions
• Are the questions organized in a logical manner to
eliminate possibilities of errors in interpretation?
6. Decisions on the Layout of the Questionnaire
• Is the questionnaire designed in a manner to avoid
confusion and minimize response errors?
7. Decisions related to the Pretest(s) and Revisions
• Has the final questionnaire been subjected to a
thorough pretest(s), using respondents similar to
those who will be included in the final survey?

37
Potential Sources of Error-1

Error Type Cause/source


1 Surrogate Variation between the information required
information to solve the problem and the information
error sought by the researcher.
2 Measurement Variation between the information sought by
error the researcher and the information produced
by the measurement process.
3 Experimental Variation between the impact attributed to
error the independent variable(s) and the actual
impact of the independent variable(s).
4 Population Variation between the population required to
specification provide the needed information and the
error population selected by the researcher.

38
Potential Sources of Error-2

Error Type Cause/source


5 Frame Variation between the population as defined by
error the researcher and the list of population
members used by the researcher.
6 Sampling Variation between a representative sample and
error the sample obtained by using a probability
sampling method.
7 Selection Variation between a representative sample and
error the sample obtained by using a non-probability
sampling method.
8 Non- Variation between the selected sample and the
response sample that actually participates in the study.
error

39
THANK YOU FOR
LISTENING

You might also like