Day 1 & 2

Introduction
Research Design in Analysis of Survey Data

Sampling
Emerging data collection platforms
Ethics
ANALYSIS OF SURVEY DATA (SPSS/STATA)
Kiplimo Araap Langat, PhD
Alexander Brookes Associates Limited,

The Old Church, Quicks Road Wimbledon, London, SW19 1EX, United Kingdom
trainer@alexanderbrookes.com
INTRODUCTION TO SURVEY DATA AND METHODOLOGIES
July 26, 2016
Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Introduction
Sampling
Ethics
What you will get
Certificate
Directly Applicable Action Point(s)
Follow-up Q&A with us for 3 months (via
trainer@alexanderbrookes.com )
Access to regular updates via our newsletter & knowledge
centre

Introduction
Sampling
Ethics
Course Objectives
1 Introduction to survey data and methodologies

2 Introduction to SPSS/STATA
3 Introduction to Basic Inferential Statistics
4 Introduction to Linear Regression Analysis
5 Introduction to limited dependent variable models

Introduction
Examples of Survey data
Sampling
CBN Surveys
Ethics
Course Outline
1 Introduction
Definitions
CBN Surveys
2 Research Design in Analysis of Survey Data
3 Sampling
Instruments and Measurements
4 Emerging data collection platforms
5 Ethics

Introduction
Sampling
CBN Surveys
Ethics
What is the research problem?
This is an area of conflict, concern, or controversy (a gap

between what is wanted and what is observed).
Include the most relevant reference that supports the claim. 1
1
From research problem to research questions
Introduction
Sampling
CBN Surveys
Ethics
Statement of the Problem
This should include

a a clear statement that the problem exists (a specific and
feasible).
b evidence that supports the existence of the problem,
c evidence of an existing trend (primary cause) that has led to the
problem,
d clear description of the setting,

Introduction
Sampling
CBN Surveys
Ethics
Evaluating your problem
It is extremely important to evaluate your research problem in

light of financial resources at your disposal, the time available,
and your own knowledge in the field of study.
It is equally important to identify any gaps in your knowledge
of relevant disciplines, such as statistics, required for analysis.
Also, ask yourself whether you have sufficient knowledge
about computers and software if you plan to use them.

Introduction
Sampling
CBN Surveys
Ethics
What to consider in selecting a problem
interest,
magnitude,
measurement of concepts,
level of expertise,
relevance,
availability of data,
ethical issues
Temporal horizones.

Introduction
Sampling
CBN Surveys
Ethics
Difference between Research in Academia and Policy

Introduction
Sampling
CBN Surveys
Ethics
Common mistakes in research
Insufficiently motivated research questions

Pursuing research fads
Unresearchable problem
Favored research methods
Blind data mining

Introduction
Sampling
CBN Surveys
Ethics
Blind data mining
Data collection is only one step in a long and elaborate

process of planning, designing, and executing research.
Several activities (rituals) are needed in a research process
prior to data collection.
Jumping into data collection without elaborate planning, may
render data irrelevant, imperfect, or useless, and their data
collection efforts may be entirely wasted.

Introduction
Sampling
CBN Surveys
Ethics
Common Errors in Research Process
1 Population specification
2 Sampling
3 Selection
4 Non-response
5 Measurement 2
2
see Qualtrics
Introduction
Sampling
CBN Surveys
Ethics
Population specification
Selecting an inappropriate population or universe from which to

obtain data.
Example:
Food manufacturers often conduct surveys of housewives, because

they are easier to contact, and it is assumed they decide what is
to be purchased and also do the actual purchasing. In this
situation there often is population specification error. The
husband may purchase a significant share of the food, and have
significant direct and indirect influence over what is bought.
For this reason, excluding husbands from samples may yield
results targeted to the wrong audience.

Introduction
Sampling
CBN Surveys
Ethics
Sampling
Failure to obtain a representative sample

Example:
Suppose that we collected a random sample of 500 people from the

general Nigerian adult population to gauge their financial
inclusion. Then, upon analysis, found it to be composed of 70%
females. This sample would not be representative of the general
adult population and would influence the data. The financial
inclusion of females would hold more weight, preventing accurate
extrapolation to the Nigerian general adult population. Sampling
error is affected by the homogeneity of the population being
studied and sampled from and by the size of the sample.

Introduction
Sampling
CBN Surveys
Ethics
Selection
Selection error is the sampling error for a sample selected by a

non-probability method.
Example:
Interviewers conducting a mall intercept study have a natural
tendency to select those respondents who are the most accessible
and agreeable whenever there is latitude to do so. Such samples
often comprise friends and associates who bear some degree of
resemblance in characteristics to those of the desired
population.

Introduction
Sampling
CBN Surveys
Ethics
Non-Response
Obtained sample differs from the original selected sample.
Example:
In some surveys, some respondents are inaccessible because they
are not at home for the initial call or call-backs. Others have
moved or are away from home for the period of the survey.
Not-at-home respondents are typically younger with no small
children, and have a much higher proportion of working wives than
households with someone at home. People who have moved or are
away for the survey period have a higher geographic mobility than
the average of the population. Thus, most surveys can anticipate
errors from non-contact of respondents. Online surveys seek to
avoid this error through e-mail distribution, thus eliminating
not-at-home. But in Africa the situation is complex if you
wanted to include the nomadic communities or rural areas.
Introduction
Sampling
CBN Surveys
Ethics
Course Outline
1 Introduction
Definitions
CBN Surveys
3 Sampling
5 Ethics

Introduction
Sampling
CBN Surveys
Ethics
Types of data
Cross-sectional data
Time series data
Panel data
Big data

Introduction
Sampling
CBN Surveys
Ethics
Survey Data in Developing Countries
Panel data have become increasingly available in developing

countries [IFC, 2015].
Several panel datasets have been identified by CPRC here
The World Bank has sponsored and helped design many panel
surveys. For example Nigeria LSMS panel data sets.

Introduction
Sampling
CBN Surveys
Ethics
Nigeria 2010-2011/2012-2013 GHS Survey Data

Introduction
Sampling
CBN Surveys
Ethics
Nigeria 2010-2011/2012-2013 GHS Survey Data

GHS-Panel is supported by the Living Standards Measurement
Study - Integrated Surveys on Agriculture (LSMS-ISA) project
undertaken by the Development Research Group at the World
Bank and implemented by the National Bureau of Stastics.
The LSMS-ISA project aims to support governments in seven
Sub-Saharan African countries to generate nationally
representative, household panel data with a strong focus on
agriculture and rural development.
GHS survey is a cross-sectional survey of 22,000 households
carried out annually throughout the country.
The panel component (GHS-Panel) applies to 5,000
households of the GHS, collecting additional data on multiple
agricultural activities and household consumption.
The first wave of the revised GHS and GHS-Panel was carried
out in two Kiplimo
visitsAraap
to Langat,
the panel
PhD households (post-planting
ANALYSIS OF SURVEY visit in
DATA (SPSS/STATA)
Introduction
Sampling
CBN Surveys
Ethics

Introduction
Sampling
CBN Surveys
Ethics
Challenges of Collecting Data in Developing Countries
Class discussion
Collecting Panel Data in Developing countries: Does It Make
Sense? LSMS working paper (1986)[?]
Ongoing debate on open data

Introduction
Sampling
CBN Surveys
Ethics
Big Data & Central Banks
Big data: data sets that are granular, high frequency and/or
non-numeric.
Require high analytical techniques, governing arrangements,
and strategic objectives in central banks,
Recent financial and economic crisis has prompted the
demand for data by central banks

Introduction
Sampling
CBN Surveys
Ethics
Big Data & Central Banks
Big data is likely to become a topic of increasing interest to

central banks in the years ahead.
This is because it is likely to change both the internal
operations of central banks, and transform the external
economic and financial systems central banks analyze.

Introduction
Sampling
CBN Surveys
Ethics
Characteristics of Big Data
These data are high volume, often because they are reported
on a granular basis, that is, item-by-item, for example,
loan-by-loan or security-by-security.
These data are high velocity, because these data are frequently
updated and, at the limit, collected and analyzed in real-time;
These data are qualitatively various, meaning they are either
non-numeric, such as text or video, or they are extracted from
novel sources, such as social media, internet search records or
biometric sensors.

Introduction
Sampling
CBN Surveys
Ethics
Why Big data for Central banks?
Recent example from Kenya: Chase Bank and Social Media
Watch Video here

Introduction
Sampling
CBN Surveys
Ethics
DD & SS of Big Data
On the supply side, increases in the volume, velocity and

variety of data have been driven by technological advances
that have increased storage capacity and processing power
while lowering costs.
And on the demand side, there is increasing interest from
economic agents in understanding how analysis of their data
might enhance productivity and profits (Bakhshi et al. (2014),
Brown et al. (2014) and Einav and Levin (2013)).

Introduction
Sampling
CBN Surveys
Ethics
The use of surveys by central banks
The importance of surveys for central banks.

Short term surveys for measuring economic activity: industrial
production (manufacturing) surveys, services surveys, etc.
Surveys of leading indicators: purchasing managers index.
Surveys on investment spending and investment projects.
Surveys on business conditions and sentiment: outlook for
business activity, sales, production, confidence and corporate
balance sheets.
Surveys of consumers/households: consumers confidence,
income and expenditure conditions, indebtedness,
employment.

Introduction
Sampling
CBN Surveys
Ethics
The use of surveys by central banks cont...
Surveys of economic expectations and forecasts (inflation

expectations, interest rates, economic growth, etc.) compiled
in groups of economic analysis and consulting firms or in
companies.
Surveys on the external sector: international travelers, workers
remittances, export prospects, foreign direct investment.
Surveys on monetary and financial conditions: credit supply
availability and cost, corporations indebtedness.
Surveys regarding currency issues: need of cash balances, etc.
Currently on effectiveness of financial inclusion policies

Introduction
Sampling
CBN Surveys
Ethics
Common Practice
Globally, central banks are active collectors and compilers of

statistics, in particular with respect to money and banking,
external statistics, and finance [IFC, 2015] .
Traditionally they have used a full reporting or census
approach.
However, increasingly they are also starting to rely on survey
methods.
There are a number of reasons for this, including
1 the reduction of compilation costs
2 the easing of reporting burdens on respondents
3 the need to speed up the process of gathering information in
order to obtain more timely statistics

Introduction
Sampling
CBN Surveys
Ethics
Surveys can also be used more readily to collect qualitative

information, e.g. inflation expectations and
consumer/business sentiment.
Sometimes survey methods can also provide more statistical
rigour in compilation techniques (for instance the calculation
of representative interest rates), and facilitate the monitoring
of particular developments and innovations (such as financial
innovations in payment systems and financial literacy of the
population).
Finally, in some cases surveys are part of coordinated
international data collection efforts, eg the IMF Coordinated
Portfolio Investment Survey (CPIS) and the Bank for
International Settlements (BIS) Triennial Survey of Foreign
Exchange and Derivative Market Activity.

Introduction
Sampling
CBN Surveys
Ethics
Course Outline
1 Introduction
Definitions
CBN Surveys
3 Sampling
5 Ethics

Introduction
Sampling
CBN Surveys
Ethics
CBN Surveys

Introduction
Sampling
CBN Surveys
Ethics

Introduction
Sampling
CBN Surveys
Ethics

Introduction
Sampling
CBN Surveys
Ethics

Introduction
Sampling
CBN Surveys
Ethics
The CBN Inflation Attitudes Survey Q1 2012

Introduction
Sampling
CBN Surveys
Ethics
CBN Monetary survey
Is this a survey or census?

Introduction
Sampling
CBN Surveys
Ethics
Why Analysis of Survey Data

Introduction
Sampling
Ethics
Course Outline
1 Introduction
Definitions
CBN Surveys
3 Sampling
5 Ethics

Introduction
Sampling
Ethics
Research Design & Analytical Approach of survey Data

The dominant analytical approach inside central banks has
been deductive.
A deductive approach starts from a general theory and then
seeks particular data to evaluate it.
Suppose an analyst starts by positing an accounting identity
that the product of the quantity of money (M) and its
velocity (V) is equal to the product of the price level (P) and
expenditures on goods and services in the economy (Q).
If the analyst further assumes that the velocity of money is
stable, then an increase in money might be hypothesised to
result in inflation.
The analyst might then seek to test the validity of the theory
using money and price data over a particular period of time.
Introduction
Sampling
Ethics
Induction
An alternative research strategy is induction.

An inductive approach starts from data and then seeks to
generate theoretical explanation of it.
Induction may mitigate confirmation bias, that is, the
tendency to seek data which confirms ex-ante assumptions.

Introduction
Sampling
Ethics
The Trinity of Research

There are three fundamentals of research- dubbed -the trinity-of
research: design, measurement, and analysis.
Figure 1: Essential
Kiplimocharacteristics
Araap Langat, PhD by ANALYSIS
design, OF
measurement, & analysis
SURVEY DATA (SPSS/STATA)
Introduction
Sampling
Ethics
The Trinity and Validity
The figure above presents a schematic view that represents

the essential roles of design, measurement, and analysis and
the main type of validity associated with each.
Internal Validity; refers to the approximate truth of inferences
about causal relations,
External validity refers to whether inferences hold across
variations in cases, treatments, settings, or measures;
Construct validity refers to the correct measurement of
variables that the researcher intends to study;
Conclusion validity, also called statistical conclusion
validity and the appropriate use of statistical methods in the
analysis to estimate relations between variables of interest.

Introduction
Sampling
Ethics
Research Design
Research design is a comprehensive plan for data collection in an

empirical research project. It is a blueprint for empirical research
aimed at answering specific research questions or testing specific
hypotheses, and must specify at least three processes:
1 The data collection process;
2 The instrument development process;and
3 The sampling process.

Introduction
Sampling
Ethics
Data Collection Methods
data collection methods can be broadly grouped into two

categories:
Positivist and
Interpretive

Introduction
Sampling
Ethics
Positivist & Interpretive
Positivist include laboratory experiments and survey research,

They are aimed at theory (or hypotheses) testing;
While interpretive methods, such as action research and
ethnography, are aimed at theory building;
Positivist methods employ a deductive approach to research,
starting with a theory and testing theoretical postulates using
empirical data.

Introduction
Sampling
Ethics
Qualitative Vs Quantitative
Interpretive methods employ an inductive approach that starts

with data and tries to derive a theory about the phenomenon
of interest from the observed data.
More often, these methods are incorrectly equated with
quantitative and qualitative research.
Quantitative and qualitative methods refers to the type of
data being collected;
Quantitative data involve numeric scores, metrics, and so
on,while
Qualitative data includes interviews, observations, and so
forth and analyzed (i.e., using quantitative techniques such as
regression or qualitative techniques such as coding).

Introduction
Sampling
Ethics
Qualitative & Quantitative
Positivist research uses predominantly quantitative data, but

can also use qualitative data.
Interpretive research relies heavily on qualitative data, but can
sometimes benefit from including quantitative data as well.
Mixed-mode designs that combine qualitative and quantitative
data are often highly desirable.

Introduction
Sampling
Ethics
Key Attributes of a Research Design
The quality of research designs

can be defined in terms of four
key design attributes:
internal validity,
external validity,
construct validity, and
statistical conclusion validity.

Introduction
Sampling
Ethics
Internal, external validity and survey Data
Figure 2: Source: [?]

Introduction
Sampling
Ethics
Internal and external validity
Examines whether the observed change in a DV is indeed

caused by a corresponding change in hypothesized IV, and not
by variables extraneous to the research context.
Causality requires four conditions:
1 Covariation of cause and effect (association between cause
and effect must be consistent)
2 Temporal precedence (time order or contiguity): cause must
precede effect in time and space, not vice versa,
3 No plausible alternative explanation (or no spurious
correlation).
4 Rationale, there must be a rationale that explains and justifies
causality,

Introduction
Sampling
Ethics
External Validity
Or generalizability refers to whether the observed associations

can be generalized from the sample to the population
(population validity), or to other people, organizations,
contexts, or time (ecological validity). 3
Example: a sample of financial firms in Nigeri generalized to the population of

financial firms (population validity) or to other firms within Nigeria (ecological
validity)?
Survey research, where data is sourced from a wide variety of individuals, firms,
or other units of analysis, tends to have broader generalizability than laboratory
experiments
Introduction
Sampling
Ethics
External Validity
Ideally, you would want studies to both be high internal and

external validities.
-No single method does this.
-Multiple methods is important.
Researchers’ choice of designs is ultimately a matter of their
personal preference and competence, and the level of internal
and external validity they desire.

Introduction
Sampling
Ethics
External Validity
Examines how well a given measurement scale is measuring

the theoretical construct that it is expected to measure.
Construct validity is assessed in positivist research based on
correlation or factor analysis of pilot test data.

Introduction
Sampling
Ethics
Construct Validity
Examines how well a given measurement scale is measuring

the theoretical construct that it is expected to measure.
Construct validity is assessed in positivist research based on
correlation or factor analysis of pilot test data.

Introduction
Sampling
Ethics
Construct Validity
Examines the extent to which conclusions derived using a

statistical procedure is valid.It examines:
whether the right statistical method was used for hypotheses
testing,
whether the variables used meet the assumptions of that
statistical test (such as sample size or distributional
requirements), and so forth.
Because interpretive research designs do not employ statistical
test, statistical conclusion validity is not applicable for such
analysis

Introduction
Sampling
Ethics
Improving internal and External Validity
The best research designs are those that can assure high levels
of internal and external validity.
Such designs would guard against spurious correlations, inspire
greater faith in the hypotheses testing, and ensure that the
results drawn from a small sample are generalizable to the
population at large.

Introduction
Sampling
Ethics
Four Ways to assure Internal Validity
Controls are required to assure internal validity (causality) of

research designs, and can be accomplished in four ways:
manipulation,
elimination,
inclusion, and
statistical control, and
randomization

Introduction
Sampling
Ethics
Manipulation
The researcher manipulates the IVs in one or more levels

(called treatments), and compares the effects of the
treatments against a control group where subjects do not
receive the treatment.

Introduction
Sampling
Ethics
Elimination
This technique relies on eliminating extraneous variables by

holding them constant across treatments, such as by
restricting the study to a single gender or a single
socioeconomic status.

Introduction
Sampling
Ethics
Inclusion
In this technique, the role of extraneous variables is considered

by including them in the research design and separately
estimating their effects on the dependent variable, such as via
factorial designs where one factor is gender (male versus
female).

Introduction
Sampling
Ethics
Statistical control
In this technique, the extraneous variables are measured and

used as covariates during the statistical testing process.

Introduction
Sampling
Ethics
Randomization
This technique is aimed at cancelling out the effects of extraneous

variables through a process of random sampling, if it can be
assured that these effects are of a random (non-systematic) nature.
Two types of randomization are:
Random selection, where a sample is selected randomly from
a population, and
Random assignment, where subjects selected in a
non-random manner are randomly assigned to treatment
groups.

Introduction
Sampling
Ethics
Randomization and external validity
Randomization also assures external validity, allowing

inferences drawn from the sample to be generalized to the
population from which the sample is drawn.
Note: random assignment is mandatory when random
selection is not possible because of resource or access
constraints.
However, generalizability across populations is harder to
ascertain since populations may differ on multiple dimensions
and you can only control for few of those dimensions.

Introduction
Sampling
Ethics
Popular research designs
As noted before; positivist designs are meant for theory

testing, while interpretive designs are meant for theory
building.
Positivist designs seek to generalize patterns based on an
objective view of reality, while
Interpretive designs seek subjective interpretations of social
phenomena from the perspectives of the subjects involved.

Introduction
Sampling
Ethics
Examples
Some popular examples of positivist designs include

laboratory experiments,
field experiments,
field surveys (Cross-sectional & Longitudinal),
secondary data analysis, and
case research
Focus group research
Action research
While examples of interpretive designs include case research,
phenomenology, and ethnography.

Introduction
Sampling
Ethics
Experimental Design
Those that are intended to test cause-effect relationships

(hypotheses) in a tightly controlled setting by separating the cause
from the effect in time, administering the cause to one group of
subjects (the treatment group) but not to another group (control
group), and observing how the mean effects vary between subjects
in these two groups. 4
4
Example: administer the drug to subjects in the treatment group, but only
give a placebo to control group
Introduction
Sampling
Ethics
Experimental Design
More complex designs may include multiple treatment groups

e.g. multiple treatments, such as combining drug
administration with dietary interventions.
In a true experimental design, subjects must be randomly
assigned between each group.
If random assignment is not followed, then the design
becomes quasi-experimental.
Experiments can be conducted in an artificial or laboratory
setting such as at a university (laboratory experiments) or in
Field settings such as in an organization where the
phenomenon of interest is actually occurring (field
experiments).

Introduction
Sampling
Ethics
Field Surveys
Are non-experimental designs

Do not control for or manipulate IV or treatments,
They measure IV and test their effects using statistical
methods.
They capture snapshots of practices, beliefs, or situations
from a random sample of subjects in field settings through a
survey questionnaire or a structured interview.

Introduction
Sampling
Ethics
Cross-sectional & Longitudinal
In cross-sectional field surveys, IV and DV are measured at the

same point in time (e.g., using a single questionnaire), while
In longitudinal field surveys, DV are measured at a later point
in time than the IV

Introduction
Sampling
Ethics
Cross-sectional & Longitudinal
The strengths of field surveys are their external validity since

data is collected in field settings,
They have ability to capture and control for a large number of
variables, and
They have ability to study a problem from multiple
perspectives or using multiple theories.
Due to Non-temporal nature, internal validity are difficult to
infer, and surveys may be subject to respondent biases

Introduction
Sampling
Ethics
Secondary Data Analysis
Is an analysis of data that has previously been collected and

tabulated by other sources.Example:
Government agencies e.g. Statistical Bureaus
Development statistics e.g. UNDP, World bank, Other
researchers,
Third-party data, such as financial data from stock markets or
real-time auction data from eBay.

Introduction
Sampling
Ethics
The limitations of this design are that the data might not have
been collected in a systematic or scientific manner and hence
unsuitable for scientific research, they may not adequately address
the research questions of interest to the researcher, and interval
validity is problematic if the temporal precedence between cause
and effect is unclear

Introduction
Sampling
Ethics
Case Study
Is an in-depth investigation of a problem in one or more

real-life settings (case sites) over an extended period of time.
Case studies can be positivist in nature (for hypotheses
testing) or interpretive (for theory building).
The strength of this research method is its ability to discover
a wide variety of social, cultural, and political factors
potentially related to the phenomenon of interest that may
not be known in advance.

Introduction
Sampling
Ethics
Focus Group
Is a type of research that involves bringing in a small group of

subjects (typically 6 to 10 people) at one location, and having
them discuss a phenomenon of interest for a period of 1.5 to 2
hours. The discussion is moderated and led by a trained facilitator,
Setting agenda and poses an initial set of questions for
participants, makes sure that ideas and experiences of all
participants are represented,

Introduction
Sampling
Ethics
Attempts to build a holistic understanding of the problem

situation based on participants comments and experiences.
Internal validity cannot be established due to lack of controls
and the findings may not be generalized to other settings
because of small sample size.
Focus groups are not generally used for explanatory or
descriptive research, but are more suited for exploratory
research.

Introduction
Sampling
Ethics
Action Research
Assumes that complex social phenomena are best understood

by introducing interventions or actions into those phenomena
and observing the effects of those actions.
Often, the researcher is a consultant or an organization
member embedded within a social context such as an
organization e.g. introduction of new technologies.

Introduction
Sampling
Ethics
Action Research
The researchers choice of actions must be based on theory,

which should explain why and how such actions may cause the
desired change.
The researcher then observes the results of that action,
modifying it as necessary, while simultaneously learning from
the action and generating theoretical insights about the target
problem and interventions.
The initial theory is validated by the extent to which the
chosen action successfully solves the target problem.

Introduction
Sampling
Ethics
Ethnography
Is an interpretive research design inspired by anthropology

that emphasizes that research phenomenon must be studied
within the context of its culture.
The researcher is deeply immersed in a certain culture over an
extended period of time (8 months to 2 years), and during
that period, engages, observes, and records the daily life of
the studied culture, and theorizes about the evolution and
behaviors in that culture.

Introduction
Sampling
Ethics
Selecting Research Design
Researchers tend to select those research designs that they are

most comfortable with and feel most competent to handle,
Ideally, the choice should depend on the nature of the
research phenomenon being studied.
When the research problem is unclear, a focus group (for
individual unit of analysis) or a case study (for organizational
unit of analysis) is an ideal strategy for exploratory research.

Introduction
Sampling
Ethics
when there are no good theories to explain the phenomenon

of interest interpretive designs such as case research or
ethnography may be useful designs in building a theory, .
If competing theories exist and the researcher wishes to test
these different theories or integrate them into a larger theory,
positivist designs such as experimental design, survey
research, or secondary data analysis are more appropriate.

Introduction
Sampling
Ethics
Regardless of the specific research design chosen, the

researcher should strive to collect quantitative and qualitative
data using a combination of techniques such as questionnaires,
interviews, observations, documents, or secondary data.
Even in a highly structured survey questionnaire, intended to
collect quantitative data, the researcher may leave some room
for a few open-ended questions to collect qualitative data that
may generate unexpected insights not otherwise available
from structured quantitative data alone.

Introduction
Sampling Instruments and Measurements
Ethics
Course Outline
1 Introduction
Definitions
CBN Surveys
3 Sampling
5 Ethics

Introduction
Ethics
Population and Sampling
Selecting Participants?
The research question will dictate the type of participants
selected for the study
Also need to match the participants to the instrumentation
and methods

Introduction
Ethics
Definition of terms
Population refers to an entire group or elements with

common characteristics that you want to generalize about
Sampling frame a list from where you can draw your sample
Sampling is the process whereby a small proportion or
subgroup of a
Population is selected for analysis
Sample refers to the small subgroup which is thought to be
representative of the larger population

Introduction
Ethics
Important Consideration
The sample needs to be representative of your population of

interest
Generalizability (external validity) of your results is dependent
on this factor!

Introduction
Ethics
Representative Sample
Sample is used to make observations and statistical inferences

about the population.
Inferences derived from a good representative sample of the
population can be generalized back to the population of
interest.
Improper and biased sampling leads to divergent and
erroneous inferences

Introduction
Ethics
Steps in Sampling Process
1 Identify the target population

2 Identify the accessible population (Sampling frame)
3 Determine the size of the sample needed
4 Select the sampling technique
5 Implement the plan

Introduction
Ethics
Sampling Technique
Sampling techniques can be grouped into two broad

categories:
1 probability (random) sampling and
2 non-probability sampling
Probability sampling is ideal if generalizability of results is
important for your study, but
There may be unique circumstances where non-probability
sampling can also be justified

Introduction
Ethics
Probability Sampling
All probability sampling have two attributes in common:

1 every unit in the population has a known non-zero probability
of being sampled, and
2 the sampling procedure involves random selection at some
point.

Introduction
Ethics
Types of Probability Sampling
Simple random sampling

Systematic Sampling
Stratified Sampling
Cluster Sampling
Matched Pairs-Sampling
Multi-stage Sampling

Introduction
Ethics
Simple Random Sampling
In this technique, all possible subsets of a population (more

accurately, of a sampling frame) are given an equal probability
of being selected.
Simple random sampling involves randomly selecting
respondents from a sampling frame, but with large sampling
frames, usually a table of random numbers or a computerized
random number generator is used.
If you wish to select 50 coutries to survey from a list of 180
firms, if this list is entered into a spreadsheet like Excel, you
can use Excels RANDBETWEEN() function to generate
random numbers for each of the 180 clients on that list. class
demo

Introduction
Ethics
Systematic Sampling
In this technique, the sampling frame is ordered according to

some criteria and elements are selected at regular intervals
through that ordered list.
It involves a random start and then proceeds with the selection
of every kth element from that point onwards, where k = N/n,
Where k is the ratio of sampling frame size N and the desired
sample size n, and is formally called the sampling ratio.
It is important that the starting point is not automatically the
first in the list, but is instead randomly chosen from within
the first k elements on the list.

Introduction
Ethics
Stratified sampling
The sampling frame is divided into homogeneous and

non-overlapping subgroups (called strata), and a simple
random sample is drawn within each subgroup.
In case of non- proportional stratus an alternative technique
will be to select subgroup samples in proportion to their size
in the population.
This technique is called proportional stratified sampling

Introduction
Ethics
Cluster Sampling
If you have a population dispersed over a wide geographic

region, it may not be feasible to conduct a simple random
sampling of the entire population.
In such case, it may be reasonable to divide the population
into clusters (usually along geographic boundaries), randomly
sample a few clusters, and measure all units within that
cluster.
However, the results of cluster sampling are less generalizable
to the population if the variability of sample estimates in a
cluster sample is higher.

Introduction
Ethics
Matched-Pairs Sampling
Used to compare two subgroups within one population based

on a specific criterion. e,g. Adoption of a given technology by
farmers
You categorize a sampling frame of farmers into adopters and
non-adopters,
You would then select a simple random sample of firms in one
subgroup, and match each farmer in this group with a farmer
in the second subgroup, based on farm size, education,
gender, and/or other matching criteria.
This technique is ideal in understanding bipolar differences
between different subgroups within a given population.

Introduction
Ethics
Multi-stage Sampling
Depending on your sampling needs, you may combine above

single-stage techniques to conduct multi-stage sampling.
For instance, you can stratify a list of businesses based on firm
size, and then conduct systematic sampling within each
stratum.

Introduction
Ethics
Non-Probability Sampling
Some units of the population have zero chance of selection or

where the probability of selection cannot be accurately
determined.
Units are selected based on certain non-random criteria, such
as quota or convenience.
Because selection is non-random, non-probability sampling
does not allow the estimation of sampling errors, and
May be subjected to a sampling bias.
Information from a sample cannot be generalized back to the
population.

Introduction
Ethics
Types of non-probability sampling techniques
Convenience sampling;
Quota sampling;
Expert sampling;
Snowball sampling;

Introduction
Ethics
Sampling Distribution
Responses from several respondents ca be graphed into

frequency distribution
Normal distribution= bell shaped curve is obtained from a
large number of observations
Normal distribution can be used to calculate overall
characteristics e.g. sample mean and SD (called sample
statistics)
Population characteristics are always unknown, and are called
population parameters (and not statistic because they are not
statistically estimated from data).

Introduction
Ethics
If a sample is truly representative of the population, then the

estimated sample statistics should be identical to theoretical
population parameters.
Sample statistics may differ from population parameters if the
sample is not perfectly representative of the population;
The difference between the two is called sampling error.

Introduction
Ethics
Theoretically, if we could gradually increase the sample size so

that the sample approaches closer and closer to the
population, then
Sampling error will decrease and a sample statistic will
increasingly approximate the corresponding population
parameter.

Introduction
Ethics
How do we know if the sample statistics are at least
reasonably close to the population parameters?
The concept of sampling distribution
Take three different random samples from a given population,
for each sample, you derived sample mean and standard
deviation.
For a sample which is representative of the population, the
three random samples will be identical (and equal to the
population parameter), and the variability in sample means
will be zero.
Hence, a sampling distribution is a frequency distribution of a
sample statistic (like sample mean) from a set of samples,
Commonly referenced frequency distribution is the distribution
of a response (observation) from a single sample.
Introduction
Ethics
The variability or spread of a sample statistic in a sampling

distribution (i.e., the standard deviation of a sampling
statistic) is called its standard error.
In contrast, the term standard deviation is reserved for
variability of an observed response from a single sample.

Introduction
Ethics
Confidence Interval
Based on standard error, it is also possible to estimate

confidence intervals for the prediction of population
parameter.
All normal distributions tend to follow a 68-95-99 percent
rule;
1 (Sample statistic + 1 standard error) represents a 68%
confidence interval for the population parameter.
2 (Sample statistic + 2 standard errors) represents a 95%
3 (Sample statistic + 3 standard errors) represents a 99%

Introduction
Ethics
Biased Sample
A sample is biased (i.e., not representative of the population)

if its sampling distribution cannot be estimated or
If the sampling distribution violates the 68-95-99 percent rule.

Introduction
Ethics
Course Outline
1 Introduction
Definitions
CBN Surveys
3 Sampling
5 Ethics

Introduction
Ethics
Research instruments
Depending on how the data is collected, survey research can be

divided into two broad categories:
Questionnaire surveys (mail-in, group-administered, or online

surveys), and
Interview surveys (personal, telephone, or focus group
interviews).

Introduction
Ethics
Response formats
1 Dichotomous response, where respondents are asked to

select one of two possible choices, such as true/false, yes/no,
or agree/disagree.
2 Nominal response, where respondents are presented with
more than two unordered options;
3 Ordinal response, where respondents have more than two
ordered options, such as: what is your highest level of
education: high school / college degree / graduate studies;
4 Continuous response, where respondents enter a continuous
(ratio-scaled) value with a meaningful zero point, such as
their age

Introduction
Ethics
Question content wording

Responses obtained in survey research are very sensitive to the
types of questions asked.
Is the question clear and understandable?
Is the question worded in a negative manner?
Is the question ambiguous?
Does the question have biased or value-laden words?
Is the question double-barreled?
Is the question too general?
Is the question too detailed?
Is the question presumptuous?
Is the question imaginary?
Do respondents have the information needed to correctly answer
the question?
Introduction
Ethics
Question sequencing
Questions should flow from the least sensitive to the most

sensitive,
From the factual and behavioral to the attitudinal, and
From the more general to the more specific.
Start with easy non-threatening questions that can be easily
recalled.
Never start with an open ended question.
If following an historical sequence of events, follow a
chronological order from earliest to latest.
Ask about one topic at a time.
Use filter or contingency questions as needed,

Introduction
Ethics
Golden rules in survey
Peoples time is valuable;

Always assure respondents about the confidentiality of their
responses, and how you will use their data;
For organizational surveys, assure respondents that you will
send them a copy of the final results;
Thank your respondents for their participation in your study;
Always pretest your questionnaire,

Introduction
Ethics
Survey delivery platform

Introduction
Ethics
Biases in survey research
Despite all of its strengths and advantages, survey research is often

tainted with systematic biases that may invalidate some of the
inferences derived from such surveys.
Five such biases are:
Non-response bias,
Sampling bias,
Social desirability bias,
Recall bias, and
Common method bias

Introduction
Ethics
Types of variable measurements
Basically, there are two types of data:

1 Qualitative and
2 Quantitative.
Qualitative data are numerically non-measurable while
Quantitative data can be measured numerically.
Most statistical analysis is based on quantitative data using
appropriate measurement of their variables.

Introduction
Ethics
Quantitative variables
Quantitative variables are also classified into two types:

1 Discrete (categorical)
2 Continuous.
A discrete variable can take only certain distinct or isolated
values in a given range, for example, number of siblings 0, 1,
2, , 10.
A continuous variable can take any value in a given range, for
example, age from 0 years to 100 years

Introduction
Ethics
Scales of Measurement
Level of measurement is the first decision to be made in
operationalizing a construct
Levels of measurement, also called rating scales, refer to the
values that an indicator can take (but says nothing about the
indicator itself).
For example, male and female (or M and F, or 1 and 2) are
two levels of the indicator gender.
According to psychologist Stanley Smith Stevens (1946),
there are four generic types of rating scales for scientific
measurements:
1 nominal,
2 ordinal,
3 interval, and
4 ratio scales
Introduction
Ethics
Figure 3: Statistical properties of rating scales

Introduction
Ethics
Nominal scales
Also called categorical scales, measure categorical data.

They are used for variables or indicators that have mutually
exclusive attributes.
Examples: gender (two values: male or female), industry type
(manufacturing, financial, agriculture, etc.), and religious
affiliation (Christian, Muslim, Jew, etc.).

Introduction
Ethics
Nominal
Even if we assign unique numbers to each value, for instance

1 for male and 2 for female, the numbers dont really mean
anything (i.e., 1 is not less than or half of 2) and could have
been easily been represented non-numerically, such as M for
male and F for female.
Nominal scales merely offer names or labels for different
attribute values.
The appropriate measure of central tendency of a nominal
scale is mode, and neither the mean nor the median can be
defined.
Permissible statistics are chi-square and frequency
distribution, and only a one-to-one (equality) transformation
is allowed (e.g., 1=Male, 2=Female).
Introduction
Ethics
Ordinal scales
Are those that measure rank-ordered data, such as the

ranking of students in a class as first, second, third, and so
forth, based on their grade point average or test scores.
The actual or relative values of attributes or difference in
attribute values cannot be assessed.
For instance, ranking of students in class says nothing about
the actual test scores of the students, or how they well
performed relative to one another.

Introduction
Ethics
Ordinal Scales
Ordinal scales can also use attribute labels (anchors) such as

bad, medium, and good, or
”strongly dissatisfied”, ”somewhat dissatisfied”, ”neutral”, or
”somewhat satisfied”, and ”strongly satisfied.
In the latter case, we can say that respondents who are
somewhat satisfied are less satisfied than those who are
strongly satisfied, but we cannot quantify their satisfaction
levels.

Introduction
Ethics
Ordinal Scales
The central tendency measure of an ordinal scale can be its

median or mode, and means are uninterpretable.
Statistical analyses may involve percentiles and
non-parametric analysis, but more sophisticated techniques
such as correlation, regression, and analysis of variance, are
not appropriate.
Monotonically increasing transformation (which retains the
ranking) is allowed.

Introduction
Ethics
Interval Scales
Interval scales are those where the values measured are not
only rank-ordered, but are also equidistant from adjacent
attributes.
Example: The temperature scale (in Fahrenheit or Celsius),
where the difference between 10◦ and 20◦ is the same as that
between 10◦ and 20◦ Fahrenheit.
Likewise, if you have a scale that asks respondents annual
income using the following attributes (ranges): Ksh. 0 to
10,000, Ksh, 10,000 to 20,000, Ksh. 20,000 to 30,000, and so
forth, this is also an interval scale, because the mid-point of
each range (i.e., Ksh. 5,000, Ksh. 15,000, Ksh. 25,000, etc.)
are equidistant from each other.

Introduction
Ethics
Interval scale allows us to examine how much more is one

attribute when compared to another, which is not possible
with nominal or ordinal scales.
Allowed central tendency measures include mean, median, or
mode, as are measures of dispersion, such as range and
standard deviation.
Permissible statistical analyses include all of those allowed for
nominal and ordinal scales, plus correlation, regression,
analysis of variance, and so on.
Allowed scale transformation are positive linear.
Satisfaction scale discussed earlier is not strictly an interval
scale, because we cannot say whether the difference between
strongly satisfied and somewhat satisfied is the same as that
between neutral and somewhat satisfied or between somewhat
dissatisfied and strongly dissatisfied.
Social science researchers often assume (incorrectly) that
these differences areLangat,
Kiplimo Araap equal PhD so that weOFcan
ANALYSIS useDATA
SURVEY statistical
(SPSS/STATA)
Introduction
Ethics
Ratio
Ratio scales are those that have all the qualities of nominal,
ordinal, and interval scales, and in addition, also have a true
zero point (where the value zero implies lack or nonavailability
of the underlying construct).
Most measurement in the natural sciences and engineering,
such as mass, incline of a plane, and electric charge, employ
ratio scales, as are some social science variables such as age,
tenure in an organization, and firm size (measured as
employee count or gross revenues).

Introduction
Ethics
Example
The Kelvin temperature scale is also a ratio scale, in contrast

to the Fahrenheit or Celsius scales, because the zero point on
this scale (equaling -273.15 degree Celsius) is not an arbitrary
value but represents a state where the particles of matter at
this temperature have zero kinetic energy.
All measures of central tendencies,including geometric and
harmonic means, are allowed for ratio scales, as are ratio
measures, such as coefficient of variation.
All statistical methods are allowed.
Sophisticated transformation such as positive similar (e.g.,
multiplicative or logarithmic) are also allowed.

Introduction
Ethics
Rating Scales
The above scales are generic scales;

Based on the four generic types of scales discussed above, we
can create specific rating scales for social science research.
The most common rating scales include binary, Likert,
semantic differential, or Guttman scales.

Introduction
Ethics
Binary
Binary scales are nominal scales consisting of binary items
Assume one of two possible values, such as yes or no, true or
false, and so on. Gender Male, Female,
Figure 4: A six-item binary scale for measuring political activism

Introduction
Ethics
Likert scale
Designed by Rensis Likert, this is a very popular rating scale

for measuring ordinal data in social science research.
This scale includes Likert items that are simply-worded
statements to which respondents can indicate their extent of
agreement or disagreement on a five or seven-point scale
ranging from strongly disagree to strongly agree.
Likert scales are summated scales, that is, the overall scale
score may be a summation of the attribute values of each
item as selected by a respondent.

Introduction
Ethics
Example
Figure 5: A six-item Likert scale for measuring employment self-esteem

Introduction
Ethics
Likert items allow for more granularity (more finely tuned

response) than binary items, including whether respondents
are neutral to the statement.
Three or nine values (often called anchors) may also be used,
but it is important to use an odd number of values to allow
for a neutral (or neither agree nor disagree) anchor.
Likert scales are ordinal scales because the anchors are not
necessarily equidistant, even though sometimes we treat them
like interval scales.

Introduction
Ethics
Semantic differential scale
This is a composite (multi-item) scale where respondents are

asked to indicate their opinions or feelings toward a single
statement using different pairs of adjectives framed as polar
opposites.
Unlike Likert scale, in the semantic differential scales, the
statement remains constant, while the anchors (adjective
pairs)change across items.
Semantic differential is believed to be an excellent technique
for measuring peoples attitude or feelings toward objects,
events, or behaviors.

Introduction
Ethics
Example
Figure 6: A semantic differential scale for measuring attitude toward

national health insurance

Introduction
Ethics
Guttman scale
Designed by Louis Guttman, this composite scale uses a series

of items arranged in increasing order of intensity of the
construct of interest, from least intense to most intense.
Example, the construct attitude toward immigrants can be
measured using five items shown in Tablble 9.
Each item in the below Guttman scale has a weight (not
indicated above) which varies with the intensity of that item,
and the weighted combination of each response is used as
aggregate measure of an observation.

Introduction
Ethics
Example
Figure 7: A five-item Guttman scale for measuring attitude toward

immigrants

Introduction
Ethics
Scaling
Scaling is a branch of measurement that involves the

construction of measures by associating qualitative judgments
about unobservable constructs with quantitative, measurable
metric units.
Stevens (1946) said, Scaling is the assignment of objects to
numbers according to a rule.
This process of measuring abstract concepts in concrete terms
remains one of the most difficult tasks in empirical social
science research.
The outcome of a scaling process is a scale, which is an
empirical structure for measuring items or indicators of a
given construct.

Introduction
Ethics
Indexes
An index is a composite score derived from aggregating

measures of multiple constructs (called components) using a
set of rules and formulas.
It is different from scales in that scales also aggregate
measures, but these measures measure different dimensions or
the same dimension of a single construct.
A well-known example of an index is the consumer price index
(CPI), which is computed every month by the Kenya Bureau
of Statistics.
The CPI is a measure of how much consumers have to pay for
goods and services in general, and is divided into major
categories

Introduction
Ethics
Summary
Scale (or index) construction in social science research is a
complex process involving several key decisions. Some of
these decisions are:
1 Should you use a scale, index?
2 How do you plan to analyze the data?
3 What is your desired level of measurement (nominal, ordinal,
interval, or ratio) or rating scale?
4 How many scale attributes should you use (e.g., 1 to 10; 1 to
7; 3 to +3)?
5 Should you use an odd or even number of attributes (i.e., do
you wish to have neutral or mid-point value)?
6 How do you wish to label the scale attributes (especially for
semantic differential scales)?
7 Finally, what procedure would you use to generate the scale
items (e.g., Thurstone, Likert, or Guttman method) or index
components?
Introduction
Ethics
Validity and reliability of instrument
Reliability and validity, jointly called the psychometric

properties of measurement scales.
They are the yardsticks against which the adequacy and
accuracy of our measurement procedures are evaluated in
scientific research.
A measure can be reliable but not valid, if it is measuring
something very consistently but is consistently measuring the
wrong construct

Introduction
Ethics
Likewise, a measure can be valid but not reliable if it is

measuring the right construct, but not doing so in a consistent
manner.
In order to be valid, a test must be reliable; but reliability
does not guarantee validity.

Introduction
Ethics
Reliability
Reliability is the degree to which the measure of a construct is

consistent or dependable.
Note that reliability implies consistency but not accuracy.(e.g.
miscalibrated weight scale)
Sources of unreliable observations:
1 Observer’s (or researcher’s) subjectivity;
2 Asking imprecise or ambiguous questions;
3 asking questions about issues that respondents are not very
familiar about or care about;

Introduction
Ethics
Creating Reliable Measures
Replacing data collection techniques that depends more on

researcher subjectivity (such as observations) with those that
are less dependent on subjectivity (such as questionnaire),
Avoiding ambiguous items in your measures
Measurement instruments must still be tested for reliability.

Introduction
Ethics
Methods of estimating Reliability
Inter-rater reliability;
Test-retest reliability;
Split-half reliability;
Internal consistency reliability

Introduction
Ethics
Inter-rater reliability
Also called inter-observer reliability, is a measure of

consistency between two or more independent raters
(observers) of the same construct. 5
5
Usually, this is assessed in a pilot study, and can be done in two ways,
depending on the level of measurement of the construct.
If the measure is categorical, a set of all categories is defined, raters check off
which category each observation falls in, and the percentage of agreement
between the raters is an estimate of inter-rater reliability. For instance, if there
are two raters rating 100 observations into one of three possible categories, and
their ratings match for 75% of the observations, then inter-rater reliability is
0.75. If the measure is interval or ratio scaled (e.g., classroom activity is being
measured once every 5 minutes by two raters on 1 to 7 response scale), then a
simple correlation between measures from the two raters can also serve as an
estimate of inter-rater reliability.
Introduction
Ethics
Test-retest reliability
Test-retest reliability is a measure of consistency between two

measurements (tests) of the same construct administered to the
same sample at two different points in time.
If the observations have not changed substantially between
the two tests, then the measure is reliable.
The correlation in observations between the two tests is an
estimate of test-retest reliability.
Note here that the time interval between the two tests is
critical.
Generally, the longer is the time gap, the greater is the chance
that the two observations may change during this time (due to
random error), and the lower will be the test-retest reliability.

Introduction
Ethics
Split-half reliability
Split-half reliability is a measure of consistency between two
halves of a construct measure.
For instance, if you have a ten-item measure of a given
construct, randomly split those ten items into two sets of five
(unequal halves are allowed if the total number of items is
odd), and administer the entire instrument to a sample of
respondents.
Then, calculate the total score for each half for each
respondent, and the correlation between the total scores in
each half is a measure of split-half reliability.
The longer is the instrument, the more likely it is that the two
halves of the measure will be similar (since random errors are
minimized as more items are added), and hence, this
technique tends to systematically overestimate the reliability
of longer instruments
Introduction
Ethics
Internal consistency reliability
Measure of consistency between different items of the same

construct.
If a multiple-item construct measure is administered to
respondents, the extent to which respondents rate those items
in a similar manner is a reflection of internal consistency.
This reliability can be estimated in terms of
1 average inter-item correlation,
2 average item-to-total correlation, or
3 more commonly, Cronbachs alpha.

Introduction
Ethics
Cronbachs alpha
It is a coefficient of reliability or consistency not a statistical

test;
Reliability estimates measuring internal consistency of scores;
It is not a measure of homogeneity;
Neither is it a measure of uni-dimensionality

Introduction
Ethics
Cronbachs alpha
The following are the ranges of the coefficient.

Range: 0 < α < 1
If α = 0 no consistency in measurement;
If α = 1 perfect consistency in measurement;
If α = 0.70 means that 70% of the variance in the scores is
reliable variance. Hence 30% is error variance.

Introduction
Ethics
Composite scores
Internal consistency reliability is relevant to composite scores.

Composite scores are the sum of (or average) of two or more
scores not individual item scores.
Example: Item 1 + Item 2 + Item 3 + Item 4 + Item 5
Internal consistency is about these scores

Introduction
Ethics
Cronbachs alpha formula
A reliability measure designed by Lee Cronbach in 1951,

factors in scale size in reliability estimation, calculated using
the following formula:
k r¯
α= (1)
(1 + (k − 1)¯
r
Where K = number of indicators/items in the measure,

r¯ mean inter indicator/item correlation
This denominator is the total variance.
This is a standardized Cronbach’s alpha.

Introduction
Ethics
Cronbachs alpha formula
5(.346)
α= ) = .73
(1 + (5 − 1).346

Introduction
Ethics
Validity
Often called construct validity, refers to the extent to which a

measure adequately represents the underlying construct that it
is supposed to measure.
For instance, is a measure of compassion really measuring
compassion, and not measuring a different construct such as
empathy?

Introduction
Sampling
Ethics
Course Outline
1 Introduction
Definitions
CBN Surveys
3 Sampling
5 Ethics

Introduction
Sampling
Ethics
Data Collection Platforms
PAPI,
CAPI,
WAPI,
CASI,
CAWI,
CATI,
TAPI,
TASI,
SAPI and
SASI.

Introduction
Sampling
Ethics
PAPI and CAPI
PAPI: Paper And Pencil Interviewing. Data obtained from the

interview is filled in on a paper form using a pencil.
CAPI: Computer Assisted Personal Interviewing. This method
is very much similar to the PAPI method, but the data is
directly entered into a computer programme instead of first
using paper forms.

Introduction
Sampling
Ethics
WAPI: Web Assisted Personal Interviewing. The respondents

answer the questions online, but they are also assisted online
in doing so.
CASI: Computer Assisted Self Interviewing. The CASI method
involves respondents taking place behind the computer
themselves in order to fill in the questionnaire.

Introduction
Sampling
Ethics
CAWI: Computer Assisted Web Interviewing. Online research

in which data is obtained electronically using online
questionnaires. These questionnaires contain references so
that the correct questions are asked to each respondent.

Introduction
Sampling
Ethics
CATI: Computer Assisted Telephone Interviewing. The

questions are usually presented to the interviewers on a
computer screen, after which they ask them to the
respondents.
To ensure that the correct questions are asked to each
respondent, the specialised computer software uses ”skips”:
Certain answers can lead to the next question being different.
This also prevents the respondent from having to answer
irrelevant questions.

Introduction
Sampling
Ethics
TAPI, TASI and SAPI
CTAPI: Tablet Assisted Personal Interviewing. This method is

virtually identical to the CAPI method, but the data is entered
into a tablet instead of a computer/laptop.
TASI: Tablet Assisted Self Interviewing. This method is
virtually identical to the CASI method, but the data is entered
into a tablet instead of a computer/laptop.
SAPI: Smartphone Assisted Personal Interviewing. With this
method, the data is entered into a smartphone by the
interviewer. SASI: Smartphone Assisted Self Interviewing.
With this method, the data is entered into a smartphone by
the respondent.

Introduction
Sampling
Ethics
Introduction to ODK
Practical Class Sessions

Trainees will go through a practical class session of
downloading the Collect application in their Smartphones and
understanding the use of Google engine to design the CAPI
platforms
This session will also involve a field work exercise of
conducting the data collection from real situations around
ABUJA

Introduction
Sampling
Ethics
Course Outline
1 Introduction
Definitions
CBN Surveys
3 Sampling
5 Ethics

Introduction
Sampling
Ethics
Areas of Bias in Research

Introduction
Sampling
Ethics
Ethical standards are an integral part of any research design

Researcher records must be securely kept for future reference
and evidence
Multiple authorship should be clearly explained, recorded and
evidenced
Publication of multiple papers from the same data is improper
Potential conflicts of interests should be disclosed
Respondents must be fully informed about research details
that may affect them
Informed consent must be ensured and documented in all
cases
Full justification must be given where ethical standards are
thought not to be required
Research proposals must obtain approval from relevant ethics
committees
Problems arising from the research are to be communicated to
the ethics committee
Introduction
Sampling
Ethics
The Ten Commandments

Thou shall NOT
1 Include in the study or continue working with a person who
demonstrates resistance or discomfort relating to the study or

to the research topic.
2 Attempt to convince a person to take part in the study, when
this person is not in a position to respond adequately to the

research question.
3 Fail to explain all relevant aspects of the study to the
respondents before they agree to participate

4 Promise anonymity and confidentiality if it is likely that this
promise will not be honoured

5 Fail to respect the respondents privacy
6 Deceive the respondent in any way Subject respondents to
procedures that may entail physical or mental stress

7 Include in the study techniques whose degree of safety is
Introduction
Sampling
Ethics
Plagiarism
Researchers should abstain from using other people’s work

without appropriate acknowledgement.
Including the work of others in one’s publication without due
acknowledgement, hence presenting it as one’s own

Introduction
Sampling
Ethics
Group Work
Practical Class Sessions

Trainees will work in groups to identify relevant available
survey datasets relevant to the CBN.
The same data will be used in all class sessions for the
analysis.

Introduction
Sampling
Ethics
CPRC, (2011), Household Panel Data Sets in Developing and Transition

Countries, http://www.chronicpoverty.org/uploads/publication_files/
Annotated_Listing_of_Panel_Datasets_in_Developing_and_Transitional_
Countries.pdf
Ashenfelter O., A. Deaton and G. Solon (1986). Collecting Panel Data in
Developing Countries: Does It Make Sense?. http://www.princeton.edu/
~deaton/downloads/Collecting_Panel_Data_in_Developing_Countries.pdf
Bhattacherjee, Anol, 2012.Social Science Research: Principles, Methods, and
Practices . Textbooks Collection. Book 3.
http://scholarcommons.usf.edu/oa_textbooks/3
Tissot B., T. Hlag, P. Nymand-Andersen and L. Comino Suarez (2015) Central
banks use of and interest in big data,IFC Report.
http://www.bis.org/ifc/publ/ifc-report-bigdata.pdf.
IFC (2009) The use of surveys by central banks, Proceedings of the IFC
Workshops in Pune June 2007, Buenos Aires December 2007 and Vienna March
2008 IFC Bulleting No. 30. http://www.bis.org/ifc/publ/ifcb30.pdf

Introduction
Sampling
Ethics
The End

Day 1 &amp; 2

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Day 1 &amp; 2

Uploaded by

Copyright:

Available Formats

Introduction

Research Design in Analysis of Survey Data

ANALYSIS OF SURVEY DATA (SPSS/STATA)

Kiplimo Araap Langat, PhD

Alexander Brookes Associates Limited,

July 26, 2016

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

What you will get

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

1 Introduction to survey data and methodologies

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

2 Research Design in Analysis of Survey Data

4 Emerging data collection platforms

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

What is the research problem?

This is an area of conflict, concern, or controversy (a gap

Statement of the Problem

This should include

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Evaluating your problem

It is extremely important to evaluate your research problem in

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

What to consider in selecting a problem

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Difference between Research in Academia and Policy

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Common mistakes in research

Insufficiently motivated research questions

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Blind data mining

Data collection is only one step in a long and elaborate

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Common Errors in Research Process

Selecting an inappropriate population or universe from which to

Food manufacturers often conduct surveys of housewives, because

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Failure to obtain a representative sample

Suppose that we collected a random sample of 500 people from the

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Selection error is the sampling error for a sample selected by a

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

2 Research Design in Analysis of Survey Data

4 Emerging data collection platforms

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Survey Data in Developing Countries

Panel data have become increasingly available in developing

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Nigeria 2010-2011/2012-2013 GHS Survey Data

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Nigeria 2010-2011/2012-2013 GHS Survey Data

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Challenges of Collecting Data in Developing Countries

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Big Data & Central Banks

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Big Data & Central Banks

Big data is likely to become a topic of increasing interest to

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Characteristics of Big Data

Kiplimo Araap Langat, PhD ANALYSIS OF SURVEY DATA (SPSS/STATA)

Why Big data for Central banks?

Recent example from Kenya: Chase Bank and Social Media

Day 1 & 2

Day 1 & 2