Professional Documents
Culture Documents
Abiot Research Methods
Abiot Research Methods
Abiyot Animaw(Phd)
Email:abiotanimaw2014@gmail.com
Abiyot Animaw(PhD.) 1
Course outlines
1. The Fundamentals of Research: The scientific
Method
2. The Research Process and Preparing the
Research proposal
3. Survey and Elements of Sampling
4. Data Collection Techniques
5. Data Processing and Analysis
6. Writing the Research Report
7. Quantitative Analysis: Basic univariate and
multivariate analysis
Abiyot Animaw(PhD.) 2
Chapter-one
The Fundamentals of
Research
Abiyot Animaw(PhD.) 3
The Scientific Method and Economic Research
•In order to have a clear understanding of the term
research it is important to know the meaning of the
scientific method.
• The practice of economic research is informed by the
innovative thinking and careful attention to the details
of data.
• It is the research methodology adopted in the process
of economic research that makes economic research
scientific.
• Thus, research methods have become central parts of
the social science investigations.
Abiyot Animaw(PhD.) 4
• A science is a coherent body of thought about a topic
over which there is a broad consensus among its
practitioners.
• Science aims to discover universal laws about how the
world works
•The scientific method is a conscious and an objective,
logical and systematic method of investigation.
• The SM can be defined as the pursuit of truth as
determined by logical considerations.
• It refers to the ideas, rules, techniques, and approaches
that are commonly used in research.
Abiyot Animaw(PhD.) 5
• The Scientific Method uses both Inductive and
deductive Logical arguments.
• The inductive logic
• begins by observing facts
• then proceed from observations of facts to
universal laws
• The deductive view starts with
• (a) Universal law(s), or
• (b) Initial conditions
• We then show how an event we are trying to
explain follows
• Then we test the generality of this.
Abiyot Animaw(PhD.) 6
• The goal of scientific research is to test
current theories and develop new ones:
• Is the current theory consistent with the
world?
• Can we develop a new theory that is more
consistent with the world?
Abiyot Animaw(PhD.) 7
• The SM involves the following series of steps or
procedures:
– Identification of the problem to be investigated
– Collection of essential facts to prove or disprove
the theory
– Selection (hypothesizing) of tentative solutions to
the problem
– Evaluation of these alternative solutions to
determine which of them is in accordance with all
the facts, and
– The final selection of the most likely solution.
Abiyot Animaw(PhD.) 8
• Each research problem involves gathering data to
confirm or refute an existing theory
• We are not supposed to let our prior beliefs influence
our conclusions
• Because of this, we formulate our hypothesis before
we gather the data
• We don't let the data alter our hypothesis so we know
we will confirm it.
• Any truly scientific theory must be refutable
otherwise, it is just dogma.
• A sound theory is one that has withstood many
attempts to refute it
Abiyot Animaw(PhD.) 9
Guiding Principles of the SM
Three useful guiding principles need to be considered:
a) Use of empirical evidence and ethical neutrality
• The scientific method is based on empirical
evidence and utilizes relevant concepts
• The goal of SM is to facilitate independent
verification of scientific observation through the
use of empirical evidence.
• It presupposes ethical neutrality i.e., it aims at
nothing but making only adequate and correct
statements about population objectives.
Abiyot Animaw(PhD.) 10
b) Logical reasoning (Critical thinking)
• The SM practices logical reasoning which allows
determination of the truth through steps different
from emotional and hopeful thinking
• Its methodology is made known to all concerned
for critical scrutiny and for use in testing the
conclusions through replications
• Critical thinkers always use logical reasoning.
• Logic is not an ability that humans are born with
but rather it is a skill that must be learned within a
formal educational environment.
Abiyot Animaw(PhD.) 11
c) Possessing a Skeptical Attitude
• The final key idea is skepticism.
• the constant questioning of your beliefs and
conclusions.
• It requires the possession of skeptical attitudes.
• Scientific attitude (SA) implies skeptical attitude
• A skeptic holds beliefs tentatively, and is open to new
evidence and rational argument
Abiyot Animaw(PhD.) 12
The Meaning of Research
• Research begins with a question that the researcher is
trying to answer
• Research inculcates scientific reasoning and promotes
the development of logical habits of thinking.
• Hence, the term research can be broadly defined as the
scientific and systematic search or inquiry for pertinent
information or knowledge.
• It is a movement from the known to the unknown
and is a deliberate response to a need for
information in order to solve a given problem.
• It is an original contribution to the existing stock of
knowledge.
Abiyot Animaw(PhD.) 13
• The research activity comprises of the following
activities:
– the defining and redefining of the problem,
– the formulation of hypotheses or suggested
solutions,
– the collection, organization and evaluation of data or
facts,
– the making of deductions and reaching conclusions.
• Research requires a specific plan or procedure
– We need to know if the question is answerable and
how.
– We need to know whether the research project is
feasible in terms of time and money
Abiyot Animaw(PhD.) 14
• Research usually divides a problem into more manageable sub-
problems
– What is the current state of the Ethiopian economy?” is
vague by itself
– We could divide the economy into sectors
• by occupation: Agriculture, Household, etc.
• by industry
• Research accepts certain critical assumptions
– If you do not make assumptions, you cannot make logical
conclusions
• Research is, by its nature, cyclical
– Every research project brings answers but also new questions
– Those questions, in turn, bring new research projects
Abiyot Animaw(PhD.) 15
The Purpose of Research
• The purpose of research is to discover new ideas or
solutions through the application of scientific
procedures.
• Research has a clearly articulated goal
• Examples:
»Test a current theory
»Add details to a theory
»Replace a theory with a better one
»Write a new theory where none existed,
etc.
Abiyot Animaw(PhD.) 16
• In general the purpose of research may be either of the
following:
Exploration
Description
Explanation
• The main factors to be considered before embarking on
research include:
Type and nature of information sought
Timing
Availability of resources
Cost/benefit analysis
Ethical considerations
Abiyot Animaw(PhD.) 17
• Classification of Research Activities
• Different people may use different classification
systems.
– The classification may be in terms of:
• methods employed,
• the time dimension,
• research environment or
• data used.
– Accordingly, several types of research classifications
could be identified some of which may include:
Abiyot Animaw(PhD.) 18
Descriptive versus Analytical Research :
• The purpose of descriptive research is description of
the state of affairs as it exists at present.
• The main characteristics of this method are that the
researcher has no control over the variables.
• He can only report what has happened or what is
happening.
• Example; the frequency of shopping by people,
the preference of people, the number of
employed workers in a factory, etc.
Abiyot Animaw(PhD.) 19
• In analytical research the researcher has to use facts
or information and analyze these to make a critical
evaluation of the material.
– Analytical studies go beyond simple description in
their attempt to model empirically the social
phenomena under investigation.
• It asks “why” and tries to find the answer to a
problem.
Abiyot Animaw(PhD.) 20
Applied versus fundamental research:
• Research may be undertaken either to understand the
fundamental nature of a social reality (basic research)
or to apply knowledge to address specific practical
issues (applied research).
• Applied research aims at finding solution for an
immediate pressing problem facing a society or an
industrial or business organization.
• Applied research tries to solve specific policy problems
or help practitioners accomplish a specific task.
– Theory is less central than seeking a solution to
specific problem for a limited setting.
Abiyot Animaw(PhD.) 21
• Fundamental research is mainly concerned with
generalizations and with the formulation of a theory.
• It is primarily concerned with the understanding
of the fundamental nature of social reality.
• It is the source of most scientific ideas and ways
of thinking about the world.
• It is mostly exploratory in nature.
• So, gathering knowledge for knowledge’s sake is
termed as fundamental or basic research.
• Mostly deductive -seeks new conclusions from current
assumptions
Abiyot Animaw(PhD.) 22
Quantitative versus Qualitative Research:
• Quantitative research is based on the measurement
of quantitative figure or quantity or amount.
• It is applicable to phenomenon that can be
expressed in terms of quantity.
• Most often we are testing a hypothesis
• We collect data and see whether the hypothesis is
consistent with the data
• Methodology is simpler than qualitative research
• But it often takes longer- identifying, collecting, and
analyzing appropriate data is difficult and expensive.
Abiyot Animaw(PhD.) 23
• This approach can be further subdivided into:
– inferential,
– experimental and
– simulation approaches.
• The purpose of the inferential approach is to form a
database from which to infer characteristics or
relationships of populations.
• A survey population where a sample population is
studied to determine its characteristics and it is
then inferred that the population has the
characteristics.
Abiyot Animaw(PhD.) 24
• Experimental approach is characterized by much
greater control over the research environment.
• Some or all the variables are manipulated to
observe their effect on other variables.
• Simulation approach involves the construction of
artificial environment within which relevant information
and data can be generated.
• This permits an observation of the dynamic
behavior of a system under controlled conditions.
• Given values of initial conditions, parameters and
exogenous variables, a simulation is run to
represent the behavior of the process over time.
Abiyot Animaw(PhD.) 25
• Qualitative Research
– Qualitative research is concerned with subjective
assessment of attitudes, opinions, and behavior.
– Qualitative Research is a function of researchers’
insights and impressions.
– It generates results, which are not subjected to
rigorous quantitative analysis.
– Generally group interviews, projective techniques
and in depth interviews are used.
• Qualitative research is particularly important in the
behavioral sciences.
Abiyot Animaw(PhD.) 26
• BUT, social research is essentially pluralistic:
researchers often combine quantitative and
qualitative research methods within the same study.
Mixed-method research strategies are particularly
effective in policy-oriented research and the
contribution that qualitative research can make to
policy evaluation is increasingly being recognized.
Abiyot Animaw(PhD.) 27
Conceptual versus Empirical Research:
– This classification is similar to the applied versus
fundamental classification.
– Conceptual research is related to some abstract or
theory.
• Generally used by philosophers and other similar
thinkers to develop new concepts or reinterpret
the existing ones.
• Mostly deductive: Seeks new conclusions from
current assumptions
Abiyot Animaw(PhD.) 28
• Empirical research relies on experiences or
observations alone without due regard to system and
theory.
• It is data based research, coming up with
conclusions, which are capable of being verified, by
observations or experiments.
• Under empirical research the researcher first
provides himself with working hypothesis.
• He/she then works to get enough data to prove or
disprove his hypothesis.
Abiyot Animaw(PhD.) 29
Some other types of research:
– Research:
• Can be one time or longitudinal research,
• can be field setting or laboratory based or
simulation research,
• can also be clinical or diagnostic research,
• can be conclusion oriented or decision oriented,
etc.
Abiyot Animaw(PhD.) 30
• Time Dimension in Research
• Quantitative research may be divided into two
groups in terms of the time dimension:
• A single point in time (cross sectional)
• Multiple points research (longitudinal research)
• Cross –sectional research takes a snapshot
approach to social world.
• This is the simplest and less costly research
approach.
• Limitation – it cannot capture social processes or
changes.
Abiyot Animaw(PhD.) 31
• Longitudinal research examines features of people or
other units more than one time.
– It is usually more complex and costly than cross
sectional research but is also more powerful
especially with respect to social changes.
• Types of Longitudinal Research
– Time series research – this is longitudinal study on a
group of people or other units across multiple
periods (e.g. time series data on exports of coffee).
– The panel study – the researcher observes exactly
the same people group or organization across time
periods, each time using the snapshot approach.
Abiyot Animaw(PhD.) 32
• In panel study the focus is on individuals or
households.
– Example: interviewing the same people in 1991,
1993, 1995, etc, and observing the change is an
example of panel data set.
• A cohort Analysis – is similar to the panel study, but
rather than observing the exact same people, a
category of people who share similar life experience in
a specified period is studied.
– Hence the focus is on group of individuals not on
specific individuals or households.
Abiyot Animaw(PhD.) 33
Ethical Consideration in the Research Process
• Shared Values
• There is no one best way to undertake research.
– There is no universal method that applies to all
scientific investigations.
• Accepted practices for the conduct of research
can and do vary from discipline to discipline.
• There are, however, some important shared
values for the responsible conduct of research
that bind all researchers together.
Abiyot Animaw(PhD.) 34
Some of the most important shared values
include:
HONESTY — conveying information truthfully and
honoring commitments,
ACCURACY — reporting findings precisely and
taking care to avoid errors,
EFFICIENCY — using resources wisely and
avoiding waste, and
OBJECTIVITY — letting the facts speak for
themselves and avoiding improper bias.
Abiyot Animaw(PhD.) 35
During data collection
• Some ethical principles governing data collection
include: informed consent, respect for privacy and
safeguarding the confidentiality of data.
– Informed consent implies that persons who are
invited to participate in social research activities
should be free to choose to take part or refuse.
– They are free to decide after having been given the
fullest information concerning the nature and
purpose of the research, including any risks to
which they personally would be exposed, the
arrangements for maintaining the confidentiality of
the data, and so on.
Abiyot Animaw(PhD.) 36
– Thus, collection of data illegally, under false
pretenses, from minors, etc is unethical.
– Getting access and consent to do research is
therefore, essential.
• During analysis (Misuse of data)
• Fabrication and falsification of research results are
serious forms of misconduct.
– It is a primary responsibility of a researcher to avoid
either a false statement or an omission that distorts
the truth.
• In order to preserve accurate documentation of
observed facts with which later reports or conclusions
can be compared, every researcher has an obligation to
maintain a clear and complete record of data acquired.
Abiyot Animaw(PhD.) 37
• Records should include sufficient detail to permit
examination for the purpose of
• replicating the research,
• responding to questions that may result from
unintentional error or misinterpretation,
• establishing authenticity of the records, and
• confirming the validity of the conclusions.
Abiyot Animaw(PhD.) 38
When writing the research paper- Plagiarism
• Plagiarism is the unauthorized use of someone else's
thoughts or wording either by
• incorrect documentation, failing to cite your
sources altogether, or
• simply by relying too heavily on external resources.
• Whether intentional or inadvertent some or all of
another author's ideas become represented as your
own.
– Plagiarizing undermines your academic integrity.
• It betrays your own responsibilities as a student
writer, your audience, and the very research
community you were entering by deciding to write a
research paper in the first place.
Abiyot Animaw(PhD.) 39
• Incidentally, plagiarism also includes informal published
material such as the "buying" of a paper from another
student.
• If you feel cheating is an easy way-out, and the moral
and intellectual consequences don't sound alarm bells,
stop and think of the serious punitive repercussions you
could incur.
• Because it is intellectual theft, plagiarism is considered
as an academic crime with punishment anywhere from
an F on that particular paper to dismissal from the
course to expulsion from a college or university.
Abiyot Animaw(PhD.) 40
Chapter two
The Research Process and
Preparing the Research proposal
Abiyot Animaw(PhD.) 41
Introduction
Most research activities follow the following steps:
– Selecting a topic
– formulating the research problem and research
questions
– Extensive literature survey
– Formulating the working hypothesis
– Preparing the research design and determining the
sample design
– Collecting and analyzing the data
– Generalizations and interpretations of results
– Preparing the report and presentation of the results
(formal write up of conclusions reached)
Abiyot Animaw(PhD.) 42
1. Identification of a Research Topic
• To do a research a topic or a research problem must
be identified.
What is a Research problem?
• A research problem refers to some difficulty, which a
researcher experiences in the context of either a
theoretical or practical situation and wants to obtain a
solution for it.
• A research topic should seek to advance the state of
science
• It usually starts with a felt practical or theoretical
difficulty.
Abiyot Animaw(PhD.) 43
• It should ask a question to which the answer is not
known
– It should ask an interesting question
– It should be as objective as possible
Some Potential Sources of a Research Topic
• A topic must spring from the researcher’s mind like a
plant springs from its own seed.
• The best way to identify a topic is to draw up a shortlist
of possible topics that have emerged from your reading
or from your own experience that look interesting.
• A general area of interest or aspect of a subject
matter (agriculture, industry, social sector, etc.) may
have to be identified at first.
Abiyot Animaw(PhD.) 44
Some important sources, which may be helpful to a
select a research problem.
A) Professional Experience
• Own professional experience is the most important
source of a research problem.
– Contacts and discussions with research oriented
people,
– attending conferences, seminars, and
– listening to learned speakers
• are all helpful in identifying research problems.
Abiyot Animaw(PhD.) 45
b) Inferences from theory and Professional literature
• Research problems can also emanate from inferences
that can be drawn from theories or from empirical
literature.
• Two types of literature can be reviewed.
The conceptual literature
The empirical literature
• Research reports, bibliographies of books, and articles,
periodicals, research abstracts and research guides
suggest areas that need research.
Abiyot Animaw(PhD.) 46
c) Technological and Social Changes
– New developments bring forth new development
challenges for research.
– New innovations and changes need to be carefully
evaluated through the research process.
• In general, the most fundamental rule of good research
topic is to investigate questions that sincerely interest
you.
• i.e. a research which a researcher honestly enjoys
even if he/she encounters problems frustrating or
discouraging.
Abiyot Animaw(PhD.) 47
The following points are important in selecting a
research problem:
• Subject, which is overdone, should be avoided
since it will be difficult to throw any new light in
such cases for the average researcher.
• Controversial subjects should not become the
choice of the average researcher.
• Too narrow or too broad or vague problems
should be avoided
• The importance of the subject in terms of:
• The qualification and training of researcher,
• The cost involved and the time factor, etc.
Abiyot Animaw(PhD.) 48
• In general, the choice of a research topic is not made
in a vacuum and is influenced by several factors:
• Interest and Values of the Researcher,
• Current Debate in the Academic world,
• Funding,
• The value and power of the subject, etc.
Abiyot Animaw(PhD.) 49
Abiyot Animaw(PhD.) 50
Common/overused topics
• For example, if impact of microfinance on
poverty reduction in rural Ethiopia has been
well researched, you may consider a topic
impact of microfinance on poverty reduction
in urban Ethiopia
Abiyot Animaw(PhD.) 51
General /too broad topics
• General /too broad topics should be avoided.
• For example, why is productivity in Ethiopia
lower than in Kenya? Too broad
• However, why is labor productivity in
agriculture lower in Ethiopia than in Kenya
may be appropriate
Abiyot Animaw(PhD.) 52
Topics related to religion,
politics/controversy
• Controversies have the propensity to arouse
emotions among people, because the
surrounding issues are highly subjective and
sensitive.
• Select this topic-if required by the programme
of study
About Animaw(PhD.) 53
Too narrow topics
• Picking a topic that is too narrow should be
avoided, because it will be near impossible to find
enough information to conduct the research.
• For example, consider a research topic “ why Fasil
broke up with Sara?” this topic is too narrow and
focused in a single event.
• If this topic is changed to “Determinants of
breakups in relationships among undergraduate
students”- the topic will become more
researchable.
Abiyot Animaw(PhD.) 54
Abiyot Animaw(PhD.) 55
Abiyot Animaw(PhD.) 56
Abiyot Animaw(PhD.) 57
Research gaps and topic selection
• Research gap-explained
Abiyot Animaw(PhD.) 58
Abiyot Animaw(PhD.) 59
Abiyot Animaw(PhD.) 60
Abiyot Animaw(PhD.) 61
Abiyot Animaw(PhD.) 62
Abiyot Animaw(PhD.) 63
Abiyot Animaw(PhD.) 64
Abiyot Animaw(PhD.) 65
Abiyot Animaw(PhD.) 66
Exercise : identify research gaps in the
text
Abiyot Animaw(PhD.) 67
2. Definition and Statement of the Problem
– After a topic has been selected the next task is to
define it clearly.
• To define a problem means to put a fence around
it.
• It involves the task of laying down the boundaries
within which a researcher shall study the
problem.
• The researcher must be certain that he knows
exactly what his/her problem is before he/she
begins work on it.
• A problem clearly defined is a problem half
solved.
Abiyot Animaw(PhD.) 68
• Defining the problem unambiguously will help to find
answers to questions like:
– What data are to be collected?
– What characteristics of data are relevant and need
to be studied
– What relations are to be explored
– What techniques are to be used for the purpose
• Hence, in the formal definition of the problem the
researcher is required
• to describe the background of the study, its
theoretical basis and underlying assumptions in
concrete, specific and workable questions.
Abiyot Animaw(PhD.) 69
Useful steps in defining the research problem:
a) Statement of the problem in a general way
– Problem should be stated in a broad and general
way keeping in mind either some practical concern
or some scientific or intellectual interest.
b) Understanding the nature of the problem more
clearly
– The next steps is to understand its origin and nature
clearly.
– The best way to understand the problem is to
discuss it with other more acquainted or
experienced people.
Abiyot Animaw(PhD.) 70
c) Survey of the available literature
• The researcher must devote sufficient time in reviewing
both the conceptual and empirical literature.
– Research already undertaken on related topics or
problems need to be systematically reviewed.
• This exercise enables the researcher to
1. find out what data are available
2. find out if there are gaps in theories, and
3. find out whether the existing theory is
applicable to the problem under study.
4. find out what other researchers have to say
about the topic,
5. ensure that no one else has already exhausted
the questions that you aim to examine, etc.
Abiyot Animaw(PhD.) 71
d) Developing the idea through discussion
– Discussion concerning a problem often produces
useful information.
– The discussion sharpens the researcher’s focus of
attentions on specific aspects of the study.
e) Rephrasing the research problem:
– The researcher must sit to rephrase the research
problem into a working proposition.
– Through rephrasing, the researcher puts the research
problem in as specific terms as possible so that it may
become operationally viable and may help in the
development of a working hypothesis.
Abiyot Animaw(PhD.) 72
f) In addition
– Technical terms or phrases, with special meanings
used in the statement of the problem should be
clearly defined.
– Basic assumptions or postulates relating to the
research problem should be clearly stated.
– The suitability of the time period and the sources of
data available must be considered in defining the
problem.
– The scope of the investigation within which the
problem is to be studied must be mentioned
explicitly in defining a research problem.
Abiyot Animaw(PhD.) 73
Abiyot Animaw(PhD.) 74
Abiyot Animaw(PhD.) 75
Abiyot Animaw(PhD.) 76
Abiyot Animaw(PhD.) 77
3. Extensive Literature Survey
– Once the problem is formulated, the researcher
should undertake an extensive literature survey
connected with the problem.
• Academic journals, conference proceedings,
dissertations, government reports, policy
reports, publications of international
organizations, books, etc. must be tapped
depending on the nature of the problem.
– Usually one source leads to the next and the best
place for the survey is the library.
Abiyot Animaw(PhD.) 78
The main goals are:
– To familiarize oneself with the issue and establish
credibility
– To show the path of prior research and how current
project is linked to it
– To integrate and summarize what is known in the
area
– To learn from others and stimulate new ideas.
Abiyot Animaw(PhD.) 79
• From the survey of the literature, you will know
whether your question has not been answered
elsewhere
• You will also know what other people have said about
similar topics.
– You can learn how other people faced
methodological and theoretical issues similar to
your own
– You can learn about sources of data that you might
not have known before
Abiyot Animaw(PhD.) 80
• You can know other researchers tackling similar
problems
• Potential literature sources
• General information: Google, etc.
• Books: Library, amazon.com
• Articles:
–JSTOR: www.jstor.org
–EconLit
• Web Pages
Abiyot Animaw(PhD.) 81
Structuring the review:
– Summarize every article briefly; a sentence or two
will do
– Interpret the article in light of its relevance to your
own study
– Critique it, if necessary
– Show the stock of knowledge building up over the
course of the literature
– Show how your research topic adds naturally to this
stock of knowledge
Abiyot Animaw(PhD.) 82
4. Developing of working hypothesis
• A hypothesis is a statement, which predicts the
relationship between two or more variables.
– Formulating an appropriate and realistic research
hypothesis is a sin quo non for a sound research.
• The role of the hypothesis is to guide the researcher by
delimiting the area of research and keep him/her on the
right track.
• It is a tentative answer to a research question that can
be confirmed or refuted by data
• Formulating hypothesis is particularly useful for causal
relationships.
Abiyot Animaw(PhD.) 83
Main problems in formulating a working hypothesis
– Formulation of a hypothesis is not an easy task.
– The main problems that may arise include:
• The lack of clear theoretical framework
• The lack of ability to utilize that theoretical
framework logically
• The failure to be acquainted with available
research techniques so as to be able to phrase the
hypothesis properly.
Abiyot Animaw(PhD.) 84
Characteristics of useable hypotheses
• The hypothesis must be conceptually clear.
• This involves two things
–the concept should be clearly defined,
–the hypothesis should be commonly accepted
one. In other words, the hypothesis should be
stated in simple terms.
• The hypothesis should have empirical referents.
– no useable hypothesis embody moral judgments.
– while a hypothesis may study value judgment such a
goal must be separated from a moral preachment or a
plea for acceptance of one’s values.
Abiyot Animaw(PhD.) 85
• The hypothesis must be specific.
– all the operations and predictions indicated by it
should be spelled out.
• The hypothesis should be related to available
techniques.
– A theorist who does not know what techniques are
available to test his/her hypothesis is on a poor way
to formulate useable hypothesis or questions.
• The hypothesis should be related to a body of theory.
– It should posses theoretical relevance.
• The hypothesis should be testable.
– hypothesis should be formulated in such a way that it
is possible to verify it.
Abiyot Animaw(PhD.) 86
5, Scope and Limitations
A research project must be clear about its scope
(a) Geographical limitations
– The study might only focus on some regions, even
though the question pertains to a given country -
Ethiopia
(b) Limitations by industry or occupation
– The study might only be able to capture some
industries or occupations- formal or informal sector.
C) Limitations by subject matter
– The researcher also must know that many other
interesting questions may arise that are outside of the
scope of the study.
Abiyot Animaw(PhD.) 87
6. Preparing the Research Design
– The research design is a plan that specifies the
sources and types of information relevant to the
research question.
• It is the arrangement of conditions for the
collection and analysis of data in a manner that
aims to combine relevance to the research
purpose.
• It is the conceptual structure, plan, and strategy of
investigation within which research is conducted.
• It constitutes the blue print for the collection,
measurement and analysis of data.
– The design that gives the smallest experimental error
is the best design.
Abiyot Animaw(PhD.) 88
• The following elements are critical when making design
decisions
– What type of data is required (required data)
– Where can the required data be found (source of
data)
– What will be the sampling design
– What techniques of data collection will be used
– How will the data be analyzed (method of data
analysis)
Abiyot Animaw(PhD.) 89
7. Selecting the Sample
– The researcher must decide the way of selecting a
sample.
– Samples could be either probability or no probability
samples.
8. Execution of the Project
– Execution involves how the survey is conducted, by
means of structured questionnaire or otherwise,
etc.
• Several ways of collecting the data exist. They may
differ in terms of
(i) money costs
(ii) time costs and
(iii) other resources
Abiyot Animaw(PhD.) 90
• Survey data can be collected by any one or more of the
following ways:
• By observations
• Through personal interviews
• Through telephone interviews
• By mailing questionnaires/through schedules
• The researcher should select one of these methods
taking into account:
– the nature of investigations,
– objectives and scope of the study,
– financial resources,
– available time and the desired level of accuracy, etc.
Abiyot Animaw(PhD.) 91
9. Analyzing the Data
– After the data have been collected the researcher
turns to the task of analyzing them.
– The analysis may involve a number of closely related
operations such as:
–Editing of the raw data
–Summarizing and tabulation of the data to
obtain answers to research questions
–Drawing statistical inferences.
• Various statistical software are available for data entry
and analysis.
– SPSS, STATA, Cspro, Spreadsheet programs such as
Excel, Lotus, etc.
Abiyot Animaw(PhD.) 92
• Second round editing is done once the data entry is
completed by examining the frequency distributions,
averages, ranges modes, etc. to detect outliers.
• Analysis is completed with the preparation of
descriptive tables, running econometric and
mathematical models or programming models.
Abiyot Animaw(PhD.) 93
10. Interpretation and Generalizations
– Explaining and discussing the research results in line
with the theoretical framework is part of the
interpretation exercise.
– The real value of research lies in its ability to arrive
at certain generalizations.
11. Preparation of the Report
– The research process is completed only when the
results are shared with the scientific community.
– Report should be written in concise and objective
style in simple language avoiding vague expressions.
Abiyot Animaw(PhD.) 94
Preparing the Research Proposal
• The research proposal helps the researcher to organize
his/her ideas in a form whereby it will be possible for
him/her to look for flaws or inadequacies and is a pre
requisite in the research process.
– It serves as a basis for determining the feasibility of
the project and provides a systematic plan of
procedure for the researcher to follow.
– It assures that the parties understand the project’s
purpose and the proposed method of investigation.
– It provides an inventory of what must be done and
which materials have to be collected
Abiyot Animaw(PhD.) 95
The research proposal should usually contain the
following categories of information:
I. Introduction – this part should include the following
information
– a) The title – the title or the topic should be worded
in such a way that it suggests the theme of the
study.
– It should be long enough to be explicit but not too
long so that it is tedious – usually between 15 and
25 words.
– It should contain the key words – the important
words that indicate the subject.
Abiyot Animaw(PhD.) 96
There are three types of titles:
– Indicative title:
• they state the subject of the proposal rather than
expected outcomes.
• Example: The role of agricultural credit in
alleviating poverty in a low-potential area of
Ethiopia.
– Hanging titles have two parts: a general first part
followed by a more specific second part.
• Example: ‘Alleviation of poverty in low-potential
area of Ethiopia: the impact of agricultural credit’.
Abiyot Animaw(PhD.) 97
• Question-type titles are used less commonly than
indicative and hanging titles.
• However, they are acceptable where it is possible to
use few words – say less than 15.
– Example: ‘Does agricultural credit alleviate poverty
in low-potential areas of Ethiopia’.
Abiyot Animaw(PhD.) 98
2. Statement of the Problem
• This section makes up between one fourth and one
half of the proposal.
• It is an expansion of the title.
– It introduces the research by situating it (by giving
background), presenting the research problem and
saying how and why this problem will be “solved.“
• Without this important information the reader cannot
easily understand the more detailed information about
the research that comes later.
– It also explains why the research is being done
(rationale) which is crucial for the reader to
understand the significance of the study.
Abiyot Animaw(PhD.) 99
The problem statement should make a convincing
argument that there is not sufficient knowledge
available to explain the problem or there is a need to
test what is known and taken as fact.
– It should provide a brief overview of the literature
and research done in the field related to the
problem and of the gaps that the proposed research
is intended to fill.
• Some ways to demonstrate that you are adding to the
knowledge in your field:
• Gap: A research gap is an area where no or little
research has been carried out.
• The first step in good data management is designing your experiment that
create meaningful and unbiased data, that will not waste resources, and that
will appropriately protect human and animal subjects.
• If data are not recorded in a fashion that allows others to validate findings,
• Methodologies, etc.
Abiyot Animaw(PhD.) 161
Data collection methods
• Data collection is the process of gathering and measuring
information on variables of interest in an accepted systematic
fashion.
• Wasted resources.
• Effective methods make the detection of errors easier - whether the errors are
intentional and deliberate falsifications or inadvertent systematic or random
errors.
• Data can be acquired from Secondary and primary sources or from both.
Secondary Sources of data
– Secondary sources are those, which have been collected by other
individuals or agencies.
• Why reinvent the wheel (waste resource) if the data already exists.
• The Internet
Advantages of Secondary data
• Can be found more quickly and cheaply.
• BUT, FGD should not be used for quantitative purposes, such as the testing of
hypotheses or the generalization of findings for larger areas, which would
require more elaborate surveys.
– In group discussions, people tend to centre their opinions on the most common
ones.
• In case of very sensitive topics group members may hesitate to express their feelings
and experiences freely.
• Combining them.
Abiyot Animaw(PhD.) 182
The Observation Method
• Advantages
• One can collect data at the time it occurs and need not
depend on reports by others.
• Questioning might be the best way to learn about opinion and attitudes
of people.
• By telephone interview
• By mail or e-mail, or
– They are ideal for large sample sizes, or when the sample
comes from a wide geographic area.
Advantages
• ability, and
• Boxes
• Blank spaces
Providing Instructions
A. IBM PC
B. Apple
• IBM PC
• Apple
• Other, specify
• There should be only one correct or appropriate choice for the respondent to
make.
A. Country side
B. Farm
C. City
As another example:
IRRELEVANT RELEVANT
No control
needed
Manipulated Random•
Control ization
Random•
ization
Frequency Distribution: is an overview of all distinct values in some variable and the
number of times they occur. That is, a frequency distribution tells
how frequencies are distributed over values.
Graphical Presentation
30
Number of Subjects
25
20
15
10
5
0
1 2 3
Treatm ent Group
Pie Chart: Lists the categories and presents the
percent or count of individuals who fall in each
category.
Measures of Descriptive Statistics
Descriptive statistics: are methods for organizing
and summarizing data.
For example, tables or graphs are used to
organize data, and descriptive values such as
the average score are used to summarize data.
A descriptive value for a population is called a
parameter and a descriptive value for a sample
is called a statistic.
•Descriptive statistics are used to describe the basic
features of the data in a study.
336
Mean:
The mean is the sum of the values divided by the
number of values. The mean of a set of numbers
x1, x2... xn is typically denoted by , pronounced "x
bar". This mean is a type of arithmetic mean. The
mean describes the central location of the data;
the arithmetic mean is the "standard" average,
often simply called the "mean".
The mode
• The most frequently occurring score.
• Typically useful in describing central tendency
when the scores reflect a nominal scale of
measurement (Nominal” scales could simply
be called “labels.”)
• E.g, Eye color, gender, and hair color are all
examples of nominal data.
338
• To find the mode, or modal value, it is best to
put the numbers in order.
• Then count how many of each number. A
number that appears most often is the mode.
• We can have only one mode (unimodal) or
more than one mode (bimodal) or more than
two modes (multimodal).
339
340
• E.g1, what is the mode?
Example2: {1, 3, 3, 3, 4, 4, 6, 6, 6, 9}
341
• However, mode gives us limited information
about a distribution
– Might be misleading
342
Median:
2
• The score at the 50th percentile (in the middle) and
tells you where the middle of a data set is.
• Used to summarize ordinal (order of the values) or
highly skewed interval or ratio scores interval or
ratio scores.
348
Cont’d
• In the first case the number of pages varies
from 15 to 35 while in the second case the
number of pages varies from 10 to 40.
350
Cont’d
THE RANGE
• It is the simplest possible measure of
dispersion.
• The range of a set of numbers (data) is the
difference between the largest and the least
numbers in the set
• If this difference is small then the series of numbers
is supposed regular and if this difference is large
then the series is supposed to be irregular.
• Example : 15 20 25 25 30 35
• Range = Largest – Smallest =20
351
Cont’d
Mean Deviation
• How far, on average, all values are from the
middle.
– three steps to calculating:
1. Find the mean of all values
2. Find the distance of each value from that
mean (subtract the mean from each value,
ignore minus signs)
3. Then find the mean of those distances
352
cont’d
• Example: the Mean Deviation of 3, 6, 6, 7, 8, 11, 15, 16
• Step 1: Find the mean:
• Mean = 3 + 6 + 6 + 7 + 8 + 11 + 15 + 16/8 = 72/8 = 9
• Step 2: Find the distance of each value from that mean:
• Step 3: mean deviation Value Distance from 9
6+3+3+2+1++2+6+7/8=3.75 3 6
6 3
6 3
7 2
8 1
11 2
15 6
16 7
353
Cont’d
• Each distance we calculate is called
an Absolute Deviation, because it is
the Absolute Value of the deviation (how far
from the mean).
• From our example, the value 16 has Absolute
Deviation = |x - μ| = |16 - 9| = |7| = 7
354
Cont’d
• Mean deviation depends on all the values of
the variables and therefore it is a better
measure of dispersion than the range.
• Since signs of the deviations are ignored
(because all deviations are taken positive),
some artificiality is created.
355
cont’d
• Example:
n 2
( x x)
i
SE ( x) i 1
358
Cont’d
–Example
• You and your friends have just measured the
heights of your dogs (in millimeters):
360
• Now we calculate each dog's difference from
the Mean:
So, using the Standard Deviation we have a "standard" way of knowing what is normal,
and what is extra large or extra small.
362
Coefficient of variation (CV):
In probability theory and statistics, the coefficient
of variation (CV) is a normalized measure of
dispersion of a probability distribution.
The coefficient of variation (CV) is defined as the
ratio of the standard deviation to the mean :
SD
CV
Mean
Covariance :
Covariance between X and Y refers to a measure of
how much two variables change together.
Covariance indicates how two variables are related.
• A positive covariance means the variables are
positively related, while a negative covariance
means the variables are inversely related. The
formula for calculating covariance of sample data is
shown below.
n
(x i x )( yi y )
Cov( x, y ) i 1
n
Shape of Frequency Distribution
Skweness:
Kurtosis:
Skewness:
• It refers to symmetry or asymmetry of the distribution.
• It is a measure of the asymmetry of the probability
distribution of a real-valued random. The skewness value
can be positive or negative, or even undefined.
• Qualitatively, a negative skew indicates that the tail on the
left side of the probability density function is longer than the
right side and the bulk of the values (possibly including the
median) lie to the right of the mean.
• A positive skew indicates that the tail on the right side is
longer than the left side and the bulk of the values lie to the
left of the mean.
• A zero value indicates that the values are relatively evenly
distributed on both sides of the mean, typically but not
necessarily implying a symmetric distribution.
The coefficient of Skewness is a measure for the degree of symmetry in
the variable distribution.
Kurtosis:
(x i x )( yi y )
r ( x, y ) i 1
var( xi x ) var( yi y )
370
Cont’d
• A value of ± 1 indicates a perfect degree of
association between the two variables.
• As the correlation coefficient value goes towards
0, the relationship between the two variables will
be weaker.
• Correlation works for quantifiable data in which
numbers are meaningful, usually quantities of
some sort.
• It cannot be used for purely categorical data, such
as gender, brands purchased, or favorite color.
371
Cont’d
• For example, height and weight are related;
taller people tend to be heavier than shorter
people.
• The relationship isn't perfect. People of the
same height vary in weight, and you can easily
think of two people you know where the shorter
one is heavier than the taller one.
• Correlation can tell you just how much of the
variation in peoples' weights is related to their
heights.
372
Example
• A correlation coefficient of 1
means that for every positive
increase in one variable, there is a
positive increase of a fixed
proportion in the other. For
example, shoe sizes go up in
(almost) perfect correlation with
foot length.
• A correlation coefficient of -1
means that for every positive
increase in one variable, there is a
negative decrease of a fixed
proportion in the other. For
example, the amount of gas in a
tank decreases in (almost) perfect
correlation with speed.
• Zero means that for every
increase, there isn’t a positive or
negative increase. The two just
aren’t related.
373
Correlation techniques
• Usually, in statistics, four types of correlations:
Pearson correlation, Kendall rank correlation,
Spearman correlation, and the Point-Biserial
correlation.
Pearson r correlation: Pearson r correlation is the
most widely used correlation statistic to measure
the degree of the relationship between linearly
related variables.
• For example, if we want to measure how age and
glucose level are related to each other,
Pearson r correlation is used to measure the
degree of relationship between the two.
374
Cont’d
• Types of research questions a Pearson
correlation can examine:
• Is there a statistically significant relationship
between age and glucos level?
• Is there a relationship between temperature
and ice cream sales?
• Is there a relationship between job satisfaction
and income?
375
Cont’d
Kendall rank correlation: is a non-parametric test
that measures the strength of dependence
between two variables.
• Sample Question: Two interviewers ranked 12
candidates (A through L) for a position. The results from
most preferred to least preferred are:
• Interviewer 1: ABCDEFGHIJKL.
• Interviewer 2: ABDCFEHGJILK
376
Cont’d
Spearman rank correlation: Spearman rank
correlation is a non-parametric test that is used to
measure the degree of association between two
variables.
• The Spearman rank correlation test does not carry
any assumptions about the distribution of the data
and is the appropriate correlation analysis when
the variables are measured on a scale that is at
least ordinal.
• ρ= Spearman rank correlation
di= the difference between the ranks of corresponding
variables
n= number of observations
377
Cont’d
• Types of research questions a Spearman Correlation
can examine:
• Is there a statistically significant relationship
between participants’ level of education (high
school, bachelor’s, or graduate degree) and
their starting salary?
378
ECONOMETRIC ANALYSIS:
Robust Regression Analysis
What is Regression?
• A way of predicting the value of one variable from
another.
• It is a hypothetical model of the relationship
between two variables.
• For example, relationship between rash driving
and number of road accidents by a driver is best
studied through regression.
380
• Regression analysis is a statistical tool for the
investigation of relationships between variables.
Usually, we seek to ascertain the causal effect of one
variable upon another.
Regression analysis estimates the conditional
expectation of the dependent variable given the
independent variables that is, the average value of the
dependent variable when the independent variables
are held fixed.
In all cases, the estimation target is a function of the
independent variables called the regression function.
Regression analysis is widely used for prediction and
forecasting Y X X
0 1 1 2 2
Broadly speaking, traditional econometric methodology proceeds along
the following lines:
•How then does one know that the data really support the Keynesian theory of
consumption? Is it because the Keynesian consumption function (i.e., the regression
line) shown in Figure I.3 is extremely close to the actual data points?
•Is it possible that another consumption model (theory) might equally fit the data as
well? For example, Milton Friedman has developed a model of consumption, called
the permanent income hypothesis.
•Robert Hall has also developed a model of consumption, called the life-cycle
permanent income hypothesis. Could one or both of these models also fit the data
in Table I.1?
•Let us use the Keynesian model for a time being. Let Keynes states that on average,
consumers increase their consumption as their income increases, but not as much as the
increase in their income (MPC < 1).
2. Specification of the Mathematical Model of Consumption (single-
equation model)
Y = β 1 + β2 X 0 < β2 < 1 (I.3.1)
• To allow for the inexact relationships between economic variables, (I.3.1) is modified as
follows:
• Y = β1 + β2X + u (I.3.2)
• where u, known as the disturbance, or error, term, is a random (stochastic) variable that
has well-defined probabilistic properties. The disturbance term u may well represent all
those factors that affect consumption but are not taken into account explicitly.
• N.B: Dependent variable (y) means response variable, explained, predictand, endogenous,
and outcome variable. Independent variables (x) means explanatory, repressors,
exogenous, predictor variables. And, coefficients are called statistic in sample, and are
parameter in population
(I.3.2) is an example of a linear regression model, i.e., it hypothesizes that Y is
linearly related to X, but that the relationship between the two is not exact; it is
subject to individual variation. The econometric model of (I.3.2) can be
depicted as shown in Figure I.2.
4. Obtaining Data
• To obtain the numerical values of β1 and β2, we need data. Look at Table
I.1, which relate to the personal consumption expenditure (PCE) and the
gross domestic product (GDP). The data are in ―real‖ terms.
5. Estimation of the Econometric Model
Yi X i i
Dependent (Response) Independent
Variable Regression (Explanatory)
Line Variable
391
Ordinary Least Squares Method
E (Y ) Yˆ a bX
Y Yˆ Y (a bX ) Y a bX
2 (Y Yˆ ) 2 (Y a bX ) 2
(Y a bX ) 2 Y 2 a 2 b 2 X 2 2aY 2bXY 2abX
2
(Y ˆ
Y ) 2
(Y a bX ) 2
Min 2 Min (Y a bX ) 2
How to get coefficients b that can minimize the sum of squares of errors?
Compute a and b so that partial derivatives
with respect to a and b are equal to zero
2
(Y a bX ) 2
2na 2 Y 2b X 0
a a
na Y b X 0
a Y
b X
Y bX
n n
Take a partial derivative with respect to b and
got,
plug in a you Y X
2
(Y a bX ) 2
2b X 2 2 XY 2a X 0
b b
b X 2 XY a X 0 Yb
X X 2
XY Y bX X 0
b X 2 XY b
Y X
X 0
n n
X Y X 0
b X 2 XY b
2
n n
n X 2 X 2 XY X Y
b
n n
Least squares method is an algebraic solution
that minimizes the sum of squares of errors
(variance component of error)
n XY X Y ( X X )(Y Y ) SP
b xy
n X X (X X )
2 2 2
SS
x
a Y b X Y bX
n n
Properties of OLS estimators: The outcome of least squares method is
OLS parameter estimators a and b.
•OLS estimators are linear
•OLS estimators are unbiased (precise)
•OLS estimators are efficient (small variance)
•Gauss-Markov Theorem: Among linear unbiased estimators, least
square estimator (OLS estimator) has minimum variance. BLUE (best
linear unbiased estimator)
In order to estimate coefficients, first we need to build the Classical
linear regression model:
Linear in Parameter
Linear relationship between Y and Xs
Constant slopes (coefficients of Xs)
Xs are fixed; Y is conditional on Xs
X is exogenous and error is not related to Xs
Constant variance of errors (Homoscedascticity)
No autocorrelation with error terms
Therefore, the estimation of the Econometric Model of the example we
have is as follows:
• Regression analysis is the main tool used to obtain the estimates. Using this
technique and the data given in Table I.1, we obtain the following estimates
of β1 and β2, namely, −184.08 and 0.7064. Thus, the estimated
consumption function is:
Response 0 i
Plane
X2
X1 (X1i,X2i)
Y|X = 0 + 1X1i + 2X2i
399
R2
401
Goodness-of-fit:How Good Is theModel?
• SST
– Total variability
(variability between
scores and the mean).
Residual SS or Error SS (SSR)
• SSR
– Residual/error variability (variability
between the regression model and the
actual data).
• Difference between the
observed data and the model
• This represents the degree of
inaccuracy when fitting the best
fit model to the data.
Model SS or Regression SS (SSM)
• SSM
– Model variability (difference in
variability between the model and the
mean).
• This is the improvement we get from
fitting the model to the data relative to
the null model.
SST = SSR +SSM
• How to we get large SSM?
• What happens if the SSM is large?
• Regression model is much different from using
the mean as the outcome, therefore
regression model improves the outcome.
• So, we can calculate the proportion of
improvement due to the model.
• SSM/SST, percentage of variation explained by
the model.
• In Simple Regression (if only has one X), R2
square is Karl Pearson correlation coefficient
squared. r2=.89672=.80
• If a regression model includes many regressors,
R2 is not equal to r2.
• Addition of any regressor always increases R2
regardless of the relevance of the regressor.
• Adjusted R2 give penalty for adding regressors:
(n 1)
R 1
2
(1 R 2 )
(n k )
Statistical Test:
Inferential Statistics and Hypothesis
Testing
Inferential statistics
• Inferential statistics use data taken from a
population to describe and make inferences about
the population.
• With inferential statistics, researchers are trying to
reach conclusions that extend beyond the
immediate data alone.
• Inferential statistics use a random sample of data
taken from a population to describe and make
inferences about the population.
412
Hypothesis testing or significance testing
• It is a method for testing a claim or hypothesis
about a parameter in a population, using data
measured in a sample.
H 0 : 1 2 0
H1 : 1 2 0
t
sbˆ
(Y Yˆ )
t
2
e 2
t
se( )
( n k ) ( X X )
t
2
(n k ) ( X t X )2
1) Statistically Testing for joint level of significance
• The F-ratio provide a test of the significance of all the
independent variables (other than the constant term)
taken together.
• The F-ratio is the ratio of the explained-variance-per-
degree-of-freedom-used to the unexplained-variance-
per-degree-of-freedom-unused, i.e.:
ESS / k 1
F
RSS / n k
Where K is the number of coefficient and N is the number of observation .
• That is to find out whether the estimates obtained in, Eq.
(I.3.3) are in accord with the expectations of the theory that is
being tested. Keynes expected the MPC to be positive but less
than 1. In our example we found the MPC to be about 0.70.
But before we accept this finding as confirmation of
Keynesian consumption theory, we must enquire whether this
estimate is sufficiently below unity. In other words, is 0.70
statistically less than 1? If it is, it may support Keynes‘ theory.
• Such confirmation or refutation of economic theories on the
basis of sample evidence is based on a branch of statistical
theory known as statistical inference (hypothesis testing).
• It is also long with Statistical Inference from Sample to
population
H 0 : 1 0 0.7064 0
t 28.56
H1 : 1 0 0.025
C) Decision Rules:
423
Diagnostic Tests (Post-Estimation
Tests)
8. Diagnostic Tests (Post-Estimation Tests)
• The results of the model MUST satisfy the
assumptions of linear regression model and the
properties of the coefficients. Otherwise, we do
not need to use the result!
• Test for Normality
• Test for Multicollinearity
• Test for Autocorrelation
• Test for Homoskedasticity
Test for Normality:
the Jarque–Bera test is a goodness-of-fit test of
whether sample data have
the skewness and kurtosis matching a normal
distribution. The test statistic JB is defined as
a) Decision Rule
J cal J tab , Reject H 0 and Accept H1
Pvalue , Reject H 0 and Accept H1
Test for Multicollinearity
Multicollinearity is a linear relationship between two explanatory
variables. One of the features of Multicollinearity is that the
standard errors of the affected coefficients tend to be large. In that
case, the test of the hypothesis that the coefficient is equal to zero
leads to a failure to reject the null hypothesis
Steps: run an OLS of one of the explanatory variable on all other
explanatory variables. And calculate VIF‖
High VIF, High MC: In the rule of thumb,
If VIF is less than 10 , MC is not there
1 1
VIFi
(1 Ri ) tolerance
2
Test for Hetroscedasctisity
In statics, a sequence of random variable is heteroskedasticity, if
the random variables have different variance.
• When the errors have the same scatter regardless of the value of
X, the error terms are homoscedastic. When the scatter of the
errors is different, varying depending on the value of one or
more of the independent variables, the error terms
are heteroskedastic.
Test for Heteroskedasticity using Breusch-Pagan
Test for Heteroskedasticity using Goldfeld–Quandt:
H 0 : Constant Variance Chi 2cal Chi 2tab , Reject H 0 and Accept H1
H1 : Null Hypothesis is not true Pvalue , Reject H 0 and Accept H1
Test for Autocorrelation
In statistics, the autocorrelation of a random process describes the correlation
between values of the process at different points in time, as a function of the two
times or of the time difference.
• degree of similarity between a given time series and a lagged version of itself
over successive time intervals.
• measures the relationship between a variable's current value and its past values.
The existence of autocorrelation can be detected using
Having the above regression estimate, Durbin-Watson propose the following to
detect the existence of autocorrelation:
et e11 t yt 0 1 xt et yˆ et
t n
(e e )2
d 2(1 )
i i 1
d i 2
t n
(e )
i 1
i
2
Having this decision rule, the Hypothesis Testing
is: H : 0 or d=2, no autocorrelation
0
H 1 : 0 or d 2, there is autocorrelation
H 0 : No serial correlation
H1 : Null Hypothesis is not true
yi Di ut
Keys:
Smoker
α+β
x
Two ways of Specifying Model with Dummy:
1)A model with constant term:
• Drop out one of the dummy category and consider it as a
reference category. This is due to protecting the model from
multicollinearity.
•Constant term coefficient is mean value of the reference category.
•Coefficients of dummy variables measures marginal difference.
•Example a model for having 4 season dummy variables:
Examining the impacts of seasonality on wage income
Y 0 1d 2 2 d3 3 d 4
Exercise 1: seasonality is represented by dummy
variables and agricultural wage income is captured
by Y.
Y 800 200d 2 400d3 100d 4
The mean wage of season one is 800 Birr.
1. Wage in season two is less than the reference season, S1, wage
2. Wage in season Three wage is higher than S1 wage
3.Wage in season four is higher than S1 wage.
1) A model with out constant term:
• Drop out the constant term. This is due to
protecting the model from Multicollinearity
• No season dummy variables are dropped out for
being a reference category.
• Coefficients of dummy variables measures mean
values, not marginal difference
• Example a model for having 4 season dummy
variables: Examining the impacts of seasonality
on wage income
Exercise 2: seasonality is represented by dummy
variables and agricultural wage income is captured
by Y. You can simply derived from the first model
Y 800d1 600d 2 1000d3 900d 4
1.The mean wage of season one is 800 Birr.
2. Mean wage in season two is 600
3. Mean wage in season three wage is 1000
4.Mean wage in season four is 900
Interactive Dummy
Dummy variables are simply variables that have been coded either 0 or 1
to indicate that an observation falls into a certain category. They are also
sometimes called indicator variables.
Interactive terms captures the possibility that the effect of one
independent variable might vary with the level of another independent
variable. Example, the effect of the drug on your blood pressure depends
on your age.
OBS PRESSURE AGE DRUG Age*Drug
1 85 30 0 0
2 95 40 1 40
3 90 40 1 40
4 75 20 0 0
5 100 60 1 60
6 90 40 0 0
7 90 50 0 0
8 90 30 1 30
9 100 60 1 60
10 85 30 1 30
Suppose that when we run a regression, we get the following result
A) Again we set D = 0 for the control group and D = 1 for
those taking the drug.
Y = 70 + 5(Drug) + .44(Age) + .21(Drug*Age)
B) We obtain two separate equations for the two
groups:
set D = 0: Y = 70 + .44Age
Set D = 1: Y = 75 + .65Age
Y=75 + .65Age
90
Y=70 + .44 Age
DRUG
BLOOD PRESSURE
80
CONTROL
70
10 20 30 40
AGE
Note that for those taking the drug not only does the intercept increase
(that is, the average level of blood pressure), but so does the slope.
Interpretation of an interactive term -- The effect of one independent
variable (DRUG) depends on the level of another independent
variable (AGE).
The results here suggest that for people not taking the drug, each
additional year adds .44 units to blood pressure.
For people taking the drug, each additional year increases blood
pressure by .65 units.