Professional Documents
Culture Documents
August 2019
1 Introduction
4 Random assignment
5 What’s next
• I still have to discuss with your TA, but most probable software we
will use is Stata. You may use any version of Eviews and Stata
• Textbooks:
• Brooks, Chris. Introductory Econometrics for Finance, 2nd ed.
Cambridge University Press (CB).
• Halcoussis, Dennis, Understanding Econometrics, 1st edition,
South-Western. (DH).
• Wooldridge, Jeffrey. M, Introductory Econometrics A Modern
Approach, 5th edition. South-Western Cengage Learning. (JW).
• Verbeek, Marno. A Guide to Modern Econometrics, 4th edition, John
Wiley. (MV).
• Angrist, Joshua and Jörn-Steffen Pischke. Mostly Harmless
Econometrics (JJ-MHE)
• Angrist, Joshua and Jörn-Steffen Pischke. Mastering Metrics (JJ-MM)
Sampling
• Population vs sample
• In most cases, we cannot obtain
population data. What we can
do most is to draw some
observations from whole
population, and analyze the
data.
• A careful sampling will give us
the ability to predict the
population behavior
Types of Data
We may categorize our dataset into three types according to the period:
• Time-Series: a sequence of data points made over a time interval.
• Cross-section: data collected by observing many subjects (such as
individuals, countries, or regions) at the same point of time. Ex:
census data
• Pooled data: combination between time-series and cross-section data:
annual GDP data for all ASEAN countries
Source of data:
• Primary data: a term for data collected from a source. Ex: field
survey on perception
• Secondary data: data collected by someone other than the user. Ex:
data from BPS
Data Presentation
Why Econometrics
Correlation vs Causation
• The organization of the regression equation often leads people to
assume the explanatory variables cause the dependent variable, but
this interpretation isn’t necessary.
• Correlation does not prove causation. If two variables, A and B, are
correlated, then it could be that:
• A causes B, or vise versa
• Both A and B are caused by some other event
• The correlation is due to random chance
• Studenmund (2017): ”Don’t be deceived by the words dependent and
independent, however. Although many economic relationships are
causal by their very nature, a regression result, no matter how
statistically significant, cannot prove causality. All regression analysis
can do is test whether a significant quantitative relationship exists.
Judgments as to causality must also include a healthy dose of
economic theory and common sense.”
• let’s watch the talk
https://www.youtube.com/watch?v=8B271L3NtAw
MPKP FEB UI Intro August 2019 12 / 23
Econometrics and Causality
Bringing Causality
Khuzdar Maria
Potential outcome without insurance: Yoi 3 5
Potential outcome with insurance: Yoi 4 5
Treatment (insurance status chosen): Di 1 0
Actual health outcome: Yi 4 5
Treatment effect: (XX) XX XX
Khuzdar Maria
Potential outcome without insurance: Yoi 3 5
Potential outcome with insurance: Yoi 4 5
Treatment (insurance status chosen): Di 1 0
Actual health outcome: Yi 4 5
Treatment effect: Y1i − Y0i 1 0
Why Mislead
• The causal effect is masked by the initial health status that affect the
student’s decision in joining program. This is what we call with
SELECTION BIAS
Back to counterfactual
• Let assume now more than 2 people joining MPKP, some are joining
the health insurance, and others skip it. You attempt to evaluate the
effect on health status Yi
• Let Di = 1 if individual i is insured and Di = 0 is not.
Avgn [Yi ∣Di = 1] is the average health status among insured, while
Avgn [Yi ∣Di = 0] is the status among uninsured.
• What we want to know (Why?)
or,
Avgn [Y1i ∣Di = 1] − Avgn [Y0i ∣Di = 0] (5)
MPKP FEB UI Intro August 2019 18 / 23
Econometrics and Causality
Constant-effects formula
• The causal effect is always masked by the last part of the exposition.
What is it? Can we drop? How?
Regression
Yi = α + βDi + Xi γ + ei (6)