You are on page 1of 23

Applied Econometrics: An introduction

Muhammad Halley Yudhistira


Department of Economics, Universitas Indonesia
m.halley@ui.ac.id

August 2019

MPKP FEB UI Intro August 2019 1 / 23


Content

1 Introduction

2 Standard Procedure in Research

3 Econometrics and Causality

4 Random assignment

5 What’s next

MPKP FEB UI Intro August 2019 2 / 23


Introduction

Introduction to Our Course

• This is an introductory class to applied econometrics. I hope you still


remember with your math and statistics class during matriculation.
• Still, I wish this class put at least a good memory for you
• Grading consists of:
• Paper (20)
• Tutor (10)
• Midterm (35)
• Final (35)

MPKP FEB UI Intro August 2019 3 / 23


Introduction

Softwares and Textbooks

• I still have to discuss with your TA, but most probable software we
will use is Stata. You may use any version of Eviews and Stata
• Textbooks:
• Brooks, Chris. Introductory Econometrics for Finance, 2nd ed.
Cambridge University Press (CB).
• Halcoussis, Dennis, Understanding Econometrics, 1st edition,
South-Western. (DH).
• Wooldridge, Jeffrey. M, Introductory Econometrics A Modern
Approach, 5th edition. South-Western Cengage Learning. (JW).
• Verbeek, Marno. A Guide to Modern Econometrics, 4th edition, John
Wiley. (MV).
• Angrist, Joshua and Jörn-Steffen Pischke. Mostly Harmless
Econometrics (JJ-MHE)
• Angrist, Joshua and Jörn-Steffen Pischke. Mastering Metrics (JJ-MM)

MPKP FEB UI Intro August 2019 4 / 23


Introduction

What we will cover

• This class aims to (hopefully) help you be familiar with regression as


one of empirical tools in economics
• we will cover:
• Ordinary least square (OLS)
• Limited dependent model
• Panel data
• introduction to time series
• We are trying to have it as applicable as possible

MPKP FEB UI Intro August 2019 5 / 23


Standard Procedure in Research

Data Analysis in Research

Figure: Data analysis process

MPKP FEB UI Intro August 2019 6 / 23


Standard Procedure in Research

Sampling

• Population vs sample
• In most cases, we cannot obtain
population data. What we can
do most is to draw some
observations from whole
population, and analyze the
data.
• A careful sampling will give us
the ability to predict the
population behavior

MPKP FEB UI Intro August 2019 7 / 23


Standard Procedure in Research

“Cooking” the Data


• Once you get your data, what kind of “receipt” you want to execute?
• Descriptive statistics: collecting, presenting, and describing the data
• Inferential statistics: drawing conclusion of population behavior w.r.t
the behavior of our sample

MPKP FEB UI Intro August 2019 8 / 23


Standard Procedure in Research

Types of Data

We may categorize our dataset into three types according to the period:
• Time-Series: a sequence of data points made over a time interval.
• Cross-section: data collected by observing many subjects (such as
individuals, countries, or regions) at the same point of time. Ex:
census data
• Pooled data: combination between time-series and cross-section data:
annual GDP data for all ASEAN countries
Source of data:
• Primary data: a term for data collected from a source. Ex: field
survey on perception
• Secondary data: data collected by someone other than the user. Ex:
data from BPS

MPKP FEB UI Intro August 2019 9 / 23


Standard Procedure in Research

Data Presentation

MPKP FEB UI Intro August 2019 10 / 23


Econometrics and Causality

Why Econometrics

• Descriptive analysis using tables and graphs is never enough. It has


limited purpose
• Further technique enables us to understand the relationship between
two (or more) variables in form of a specific function.
• For example, how to analyze the relationship between price and
quantity demanded in our usual demand function
• Econometrics technique will help us. Econometrics uses statistical
tests to tackle various questions, such as. . .
• How well or badly does the model describe the observed data?
• Does another available model to describe the observed data any better?
• In any model, how large is the estimate of the effects of variable on any
other, and how reliable is the estimate?
• How far into the future, and with what degree of reliability, can the
model predict any variable of interest?

MPKP FEB UI Intro August 2019 11 / 23


Econometrics and Causality

Correlation vs Causation
• The organization of the regression equation often leads people to
assume the explanatory variables cause the dependent variable, but
this interpretation isn’t necessary.
• Correlation does not prove causation. If two variables, A and B, are
correlated, then it could be that:
• A causes B, or vise versa
• Both A and B are caused by some other event
• The correlation is due to random chance
• Studenmund (2017): ”Don’t be deceived by the words dependent and
independent, however. Although many economic relationships are
causal by their very nature, a regression result, no matter how
statistically significant, cannot prove causality. All regression analysis
can do is test whether a significant quantitative relationship exists.
Judgments as to causality must also include a healthy dose of
economic theory and common sense.”
• let’s watch the talk
https://www.youtube.com/watch?v=8B271L3NtAw
MPKP FEB UI Intro August 2019 12 / 23
Econometrics and Causality

Bringing Causality

• In recent applied econometrics, people are obsessed to build a


causality. ”Does A cause B?” becomes a mainstream.
• Does social assistance program (ex.PKH) improve welfare?
• Is trans-Java highways beneficial for household welfare?
• Does odd-even policy reduce traffic congestion?
• Let assume you are a governor of Jakarta and aim to evaluate the
effect of KJP on student’s UAS result. How do you quantify the
effect?

MPKP FEB UI Intro August 2019 13 / 23


Econometrics and Causality

Challenges: How to build a correct ”counterfactual”

• Consider the following example. Two new students are admitted by


MPKP and offered an MPKP-customized health insurance by Pak
Triman. A student decides to join the program and another one
doesn’t. As an SPS, you try to evaluate effect of the program.

Khuzdar Maria
Potential outcome without insurance: Yoi 3 5
Potential outcome with insurance: Yoi 4 5
Treatment (insurance status chosen): Di 1 0
Actual health outcome: Yi 4 5
Treatment effect: (XX) XX XX

MPKP FEB UI Intro August 2019 14 / 23


Econometrics and Causality

Challenges: How to build a correct ”counterfactual (2)”


• The causal effect of the health insurance is Y1i − Y0i . The effect is
detected only for Khuzdar.
• If we have a group of n people, the average causal effect is
Avgn [Y1i − Y0i ], where
1 1 1
Avgn [Y1i − Y0i ] = ∑[Y1i − Y0i ] = ∑[Y1i ] − ∑[Y0i ] (1)
n n n

Khuzdar Maria
Potential outcome without insurance: Yoi 3 5
Potential outcome with insurance: Yoi 4 5
Treatment (insurance status chosen): Di 1 0
Actual health outcome: Yi 4 5
Treatment effect: Y1i − Y0i 1 0

MPKP FEB UI Intro August 2019 15 / 23


Econometrics and Causality

Challenges: How to build a correct ”counterfactual (3)”

• What do we see in the real world?


• Actual health outcome of both students after the health insurance
program
• Temptation in taking the difference between health outcome of
Khuzdar and Maria as causal effect (Y1K − Y1M = Y1K − Y0M = −1).
• misleading conclusion and even further policy implication
• Mistakes in choosing the counterfactual is commonly found in
understanding the causal analysis. The key: comparability

MPKP FEB UI Intro August 2019 16 / 23


Econometrics and Causality

Why Mislead

• Let’s see closer to our misleading result. We may rewrite it as:

Y1K − Y1M = Y1K − Y0M


= (Y1K − Y0K ) + (Y0K − Y0M ) (2)
= 1 + (−2)

• The causal effect is masked by the initial health status that affect the
student’s decision in joining program. This is what we call with
SELECTION BIAS

MPKP FEB UI Intro August 2019 17 / 23


Econometrics and Causality

Back to counterfactual
• Let assume now more than 2 people joining MPKP, some are joining
the health insurance, and others skip it. You attempt to evaluate the
effect on health status Yi
• Let Di = 1 if individual i is insured and Di = 0 is not.
Avgn [Yi ∣Di = 1] is the average health status among insured, while
Avgn [Yi ∣Di = 0] is the status among uninsured.
• What we want to know (Why?)

Avgn [Y1i ∣Di = 1] − Avgn [Y1i ∣Di = 0] (3)

• Unfortunately, what we know (Why?)

Avgn [Yi ∣Di = 1] − Avgn [Yi ∣Di = 0] (4)

or,
Avgn [Y1i ∣Di = 1] − Avgn [Y0i ∣Di = 0] (5)
MPKP FEB UI Intro August 2019 18 / 23
Econometrics and Causality

Constant-effects formula

• Let further assume that the insurance makes people healthier by β, or


average causal effect of insurance on health, that is Y1i = Y0i + β
• Substituting into Equation (5), we have

Avgn [Y1i ∣Di = 1] − Avgn [Y0i ∣Di = 0]


= (β + Avgn [Y0i ∣Di = 1]) − Avgn [Y0i ∣Di = 0]
= β + (Avgn [Y0i ∣Di = 1] − Avgn [Y1i ∣Di = 0])

• The causal effect is always masked by the last part of the exposition.
What is it? Can we drop? How?

MPKP FEB UI Intro August 2019 19 / 23


Random assignment

Random assignment for removing selection bias

• By randomly assign the treatment, we expect that probability of


people getting treated is similar across group
• The random assignment works by ensuring that the mix of individuals
being compared is the same, not by eliminating individual differences.
Creating ceteris paribus
• Note: The number of sample should be large enough and
representative to be able to draw any conclusion at population level

MPKP FEB UI Intro August 2019 20 / 23


Random assignment

Random assignment in practice

• Popular term: Randomized Control Trial (RCT)


• Having random assignment also means that you do not have to use
about ”complicated” econometrics.
• Even simple t-test of difference in average between treatment and
control group almost give you the whole story
• You may consider to skip the next class afterwards.
• In reality, RCT is perhaps the most difficult approach
• Careful preparation and design
• Costly

MPKP FEB UI Intro August 2019 21 / 23


What’s next

Get away from bias

• You’ve (hopefully) already understood that simple comparison


between treated and control groups tends to provide misleading
causal effect unless under random assignment is applied
• Question: Are there any alternative ways to escape from the bias
(control the selection)?

MPKP FEB UI Intro August 2019 22 / 23


What’s next

Regression

• In the next session, we will learn how regression framework can


provide us a causal estimate
• Specifically we aim for

Yi = α + βDi + Xi γ + ei (6)

and hope to have β as causal effect by controlling other factors that


may affect the outcome.

MPKP FEB UI Intro August 2019 23 / 23

You might also like