Professional Documents
Culture Documents
Yiran Xie
September 7, 2022
1. Motivation
Endogeneity
Sources of endogeneity
2. Instrumental Variables
IV Intuition
Model setup
IV estimator
2SLS and Optimal GMM Estimators
Table of contents
Motivation
Endogeneity
Sources of endogeneity
Instrumental Variables
IV Intuition
Model setup
IV estimator
2SLS and Optimal GMM Estimators
Motivation
yi = xi′ β + ui
Endogeneity
E [xu] = 0
Otherwise it is endogenous
Exogeneity guarantees that the things that are not accounted for (u) do not
interfere with the estimation of β
Endogeneity bias
β̂ = (X ′ X )−1 X ′ y
= β + (X ′ X )−1 X ′ u
! N
#−1 ! N
#
1 " ′ 1 "
=β+ xi xi x i ui
N N
i=1 i=1
When E [xi ui ] ∕= 0,
! N
#
1 "
plim x i ui = E [xi ui ] ∕= 0
N
i=1
⇒ plim β̂ ∕= β
Standard regression:
Assumes that x is uncorrelated with the error u
Then the only effect of x on y is a direct effect via the term βx
⇒ β̂ is consistent
x y
⇒ β̂ is biased
x y
Why do we care?
Sources of Endogeneity
Omitted variables
Simultaneous equations
Selection bias
Measurement error
yi = β0 + β1 x1i + αx2i + ui
yi = β0 + β1 x1i + ei
yi = β 0 + β 1 d i + u i
We want to estimate β1
Can we use
β̂1 = mean(yi |di = 1) − mean(yi |di = 0)?
Notice that
yi = β 0 + β 1 xi + u i
yi = β0 + β1 xi∗ + ei
ei = u i − β 1 µi
Table of contents
Motivation
Endogeneity
Sources of endogeneity
Instrumental Variables
IV Intuition
Model setup
IV estimator
2SLS and Optimal GMM Estimators
IV Intuition
y = β0 + β1 x + u
IV Intuition
Note: z does not directly cause y , though z and y are correlated via indirect
path of z being correlated with x which in turn determines y .
IV Intuition
Then 0.2 years extra schooling is associated with $500 extra earnings.
So a one year increase in schooling is associated with a $500/0.2 = $2, 500
increase in earnings.
(not a general formular. only for this simplified model with one instrument z)
Model setup
Model setup
yi = xi′ β + ui
E (zi ui ) = 0
In over-identified$
case (#instruments > #regressors)
we cannot solve i zi (yi − xi′ β) = 0
because we have more equations than unknowns
dim(z) equations with only dim(x) unknowns
More on 2SLS
Π̂ = (Z ′ Z )−1 Z ′ X
X̂ = Z Π̂ = Z (Z ′ Z )−1 Z ′ X
Structural: Do OLS of y on X̂
use mus06data.dta
Drug expenditures for U.S. elderly (ldrugexp) regressed on
endogenous private health insurance dummy (hi empunion) and
exogenous regressors defined by global x2list.