Metrics WT 2023-24 Unit12 Iv+2sls

Unit 12: Instrumental variables (IV) and 2SLS
C. Zulehner: Introductory Econometrics 1 / 32

Outline
1 Instrumental Variables
The IV Estimator
Identification under IV Estimation
Variance of the IV Estimator
Consistency of the IV estimator
2 The Power of IV: Examples

Addressing errors-in-variables with IV
Dealing with unobserved heterogeneity: An Example
3 The Two Stage Least Squared Estimator

1. The Instrumental Variables (IV) Estimator
Our regression model is:
yN×1 = XN×K β K ×1 + uN×1
We suspect that E (X0 u) = plim( N1 X0 u) 6= 0 ⇒ plimβ̂ 6= plimβ, therefore, the OLS

estimator is inconsistent
We have namned four different causes for this endogeneity problem
In many cases, only some of the variables in X suffer from endogeneity
So partition X = [X1 X2 ] and define:
X01 u

Exogenous: plim = E (X01 u) = 0
N
X02 u

Endogenous: plim = E (X02 u) 6= 0
N

Endogeneity - Remarks
Our thinking that plim(X02 u/N) 6= 0 is a suspicion, not a certainty

I We use economic reasoning, common sense, intuition to form our suspicion
I We will never know whether the suspicion is real or justified
I We cannot test whether plim(X02 u/N) = 0 as u is not observable
We look for an alternative estimator, which can help to obtain consistent estimates
in presence of endogeneity
We will call this estimator Instrumental Variable (IV) estimator

The Instruments
The key idea is to find some variables (ZP×N ) which have two properties:
I They are uncorrelated to the error term u, hence
1 0
plim( Z u) = E (Z0 u) = 0
N
I They are correlated to the potentially endogenous regressors X, hence
1 0
plim( Z X) = E (Z0 X) = ΣZX 6= 0
N
We will call these variables instruments

1.1 The IV Estimator
Let’s start with the SLR model as usual. For y = β0 + β1 x + u, and given our
assumptions
E (z, y ) = Cov (z, y ) = Cov (z, β0 + β1 x + u)

= β1 Cov (z, x) + Cov (z, u)
Cov (z, y )
⇒ β1 =
Cov (z, x)
Then the IV estimator for β1 is obtained by replacing the population moments with
the sample moments:
Pn
(zi − z̄)(yi − ȳ )
β̂1 = Pi=1
n
i=1 (zi − z̄)(xi − x̄)
When z is equal x, IV reduces to OLS, i.e. OLS is a particular case of IV when use
as instrument for themselves

The IV Estimator - The general case
The key assumption (moment condition) that we want to exploit is

plim( N1 Z0 u) = E (Z 0 u) = 0
I Note that this is a property of the entire population
We can measure its sample analog:
Z0 û
= 0 ⇒ Z0 (y − Xβ̂ IV ) = 0 ⇒ Z0 y = Z0 Xβ̂ IV
N
If Z0 X is invertible, we have:
Instrumental Variable (IV) estimator
β̂ IV = (Z0 X)−1 Z0 y

The IV Estimator - Weighted LS derivation
The alternative strategy is to minimize the residual sum of squares weighting
observations with the instruments, i.e. find the β̂ that minimizes:
(Z0 u)0 (Z0 u) = (u0 ZZ0 u)

= (y − Xβ)0 ZZ0 (y − Xβ)
= y0 ZZ0 y − 2β 0 X0 ZZ0 y + β 0 X0 ZZ0 Xβ
The FOC are
−2X0 ZZ0 y + 2X0 ZZ0 Xβ̂ IV = 0

X0 ZZ0 Xβ̂ IV = X0 ZZ0 y
Z0 Xβ̂ IV = Z0 y
β̂ IV = (Z0 X)−1 Z0 y
If we set ZZ0 = I we get the OLS estimator: Minimizing the weighted residual sum
of squares is the same as minimizing the covariance between the instruments and
the error terms

Comments to IV - Exact Identification
If we set Z = X we again get the OLS estimator

To get the IV estimator, Z0 P×N XN×K has to be invertible:
I Z must have the same number of variables as X, i.e. P = K
I The instruments must be linearly independent (Z must be full rank)
In this case we have what is called exact identification. What does this mean?
If X = [X1 , X2 ] and X1 are exogenous and X2 are endogenous variables, then
I Z will include all X1 variables as an instrument for themselves
I We need to replace X2 variables with as many instruments
Exact Identification
# of exogenous instruments (Z2 ) = # of endogenous variables (X2 )
⇒ Z = [X1 , Z2 ]

1.2 Identification under IV Estimation
The intuitive idea behind IV is as follows:

0
I X vary in response to u (because plim( XNu ) 6= 0)
I But X also vary in response to Z
0
I Z do not vary as u changes (because plim( ZNu ) = 0)
I Exploit the variation in X that is due to the variation in Z to identify the effect
of X on y
You see again the centrality of the orthogonality condition E (Z0 u) = 0 which
replaces E (X0 u) = 0
Therefore we solve the endogeneity problems by looking for variables (instruments)
which are exogenous!
we will see that this might be very useful

More formal intuition: Identification
The effect we want to measure is
∂yi ∂ui
= βi +
∂xi ∂xi
If the assumptions of OLS are met ( ∂ui

∂xi
= 0 ) we can identify the true parameter βi ,
else, we don’t get identification
In the IV case we have
∂yi ∂xi ∂ui
= βi +
∂zi ∂zi ∂zi
∂ui
If zi is a good instrument ∂zi
= 0 then
IV Identification

∂yi
∂zi
βi =
∂xi
∂zi

1.3 Variance of the IV Estimator
The homoskedasticity assumption in this case is E (ui2 |zi ) = Var (u) = σ 2 or, more
generally, E (uu0 |Z) = σ 2 I
We can then derive the variance of the IV estimator:1
Var (β̂ IV ) = E [(β̂ IV − β)(β̂ IV − β)0 ]

= E [((Z0 X)−1 Z0 u)((Z0 X)−1 Z0 u)0 ] = E [(Z0 X)−1 Z0 uu0 Z(Z0 X)−1 ]
= (Z0 X)−1 Z0 E [uu0 ] Z(Z0 X)−1 = σ 2 (Z0 X)−1 Z0 Z(Z0 X)−1
| {z }
=σ 2
This can be estimated by

d (β̂ ) = σ̂ 2 (Z0 X)−1 Z0 Z(Z0 X)−1
Var IV
ûû0 0 −1 0
= (Z X) Z Z(Z0 X)−1
N
a correction for degrees of freedom can be done but it is not relevant if we think in
terms of asymptotic variance
1
Since β̂ IV = (Z0 X)−1 Z0 y = β + (Z0 X)−1 Z0 u ⇒ β̂ IV − β = (Z0 X)−1 Z0 u

Remarks
when Z = X, then the variance of the IV estimator is equal to the variance of the
OLS estimator
in general, the IV estimator produces larger variances than the OLS estimator
we are not going to prove it for the general case but show in the simple case that
σ2 1 1
VarIV (β̂1 ) = = VarOLS (β̂1 ) 2
Nσx2 ρ2x,z Rx,z
standard error (the squared root of the variance) in the IV case differs from OLS
2 2
only in the Rx,z from regressing x on z. Since Rx,z < 1 , IV standard errors are larger
what is the intuition behind the this result?
I to identify the effect of X on y, we are only using that part of the variability in
X which is induced by the instrument
∂y
I less variability in X allows for lower precision in the estimate of ∂X
the stronger the correlation between Z and X, the smaller the IV standard errors

Variance of IV vs OLS - Formally
suppose for simplicity that Z and X are mean-zero vectors (i.e., we have a single
regressor and a single instrument). Thus
n
X n
X n
X
Z0 X = X 0 Z = zi xi = ncov (z, x), Z0 Z = zi2 = nvar (z), X 0 X = xi2 = nvar (x)
i=1 i=1 i=1
thus:
−1 0 −1 σ 2 var(z)
Var β̂IV = σu2 Z 0 X Z Z X 0Z = u
n cov(x, z)2
σu2 1 var(x) var(z) σu2 1 1 1
= = = Var β̂ OLS
n var(x) cov(x, z)2 n var(x) ρ2xz ρ2xz
2
where ρxz is the correlation
between x and z. Since 0 ≤ ρxz ≤ 1, it
coeffcient

follows that Var β̂IV ≥ Var β̂OLS . Note that when we have instruments with

low power, ρ2xz → 0 and Var β̂IV → ∞. Thus an indication of low power of
instruments is effectively high standard errors of the IV estimates.

1.4 Consistency of the IV estimator
The IV estimator is:
β̂ IV = (Z0 X)−1 Z0 y = (Z0 X)−1 Z0 (Xβ + u)

= β + (Z0 X)−1 Z0 u
Taking the plim
plimβ̂ IV = plimβ + plim(Z0 X)−1 plim(Z0 u)

| {z } | {z }
E (Z0 X)−1 =Σ−1 E (Z0 u)=0
ZX
= β + Σ−1
ZX × 0 = β

Moreover the variance of the IV estimator tends to zero when N → ∞
1 2 1 0 −1 1 0 1
lim VarIV (β IV ) = lim ( σ ) ( Z X) ( Z Z) ( Z0 X)−1 = 0
N→∞ N→∞ N N N N
| {z } | {z } | {z }
=Σ−1
ZX
=ΣZZ =Σ−1
ZX
The instrumental variable IV estimator is consistent

Note: While IV is not efficient if E (X0 u) = 0 (i.e. in absence of endogeneity), IV is
consistent both when E (X0 u) = 0 and when E (X0 u) 6= 0
This is the key idea that we will use to "test"for endogeneity (see next lecture)

2. The Power of IV: When and how
IV can address any case where the orthogonality condition fails

What should we do if we suspect one of the above? Should we rush to find an
instrument?2
It depends on the nature of failure of the orthogonality condition
Better first to try fix the problem by eliminating its cause
I If a variable is omitted try first to get it or a proxy for it
I Unobserved heterogeneity can sometimes be dealt with dummy variables plus
some other source of variability
I If variable measured with error try to obtain a better measured variable
I Reverse causality (& sample selection): very hard to deal by trying to eliminate
the cause; unavoidable to rely on IV
2
A very instructive lecture on IV and identification is: Angrist, J. D. and A.B. Kruger, 2001, "Instrumental
Variables and the Search for Identification: From Supply and Demand to Natural Experiments," Journal of
Economic Perspectives 15, 4, 69–85.

Why to be so careful?
1 Finding instruments is difficult, mainly because there is no way we can ever be sure
that the variables that we have proposed as instruments are actually good ones, i.e.
that the exclusion restrictions hold. Remember plim(Z0 u/N) = 0 is an act of faith
2 Moreover, as we shortly mentioned above and discuss in detail in the next lecture, if
the instruments are poorly correlated with X the IV can be very misleading: Even
worse than OLS!

2.1 Addressing errors-in-variables with IV
Remember the classical errors-in-variables problem where we observe x1 instead of

x1∗
I Where x1 = x1∗ + e1 , and e1 is uncorrelated with x1∗ and x2 , . . . , xk
If there is a z, such that Corr (z, u) = 0 and Corr (z, x1) 6= 0, then IV will remove
the attenuation bias
I Notice that z can also be measured with errors, the important is that the error
in the instrument is uncorrelated with error in the x1
The use of checks in Italy

Last week, we saw the measurement-error-bias when we perturbed the social capital
variable (participation to referenda) with a random error
We choose as an instrument a dummy for blood donation
I It is correlated with participation in referenda (social awareness)
I It should be not correlated with the error term in the use of check equation, as
well as with the measurement error that we randomly generated

Dependent variable Use of Checks Use of Checks Social capital Use of Checks
Specification correct sc w/ large error w/ large error IV
Social capital 0.9440 0.0127 1.3759
(5.785)∗∗ (1.681) (4.266)∗∗
Blood donation 1.9287
(4.530)∗∗
Age -0.0034 -0.0033 0.0004 -0.0038
(-7.740)∗∗ (-7.406)∗∗ (2.285)∗∗ (-7.140)∗∗
Married 0.0589 0.0511 0.0157 0.0336
(5.111)∗∗ (3.930)∗∗ (1.802)∗ (2.413)∗∗
Male 0.0197 0.0192 -0.0127 0.0356
(2.110)∗∗ (1.814)∗ (-1.136) (1.959)∗
Years of education 0.0287 0.0283 -0.0001 0.0290
(15.852)∗∗ (15.528)∗∗ (-0.134) (12.986)∗∗
Household wealth 0.0447 0.0410 -0.0109 0.0598
(1.429) (1.209) (-0.791) (1.645)
Household income 0.0057 0.0063 0.0005 0.0052
(14.645)∗∗ (14.403)∗∗ (2.343)∗∗ (10.456)∗∗
Judicial Efficiency -0.0130 -0.0481 -0.0259 0.0036
(-1.138) (-3.647)∗∗ (-4.917)∗∗ (0.258)
Constant -0.4778 0.3845 0.8063 -0.8407
(-2.995)∗∗ (4.890)∗∗ (32.351)∗∗ (-3.174)∗∗
Number of observations 32442 32442 32442 32442
R-squared adjusted 0.277 0.263 0.015 .
Robust t-statistics clustered at the provincial level in parentheses

2.2 Dealing with unobserved heterogeneity

Social capital varies across provinces
One objection is that it may be capturing other variables that are specific to the
province, that are omitted from the regression and that are correlated with social
capital
One can try to address this issue by inserting other controls that vary by province
and that may possibly matter for the regression (and are possibly correlated with
social capital)
1 Judicial inefficiency
2 GDP per capita
But there could be others that one is omitting either because they do not come to
one’s mind or because they are not easily measurable

Dummies may solve the problem
One way to solve the problem is to add a set of province dummies:
I Pj = 1 if household located in province j, zero otherwise
I These dummies would control for ANY variable that varies across provinces
(but are constant over time!) and that may matter for the choice of using
checks
However, the dummies absorb all the variation in the province of residence. Hence
all those other variables that are province-specific and do not vary over time are not
identified
Hence, one has to find some other source of variation for variables such as the
social capital
I One can use the social capital of origin province
I This will work because some households are original from a different province
than the province they now leave in
I This requires the additional assumption that individual’s behavior depends not
only on the social capital of the place where he lives but also on the SC of the
place were he is born

Dependent variable Use of Checks Use of Checks Use of Checks
Specification No dummies No dummies Prov. dummies
Social capital 0.9440
(5.785)∗∗
Social capital - origin 0.6433 0.2450
(4.602)∗∗ (2.672)∗∗
Age -0.0034 -0.0036 -0.0035
(-7.740)∗∗ (-7.787)∗∗ (-8.317)∗∗
Married 0.0589 0.0537 0.0657
(5.111)∗∗ (4.415)∗∗ (5.666)∗∗
Male 0.0197 0.0235 0.0286
(2.110)∗∗ (2.286)∗∗ (3.993)∗∗
Years of education 0.0287 0.0283 0.0276
(15.852)∗∗ (15.122)∗∗ (13.480)∗∗
Household wealth 0.0447 0.0331 0.0455
(1.429) (1.010) (1.535)
Household income 0.0057 0.0060 0.0053
(14.645)∗∗ (15.104)∗∗ (11.515)∗∗
Judicial Efficiency -0.0130 -0.0285
(-1.138) (-2.249)∗∗
Constant -0.4778 -0.1648 0.0513
(-2.995)∗∗ (-1.056) (0.791)
Number of observations 32442 31961 31961
R-squared adjusted 0.277 0.273 0.300
Robust t-statistics clustered at the regional level in parentheses

3. The Two Stage Least Squared Estimator
IV estimates are numerically equivalent to a two stage procedure where

1 We run an OLS regression of the endogenous variables on the instruments (i.e.
X = Zδ + η) and save the predicted values X
b = Zδ̂
2 We then regress y on X again by OLS:
b
y = Xβ
b +u
The Two-Stage Least Squares (2SLS) estimator β̂ 2SLS is numerically identical to

β̂ IV (see the derivation below)
Hence, the 2SLS estimator, being identical to the IV estimator, is also consistent

2SLS and IV Estimator - Derivation
1 The predicted value from regression X = Zδ + η is
X̂ = Zδ̂ = Z (Z0 Z)−1 Z0 X

| {z }
OLS estimator of δ
2 b + u is3
The OLS estimator for y = Xβ
b 0 X)
β̂ 2SLS = (X b −1 X
b0y
= [(Z(Z0 Z)−1 Z0 X)0 (Z(Z0 Z)−1 Z0 X)]−1 (Z(Z0 Z)−1 Z0 X)0 y
= [X0 Z (Z0 Z)−1 Z0 Z(Z0 Z)−1 Z0 X)]−1 X0 Z(Z0 Z)−1 Z0 y
| {z }
=I
= (Z0 X)−1 (Z0 Z)(X0 Z)−1 X0 Z(Z0 Z)−1 Z0 y

| {z }
=I
0 −1 0
= (Z X) Z y = β̂ IV
3
In the perfectly identified case X0 Z and Z0 Z are squared, invertible matrices, ABC−1 = C−1 B−1 A−1

Standard Errors in 2SLS
The estimates produced by 2SLS are the same as those produced by IV, the
standard errors that you obtain from the second stage OLS will not be the same as
the correct standard errors of the IV estimates
In general they will be larger. One can show that:
Var (β̂ 2SLS ) = σ 2 (Z0 X)−1 Z0 Z(Z0 X)−1 + Var (η̂β)(Z0 X)−1 Z0 Z(X0 Z)−1
> σ 2 (Z0 X)−1 Z0 Z(Z0 X)−1 = Var (β̂ IV )
I The reason is that the variable that is used in the second stage as an
explanatory variable (X)
b to replace X, is itself a random variable (a generated
regressor) and this inflates the variance
I Hence they need to be adjusted
I As usual, you won’t have to do it, STATA is doing it for you!

Derivation of the Variance of 2SLS
Now, from
y = Xβ + u
we use the fact that X = X̂ + v̂ and rewrite the model as
y = (X̂ + v̂ )β + u = X̂ β + (u + v̂ β) = X̂ β + ε
| {z }
error in the 2S of 2SLS
which is the regression equation we use in the second stage of 2SLS. The variance of
β̂2SLS which we get from this regression (and which will appear in your output) is
−1 −1 0 −1

Var β̂2SLS = σε2 X̂ 0 X̂ = σε2 Z 0 X Z Z X 0Z
−1 0 −1 −1 0 −1
= σu2 Z 0 X Z Z X 0Z + var(v̂ β) Z 0 X Z Z X 0Z ≥ Var β̂IV
| {z }
Var(β̂IV )
where the equality sign applies to the case where there’s a perefct fit in the first stage
(so that var(v̂ ) = 0 ).

Overidentification
If we have more instruments than endogenous variables, the model is overidentified

I In practice it is good to have more instruments than strictly needed, because
this increases the precision of the estimates
I But be careful! (see below and next lecture)
in case of overidentification there are several estimates of the “structural
parameters” that we can obtain
I If we have one endogenous variable and two instruments, we can obtain one IV
estimate using the first instrument and another IV estimate using the second
I We can even think of several possible combinations of the 2 instruments
What should we then do?
I Disregarding instruments (that is disregarding identifying restrictions) sounds
inefficient
I For this reason we will use all instruments but give more weight to “better”
ones

Overidentification and IV Estimator
In case of overidentification we can therefore think of a matrix PZ that gives more

weight to the more “informative” instruments
The IV estimate is then obtained by minimizing the weighted residual sum of
squares, i.e. by finding the β̂ IV that minimizes:
(u0 PZ u) = (y − Xβ)0 PZ (y − Xβ)
The FOC is:
−2X0 PZ y + 2X0 PZ Xβ̂ IV = 0

⇒ β̂ IV = (X0 PZ X)−1 X0 PZ y
But where should the weights come from?

The 2SLS estimator gives us a clever way of assigning these weights

2SLS Estimator - Another notation
The predicted value from regression X = Zδ + η is:
0 −1 0
X̂ = Zδ̂ = Z(Z Z) Z X = PZ X
b + u is:4
The OLS estimator for y = Xβ
b 0 X)
β̂ 2SLS = (X b −1 Xy
b
0 −1 0 0 0 −1 0 −1 0 −1 0 0
= [(Z(Z Z) Z X) (Z(Z Z) Z X)] (Z(Z Z) Z X) y
| {z } | {z } | {z }
PZ PZ PZ
0 0 −1 0 0
= (X P Z PZ X) XP Z PZ y
0 −1 0
= (X PZ X) X PZ y
the matrix PZ is also known as a “projection” matrix: since we have more

instruments than endogenous variables we have to “reduce” their dimension, i.e. we
project the space of the instruments into the space of the endogenous variables
4
The matrix PZ is idempotent, hence P0 Z PZ = PZ

What is 2SLS doing?
Imagine we have the "structural"model:
y = β0 + β1 x1 + β2 x2 + β3 x3 + u (1)
I We have two potentially endogenous variables x1 and x2 and one exogenous

variable x3
I We therefore need two instruments z1 and z2
2SLS is using in the best way all possible instruments, i.e. is calculating good
weights for a linear combination of all of the exogenous variables by using a
regression!
In our example, we run 2 first-stage regressions and for each of them we use a
linear combination of all exogenous variables and instruments:
x1 = δ0 + δ1 z1 + δ2 z2 + δ3 x3 + η1
x2 = α0 + α1 z1 + α2 z2 + α3 x3 + η2
In the second stage, we use x̂1 and x̂2 instead of x1 and x2 and obtain the
consistent estimates βˆ1 and βˆ2

Overidentification and 2SLS Estimator
So, we can think of the 2SLS estimator as a Weighted Least Squared Estimator
where PZ is a particular matrix of weights
The weights are derived from the regression coefficients of the first stage
regression(s)
Hence, we assign a larger weight to the instruments that are more strongly
correlated with X
Thus the optimal matrix PZ has a simple interpretation: it is the matrix that
transforms the endogenous regressors X into their predicted values from the first
stage regression
This new matrix of regressors (X~ = PZ X) satisfies the OLS restrictions because it is
a linear combination of the instruments

Metrics WT 2023-24 Unit12 Iv+2sls

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Metrics WT 2023-24 Unit12 Iv+2sls

Uploaded by

Copyright:

Available Formats

Unit 12: Instrumental variables (IV) and 2SLS

C. Zulehner: Introductory Econometrics 1 / 32

2 The Power of IV: Examples

3 The Two Stage Least Squared Estimator

C. Zulehner: Introductory Econometrics 2 / 32

yN×1 = XN×K β K ×1 + uN×1

We suspect that E (X0 u) = plim( N1 X0 u) 6= 0 ⇒ plimβ̂ 6= plimβ, therefore, the OLS

C. Zulehner: Introductory Econometrics 3 / 32

Our thinking that plim(X02 u/N) 6= 0 is a suspicion, not a certainty

C. Zulehner: Introductory Econometrics 4 / 32

C. Zulehner: Introductory Econometrics 5 / 32

E (z, y ) = Cov (z, y ) = Cov (z, β0 + β1 x + u)

C. Zulehner: Introductory Econometrics 6 / 32

The key assumption (moment condition) that we want to exploit is

Instrumental Variable (IV) estimator

C. Zulehner: Introductory Econometrics 7 / 32

(Z0 u)0 (Z0 u) = (u0 ZZ0 u)

The FOC are

−2X0 ZZ0 y + 2X0 ZZ0 Xβ̂ IV = 0

C. Zulehner: Introductory Econometrics 8 / 32

If we set Z = X we again get the OLS estimator

C. Zulehner: Introductory Econometrics 9 / 32

The intuitive idea behind IV is as follows:

C. Zulehner: Introductory Econometrics 10 / 32

If the assumptions of OLS are met ( ∂ui

C. Zulehner: Introductory Econometrics 11 / 32

Var (β̂ IV ) = E [(β̂ IV − β)(β̂ IV − β)0 ]

This can be estimated by

C. Zulehner: Introductory Econometrics 12 / 32

C. Zulehner: Introductory Econometrics 13 / 32

C. Zulehner: Introductory Econometrics 14 / 32

The IV estimator is:

β̂ IV = (Z0 X)−1 Z0 y = (Z0 X)−1 Z0 (Xβ + u)

Taking the plim

plimβ̂ IV = plimβ + plim(Z0 X)−1 plim(Z0 u)

C. Zulehner: Introductory Econometrics 15 / 32

The instrumental variable IV estimator is consistent

C. Zulehner: Introductory Econometrics 16 / 32

IV can address any case where the orthogonality condition fails

C. Zulehner: Introductory Econometrics 17 / 32

C. Zulehner: Introductory Econometrics 18 / 32

Remember the classical errors-in-variables problem where we observe x1 instead of

The use of checks in Italy

C. Zulehner: Introductory Econometrics 19 / 32

Robust t-statistics clustered at the provincial level in parentheses

C. Zulehner: Introductory Econometrics 20 / 32

The use of checks in Italy

C. Zulehner: Introductory Econometrics 21 / 32

C. Zulehner: Introductory Econometrics 22 / 32

Robust t-statistics clustered at the regional level in parentheses

C. Zulehner: Introductory Econometrics 23 / 32

IV estimates are numerically equivalent to a two stage procedure where

The Two-Stage Least Squares (2SLS) estimator β̂ 2SLS is numerically identical to

C. Zulehner: Introductory Econometrics 24 / 32

X̂ = Zδ̂ = Z (Z0 Z)−1 Z0 X

= (Z0 X)−1 (Z0 Z)(X0 Z)−1 X0 Z(Z0 Z)−1 Z0 y

C. Zulehner: Introductory Econometrics 25 / 32

C. Zulehner: Introductory Econometrics 26 / 32

   −1 −1 0 −1

C. Zulehner: Introductory Econometrics 27 / 32

If we have more instruments than endogenous variables, the model is overidentified

C. Zulehner: Introductory Econometrics 28 / 32

In case of overidentification we can therefore think of a matrix PZ that gives more

−1 −1 0 −1