Professional Documents
Culture Documents
TUTORIAL II
The second tutorial will deal with the issue of how to estimate a
model with an alternative (more general) estimation method than OLS,
namely the Generalized Method of Moments (GMM), which embeds the
OLS, IV and GIVE (or Two-stage Least Squares) estimators as special
cases.
To this end we will use the data contained in the schooling2.wf1 workfile.
These data are used in example 5.4 and in exercise 5.2 of Verbeek and
originally used in a David Card (1995) article. The data are referred to
3, 010 US men observed in 1976. The variables are wage, experience and
other individual characteristics (race, residency in a metropolitan area
and residency in Southern regions). With these data we want to estimate
a human capital earning function (i.e. a function which explains wages
as a function of the human capital accumulated by the individual):
wi = β1 + β2 Si + β3 Ei + β4 Ei2 + εi (1)
where w is log wage (lwage76), S is the education level (ed76, in years),
E is the working experience (exp76, in years) and E 2 is its square
(exp762, in years). Both S and E are endogenous (S is an individual
choice and E is computed by age−S −6) so that appropriate instrumen-
ts must be found. The choice of instruments for E and E 2 is quite easy:
age and age squared. More difficult is to find appropriate instruments
for S. Labour economics literature has found as possible instruments
two groups of variables: 1) those related to the family environment of
the individual (e.g. parents’ education) and 2) those related to the insti-
tutional setting of schooling system (e.g. proximity or not of individuals
to college). We will use both types of variables by instrumenting S first
with college proximity (dummy nearc4) and then with education of bo-
th parents (daded and momed, in years). In the first case the model is
exactly identified and in the second one it is overidentified.
wi = β1 + β2 Si + γAi + ei (2)
where Ai is unobserved worker ability.
Call the instrumental variable Zi . A valid instrument needs to satisfy
three conditions:
1
I. Zi is as good as randomly assigned.
wi = β1 + β2 Si + εi
This relationship only works when there is one endogenous regressor and one in-
strument (the model is just identified), while if there are multiple instruments for a
single endogenous regressor the model is over-identified and there is no single solution
to get β2 from the first stage and reduced form coefficients.
2
If the instruments exhibit only weak correlation with the endogenous
regressor(s) (condition III above), the properties of the IV estimator
can be very poor (the IV estimator is biased, its standard errors are
misleading, hypothesis tests are unreliable).
To test whether there is a weak instruments problem, it is useful to:
• Report the first stage and think about whether it makes sense. Are
the magnitude and sign as you would expect?
• Report the F-statistics on the excluded instruments. The big-
ger this is, the better. F-statistics above 10 to 20 are considered
relatively safe, lower F-statistics put you in the danger zone.
• Pick your “best” single instrument and report just-identified esti-
mates using this one only. Just-identified IV is somehow “less”
unbiased than over-identified IV.
• Look at the coefficients, t-statistics, and F-statistics for excluded
instruments in the reduced-form regression of dependent variables
on instruments. The reduced-form estimates are just OLS, so they
are unbiased. If the relationship you expect is not in the reduced
form, it is probably not there.
3
where usually εbi is the residual obtained in a first stage where the Wn
matrix is equal to the identity matrix. Therefore
−1 0 0
bopt 0 −1 0
GM M = (X ZS Z X) X ZS
−1
Z y = (SXZ S−1 SZX )−1 SXZ S−1 sZY
E[xi εi ] = 0 (9)
so that
4
2.1.4 Two-stage Least Squares estimator
GIVE estimator can also be seen as the outcome of a two stage estima-
tion procedure.
bT SLS = (X b 0y
b 0 X)−1 X (16)
= (X0 Z(Z0 Z)−1 Z0 X)−1 X0 Z(Z0 Z)−1 Z0 y
= (SXZ S−1 −1 −1
ZZ SZX ) SXZ SZZ sZY = bGIV E
5
3 GMM AND SPECIAL CASES ESTIMATION
IN EViews
EViews allows one to estimate a regression equation with the methods
previously seen by appropriately specifying first the estimation method
(from the Estimation settings window) and then regressors and instru-
ments.
In the TSLS (Two Stage Least Squares) category are grouped the
GIVE and IV methods. Then there is the GMM category. In both cases
we have to specify regressors and instruments in the windows. Notice
that:
scalar overidp=1-@cchisq(overid, 2)
4 EQUATIONS TO ESTIMATE
1. Estimate the earnings function (1) with OLS by also including the
three dummies for race (black), for residency in a metropolitan area
(smsa76) and for residency in Southern regions (south76) among
the regressors.
2. Estimate the first stage for the education level by regressing with
OLS ed76 on all exogenous variables of the model (i.e. the th-
ree dummies for race black, for residency in a metropolitan area
smsa76 and for residency in Southern regions south76, and the
three instruments age76, age762 and nearc4).
6
Perform an F test on the excluded instruments (age76, age762 and
nearc4), to test whether there is a weak instruments problem.
7
1. OLS estimation:
8
2. First stage for ed76:
9
3. Two stage Least Squares perfectly identified:
10
4. GMM perfectly identified:
11
5. First stage for ed76 with additional instruments:
12
6. GMM overidentified:
13
7. Two stage Least Squares overidentified:
14