This action might not be possible to undo. Are you sure you want to continue?
)
c ( 2005 A. Colin Cameron and Pravin K. Trivedi
"Microeconometrics: Methods and Applications"
1. Chapter 1: Introduction
No exercises.
2. Chapter 2: Causal and Noncausal Models
No exercises.
3. Chapter 3: Microeconomic Data Structures
No exercises.
4. Chapter 4: Linear Models
41 (a) For the diagonal entries i = , and E[n
2
i
] = o
2
.
For the …rst o¤diagonal i = , ÷1 or i = , + 1 so [i ÷,[ = 1 and E[n
i
n
j
] = jo
2
.
Otherwise [i ÷,[ 1 and E[n
i
n
j
] = 0.
(b)
´
OLS
is asymptotically normal with mean 0 and asymptotic variance matrix
V[
´
OLS
] = (X
0
X)
1
X
0
X(X
0
X)
1
.
where
=
o
2
jo
2
0 0
jo
2
.
.
.
.
.
.
.
.
.
0
.
.
.
.
.
.
.
.
. 0
.
.
.
.
.
.
.
.
. jo
2
0 0 jo
2
o
2
¸
¸
¸
¸
¸
¸
¸
¸
¸
.
(c) This example is a simple departure from the simplest case of = o
2
I.
1
Here depends on just two parameters and hence can be consistently estimated as
· ÷·.
So we use
´
V[
´
OLS
] = (X
0
X)
1
X
0
´
X(X
0
X)
1
.
where
´
=
´ o
2
´j´ o
2
0 0
´j´ o
2
.
.
.
.
.
.
.
.
.
0
.
.
.
.
.
.
.
.
. 0
.
.
.
.
.
.
.
.
. ´j´ o
2
0 0 ´j´ o
2
´ o
2
¸
¸
¸
¸
¸
¸
¸
¸
¸
and
´
p
÷ if ´ o
2
p
÷ o
2
and
¯
jo
2
p
÷jo
2
or ´j
p
÷j.
For o
2
=E[n
2
i
] the obvious estimate is ´ o
2
= ·
1
¸
N
i=1
´ n
2
i
, where ´ n
i
= n
i
÷x
0
i
´
.
For j we can directly use jo
2
=E[n
i
n
i1
] consistently estimated by
¯
jo
2
= ·
1
¸
N
i=2
´ n
i
´ n
i1
.
Or use j =E[n
i
n
i1
]´
E[n
i
]E[n
i1
] =E[n
i
n
i1
]´E[n
2
i
] consistently estimated by ´j =
·
1
¸
N
i=2
´ n
i
´ n
i1
´·
1
¸
N
i=1
´ n
2
i
and hence ´j´ o
2
= ·
1
¸
N
i=2
´ n
i
´ n
i1
.
(d) To answer (d) and (e) it is helpful to use summation notation:
´
V[
´
OLS
] =
¸
N
¸
i=1
x
i
x
0
i
¸
1
¸
´ o
2
N
¸
i=1
x
i
x
0
i
+ 2´j´ o
2
N
¸
i=2
x
i
x
0
i1
¸¸
N
¸
i=1
x
i
x
0
i
¸
1
= ´ o
2
¸
N
¸
i=1
x
i
x
0
i
¸
1
+ 2´j´ o
2
¸
N
¸
i=1
x
i
x
0
i
¸
1
¸
N
¸
i=2
x
i
x
0
i1
¸¸
N
¸
i=1
x
i
x
0
i
¸
1
(d) No. The usual OLS output estimate ´ o
2
(X
0
X)
1
is inconsistent as it ignores the
o¤diagonal terms and hence the second term above.
(e) No. The White heteroskedasticityrobust estimate is inconsistent as it also ignores
the o¤diagonal terms and hence the second term above.
43 (a) The error n is conditionally heteroskedastic, since V[n[r] =V[r[r] = r
2
V[[r] =
r
2
V[] = r
2
1 = r
2
which depends on the regressor r.
(b) For scalar regressor ·
1
X
0
X =·
1
¸
i
r
2
i
.
Here r
2
i
are iid with mean 1 (since E[r
2
i
] =E[(r
i
÷E[r
i
])
2
] =V[r
i
] = 1 using E[r
i
] = 0).
Applying a LLN (here Kolmogorov), ·
1
X
0
X =·
1
¸
i
r
2
i
p
÷ E[r
2
i
] = 1, so M
xx
= 1.
(c) V[n] =V[r] =E[(r)
2
] ÷ (E[r])
2
=E[r
2
]E[
2
] ÷ (E[r]E[])
2
=V[r]V[] ÷ 0 0 =
1 1 = 1 where use independence of r and  and fact that here E[r] = 0 and E[] = 0.
(d) For scalar regressor and diagonal ,
·
1
X
0
X =
1
·
N
¸
i=1
o
2
i
r
2
i
=
1
·
N
¸
i=1
r
i
2
r
2
i
=
1
·
N
¸
i=1
r
i
4
using o
2
i
= r
2
i
from (a).N
Here r
4
i
are iid with mean 3 (since E[r
4
i
] =E[(r
i
÷E[r
i
])
4
] = 3 using E[r
i
] = 0 and the
fact that fourth central moment of normal is 3o
4
= 3 1 = 3).
Applying a LLN (here Kolmogorov), ·
1
X
0
X =·
1
¸
i
r
4
i
p
÷ E[r
4
i
] = 3, so M
xx
=
3.
(e) Default OLS result
·(
´
OLS
÷)
d
÷A
0. o
2
M
1
xx
= A
0. 1 (1)
1
= A [0.1] .
(f ) White OLS result
·(
´
OLS
÷)
d
÷A
0. M
1
xx
M
xx
M
1
xx
= A
0. (1)
1
3 (1)
1
= A[0.3].
(g) Yes. Expect that failure to control for conditional heteroskedasticity when should
control for it will lead to inconsistent standard errors, though a priori the direction of
the inconsistency is not known. That is the case here.
What is unusual compared to many applications is that there is a big di¤erence in this
example  the true variance is three times the default estimate and the true standard
errrors are
3times larger.
45 (a) Di¤erentiate
0Q()
0
=
0u
0
Wu
0
=
0u
0
0
0u
0
Wu
0u
by chain rule for matrix di¤erentiation
= X
0
2Wu assuming W is symmetric
= 2X
0
Wu
Set to zero
2X
0
Wu = 0
= 2X
0
W(y ÷X)= 0
= X
0
Wy = X
0
WX
=
´
= (X
0
WX)
1
X
0
Wy
where need to assume the inverse exists.
Here W is rank r 1 =rank(X) = X
0
Z and Z
0
Z are rank 1 =X
0
Z(Z
0
Z)
1
Z
0
X is of
full rank 1.
(b) For W= I we have
´
=(X
0
IX)
1
X
0
Iy = (X
0
X)
1
X
0
y which is OLS.
Note that (X
0
X)
1
exists if · 1 matrix X is of full rank 1.
(c) For W=
1
we have
´
=(X
0
1
X)
1
X
0
1
y which is GLS (see (4.28)).
(d) For W= Z(Z
0
Z)
1
Z
0
we have
´
=(X
0
Z(Z
0
Z)
1
Z
0
X)
1
X
0
Z(Z
0
Z)
1
Z
0
y which is
2SLS (see (4.53)).
47 Given the information, E[r] = 0 and E[.] = 0 and
V[r] = E[r
2
] = E[(`n + )
2
] = `
2
o
2
u
+ o
2
"
V[.] = E[.
2
] = E[( + ·)
2
] =
2
o
2
"
+ o
2
v
Cov[r. .] = E[r.] = E[(`n + )( + ·)] = `o
2
"
Cov[r. n] = E[rn] = E[(`n + )n] = `o
2
u
(a) For regression of n on r we have
´
OLS
=
¸
i
r
2
i
1
¸
i
r
i
n
i
and as usual
plim(
´
OLS
÷) =
plim
¸
i
r
2
i
1
plim
¸
i
r
i
n
i
=
E[r
2
]
1
E[rn] as here data are iid
= (`
2
o
2
u
+ o
2
"
)
1
`o
2
u
.
(b) The squared correlation coe¢cient is
j
2
XZ
= [Cov[r. .]
2
]´[V[r]V[.]]
= [o
2
"
]
2
´[(`
2
o
2
u
+ o
2
"
)(
2
o
2
"
+ o
2
v
)]
(c) For single regressor and single instrument
´
IV
= (
¸
i
.
i
r
i
)
1
¸
i
.
i
n
i
= (
¸
i
.
i
r
i
)
1
¸
i
.
i
(r
i
+ n
i
)
= + (
¸
i
.
i
r
i
)
1
¸
i
.
i
n
i
= + (
¸
i
.
i
(`n
i
+ 
i
))
1
¸
i
.
i
n
i
= + (`
¸
i
.
i
n
i
+
¸
i
.
i

i
))
1
¸
i
.
i
n
i
= + (`:
zu
+ :
z"
)
1
:
zu
´
IV
÷ = :
zu
´ (`:
zu
+ :
z"
)
where :
zu
= ·
1
¸
i
.
i
n
i
and :
z"
= ·
1
¸
i
.
i

i
.
By a LLN :
zu
p
÷E[:
zu
] = E[.
i
n
i
] = E[(
i
+·
i
)n
i
] = 0 since , n and · are independent
with zero means.
By a LLN :
z"
p
÷ E[:
z"
] = E[.
i

i
] = E[(
i
+ ·
i
)
i
] = E[
2
i
] = o
2
"
.
´
IV
÷
p
÷0´` 0 + o
2
"
) = 0.
(d) If :
zu
= ÷:
z"
´` then `:
zu
= ÷:
z"
so `:
zu
÷ :
z"
= 0 and
´
IV
÷ = :
zu
´0
which is not de…ned.
(e) First
´
IV
÷ = :
zu
´ (`:
zu
+ :
z"
)
= 1´(` + :
z"
´:
zu
).
If :
zu
is large relative to :
z"
´` then ` is large relative to :
z"
´:
zu
so ` +:
z"
´:
zu
is
close to ` and 1´(` + :
z"
´:
zu
) is close to 1´`.
(f ) Given the de…nition of j
2
XZ
in part (c), j
2
XZ
is smaller the smaller is , the smaller is
o
2
"
, and the larger is `. So in the weak instruments case with small correlation between
r and . (ao j
2
XZ
is small),
´
IV
÷ is likely to converge to 1´` rather than 0, and there
is “…nite sample bias” in
´
IV
.
411 (a) The true variance matrix of OLS is
V[
´
OLS
] = (X
0
X)
1
X
0
X(X
0
X)
1
= (X
0
X)
1
X
0
o
2
(I
N
+AA
0
)X(X
0
X)
1
= o
2
(X
0
X)
1
+ o
2
(X
0
X)
1
X
0
AA
0
X(X
0
X)
1
.
(b) This equals or exceeds o
2
(X
0
X)
1
since (X
0
X)
1
X
0
AA
0
X(X
0
X)
1
is positive semi
de…nite. So the default OLS variance matrix, and hence standard errors, will generally
understate the true standard errors (the exception being if X
0
AA
0
X = 0).
(c) For GLS
V[
´
GLS
] = (X
0
1
X)
1
= (X
0
[o
2
(I +AA
0
)]
1
X)
1
= o
2
(X
0
[I +AA
0
]
1
X)
1
= o
2
(X
0
[I
N
÷A(I
m
+A
0
A)
1
A
0
]X)
1
= o
2
(X
0
X÷X
0
A(I
m
+A
0
A)
1
A
0
X)
1
.
(d) o
2
(X
0
X)
1
_ V[
´
GLS
] since
X
0
X _ X
0
X÷X
0
A(I
m
+A
0
A)
1
A
0
X in the matrix sense
= (X
0
X)
1
_ (X
0
X÷X
0
A(I
m
+A
0
A)
1
A
0
X)
1
in the matrix sense.
If we ran OLS and GLS and used the incorrect default OLS standard errors we would
obtain the puzzling result that OLS was more e¤…cient than GLS. But this is just an
artifact of using the wrong estimated standard errors for OLS.
(e) GLS requires (X
0
1
X)
1
which from (c) requires (I
m
+ A
0
A)
1
which is the
inverse of an :: matrix.
[We also need (X
0
X÷X
0
A(I
m
+A
0
A)
1
A
0
X)
1
but this is a smaller / / marix given
/ < : < ·.]
413 (a) Here = [1 1] and c = [1 0].
From bottom of page 86 the intercept will be
1
+ c
1
1
1
"
(c) = 1 + 1 1
1
"
(c) =
1 + 1
1
"
(c).
The slope will be
2
+ c
2
1
1
"
(c) = 1 + 0 1
1
"
(c) = 1.
The slope should be 1 at all quantiles.
The intercept varies with 1
1
"
(c). Here 1
1
"
(c) takes values ÷2.56, ÷1.68 , ÷1.05,
÷0.51, 0.0, 0.51, 1.05, 1.68 and 2.56 for c = 0.1, 0.2, .... , 0.9. It follows that the
intercept takes values ÷1.56, ÷0.68 , ÷0.05, 0.49, 1.0, 1.51, 2.05, 2.68.
[For example 1
1
"
(0.9) is 
such that Pr[ _ 
] = 0.9 for  ~ A[0. 4] or equivalently

such that Pr[. _ 
´2] = 0.9 for . ~ A[0. 1]. Then 
´2 = 1.28 so 
= 2.56.]
(b) The answers accord quite closely with theory as the slope and intercepts are quite
precisely estimated with slope coe¢cient standard errors less than 0.01 and intercept
coe¢cient standard errors less than 0.04.
(c) Now both the intercept and slope coe¢cients vary with the quantile. Both intercept
and slope coe¢cients increase with the quantile, and for 1 = 0.5 are within two standard
errors of the true values of 1 and 1.
(d) Compared to (b) it is now the intercept that is constant and the slope that varies
across quantiles.
This is predicted from theory similar to that in part (a). Now = [1 1] and c = [0 1].
From bottom of page 86 the intercept will be
1
+ c
1
1
1
"
(c) = 1 + 0 1
1
"
(c) = 1
and the slope will be
2
+ c
2
1
1
"
(c) = 1 + 1 1
1
"
(c) = 1 + 1
1
"
(c).
415 (a) The OLS slope estimate and standard error are 0.05209 and 0.00291, and
the IV estimates are 0.18806 and 0.02614. The IV slope estimate is much larger and
indicates a very large return to schooling. There is a lossin precision with IV standard
error ten times larger, but the coe¢cient is still statististically signi…cant.
(b) OLS of wage76 on an intercept and col4 gives slope coe¢cient 0.1559089 and OLS
regression of grade76 on an intercept and col4 gives slope coe¢cient 0.829019. From
(4.46) , dn´dr = (dn´d.)´(dr´d.) = 0.1559089´0.829019 = 0.18806. This is the same
as the IV estimate in part (a).
(c) We obtain Wald = (1.706234  1.550325) / ( 13.52703  12.69801) = 0.18806. This
is the same as the IV estimate in part (a).
(d) From OLS regression of grade76 on col4, 1
2
= 0.0208 and 1 = 60.37. This does
not suggest a weak instruments problem, except that precision of IV will be much lower
than that of OLS due to the relatively low 1
2
.
(e) Including the additional regressors the OLS slope estimate and standard error are
0.03304 and 0.00311, and the IV estimates are 0.09521 and 0.04932. The IV slope
estimate is again much larger and indicates a very large return to schooling. There is a
loss in precision with IV standard error now eighteed ten times larger, but the coe¢cient
is still statististically signi…cant using a onetail test at …ve percent.
Now OLS of wage76 on an intercept and col4 and other regressors gives slope coe¢cient
0.1559089 and OLS regression of grade76 on an intercept and col4 gives slope coe¢cient
0.829019. From (4.46) , dn´dr = (dn´d.)´(dr´d.) = 0.1559089´0.829019 = 0.18806.
This is the same as the IV estimate in part (a).
417 (a) The average of
´
OLS
over 1000 simulations was 1.502518.
This is close to the theoretical value of 1.5: plim(
´
OLS
÷ ) = `o
2
u
´
`
2
o
2
u
+ o
2
"
=
(1 1)´(1 1 + 1) = 1´2 and here = 1.
(b) The average of
´
IV
over 1000 simulations was 1.08551.
This is close to the theoretical value of 1: plim(
´
IV
÷) = 0 and here = 1.
(c) The observed values of
´
IV
over 1000 simulations were skewed to the right of = 1,
with lower quartile .964185, median 1.424028 and upper quartile 1.7802471. Exercise
47 part (e) suggested concentration of
´
IV
÷ around 1´` = 1 or concetration of
´
IV
around + 1 = 2 since here = 1.
(d) The 1
2
and 1 statistics across simulations from OLS regression (with intercept) of
. on r do indicate a likely weak instruments problem.
Over 1000 simulations, the average 1
2
was 0.0148093 and the average 1 was 1.531256.
[Aside: From Exercise 47 (b) j
2
XZ
= [o
2
"
]
2
´[(`
2
o
2
u
+ o
2
"
)(
2
o
2
"
+ o
2
v
) = [0.01]
2
´(1 +
1)(0.01
2
+ 1) = 0.00005.]
5. Chapter 5: Extremum, ML, NLS
51 First note that
0
´
E[n[r]
0r
=
0
0r
exp(1 + 0.01r)[1 + exp(1 + 0.01r)]
1
= 0.01 exp(1 + 0.01r)[1 + exp(1 + 0.01r)]
1
÷exp(1 + 0.01r) 0.01 exp(1 + 0.01r)[1 + exp(1 + 0.01r)]
2
= 0.01
exp(1 + 0.01r)
[1 + exp(1 + 0.01r)]
2
upon simpli…cation
(a) The average marginal e¤ect over all observations.
0
´
E[n[r]
0r
=
1
100
100
¸
i=1
0.01
exp(1 + 0.01i)
1 + exp(1 + 0.01i)
= 0.0014928.
(b) The sample mean r =
1
100
¸
100
i=1
i = 50.5. Then
0
´
E[n[r]
0r
x
= 0.01
exp(1 + 0.01 50.5)
[1 + exp(1 + 0.01r 50.5)]
2
= 0.0014867.
(c) Evaluating at r = 90
0
´
E[n[r]
0r
90
= 0.01
exp(1 + 0.01 90)
[1 + exp(1 + 0.01r 90)]
2
= 0.0011318.
(d) Using the …nite di¤erence method
´
E[n[r]
r
90
=
exp(1 + 0.01 90)
1 + exp(1 + 0.01r 90)
÷
exp(1 + 0.01 90)
1 + exp(1 + 0.01r 90)
= 0.0011276.
Comment: This example is quite linear, leading to answers in (a) and (b) being close,
and similarly for (c) and (d). A more nonlinear function, with greater variation is
obtained using
´
E[n[r] = exp(0 +0.04r)´[1 +exp(0 +0.04r)] for r = 1. .... 100. Then the
answers are 0.0026163, 0.0013895, 0.00020268, and 0.00019773.
52 (a) Here
ln1(n) = lnn ÷2 ln` ÷n´` with ` = exp(x
0
)´2 and ln` = x
0
÷ln2
= lnn ÷2(x
0
÷ln2) ÷n´[exp(x
0
)´2]
= lnn ÷2x
0
+ 2 ln 2 ÷2n exp(÷x
0
)
so
Q
N
()=
1
·
¸
i
ln1(n
i
) =
1
·
¸
i
¦lnn
i
÷2x
0
+ 2 ln 2 ÷2n
i
exp(÷x
0
)¦.
(b) Now using x nonstochastic so need only take expectations wrt n
Q
0
() = plimQ
N
()
= plim
1
·
¸
i
lnn
i
÷plim
1
·
¸
i
2x
0
i
+plim
1
·
¸
i
2 ln 2 ÷plim
1
·
¸
i
2n
i
exp(÷x
0
i
)
= lim
1
·
¸
i
E[lnn
i
] ÷2 lim
1
·
¸
i
x
0
i
+2 ln2 ÷2 lim
1
·
¸
i
E[n
i
] exp(÷x
0
i
)
= lim
1
·
¸
i
E[lnn
i
] ÷2 lim
1
·
¸
i
x
0
i
+2 ln2 ÷2 lim
1
·
¸
i
exp(x
0
i
0
) exp(÷x
0
i
).
where the last line uses E[n
i
] = exp(x
0
i
0
) in the dgp and we do not need to evaluate
E[lnn
i
] as the …rst sum does not invlove and will therefore have derivative of 0 wrt
.
(c) Di¤erentiate wrt (not
0
)
0Q
0
()
0
= ÷2 lim
1
·
¸
i
x
i
+lim
2
·
¸
i
exp(x
0
i
0
) exp(÷x
0
i
)x
i
= 0 when =
0
.
[Also 0
2
Q
0
()´00
0
= ÷2 lim·
1
¸
i
exp(x
0
i
0
) exp(÷x
0
i
)x
i
x
0
i
is negative de…nite at
0
, so local max.]
Since plimQ
N
() attains a local maximum at =
0
, conclude that
´
= arg max Q
N
()
is consistent for
0
.
(d) Consider the last term. Since n
i
exp(÷x
0
i
) is not iid need to use Markov SLLN.
This requires existence of second moments of n
i
which we have assumed.
53 (a) Di¤erentiating Q
N
() wrt
0Q
N
0
=
1
·
¸
i
÷2x
i
+ 2n
i
exp(÷x
0
i
)x
i
=
1
·
¸
i
2 ¦n
i
exp(÷x
0
i
) ÷1¦x
i
rearranging
=
1
·
¸
i
2
n
i
÷exp(x
0
i
)
exp(x
0
i
)
x
i
multiplying by
exp(x
0
i
)
exp(x
0
i
)
(b) Then
limE
¸
0Q
N
0
0
¸
= lim
1
·
¸
i
2
n
i
÷exp(x
0
i
0
)
exp(x
0
i
0
)
x
i
= 0 if E[n
i
[x
i
] = exp(x
0
i
0
).
So essential condition is correct speci…cation of E[n
i
[x
i
].
(c) From (a)
·
0Q
N
0
0
=
1
·
¸
i
2
n
i
÷exp(x
0
i
0
)
exp(x
0
i
0
)
x
i
.
Apply CLT to average of the term in the sum.
Now n
i
[x
i
has mean exp(x
0
i
0
) and variance (exp(x
0
i
0
))
2
´2.
So A
i
= 2
y
i
exp(x
0
i
0
)
exp(x
0
i
0
)
x
i
has mean 0 and variance 4
(exp(x
0
i
0
))
2
=2
(exp(x
0
i
0
))
2
x
i
x
0
i
= 2x
i
x
0
i
.
Thus for 7
N
= (V[
·
A])
1=2
(
·
A ÷
·E[
A]) =
1
N
¸
i
V[A
i
]
1=2
(
1
p
N
¸
i
A
i
)
7
N
=
1
·
¸
i
2x
i
x
0
i
1=2
1
·
¸
i
2
n
i
÷exp(x
0
i
0
)
exp(x
0
i
0
)
x
i
d
÷A[0. I]
=
1
·
¸
i
2
n
i
÷exp(x
0
i
0
)
exp(x
0
i
0
)
x
i
d
÷A
¸
0. lim
1
·
¸
i
2x
i
x
0
i
(d) Here n
i
is not iid. Use Liapounov CLT.
This will need a (2 + c)
th
absolute moment of n
i
. e.g. 4
th
moment of n
i
.
(e) Di¤erentiating (a) wrt
0
yields
0
2
Q
N
00
0
0
=
1
·
¸
i
÷2
exp(x
0
i
0
)
exp(x
0
i
0
)
x
i
x
0
i
p
÷lim
1
·
¸
i
÷2x
i
x
0
i
.
(f ) Combining
·(
´
÷
0
)
d
÷A[0. A(
0
)
1
B(
0
)A(
0
)
1
]
d
÷A
¸
0.
lim
1
·
¸
i
÷2x
i
x
0
i
1
lim
1
·
¸
i
2x
i
x
0
i
lim
1
·
¸
i
÷2x
i
x
0
i
1
¸
d
÷A
¸
0.
lim
1
·
¸
i
2x
i
x
0
i
1
¸
.
(g) Test H
0
:
0j
_
j
against H
a
:
0j
<
j
at level .05.
´
a
~ A
¸
.
¸
i
2x
i
x
0
i
1
= .
j
=
(
´
j
÷
j
)
:
j
a
~ A[0. 1], where :
j
is ,
th
diag entry in
¸
i
2x
i
x
0
i
1
.
Reject H
0
at level 0.05 if .
j
< ÷.
:05
= ÷1.645.
55 (a) t =
´
0
1
´:c[
´
0
1
] = 5´2 = 2.5. Since [2.5[ _ .
:05
= 1.645 we reject H
0
.
(b) Rewrite as H
0
: 0
1
÷20
2
= 0 versus H
0
: 0
1
÷20
2
= 0.
Use (5.32). Test H
0
: R = r where R = [1 ÷2] and r = 0 and
0
= [0
1
0
2
].
Here
´
=
¸
5
2
so R
´
÷r = [1 ÷2]
¸
5
2
= 1.
Also V[
´
] = ·
1
´
C =
¸
4 1
1 1
using Cov[
´
0
1
.
´
0
2
] = (Cor[
´
0
1
.
´
0
2
])
2
V[
´
0
1
]V[
´
0
2
] = 0.5
2
2
2
1
2
= 1.
Then R·
1
´
CR
0
= [1 ÷2]
¸
4 1
1 1
¸
1
÷2
= 4
so W = (R
´
÷r)
0
R(·
1
´
C)R
0
1
(R
´
÷r) = 1 4
1
1.
Since W = 0.25 < .
2
1;:05
= 3.84 do not reject H
0
.
[Alternatively as only one restriction here, note that
´
0
1
÷ 2
´
0
2
has variance V[
´
0
1
] +
4V[
´
0
1
] ÷4Cov[
´
0
1
.
´
0
2
] = 4 + 4 1 ÷4 1 = 4, leading to
t =
´
0
1
÷2
´
0
2
se[
´
0
1
÷2
´
0
2
]
=
5 ÷3
4
= 0.5
and do not reject as [0.5[ < .
:05
= 1.96. Note that t
2
=W.]
(c) Use (5.32) Test H
0
: R = r where R =
¸
1 0
0 1
and r =
¸
0
0
and =
¸
0
1
0
2
.
Then R
´
÷r =
¸
1 0
0 1
¸
5
2
=
¸
5
2
and R·
1
´
CR
0
=
¸
1 0
0 1
¸
4 1
1 1
¸
1 0
0 1
=
¸
4 1
1 1
so W = (R
´
÷r)
0
R(·
1
´
C)R
0
1
(R
´
÷r) =
5 2
¸
4 1
1 1
¸
5
2
= 124.
Since W = 124 < .
2
2;:05
= 5.99 reject H
0
.
57 Results will vary as uses generated data. Expect
´
1
· ÷1 and
´
2
· 1 and standard
errors similar to those below.
(a) For NLS got
´
1
= ÷1.1162 and
´
2
= 1.1098 with standard errors 0.0551 and 0.0256.
(b) Yes, will need to use sandwich errors due to heteroskedasticity as V[n[r] = exp(
1
+
2
r)
2
´2. Note that standard errors given in (a) do not correct for heteroskedasticity.
(c) For MLE got
´
1
= ÷1.0088 and
´
2
= 1.0262 with standard errors 0.0224 and 0.0215.
(d) Sandwich errors can be used but are not necessary since the ML simpli…cation that
A = ÷B is appropriate here.
Additional Exercises
c 2005 A. Colin Cameron and Pravin K. Trivedi
"Microeconometrics: Methods and Applications"
1. Chapter 1: Introduction
No exercises.
2. Chapter 2: Causal and Noncausal Models
No exercises.
3. Chapter 3: Microeconomic Data Structures
No exercises.
4. Chapter 4: Linear Models
47 THIS QUESTION HAD SEVERAL ERRORS (notable (d)(f )).
USE THE FOLLOWING REVISED QUESTION INSTEAD.
(Adapted from Nelson and Startz, 1990). Consider the three equation model, n = r+n;
r = `n+; . = +·. where the mutually independent errors n,  and · are iid normal
with mean 0 and variances, respectively, o
2
u
, o
2
"
and o
2
v
.
(a) Show that plim(
b
OLS
) = `o
2
u
´
`
2
o
2
u
+ o
2
"
.
(b) Show that j
2
XZ
= [o
2
"
]
2
´[(`
2
o
2
u
+ o
2
"
)(
2
o
2
"
+ o
2
v
)].
(c) Show that
b
IV
= :
zu
´ (`:
zu
+ :
z"
)
p
!0, where, for example, :
zu
= ·
1
P
i
.
i
n
i
.
(d) Show that
b
IV
is not de…ned if :
zu
= :
z"
´`. Nelson and Startz (1990) argue
that this region is visited often enough that the mean of
b
IV
does not exist.
(e) Show that
b
IV
= 1´(` +:
z"
´:
zu
) equals 1´` if :
zu
is large relative to :
z"
´`.
Nelson and Startz (1990) conclude that if :
zu
is large relative to :
z"
´` then
b
IV
is concentrated around 1´`, rather than the probability limit of zero from part (c).
1
(f ) Nelson and Startz (1990) argue that
b
IV
concentrates on 1´` more rapidly the
smaller is , the smaller is o
2
"
, and the larger is `. Given your answer in part (c), what
do you conclude about the small sample distribution of
b
IV
when j
2
XZ
is small?
410 Consider weighted least squares estimation of household medical expenditures on
household total expenditure using the data of section 4.6.4 where again only those with
positive medical expenditures are included, but in this question the regression is in
levels and not logs. [Use program mma04p2qreg and generate new variables med =
exp(lnmed) and total = exp(lntotal).]
(a) Perform OLS regression of med on total. You should obtain slope coe¢cient of
0.0938.
(b) Do you think that the errors in regression of med on total are likely to be het
eroskedastic? Explain.
(c) Compare the default OLS standard errors for the OLS slope coe¢cient estimate
with heteroskedasticrobust standard errors. Comment.
(d) OLS packages have an option for weighting. By appropriate use of weights option
in your package, perform GLS regression of med on total under the assumption that the
error has variance o
2
(total)
2
.
(e) Compare the default standard errors for the weighted LS slope coe¢cient estimate
with heteroskedasticrobust standard errors for the weighted LS slope coe¢cient esti
mate. Comment.
(f ) Compare the default standard errors for the weighted LS slope coe¢cient estimate
with heteroskedasticrobust standard errors for the weighted LS slope coe¢cient esti
mate. Comment.
(g) Obtain the LS estimates in part (d) manually by (unweighted) OLS regression by
…rst appropriately transforming med, total and the intercept.
411 Consider least squares estimation of the model y = X +u where = o
2
(I +AA
0
)
where A is an · : matrix with / < : < · and for simplicity we assume that o
2
and A are known.
(a) Obtain the variance of the OLS estimator using (4.19).
(b) Compare your answer in (a) to the default variance estimate o
2
(X
0
X)
1
. Will
default OLS standard errors be biased / inconsistent in any particular direction?
(c) Give the variance of the GLS estimator of , using the result that (I +AA
0
)
1
=
I
N
A(I
m
+A
0
A)
1
A
0
.
(d) In general GLS is more e¢cient than OLS. But what if we instead compare the (in
correct) default variance o
2
(X
0
X)
1
of OLS with the true variance of GLS? Comment.
(e) GLS requires matrix inversion. For this problem what is the largest size matrix
that needs to be inverted to perform GLS?
412 Consider regression of n on r when data (n. r) take values (1. 2), (0. 1), (0. 0),
(0. 1) and (1. 2).
(a) Using an appropriate statistical package obtain the OLS estimate of the slope co
e¢cient and the least absolute deviations regression estimate of the slope coe¢cient.
Compare the two estimates of the slope coe¢cient and the precision of the estimates.
(b) From part (a) you should …nd that the intercept is 0. We will …nd the least absolute
deviations regression estimate of the slope coe¢cient by grid search. For the given
data compute (4.34) with c = 0.5 and x
0
i
=r
i
(one regressor and no intercept) for
= 0.3. 0.35. 0.4. 0.45. 0.5. 0.55 and 0.6. Which value of minimizes (4.34)? Compare
your answer to that in part (a).
413 Consider dgp n = 1 + 1 r and n =  where r N[0. 25] and  N[0. 4]. This
is the same dgp as in Section 4.5.3 page 84 except that n is homoskedastic.
(a) Using the general result at bottom page 86 give the true slope and intercept coe¢
cient for the j
th
quantile of n conditional on x for quantiles c = 0.1, 0.2, .... , 0.9. [Hint:
1
1
"
(0.1) is that value 
such that Pr[ 
] = 0.1 for  N[0. 4]].
(b) Generate a sample of size 10. 000 from this dgp. This requires minor modi…cation
of the Section 4.5.3 program given on the book website. Estimate quantile regressions
of n on r for c = 0.1, 0.2, .... , 0.9. Compare your answers to the theoretical answers in
part (a).
(c) Redo part (b) for dgp with the modi…cation that n =
p
r rather than . Comment
on any changes compared to (b).
(d) Redo part (b) for dgp with the modi…cations that r is the square of a N[0. 25] variate
and n = r. [This setup ensures that x
0
1 as assumed on page 86]. Comment on
any changes compared to (b).
414 If n is exponential distributed with mean exp(x
0
) then the variance is [exp(x
0
)]
2
.
This can be written as the regression model n = exp(x
0
)+n where n = exp(x
0
) and
 iid[0. 1].
(a) Show by argument similar to that in Section 4.6.1 that the population c
th
quantile
of n conditional on x is j
q
(x. ) = exp(x
0
) + exp(x
0
) 1
1
"
(c).
(b) Hence state what happens for this model to the intercept and slope coe¢cients of
the population c
th
quantile of n conditional on x as c varies.
415 Consider the same data set as that in example 4.9.6 and use the program given
at the book website. Note that wage76 is log hourly wage, grade76 is years of schooling
and col4 is proximity to college.
(a) Regress wage76 on an intercept and grade76, with estimation by OLS and by IV
where col4 is the instrument for grade76. Compare the size and precision of the OLS
and IV slope coe¢cients, where heteroskedasticityrobust standard errors are used.
(b) Perform OLS regression of wage76 on an intercept and col4 and perform OLS
regression of grade76 on an intercept and col4. Obtain the ratio of the two slope
coe¢cient estimates as in (4.46) and compare to your answer in (a).
(c) The instrument here is a binary variable and there is one regressor. Compute the
Wald estimate (see (4.48)) and compare it to the IV estimate from part (a).
(d) Do you think there might be a weak instrument problem here? Provide appropriate
statistical measures.
(e) Redo parts (a) and (b), with all regressions now including as additional (exogenous)
regressors age76, agesq76, black, south76, smsa76, reg2reg9,
smsa66, momdad14, sinmom14, nodaded, nomomed, daded, momed, and famed1famed8.
Does result (4.46) still old?
416 Consider the linear regression model y= X +u and the IV estimator
b
= (Z
0
X)
1
Z
0
y
where Z and X are · / full rank matrices of constants (i.e. are nonstochastic).
(a) Suppose u N[0. ]. Obtain the …nite sample distribution of
b
.
(b) Suppose u [0. ], the probability limits of ·
1
X
0
X, ·
1
Z
0
Z and ·
1
Z
0
X all
exist and are …nite nonsingular and ·
1=2
Z
0
u
d
! N[0. plim·
1
Z
0
Z]. Obtain the
limit distribution of
p
·(
b
).
(c) Hence obtain the asymptotic distribution of
b
and compare to your answer in (a).
417 Consider the same three equation model as in Exercise 47, with n = r + n;
r = `n+; . = +·. where the mutually independent errors n,  and · are iid normal
with mean 0 and variances 1, and = 1, ` = 1 and = 0.01. Perform 1000 simulations
with · = 100 where in each simulation data (n. r. .) is generated and we obtain (1)
b
OLS
from OLS regression of n on r without intercept; (2)
b
IV
from IV regression of
n on r without intercept with instrument .; (3) 1
2
and 1 from OLS regression (with
intercept) of . on r.
(a) Compare the average across simulations of
b
OLS
with the probability limit given in
Exercise 47 part (b). Comment.
(b) Compare the average across simulations of
b
IV
with the probability limit given in
Exercise 47 part (c). Comment.
(c) Obtain percentiles and quartiles of the observed values across simulations of
b
IV
.
Comment in the light of Exercise 47 part (e).
(d) Do the 1
2
and 1 statistics across simulations from OLS regression (with intercept)
of . on r indicate a likely weak instruments problem?
Exercises: Di¢culty and Topics Covered
c 2005 A. Colin Cameron and Pravin K. Trivedi
"Microeconometrics: Methods and Applications"
Di¢culty: 1 is easiest; 2 is harder; 3 is hardest.
1. Chapter 1: Introduction
No exercises.
2. Chapter 2: Causal and Noncausal Models
No exercises.
3. Chapter 3: Microeconomic Data Structures
No exercises.
1
4. Chapter 4: Linear Models
Ques Sol:given Diff Section Topic
4 1 Y es 2 4:4 OLS with 6=
2
I
4 2 1 4:4 OLS with heteroskedasticity
4 3 Y es 3 4:4 OLS with heteroskedasticity
4 4 3 4:4 OLS limit distribution
4 5 Y es 1 4:5 LS estimators minimize u
0
Wu
4 6 1 4:8 IV estimation theory
4 7 Y es 3 4:8 IV estimation weak instruments theory
4 8 2 4:6 Quantile regression data application
4 9 Y es 3 4:9 IV with weak instruments data application
4 10 2 4:5 Weighted LS application
4 11 Y es 2 4:5 GLS theory
4 12 1 4:6 Quantile regression with arti…cial data
4 13 Y es 2 4:6 Quantile regression with generated data
4 14 2 4:6 Quantile regression theory
4 15 Y es 2 4:8 IV data application
4 16 2 4:8 IV theory
4 17 Y es 2 4:9 IV with weak instruments simulation
5. Chapter 5: ML and NLS Estimation
Ques Sol:given Diff Section Topic
5 1 Y es 1 5:2 Marginal e¤ects
5 2 Y es 2 5:3 mestimator example asymptotic theory
5 3 Y es 2 5:3 mestimator example asymptotic theory
5 4 3 4:3 mestimator example asymptotic theory
5 5 Y es 2 5:5 Wald test of linear restrictions
5 6 2 5:7 NLS estimation theory
5 7 Y es 2 5:7 8 NLS and ML with generated data
5 8 1 5:2 Marginal e¤ects
= b 2 " N X i=1 xi x0 i # i=1 1 + 2bb 2 " N X i=1 xi x0 i i 1 i=2 # 1" N X i=2 i i=1 xi x0 1 i #" N X i=1 xi x0 i # 1 43 (a) The error u is conditionally heteroskedastic. 0 ... .. . 4 . i i 1 i=2 b b p 2 ] consistently estimated by b = Or use =E[ui ui 1 ]= E[ui ]E[ui 1 ] =E[ui ui 1 ]=E[ui P P P N 1 N ui ui 1 =N 1 N u2 and hence bb2 = N 1 N ui ui 1 . .. The usual OLS output estimate b2 (X0 X) 1 is inconsistent as it ignores the o¤diagonal terms and hence the second term above. 0 . where ui = yi x0 b . 7 . where 6 6 bb 2 6 b =6 6 0 6 6 . So we use b V[ b OLS ] = (X0 X) 1 X0 b X(X0 X) 1 . i i P p Applying a LLN (here Kolmogorov). i i i=1 bi P 2 =E[u u 2 For we can directly use ] consistently estimated by d = N 1 N ui ui 1 . . .Here depends on just two parameters and hence can be consistently estimated as N ! 1. since V[ujx] =V[x"jx] = x2 V["jx] = x2 V["] = x2 1 = x2 which depends on the regressor x. 7 . . p p p 2 p and b ! if b2 ! 2 and d ! 2 or b ! . . bb 2 (d) To answer (d) and (e) it is helpful to use summation notation: "N # 1" #" N # N N X X X X b V[ b ] = xi x0 b2 xi x0 + 2bb2 xi x0 xi x0 OLS i i i=1 1 (d) No. . 0 2 b2 bb 2 .P b For 2 =E[u2 ] the obvious estimate is b2 = N 1 N u2 . so Mxx = 1: i i (c) V[u] =V[x"] =E[(x")2 ] (E[x"])2 =E[x2 ]E["2 ] (E[x]E["])2 =V[x]V["] 0 0 = 1 1 = 1 where use independence of x and " and fact that here E[x] = 0 and E["] = 0: . i Here x2 are iid with mean 1 (since E[x2 ] =E[(xi E[xi ])2 ] =V[xi ] = 1 using E[xi ] = 0).. 3 0 ... P (b) For scalar regressor N 1 X0 X =N 1 i x2 . N 1 X0 X =N 1 i x2 ! E[x2 ] = 1. . . i=2 b b i=1 bi i=2 b b . The White heteroskedasticityrobust estimate is inconsistent as it also ignores the o¤diagonal terms and hence the second term above. (e) No. 7 7 0 7 7 2 7 bb 5 b2 .
P p Applying a LLN (here Kolmogorov). Expect that failure to control for conditional heteroskedasticity when should control for it will lead to inconsistent standard errors. 1 1 ) ! N 0.1] : (f ) White OLS result p d 1 N ( b OLS ) ! N 0.3]: (g) Yes. What is unusual compared to many applications is that there is a big di¤erence in this example . 45 (a) Di¤erentiate @Q( ) @ @u0 Wu @ @u0 @u0 Wu = by chain rule for matrix di¤erentiation @ @u = X0 2Wu assuming W is symmetric = = 2X0 Wu Set to zero 2X0 Wu = 0 ) X )= 0 0 Wy = X0 WX ) X b = (X0 WX) 1 X0 Wy ) 2X0 W(y . (1) = N [0. N 1 X0 X =N 1 i x4 ! E[x4 ] = 3. (1) 1 3 (1) 1 = N [0. Mxx Mx 1 x Mxx = N 0. though a priori the direction of the inconsistency is not known. (e) Default OLS result p N ( b OLS d 2 1 Mxx = N 0.(d) For scalar regressor and diagonal N 1 .the true variance is three times the default estimate and the true standard p errrors are 3times larger. 2 2 i xi X0 X = N 1 X N i=1 = N N 1 X 2 2 1 X 4 xi xi = xi N N i=1 i=1 using 2 = x2 from (a).N i i Here x4 are iid with mean 3 (since E[x4 ] =E[(xi E[xi ])4 ] = 3 using E[xi ] = 0 and the i i fact that fourth central moment of normal is 3 4 = 3 1 = 3). so Mx x = i i 3. That is the case here.
where need to assume the inverse exists. z]2 ]=[V[x]V[z]] = [ 2 ]2 =[( 2 2 + 2 )( 2 " u " 2 " + 2 )] v (c) For single regressor and single instrument b IV b IV P P = ( i zi xi ) 1 i zi yi P P = ( i zi xi ) 1 i zi (xi + ui ) P P = + ( i zi xi ) 1 i zi ui P P = + ( i zi ( ui + "i )) 1 i zi ui P P 1P = +( i zi u i + i zi "i )) i zi u i 1 = + ( mzu + mz" ) mzu = mzu = ( mzu + mz" ) . (b) For W = I we have b =(X0 IX) 1 X0 Iy = (X0 X) 1 X0 y which is OLS. u] = E[xu] = E[( u + ")u] = 2 u P 2 (a) For regression of y on x we have b OLS = i xi plim( b OLS 1P i xi yi 2 " and as usual P P 1 ) = plim i x2 plim i xi ui i 1 = E[x2 ] E[xu] as here data are iid = ( 2 2 + 2) 1 2 : u " u (b) The squared correlation coe¢ cient is 2 XZ = [Cov[x.28)).53)). Here W is rank r > K =rank(X) ) X0 Z and Z0 Z are rank K ) X0 Z(Z0 Z) full rank K. (c) For W = 1 1 Z0 X is of (d) For W = Z(Z0 Z) 2SLS (see (4. 1 Z0 X) 1 X0 Z(Z0 Z) 1 Z0 y we have b =(X0 Z(Z0 Z) which is 47 Given the information. we have b =(X0 1 Z0 1 X) 1 X0 1 y which is GLS (see (4. Note that (X0 X) 1 exists if N K matrix X is of full rank K. z] = E[xz] = E[( u + ")( " + v)] = Cov[x. E[x] = 0 and E[z] = 0 and V[x] = E[x2 ] = E[( u + ")2 ] = 2 2 + 2 u " V[z] = E[z 2 ] = E[( " + v)2 ] = 2 2 + 2 " v Cov[x.
u and v are independent with zero means. and there XZ b . So in the weak instruments case with small correlation between " x and z (ao 2 is small). (e) First b IV mz" so mzu = mzu = ( mzu + mz" ) = 1=( + mz" =mzu ): + mz" =mzu is If mzu is large relative to mz" = then is large relative to mz" =mzu so close to and 1=( + mz" =mzu ) is close to 1= : (f ) Given the de…nition of 2 in part (c). and the larger is . the smaller is XZ XZ 2 . p By a LLN mz" ! E[mz" ] = E[zi "i ] = E[( "i + vi )"i ] = E["2 ] = 2 : " i b IV ! 0= p 0+ 2) " = 0: mz" = 0 and b IV = mzu =0 (d) If mzu = mz" = then mzu = which is not de…ned. p By a LLN mzu ! E[mzu ] = E[zi ui ] = E[( "i +vi )ui ] = 0 since ". 2 is smaller the smaller is . So the default OLS variance matrix. (c) For GLS V[ b GLS ] = (X0 = 2 2 2 1 2 X) 1 1 = (X0 [ = = (I + AA0 )] 1 X) 1 1 (X0 [I + AA0 ] (X0 [IN (X0 X X) A(Im + A0 A) X0 A(Im + A0 A) 1 A0 ]X) 1 1 1 A0 X) : . b IV is likely to converge to 1= rather than 0. is “…nite sample bias” in IV 411 (a) The true variance matrix of OLS is V[ b OLS ] = (X0 X) = (X0 X) 2 1 1 X0 X(X0 X) X0 1 2 1 1 1 (IN +AA0 )X(X0 X) 2 = (X0 X) + (X0 X) 1 X0 AA0 X(X0 X) : (b) This equals or exceeds 2 (X0 X) 1 since (X0 X) 1 X0 AA0 X(X0 X) 1 is positive semide…nite. and hence standard errors. will generally understate the true standard errors (the exception being if X0 AA0 X = 0).P P where mzu = N 1 i zi ui and mz" = N 1 i zi "i .
0:49. Both intercept and slope coe¢ cients increase with the quantile. and for 1 = 0:5 are within two standard errors of the true values of 1 and 1. 1:05. 0:0. It follows that the intercept takes values 1:56. The IV slope estimate is much larger and ... 0:68 . 0:05. 1:68 and 2:56 for q = 0:1. The intercept varies with F" 1 (q). 1:0.(d) 2 (X0 X) 1 X0 X ) (X0 X) 1 V[ b GLS ] since X0 X (X0 X X0 A(Im + A0 A) X0 A(Im + A0 A) 1 A0 X 1 A0 X) 1 in the matrix sense in the matrix sense.] 413 (a) Here = [1 1] and = [1 0]: From bottom of page 86 the intercept will be 1 + 1 F" 1 (q) = 1 + 1 F" 1 (q) = 1 + F" 1 (q): The slope will be 2 + 2 F" 1 (q) = 1 + 0 F" 1 (q) = 1: The slope should be 1 at all quantiles. . (e) GLS requires (X0 1 X) 1 which from (c) requires (Im + A0 A) 1 which is the inverse of an m m matrix. From bottom of page 86 the intercept will be 1 + 1 F" 1 (q) = 1 + 0 F" 1 (q) = 1 and the slope will be 2 + 2 F" 1 (q) = 1 + 1 F" 1 (q) = 1 + F" 1 (q): 415 (a) The OLS slope estimate and standard error are 0:05209 and 0:00291. 1]. 0:2. If we ran OLS and GLS and used the incorrect default OLS standard errors we would obtain the puzzling result that OLS was more e¤…cient than GLS. [For example F" 1 (0:9) is " such that Pr[" " ] = 0:9 for " N [0. 4] or equivalently " such that Pr[z " =2] = 0:9 for z N [0. 0:51. But this is just an artifact of using the wrong estimated standard errors for OLS. . This is predicted from theory similar to that in part (a). Now = [1 1] and = [0 1].. (d) Compared to (b) it is now the intercept that is constant and the slope that varies across quantiles. 0:51. (c) Now both the intercept and slope coe¢ cients vary with the quantile. Here F" 1 (q) takes values 2:56. 1:68 . Then " =2 = 1:28 so " = 2:56. [We also need (X0 X X0 A(Im + A0 A) 1 A0 X) 1 but this is a smaller k k marix given k < m < N . 0:9. 2:05. 1:05. 2:68.] (b) The answers accord quite closely with theory as the slope and intercepts are quite precisely estimated with slope coe¢ cient standard errors less than 0:01 and intercept coe¢ cient standard errors less than 0:04. 1:51. and the IV estimates are 0:18806 and 0:02614.
69801) = 0. This is the same as the IV estimate in part (a). the average R2 was 0.706234 .502518. dy=dx = (dy=dz)=(dx=dz) = 0:1559089=0:829019 = 0:18806. but the coe¢ cient is still statististically signi…cant using a onetail test at …ve percent.531256. with lower quartile .08551.52703 .550325) / ( 13. There is a loss in precision with IV standard error now eighteed ten times larger.18806. (c) The observed values of b IV over 1000 simulations were skewed to the right of = 1.0148093 and the average F was 1. 2= u 2 2 u + 2 " = (b) The average of b IV over 1000 simulations was 1. This is close to the theoretical value of 1: plim( b IV ) = 0 and here = 1.12. This is close to the theoretical value of 1:5: plim( b OLS ) = (1 1)=(1 1 + 1) = 1=2 and here = 1.7802471. dy=dx = (dy=dz)=(dx=dz) = 0:1559089=0:829019 = 0:18806.964185.1. [Aside: From Exercise 47 (b) 2 = [ 2 ]2 =[( 2 2 + 2 )( 2 2 + 2 ) = [0:01]2 =(1 + v " " u " XZ 1)(0:012 + 1) = 0:00005:] . From (4.indicates a very large return to schooling. R2 = 0:0208 and F = 60:37.46) . but the coe¢ cient is still statististically signi…cant. (e) Including the additional regressors the OLS slope estimate and standard error are 0:03304 and 0:00311. 417 (a) The average of b OLS over 1000 simulations was 1. (d) From OLS regression of grade76 on col4. There is a lossin precision with IV standard error ten times larger. Now OLS of wage76 on an intercept and col4 and other regressors gives slope coe¢ cient 0:1559089 and OLS regression of grade76 on an intercept and col4 gives slope coe¢ cient 0:829019. (c) We obtain Wald = (1.424028 and upper quartile 1. This does not suggest a weak instruments problem. This is the same as the IV estimate in part (a). This is the same as the IV estimate in part (a). (b) OLS of wage76 on an intercept and col4 gives slope coe¢ cient 0:1559089 and OLS regression of grade76 on an intercept and col4 gives slope coe¢ cient 0:829019. Exercise 47 part (e) suggested concentration of b IV around 1= = 1 or concetration of b IV around + 1 = 2 since here = 1: (d) The R2 and F statistics across simulations from OLS regression (with intercept) of z on x do indicate a likely weak instruments problem. Over 1000 simulations. and the IV estimates are 0:09521 and 0:04932. except that precision of IV will be much lower than that of OLS due to the relatively low R2 . median 1. From (4. The IV slope estimate is again much larger and indicates a very large return to schooling.46) .
0:0013895. 100. and 0:00019773. b @ E[yjx] 1 X = 0:01 @x 100 i=1 100 exp(1 + 0:01i) = 0:0014928: 1 + exp(1 + 0:01i) = 50:5. leading to answers in (a) and (b) being close. with greater variation is b obtained using E[yjx] = exp(0 + 0:04x)=[1 + exp(0 + 0:04x)] for x = 1. Then the answers are 0:0026163. . A more nonlinear function. NLS 51 First note that b @ E[yjx] @x = @ exp(1 + 0:01x)[1 + exp(1 + 0:01x)] 1 @x = 0:01 exp(1 + 0:01x)[1 + exp(1 + 0:01x)] 1 exp(1 + 0:01x) 0:01 exp(1 + 0:01x)[1 + exp(1 + 0:01x)] exp(1 + 0:01x) = 0:01 upon simpli…cation [1 + exp(1 + 0:01x)]2 2 (a) The average marginal e¤ect over all observations.5. ML. Chapter 5: Extremum. and similarly for (c) and (d). 0:00020268. :::. Then (b) The sample mean x = b @ E[yjx] @x 1 100 P100 i=1 i = 0:01 x exp(1 + 0:01 50:5) = 0:0014867: [1 + exp(1 + 0:01x 50:5)]2 (c) Evaluating at x = 90 b @ E[yjx] @x = 0:01 90 exp(1 + 0:01 90) = 0:0011318: [1 + exp(1 + 0:01x 90)]2 (d) Using the …nite di¤erence method b E[yjx] x = 90 exp(1 + 0:01 90) 1 + exp(1 + 0:01x 90) exp(1 + 0:01 90) = 0:0011276: 1 + exp(1 + 0:01x 90) Comment: This example is quite linear.
(c) Di¤erentiate wrt @Q0 ( ) @ = (not 0) 1 X 1 X ln f (yi ) = fln yi i i N N 1 X 2 X xi + lim exp(x0 0 ) exp( x0 )xi i i i i N N = 0 when = 0 : P 2 Q ( )=@ @ 0 = 2 lim N 1 0 0 0 [Also @ 0 i exp(xi 0 ) exp( xi )xi xi is negative de…nite at 0 .] Since plim QN ( ) attains a local maximum at = 0 . conclude that b = arg max QN ( ) is consistent for 0 .52 (a) Here ln f (y) = ln y = ln y = ln y so QN ( )= 2 ln 2(x 2x 0 0 y= with ln 2) + 2 ln 2 = exp(x0 )=2 and ln 0 = x0 ln 2 y=[exp(x )=2] 2y exp( x0 ) 2x0 + 2 ln 2 2yi exp( x0 )g: (b) Now using x nonstochastic so need only take expectations wrt y Q0 ( ) = plim QN ( ) 1 X 1 X 1 X 1 X = plim ln yi plim 2x0 + plim 2 ln 2 plim 2yi exp( x0 ) i i i i i i N N N N X X X 1 1 1 = lim E[ln yi ] 2 lim x0 +2 ln 2 2 lim E[yi ] exp( x0 ) i i i i i N N N X X X 1 1 1 = lim E[ln yi ] 2 lim x0 +2 ln 2 2 lim exp(x0 0 ) exp( x0 ). so local max. Since yi exp( x0 ) is not iid need to use Markov SLLN. 2 lim (d) Consider the last term. 53 (a) Di¤erentiating QN ( ) wrt 1 X @QN = 2xi + 2yi exp( x0 )xi i i @ N 1 X = 2 fyi exp( x0 ) 1gxi rearranging i i N 0 ) X 1 yi exp(xi exp(x0 ) i = 2 xi multiplying by i N exp(x0 ) exp(x0 ) i i . i This requires existence of second moments of yi which we have assumed. i i i i i i N N N where the last line uses E[yi ] = exp(x0 0 ) in the dgp and we do not need to evaluate i E[ln yi ] as the …rst sum does not invlove and will therefore have derivative of 0 wrt .
A( 0 ) 1 B( 0 )A( 0 ) 1 ] 1 1 X 1 X d ! N 0. This will need a (2 + )th absolute moment of yi . lim 2xi x0 lim 2xi x0 i i i i N N # " 1 1 X d ! N 0. I] 0 i i N exp(xi 0 ) N 1 X yi exp(x0 0 ) 1 X d i p 2 xi ! N 0. lim 2xi x0 i 0 i i exp(xi 0 ) N N 2 0 )) =2. i i (exp(x0 0 ))2 i p 1=2 1 P 1 P N E[X]) = N i V[Xi ] ( pN i X i ) (d) Here yi is not iid. (c) From (a) p N @QN @ 1 X =p 2 i N yi exp(x0 0 ) i xi : exp(x0 0 ) i 0 So Xi Apply CLT to average of the term in the sum. Use Liapounov CLT. Now yi jxi has mean exp(x0 0 ) and variance (exp(x0 i i 2 yi exp(x0 0 ) i exp(x0 0 ) i Thus for ZN ZN = ) xi has mean 0 p p = (V[ N X]) 1=2 ( N X 1=2 1 X 1 X yi exp(x0 0 ) d 0 i p 2xi xi 2 xi ! N [0. e. (e) Di¤erentiating (a) wrt @ 2 QN @ @ 0 (f ) Combining ! N [0.g. (exp(x0 0 ))2 =2 i and variance 4 xi x0 = 2xi x0 . lim 2xi x0 : i i N 0) 0 yields 2 exp(x0 i exp(x0 i 0) 0) = 0 1 X i N xi x0 i ! lim p 1 X 2xi x0 : i i N p N (b " d 1 X lim 2xi x0 i i N 1 # . 4th moment of yi .(b) Then " @QN lim E @ 0 # = lim 1 X 2 i N yi exp(x0 0 ) i xi = 0 if E[yi jxi ] = exp(x0 i exp(x0 0 ) i 0 ): So essential condition is correct speci…cation of E[yi jxi ].
32).(g) Test H0 : b a 0j j against Ha : X 2xi x0 i 1 0j < j at level :05. 5 5 Here b = so Rb r = [1 2] = 1: 2 2 4 1 b Also V[b] = N 1 C = using Cov[b1 .:05 2b2 5 3 = p = 0:5 4 2b2 ] 1 0 0 1 and r = 0 0 and = 1 2 . leading to t= se[b1 b1 1 C)R0 b 1 (Rb r) = 1 4 1 1. 1]. X i 2xi x0 i 1 : Reject H0 at level 0:05 if zj < (b) Rewrite as H0 : 1 2 2 = 0 versus H0 : 1 2 2 6= 0.] (c) Use (5. 5 2 = 4 1 1 1 5 2 1 0 0 1 1 = 4 1 1 1 5 2 4 1 1 1 5 2 = 124. _ Since W = 0:25 < = 3:84 do not reject H0 : 2b2 has variance V[b1 ] + [Alternatively as only one restriction here. Test H0 : R = r where R = [1 2] and r = 0 and 0 = [ 1 2 ]. where sj is j th diag entry in z:05 = 1:645. b2 ] = 4 + 4 1 4 1 = 4. and do not reject as j0:5j < z:05 = 1:96. Note that t2 =W. N (bj .:05 55 (a) t = b1 =se[b1 ] = 5=2 = 2:5. b2 ])2 V[b1 ]V[b2 ] = 0:52 1 1 22 12 = 1. b2 ] = (Cor[b1 . note that b1 4V[b1 ] 4Cov[b1 .32) Test H0 : R = r where R = Then Rb 1 0 0 1 1 0 b and RN 1 CR0 = 0 1 r= so W = (Rb r)0 R(N 2 2. 4 1 1 b Then RN 1 CR0 = [1 2] =4 1 1 2 so W = (Rb r)0 R(N 2 1. Since W = 124 < = 5:99 reject H0 : 1 C)R0 b (Rb r) = . Use (5. i ) zj = j) a sj N [0. Since j2:5j > z:05 = 1:645 we reject H0 .
Expect b 1 ' errors similar to those below. will need to use sandwich errors due to heteroskedasticity as V[yjx] = exp( 1 + 2 2 x) =2. . Note that standard errors given in (a) do not correct for heteroskedasticity. 1:0088 and b 2 = 1:0262 with standard errors 0:0224 and 0:0215. 1 and b 2 ' 1 and standard (c) For MLE got b 1 = (d) Sandwich errors can be used but are not necessary since the ML simpli…cation that A = B is appropriate here.57 Results will vary as uses generated data. 1:1162 and b 2 = 1:1098 with standard errors 0:0551 and 0:0256. (a) For NLS got b 1 = (b) Yes.
3. v 1 =[ + 2 )( 2 2 " " p + (d) Show that b IV is not de…ned if mzu = mz" = . y = x+u. 2 and 2 . mzu = N P i zi u i . " and v are iid normal with mean 0 and variances. 4. 2 . Consider the three equation model. 2 )]. Colin Cameron and Pravin K. z = " + v. (c) Show that b IV = mzu = ( mzu + mz" ) ! 0. u " v (a) Show that plim( b OLS (b) Show that 2 XZ )= 2 ]2 =[( 2 2 " u 2= u 2 2 u + 2 " . x = u + ". respectively. USE THE FOLLOWING REVISED QUESTION INSTEAD. for example. where the mutually independent errors u. Chapter 3: Microeconomic Data Structures No exercises. (e) Show that b IV = 1=( + mz" =mzu ) equals 1= if mzu is large relative to mz" = . Nelson and Startz (1990) argue that this region is visited often enough that the mean of b IV does not exist. 2. rather than the probability limit of zero from part (c). 1 . where. Chapter 4: Linear Models 47 THIS QUESTION HAD SEVERAL ERRORS (notable (d)(f )). (Adapted from Nelson and Startz. Chapter 2: Causal and Noncausal Models No exercises.Additional Exercises c 2005 A. 1990). Chapter 1: Introduction No exercises. Trivedi "Microeconometrics: Methods and Applications" 1. Nelson and Startz (1990) conclude that if mzu is large relative to mz" = then b IV is concentrated around 1= .
(f ) Compare the default standard errors for the weighted LS slope coe¢ cient estimate with heteroskedasticrobust standard errors for the weighted LS slope coe¢ cient estimate. (d) OLS packages have an option for weighting.6.4 where again only those with positive medical expenditures are included. (c) Compare the default OLS standard errors for the OLS slope coe¢ cient estimate with heteroskedasticrobust standard errors. total and the intercept. Given your answer in part (c). but in this question the regression is in levels and not logs.19). Will default OLS standard errors be biased / inconsistent in any particular direction? (c) Give the variance of the GLS estimator of . (e) Compare the default standard errors for the weighted LS slope coe¢ cient estimate with heteroskedasticrobust standard errors for the weighted LS slope coe¢ cient estimate. 411 Consider least squares estimation of the model y = X + u where = 2 (I + AA0 ) where A is an N m matrix with k < m < N and for simplicity we assume that 2 and A are known. Comment. (g) Obtain the LS estimates in part (d) manually by (unweighted) OLS regression by …rst appropriately transforming med. You should obtain slope coe¢ cient of 0:0938. using the result that (I + AA0 ) 1 (f ) Nelson and Startz (1990) argue that b IV concentrates on 1= more rapidly the smaller is .] (a) Perform OLS regression of med on total. (b) Do you think that the errors in regression of med on total are likely to be heteroskedastic? Explain.410 Consider weighted least squares estimation of household medical expenditures on household total expenditure using the data of section 4. and the larger is . Comment. [Use program mma04p2qreg and generate new variables med = exp(lnmed) and total = exp(lntotal). perform GLS regression of med on total under the assumption that the error has variance 2 (total)2 . (a) Obtain the variance of the OLS estimator using (4. (b) Compare your answer in (a) to the default variance estimate 2 (X0 X) 1 . Comment. the smaller is 2 . what " do you conclude about the small sample distribution of b IV when 2 is small? XZ = . By appropriate use of weights option in your package.
. 0:2. 25] variate and u = x". 414 If y is exponential distributed with mean exp(x0 ) then the variance is [exp(x0 )]2 . 2). (e) GLS requires matrix inversion. Which value of minimizes (4. . (0. 4]].. . (b) Generate a sample of size 10. . [This setup ensures that x0 > 1 as assumed on page 86]. 4]. 0:2. (a) Using the general result at bottom page 86 give the true slope and intercept coe¢ cient for the pth quantile of y conditional on x for quantiles q = 0:1.5.. 0:4. 0:35. This is the same dgp as in Section 4. . 0:5.3 page 84 except that u is homoskedastic. (d) Redo part (b) for dgp with the modi…cations that x is the square of a N [0.3 program given on the book website. This can be written as the regression model y = exp(x0 )+u where u = exp(x0 )" and " iid[0. (0. Comment on any changes compared to (b). 0). .34)? Compare your answer to that in part (a). (b) From part (a) you should …nd that the intercept is 0. x) take values ( 1. 0:55 and 0:6. [Hint: F" 1 (0:1) is that value " such that Pr[" " ] = 0:1 for " N [0. 1). Compare your answers to the theoretical answers in part (a).IN A(Im + A0 A) 1 A0 . But what if we instead compare the (incorrect) default variance 2 (X0 X) 1 of OLS with the true variance of GLS? Comment. 0:9. We will …nd the least absolute deviations regression estimate of the slope coe¢ cient by grid search. Compare the two estimates of the slope coe¢ cient and the precision of the estimates. 413 Consider dgp y = 1 + 1 x and u = " where x N [0.5. (d) In general GLS is more e¢ cient than OLS. 25] and " N [0. For this problem what is the largest size matrix that needs to be inverted to perform GLS? 412 Consider regression of y on x when data (y. 0:9. 0:45. 1) and (1. For the given data compute (4. 000 from this dgp. (a) Using an appropriate statistical package obtain the OLS estimate of the slope coe¢ cient and the least absolute deviations regression estimate of the slope coe¢ cient. 1].. Comment on any changes compared to (b).. p (c) Redo part (b) for dgp with the modi…cation that u = x" rather than ".34) with q = 0:5 and x0 = xi (one regressor and no intercept) for i = 0:3. This requires minor modi…cation of the Section 4. Estimate quantile regressions of y on x for q = 0:1.. 2). (0.
Does result (4. limit distribution of N ( b ! N [0. ]. smsa66.48)) and compare it to the IV estimate from part (a). daded. Note that wage76 is log hourly wage. 1 X0 X. nomomed. momdad14. south76. grade76 is years of schooling and col4 is proximity to college. Obtain the ratio of the two slope coe¢ cient estimates as in (4. plim N Z]. ) = exp(x0 ) + exp(x0 ) F" 1 (q). Compute the Wald estimate (see (4.46) and compare to your answer in (a). k full rank matrices of constants (i. Compare the size and precision of the OLS and IV slope coe¢ cients. with estimation by OLS and by IV where col4 is the instrument for grade76. agesq76. (b) Hence state what happens for this model to the intercept and slope coe¢ cients of the population q th quantile of y conditional on x as q varies. (a) Regress wage76 on an intercept and grade76. Obtain the . the probability limits of N 1=2 Z0 u d N 1 Z0 Z 1 Z0 and N 1 Z0 X all (c) Hence obtain the asymptotic distribution of b and compare to your answer in (a).6 and use the program given at the book website. nodaded.46) still old? 416 Consider the linear regression model y= X + u and the IV estimator b = (Z0 X) 1 Z0 y where Z and X are N (a) Suppose u (b) Suppose u N [0. smsa76. (e) Redo parts (a) and (b). 415 Consider the same data set as that in example 4. and famed1famed8. sinmom14. Obtain the …nite sample distribution of b .(a) Show by argument similar to that in Section 4. black.9. momed. (c) The instrument here is a binary variable and there is one regressor. [0. exist and are …nite nonsingular and N p ). with all regressions now including as additional (exogenous) regressors age76. are nonstochastic).1 that the population q th quantile of y conditional on x is q (x.6. reg2reg9.e. ]. (d) Do you think there might be a weak instrument problem here? Provide appropriate statistical measures. where heteroskedasticityrobust standard errors are used. (b) Perform OLS regression of wage76 on an intercept and col4 and perform OLS regression of grade76 on an intercept and col4.
x. (b) Compare the average across simulations of b IV with the probability limit given in Exercise 47 part (c). " and v are iid normal with mean 0 and variances 1. Comment. Comment. Perform 1000 simulations with N = 100 where in each simulation data (y. Comment in the light of Exercise 47 part (e). and = 1. (3) R intercept) of z on x: (a) Compare the average across simulations of b OLS with the probability limit given in Exercise 47 part (b). z = " + v. (d) Do the R2 and F statistics across simulations from OLS regression (with intercept) of z on x indicate a likely weak instruments problem? .417 Consider the same three equation model as in Exercise 47. z) is generated and we obtain (1) b b OLS from OLS regression of y on x without intercept. = 1 and = 0:01. (c) Obtain percentiles and quartiles of the observed values across simulations of b IV . (2) IV from IV regression of 2 and F from OLS regression (with y on x without intercept with instrument z. with y = x + u. x = u + ". where the mutually independent errors u.
Exercises: Di¢ culty and Topics Covered c 2005 A. 1. Trivedi "Microeconometrics: Methods and Applications" Di¢ culty: 1 is easiest. Chapter 2: Causal and Noncausal Models No exercises. 2. 3. Chapter 1: Introduction No exercises. 3 is hardest. 2 is harder. 1 . Chapter 3: Microeconomic Data Structures No exercises. Colin Cameron and Pravin K.
Chapter 5: ML and NLS Estimation Ques Sol:given Dif f 5 1 Y es 1 5 2 Y es 2 5 3 Y es 2 5 4 3 5 5 Y es 2 5 6 2 5 7 Y es 2 5 8 1 Section 5:2 5:3 5:3 4:3 5:5 5:7 5:7 8 5:2 T opic Marginal e¤ects mestimator example asymptotic theory mestimator example asymptotic theory mestimator example asymptotic theory Wald test of linear restrictions NLS estimation theory NLS and ML with generated data Marginal e¤ects .4. Chapter 4: Linear Models Ques Sol:given Dif f 4 1 Y es 2 4 2 1 4 3 Y es 3 4 4 3 4 5 Y es 1 4 6 1 4 7 Y es 3 4 8 2 4 9 Y es 3 4 10 2 4 11 Y es 2 4 12 1 4 13 Y es 2 4 14 2 4 15 Y es 2 4 16 2 4 17 Y es 2 Section 4:4 4:4 4:4 4:4 4:5 4:8 4:8 4:6 4:9 4:5 4:5 4:6 4:6 4:6 4:8 4:8 4:9 T opic OLS with 6= 2 I OLS with heteroskedasticity OLS with heteroskedasticity OLS limit distribution LS estimators minimize u0 Wu IV estimation theory IV estimation weak instruments theory Quantile regression data application IV with weak instruments data application Weighted LS application GLS theory Quantile regression with arti…cial data Quantile regression with generated data Quantile regression theory IV data application IV theory IV with weak instruments simulation 5.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.