You are on page 1of 4

Goethe University Frankfurt Fall 2021

Advanced Econometrics I
Problem Set 1 (OLS Mechanics)

! " ′
Let y be a random variable with E y 2 < ∞. Let x be a k-dimensional random vector #n with E′ ["xx "] <
′ n
∞ and E [xx ] invertible. Let {yi , xi }i=1 be an IID sample drawn from Fy,x with i=1 xi xi invertible
with probability 1. Moreover, you may assume other random variables that appear below also satisfy
appropriate regularity conditions, generate a sample by the same IID sampling scheme, and produce
well-defined estimators based on that sample and {yi , xi }ni=1 . Recall the CEF model of y given x and
BLP model of y given x are defined as

y = E [y | x] + e, E [e | x] = 0 and y = x′ β + e, E [xe] = 0k ,

respectively, where β = (E [xx′ ])−1 (E[xy]).

Question 1: Regression on Constant

(a) (Best Guess) Recall that the CEF m(x) = E [y | x] is the best predictor of y given x. That is, the
CEF solves the squared loss prediction problem in the population
! "
min E (y − g(x))2
g(x)

over all possible functions of x.

Now, suppose that we are interested in the econometric modeling of the individual earnings y. However,
except for the earnings, we have no other information about the individuals (such as gender, education
level). Put differently, we have to derive the best unconditional predictor of y.

(i) Write down the suitable population MSE minimization problem. Derive the best unconditional
guess of y in the population.

(ii) Replace the population MSE minimization problem in (i) by a sample counterpart. Derive the
best unconditional guess of y in the sample.

(b) (Application BLP) Recall that the linear projection coefficient is given by
$ ! "%−1
β = E xx′ (E[xy]).

Now, suppose that we are in the context of Q1(a), where we have only the information on the individual
earnings y. As a result, we decide to project y on a constant and set x = 1.

(i) Explicitly define the projection coefficient β and the projection error e. Propose an estimator β& by
analogy principle and explicitly define the estimation error e& (i.e. residuals).

(ii) Compare the CEF model in Q1(a) with the BLP model in Q1(b). Are the CEF and the BLP
equivalent? Are the CEF error and the projection error equivalent?

(iii) Suppose that we project y on x = 2. Re-answer Q1(b)(i)-(ii).

(c) (Application FWL) Recall that the population foundation of the Frisch-Waugh-Lovell theorem is
given by
E [x̃k ỹ]
βk = ! 2 " ,
E x̃k
where x̃k is the projection error from the linear projection of the kth variable xk on all the other
variables x−k , and ỹ is the projection error from the linear projection of y on x−k .

1
Goethe University Frankfurt Fall 2021

Now, we would like to apply this population FWL Theorem to a simple setting. Consider the
linear projection of y on x = (1, x1 ):

y = β0 + β1 x1 + e1 , E [xe1 ] = 0.

Use the population FWL Theorem to obtain β1 and show that it is equivalent to cov[x 1 ,y]
var[x1 ] . In doing so,
explicitly write down the three-step projection procedure suggested by the population FWL Theorem.

(d) (Demeaned Regression without Intercept) Define y d = y − E [y] and xd1 = x1 − E [x1 ], that is, y d
and xd1 are the demeaned versions of y and x1 . Consider the linear projection of y d on xd1 :
' (
y d = β1d xd1 + e2 , E xd e2 = 0.

Is β1d equivalent to β1 in Q1(c)?

Question 2: Saturated Models

Saturated regression models are regression models with dummy variables as explanatory variables
for mutually exclusive and exhaustive categories. We will show that such models have as many
parameters as categories and are necessarily linear in parameters.

(a) (Linear CEF I) Let the continuous variable y denote the earnings. Let the binary variable x1
denote the gender (with x1 = 1 being male, and x1 = 0 female). Let x1 ≡ (1, x1 )′ .

(i) Let α0 ≡ E [y | x1 = 0] and α1 ≡ E [y | x1 = 1]. Write down the CEF model of y given x1 .

Hint: For below questions, it will be helpful to formulate the CEF in terms of α ≡ (α0 , α1 − α0 )′ .

(ii) Write down the BLP model of y given x1 and explicitly define the projection coefficient β.

(iii) Compare the CEF model in Q2(a)(i) with the BLP model in Q2(a)(ii). Are the CEF and the
BLP equivalent? Are the CEF error and the projection error equivalent?

(iv) We are interested in estimating the gender effect defined as the average earnings for men relative
to women. How could this effect be estimated by the OLS using the sample {(yi , x1i )}ni=1 ? Explain.

(b) (Linear CEF II) We already illustrated in Q2(a) that the CEF of y given x1 is necessarily linear
in parameters if x1 is a dummy variable and thus corresponds to the BLP of y given x1 . Now, define
x0 = 1 − x1 as the negation of x1 (with x0 = 1 being female, and x0 = 0 male). The CEF of y given
x0 is again necessarily linear and the corresponding CEF (and BLP) model is given by:

y = γ0 + γ1 x0 + e, E [e | x0 ] = 0.

Note that we do not distinguish between the CEF parameters and the BLP coefficient here, since the
CEF corresponds to the BLP when the former is linear in parameters.

(i) Explicitly define γ0 and γ1 . How could the gender effect be estimated by the OLS using the sample
{(yi , x0i )}ni=1 ? Explain.

(ii) Now, we consider the CEF of y given x0 and x1 that is linear in parameters. Explicitly write
down this CEF by yourself. How could the gender effect be estimated by the OLS using the sample
{(yi , x0i , x1i )}ni=1 ? Explain.

(c) (Interaction) Let the binary variable x2 denote the education level (with x2 = 1 being a PhD
holder, and x2 = 0 otherwise).

2
Goethe University Frankfurt Fall 2021

Explicitly write down the CEF of y given x1 and x2 as a function that is linear in parameters by
yourself.

Hint: It will be helpful to define the interaction term x3 ≡ x1 x2 .

Question 3: Regression Variance

(a) (Variance Decomposition) Recall that for the OLS regression with a constant, the analysis-of-
variance formula is
n
) n
) n
)
(yi − ȳ)2 = yi − ȳ)2 +
(& e&2i .
i=1 i=1 i=1

That is, the sample variance of y can be decomposed into the sample variances of explained components
and unexplained components.

Now, we would like to see whether a counterpart exists in the underlying population model.

(i) Show that a similar formula exists for the CEF model. That is, show Var [y] = Var [m(x)] + Var [e],
where e is the CEF error.

(ii) Show that a similar formula exists for the BLP model with a constant. That is, show Var [y] =
Var [x′ β] + Var [e], where e is the projection error.

(b) (Error Variance) Let xnew be an additional random variable. Define xnew ≡ (x′ , xnew )′ . Let enew
denote the error from either a CEF model or a BLP model defined on xnew . Compared to the models
defined on x, additional information on xnew is included in these alternative models on xnew .

Now, we would like to see how this additional piece of information contributes to the regres-
sion error variance.

(i) Consider the CEF models of y given x / given xnew . Does the variance of the CEF error increase
or decrease with more information? That is, is Var [enew ] greater or smaller than Var [e]?

Hint: ≤ E [Var
! ! Start with 0 "" " ) | x]] and show what the latter term is equal to. The fact that
! [m(x, xnew
E E m(xnew )2 | x = E m(xnew )2 and E [m(xnew ) | x]2 = m(x)2 (why?) will be useful.

(ii) Consider the BLP models of y given x / given xnew with a constant. Does the variance of the
projection error increase or decrease with more information?

Hint: It will be helpful to consider the population MSE minimization problem that defines the BLP.

Optional: Programming Exercises

(a) (Application FWL) Use R and cps09mar data to verify in the sample that the FWL Theorem can
be used to obtain the single OLS coefficient estimate in the multivariate linear regression
setting. In doing so, consider the following linear projection:
*- 0 3
1
+.educationi 1 4
earningsi = β0 + β1 educationi + β2 f emalei + β3 agei + ei , E + . 1 4
,/ f emalei 2 ei 5 = 04
agei

(i) Run the above multivariate linear regression in R to obtain the OLS estimate for β1 .

(ii) Run the three-step regression procedure suggested by the sample FWL Theorem to obtain the
OLS estimate for β1 in R. Does it make any difference if we include an intercept in the last step?

3
Goethe University Frankfurt Fall 2021

Hint: lm(y ∼ x1 + x2 ) is used to fit a linear regression model of y given x1 and x2 including the
intercept. lm(y ∼ x1 + x2 − 1) is used to fit a linear regression model of y given x1 and x2 omitting
the intercept.

(b) (Saturated Models) Use R and cps09mar data to verify in the sample the relationship between
the OLS estimates for the CEF/BLP coefficients in Q2(a), Q2(b)(i) and Q2(b)(ii). In
doing so, consider the following linear projections:
67 8 9
1
earningsi = β0 + β1 malei + e1i , E e = 02
malei 1i
67 8 9
1
earningsi = γ0 + γ1 f emalei + e2i , E e = 02
f emalei 2i
67 8 9
f emalei
earningsi = θ0 f emalei + θ1 malei + e3i , E e3i = 02
malei

Run the above linear regression in R and verify that the OLS estimates have the relationship β&1 =
γ1 = θ&1 − θ&0 . What do these estimates measure?
−&

You might also like