Extensions

EXTENSIONS OF THE LINEAR
REGRESSION MODEL
Christophe Croux
Katholieke Universiteit Leuven
christophe.croux@econ.kuleuven.ac.be
1
Session 1: The General Linear Model
I The General Linear Model

In the classical linear regression model
Yi = Xi0β + εi (i = 1, . . . , n)
we have that the error terms are homoscedastic
and uncorrelated. With
     
Y
 1 
X10  ε
 1
 ..   ..   .. 
     
Y =  . , X =  . ,ε =  . 
    

     
Yn Xn0 εn
we can write the classical linear model in matrix

notation as
Y = Xβ + ε
where ε has mean zero and covariance matrix

 σ
2 ... 0 



Σ := Cov(ε) = σ In =  .. . . .
2 ..  .



 

0 ... σ 2
The OLS-estimator β̂OLS = (X 0X)−1(X 0Y ) is

optimal in the classical model.
2
In the general linear model the matrix Σ may
have any form. So we allow for heteroscedasticity
and correlation among error terms.
Properties of β̂OLS in the general linear regres-

sion model
• Unbiasedness: E[β̂OLS] = β
• Cov(β̂OLS) = (X 0X)−1(X 0ΣX)(X 0X)−1
which is a different formula as in the classical
model −→ Formula for standard-errors differ.
• β̂OLS remains consistent, so tends to β as the
sample size increases, if (X 0ΣX)/n2 tends to
zero. This is the case if Σ contains “enough
small elements.”
Although β̂OLS will, in most cases, remain a con-
sistent estimator in the general linear model, it
turns out to be possible to do better.
3
II The Generalized Least Squares (GLS) estimator
Let A = Σ1/2 be a matrix such that AA0 = Σ.

Define now Ỹ = A−1Y , X̃ = A−1X, and ε̃ =
A−1ε. The general linear model can be written
as a classical linear model for the transformed
variables:
Ỹ = X̃β + ε̃,
with Cov(ε̃) = In. So the best estimator is given
0 −1 0
by (X̃ X̃) (X̃ Ỹ ) or
β̂GLS = (X 0Σ−1X)−1X 0Σ−1Y .
Properties of β̂GLS
• Unbiasedness: E[β̂GLS] = β
• Cov(β̂GLS) = (X 0Σ−1X)−1
• β̂GLS is consistent and efficient at the normal
model
4
III The Feasible GLS estimator
The idea is to replace Σ by Σ̂ in the formula for

β̂GLS. Unfortunately, Σ is in general not known,
and contains too many unknown elements. So we
need to make hypothesis regarding the structure
of Σ.
To fix ideas, suppose that Σ is of the form


 σ
2 . . . 0 



1
 . .. 

 .
 ...
 
 
 

0 2
. . . σN 
with log(σi2) = Zi0γ, and Zi an exogenous vari-
able. This implies that Σ depends now only on
one parameter.
5
We can work with a 2-step procedure:
• Step 1: Compute β̂OLS and the residuals ri =
Yi −Xi0β̂OLS. Regress log(ri2) on Zi to obtain
γ̂.
• Step 2: Compute σ̂i2 = exp(Ziγ̂) → Σ̂ →
β̂GLS
In principle, on could redo the 2 steps, now start-
ing with the obtained feasible GLS-estimator. As
such, we obtain an iterative procedure.
Depending on the structure imposed on Σ, other

feasible GLS-procedures will be obtained.
In the example outlined above, we treated an heteroscedastic regression
model by specifying the form of the volatility σi2. Many people find this
approach too restrictive and prefer in this case an OLS with White-
corrected standard errors.
6
IV Exercise
We are interested in knowing whether persons having a loyalty card

spend more in the supermarket or not. Take the data in the Eviews
file “glsex.wf1”. We have data for 2 supermarkets, and for each of them
we have for 20 clients the amount spend (AMOUNT), the size of the
household the person belongs to (HHS), and a binary variable (CARD)
indicating whether the person own a loyalty card or not. Denote yij
the amount spend by customer i in supermarket j, and xij the personal
characteristics of the client (in this case, only consisting of HHS). The
proposed model is:
yij = β 0xij + αj + δ CARD + εij ,
where αj , for j = 1, 2 are fixed effects for each supermarket. Of main

interest is to know whether δ is significant or not.
1. Estimate the parameters of the above model by OLS. Interpret

briefly the parameters estimates. (Hint: since the β and δ para-
meters are supposed to be the same for the 2 supermarkets, it will
be necessary to pool the data. You will also need to create yourself
the appropriate dummy variables STORE1 and STORE2 to take
the fixed effects into account.)
2. We are afraid that there is groupwise heteroscedasticity in the error

terms, i.e. Var(εij ) = σj2 for j = 1, 2.
(a) Estimate the variances σ12 and σ22 using the OLS-residuals.
7
(b) Estimate now the parameters by GLS. Write down an expres-
sion for Σ, the covariance matrix of the error terms, and show
that GLS boils down to Weighted Least Squares (WLS) here.
Create the series of weights to be used, and carry out the WLS
estimation (in Eviews, take estimation method LS with option
Weighted LS).
(c) What is the advantage of a WLS instead of OLS.
3. We are also afraid that there might be interaction between the

variable CARD and the supermarket. In particular, the effect of
the loyality card might differ among different supermarkets. The
model becomes now
yij = β 0xij + αj + δj CARD + εij .
(a) Estimate the above model by OLS. Do you think there might
be interaction? (Hint: creating the variables STORE1*CARD
and STORE2*CARD might be useful)
(b) Test whether the interaction is significant or not.
4. Economists would say that there is a serious endogeneity prob-

lem here. There is probably existing a feed-back relation from
AMOUNT to CARD. Could you explain why this might be the
case? Explain in words why it might indeed be that the error
terms are correlated with the variable CARD.
8
Session 2: The Panel Data Model
I Introduction
For a collection on N units, called the cross-

sections, we observe variables (Yit, Xit) for T con-
secutive periods. The (Yit, Xit) form a panel. N
is the cross-sectional and T the temporal dimen-
sion of the panel.
Example: (i) Yit is the income of family i dur-

ing year t, for 1 ≤ i ≤ 1000, and observed for
t = 2000, 2001, 2002. (ii) Yit is the unemploy-
ment rate for EU-country i, (1 ≤ i ≤ 15), ob-
served monthly from 1998:01 up to 2001:12, so
T = 48.
For every cross section number i, consider the

regression equation
9
Yit = αi + Xit0 β + εit (t = 1, . . . T ).
Note that for every index i fixed, Yit is a time se-
ries of length T . A panel is therefore a collection
of N time series. These time series may be short.
Typically the slope parameter β is the same for all

cross-sections, but the intercept not. We call αi
the cross-sectional effects. They are different be-
tween cross-sections, and unobserved, so it takes
“unobserved heterogeneity” into account. One
could call Xit the observed part of the hetero-
geneity.
In case H0 : α1 = . . . = αN := α, there is no un-

observed heterogeneity, and the model becomes
Yit = α + Xit0 β + εit (t = 1, . . . T ; 1 ≤ i ≤ N ),
and can simply be estimated by pooling all the
data and applying OLS to the pooled data. We
10
get the pooled regression estimator for the com-
mon intercept model. If this model is misspeci-
fied, also the estimated β can become completely
biased.
Putting all the Yit on top of each other is called

“stacking”. As such, we obtain the stacked vec-
tors   
0
  

Y11  
X11  
ε11 
     
   0   

 Y 12 


 X 
12 

 ε 12 

     



... 





... 





...


     
     
   0   

 Y 1T 


 X 
1T 

 ε 1T 

     
Y=

 ...
 , X=

 ... 
 , and ε =

 ...

     
     



...






... 





...



     
     
   0   

 Y N1 


 X 
N1 

 ε N1 

     



...





... 





...


     
     
YN T XN0 T εN T
Regression Y on X yields the pooled estimates.
One distinguishes between the fixed effects model,

where αi is an unknown parameter, and the ran-
dom effects model, where αi is a realization of a
11
(latent) random variable.
II The Fixed Effects (FE) model
Recall the model Yit = αi + Xit0 β + εit. With

the dummy variables
     
1
 
0
 
0
 
 ..   ..   .. 
     
 .   .   . 
     
     
     



1 





0 





0 


     
     


0 



1 



0 

     

 ... 


 ... 


 ... 

     
d1 = 


 , d2 = 


 , . . . , dN = 






0 





1 





0 




 0





 0






... 


     
     

 ... 


 ... 


 1


     
     



... 





... 





... 


     
     
0 0 1
and the panel data in stacked form Y and X,
we have the following linear model in matrix no-
tation:
Y = α1d1 + . . . + αN dN + Xβ + ε
Estimation of β and the fixed effects follows now
by least squares estimation of Y on X and the
12
dummies d1, . . . dN. This yields the Fixed Ef-
fects estimators.
Remark: note that if a variable Xit is constant in

time for all cross-sections , then this variable will
be a multiple of di and we have a multicollinearity
problem. The FE model cannot be estimated.
III Estimation of the FE model by OLS or GLS
Let Σ = Cov(ε) be the N T × N T matrix of the

error terms. It has
• Var(εit) on the diagonal
• Cov(εit, εjs) for i 6= j or t 6= s off the diagonal
Note that Σ is a huge matrix, where we need to
put more structure on.
Three different specifications of Σ are common

13
1. Var(εit) = σ 2 and all covariances between er-
ror terms are zero. In this case, OLS can be
applied (no weighting).
2. Var(εit) = σi2 and all covariances between er-
ror terms are zero. We have cross-sectional
heteroscedasticity. GLS can be applied (cross-
section weights):
(i) Step 1: compute OLS-residuals.
(ii) Step 2: compute residual variances within
every sector −→ Σ̂ −→ GLS-estimator.
3. Var(εit) = σi2, Cov(εit, εjt) = σij , all other
covariances zero. We allow also for contem-
poraneous correlation between cross-section.
GLS can be applied (SUR weights).
(i) Step 1: compute OLS-residuals.
(ii) Step 2: compute residual variances and
covariances between sectors −→ Σ̂ −→ GLS-
estimator.
14
IV Exercise
For 8 South-American countries we want to model the Real

GDP per capita in 1985 prices (=Rgdlp) in function of the
following explicative variables.
• Population in 1000’s (Pop)
• Real Investment share of GDP, in % (I)
• Real Government share of GDP, in % (G)
• Exchange Rate with U.S. dollar (XR)
• Measure of Openness of the Economy (Open)
You find the data in the file ”penn.wmf”, already in Eviews
format. Now we want to estimate the panel data model
Rgdplit=ai+b1*Popit+b2*Iit+b3*Git+b4*XRit
+b5*Openit+errorit
with Rgdpl, Pop, and XR in log-differences.

1. Create a“pool” object in Eviews. Give it a name and
define the cross-section identifiers. These identifiers are
those parts of the names of the series which identify the
cross-section.
15
2. Make plots of the XR-variables (herefore you need to
open them as a group). Create now these variables
in log-difference and make the plots again. To create
the variables use ”logdifXR?=dlog(XR?)” and the Pool-
Genr menu of the pool object; the ? will be substituted
by every cross-section identifier.
3. Compute the medians of the variables I? for the different
countries (use descriptive statistics)
4. We will take the fixed effects approach. Why? Estimate
the fixed effect model, with no weighting. (In Eviews:
cross-section=Fixed Effect; dependent variable=dlog(rgdpl?);
common coefficients=dlog(pop?) i? g? dlog(xr?) open?
). Which variables are significant?
5. Have a look at the residuals and their variances (View
/residuals/Graphs|Covariance Matrix within the pool
object). Is there cross-sectional heteroscedasticity? If
you think so, use cross-section weighting in the estimation-
procedure. What does ”iterate until convergence means”.
6. Compute now the correlations between the residual se-
ries from different countries. Do you need to take an-
other weighting form?
16
7. Are the results very depending on the way of weighting?
8. Take a look at the Standard Errors for the estimated
fixed effects. (Herefore, you need to estimate the panel
data model without intercept, but with a constant as
cross-section specific coefficient).
9. Test whether all fixed effects are equal (to know how
Eviews labels the coefficients, use View/Representation).
Test also whether all fixed effects except for Brazil are
equal.
V The Random Effects (RE) model
Write
Yit = Xit0 β + εit
and with vit a white noise with mean zero and
variance σv2
εit = αi + vit.
The error term εit contains a permanent compo-
nent (the random effect) αi and a transitory com-
ponent vit. Both components are supposed to be
17
independent and αi has mean zero and variance
σα2 .
The model in matrix notation is

Y = Xβ + ε
with Σ := Cov(ε) not diagonal. Indeed
Cov(εit, εis) = Cov(αi + vit, αi + vis) = σα2
for t 6= s. All other covariances are zero, and on
the diagonal of Σ we have
Var(εit) = Var(αi + vit) = σα2 + σv2,
called the variance decomposition formula.
We will then estimate β by GLS, and the result-

ing estimator is the RE-estimator.
Remark: since the effect αi is part of the error

term in the RE-model, it needs to be uncorrelated
18
with the X-variables. If not, the RE-estimator
is inconsistent. Independency of the unobserved
heterogeneity αi and the observed part Xit is a
very strong assumptions.
In econometrics, the fixed effects model seems to

be the most appropriate.
Exercise Consider a panel of 3 firms having data

on investment x and profit y for 10 years. You
can find the data in the file “smalldata.wf1”
1. Create a workfile and the appropriate “pool”
object in Eviews. Give it a name and define
the cross-section identifiers. Read in the vari-
ables.
2. Estimate the FE-model (with appropriate weight-
ing) and compare with the estimates of the
parameters in a RE-model.
19
Session 3: A note on instrumental vari-
ables
I The endogeneity problem
Consider the regression equation

Y 1 = β1 X 1 + γ 1 Y 2 + ε 1 , (1)
with Cov(Y2, ε1) 6= 0 and Cov(X1, ε1) = 0. In
this case, we say that we have an endogenous
RHS-variable, and there is a serious problem. A
basic assumption of the general linear model is vi-
olated, and we cannot use the LS-estimator any-
more.
20
We discuss now the most important cases where
the endogeneity problem arises:
Simultaneity problem Suppose that Y1 and Y2 occur

simultaneously; there is an instantaneous feed-back mech-
anism. For example



 Y1 = β1X1 + γ1Y2 + ε1



Y2 = γ2Y1 + ε2
We see that Cov(Y1, ε1) = γ1Cov(Y2, ε1) + Var(ε1) and
Cov(Y2, ε1) = γ2Cov(Y1, ε1) + Cov(ε1, ε2), implying
γ2Var(ε1) + Cov(ε1, ε2)
Cov(Y2, ε1) = 6= 0
1 − γ1γ2
So even for uncorrelated ε1 and ε2, there is an endogeneity
problem.
21
Measurement error
The regressor is observed with a measurement error. Call
X ∗ the true value, and X the observed value of the exoge-
nous variable: 
∗
 Y = βX + ε1


X = X ∗ + ε2



Suppose that the measurement error ε2 is uncorrelated with

ε1. It follows that
Y = βX − βε2 + ε1 := βX + ε
May we estimate β by regressing Y on X? No, since

Cov(X, ε) = Cov(X ∗ + ε2, −βε2 + ε1) = −β1Var(ε2). So
there is an endogeneity problem; a RHS-variable is corre-
lated with the error term.
22
II Instrumental variables
The endogeneity problem can be solved by intro-

ducing instrumental variables. Take again
the model Y1 = β1X1 + γ1Y2 + ε1.
A variable Z is called an instrumental variable
(IV) for Y2 if
• Y2 and Z have a non-zero correlation
• Z is uncorrelated with the error term.
The higher the correlation between Z and Y2, the
better.
23
One could even have more IV for Y2; let Z1, . . . , Zk
be the list of instrumental variables. Estimation
of γ1 is then done by a two-stage least squares
approach:
1. Regress Y2 on the instrumental variables, giv-
ing the adjusted series
Ŷ2 = c + a1Z1 + . . . + ak Zk .
Note that Ŷ2 is itself an instrumental variable.
It can be seen as the optimal IV formed by a
linear combination of Z1, . . . , Zk .
2. Regress Y1 on X1 and Ŷ2, giving us
β̂1X1 + γ̂1Ŷ2.
The IV-estimators β̂1 and γ̂1 are consistent
estimators.
24
How to find instrumental variables?
Most of the time, it is very difficult to find good
IV. Two exceptions:
• One can always try lagged version of the vari-
able to instrument: Zt = Y2,t−1 or Zt =
Y2,t−2 (or both).
• In a system of equations, exogenous variables
from other equations can be taken.
25
III Exercise
We will analyse the data in the file “schooling2.wf1” (see Verbeek,

page 130, for details). We want to estimate the returns to schooling of
education. Herefore we will estimate a human capital earnings function:
log(wi) = β1 + β2Si + β3Ei + β4Ei2 + γ txi + εi.
The dependent variable is log(wi), the log of individual earnings. The

explicative variables are
• Si: years of schooling
• Ei: years of experience (measuread as age-Si-6).
• xi: contains the control variables: black, south, smsa (living a

metropolitan area). These are all dummy variables.
1. Make descriptive statistics of the wage variables

(histogram/qqplot/density estimate).
2. Create the variable Ei and make a scatterplot of log(wi) versus Ei.

Also apply a scatterplot smoother.
3. Make a scatterplot
4. Estimate the earnings function . Have a look at the regression

output.
5. Why is there an endogeneity problem with Si? As an instrumental

variable we use the dummy variable “nearc” indicating whether the
26
person lives near a college or not. To see whether this could be
suitable, we compute the correlation between the instrument and
Si .
6. Estimate now the earning equation using the instrumental variable

estimator. Do the coefficients have the expected sign? Which
variables are significant?
7. Compare the SE of the estimates of β2 with OLS and with IV.
8. Perform a residual analysis (test for homoscedasticity, residual

plot, ...).
9. Comment on the value of R2.
10. Comment on the choice of the instrument.
27
Session 4: Estimation of Systems of Equa-
tions
I Introduction
In this Chapter, Y1, . . . , YN denote the endoge-

nous variables, while X1, . . . , Xp are the exoge-
nous variables. The economist decides which vari-
ables can be considered as exogenous. He/she
proposes an economic model of the form






Y1 = f1(X1, X2, . . . , Xp, Y2, Y3, . . . , YN ) + ε1





 Y2 = f2(X1, X2, . . . , Xp, Y1, Y3, . . . , YN ) + ε2






...




 YN = fN (X1, X2, . . . , Xp, Y1, Y2, . . . , YN −1) + εN
This is called the structural form of the system,

i.e. as given by economic theory. The functions
fj , for j = 1, . . . , N are most of the time linear
functions. We may then write
28






Y1 = a11X1 + a12X2 + . . . + a1pXp + b12Y2 + . . . + b1N YN + ε1





 Y2 = a21X1 + a22X2 + . . . + a2pXp + b21Y1 + . . . + b2N YN + ε2






...




 YN = aN 1X1 + aN 2X2 + . . . + aN pXp + bN 1Y1 + . . . + bN N YN + εN
If no restrictions are put on the parameters, then

the structural form can never be identified. For
example, the second equation can be rewritten
as
a21 a22 1 b2N 1
Y1 = − X1 − X2 + . . . − Y2 + . . . + − YN + ε 2 ,
b21 b21 b21 b21 b21
which is not compatible with the first equation,

unless if certain restrictions are verified.
We can rewrite the system in its reduced form, by

solving for the endogenous variables (if possible):






Y1 = π11X1 + π12X2 + . . . + π1pXp + ε01





 Y2 = π21X1 + π22X2 + . . . + π2pXp + ε02






...




 YN = πN 1X1 + πN 2X2 + . . . + πN pXp + ε0N
29
If the coefficients of the structural form can be re-
trieved from the coefficients of the reduced form,
then the system is identified. If
• πij → aij , bij uniquely: exactly identified
• πij → aij , bij in multiple ways: overidentified
Restrictions need to be put on the coefficients
before a system becomes identifiable.
A system-approach usually gives compact mod-

els, at the price of a making some assumptions
(exogenous/endogenous; parameter restrictions).
This is in contrast with (unstructured) VAR-models.
The latter models make less hypotheses, but in-
volve a huge amount of unknown parameters to
estimate.
30
II Systems of regression equations
For the i th equation of the model, with 1 ≤ i ≤

N , we observe (Yit, Xit) for T periods. For every
i, consider the regression equation
Yit = Xit0 βi + εit (t = 1, . . . T ).
(So no endogenous RHS variables for the mo-
ment)
Consider the stacked vectors

   

Y11  
ε11 
   
   

 Y 12 


 ε12 

   



... 





...



 
   



 
β1 





 Y 1T 






 ε1T 


 ...


 β 2 




...

Y= 





, β= 

 ...



, ε= 








...















...
   



 βN 




 Y N1 


 εN1 

   



...





...



   
   
YN T εN T
31
and the matrix 
0 

X11 0 ... 0
 
 

 ... 0 ... 0 

 
 0 



X1T 0 ... 0 


 
 0 
X̃ = 
 0 X21 ... 0 

 
 



0 ... ... 0 


 
0



0 X2T ... 0 



... ... ... ... 
We obtain the general linear model, in matrix

notation,
Y = X̃β + ε.
• Cov(εit, εjs) = 0 for all t, s and i 6= j. The
regression equations are unrelated. We can
estimate every equation separately by OLS.
• Cov(εit, εjt) = σij . The regression equations
are seemingly unrelated. We have a SURE-
model, were we allow for contemporenous cor-
relation between the innovation of different equa-
tions. We estimate the SURE-model using
GLS.
32
Exercice: Grunfeld data
We consider investment data for 5 firms during 20 years, and consider
the model
IN Vit = αi + βi1V ALit + βi2CAPit + εit
for 1 ≤ i ≤ N = 5, and 1 ≤ t ≤ T = 20. The variables are
• gross investment (INV)
• market value of the firm (VAL)
• value of the stock of plant and equipment (CAP)
The firms have labels GM, CH, US, GE, and WH.
1. Open the file ”grunfeld.wf1” and create a new object of the “system”-
type. Specify the system by typing
invgm=c(1)+c(2)*valgm+c(3)*capgm
invch=c(4)+c(5)*valch+c(6)*capch
invge=c(7)+c(8)*valge+c(9)*capge
invwh=c(10)+c(11)*valwh+c(12)*capwh
invus=c(13)+c(14)*valus+c(15)*capus
2. Estimate the system by OLS.
3. Have a look at the residuals. Make graphs and compute correla-

tions between them. What is your conclusion.
33
4. Estimate now using the SURE method. Which coefficients are
significant? Is your answer different if you do not iterate the GLS-
estimator? (Tip: you can use “freeze” to keep a table on the
screen.)
5. Test whether the coefficients for market value are equal across
firms.
6. Is there autocorrelation in the residuals? Comment on the value

of the DW-statistic. Is there improvement if you model the error
terms by AR(1) processes?
(specify “invgm=c(1)+c(2)*valgm+c(3)*capgm + [ar(1)=c(16)]”
and similarly for the other equations).
7. It seems that many of the variables in the model are trending up-
wards. Is this a problem or not? Do the residuals look stationary?
8. Use a cross correlogram to check whether the assumption of only

contemporaneous correlation is plausible.
9. Can you make a forecast with this model?
34
III Simultaneous-Equations Models
In a Simultaneous-Equation model there will be

endogenous variables at the RHS of the equa-
tions.
Two-stage least squares (2SLS)
Here the equations of the system are estimated

one after another. The correlation between the
error terms of different equations is not exploited.
We speak about a ‘limited information” method.
This aproach will still be consistent, but not effi-
cient. 2SLS is nothing else but IV-estimation.
The advantage of 2SLS is that you can still cor-

rectly estimate an equation, even when some of
the other equations are misspecified or not iden-
tified.
35
Suppose that we have the following system of
equations:









Y 1 = β1 X 1 + β2 X 2 + γ 1 Y 2 + ε 1







Y2 = β3X3 + γ2Y1 + ε2



 Y
3 = β4X1 + γ4Y1 + γ5Y2 + ε3



For equation (1): use X3 as IV for Y2. By taking X1, X2 as instruments

for themselves, we have a list of 3 instruments for 3 RHS-variables. This
equation is exactly identified.
For equation (2): Use X1 and X2 as instruments for Y1. Take X3

as instrument for itself. We have a list of 3 instruments for 2 RHS
variables. The equation is overidentified.
For equation (3): Use X2 and X3 as instruments for Y1 and Y2. Take
X1 as instrument for itself. We have a list of 3 instruments for 3 RHS
variables. The equation is exactly identified.
An equation can be estimated with 2SLS, if the

number of instruments is ≥ the number of RHS-
variables.
36
Exercice: Estimation of the Klein model I
Consider the following model of the American (pre-war) economy:





CSt = α0 + α1Pt + α2Pt−1 + α3(W Gt + W Pt) + ε1 consumption




It = β0 + β1Pt + β2Pt−1 + β3Kt−1 + ε2 investment




W Pt = γ0 + γ1Xt + γ2Xt−1 + γ3T IM Et + ε3 private wages
with Xt (demand), Pt (private profits), Kt capital stock endogenous
and T (indirect taxes plus net exports), W G (government wages) ex-
ogenous.
1. Read in the data from “klein.wf1” and create a system object. Add
a list of instrumental variables that you can use:
cs=c(1)+c(2)*p+c(3)*p(-1)+c(4)*(wg+wp)
i=c(6)+c(7)*p+c(8)*p(-1)+c(9)*k1
wp=c(10)+c(11)*x+c(12)*x(-1)+c(13)*year
inst x(-1) k1 wg year p(-1)
Could you add x(-2) to this list?
2. Estimate the system by 2SLS. Do the parameters have the ex-

pected sign?
3. Which equations are overidentified?
4. Are the residuals from different equations correlated? If yes, is

2SLS still valid?
5. Comment on serial correlation in the residuals.

37
6. Is there a test to know whether the instruments are valid? No, the
only thing we can do is to test for overidentifying restrictions, by
which we mean testing for the validity of the use of a number of
instruments larger than strictly needed. Take for example the sec-
ond equation. Estimate it as a separate equation using the GMM
estimation tool of Eviews. Here we have q = 2 overidentifying
restrictions. Look at the value of the J-statistics. Under the null
hypothesis that all instruments are valid, we have
T J ∼ χ2q .
So we reject the null hypothesis if T J is larger than the 5 % upper

quantile of a chi-squared distribution of q degrees of freedom (in
Eviews, you get this critical value by genr cval=@qchisq(0.95,q)).
What is your conclusion? This procedure is a test for overidenti-
fying restrictions.
38
Three-stage least squares (3SLS)
This method provides efficient estimates, if the

complete system is well specified, by exploiting
the correlation between the error terms. To fix
ideas, take again









Y 1 = β1 X 1 + β2 X 2 + γ 1 Y 2 + ε 1







Y2 = β3X3 + γ2Y1 + ε2



 Y
3 = β4X1 + γ4Y1 + γ5Y2 + ε3



Step 1: find the optimal instruments Ŷ2 (and Ŷ1)

by regressing Y1 (and Y2) on the list of instru-
ments. We obtain the system









Y1 = β1X1 + β2X2 + γ1Ŷ2 + ε1







Y2 = β3X3 + γ2Ŷ1 + ε2



 Y
3 = β4X1 + γ4Ŷ1 + γ5Ŷ2 + ε3



39
Step 2+3: the above system has no endogeneity
problem anymore. We will estimate it by SURE,
which is a 2-step GLS-estimation method.
Remark: 3SLS is a full information approach, like

FIML
40

Extensions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Extensions

Uploaded by

Copyright:

Available Formats

EXTENSIONS OF THE LINEAR

I The General Linear Model

we can write the classical linear model in matrix

The OLS-estimator β̂OLS = (X 0X)−1(X 0Y ) is

Properties of β̂OLS in the general linear regres-

Let A = Σ1/2 be a matrix such that AA0 = Σ.

β̂GLS = (X 0Σ−1X)−1X 0Σ−1Y .

The idea is to replace Σ by Σ̂ in the formula for

To fix ideas, suppose that Σ is of the form

Depending on the structure imposed on Σ, other

corrected standard errors.

We are interested in knowing whether persons having a loyalty card

yij = β 0xij + αj + δ CARD + εij ,

where αj , for j = 1, 2 are fixed effects for each supermarket. Of main

1. Estimate the parameters of the above model by OLS. Interpret

2. We are afraid that there is groupwise heteroscedasticity in the error

3. We are also afraid that there might be interaction between the

yij = β 0xij + αj + δj CARD + εij .

4. Economists would say that there is a serious endogeneity prob-

For a collection on N units, called the cross-

Example: (i) Yit is the income of family i dur-

For every cross section number i, consider the

Typically the slope parameter β is the same for all

In case H0 : α1 = . . . = αN := α, there is no un-

Putting all the Yit on top of each other is called

Regression Y on X yields the pooled estimates.

One distinguishes between the fixed effects model,

II The Fixed Effects (FE) model

Recall the model Yit = αi + Xit0 β + εit. With

Remark: note that if a variable Xit is constant in

III Estimation of the FE model by OLS or GLS

Let Σ = Cov(ε) be the N T × N T matrix of the

Three different specifications of Σ are common

For 8 South-American countries we want to model the Real

with Rgdpl, Pop, and XR in log-differences.

V The Random Effects (RE) model

The model in matrix notation is

We will then estimate β by GLS, and the result-

Remark: since the effect αi is part of the error

In econometrics, the fixed effects model seems to

Exercise Consider a panel of 3 firms having data

I The endogeneity problem

Consider the regression equation

Simultaneity problem Suppose that Y1 and Y2 occur

Suppose that the measurement error ε2 is uncorrelated with

May we estimate β by regressing Y on X? No, since

The endogeneity problem can be solved by intro-

We will analyse the data in the file “schooling2.wf1” (see Verbeek,

log(wi) = β1 + β2Si + β3Ei + β4Ei2 + γ txi + εi.

The dependent variable is log(wi), the log of individual earnings. The

• Si: years of schooling

• Ei: years of experience (measuread as age-Si-6).

• xi: contains the control variables: black, south, smsa (living a

1. Make descriptive statistics of the wage variables

2. Create the variable Ei and make a scatterplot of log(wi) versus Ei.

4. Estimate the earnings function . Have a look at the regression

5. Why is there an endogeneity problem with Si? As an instrumental

6. Estimate now the earning equation using the instrumental variable

7. Compare the SE of the estimates of β2 with OLS and with IV.

8. Perform a residual analysis (test for homoscedasticity, residual

9. Comment on the value of R2.

10. Comment on the choice of the instrument.

In this Chapter, Y1, . . . , YN denote the endoge-