Chapter 4

Analysis of panel data

Applied Econometrics
Winter Term 2020/2021
Prof. Dr. Simone Maxand
Humboldt University Berlin
Applied Econometrics  Chapter 4

Applied Econometrics  Chapter 4

4.1.1 Introduction

I Panel data:

. pooled observations on a cross-section of investigation units

over several time periods
. typically: collected on the microeconomic level
. also increasingly common: pool individual time series of a
number of countries/industries & analyze them simultaneously

I Investigation units: (shortly: individuals)

. persons, households,
. rms, industries,
. countries, regions within a country,
. assets, ...

Applied Econometrics  Chapter 4

Examples of panel data sets

I German SOEP (Socio-economic panel, DIW):

. representative study of private households

. since 1984 annual survey of the same private households,

persons and families (new federal states since 1990)

. e.g. 2008: around 11000 households with more than 20000


. subjects covered: composition of the household, occupational

and family biographies, labor participation and occupational
mobility, income developments, health, ...

. enables to analyse political and social changes (e.g. eects of

policies and nature, reasons of poverty and societal/economic

Applied Econometrics  Chapter 4

Objective and notation

I Objective: Explain the relationship between a dependent

variable y and K explanatory variables x1 , ..., xK .

I Observations:

. N individuals (persons, households, rms, countries,...)

. over T time periods (balanced panel).

I Notation: (yit , xkit ) , i = 1, ..., N; t = 1, ..., T ; k = 1, ..., K

. T = 1: cross sectional data

. N = 1: time series data

I Here: Focus on panels with many individuals (large N) and

comparatively few time periods (small T ).

Applied Econometrics  Chapter 4
Some advantages of panel data

I Panel data contain two types of information:
. cross-sectional information reecting dierences between
individuals, and

. time series information reecting changes within individuals

over time.

I Consequently, panel data allow

. to control for unobserved heterogeneity (omitted time-invariant

or individual-invariant variable) which leads to a bias reduction,

. to construct and test more complex (more realistic) behavioral

models than purely cross-sectional or time series data, and

. a better identication of eects that are simply not detectable

in pure cross-section or pure time series data.

Applied Econometrics  Chapter 4

Eciency considerations for panel data

I In general, there are gains in eciency since explanatory

variables vary across two dimensions.

I Panel data contain more information due to repeated

observations of the same individual.

. This implies more information about temporal changes

(through path dependencies),

. but less variation in the explanatory variables compared with

repeated cross-sections.

Applied Econometrics  Chapter 4

I Therefore: In comparison to repeated cross sections, panel

data allows for more ecient estimation of temporal changes.

I But: In comparison to repeated cross sections, panel data

allows for less ecient estimation of averages over time.

Applied Econometrics  Chapter 4

Linear panel data regression models

I A most general linear panel data model is given by

yit = αit + xit0 βit + εit , i = 1, . . . , N, t = 1, . . . , T .

. Notation: xit = (x1it , . . . , xKit )0

I Model cannot be estimated since it has more parameters than

observations (parameters are not identiable).

I Restrictions concerning the variation of αit and βit in i and t

as well as concerning the error term εit (its variance cannot

arbitrarily depend on both i and t) are necessary.

I Asymptotic results for panel estimators are typically derived for

xed T and N → ∞.
Applied Econometrics  Chapter 4
4.1.2 Seemingly unrelated regression (SUR)

I A linear SUR model is given by

yit = αi + xit0 βi + εit , i = 1, . . . , N, t = 1, . . . , T ,

I In general, this model might be seen as multivariate linear
model (with N equations), where
. N (dierent) dependent variables are observed T times, and
. the numbers of regressors Ki might dier across the equations.
I Assumptions on the error terms:

E[εit ] = 0 ∀i, t,
σij , t = s
E[εit εjs ] = .
0, t 6= s
. Allows for contemporaneous correlations across the equations.

Applied Econometrics  Chapter 4

Cross section representation

I Equation i: yi = Xi θi + εi , i = 1, . . . , N,
where (assuming Ki = K for all i)
 
1 " #
. .  αi
Xi =  .. . 
. , θi =
xiT (K +1)×1
T ×(K +1)

   
yi 1 εi 1
. .
yi =  . , εi =  .
   
.  . 
yiT T ×1
εiT T ×1

Applied Econometrics  Chapter 4

System representation
y = X θ + ε,
 
  X1 0 ··· 0
0 X2 ··· 0
 

y = . , X =
 
.  .. . .. .

. .

 . . . .


NT ×1 0 ··· ··· XN
NT ×(K +1)N

   
θ1 ε1
. .
θ= . , ε= .
   
.  . 
θN (K +1)N×1
εN NT ×1

Applied Econometrics  Chapter 4

Covariance matrix

V[ε] = E[εε0 ] = Σ ⊗ IT := Ω,

where, with σii =: σi2 = V(εit ) (∀t),

σ12 σ12 . . . σ1N

 

 σ21 σ22 . . . σ2N 

 
Σ=  .. .. . 

. .
 . . 

σN 1 2
. . . σN

and IT denotes the identity matrix of dimension T.

Applied Econometrics  Chapter 4

The Kronecker product ⊗

I Denition: Let

A = ((aij ))i=1,...,r ;j=1,...,s and B .

r ×s n×k


A ⊗ B := ((aij B))i=1,...,r ;j=1,...,s

 
a11 B . . . a1s B
 . . 
=  .. . 

ar 1 B ... ars B

Applied Econometrics  Chapter 4

Some properties of the Kronecker product

1. (A ⊗ B)0 = A0 ⊗ B 0

2. ( A ⊗ B )( C ⊗ D ) = AC ⊗ BD
r ×s n×k s×p k×q

3. A ⊗( B + C )=A⊗B +A⊗C
r ×s n×k n×k

4. (A ⊗ B)−1 = A−1 ⊗ B −1 (for regular square matrices A, B )

5. tr(A ⊗ B) = tr(A)tr(B)

6. In general: A ⊗ B 6= B ⊗ A!

Applied Econometrics  Chapter 4

Least squares estimation

I General linear regression model

I OLS estimator:

θbOLS = (X 0 X )−1 X 0 y ,

where V[θbOLS ] is given by

V[θbOLS ] = (X 0 X )−1 X 0 ΩX (X 0 X )−1 .

I The BLUE is the GLS estimator

θbGLS = (X 0 Ω−1 X )−1 X 0 Ω−1 y ,

with V[θbGLS ] = (X 0 Ω−1 X )−1 .

Applied Econometrics  Chapter 4

System- vs. single equation estimation

I Single equation estimators are consistent but in general less

ecient than system estimators.

I Single equation estimators are more robust since they are not

inuenced by potential misspecications in other equations.

I In the special case of the same values of the regressors in all

equations, i.e. X1 = . . . = XN ,
. single equation estimation and system estimation coincide,

. OLS and GLS are identical.

I In case of no contemporaneous correlations (σij =0 for all

i 6= j ), single equation estimation is also ecient.

Applied Econometrics  Chapter 4

Estimation of the error covariance matrix

I The covariance matrix V(θbGLS ) = (X 0 Ω−1 X )−1 is estimated

V b −1 X
b θbGLS = X 0 Ω ,

with Ω b ⊗ IT
b =Σ ; Σ
b = ((b
σij ))i,j=1,...,N ,

and σij is estimated by

1 X
bij = bi − βbi0 xit )(yjt − α
(yit − α bj − βbj0 xjt ),

where α
bi and βbi denote the OLS estimators of αi and βi .

Applied Econometrics  Chapter 4

From SUR to panel data models

I Panel data models assume

. the same dependent variable for the N individuals,

. the same regressors for the N individuals (and thus Ki ≡ K ),
but in general Xi 6= Xj for 6 j.

I SUR does not allow for an estimation of common eects, e.g.

. β1 = β2 = . . . = β ,
. α1 = α2 = . . . = α,
. common eects that vary over time like yit = αt + xit0 β + εit .

I In case of common parameters, an estimation that takes this

restriction into account yields a more ecient inference.

Applied Econometrics  Chapter 4

4.1.3 Individual-specic error component

(EC) model
I Linear panel data regression model:

yit = α + xit0 β + εit , i = 1, ..., N; t = 1, ..., T

I Error terms εit follow an individual-specic (one-way) EC


εit = µi + νit ,
. µi : unobserved, individual-specic eects (describing the
unobserved heterogeneity across individuals); captures all
unobserved, time-constant eects that are not contained in xkit
. νit : remaining (idiosyncratic) disturbances, i.e. measurement
errors or omitted/unobservable eects that vary over time

Applied Econometrics  Chapter 4

Two-way error components model

I As before: yit = α + β 0 xit + εit , i = 1, ..., N; t = 1, ..., T

I Error terms εit follow a two-way error components model:

εit = µi + λt + νit ,

where λt captures eects that are constant across the

individuals but vary over time.

I We will focus on the individual-specic/one-way EC model.

I Further specications (not addressed): systems of equations

(like SUR) with EC, dynamic panel data models, and panel

data models for qualitative dependent variables or count data.

Applied Econometrics  Chapter 4

Matrix notation for the one-way EC model

I Variables of individual i at time t:

yit , εit , νit , xit = (x1it , . . . , xKit )0
I Variables for individual i (i = 1, . . . , N ) :

   
.  . 
Xi = . , β =  .. 
 
. 
(T ×K ) 0
xiT βK
     
yi 1 νi 1 εi 1
. . .
yi = .  , νi =  .  , εi =  .
     
. . . 
(T ×1)
yiT νiT εiT

Applied Econometrics  Chapter 4

I Notation for stacked observations:
       
y1 ν1 ε1 X1
. . . .
y = . , ν =  . , ε =  . , X = .
       
. . . . 
(NT ×1) (NT ×K )
yN νN εN XN

⇒ Model in matrix notation:

y = α1NT + X β + ε = Z θ + ε,
ε = G µ + ν,
where . α
Z= [1NT ..X ], θ= ,
µ = (µ1 , . . . , µN )0 , G = IN ⊗ 1T ,
and 1n denotes the n-vector of ones.

Applied Econometrics  Chapter 4

Applied Econometrics  Chapter 4

Applied Econometrics  Chapter 4

4.2.1 Assumptions for the xed eects (FE)

I The eects µi are xed/deterministic.

. reasonable e.g., if interest is only in behavior of sample at hand

I X is deterministic (or strictly exogenous: E(ν|X ) = 0) with

rk(G ..X ) = N + K (with probability one).

⇒ rk(X ) = K [note that by denition: rk(G ) = N)]

⇒ There are no time-invariant explanatory variables in X.

I The errors are homoscedastic & uncorrelated: ν|X ∼ (0, σν2 INT )
⇒ y |X ∼ (α1NT + X β + G µ, σν2 INT )

Applied Econometrics  Chapter 4

I Under the above assumptions: β is identiable.

I α and µ are not identiable (dummy trap), since

. .
y = Aδ + ν , with A = (1NT ..X ..G ), δ = (α, β 0 , µ0 )0 ,

where the (NT × (N + K + 1))-matrix A has no full rank;

because of G 1N = 1NT it follows

rk(A) = rk(G ..X ) = N + K .

. The sum of the columns in G equals the rst column 1NT .

Applied Econometrics  Chapter 4

Possible solutions to non-identiability

(i) Individual-specic intercepts:

µ∗i := α + µi , i = 1, ..., N

⇒ y = X β + G µ∗ + ν (1)

with µ∗ = (µ∗1 , ..., µ∗N )0 .

(ii) Linear restrictions, e.g.

µ· := µi = 0.

Applied Econometrics  Chapter 4

ANOVA notation

I Time sum and time average

X 1 1 X
xi· := xit x i· := xi· = xit
t=1 t=1
I Cross sectional sum and cross sectional average

X 1 1 X
x·t := xit x ·t := x·t = xit
i=1 i=1
I Time and cross sectional sum and average

X 1 1 X
x·· := xit x ·· := x·· = xit
t=1 i=1 t=1 i=1

Applied Econometrics  Chapter 4

4.2.2 Parameter estimation

I For notational convenience, we consider X as deterministic.

I Because of ν ∼ (0, σν2 INT ) the OLSE for β is BLUE.

I Separate estimation for β using the Frisch-Waugh Theorem:

βb = (X 0 QX )−1 X 0 Qy
Q = I − P, P = PR(G ) := G (G 0 G )−1 G 0

I P, Q are orthogonal projections (symmetric and idempotent

NT × NT matrices) with PQ = 0 and P = IN ⊗ T1 JT ,
where JT = 1T 1T denotes the (T × T ) matrix of ones.

Applied Econometrics  Chapter 4

The projection matrices P and Q

I P- orthogonal projection onto R(G ) [column space of G]
I Q - orthogonal projection onto R(G )⊥
I Transforming y by P yields group means (averages across

time ∀i ):
   
y1 y 1· 1T
. NT .
y = . ∈R ⇒ Py =  .
   
. . 
yN y N· 1T
⇒ (Within-)Transformation Q yields deviations from the group

means y i· :
Qy = (yit − y i· ) i=1,...,N

Applied Econometrics  Chapter 4

Within(-group) estimator

I Accordingly, βb can be written as

βbW := (X 0 QX )−1 X 0 Qy = [(QX )0 QX ]−1 (QX )0 Qy

 −1
=  (xit − x i· )(xit − x i· )0  (xit − x i· )(yit − y i· )
i,t i,t

where x i· = (x 1i· , ..., x Ki· )0 .

⇒ The within(-group) estimator βbW utilizes only the variation

within each group (for each individual).

⇒ Elimination of µi , α by forming deviations from group means

I Alternative names: covariance estimator; least squares dummy

variable (LSDV) estimator; xed eects (FE) estimator.

Applied Econometrics  Chapter 4

Optimality of βbW
I βbW is OLSE of β in (1) resp. OLSE in the (Frisch-Waugh)

transformed model (note Q1NT = 0 and QG = 0)

Qy = QX β + Qν (2)

resp. (yit − y i· ) = (xit − x i· )0 β + (νit − ν i· )

I Within-transformation Q eliminates the individual eects (µi )

and the intercept (α).

I The optimality (eciency) is not obvious from (2), since

V[Qν] = σν2 Q (6= σν2 I) is singular.

I But βbW is BLUE, since it corresponds to the OLSE in model

(1) (Gauss-Markov Theorem).

Applied Econometrics  Chapter 4

I From the FW Theorem it follows, that the residuals from (1)

and (2) are identical.

I Thus

νb = y − α
b1NT − X βb − G µ
= y − X βb − G µ
= Qy − QX βbW
=: νbW

. Here, α
b, βb and µ
b are (any) OLS estimators of α, β and µ.
. Clearly, βb = βbW and the OLSE b∗
µ of µ∗ are unique.

Applied Econometrics  Chapter 4

Estimation for the error variance

I Unbiased estimator of the error variance:

0 ν
νb0 νb νbW
bν2 = σν2 ] = σν2
σ = with E[b
NT − K − N NT − K − N
I Note: Actually we estimate N +K parameters (β and µ∗ ) ⇒
only NT − K − N degrees of freedom!

I Covariance matrix of the BLUE βbW :

V[βbW ] = σν2 (X 0 QX )−1

Applied Econometrics  Chapter 4

Estimation of individual eects

I FW Theorem b∗ = P(y − X βbW )

⇒ Gµ
. This follows even without the full rank property of G.

I But G has full rank implying

b∗ = (G 0 G )−1 G 0 (y − X βbW ).
I Since G = IN ⊗ 1T and (G 0 G )−1 = 1
T N it follows that

b∗ = (G 0 G )−1 G 0 (y − X βbW ) = (IN ⊗

µ 10T )(y − X βbW )
= (y i· − x 0i· βbW )i=1,...,N
I Accordingly:

b∗i = y i· − x 0i· βbW

µ (i = 1, ..., N)
Applied Econometrics  Chapter 4
Separate estimation for α and µi

I Assumption (for identication): µ· = 0
I On account of µ∗i = α + µi , it holds

µ∗· = α + µ· = α
⇒ µi = µ∗i − α = µ∗i − µ∗·
and therefore

α b· = y ·· − x 0·· βbW
bi = y i· − y ·· − (x i· − x ·· )0 βbW

. This is the only OLSE (out of innitely many) satisfying

µ· = 0.
Applied Econometrics  Chapter 4
First-dierence (FD) estimation

I Idea: Elimination of µi , α by taking rst dierences of the

variables (instead of subtracting group means):

yit = α + xit0 β + µi + νit

y yi,t−1 = α + xi,t− 1 β + µi + νi,t−1
(yit − yi,t−1 ) = (xit − xi,t−1 )0 β + (νit − νi,t−1 )
| {z } | {z } | {z }
=∆yit =∆xit =∆νit

⇒ ∆yit = ∆xit0 β + ∆νit , i = 1, ..., N; t = 2, ..., T

Applied Econometrics  Chapter 4

I OLSE in this model yields the so called FD estimator:

" N X
βbFD = (xit − xi,t−1 )(xit − xi,t−1 )0
i=1 t=2
· (xit − xi,t−1 )(yit − yi,t−1 )
i=1 t=2

I T =2 ⇒ βbFD = βbW
I T >2 ⇒ βbFD is less ecient than βbW (under our


Applied Econometrics  Chapter 4

Comparing βbW and βbFD when errors are

serially correlated
I Assume e.g. {νit } ∼ AR(1):
νit = ρνi,t−1 + uit , uit ∼ (0, σu2 ) i.i.d., |ρ| < 1
⇔ ∆νit = uit − (1 − ρ)νi,t−1
I If ρ is large (close to 1, i.e. strong serial correlation)

⇒ (1 − ρ) is close to zero, then ∆νit is near to white noise.

⇒ Model (1) no longer satises the standard assumptions.

⇒ βbW is no longer ecient.

⇒ βbFD is more ecient than βbW .

Applied Econometrics  Chapter 4

Pooled OLS estimation

I The Pooled OLS estimator uses cross sectional and time series

variation in order to estimate β (but does not control for cross

sectional heterogeneity):

yit = α + xit0 β + νit , i = 1, . . . , N, t = 1, . . . , T

⇔ y = α1NT + X β + ν.
I Using the FW Theorem:

βbPOOL = [X 0 (I − P0 )X ]−1 X 0 (I − P0 )y ,
where P0 = NT JNT (orthogonal projection onto R(1NT )).
I Clearly, V(βbPOOL )  V(βbW ), however βbPOOL is biased unless,

for all i , µ∗i = α or x i· = x ·· !

Applied Econometrics  Chapter 4
4.2.3 Test for xed eects

I Null hypothesis
H0 : µ∗1 = ... = µ∗N
. Under H0 , βbPOOL is unbiased and more ecient than βbW .

I Under the assumption of normally distributed error terms:

(RRSS − RSS)/(N − 1)
F = F (y ) = ∼ FN−1,NT −N−K
RSS/(NT − N − K ) H0
I Unrestricted residual sum of squares:

RSS = min||y − X β − G µ∗ ||2


= min ||y − α1NT − X β − G µ||2 = νb0 νb


= min||Qy − QX β||2 = ||Qy − QX βbW ||2 = νbW

Applied Econometrics  Chapter 4
I Restricted residual sum of squares?

. Under H0 we have the pooled model:

y = α1NT + X β + ν

⇒ Sum of Squared Residuals:

RRSS = min||y − α1NT − X β||2


kQ0 y − Q0 X βbPOOL k2 ,
= Q0 = I − P0

I H0 is rejected by a α0 -test, if

F (y ) > FN−1,NT −N−K

Applied Econometrics  Chapter 4

Example: Trac fatalities in the US

I Data for T =7 years (1982-1988) for N = 48 contiguous

states, dataset Fatalities in R package AER.

I Approximately one third of fatal crashes involve a driver who

was drinking (in the 1980s).

I Aim: How eective are various government policies designed

for discourage drunk driving in reducing trac deaths?

I Study the impact of the tax on beer in each state, x (adjusted

for ination), on the fatality rate, y (no. of trac deaths per

10,000 people).

Applied Econometrics  Chapter 4

I Exemplary cross section analysis (for 1982):

yb = 2.01 + 0.15x

I Estimated coecient is positive (but insignicant at α0 = 0.1).

⇒ Higher beer taxes lead to more trac fatalities!

. This contradicts our expectation (hope).

Applied Econometrics  Chapter 4

I Conclusion might be misleading due to omitted variable bias.

I Panel data analysis, e.g. FE model, controls for unobserved

heterogeneity across states (time-invariant variables: long-

standing cultural attitudes towards drinking and driving in the

states etc)

yb = −0.66x + estimated xed eects of the states

I Now the estimated coecient has the expected sign.

I Moreover, the impact of beer taxes is signicant.

Applied Econometrics  Chapter 4

Model extension

I Add explanatory variables (driving performance, general

economic situation in the state, legal minimum age for alcohol


I Take time eects into account λt (e.g. improved security

standards are introduced in all states in order to reduce trac

fatality rates - they do not dier across states).

I Leads to the Two-Way-Classication-Model:

yit = α + xit0 β + µi + λt + νit (i = 1, ..., N; t = 1, ..., T )

Applied Econometrics  Chapter 4

Applied Econometrics  Chapter 4

Applied Econometrics  Chapter 4

4.3.1 Assumptions of the random eects (RE) model

I Idea: Individuals form a representative sample from a large

population. Individual-specic eects are random.

. Typical application: Household panel studies

I Assumptions: (X and Z = [1NT ..X ] as in the FE model)

. rk(Z ) = K + 1 (with probability one if X is stochastic).

. One-Way- EC Model:

εit = µi + νit (i = 1, ..., N; t = 1, ..., T )

. All µi , νit are uncorrelated with

µi ∼ (0, σµ2 ) and νit ∼ (0, σν2 )

⇒ 2
One parameter (σµ ) describes RE (unobserved heterogeneity)!

Applied Econometrics  Chapter 4

Moments of ε
I If X is stochastic, it is assumed to be strictly exogenous [w.r.t.

ε, i.e. E(ε|X ) = 0], and all moments are conditioned on X.

I Because of µ ∼ (0, σµ2 IN ) and ν ∼ (0, σν2 INT ) and the

uncorrelatedness of µ and ν one gets

ε =G µ + ν ∼ (0, Ω), where

Ω :=V[ε] = V[G µ] + V[ν]

=σµ2 GG 0 + σν2 INT = σµ2 (IN ⊗ JT ) + σν2 INT .
I Note: If X is stochastic and E[ν|X ] = 0, then

. E[y |X ] = Z θ only under E[µ|X ] = 0,

. but E[y |X , µ] = Z θ + G µ even when E[µ|X ] 6= 0.

Applied Econometrics  Chapter 4

Covariance matrix of the error terms

I Error terms εit have homogenous variances.

I Ω has a block diagonal structure with serial correlations only

for errors of the same individual:

2 2

 σµ + σν ,
 if i = j and t = s (variances)

Cov[εit , εjs ] =  σµ2 , if i = j, t 6= s

0, else (i 6= j)

⇒ Ω is an equicorrelated block diagonal matrix.

I Clearly, Ω is positive denite (as sum of a positive denite and

a nonnegative denite matrix).

Applied Econometrics  Chapter 4

4.3.2 GLS estimation

I Linear model with general error term structure:

y = α1NT + X β + ε = Z θ + ε , ε|X ∼ (0, Ω) (3)

I The pooled OLS estimator is unbiased but inecient.

I The FE estimator βbW remains unbiased (as long as E[ν|X ] = 0)

and consistent, but it is inecient!

. If E[µ|X ] 6= 0 and thus E[ε|X ] 6= 0, βbW may be interpreted as

IV estimator with QX as (valid) IVs for X.
I The GLS Estimator is BLUE:
= (Z 0 Ω−1 Z )−1 Z 0 Ω−1 y
θbGLS =

⇒ Need to invert the huge (NT × NT )-matrix Ω.

Applied Econometrics  Chapter 4
Spectral decomposition of Ω
P = IN ⊗ Q = I − P, Ω
I Because of JT and can be written

Ω = σµ2 (IN ⊗ JT ) + σν2 INT = (T σµ2 + σν2 )P + σν2 Q

= σ12 P + σν2 Q, where σ12 = T σµ2 + σν2 .
I Since P and Q are orthogonal projections with PQ = 0, this is

the spectral decomposition of Ω.

. Ω has two distinct eigenvalues σ12 and σν2 with multiplicities
rk(P) = tr (P) = N and tr (Q) = N(T − 1), respectively.
⇒ For any scalar r: Ωr = (σ12 )r P + (σν2 )r Q
1 1
⇒ Ω− 1 = P + Q (for r = −1; easy to verify directly!)
σ12 σν2
Applied Econometrics  Chapter 4
GLS transformation

I The GLSE for θ is the solution to the normal equation

Z 0 Ω−1 Z θbGLS = Z 0 Ω−1 y

or (equivalently) the OLSE in the transformed model

Ω−1/2 y = |Ω−{z
Z} θ + Ω −1/2
| {z ε}
| {z }
y =Ze =e

εe|X ∼ (0, INT ).

1 1
I With Ω−1/2 = P+ Q it follows
σ1 σν

Applied Econometrics  Chapter 4

I Because of P1NT = 1NT and Q1NT = 0 it follows

Ω−1/2 1NT = 1NT .
I Therefore the transformed model can be written as

ye = −1/2
| {z X} β + εe ,
1NT + Ω εe|X ∼ (0, INT ) (4)

I GLS Estimator:
b/σ1 . . .
= ((1NT ..Xe )0 (1NT ..Xe ))−1 (1NT ..Xe )0 ye

Applied Econometrics  Chapter 4

Separate GLS estimation of β

I Alternatively, βbGLS can be obtained separately using the

Frisch-Waugh-Theorem as the OLSE in

Q0 ye = Q0 Xe β + Q0 εe, (5)
|{z} |{z} |{z}
=:y ∗ =X ∗ =ε∗

where Q0 = I − P0 with Q0 1NT = 0 and

P0 = 1NT (10NT 1NT )−1 10NT = JNT .
| {z } NT

I This is equivalent to the OLS estimation of β in model (4).

Applied Econometrics  Chapter 4

Properties of the matrices P0 and Q0

I P0 and Q0 are orthogonal projection matrices (i.e., they are

symmetric and idempotent).

I P0 Q0 = 0

I P0 P = P0

I P0 Q = 0

I Q0 Q = Q

I Q0 P = P − P0

Applied Econometrics  Chapter 4

I y ∗ = Q0 Ω−1/2 y and X ∗ = Q0 Ω−1/2 X in (5) can be written as

y ∗ = By and X ∗ = BX ,


1 1 1 1
B = Q0 Ω−1/2 = Q0 ( P+ Q)= (P − P0 ) + Q = B 0.
σ1 σν σ1 σν
I On account of (P − P0 )Q = PQ − P0 Q = 0, it follows

1 1
B 0B = B 2 = (P − P0 ) + Q.
σ12 σν2
⇒ The GLS estimator of β (OLSE in (5)) can be written as

βbGLS = (X ∗ 0 X ∗ )−1 X ∗ 0 y ∗ = (X 0 B 0 BX )−1 X 0 B 0 By .

Applied Econometrics  Chapter 4

GLS estimator as weighted average

βbGLS = (X ∗ 0 X ∗ )−1 X ∗ 0 y ∗ = (X 0 B 0 BX )−1 X 0 B 0 By
= W1 βbW + W2 βbB , where
βbW = (X 0 QX )−1 X 0 Qy (Within estimator)
| {z } | {z }

βbB = [X 0 (P − P0 )X ]−1 X 0 (P − P0 )y (Between estimator)

| {z } | {z }

W1 = WXX + ν2 BXX WXX
−1 2

W2 = WXX + 2 BXX BXX with W1 + W2 = IK .
σ1 σ12

Applied Econometrics  Chapter 4

The between estimator

I βbB only takes into account the variation between the groups,

but ignores the variation within the groups, since it is the

OLSE of β in the model after transforming by P:

Py = αP1NT + PX β + Pε

= α1NT + PX β + Pε (6)

⇔ y i . = α + x i . 0 β + εi . i = 1, ..., N; (∀t).

Applied Econometrics  Chapter 4

I By the Frisch-Waugh Theorem, the between estimator of β

can be obtained separately as OLSE in the model

Q0 P y = Q0 P X β + Q0 Pε (recall Q0 1NT = 0).

|{z} |{z}
=P−P0 =P−P0

I It follows:

βbB = [X 0 (P − P0 )X ]−1 X 0 (P − P0 )y
| {z } | {z }
" N
#−1 N
= (x i . − x..)(x i . − x..)0 (x i . − x..)(y i . − y ..)
i=1 i=1

Applied Econometrics  Chapter 4

Interpretation of the weighting matrices

I The respective FW representations yield:

V[βbW |X ] = σν2 WXX
V[βbB |X ] = σ 2 B −1
1 XX
V[βbGLS |X ] = σν2 [WXX + ψ 2 BXX ]−1 , where ψ 2 =
⇒ WXX = σν2 (V[βbW |X ])−1
ψ 2 BXX = σ 2 (V[βbB |X ])−1
−1 −1 2
⇒ W1 = WXX + ψ 2 BXX WXX , W2 = WXX + ψ 2 BXX
are proportional to inverse covariance matrices of βbW , βbB .

Applied Econometrics  Chapter 4

Special cases
(i) σµ2 = 0 : ⇒ ψ 2 = T ·0+σν2

⇒ V[ε] = σν2 INT ⇒ (pooled) OLSE = GLSE

(ii) T → ∞:
ψ2 = → 0 ⇒ W1 → IK ⇒ βbGLS → βbW
T σµ2 + σν2

(iii) ψ2 → ∞ (hypothetically, since 0 ≤ ψ 2 ≤ 1):

W1 → 0 ⇒ βbGLS → βbB

Applied Econometrics  Chapter 4

Estimation of the intercept α

I The GLSE for α is the OLSE in model (4).
I Because of the identity of the residuals in (4) and (5) it follows

1NT · y − Xe βbGLS ).
= P0 (e
I Due to
−1/2 1 1 1
P0 Ω = P0 P+ Q = P0 ,
σ1 σν σ1
we obtain
bGLS 1 1
1NT · = P0 (y − X βbGLS ) = 1NT (y .. − x..0 βbGLS )
σ1 σ1 σ1
and hence

bGLS = y .. − x..0 βbGLS .

Applied Econometrics  Chapter 4
4.3.3 Estimation of the variance components

I Necessary to obtain feasible GLS estimator!

I Starting point: ε ∼ (0, Ω) [conditional on X ], Ω = σ12 P + σν2 Q

I Because of QG = 0 and PQ = 0 it follows

Qε=Qν ∼ (0, σν Q) and Pε ∼ (0, σ12 P)

I Then E[||Qε||2 ] = E[||Qν||2 ] = σν2 tr(Q) = σν2 N(T − 1)

I Hence,
ε0 Qε ||Qε||2
E =E = σν2
N(T − 1) N(T − 1)
I Similarly: E[||Pε||2 ] = σ12 tr(P) = σ12 N
⇒ E [ε0 Pε/N] = σ12

I Now: replace ε (not observable!) by residuals.

Applied Econometrics  Chapter 4

Intermezzo: A useful lemma

I Lemma. Let y be a random n-vector with y ∼ (µ, Σ) and A
be a (symmetric) n × n-matrix. Then

E[y 0 Ay ] = µ0 Aµ + tr [AΣ].
I Proof. Writing y = µ + (y − µ), we obtain

E[y 0 Ay ] = µ0 Aµ + E[(y − µ)0 A(y − µ)] + 2 E[µ0 A(y − µ)]

| {z }
0 0
= µ Aµ + E{tr [(y − µ) A(y − µ)]}
= µ0 Aµ + tr {A E[(y − µ)(y − µ)0 ]},
| {z }

where we used tr (AB) = tr (BA) and the linearity of the

Applied Econometrics  Chapter 4
Within residuals

I Appropriate to estimate σν2

I Within residuals are obtained from (3) after appyling the
within transformation Q:
νbW = Qy − QX βbW = [Q − QX (X 0 QX )−1 X 0 Q] y = Cy
| {z }

I C is an orthogonal projection (i.e. C

0 = C = C 2) with

CQ = C , CX = 0, CP = 0, C 1NT = 0.
I It can be shown that

νbW ∼ (0, σν2 C ).

Applied Econometrics  Chapter 4

I The above Lemma provides thus

νW ||2 ] = σν2 tr(C ) = σν2 (NT − N − K ).


⇒ An unbiased estimator of σν2 is given by:

νW ||2
||b y 0 (Q − QX (X 0 QX )−1 X 0 Q)y
bν2 =
σ = .
NT − N − K NT − N − K

Applied Econometrics  Chapter 4

4.3 The random eects model | 4.3.3 FGLS estimation 71 | 95

I To estimate σ12 , we use the between-group residuals, which

can be obtained as OLS residuals in (6):

εbB = Py − PZ θbB (with θbB = (Z 0 PZ )−1 Z 0 Py )

= (P − PZ (Z 0 PZ )−1 Z 0 P) y = Dy
| {z }

I D is also an orthogonal projection (i.e. D 0 = D = D 2) with

DP = D, DZ = 0, DQ = 0.
I It can be shown that

εbB ∼ (0, σ12 D).

Applied Econometrics  Chapter 4

I From our Lemma it follows

εB ||2 ] = σ12 tr(D) = σ12 (N − K − 1).

⇒ Unbiased estimator of σ12 = T σµ2 + σν2 :

c2 = εB ||2
||b y 0 Dy y 0 (P − PZ (Z 0 PZ )−1 Z 0 P)y
σ1 = =
N −K −1 N −K −1 N −K −1
⇒ Unbiased estimator of σµ :
b12 − σ
σ bν2
bµ2 =
I Note: bµ2
σ can yield negative values. (In this case alternative

estimators have to be used!)

⇒ Problems with interpretation
. Possible reasons: The sample is too small or the individual
eects are insignicant or the model is misspecied.

Applied Econometrics  Chapter 4

Applied Econometrics  Chapter 4

Applied Econometrics  Chapter 4

4.4.1 Tests for poolability of the data

I Model assumption: α and β depend neither on i nor on t.
I Allow (under H1 ) that parameters vary across individuals:

H1 : yi = αi 1T + Xi βi + εi (i = 1, ..., N)
I Test for poolability of the data across individuals:

H0a : αi ≡ α and βi ≡ β (i = 1, ..., N)

I Assume rst ε ∼ N (0, σ 2 INT ) [no individual specic eects]

. The model can be estimated eciently by OLS

(under H1 : OLS equation by equation).

. The linear hypothesis H0a can be tested by an F -test.

. Distribution of F -statistic under H0a : F(N−1)(K +1),N(T −K −1) .
Applied Econometrics  Chapter 4
I Now: inclusion of xed eects under both H0 and H1

. Model under H1 as above: αi = µ∗i , εi = νi
. Assume ε = ν ∼ N (0, σ 2 INT )
⇒ Estimation by OLS (under H1 : equation by equation)

. H0b : βi ≡ β (i = 1, ..., N) is tested by an F -test

. Distribution of F -statistic under H0b : F(N−1)K ,N(T −K −1)

I Case of random eects: ε ∼ N (0, Ω)

. Estimation of the variance components ⇒ Ω

. F -test for testing H0a vs. H1 in the model for b −1/2 y

Ω is
approximately valid.

Applied Econometrics  Chapter 4

4.4.2 Tests for individual eects

I Test for xed eects:

. F -test, cp. Subsection 4.2.3 (H0 : µ1 = . . . = µN = 0)

I Test for random eects:

H0 : σµ2 = 0
. Lagrange Multiplier test under normality (Breusch-Pagan)

. An asymptotic α0 -test rejects H0 ⇔

εb0H (IN

NT ⊗ JT )b
LM := 1 − 0 > χ12,1−α0 ,
2(T − 1) εbH εbH
where εbH is the vector residuals under H0 (pooled OLS

Applied Econometrics  Chapter 4

4.4.2 Exogeneity of the regressors

I General assumption: X is strictly exogenous w.r.t. ν :

E[ν|X ] = 0

⇒ βbW is unbiased (and consistent under mild conditions) in

both the FE and the RE model:

βbW = β + (X 0 QX )−1 X 0 Qν
E[βbW ] = β + E (X 0 QX )−1 X 0 QE[ν|X ] = β

. In the FE model, βbW is ecient (BLUE).

. In the RE model, βbW is inecient.

Applied Econometrics  Chapter 4

I RE model: βbGLS is in general only unbiased (and consistent

and ecient), if

E[ε|X ] = 0 , i.e. E[µ|X ] = 0 besides E[ν|X ] = 0

I Recall:

θbGLS = θ + (Z 0 Ω−1 Z )−1 Z 0 Ω−1 ε

⇒ E[θbGLS ] = θ + E[(Z 0 Ω−1 Z )−1 Z 0 Ω−1 E(ε|Z ) ]
| {z }
=G E[µ|Z ]+0

=θ (only) if E[µ|Z ] = E[µ|X ] = 0 !

Applied Econometrics  Chapter 4

Hausman test

I Null hypothesis: H0 : E[ε|X ] = 0 (or E[µ|X ] = 0)

I Idea of Hausman test:

qb := βbGLS − βbW −
→0 (under H0 )
but qb →
6 0 (under H1 )
I An asymptotic α0 -test rejects H0 if

q )]−1 qb
qb0 [V(b
[ > χ2K,1−α0 ,
where V(b q ) = V(βbW ) − V(βbGLS )
(generally positive denite under H0 due to ineciency of βbW ).

Applied Econometrics  Chapter 4

Practice: RE vs. FE model

I RE model if data are randomly drawn from some population

. Otherwise: FE model, within estimator (ecient)

I RE model: X exogenous w.r.t. ε (Hausman test)?

. If yes, use RE estimator, (feasible GLSE, asympt. ecient).

. Otherwise, use FE estimator (within estimator, consistent),

which can be interpreted as IV estimator (IV: QX ).
. Within estimator is even ecient in RE model if e.g.

µi = x 0i· π + ui , ui ∼ (0, σu2 ) i.i.d.

I After deciding for RE or FE model, test for individual eects:

. RE model H0 : σµ2 = 0, FE model H0 : µi ≡ 0

. If H0 not rejected: Use pooled OLSE.

Applied Econometrics  Chapter 4

Applied Econometrics  Chapter 4

Applied Econometrics  Chapter 4

4.5 Example: Estimation of an investment


I Aim: Estimation of a linear investment function

I Data for N = 10 US (manufactory) rms over T = 20 years

from 1935 to 1954 (Grunfeld, 1958).

I Dependent variable GIit : real gross investment for rm i in

year t
I Explanatory variables:

. VFit : real value of the rm

. VCit : real value of the rm's capital stock

Applied Econometrics  Chapter 4

Plot of Grunfeld data (see R code)

1935 1945 0 2000 5000








2 4 6 8 0 500 1500 0 1000 2000

Applied Econometrics  Chapter 4

Pooled OLS estimation

Oneway (individual) effect Pooling Model

plm(formula = inv ~ value + capital, data =, model = "pooling")
Balanced Panel: n=10, T=20, N=200

Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-292.0 -30.0 5.3 34.8 369.0

Coefficients :
Estimate Std. Error t-value Pr(>|t|)
(Intercept) -42.7143694 9.5116760 -4.4907 1.207e-05 ***
value 0.1155622 0.0058357 19.8026 < 2.2e-16 ***
capital 0.2306785 0.0254758 9.0548 < 2.2e-16 ***
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Total Sum of Squares: 9359900

Residual Sum of Squares: 1755900
R-Squared : 0.81241
Adj. R-Squared : 0.80022
F-statistic: 426.576 on 2 and 197 DF, p-value: < 2.22e-16
Applied Econometrics  Chapter 4
One-way EC model with xed eects: Within-estimation

Oneway (individual) effect Within Model

plm(formula = inv ~ value + capital, data =, model = "within")
Balanced Panel: n=10, T=20, N=200

Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-184.000 -17.600 0.563 19.200 251.000

Coefficients :
Estimate Std. Error t-value Pr(>|t|)
value 0.110124 0.011857 9.2879 < 2.2e-16 ***
capital 0.310065 0.017355 17.8666 < 2.2e-16 ***
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Total Sum of Squares: 2244400

Residual Sum of Squares: 523480
R-Squared : 0.76676
Adj. R-Squared : 0.72075
F-statistic: 309.014 on 2 and 188 DF, p-value: < 2.22e-16

Applied Econometrics  Chapter 4

Estimating the xed eects

I Estimates of µ∗i = α + µi
> ( <- fixef(gr.fe))
1 2 3 4 5 6
-70.296717 101.905814 -235.571841 -27.809295 -114.616813 -23.161295
7 8 9 10
-66.553474 -57.545657 -87.222272 -6.567844
> summary(fixef(gr.fe))
Estimate Std. Error t-value Pr(>|t|)
1 -70.2967 49.7080 -1.4142 0.15730
2 101.9058 24.9383 4.0863 4.383e-05 ***
3 -235.5718 24.4316 -9.6421 < 2.2e-16 ***
4 -27.8093 14.0778 -1.9754 0.04822 *
5 -114.6168 14.1654 -8.0913 6.661e-16 ***
6 -23.1613 12.6687 -1.8282 0.06752 .
7 -66.5535 12.8430 -5.1821 2.194e-07 ***
8 -57.5457 13.9931 -4.1124 3.915e-05 ***
9 -87.2223 12.8919 -6.7657 1.327e-11 ***
10 -6.5678 11.8269 -0.5553 0.57867
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Applied Econometrics  Chapter 4

Test for joint signicance of xed eects

I Null hypothesis H0 : µi = 0 ∀i ⇔ µ∗1 = . . . = µ∗N = α

I Under H0 , pooled OLS estimation is appropriate.

> pFtest(gr.fe,gr.pool)

F test for individual effects

data: inv ~ value + capital

F = 49.1766, df1 = 9, df2 = 188, p-value < 2.2e-16
alternative hypothesis: significant effects

⇒ H0 is rejected.

Applied Econometrics  Chapter 4

One-way EC model with random eects: FGLS estimation

plm(formula = inv ~ value + capital, data =, model = "random")
Balanced Panel: n=10, T=20, N=200
var share
idiosyncratic 2784.46 52.77 0.282
individual 7089.80 84.20 0.718
theta: 0.8612
Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-178.00 -19.70 4.69 19.50 253.00
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
(Intercept) -57.834415 28.898935 -2.0013 0.04674 *
value 0.109781 0.010493 10.4627 < 2e-16 ***
capital 0.308113 0.017180 17.9339 < 2e-16 ***
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Total Sum of Squares: 2381400

Residual Sum of Squares: 548900
R-Squared : 0.7695
Adj. R-Squared : 0.75796
F-statistic: 328.837 on 2 and 197 DF, p-value: < 2.22e-16

Applied Econometrics  Chapter 4

Between estimation in the RE model

Oneway (individual) effect Between Model

plm(formula = inv ~ value + capital, data =, model = "between")
Balanced Panel: n=10, T=20, N=200

Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-163.00 -3.68 2.97 20.70 144.00

Coefficients :
Estimate Std. Error t-value Pr(>|t|)
(Intercept) -8.527114 47.515308 -0.1795 0.86266
value 0.134646 0.028745 4.6841 0.00225 **
capital 0.032031 0.190938 0.1678 0.87152
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Total Sum of Squares: 355780

Residual Sum of Squares: 50603
R-Squared : 0.85777
Adj. R-Squared : 0.60044
F-statistic: 21.1077 on 2 and 7 DF, p-value: 0.0010851

Applied Econometrics  Chapter 4

4.5 Example: Estimation of an investment function | 92 | 95

I Under H0 : All slope parameters are the same, but the intercepts
may dier across rms

I Under H1 : FE model with slope parameters varying across rms

> # pvcm: estimation of models with variable coefficients (model under H_1)
> gr.sur <- pvcm(inv ~ value + capital,,model="within")
> # Estimation under H_0:
> gr.fe.pool <- plm(inv ~ value + capital,
> pooltest(gr.fe.pool,gr.sur)
F statistic
data: inv ~ value + capital
F = 5.7805, df1 = 18, df2 = 170, p-value = 1.219e-10
alternative hypothesis: unstability

⇒ H0 is rejected.

Applied Econometrics  Chapter 4

There is a substantial variation among the rms:

> summary(gr.sur)
Oneway (individual) effect No-pooling model

pvcm(formula = inv ~ value + capital, data =, model = "within")

Balanced Panel: n=10, T=20, N=200

Min. 1st Qu. Median Mean 3rd Qu. Max.
-184.5000 -7.1180 -0.3926 0.0000 5.7030 144.0000

(Intercept) value capital
Min. :-149.782 Min. :0.004573 Min. :0.003102
1st Qu.: -9.639 1st Qu.:0.058518 1st Qu.:0.087132
Median : -6.956 Median :0.082738 Median :0.137738
Mean : -21.368 Mean :0.091285 Mean :0.205263
3rd Qu.: -1.507 3rd Qu.:0.128411 3rd Qu.:0.357513
Max. : 22.707 Max. :0.174856 Max. :0.437369

Total Sum of Squares: 474010000

Residual Sum of Squares: 324730
Multiple R-Squared: 0.99931

Applied Econometrics  Chapter 4

Hausman test

I Under H0 , the regressors are exogenous.

> phtest(gr.fe,

Hausman Test

data: inv ~ value + capital

chisq = 2.3304, df = 2, p-value = 0.3119
alternative hypothesis: one model is inconsistent

⇒ H0 is not rejected, so that the RE model can be used.

Applied Econometrics  Chapter 4

4.5 Example: Estimation of an investment function | 95 | 95

I H0 : σµ2 = 0
> plmtest(gr.pool,effect="individual",type="bp")
Lagrange Multiplier Test - (Breusch-Pagan)

data: inv ~ value + capital

chisq = 798.1615, df = 1, p-value < 2.2e-16
alternative hypothesis: significant effects

⇒ H0 is rejected.

I Conclusion: A RE model seems to be appropriate, whereas the

pooled OLS model cannot be used.

. However, the poolability assumption might be problematic.

Applied Econometrics  Chapter 4

