Analysis of Panel Data: Applied Econometrics Prof. Dr. Simone Maxand

Chapter 4
Analysis of panel data
Applied Econometrics
Winter Term 2020/2021
Prof. Dr. Simone Maxand
Humboldt University Berlin
4.1 Linear panel data models 2 | 95
Contents I
4.1 Linear panel data models

4.1.1 Introduction
4.1.2 SUR
4.1.3 Individual-specic EC model
4.2 The xed eects model
4.2.1 Assumptions
4.2.2 Parameter estimation
4.2.3 Test for xed eects
4.3 The random eects model
4.3.1 Assumptions
4.3.2 GLS estimation
4.3.3 FGLS estimation
4.4 Checking the assumptions
Applied Econometrics Chapter 4

4.1 Linear panel data models 3 | 95
Contents II
4.4.1 Tests for poolability

4.4.2 Tests for individual eects
4.4.2 Exogeneity of the regressors
4.5 Example: Estimation of an investment function

4.1 Linear panel data models | 4.1.1 Introduction 4 | 95
4.1.1 Introduction
I Panel data:
. pooled observations on a cross-section of investigation units

over several time periods
. typically: collected on the microeconomic level
. also increasingly common: pool individual time series of a
number of countries/industries & analyze them simultaneously
I Investigation units: (shortly: individuals)
. persons, households,
. rms, industries,
. countries, regions within a country,
. assets, ...

Examples of panel data sets

I German SOEP (Socio-economic panel, DIW):
. representative study of private households
. since 1984 annual survey of the same private households,

persons and families (new federal states since 1990)
. e.g. 2008: around 11000 households with more than 20000

persons
. subjects covered: composition of the household, occupational

and family biographies, labor participation and occupational
mobility, income developments, health, ...
. enables to analyse political and social changes (e.g. eects of

policies and nature, reasons of poverty and societal/economic
dierences)

Objective and notation

I Objective: Explain the relationship between a dependent
variable y and K explanatory variables x1 , ..., xK .

I Observations:
. N individuals (persons, households, rms, countries,...)
. over T time periods (balanced panel).
I Notation: (yit , xkit ) , i = 1, ..., N; t = 1, ..., T ; k = 1, ..., K

. T = 1: cross sectional data
. N = 1: time series data
I Here: Focus on panels with many individuals (large N) and
comparatively few time periods (small T ).

Some advantages of panel data

I Panel data contain two types of information:
. cross-sectional information reecting dierences between
individuals, and
. time series information reecting changes within individuals

over time.
I Consequently, panel data allow
. to control for unobserved heterogeneity (omitted time-invariant

or individual-invariant variable) which leads to a bias reduction,
. to construct and test more complex (more realistic) behavioral

models than purely cross-sectional or time series data, and
. a better identication of eects that are simply not detectable

in pure cross-section or pure time series data.

Eciency considerations for panel data
I In general, there are gains in eciency since explanatory
variables vary across two dimensions.
I Panel data contain more information due to repeated
observations of the same individual.
. This implies more information about temporal changes

(through path dependencies),
. but less variation in the explanatory variables compared with

repeated cross-sections.

I Therefore: In comparison to repeated cross sections, panel
data allows for more ecient estimation of temporal changes.
I But: In comparison to repeated cross sections, panel data
allows for less ecient estimation of averages over time.

Linear panel data regression models
I A most general linear panel data model is given by
yit = αit + xit0 βit + εit , i = 1, . . . , N, t = 1, . . . , T .
. Notation: xit = (x1it , . . . , xKit )0
I Model cannot be estimated since it has more parameters than
observations (parameters are not identiable).
I Restrictions concerning the variation of αit and βit in i and t

as well as concerning the error term εit (its variance cannot
arbitrarily depend on both i and t) are necessary.
I Asymptotic results for panel estimators are typically derived for
xed T and N → ∞.
4.1 Linear panel data models | 4.1.2 SUR 11 | 95
4.1.2 Seemingly unrelated regression (SUR)

I A linear SUR model is given by
yit = αi + xit0 βi + εit , i = 1, . . . , N, t = 1, . . . , T ,

I In general, this model might be seen as multivariate linear
model (with N equations), where
. N (dierent) dependent variables are observed T times, and
. the numbers of regressors Ki might dier across the equations.
I Assumptions on the error terms:
E[εit ] = 0 ∀i, t,
(
σij , t = s
E[εit εjs ] = .
0, t 6= s
. Allows for contemporaneous correlations across the equations.

Cross section representation
I Equation i: yi = Xi θi + εi , i = 1, . . . , N,
where (assuming Ki = K for all i)
xi01
 
1 " #
. .  αi
Xi =  .. . 
. , θi =
βi
1
0
xiT (K +1)×1
T ×(K +1)
   
yi 1 εi 1
. .
yi =  . , εi =  .
   
.  . 
yiT T ×1
εiT T ×1

System representation
y = X θ + ε,
where
 
  X1 0 ··· 0
y1
0 X2 ··· 0
 
.

y = . , X =
 
.  .. . .. .

. .

 . . . .

yN

NT ×1 0 ··· ··· XN
NT ×(K +1)N
   
θ1 ε1
. .
θ= . , ε= .
   
.  . 
θN (K +1)N×1
εN NT ×1

Covariance matrix
V[ε] = E[εε0 ] = Σ ⊗ IT := Ω,
where, with σii =: σi2 = V(εit ) (∀t),
σ12 σ12 . . . σ1N

 
 σ21 σ22 . . . σ2N 

 
Σ=  .. .. . 

. .
 . . 
σN 1 2
. . . σN
N×N
and IT denotes the identity matrix of dimension T.

The Kronecker product ⊗
I Denition: Let
A = ((aij ))i=1,...,r ;j=1,...,s and B .

r ×s n×k
Then,
A ⊗ B := ((aij B))i=1,...,r ;j=1,...,s

rn×sk
 
a11 B . . . a1s B
 . . 
=  .. . 
.
ar 1 B ... ars B

Some properties of the Kronecker product
1. (A ⊗ B)0 = A0 ⊗ B 0
2. ( A ⊗ B )( C ⊗ D ) = AC ⊗ BD
r ×s n×k s×p k×q
3. A ⊗( B + C )=A⊗B +A⊗C
r ×s n×k n×k
4. (A ⊗ B)−1 = A−1 ⊗ B −1 (for regular square matrices A, B )
5. tr(A ⊗ B) = tr(A)tr(B)
6. In general: A ⊗ B 6= B ⊗ A!

Least squares estimation
I General linear regression model
I OLS estimator:
θbOLS = (X 0 X )−1 X 0 y ,
where V[θbOLS ] is given by
V[θbOLS ] = (X 0 X )−1 X 0 ΩX (X 0 X )−1 .
I The BLUE is the GLS estimator
θbGLS = (X 0 Ω−1 X )−1 X 0 Ω−1 y ,
with V[θbGLS ] = (X 0 Ω−1 X )−1 .

System- vs. single equation estimation
I Single equation estimators are consistent but in general less
ecient than system estimators.
I Single equation estimators are more robust since they are not
inuenced by potential misspecications in other equations.
I In the special case of the same values of the regressors in all
equations, i.e. X1 = . . . = XN ,
. single equation estimation and system estimation coincide,
. OLS and GLS are identical.
I In case of no contemporaneous correlations (σij =0 for all
i 6= j ), single equation estimation is also ecient.

Estimation of the error covariance matrix
I The covariance matrix V(θbGLS ) = (X 0 Ω−1 X )−1 is estimated
by
−1
V b −1 X
b θbGLS = X 0 Ω ,
with Ω b ⊗ IT
b =Σ ; Σ
b = ((b
σij ))i,j=1,...,N ,
and σij is estimated by

T
1 X
σ
bij = bi − βbi0 xit )(yjt − α
(yit − α bj − βbj0 xjt ),
T
t=1
where α
bi and βbi denote the OLS estimators of αi and βi .

From SUR to panel data models
I Panel data models assume
. the same dependent variable for the N individuals,

. the same regressors for the N individuals (and thus Ki ≡ K ),
but in general Xi 6= Xj for 6 j.
i=
I SUR does not allow for an estimation of common eects, e.g.
. β1 = β2 = . . . = β ,
. α1 = α2 = . . . = α,
. common eects that vary over time like yit = αt + xit0 β + εit .
I In case of common parameters, an estimation that takes this
restriction into account yields a more ecient inference.

4.1 Linear panel data models | 4.1.3 Individual-specic EC model 21 | 95
4.1.3 Individual-specic error component

(EC) model
I Linear panel data regression model:
yit = α + xit0 β + εit , i = 1, ..., N; t = 1, ..., T

I Error terms εit follow an individual-specic (one-way) EC
model:
εit = µi + νit ,
. µi : unobserved, individual-specic eects (describing the
unobserved heterogeneity across individuals); captures all
unobserved, time-constant eects that are not contained in xkit
. νit : remaining (idiosyncratic) disturbances, i.e. measurement
errors or omitted/unobservable eects that vary over time

Two-way error components model
I As before: yit = α + β 0 xit + εit , i = 1, ..., N; t = 1, ..., T

I Error terms εit follow a two-way error components model:
εit = µi + λt + νit ,
where λt captures eects that are constant across the
individuals but vary over time.
I We will focus on the individual-specic/one-way EC model.
I Further specications (not addressed): systems of equations
(like SUR) with EC, dynamic panel data models, and panel
data models for qualitative dependent variables or count data.

Matrix notation for the one-way EC model
I Variables of individual i at time t:

yit , εit , νit , xit = (x1it , . . . , xKit )0
I Variables for individual i (i = 1, . . . , N ) :
xi01
   
β1
.  . 
Xi = . , β =  .. 
 
. 
(T ×K ) 0
xiT βK
     
yi 1 νi 1 εi 1
. . .
yi = .  , νi =  .  , εi =  .
     
. . . 
(T ×1)
yiT νiT εiT

I Notation for stacked observations:
       
y1 ν1 ε1 X1
. . . .
y = . , ν =  . , ε =  . , X = .
       
. . . . 
(NT ×1) (NT ×K )
yN νN εN XN
⇒ Model in matrix notation:
y = α1NT + X β + ε = Z θ + ε,
ε = G µ + ν,
!
where . α
Z= [1NT ..X ], θ= ,
β
µ = (µ1 , . . . , µN )0 , G = IN ⊗ 1T ,
and 1n denotes the n-vector of ones.

4.2 The xed eects model | 25 | 95
Contents I

4.1.1 Introduction
4.1.2 SUR
4.2.1 Assumptions
4.3.1 Assumptions

4.2 The xed eects model | 26 | 95
Contents II


4.2 The xed eects model | 4.2.1 Assumptions 27 | 95
4.2.1 Assumptions for the xed eects (FE)

model
I The eects µi are xed/deterministic.
. reasonable e.g., if interest is only in behavior of sample at hand
I X is deterministic (or strictly exogenous: E(ν|X ) = 0) with
.
rk(G ..X ) = N + K (with probability one).
⇒ rk(X ) = K [note that by denition: rk(G ) = N)]

⇒ There are no time-invariant explanatory variables in X.
I The errors are homoscedastic & uncorrelated: ν|X ∼ (0, σν2 INT )
⇒ y |X ∼ (α1NT + X β + G µ, σν2 INT )

Identiability
I Under the above assumptions: β is identiable.
I α and µ are not identiable (dummy trap), since
. .
y = Aδ + ν , with A = (1NT ..X ..G ), δ = (α, β 0 , µ0 )0 ,
where the (NT × (N + K + 1))-matrix A has no full rank;
because of G 1N = 1NT it follows

.
rk(A) = rk(G ..X ) = N + K .
. The sum of the columns in G equals the rst column 1NT .

Possible solutions to non-identiability
(i) Individual-specic intercepts:
µ∗i := α + µi , i = 1, ..., N
⇒ y = X β + G µ∗ + ν (1)
with µ∗ = (µ∗1 , ..., µ∗N )0 .

(ii) Linear restrictions, e.g.
N
X
µ· := µi = 0.
i=1

ANOVA notation
I Time sum and time average
T T
X 1 1 X
xi· := xit x i· := xi· = xit
T T
t=1 t=1
I Cross sectional sum and cross sectional average
N N
X 1 1 X
x·t := xit x ·t := x·t = xit
N N
i=1 i=1
I Time and cross sectional sum and average
T X
N T X
N
X 1 1 X
x·· := xit x ·· := x·· = xit
NT NT
t=1 i=1 t=1 i=1

4.2 The xed eects model | 4.2.2 Parameter estimation 31 | 95
I For notational convenience, we consider X as deterministic.
I Because of ν ∼ (0, σν2 INT ) the OLSE for β is BLUE.
I Separate estimation for β using the Frisch-Waugh Theorem:
βb = (X 0 QX )−1 X 0 Qy
Q = I − P, P = PR(G ) := G (G 0 G )−1 G 0
I P, Q are orthogonal projections (symmetric and idempotent

NT × NT matrices) with PQ = 0 and P = IN ⊗ T1 JT ,
0
where JT = 1T 1T denotes the (T × T ) matrix of ones.

The projection matrices P and Q

I P- orthogonal projection onto R(G ) [column space of G]
I Q - orthogonal projection onto R(G )⊥
I Transforming y by P yields group means (averages across
time ∀i ):
   
y1 y 1· 1T
. NT .
y = . ∈R ⇒ Py =  .
   
. . 
yN y N· 1T
⇒ (Within-)Transformation Q yields deviations from the group
means y i· :
Qy = (yit − y i· ) i=1,...,N
t=1,...,T

Within(-group) estimator
I Accordingly, βb can be written as
βbW := (X 0 QX )−1 X 0 Qy = [(QX )0 QX ]−1 (QX )0 Qy

 −1
X X
=  (xit − x i· )(xit − x i· )0  (xit − x i· )(yit − y i· )
i,t i,t
where x i· = (x 1i· , ..., x Ki· )0 .

⇒ The within(-group) estimator βbW utilizes only the variation
within each group (for each individual).
⇒ Elimination of µi , α by forming deviations from group means
I Alternative names: covariance estimator; least squares dummy
variable (LSDV) estimator; xed eects (FE) estimator.

Optimality of βbW
I βbW is OLSE of β in (1) resp. OLSE in the (Frisch-Waugh)
transformed model (note Q1NT = 0 and QG = 0)

Qy = QX β + Qν (2)
resp. (yit − y i· ) = (xit − x i· )0 β + (νit − ν i· )

I Within-transformation Q eliminates the individual eects (µi )
and the intercept (α).
I The optimality (eciency) is not obvious from (2), since
V[Qν] = σν2 Q (6= σν2 I) is singular.
I But βbW is BLUE, since it corresponds to the OLSE in model
(1) (Gauss-Markov Theorem).

Residuals
I From the FW Theorem it follows, that the residuals from (1)
and (2) are identical.
I Thus
νb = y − α
b1NT − X βb − G µ
b
b∗
= y − X βb − G µ
= Qy − QX βbW
=: νbW
. Here, α
b, βb and µ
b are (any) OLS estimators of α, β and µ.
. Clearly, βb = βbW and the OLSE b∗
µ of µ∗ are unique.

Estimation for the error variance
I Unbiased estimator of the error variance:
0 ν
νb0 νb νbW
bν2 = σν2 ] = σν2
bW
σ = with E[b
NT − K − N NT − K − N
I Note: Actually we estimate N +K parameters (β and µ∗ ) ⇒
only NT − K − N degrees of freedom!
I Covariance matrix of the BLUE βbW :
V[βbW ] = σν2 (X 0 QX )−1

Estimation of individual eects
I FW Theorem b∗ = P(y − X βbW )

⇒ Gµ
. This follows even without the full rank property of G.
I But G has full rank implying
b∗ = (G 0 G )−1 G 0 (y − X βbW ).
µ
I Since G = IN ⊗ 1T and (G 0 G )−1 = 1
I
T N it follows that
b∗ = (G 0 G )−1 G 0 (y − X βbW ) = (IN ⊗

1
µ 10T )(y − X βbW )
T
= (y i· − x 0i· βbW )i=1,...,N
I Accordingly:
b∗i = y i· − x 0i· βbW

µ (i = 1, ..., N)
Separate estimation for α and µi

I Assumption (for identication): µ· = 0
I On account of µ∗i = α + µi , it holds
µ∗· = α + µ· = α
⇒ µi = µ∗i − α = µ∗i − µ∗·
and therefore
∗
α b· = y ·· − x 0·· βbW
b=µ
bi = y i· − y ·· − (x i· − x ·· )0 βbW
µ
. This is the only OLSE (out of innitely many) satisfying

µ· = 0.
First-dierence (FD) estimation
I Idea: Elimination of µi , α by taking rst dierences of the
variables (instead of subtracting group means):

yit = α + xit0 β + µi + νit
−

0
y yi,t−1 = α + xi,t− 1 β + µi + νi,t−1
(yit − yi,t−1 ) = (xit − xi,t−1 )0 β + (νit − νi,t−1 )
| {z } | {z } | {z }
=∆yit =∆xit =∆νit
⇒ ∆yit = ∆xit0 β + ∆νit , i = 1, ..., N; t = 2, ..., T

I OLSE in this model yields the so called FD estimator:
" N X
T
#−1
X
βbFD = (xit − xi,t−1 )(xit − xi,t−1 )0
i=1 t=2
N
XX T
· (xit − xi,t−1 )(yit − yi,t−1 )
i=1 t=2
I T =2 ⇒ βbFD = βbW
I T >2 ⇒ βbFD is less ecient than βbW (under our
assumptions)

Comparing βbW and βbFD when errors are

serially correlated
I Assume e.g. {νit } ∼ AR(1):
νit = ρνi,t−1 + uit , uit ∼ (0, σu2 ) i.i.d., |ρ| < 1
⇔ ∆νit = uit − (1 − ρ)νi,t−1
I If ρ is large (close to 1, i.e. strong serial correlation)
⇒ (1 − ρ) is close to zero, then ∆νit is near to white noise.
⇒ Model (1) no longer satises the standard assumptions.
⇒ βbW is no longer ecient.
⇒ βbFD is more ecient than βbW .

Pooled OLS estimation
I The Pooled OLS estimator uses cross sectional and time series
variation in order to estimate β (but does not control for cross
sectional heterogeneity):
yit = α + xit0 β + νit , i = 1, . . . , N, t = 1, . . . , T

⇔ y = α1NT + X β + ν.
I Using the FW Theorem:
βbPOOL = [X 0 (I − P0 )X ]−1 X 0 (I − P0 )y ,
1
where P0 = NT JNT (orthogonal projection onto R(1NT )).
I Clearly, V(βbPOOL ) V(βbW ), however βbPOOL is biased unless,
for all i , µ∗i = α or x i· = x ·· !

4.2 The xed eects model | 4.2.3 Test for xed eects 43 | 95

I Null hypothesis
H0 : µ∗1 = ... = µ∗N
. Under H0 , βbPOOL is unbiased and more ecient than βbW .
I Under the assumption of normally distributed error terms:
(RRSS − RSS)/(N − 1)
F = F (y ) = ∼ FN−1,NT −N−K
RSS/(NT − N − K ) H0
I Unrestricted residual sum of squares:
RSS = min||y − X β − G µ∗ ||2

β,µ∗
= min ||y − α1NT − X β − G µ||2 = νb0 νb

α,β,µ
= min||Qy − QX β||2 = ||Qy − QX βbW ||2 = νbW

0
νbW
β
I Restricted residual sum of squares?
. Under H0 we have the pooled model:
y = α1NT + X β + ν
⇒ Sum of Squared Residuals:
RRSS = min||y − α1NT − X β||2

α,β
kQ0 y − Q0 X βbPOOL k2 ,
!
= Q0 = I − P0
I H0 is rejected by a α0 -test, if
1−α0
F (y ) > FN−1,NT −N−K

Example: Trac fatalities in the US
I Data for T =7 years (1982-1988) for N = 48 contiguous
states, dataset Fatalities in R package AER.
I Approximately one third of fatal crashes involve a driver who
was drinking (in the 1980s).
I Aim: How eective are various government policies designed
for discourage drunk driving in reducing trac deaths?
I Study the impact of the tax on beer in each state, x (adjusted
for ination), on the fatality rate, y (no. of trac deaths per
10,000 people).

I Exemplary cross section analysis (for 1982):
yb = 2.01 + 0.15x
I Estimated coecient is positive (but insignicant at α0 = 0.1).
⇒ Higher beer taxes lead to more trac fatalities!
. This contradicts our expectation (hope).

I Conclusion might be misleading due to omitted variable bias.
I Panel data analysis, e.g. FE model, controls for unobserved
heterogeneity across states (time-invariant variables: long-
standing cultural attitudes towards drinking and driving in the
states etc)
yb = −0.66x + estimated xed eects of the states
I Now the estimated coecient has the expected sign.
I Moreover, the impact of beer taxes is signicant.

Model extension
I Add explanatory variables (driving performance, general
economic situation in the state, legal minimum age for alcohol
consumption)
I Take time eects into account λt (e.g. improved security
standards are introduced in all states in order to reduce trac
fatality rates - they do not dier across states).
I Leads to the Two-Way-Classication-Model:
yit = α + xit0 β + µi + λt + νit (i = 1, ..., N; t = 1, ..., T )

4.3 The random eects model | 49 | 95
Contents I

4.1.1 Introduction
4.1.2 SUR
4.2.1 Assumptions
4.3.1 Assumptions

4.3 The random eects model | 50 | 95
Contents II


4.3 The random eects model | 4.3.1 Assumptions 51 | 95
4.3.1 Assumptions of the random eects (RE) model

I Idea: Individuals form a representative sample from a large
population. Individual-specic eects are random.
. Typical application: Household panel studies
.
I Assumptions: (X and Z = [1NT ..X ] as in the FE model)
. rk(Z ) = K + 1 (with probability one if X is stochastic).
. One-Way- EC Model:
εit = µi + νit (i = 1, ..., N; t = 1, ..., T )

. All µi , νit are uncorrelated with
µi ∼ (0, σµ2 ) and νit ∼ (0, σν2 )

⇒ 2
One parameter (σµ ) describes RE (unobserved heterogeneity)!

Moments of ε
I If X is stochastic, it is assumed to be strictly exogenous [w.r.t.
ε, i.e. E(ε|X ) = 0], and all moments are conditioned on X.

I Because of µ ∼ (0, σµ2 IN ) and ν ∼ (0, σν2 INT ) and the
uncorrelatedness of µ and ν one gets
ε =G µ + ν ∼ (0, Ω), where
Ω :=V[ε] = V[G µ] + V[ν]

=σµ2 GG 0 + σν2 INT = σµ2 (IN ⊗ JT ) + σν2 INT .
I Note: If X is stochastic and E[ν|X ] = 0, then
. E[y |X ] = Z θ only under E[µ|X ] = 0,

. but E[y |X , µ] = Z θ + G µ even when E[µ|X ] 6= 0.

Covariance matrix of the error terms
I Error terms εit have homogenous variances.
I Ω has a block diagonal structure with serial correlations only
for errors of the same individual:
2 2

 σµ + σν ,
 if i = j and t = s (variances)
Cov[εit , εjs ] =  σµ2 , if i = j, t 6= s
0, else (i 6= j)

⇒ Ω is an equicorrelated block diagonal matrix.
I Clearly, Ω is positive denite (as sum of a positive denite and
a nonnegative denite matrix).

4.3 The random eects model | 4.3.2 GLS estimation 54 | 95

I Linear model with general error term structure:
y = α1NT + X β + ε = Z θ + ε , ε|X ∼ (0, Ω) (3)
I The pooled OLS estimator is unbiased but inecient.
I The FE estimator βbW remains unbiased (as long as E[ν|X ] = 0)

and consistent, but it is inecient!
. If E[µ|X ] 6= 0 and thus E[ε|X ] 6= 0, βbW may be interpreted as

IV estimator with QX as (valid) IVs for X.
I The GLS Estimator is BLUE:
!
α
= (Z 0 Ω−1 Z )−1 Z 0 Ω−1 y
bGLS
θbGLS =
βbGLS
⇒ Need to invert the huge (NT × NT )-matrix Ω.

Spectral decomposition of Ω
P = IN ⊗ Q = I − P, Ω
1
I Because of JT and can be written
T
as
Ω = σµ2 (IN ⊗ JT ) + σν2 INT = (T σµ2 + σν2 )P + σν2 Q

= σ12 P + σν2 Q, where σ12 = T σµ2 + σν2 .
I Since P and Q are orthogonal projections with PQ = 0, this is
the spectral decomposition of Ω.

. Ω has two distinct eigenvalues σ12 and σν2 with multiplicities
rk(P) = tr (P) = N and tr (Q) = N(T − 1), respectively.
⇒ For any scalar r: Ωr = (σ12 )r P + (σν2 )r Q
1 1
⇒ Ω− 1 = P + Q (for r = −1; easy to verify directly!)
σ12 σν2
GLS transformation
I The GLSE for θ is the solution to the normal equation
Z 0 Ω−1 Z θbGLS = Z 0 Ω−1 y
or (equivalently) the OLSE in the transformed model
Ω−1/2 y = |Ω−{z
1/2
Z} θ + Ω −1/2
| {z ε}
| {z }
=e
y =Ze =e
ε
εe|X ∼ (0, INT ).

1 1
I With Ω−1/2 = P+ Q it follows
σ1 σν

I Because of P1NT = 1NT and Q1NT = 0 it follows
1
Ω−1/2 1NT = 1NT .
σ1
I Therefore the transformed model can be written as
α
ye = −1/2
| {z X} β + εe ,
1NT + Ω εe|X ∼ (0, INT ) (4)
σ1
=:X
e
I GLS Estimator:
!
α
b/σ1 . . .
= ((1NT ..Xe )0 (1NT ..Xe ))−1 (1NT ..Xe )0 ye
βb

Separate GLS estimation of β
I Alternatively, βbGLS can be obtained separately using the
Frisch-Waugh-Theorem as the OLSE in
Q0 ye = Q0 Xe β + Q0 εe, (5)
|{z} |{z} |{z}
=:y ∗ =X ∗ =ε∗
where Q0 = I − P0 with Q0 1NT = 0 and
1
P0 = 1NT (10NT 1NT )−1 10NT = JNT .
| {z } NT
=NT
I This is equivalent to the OLS estimation of β in model (4).

Properties of the matrices P0 and Q0
I P0 and Q0 are orthogonal projection matrices (i.e., they are
symmetric and idempotent).
I P0 Q0 = 0
I P0 P = P0
I P0 Q = 0
I Q0 Q = Q
I Q0 P = P − P0

I y ∗ = Q0 Ω−1/2 y and X ∗ = Q0 Ω−1/2 X in (5) can be written as
y ∗ = By and X ∗ = BX ,
where
1 1 1 1
B = Q0 Ω−1/2 = Q0 ( P+ Q)= (P − P0 ) + Q = B 0.
σ1 σν σ1 σν
I On account of (P − P0 )Q = PQ − P0 Q = 0, it follows
1 1
B 0B = B 2 = (P − P0 ) + Q.
σ12 σν2
⇒ The GLS estimator of β (OLSE in (5)) can be written as
βbGLS = (X ∗ 0 X ∗ )−1 X ∗ 0 y ∗ = (X 0 B 0 BX )−1 X 0 B 0 By .

GLS estimator as weighted average

βbGLS = (X ∗ 0 X ∗ )−1 X ∗ 0 y ∗ = (X 0 B 0 BX )−1 X 0 B 0 By
= W1 βbW + W2 βbB , where
βbW = (X 0 QX )−1 X 0 Qy (Within estimator)
| {z } | {z }
=WXX =WXY
βbB = [X 0 (P − P0 )X ]−1 X 0 (P − P0 )y (Between estimator)

| {z } | {z }
=BXX =BXY
−1
σ2

W1 = WXX + ν2 BXX WXX
σ1
−1 2
σν2

σν
W2 = WXX + 2 BXX BXX with W1 + W2 = IK .
σ1 σ12

The between estimator
I βbB only takes into account the variation between the groups,
but ignores the variation within the groups, since it is the
OLSE of β in the model after transforming by P:
Py = αP1NT + PX β + Pε
= α1NT + PX β + Pε (6)
⇔ y i . = α + x i . 0 β + εi . i = 1, ..., N; (∀t).

I By the Frisch-Waugh Theorem, the between estimator of β

can be obtained separately as OLSE in the model
Q0 P y = Q0 P X β + Q0 Pε (recall Q0 1NT = 0).

|{z} |{z}
=P−P0 =P−P0
I It follows:
βbB = [X 0 (P − P0 )X ]−1 X 0 (P − P0 )y
| {z } | {z }
=BXX =BXY
" N
#−1 N
X X
= (x i . − x..)(x i . − x..)0 (x i . − x..)(y i . − y ..)
i=1 i=1

Interpretation of the weighting matrices
I The respective FW representations yield:
−1
V[βbW |X ] = σν2 WXX
V[βbB |X ] = σ 2 B −1
1 XX
σν2
V[βbGLS |X ] = σν2 [WXX + ψ 2 BXX ]−1 , where ψ 2 =
σ12
⇒ WXX = σν2 (V[βbW |X ])−1
ψ 2 BXX = σ 2 (V[βbB |X ])−1
ν
−1 −1 2
⇒ W1 = WXX + ψ 2 BXX WXX , W2 = WXX + ψ 2 BXX

ψ BXX
are proportional to inverse covariance matrices of βbW , βbB .

Special cases
σν2
(i) σµ2 = 0 : ⇒ ψ 2 = T ·0+σν2
=1
⇒ V[ε] = σν2 INT ⇒ (pooled) OLSE = GLSE
(ii) T → ∞:
σν2
ψ2 = → 0 ⇒ W1 → IK ⇒ βbGLS → βbW
T σµ2 + σν2
(iii) ψ2 → ∞ (hypothetically, since 0 ≤ ψ 2 ≤ 1):

W1 → 0 ⇒ βbGLS → βbB

Estimation of the intercept α

I The GLSE for α is the OLSE in model (4).
I Because of the identity of the residuals in (4) and (5) it follows
α
bGLS
1NT · y − Xe βbGLS ).
= P0 (e
σ1
I Due to

−1/2 1 1 1
P0 Ω = P0 P+ Q = P0 ,
σ1 σν σ1
we obtain
α
bGLS 1 1
1NT · = P0 (y − X βbGLS ) = 1NT (y .. − x..0 βbGLS )
σ1 σ1 σ1
and hence
bGLS = y .. − x..0 βbGLS .

α
4.3 The random eects model | 4.3.3 FGLS estimation 67 | 95
4.3.3 Estimation of the variance components

I Necessary to obtain feasible GLS estimator!
I Starting point: ε ∼ (0, Ω) [conditional on X ], Ω = σ12 P + σν2 Q

I Because of QG = 0 and PQ = 0 it follows
2
Qε=Qν ∼ (0, σν Q) and Pε ∼ (0, σ12 P)
I Then E[||Qε||2 ] = E[||Qν||2 ] = σν2 tr(Q) = σν2 N(T − 1)

I Hence,
ε0 Qε ||Qε||2

E =E = σν2
N(T − 1) N(T − 1)
I Similarly: E[||Pε||2 ] = σ12 tr(P) = σ12 N
⇒ E [ε0 Pε/N] = σ12
I Now: replace ε (not observable!) by residuals.

Intermezzo: A useful lemma

I Lemma. Let y be a random n-vector with y ∼ (µ, Σ) and A
be a (symmetric) n × n-matrix. Then
E[y 0 Ay ] = µ0 Aµ + tr [AΣ].
I Proof. Writing y = µ + (y − µ), we obtain
E[y 0 Ay ] = µ0 Aµ + E[(y − µ)0 A(y − µ)] + 2 E[µ0 A(y − µ)]

| {z }
=0
0 0
= µ Aµ + E{tr [(y − µ) A(y − µ)]}
= µ0 Aµ + tr {A E[(y − µ)(y − µ)0 ]},
| {z }
=Σ
where we used tr (AB) = tr (BA) and the linearity of the
expectation.
Within residuals
I Appropriate to estimate σν2

I Within residuals are obtained from (3) after appyling the
within transformation Q:
νbW = Qy − QX βbW = [Q − QX (X 0 QX )−1 X 0 Q] y = Cy
| {z }
=:C
I C is an orthogonal projection (i.e. C

0 = C = C 2) with
CQ = C , CX = 0, CP = 0, C 1NT = 0.
I It can be shown that
νbW ∼ (0, σν2 C ).

I The above Lemma provides thus
νW ||2 ] = σν2 tr(C ) = σν2 (NT − N − K ).

E[||b
⇒ An unbiased estimator of σν2 is given by:
νW ||2
||b y 0 (Q − QX (X 0 QX )−1 X 0 Q)y
bν2 =
σ = .
NT − N − K NT − N − K

Between (-group) residuals
I To estimate σ12 , we use the between-group residuals, which
can be obtained as OLS residuals in (6):
εbB = Py − PZ θbB (with θbB = (Z 0 PZ )−1 Z 0 Py )

= (P − PZ (Z 0 PZ )−1 Z 0 P) y = Dy
| {z }
=:D
I D is also an orthogonal projection (i.e. D 0 = D = D 2) with
DP = D, DZ = 0, DQ = 0.
I It can be shown that
εbB ∼ (0, σ12 D).

I From our Lemma it follows
εB ||2 ] = σ12 tr(D) = σ12 (N − K − 1).

E[||b
⇒ Unbiased estimator of σ12 = T σµ2 + σν2 :
c2 = εB ||2
||b y 0 Dy y 0 (P − PZ (Z 0 PZ )−1 Z 0 P)y
σ1 = =
N −K −1 N −K −1 N −K −1
2
⇒ Unbiased estimator of σµ :
b12 − σ
σ bν2
bµ2 =
σ
T
I Note: bµ2
σ can yield negative values. (In this case alternative
estimators have to be used!)

⇒ Problems with interpretation
. Possible reasons: The sample is too small or the individual
eects are insignicant or the model is misspecied.

4.4 Checking the assumptions | 73 | 95
Contents I

4.1.1 Introduction
4.1.2 SUR
4.2.1 Assumptions
4.3.1 Assumptions

4.4 Checking the assumptions | 74 | 95
Contents II


4.4 Checking the assumptions | 4.4.1 Tests for poolability 75 | 95
4.4.1 Tests for poolability of the data

I Model assumption: α and β depend neither on i nor on t.
I Allow (under H1 ) that parameters vary across individuals:
H1 : yi = αi 1T + Xi βi + εi (i = 1, ..., N)
I Test for poolability of the data across individuals:
H0a : αi ≡ α and βi ≡ β (i = 1, ..., N)

I Assume rst ε ∼ N (0, σ 2 INT ) [no individual specic eects]
. The model can be estimated eciently by OLS

(under H1 : OLS equation by equation).
. The linear hypothesis H0a can be tested by an F -test.

. Distribution of F -statistic under H0a : F(N−1)(K +1),N(T −K −1) .
4.4 Checking the assumptions | 4.4.1 Tests for poolability 76 | 95
I Now: inclusion of xed eects under both H0 and H1

. Model under H1 as above: αi = µ∗i , εi = νi
. Assume ε = ν ∼ N (0, σ 2 INT )
⇒ Estimation by OLS (under H1 : equation by equation)
. H0b : βi ≡ β (i = 1, ..., N) is tested by an F -test

. Distribution of F -statistic under H0b : F(N−1)K ,N(T −K −1)
I Case of random eects: ε ∼ N (0, Ω)

. Estimation of the variance components ⇒ Ω
b
. F -test for testing H0a vs. H1 in the model for b −1/2 y

Ω is
approximately valid.

4.4 Checking the assumptions | 4.4.2 Tests for individual eects 77 | 95
I Test for xed eects:
. F -test, cp. Subsection 4.2.3 (H0 : µ1 = . . . = µN = 0)
I Test for random eects:
H0 : σµ2 = 0
. Lagrange Multiplier test under normality (Breusch-Pagan)
. An asymptotic α0 -test rejects H0 ⇔

2
εb0H (IN

NT ⊗ JT )b
εH
LM := 1 − 0 > χ12,1−α0 ,
2(T − 1) εbH εbH
where εbH is the vector residuals under H0 (pooled OLS
residuals).

4.4 Checking the assumptions | 4.4.2 Exogeneity of the regressors 78 | 95
I General assumption: X is strictly exogenous w.r.t. ν :
E[ν|X ] = 0
⇒ βbW is unbiased (and consistent under mild conditions) in
both the FE and the RE model:
βbW = β + (X 0 QX )−1 X 0 Qν
E[βbW ] = β + E (X 0 QX )−1 X 0 QE[ν|X ] = β

⇒
. In the FE model, βbW is ecient (BLUE).
. In the RE model, βbW is inecient.

I RE model: βbGLS is in general only unbiased (and consistent
and ecient), if
E[ε|X ] = 0 , i.e. E[µ|X ] = 0 besides E[ν|X ] = 0
I Recall:
θbGLS = θ + (Z 0 Ω−1 Z )−1 Z 0 Ω−1 ε

⇒ E[θbGLS ] = θ + E[(Z 0 Ω−1 Z )−1 Z 0 Ω−1 E(ε|Z ) ]
| {z }
=G E[µ|Z ]+0
=θ (only) if E[µ|Z ] = E[µ|X ] = 0 !

Hausman test
I Null hypothesis: H0 : E[ε|X ] = 0 (or E[µ|X ] = 0)

I Idea of Hausman test:
P
qb := βbGLS − βbW −
→0 (under H0 )
P
but qb →
6 0 (under H1 )
I An asymptotic α0 -test rejects H0 if
q )]−1 qb
qb0 [V(b
[ > χ2K,1−α0 ,
where V(b q ) = V(βbW ) − V(βbGLS )
(generally positive denite under H0 due to ineciency of βbW ).

Practice: RE vs. FE model

I RE model if data are randomly drawn from some population
. Otherwise: FE model, within estimator (ecient)
I RE model: X exogenous w.r.t. ε (Hausman test)?
. If yes, use RE estimator, (feasible GLSE, asympt. ecient).
. Otherwise, use FE estimator (within estimator, consistent),

which can be interpreted as IV estimator (IV: QX ).
. Within estimator is even ecient in RE model if e.g.
(Mundlak)
µi = x 0i· π + ui , ui ∼ (0, σu2 ) i.i.d.
I After deciding for RE or FE model, test for individual eects:
. RE model H0 : σµ2 = 0, FE model H0 : µi ≡ 0

. If H0 not rejected: Use pooled OLSE.

4.5 Example: Estimation of an investment function | 82 | 95
Contents I

4.1.1 Introduction
4.1.2 SUR
4.2.1 Assumptions
4.3.1 Assumptions

Contents II


4.5 Example: Estimation of an investment

function
I Aim: Estimation of a linear investment function
I Data for N = 10 US (manufactory) rms over T = 20 years
from 1935 to 1954 (Grunfeld, 1958).
I Dependent variable GIit : real gross investment for rm i in
year t
I Explanatory variables:
. VFit : real value of the rm
. VCit : real value of the rm's capital stock

Plot of Grunfeld data (see R code)

1935 1945 0 2000 5000
8
firm
6
4
2
1945
year
1935
1500
inv
500
0
5000
value
2000
0
2000
capital
1000
0
2 4 6 8 0 500 1500 0 1000 2000

Pooled OLS estimation
Oneway (individual) effect Pooling Model
Call:
plm(formula = inv ~ value + capital, data = panel.gr, model = "pooling")
Balanced Panel: n=10, T=20, N=200
Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-292.0 -30.0 5.3 34.8 369.0
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
(Intercept) -42.7143694 9.5116760 -4.4907 1.207e-05 ***
value 0.1155622 0.0058357 19.8026 < 2.2e-16 ***
capital 0.2306785 0.0254758 9.0548 < 2.2e-16 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Total Sum of Squares: 9359900

Residual Sum of Squares: 1755900
R-Squared : 0.81241
Adj. R-Squared : 0.80022
F-statistic: 426.576 on 2 and 197 DF, p-value: < 2.22e-16
One-way EC model with xed eects: Within-estimation
Oneway (individual) effect Within Model
Call:
plm(formula = inv ~ value + capital, data = panel.gr, model = "within")
Residuals :
-184.000 -17.600 0.563 19.200 251.000
Coefficients :
value 0.110124 0.011857 9.2879 < 2.2e-16 ***
capital 0.310065 0.017355 17.8666 < 2.2e-16 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

R-Squared : 0.76676

Estimating the xed eects
I Estimates of µ∗i = α + µi
> (fix.gr.fe <- fixef(gr.fe))
1 2 3 4 5 6
-70.296717 101.905814 -235.571841 -27.809295 -114.616813 -23.161295
7 8 9 10
-66.553474 -57.545657 -87.222272 -6.567844
> summary(fixef(gr.fe))
1 -70.2967 49.7080 -1.4142 0.15730
2 101.9058 24.9383 4.0863 4.383e-05 ***
3 -235.5718 24.4316 -9.6421 < 2.2e-16 ***
4 -27.8093 14.0778 -1.9754 0.04822 *
5 -114.6168 14.1654 -8.0913 6.661e-16 ***
6 -23.1613 12.6687 -1.8282 0.06752 .
7 -66.5535 12.8430 -5.1821 2.194e-07 ***
8 -57.5457 13.9931 -4.1124 3.915e-05 ***
9 -87.2223 12.8919 -6.7657 1.327e-11 ***
10 -6.5678 11.8269 -0.5553 0.57867
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Test for joint signicance of xed eects
I Null hypothesis H0 : µi = 0 ∀i ⇔ µ∗1 = . . . = µ∗N = α

I Under H0 , pooled OLS estimation is appropriate.
> pFtest(gr.fe,gr.pool)
F test for individual effects
data: inv ~ value + capital

F = 49.1766, df1 = 9, df2 = 188, p-value < 2.2e-16
alternative hypothesis: significant effects
⇒ H0 is rejected.

One-way EC model with random eects: FGLS estimation

Call:
plm(formula = inv ~ value + capital, data = panel.gr, model = "random")
Effects:
var std.dev share
idiosyncratic 2784.46 52.77 0.282
individual 7089.80 84.20 0.718
theta: 0.8612
Residuals :
-178.00 -19.70 4.69 19.50 253.00
Coefficients :
(Intercept) -57.834415 28.898935 -2.0013 0.04674 *
value 0.109781 0.010493 10.4627 < 2e-16 ***
capital 0.308113 0.017180 17.9339 < 2e-16 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

R-Squared : 0.7695

Between estimation in the RE model

Oneway (individual) effect Between Model
Call:
plm(formula = inv ~ value + capital, data = panel.gr, model = "between")
Residuals :
-163.00 -3.68 2.97 20.70 144.00
Coefficients :
(Intercept) -8.527114 47.515308 -0.1795 0.86266
value 0.134646 0.028745 4.6841 0.00225 **
capital 0.032031 0.190938 0.1678 0.87152
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

R-Squared : 0.85777
F-statistic: 21.1077 on 2 and 7 DF, p-value: 0.0010851

Test for poolability in the FE model
I Under H0 : All slope parameters are the same, but the intercepts
may dier across rms
I Under H1 : FE model with slope parameters varying across rms
> # pvcm: estimation of models with variable coefficients (model under H_1)
> gr.sur <- pvcm(inv ~ value + capital,data=panel.gr,model="within")
> # Estimation under H_0:
> gr.fe.pool <- plm(inv ~ value + capital,data=panel.gr)
> pooltest(gr.fe.pool,gr.sur)
F statistic
F = 5.7805, df1 = 18, df2 = 170, p-value = 1.219e-10
alternative hypothesis: unstability
⇒ H0 is rejected.

There is a substantial variation among the rms:

> summary(gr.sur)
Oneway (individual) effect No-pooling model
Call:
pvcm(formula = inv ~ value + capital, data = panel.gr, model = "within")
Residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-184.5000 -7.1180 -0.3926 0.0000 5.7030 144.0000
Coefficients:
(Intercept) value capital
Min. :-149.782 Min. :0.004573 Min. :0.003102
1st Qu.: -9.639 1st Qu.:0.058518 1st Qu.:0.087132
Median : -6.956 Median :0.082738 Median :0.137738
Mean : -21.368 Mean :0.091285 Mean :0.205263
3rd Qu.: -1.507 3rd Qu.:0.128411 3rd Qu.:0.357513
Max. : 22.707 Max. :0.174856 Max. :0.437369

Multiple R-Squared: 0.99931

Hausman test
I Under H0 , the regressors are exogenous.
> phtest(gr.fe,gr.re)
Hausman Test

chisq = 2.3304, df = 2, p-value = 0.3119
alternative hypothesis: one model is inconsistent
⇒ H0 is not rejected, so that the RE model can be used.

Breusch-Pagan Lagrange Multiplier test

I H0 : σµ2 = 0
> plmtest(gr.pool,effect="individual",type="bp")
Lagrange Multiplier Test - (Breusch-Pagan)

chisq = 798.1615, df = 1, p-value < 2.2e-16
alternative hypothesis: significant effects
⇒ H0 is rejected.
I Conclusion: A RE model seems to be appropriate, whereas the
pooled OLS model cannot be used.
. However, the poolability assumption might be problematic.

Analysis of Panel Data: Applied Econometrics Prof. Dr. Simone Maxand

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analysis of Panel Data: Applied Econometrics Prof. Dr. Simone Maxand

Uploaded by

Copyright:

Available Formats

Chapter 4

Analysis of panel data

4.1 Linear panel data models

Applied Econometrics  Chapter 4

4.4.1 Tests for poolability

4.5 Example: Estimation of an investment function

Applied Econometrics  Chapter 4

. pooled observations on a cross-section of investigation units

I Investigation units: (shortly: individuals)

Applied Econometrics  Chapter 4

Examples of panel data sets

. representative study of private households

. since 1984 annual survey of the same private households,

. e.g. 2008: around 11000 households with more than 20000

. subjects covered: composition of the household, occupational

. enables to analyse political and social changes (e.g. eects of

Applied Econometrics  Chapter 4

Objective and notation

variable y and K explanatory variables x1 , ..., xK .

. N individuals (persons, households, rms, countries,...)

. over T time periods (balanced panel).

I Notation: (yit , xkit ) , i = 1, ..., N; t = 1, ..., T ; k = 1, ..., K

. N = 1: time series data

I Here: Focus on panels with many individuals (large N) and

comparatively few time periods (small T ).

Some advantages of panel data

. time series information reecting changes within individuals

I Consequently, panel data allow

. to control for unobserved heterogeneity (omitted time-invariant

. to construct and test more complex (more realistic) behavioral

. a better identication of eects that are simply not detectable

Applied Econometrics  Chapter 4

Eciency considerations for panel data

I In general, there are gains in eciency since explanatory

variables vary across two dimensions.

I Panel data contain more information due to repeated

observations of the same individual.

. This implies more information about temporal changes

. but less variation in the explanatory variables compared with

Applied Econometrics  Chapter 4

I Therefore: In comparison to repeated cross sections, panel

data allows for more ecient estimation of temporal changes.

I But: In comparison to repeated cross sections, panel data

allows for less ecient estimation of averages over time.

Applied Econometrics  Chapter 4

Linear panel data regression models

I A most general linear panel data model is given by

yit = αit + xit0 βit + εit , i = 1, . . . , N, t = 1, . . . , T .

. Notation: xit = (x1it , . . . , xKit )0

I Model cannot be estimated since it has more parameters than

observations (parameters are not identiable).

I Restrictions concerning the variation of αit and βit in i and t

arbitrarily depend on both i and t) are necessary.

I Asymptotic results for panel estimators are typically derived for

4.1.2 Seemingly unrelated regression (SUR)

yit = αi + xit0 βi + εit , i = 1, . . . , N, t = 1, . . . , T ,

Applied Econometrics  Chapter 4

Cross section representation

Applied Econometrics  Chapter 4

Applied Econometrics  Chapter 4

where, with σii =: σi2 = V(εit ) (∀t),

σ12 σ12 . . . σ1N

 σ21 σ22 . . . σ2N 

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

I Investigation units: (shortly: individuals)

Applied Econometrics Chapter 4

. enables to analyse political and social changes (e.g. eects of

Applied Econometrics Chapter 4

. N individuals (persons, households, rms, countries,...)

. time series information reecting changes within individuals

. a better identication of eects that are simply not detectable

Applied Econometrics Chapter 4

Eciency considerations for panel data

I In general, there are gains in eciency since explanatory

Applied Econometrics Chapter 4

data allows for more ecient estimation of temporal changes.

allows for less ecient estimation of averages over time.

Applied Econometrics Chapter 4

I A most general linear panel data model is given by

observations (parameters are not identiable).

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

ecient than system estimators.

inuenced by potential misspecications in other equations.

i 6= j ), single equation estimation is also ecient.

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

I SUR does not allow for an estimation of common eects, e.g.

restriction into account yields a more ecient inference.

Applied Econometrics Chapter 4

4.1.3 Individual-specic error component

Applied Econometrics Chapter 4

where λt captures eects that are constant across the

I We will focus on the individual-specic/one-way EC model.

I Further specications (not addressed): systems of equations

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

Applied Econometrics Chapter 4

4.2.1 Assumptions for the xed eects (FE)

⇒ rk(X ) = K [note that by denition: rk(G ) = N)]

Applied Econometrics Chapter 4

I Under the above assumptions: β is identiable.

I α and µ are not identiable (dummy trap), since

. The sum of the columns in G equals the rst column 1NT .

Applied Econometrics Chapter 4

Possible solutions to non-identiability