Professional Documents
Culture Documents
• Example:
sleepit total _ workit uit
where sleep and total_work are sleeping and working measured by
hours per week.
• The data set covers 239 people in 1975 and 1981, i.e.,
▫ i = 1, 2,…., 239
▫ t = 1975 and 1981.
5
where
• ai is called unobserved heterogeneity (a fixed effect) that captures all
unobserved, time-constant factors.
▫ Question: ai does not have t subscript. What does this mean?
▫ Answer:
Time dummies
• We also normally include dt, a dummy variable(s) for the time period,
that captures time specific macro effects.
▫ Question: dt does not have i subscript. What does this mean?
▫ Answer:
Unobserved heterogeneity
Unobserved heterogeneity
Example 1
• People that are organised pay attention to how much sleep they get.
• We worry it’s also positively correlated with how much they choose
to work.
Example 2
• Remember the formula for omitted variable bias (Lecture Note 11,
p.16):
ˆ
E ( 1 ) 1 2
( x x )( z z )
1
2
( xx )
bias
ˆ
E ( 1 ) 1 2
( x x )( z z )
( x x )2
bias
Example 1 (revisited)
E ( ˆ1 ) 1 2
bias
Example 2 (revisited)
E ( ˆ1 ) 1
2
bias
• Therefore,
E ( ˆeduc ) educ
2
bias
Estimation methods
• I’ll introduce four common ways to estimate panel data models.
1. Pooled OLS
1 T 1 T 1 T 1 T
where yi yit , xi xit , d d t , ui uit
T t 1 T t 1 T t 1 T t 1
1 T 1 T 1 T 1 T
where yi yit , xi xit , d d t , ui uit
T t 1 T t 1 T t 1 T t 1
u2
and 1
u2 T a2
• This is like Pooled OLS accounting for the “serial correlation” in the
composite error term, ai + uit, due to ai.
23
E ( ˆ )
where
▫ ρ = the impact of positive changes in ai is likely to have on the dependent
variable.
▫ θ = the correlation between ai and and xit.
• Note that any explanatory variable that is constant over time gets
swept away by first-differencing.
▫ Thus, we cannot estimate the effect of such variable on y.
27
where ~
yit yit yi is the mean deviation of y and so on.
28
• Using OLS to estimate (5)’ yields the fixed effects (FE) estimator.
▫ It is also called the within estimator.
because we use the time variation in y, x and d within each cross-
sectional observation.
• Note that any explanatory variable that is constant over time gets
swept away by the within transformation.
29
• You can, but the fact that ai is constant over time means that there is
serial correlation within each i.
• To do so, let
u2
1
u2 T a2
u2
1
u2 T a2
▫ when λ = 0, we get the estimating equation of pooled OLS.
▫ when λ = 1, we get the estimating equation of FE.
Example
• Let’s consider a wage equation for men:
• Estimate the wage equation using observations on 545 males for the
years 1980-1987 in the US.
35
• We use 4 methods:
Pooled OLS, Random Effects, Fixed Effects and First Differences.
(1) (2) (3) (4)
VARIABLES pooled OLS RE FE FD
• There is a tradeoff:
• What you need is a test of the correlation between ai and the x’s.
Hausman test
• The RE estimator is
▫ unbiased under H0.
▫ biased under H1.
• The FE estimator is
▫ unbiased both under H0 and H1.
40
2
• Hausman is asymptotically distributed as with degrees of
freedom (k+1), where k is the number of explanatory variables in
the model.
• Decision rule:
▫ Use the RE estimator if H0 is not rejected.
▫ Use the FE estimator if H0 is rejected.
41
Example
• Consider again the wage equation for men:
ln( wageit ) 0 1educi 2 blacki 3 hispanici
4 exp erit 5 exp erit2 6 married it 7unionit ai uit
• Estimating the equation by the RE and FE estimators yields ˆRE and ˆFE :
Example – cont.
• The Hausman test statistic is:
Summary
1. What is panel data?
▫ time dummies, unobserved heterogeneity