Slides Chapter 11 Panel Data

1
INTRODUCTORY JJNPANEL DATA

JIBNIJBIJBBB
2
CHAPTER 11: PANEL DATA
• What have we learned so far?
• Analyses based on…

1. Cross section data ✓
•It is time to depart from cross section and time series

econometrics and deal with…
2. Panel data
• We’re now ready for our final topic for this module.
3
What is panel data?

1. Cross-sectional data - multiple observations of, say, individuals for,
say, a given year:
yi   0  1 x1i   2 x2i  ...   k xki  ui
▫ “i” refers to cross-sectional unit (e.g., individual).
2. Time-series data - say, multiple years of observations for, say, a given

country:
yt   0  1 x1t   2 x2t  ...   k xkt  ut
▫ “t” refers to time-series unit (e.g., year).
3. Panel data - observations on, say, multiple individuals at multiple

points in time:
yit   0  1 x1it   2 x2it  ...   k xkit  uit
▫ e.g., data on 500 individuals for 10 years.
4
Panel data model - Example

• The simple panel data model looks like this:
yit     xit  uit
• where i = 1,…,N indexes individuals and t = 1,…,T indexes time periods.
• Example:
sleepit     total _ workit  uit
where sleep and total_work are sleeping and working measured by
hours per week.
• The data set covers 239 people in 1975 and 1981, i.e.,
▫ i = 1, 2,…., 239
▫ t = 1975 and 1981.
5
Panel data model – cont.

• One way to use panel data is to view the unobserved factors affecting
the dependent variable as consisting of two types.
▫ Constant and time-varying:
yit    xit  d t  ai  uit


errors
where
• ai is called unobserved heterogeneity (a fixed effect) that captures all
unobserved, time-constant factors.
▫ Question: ai does not have t subscript. What does this mean?
▫ Answer:
• uit is called the idiosyncratic error (time-varying error) that

represents unobserved factors that change over time.
6
Time dummies
• We also normally include dt, a dummy variable(s) for the time period,
that captures time specific macro effects.
▫ Question: dt does not have i subscript. What does this mean?
▫ Answer:
• Q: Why bother including dt?

▫ We often think behavior changes over time due to natural evolutions
in economic variables.
7
Unobserved heterogeneity
• Q: What is ai (unobserved heterogeneity) capturing?
• We often think people are fundamentally different.
• We want to account for that by including ai.

▫ It measures tastes or costs that are (or might be) different for
each individual in the sample,
 that yield different choices of yit.
8
Unobserved heterogeneity
• Q: Should you always account for unobserved heterogeneity?
• A: Depends on question being addressed…
• …but there almost always is a good reason to include it.

▫ Why?
• We will look at some examples!

9
Example 1
• When analysing the impact of hours worked on hours of sleep, ai

could measure “how organised someone is”.
• People that are organised pay attention to how much sleep they get.
• We worry it’s also positively correlated with how much they choose
to work.
• Thus ignoring it could yield an upward biased estimate of the

effect of working on sleep.
10
Example 2
• When analysing the impact of education on wages, ai could measure

“ability”.
• Highly “able” people are likely to have high wages.
• We worry it’s also positively correlated with their choice of education.
• Thus ignoring it could yield an upward biased estimate of the effect

of education on wages.
11
Unobserved heterogeneity bias
• I’ve claimed that ignoring unobserved heterogeneity could yield bias.
• Why? This is easy!
• Surely you remember how to sign the bias in an OLS regression.

▫ (I say this in jest - I’m sure you do not.)
• It is worth a reminder, as this is an important skill.

12
Unobserved heterogeneity bias – cont.
• How do you know there is bias from unobserved heterogeneity?
• Recall that ai is part of the error term in our regression:



• Since ai is part of the error term, if it is also correlated with one of

our x’s...
• We’ve violated one of the CLRM assumptions: E(ηi |xi ) = 0

▫ This means the OLS estimator for β is biased.
▫ Sounds familiar? (It should!)
13

• How do you measure the sign of the bias?
• Consider the true model:
y   0  1 x   2 z  u
while you omitted z and regressed:

y   0  1 x  u
• Remember the formula for omitted variable bias (Lecture Note 11,
p.16):
ˆ
E ( 1 )  1   2
 ( x  x )( z  z )
 1
2
( xx )

 
bias
▫ Note that θ is the slope from the regression of z on x.

 Think of θ as just measuring the correlation between z and x.
14
ˆ
E ( 1 )  1   2
 ( x  x )( z  z )
 ( x  x )2

 
bias
• In a panel data setting, ai plays the role of the omitted z.
• Thus we infer the sign of…

▫ β2 by the impact of positive changes in ai is likely to have on the
dependent variable.
▫ θ by the correlation between ai and and xi.
• Here’s some examples..

15
Example 1 (revisited)
E ( ˆ1 )  1   2

bias
• In the sleep-work example, we worry ai captures “how organised

someone is”.
• Higher organisation is likely to have a positive impact on the amount

of sleep someone gets.
• We further worry that highly organised people also work more.
• Therefore… E ( ˆworking )   working  

2 

bias
• Meaning that we think their may be positive bias on our estimate of

the impact of working on sleep.
16
Example 2 (revisited)
E ( ˆ1 )  1  
 2
bias
• In the wages-education example, we worry ai captures ability.
• Higher ability is likely to have a positive impact on wages.
• We further worry that highly-able people get more education.
• Therefore,
E ( ˆeduc )   educ  
2 

bias
• Meaning that we think their may be positive bias on our estimate of

the impact of education on wages.
17
Final thoughts: Thinking about bias
• Thinking about possible sources of bias, such as unobserved

heterogeneity, and its impact on your regression coefficients…
• ...is a major part of contemporary econometric practice.
• It is the thing that separates

▫ “Careful Econometricians” from
▫ “Regression Runners”.
• Please be one of the former, not one of the latter!
• Now let’s see how to handle problems of unobserved heterogeneity.

18
Panel data analysis
• How to estimate the parameters in a panel data model?
• Consider the most common representation. Let’s begin by assuming

our data only have 2 time periods.
▫ This corresponds nicely to our sleep dataset.
where i = 1, . . . ,N and t = 0, 1.
▫ Therefore we have only a single time dummy, dt, where dt = 1 in

period 1 and dt = 0 in period 0.
19
Estimation methods
• I’ll introduce four common ways to estimate panel data models.
1. Pooled OLS
• In this case, we ignore concerns of unobserved heterogeneity and

simply use OLS to estimate:
Pooled OLS: yit     xit  d t  a

i  uit

 it
where ai + uit is a composite error term, ηit, that includes any

unobserved heterogeneity.
20
Estimation methods – cont.

2. First-Differences:
• In this case, we account for unobserved heterogeneity by estimating

in first-differences:
First-Differences: yi     xi  ui
where yi  yi 2  yi1 , xi  xi 2  xi1 , ui  ui 2  ui1
• Taking first-differences eliminates unobserved heterogeneity.
• This is the simplest way to correct for unobserved heterogeneity.

▫ though you loose one period by taking differences.
21

3. Fixed Effects:
• In this case, we account for unobserved heterogeneity by estimating

the fixed effects model:
Fixed Effects: yit  yi   ( xit  xi )   ( d t  d )  (uit  ui )
1 T 1 T 1 T 1 T
where yi   yit , xi   xit , d   d t , ui   uit
T t 1 T t 1 T t 1 T t 1
• Taking mean deviations eliminates unobserved heterogeneity.
• This is the most common panel data estimation method.

22

4. Random Effects:
• In this case, we assume that any unobserved heterogeneity, ai, is

uncorrelated with the x’s and estimate the random effects model:
Random Effects: yit  yi   (1   )   ( xit  xi )   (d t  d )  (uit  ui )
1 T 1 T 1 T 1 T
where yi   yit , xi   xit , d   d t , ui   uit
T t 1 T t 1 T t 1 T t 1
 u2
and   1 
 u2  T a2
• This is like Pooled OLS accounting for the “serial correlation” in the
composite error term, ai + uit, due to ai.
23
Estimation method 1: Pooled OLS

• Our estimating equation for Pooled OLS is:
yit     xit  d t  ai  uit

 it
where ai + uit is a composite error term that includes any unobserved

heterogeneity.
• As you can see, we are ignoring the unobserved heterogeneity, ai.

▫ It is just part of the composite error, ηit = ai + uit.
• To estimate, we run a simple OLS regression.
• When we do this for our example, what do we get for ̂ ?

24
Pooled OLS – cont.

yit     xit  d t  ai  uit (1)
• What do we get for (expectation of) ˆ ?
E ( ˆ )      
where
▫ ρ = the impact of positive changes in ai is likely to have on the dependent
variable.
▫ θ = the correlation between ai and and xit.
• Thus the OLS estimator is biased, unless

1. ρ = 0 in which case ai does not appear in equation (1).
2. θ = 0, that is, unobserved heterogeneity ai is unrelated to the
variable xit.
25
Estimation method 2: First Differences

• We need to derive our estimating equation for the First Difference
estimation.
• For a cross-sectional observation i, write the two years as

yi 2  (   )  xi 2  ai  ui 2 for t = 2.
yi1     xi1  ai  ui1 for t = 1.
• Subtracting the second equation from the first, we obtain:

( yi 2  yi1 )     ( xi 2  xi1 )  (ui 2  ui1 ) (2)
or
yi    xi  ui (2)’
where yi  yi 2  yi1 , xi  xi 2  xi1 , ui  ui 2  ui1 .
26
First Differences – cont.
yi    xi  ui (2)’
• Equation (2)’ is called the first-differenced equation.
• Using OLS to estimate (2)’ yields the first-differenced (FD) estimator.
• The first-differenced estimator is unbiased...

▫ as long as uit is uncorrelated with x in both time periods;
▫ even if ai is correlated with x because ai is differenced away.
• Note that any explanatory variable that is constant over time gets
swept away by first-differencing.
▫ Thus, we cannot estimate the effect of such variable on y.
27
Estimation method 3: Fixed Effects

• We need to derive our estimating equation for the Fixed Effects
estimation. Consider a model:
yit    xit  d t  ai  uit (3)
• Now, for each i, average this equation over time:
yi    xi  d  ai  ui (4)
1 T
where yi   yit and so on.
T t 1
▫ Note that α and ai are fixed over time.
• If we subtract (4) from (3) (called the within transformation), we obtain

yit  yi   ( xit  xi )   (d t  d )  (uit  ui ) (5)
or
~ ~
yit  ~xit  d t  u~it (5)’
where ~
yit  yit  yi is the mean deviation of y and so on.
28
Fixed Effects – cont.

~ ~
yit  ~
xit  d t  u~it (5)’
• Using OLS to estimate (5)’ yields the fixed effects (FE) estimator.
▫ It is also called the within estimator.
 because we use the time variation in y, x and d within each cross-
sectional observation.
• The fixed effects estimator is unbiased...

 as long as uit is uncorrelated with x across all time periods;
▫ even if ai is correlated with x because ai is dropped by the within
transformation.
• Note that any explanatory variable that is constant over time gets
swept away by the within transformation.
29
Estimation method 4: Random Effects

• The Random Effects estimator assumes that any unobserved
heterogeneity, ai, is uncorrelated with the x’s.
▫ OK, why not then just run OLS?
• You can, but the fact that ai is constant over time means that there is
serial correlation within each i.
• The composite error term is  it  ai  uit .
• Assume t = 0, 1. The correlation of the composite errors across time

is: Cov(i 0 ,i1 )
 Cov(ai  ui 0 , ai  ui1 )
 Cov(ai , ai )
 Var (ai )  0
30
Random Effects – cont.

• You can fix the correlation caused by ai by modelling the correlation.
▫ This is what the Random Effects estimator does.
• For a general panel dataset, you can show (but we won’t)...

• ...that the correlation between any ηis and ηit (for a given i) is:
 a2
Corr (is ,it )  2
 a   u2
where it  ai  uit ,  a2  Var (ai ) and  u2  Var (ui ) .
• Note that the pooled OLS ignores this serial correlation.

▫ Therefore, the usual standard errors from the pooled OLS are incorrect.
31
• The random effects estimator exploits this correlation to create a

transformed equation.
▫ whose error term does not have any serial correlation within i.
• To do so, let
 u2
  1
 u2  T a2
• Then the Random Effects model is:
yit  yi   (1   )   ( xit  xi )   (d t  d )  (it  i ) (6)

32
yit  yi   (1   )   ( xit  xi )   (d t  d )  (it  i ) (6)
• Equation (6) involves quasi-mean deviations of each variable.
• In practice, we never know λ. Instead, use its estimate ̂ .
• The OLS estimator of equation (6), using ̂ in place of λ, is called the

Random Effects (RE) estimator.
• Equation (6) allows for x that are constant over time.

▫ This is an advantage of RE over FE or FD.
▫ This is possible because RE assumes ai is uncorrelated with all x’s.
 though the assumption is quite unlikely to hold in reality…
33
Relationships among estimators

yit  yi   (1   )   ( xit  xi )   ( d t  d )  ( it   i ) (6)
where it  ai  uit .
• In order to relate RE to OLS and FE, recall that:
 u2
  1
 u2  T a2
▫ when λ = 0, we get the estimating equation of pooled OLS.
▫ when λ = 1, we get the estimating equation of FE.
• In practice, ̂ is between 0 and 1.

▫ when λ is close to 0, the RE estimates will be close to the pooled OLS
estimates.
▫ when λ is close to 1, the RE estimates will be close to the FE estimates.
34
Example
• Let’s consider a wage equation for men:
ln( wageit )   0  1educi   2 blacki   3 hispanici
  4 exp erit   5 exp erit2   6 married it   7unionit  ai  uit

where…
educ: years of schooling;
black: a dummy =1 if black, = 0 otherwise;
hispanic : a dummy =1 if hispanic, = 0 otherwise;
exper: years of labour market experience;
exper^2: squared exper;
married: a dummy =1 if married, = 0 otherwise;
union: a dummy =1 if in union, = 0 otherwise.
• Estimate the wage equation using observations on 545 males for the
years 1980-1987 in the US.
35
• We use 4 methods:
Pooled OLS, Random Effects, Fixed Effects and First Differences.
(1) (2) (3) (4)
VARIABLES pooled OLS RE FE FD
educ 0.0994*** 0.101*** - -

(0.00468) (0.00891)
black -0.144*** -0.144*** - -
(0.0236) (0.0476)
hisp 0.0157 0.0202 - -
(0.0208) (0.0426)
exper 0.0892*** 0.112*** 0.117*** -
(0.0101) (0.00826) (0.00842)
expersq -0.00285*** -0.00407*** -0.00430*** -0.00388***
(0.000707) (0.000592) (0.000605) (0.00139)
married 0.108*** 0.0628*** 0.0453** 0.0381*
(0.0157) (0.0168) (0.0183) (0.0229)
union 0.180*** 0.107*** 0.0821*** 0.0428**
(0.0171) (0.0178) (0.0193) (0.0197)
Observations 4,360 4,360 4,360 3,815

36
Interpretation of the quadratic term

• The estimation results show that the quadratic term, exper^2 is
significant.
Diagram:
• What does this mean? Effects of experience on wage
• Answer:
• Using FE estimates, we can compute

years of experience that maximise (log of)
wage:
ln( wage)  1.065  0.117 exp er  0.0043 exp er 2  ....
 ln( wage)
 0.117  2  0.0043 exp er  0
 exp er
0.117
exp er   13.604651
2  0.0043
37
Estimation methods: Which to use?

• OK, I just showed you a bunch of different estimation methods. The
next question is..
Which one should you use?
• There is a tradeoff:
▫ If ai is uncorrelated with all the x’s,

 the Random Effect estimator is the best.
 because it is efficient.
▫ If instead ai is correlated with at least one of the x’s,

 the Random Effects estimator is biased and inconsistent!
 and the Fixed Effects estimator is the best.
 because ai gets swept away by within transformation.
38
Which one to use? – cont.

• That’s great but it still doesn’t tell you which one to use!
• What you need is a test of the correlation between ai and the x’s.
• Fortunately, there is a way to test it, called the Hausman test.
• The basic idea of the Hausman test:

1. Compare the difference in the estimates between the FE and the RE
estimators.
2. If they are similar, then the assumption underlying the RE estimator,
 i.e., ai is uncorrelated with all the x’s
is not a bad one.
3. Thus use the RE estimator if the RE and FE estimates are “close
enough” and use the FE estimator otherwise.
• Let’s see it concretely..

39
Hausman test
• Consider the model:

yit    xit  ai  uit
• The null hypothesis to be tested is:
H0: ai and xit are not correlated

H1: otherwise
• The RE estimator is
▫ unbiased under H0.
▫ biased under H1.
• The FE estimator is
▫ unbiased both under H0 and H1.
40
Hausman test – cont.

• Using the difference ( ˆFE  ˆRE ) , the Hausman test statistic is:
Hausman  ( ˆFE  ˆRE )'[Vˆ{ˆFE }  Vˆ{ˆRE }]1 ( ˆFE  ˆRE )
2
• Hausman is asymptotically distributed as  with degrees of
freedom (k+1), where k is the number of explanatory variables in
the model.
• Remember: an important reason why the two estimators would be

different is the existence of correlation between ai and xit.
• Decision rule:
▫ Use the RE estimator if H0 is not rejected.
▫ Use the FE estimator if H0 is rejected.
41
Example
• Consider again the wage equation for men:
ln( wageit )   0  1educi   2 blacki   3 hispanici
  4 exp erit   5 exp erit2   6 married it   7unionit  ai  uit
• Estimating the equation by the RE and FE estimators yields ˆRE and ˆFE :
• The Hausman test statistic is:

42
Example – cont.
• The Hausman test statistic is:
• Do we reject the null hypothesis?

• Answer:
• Which estimator should we use then?

43
Summary
1. What is panel data?
▫ time dummies, unobserved heterogeneity
2. What is unobserved heterogeneity bias?
3. How to estimate the parameters in a panel data model?

▫ Pooled OLS
▫ First Differences
▫ Fixed Effects
▫ Random Effects
4. Which estimation methods to use?

▫ Hausman test

Slides Chapter 11 Panel Data

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Slides Chapter 11 Panel Data

Uploaded by

Copyright:

Available Formats

1

INTRODUCTORY JJNPANEL DATA

CHAPTER 11: PANEL DATA

• What have we learned so far?

• Analyses based on…

•It is time to depart from cross section and time series

What is panel data?

2. Time-series data - say, multiple years of observations for, say, a given

3. Panel data - observations on, say, multiple individuals at multiple

Panel data model - Example

• where i = 1,…,N indexes individuals and t = 1,…,T indexes time periods.

Panel data model – cont.

yit    xit  d t  ai  uit

• uit is called the idiosyncratic error (time-varying error) that

yit    xit  d t  ai  uit

• Q: Why bother including dt?

yit    xit  d t  ai  uit

• Q: What is ai (unobserved heterogeneity) capturing?

• We often think people are fundamentally different.

• We want to account for that by including ai.

• Q: Should you always account for unobserved heterogeneity?

• A: Depends on question being addressed…

• …but there almost always is a good reason to include it.

• We will look at some examples!

• When analysing the impact of hours worked on hours of sleep, ai

• Thus ignoring it could yield an upward biased estimate of the

• When analysing the impact of education on wages, ai could measure

• Highly “able” people are likely to have high wages.

• We worry it’s also positively correlated with their choice of education.

• Thus ignoring it could yield an upward biased estimate of the effect

Unobserved heterogeneity bias

• I’ve claimed that ignoring unobserved heterogeneity could yield bias.

• Why? This is easy!

• Surely you remember how to sign the bias in an OLS regression.

• It is worth a reminder, as this is an important skill.

Unobserved heterogeneity bias – cont.

• How do you know there is bias from unobserved heterogeneity?

• Recall that ai is part of the error term in our regression:

yit    xit  d t  ai  uit

• Since ai is part of the error term, if it is also correlated with one of

• We’ve violated one of the CLRM assumptions: E(ηi |xi ) = 0

Unobserved heterogeneity bias – cont.

while you omitted z and regressed:

▫ Note that θ is the slope from the regression of z on x.

Unobserved heterogeneity bias – cont.

• In a panel data setting, ai plays the role of the omitted z.

• Thus we infer the sign of…

• Here’s some examples..

• In the sleep-work example, we worry ai captures “how organised

• Higher organisation is likely to have a positive impact on the amount

• We further worry that highly organised people also work more.

• Therefore… E ( ˆworking )   working  

• Meaning that we think their may be positive bias on our estimate of

• In the wages-education example, we worry ai captures ability.

• Higher ability is likely to have a positive impact on wages.

• We further worry that highly-able people get more education.

• Meaning that we think their may be positive bias on our estimate of

Final thoughts: Thinking about bias

• Thinking about possible sources of bias, such as unobserved

• ...is a major part of contemporary econometric practice.

• It is the thing that separates

• Please be one of the former, not one of the latter!

• Now let’s see how to handle problems of unobserved heterogeneity.

Panel data analysis

educ 0.0994* 0.101* - -