Pooled Cross-Section Time Series Data: Interpreting Coefficients and Addressing Unobserved Heterogeneity

Pooled Cross-Section
Time Series Data

Wooldridge Chapters 13 and 14
Types of Data
 Pooled Cross Sections: Independent cross
section data at different points in time.
 Panel / Longitudinal: Uniquely identified

cross section units (i) followed over time.
• Balanced Panel: All i appear in every period.
• Unbalanced Panel: Some i are missing for
some time periods.
2
Example: Two Period Panel Data
N=4, T=2
i t Consumption (Y) Income
(X)
1 1 72 98
1 2 75 102
2 1 31 40
2 2 26 39
3 1 55 66
3 2 62 70
4 1 41 59
4 2 45 60 3
Yit = B0 + B1Xit + eit
B1 = 0.72, but how to interpret?
80
60
40
20
40 60 80 100
x
Fitted values y 4
Interpreting Coefficients
 Yit = B0 + B1Xit + eit
Change in Yi Yit Yit  Y jt

B1  
across individuals X it X it  X jt
at time t.
Yit Yi ,t 1  Yit
Change in Yt over B1  
time for a given X it X i ,t 1  X it
individual.
5
Use intercept dummies to
differentiate between “time” and
“type” effects
 Time Dummies: the effect of being

in time period 2 vs. time period 1 on
the expected value of Yit, holding all
else constant.
 Type Dummies: the effect of being

of type B vs. type A on the expected
value of Yit, holding all else constant. 6
Time Dummies
 Let D2,t = 0 if t = 1
1 if t = 2
Yit = B0 + TD2,t + eit

Where Y2 is the
 T  (Y2  Y1 ) mean at time 2
across all i.
7
with Time Dummy
i t DT (Yit) (Xit)
1 1 0 72 98
1 2 1 75 102
2 1 0 31 40
2 2 1 26 39
3 1 0 55 66
3 2 1 62 70
4 1 0 41 59
4 2 1 45 60 8
Time Dummy Example
sum y if t==1
Variable | Obs Mean

-------------+--------------------------
y | 4 49.75
sum y if t==2
Variable | Obs Mean

-------------+--------------------------
y | 4 52
Reg y dt
Coeff = 52 - 49.75 = 2.25
9
Time Dummy represents shift of regression line from period
1 to period 2. When regressed on Yit along with Xit:
t=2
80
t=1
60
T = 2.25
40
20
40 60 80 100
x
Fitted values y 10
Type Dummies
 Separate cross-sectional dimension of
sample into qualitative “types” (e.g. male vs.
female, rural vs. urban, foreign vs. domestic,
treatment vs. control, etc.)
 Let DiB = 1 if individual i is Type B

 = 0 otherwise.
Yit = B0 + BDiB + eit   (Y  Y
B B A )
When Xit is included in regression, B
represents shift in intercept. 11
with Type Dummy
i t Type DB (Yit) (Xit)
1 1 A 0 72 98
1 2 A 0 75 102
2 1 B 1 31 40
2 2 B 1 26 39
3 1 B 1 55 66
3 2 B 1 62 70
4 1 A 0 41 59
4 2 A 0 45 60 12
From Simple Example
reg y db
y | Coef. Std. Err. t
db | -14.75 12.51582 -1.18
_cons | 58.25 8.850024 6.58
sum y if db==1
Variable | Obs Mean
y | 4 43.5
sum y if db==0
Variable | Obs Mean
y | 4 58.25
Coefficient = difference in means

= 43.5 - 58.25 = -14.75
13
Type Dummy represents shift of regression line from type B
to Type A. When regressed on Yit along with Xit:
type=A
80
B = -14.25
60
type=B
40
20
40 60 80 100
x
Fitted values y 14
Difference-in-Differences Estimator
 Estimates the difference across types, and
over time, using simple dummy variable
framework.
 Excellent for policy analysis. Takes advantage

of “natural experiment” quality of panel data.
 Can be expanded beyond two period

framework.
Examples: stadium construction, natural

disaster, water treatment facility, tax cuts. 15
Use interaction term between type
and time dummies.
Yit  B0  B1 X it   0 D2,t   1 DB ,it   DD D2,t D B ,it  eit
 DD  (YB , 2  Y A, 2 )  (YB ,1  Y A,1 )
Difference Difference
“After” “Before” 16
Difference Coefficient
 Also known as “Average Treatment
Effect”,
 Can also be written as
 DD  (YB , 2  YB ,1 )  (Y A, 2  Y A,1 )
Treatment Impact on Treatment Impact on

‘treated’ control group.
17
D-in-D example
i t Type DB D2T DB*D2T (Yit) (Xit)
1 1 A 0 0 0 72 98
1 2 A 0 1 0 75 102
2 1 B 1 0 0 31 40
2 2 B 1 1 1 26 39
3 1 B 1 0 0 55 66
3 2 B 1 1 1 62 70
4 1 A 0 0 0 41 59
4 2 A 0 1 0 45 60
18
From simple example
Reg y db d2 dd
y | Coef. Std. Err. t
--------------------------------------------
db | -13.5 21.6015 -0.62
d2 | 3.5 21.6015 0.16
dd | -2.5 30.54914 -0.08
_cons | 56.5 15.27457 3.70
Mean of y for type b when t=2: 44.00

Mean of y for type a when t=2: 60.00
Mean of y for type b when t=1: 43.00
Mean of y for type a when t=1: 56.50
Coefficient = (44.00 - 60.00) - (43.00- 56.50)
= (-16)-(-13.5) = -2.5 19
How much more did treatment group (B) outcome increase
than control group (A) from time 1 to time 2?
YB , 2
 DD  (YB , 2  YB ,1 )
 (Y A, 2  Y A,1 )
YB ,1
Y A, 2
Y A,1
t=1 t=2
20
Panel Data Problem!
Unobserved Heterogeneity
 There exist characteristics of each
individual that persist over time
which cannot be included in the
regression (unobservable in available
data), but which none-the-less impact
the observed variation in our
dependent variable.
21
Composite Errors
 These time-invariant unobserved
effects are best modeled as a
component in the regression error term.
 It is this “composite error” approach

that sets apart panel regression from
OLS.
22
Examples
 Unobservable motivational skills of
firm manager in a production function.
 Skills, charisma, connections,

nepotism in a wage model.
 Levels of unobserved macro-level

institutional corruption or inefficiency
in a cross-sectional growth model.
23
The Composite Error Model
 Yit = B0 + Xit + vit
 Where vit = uit + ai is the composite

error, and…
 uit is the random, time-varying

idiosyncratic error.
 ai is the time invariant error 24
component.
The Composite Error Problems
 1.) If COV(ai, Xit)  0, then OLS
estimates will be biased.
Very much like simultaneous equations

(endogeneity) bias, but here covariance
with error term will only involve cross
sectional variation.
25
Composite Error Bias
Yit  B0  B1 X it  a i  u it
Bˆ1 
 ( X  X )( a  u
it i it )
(X  X )
2
it
E   ( X it  X )ai  E   ( X it  X )u it 
E ( Bˆ1 )  
(X (X
2 2
it  X) it  X)
ˆ COV ( X it , ai ) COV ( X it , u it )
E ( B1 )  
 ( X it  X )  ( X it  X )
2 2
26
Examples:
1. manager charisma correlated with
firm size in production function.
2. Nepotism/networking correlated
with education in wage equation.
3. Institutional quality associated

with development in corruption
equation.
27
 2.) Since ai represents a time-invariant
component of the error, composite
errors will be correlated over time –
Serial Correlation is the result:

Corr(vit, vi,t+s)  0
Estimates will not be biased, but

goodness of fit and significance of
coefficients will be overstated.
28
How to deal with the Composite
Error problem?
 Pooled OLS – do nothing about it.
 First Difference – eliminate ai.
 Dummy Variables – estimate the ai when N

small
 Fixed Effects. – estimate ai when N large.
 Random Effects. – account for serial correlation
29
First Difference Transformation (two
period panel) with Time dummy
Yit = B0 + 0DTt + Xit + ai + uit
 For Period 2:
Yi2 = (B0 + 0) + Xi2 + ai + ui2
 For Period 1:
Yi1 = B0 + Xi1 + ai + ui1
 First Difference = Yi = Yi2 – Yi1
Yi = 0 + B1(Xi2 – Xi1) + (ui2 – ui1)

Yi = 0 + B1(Xi) + ui
30
First Difference
 Transformation eliminates ai terms.
 Corrects for heterogeneity bias and serial

correlation.
 Problems:
• 1. Eliminates all time invariant variables (type
dummies)
• 2. Eliminates time dimension in two period

panel (reduces T by 1 in general)
31
“Type” Dummy Variables for each i
 If ai terms are viewed as coefficients to be
estimated, a dummy can be constructed
that uniquely identifies each individual in
the sample.
 Dummy coefficient will represent effect of

the sum of all unobserved attributes.
32
Type Dummies
 Solves ‘time invariant bias’ problem
by removing ai from error
component, and directly estimating
the effects.
 Obvious problem is that degrees of

freedom are vastly reduced.
Requires a large number of time
periods relative to cross sectional
units.
33
Example: 4 country panel over 250
months
Step 1: append the separate country data
files:
 use c:/stata627/nfa/canada.dta
 append using
c:/stata627/nfa/italy.dta
 append using
c:/stata627/nfa/japan.dta
 append using c:/stata627/nfa/uk.dta
 tsset code time
34
Dummy Example – estimates of ai
xi:reg y cpi r er i.code
i.code _Icode_1-5 (naturally coded; _Icode_1 omitted)
Number of obs = 990

Prob > F = 0.0000
R-squared = 0.7464
Adj R-squared = 0.7448
------------------------------------------------------
y | Coef. Std. Err. t P>|t|
-------------+----------------------------------------
cpi | .3817633 .0117365 32.53 0.000
r | -.4944136 .0780945 -6.33 0.000
er | -.0196729 .0014589 -13.49 0.000
_Icode_3 | 26.41765 2.053128 12.87 0.000
_Icode_4 | -12.51685 .6298041 -19.87 0.000
_Icode_5 | -1.729212 .5753217 -3.01 0.003
_cons | 67.36739 1.632653 41.26 0.000
-------------------------------------------------------
 Code 1 = Canada, omitted
 Code 3 = Italy, positive estimate of ai
 Code 4 = Japan, negative ai
35
 Code 5 = UK, negative ai
Fixed Effects
 Assume CORR(ai, Xit)  0, but
 CORR(uit, Xit) = 0.
 An alternative to the first difference

transformation is the “Time De-
meaning” transformation of the fixed
effects model.
 Results in a model essentially identical to

the Dummy model, without having to
estimate N-1 dummy coefficients.
36
Fixed Effects Transformation
(1) y it   0  1 xit  ai  u it original model
1 T
y i  avg. for each individual over time   y it
T t 1
(2) y i   0   1 xi  a i  u i
(2) is the " between" equation, only shows variation
between individuals, taking out the time element.
Subtract the mean from the level each time period :
y it  y i  (  0   0 )   1 ( xit  xi )  (ai  ai )  (u it  u i )
( y it  y i )  1 ( xit  xi )  (u it  u i )
(3) 
 y it  1 xit  uit
37
Fixed Effects Regression is equivalent to
running OLS on Equation 3:
( yit  yi )  1 ( xit  xi )  (uit  ui )

(3) 
 yit  1 xit  uit
CORR( xit , uit )  0

ˆ
 1 is unbiased.
FE
This is also known as the “within”

estimation equation, as it shows the 38
variation within a group over time.

Fixed Effects Coefficients
 Will have same “two-dimension”
interpretation as pooled OLS.
 Variation in transformed variables are

same as in Yit and Xit.
Yit Yit
B1  
X it X it
39
Fixed Effects Transformation With Time-Invariant
Dummy Independent Variable
(1) y it   0  1 xit   0 Dit  ai  u it
Dit  (0,1), and time invariant for each i
1 T 1
Di  avg. for each individual over time   Dit  TDi  Di
T t 1 T
(2) y i   0  1 xi   0 Di  ai  u i
y it  y i  (  0   0 )   1 ( xit  xi )   0 ( Di  Di )  (ai  ai )  (u it  u i )
( y it  y i )  1 ( xit  xi )  (u it  u i )
(3) 
 y it  1 xit  uit
Problem : Both ai and Di are eliminated. 40

N=4, T=2
i t (Yit) Yi (Yit  Yi )  Yit
1 1 72 73.5 -1.5
1 2 75 73.5 1.5
2 1 31 28.5 2.5
2 2 26 28.5 -2.5
3 1 55 58.5 -3.5
3 2 62 58.5 3.5
4 1 41 43 -2
4 2 45 43 2
41
Goodness of Fit
 A fixed effects regression returns three
“R-square” measures. They are each
actually squared correlations between
predicted and observed values:
 1. Within R2: fitted de-meaned yit

 2. Between R2: fitted y_bari
 3. Overall R2: fitted yit (pooled OLS)
42
Panel Regressions in Stata
 XT = cross-section time series.
 “xtreg y x, fe” will run a panel fixed
effects regression.
 Must declare your “i” and “t” identifiers:

• tsset code time, for example.
 Unfortunately, Stata refers to the time-

invariant error component (our ai) as u_i.
43
Fixed Effects Stata Example
xtreg y cpi r er,fe
Fixed-effects (within) regression Number of obs = 990

Group variable (i): code Number of groups = 4
R-sq: within = 0.7071 Obs per group: min = 244

between = 0.0335 avg = 247.5
overall = 0.1827 max = 250
F(3,983) = 791.14
corr(u_i, Xb) = -0.7495 Prob > F = 0.0000
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cpi | .3817633 .0117365 32.53 0.000 .3587318 .4047948
r | -.4944136 .0780945 -6.33 0.000 -.6476647 -.3411625
er | -.0196729 .0014589 -13.49 0.000 -.0225358 -.0168101
_cons | 70.49544 1.529625 46.09 0.000 67.49374 73.49715
-------------+----------------------------------------------------------------
sigma_u | 16.538008 (std. error of time-invariant error)
sigma_e | 6.3818613 (std. error of idiosyncratic error)
rho | .87038904 (fraction of variance due to u_i)
------------------------------------------------------------------------------
44
F test that all u_i=0: F(3, 983) = 362.02 Prob > F = 0.0000
Random Effects
 Assumes CORR(ai, Xit) = 0.
 Therefore, OLS coefficients will not
suffer “composite error bias”, as was
assumed with Fixed Effects.
 we do not need to eliminate ai terms.
 Although ai terms do not truly have

to be “randomly” assigned, there is
no structural relationship between ai
and Xit in a correctly specified model.
45
Random Effects
 Even when CORR(ai, Xit) = 0, we still have
to account for the serial correlation
introduced by the ai error component.
 A “Quasi-demeaned” data transformation is

used to accomplish this, wherein ai are
altered but not eliminated.
 A bonus is that time-invariant dummies are

46
not eliminated.
Random Effects Assumptions
 1. E(ai |Xit) = E(ai) = 0,
• independence of ai’s and X’s. cov(ai,Xit)=0
 2. E(uit | Xit, ai) = 0
 3. E(uituis) = cov(uit,uis) = 0 for all t≠s.
 4. E(uit2 |Xit,ai) = 2u = constant
 5. E(ai2 | Xit) = Var(ai) = 2a
47
Random Effects
 Under the preceding criteria, the
composite error does not violate OLS
assumptions.
 Unnecessarily eliminating the ai terms

will cause estimates to be inefficient.
 Don’t use Fixed Effects unless

warranted.
48
Random Effects
 However, running Pooled OLS will
not be appropriate because the
composite errors are still serially
correlated over time.
 It can be shown that:

 2
corr (vit , vis )  2 ,t  s
a
a u
2
49
Where, again: vit = uit + ai
Random Effects
 Random effects transformation is
more complicated than FD or FE, but
basic idea is to eliminate serial
correlation in the error term by using
information on variances of fixed and
idiosyncratic errors.
50
Random Effects (RE)
 Transformation results in a weighted

average of the estimates provided by
the “within” and “between”
estimators.
51
RE Transformation
(1) yit   0  1 xit  ai  uit
Define a weighted average :

 y i   0  1 xi  ai  ui then subtract from (1)
 
( yit   y i )  (1   )  0  1 ( xit   xi )  (1   )ai  (uit   ui )
 ˆ 2
where ˆi  1  u
Tiˆ a  ˆ u
2 2
52
ˆ u
ˆ 2
Given i  1 
Ti a   u
ˆ 2
ˆ 2
 It can be shown that the composite error

term vit augmented by the weighting
term  (lambda) will NOT suffer from
serial correlation.
Corr(vit, vis) = 0
53
ˆ u ˆ 2
Given i  1 
Tiˆ a2  ˆ u2
NOTE:
 If var(a ) = 0, meaning a is always zero
i i
(no time-invariant effects), then lambda
equals 0 and RE regression is equivalent
to Pooled OLS equation (1) - all lambda-
weighted terms drop out.
 As 2a dominates 2u, ai terms become

more important,  goes to 1, and
RE→FE.
54
RE Stata Example (N=4)
xtreg y cpi r er
Random-effects GLS regression Number of obs = 990
Group variable (i): code Number of groups = 4
R-sq: within = 0.6252 Obs per group:min = 24
beween = 0.7702 avg = 247.5
overall = 0.4662 max = 250
Random effects u_i ~ Gaussian Wald chi2(3) = 861.17
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
----------------------------------------------------------
y | Coef. Std. Err. z P>|z|
-------------+--------------------------------------------
cpi | .3468475 .0158341 21.91 0.000
r | .0072637 .1077631 0.07 0.946
er | -.0002592 .0005834 -0.44 0.657
_cons | 61.17895 2.168505 28.21 0.000
-------------+--------------------------------------------
sigma_u | 0
sigma_e | 6.3818613
rho | 0 (fraction of variance due to u_i) 55
Fixed vs. Random Effects
 As a practical matter, Random Effects is preferred
when key explanatory variables are time-
invariant.
 The Fixed Effects view is that the unobserved

heterogeneity is in itself an explanatory variable
that ideally would have a coefficient to be
estimated.
56
Fixed vs. Random Effects
 The Random Effects view is that
unobserved heterogeneity is
“randomly assigned” to each
cross sectional entity and not
correlated with other explanatory
variables.
57
When to use FE vs. RE?
The Hausman Coefficient Test
 The logic of the test is the following:
• If CORR(ai, Xit)  0, then RE is biased.
• If CORR(ai, Xit) = 0, then both RE and FE are

unbiased, but it can be shown that RE is more
efficient (smaller standard error of
coefficienents)
• Therefore, if the FE coefficients are significantly
different from the RE coefficients, then RE must
be biased, so use FE.
• If FE coefficients are not significantly different
from RE, then neither is biased, so use RE.
58
General Hausman Test
 test the equality of the vector of coefficients:
 ˆ1FE  ˆ1RE
 ˆ FE  ˆ RE
ˆ FE   2 , ˆ RE   2
 ˆ FE  ˆ RE
 3  3
ˆ
V (  )  vector of variance terms for each coefficient.
*E
Test Statistic 
H  ( ˆ FE  ˆ RE )'[V ( ˆ FE )  V ( ˆ RE )]1 ( ˆ FE  ˆ RE )
H is distributed Chi-square with k degrees of freedom 59

Single Coefficient Version
 If we are primarily interested in a single
parameter, there is a t-statistic version
of the Hausman test.
 Let B1FE and B1RE be the fixed- and

random effects coefficients for X1,it
( B1FE  B1RE ) Where t is

t
 se( B
1
FE 2
)  se( B )1 
RE 2 1 / 2 asymptotically
normally
distributed 60
Note: Hausman Test Problem
 Most of the time the Hausman test
works fine, however…
 The test statistic is based on the

assumption that RE is more efficient
(estimates have a smaller variance)
than FE.
61
 While this can be shown to be
asymptotically true, it may not hold for a
given sample.
 If this is the case, then the test statistic is

negative, and cannot be interpreted as a
Chi-square.
 This is why it is important to type :

• ’Hausman unbiased efficient’
 Where ‘unbiased’ is the vector of FE

coefficients and ‘efficient’ is the vector of
RE coefficients
62
Hausman Test Interpretation
 H0: FE = RE (difference in coefficients
is NOT systematic)
 HA: FE  RE.
 If H > critical value, we reject H0,

• conclude that since FE  RE
• Random Effects is biased, therefore
• CORR(ai, Xit)  0, and
• Fixed Effects is the appropriate model. 63
Hausman Test in Stata
xtreg y cpi r er,fe
estimates store fe
xtreg y cpi r er
estimates store re
hausman fe re
---- Coefficients ----
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| fe re Difference S.E.
----+------------------------------------------------------------
cpi | .3817633 .3468475 .0349158 .
r | -.4944136 .0072637 -.5016774 .
er | -.0196729 -.0002592 -.0194137 .0013371
-----------------------------------------------------------------
b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test: Ho: difference in coefficients not systematic
chi2(3) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 162.38
Prob>chi2 = 0.0000
(V_b-V_B is not positive definite)
Reject H0 in this case, so go with Fixed Effects

64
Lagrange Multiplier Test for
Random Effects
 Essentially, this is a derivation of a test for
heteroskedasticity in a panel composite
error setting, where vit = ai + uit.
 Assume var(uit) is constant, and uit is not

correlated with Xit.
 Then any correlation between var(vit) and

Xit must be due to the time-invariant error
ai.
65
Stata Note for Panel Regressions
 You will notice that running FE / RE
regressions with large N can be time
consuming, which is really annoying
during the specification search process.
 This is because each regression requires

Stata to perform the ‘de-meaning’
transformation for each observation from
the original data.
66
Stata Note
 The ‘xtdata’ command allows you to create
a new data set of the transformed
variables.
 Running OLS on the transformed variables

is equivalent to the transformed FE/RE
regression.
 Typing ‘xtdata y x1 x2,fe’ will create a

new .dta file with the fixed effect de-
meaned values of the specified variables
for each observation. 67
Extensions to Panel Regression
 1.) 2SLS/IV with panel
 Xtivreg y x1 (x2=z), fe
 2.) Cluster effects for cross-sectional

data.
 3.) Auto-correlated idiosynchratic errors

(uit)
68
Extension 1: IV Panel
 When an independent variable is
endogenous in a panel regression,
each stage of the two stage least
squares process must take into
account the composite error issue.
 i.e. the first stage and second stage

will either be RE or FE regression,
depending on which is appropriate. 69
Yit = B0 + Xit + ai + uit
 The fixed effects transformation will

address the issue of
COV(Xit,ai) ≠ 0.
But what about when

COV(Xit,uit) ≠ 0?
70
Panel 2SLS
(1) yit   0  1 xit  ai  uit
CORR( xit , ai )  0
CORR( xit , uit )  0
Define variable zit such that

CORR( zit , ai )  0 but
CORR( zit , uit )  0
Therefore zit will be exogenous in (1), but
will require the fixed effects transformation
to be used as an effective instrument. 71
First Stage FE
xit   0   1zit  eit , where zit  ( zit  zi )

ˆ FE
1 is unbiased.
Save fitted, transformed, values of xˆit .
72
Second Stage FE
(1' ) yit  1 xˆit  uit
CORR ( xˆit , ai )  0 (because of the umlat)

CORR ( xˆ , u )  0 (because of the hat)
it it
ˆ
 FE , 2 SLS
1 will be unbiased.
73
Extension 2: Cluster Regression
 Allows for a Fixed Effects transformation with
single period cross-section data.
 “cluster-” or “group-” invariant errors replace

“time-invariant” errors (ai).
 For example, there may be “within village

effects” that will be the same for all households
in Village A that differ from Village B.
 Often can be controlled for with “cluster 74
dummy” variables.
Cross Section Cluster Example
Household Village Consumptio Income
(i) (j) n (Yij) (Xij)
1 1 500 750
2 1 650 1000
3 1 475 725
1 2 600 700
2 2 625 750
3 2 550 600
1 3 575 1100
2 3 625 1200
75
3 3 600 1000
Cluster Regression
 Model:
 Xij = observation for household i in village j
Yij = B0 + Xij + aj + uij

 The analogy to panel structure is that i acts like
the time variable, and j acts like the cross-
sectional identifier.
 Multiple observations for a given village j.
 aj is the “cluster invariant error” or “village
level fixed effect”
76
Fixed Effects for Cluster
 Again, if there is correlation between
the “cluster-invariant” error (aj) and
the independent variables (Xij), then
the coefficient estimates will be biased.
 Fixed Effects transformation eliminates

the aj by subtracting the cluster mean
from each observation.
(Yij  Y j )  B1 ( X ij  X j )  (a j  a j )  (uij  u j )
Y j  village level mean
Y  B FE X  u 77
ij 1 ij ij
Cluster Effects Transformation
i j y x ybarj y_umlatij xbarj x_umlatij
1 1 500 750 541.67 -41.67 825 -75

2 1 650 1000 541.67 108.33 825 175
3 1 475 725 541.67 -66.67 825 -100
1 2 600 700 591.67 8.3333 683.33 16.67
2 2 625 750 591.67 33.333 683.33 66.67
3 2 550 600 591.67 -41.67 683.33 -83.3
1 3 575 1100 600 -25 1100 0
2 3 625 1200 600 25 1100 100
3 3 600 1000 600 0 1100 -100
78
Transformed OLS Regression
reg y_umlat x_umlat
Source | SS df MS Number of obs = 9

-------------+------------------------------ F( 1, 7) = 27.86
Model | 17649.2873 1 17649.2873 Prob > F = .0011
Residual | 4434.04639 7 633.435199 R-squared = .7992
-------------+------------------------------ Adj R-squared = .7705
Total | 22083.3337 8 2760.41671 Root MSE = 25.168
------------------------------------------------------------------------
y_umlat | Coef. Std. Err. t P>|t|
-------------+----------------------------------------------------------
x_umlat | .4759358 .0901646 5.28 0.001
_cons | 4.09e-07 8.38938 0.00 1.000
------------------------------------------------------------------------
79
FIXED EFFECTS
tsset j i
panel variable: j (strongly balanced)
time variable: i, 1 to 3
delta: 1 unit
xtreg y x,fe
Fixed-effects (within) regression Number of obs = 9
Group variable: j Number of groups = 3
within = 0.7992 Obs per group: min = 3
between = 0.0961 avg = 3.0
overall = 0.2517 max = 3
F(1,5) = 19.90
corr(u_i, Xb) = -0.8365 Prob > F = 0.0066
---------------------------------------------------------------------
-------------+-------------------------------------------------------
x | .4759358 .1066842 4.46 0.007
_cons | 163.978 93.28558 1.76 0.139
-------------+-------------------------------------------------------
sigma_u | 95.865925
sigma_e | 29.779343
rho | .91199744 (fraction of variance due to u_i)
---------------------------------------------------------------------
F test that all u_i=0: F(2, 5) = 9.34 Prob > F = 0.020580
Cluster (village) Dummies
xi:reg y x i.j
i.j _Ij_1-3 (naturally coded; _Ij_1 omitted)
Source | SS df MS Number of obs = 9

-------------+------------------------------ F( 3, 5) = 8.88
Model | 23621.5092 3 7873.8364 Prob > F = 0.0191
Residual | 4434.04635 5 886.809269 R-squared = 0.8420
-------------+------------------------------ Adj R-squared = 0.7471
Total | 28055.5556 8 3506.94444 Root MSE = 29.779
-----------------------------------------------------------------------
-------------+---------------------------------------------------------
x | .4759358 .1066842 4.46 0.007
Vlg2 _Ij_2 | 117.4242 28.62912 4.10 0.009
Vlg3 _Ij_3 | -72.54902 38.10424 -1.90 0.115
Vlg1 _cons | 149.0196 89.678 1.66 0.157
-----------------------------------------------------------------------
81
“predict ai, u” to view the estimated ai
i j _Ij_2 _Ij_3 ai
1 1 0 0 -14.9584
2 1 0 0 -14.9584
3 1 0 0 -14.9584
1 2 1 0 102.4658
2 2 1 0 102.4658
3 2 1 0 102.4658
1 3 0 1 -87.5074
2 3 0 1 -87.5074
3 3 0 1 -87.5074
82
Aside. . . ”xtdes” command
xtdes
j: 1, 2, ..., 3 n = 3
i: 1, 2, ..., 3 T = 3
Delta(i) = 1 unit
Span(i) = 3 periods
(j*i uniquely identifies each observation)
Distribution of T_i:
min 5% 25% 50% 75% 95% max
3 3 3 3 3 3 3
Freq. Percent Cum. | Pattern

---------------------------+---------
3 100.00 100.00 | 111
---------------------------+---------
3 100.00 | XXX
83
Extension 3: Autocorrelation of uit’s
 Random Effects transformation eliminated
autocorrelation amongst composite errors
due to presence of ai.
 Fixed Effects eliminated autocorrelation due

to ai by eliminating the time-invariant error.
 What if, in addition, uit is autocorrelated?

RE or FE alone will not address the issue.
84
Panel FE Regression with AC
(1) yit   0  1 xit  ai  uit
CORR ( xit , ai )  0
uit  ui ,t 1   it
-1    1
 it ~ N (0,  ) 2

(2) yit  1 xit  ui ,t 1   it  ui

(2.1) yit  1 xit   (ui ,t 1  ui ,t 1 )   it 85
 Equation (2.1) is now a linear AR(1) Model.
 To solve, we need to use the Cochrane-

Orcutt method of estimating , then using
the generalized difference equation to
eliminate the term:
 (ui ,t 1  ui ,t 1 )
86
STATA to the rescue again!
 The command:
“xtregar y x,fe”
Will simultaneously transform the data

to eliminate the ai terms AND
estimate  AND provide consistent
standard errors with the generalized
difference equation.
87
Xtregar Example from 4 country panel
xtregar y r cpi er,fe
FE (within) regression with AR(1) disturbances Number of obs =986

Group variable: code Number of groups =4
R-sq: within = 0.0155 Obs per group: min =243

between = 0.5840 avg =246.5
overall = 0.4567 max =249
F(3,979) = 5.13
corr(u_i, Xb) = -0.1308 Prob > F = 0.0016
------------------------------------------------------------------------
-------------+----------------------------------------------------------
r | -.0362285 .0875633 -0.41 0.679
cpi | .2832925 .076438 3.71 0.000
er | .0015201 .0029347 0.52 0.605
_cons | 68.766 .2288196 300.52 0.000
-------------+----------------------------------------------------------
rho_ar | .9718915
sigma_u | 6.3918957
sigma_e | 1.7246814
rho_fov | .93213626 (fraction of variance because of u_i)
88
------------------------------------------------------------------------
F test that all u_i=0: F(3,979) = 2.14 Prob > F = 0.094
Stata Note – balancing your panel
 It may be useful to use only those
“entities” that appear in all time
periods. Suppose T=20 – use the
following:
Sort entity time

by entity: gen count=_N
keep if count==20
89
Panel Data Management in STATA
 Common problem is that original data
is stored in “wide” or “rectangular”
form, wherein values for a given year
are stored in a separate column.
 For example, in a cross-country panel,
FDI in 2000 has one column, with each
row representing a unique country.
Likewise for FDI in 2001, etc.
90
Example of “wide” form data set
Countries Code fdi2000 fdi2001 fdi2002

Argentina 1 1.04E+10 2.17E+09 2.15E+09
Australia 2 1.36E+10 8.26E+09 1.77E+10
Austria 3 8.52E+09 5.91E+09 3.19E+08
Bangladesh 4 2.80E+08 7.90E+07 5.20E+07
91
Problem
 In order to run a panel regression in
STATA, we need data to be stored in
“long” form.
 Here, each row is identified by both a

time period and country code. A
variable like FDI will have a single
column.
92
Example of “long” form data set
 code year countries fdi
1 2000 Argentina 1.040e+10
1 2001 Argentina 2.170e+09
1 2002 Argentina 2.150e+09
2 2000 Australia 1.360e+10

2 2001 Australia 8.260e+09
2 2002 Australia 1.770e+10
93
The “reshape” STATA command
 Instead of copying and pasting in
excel, load the data into STATA as
“wide” form, then transform.
 The “reshape” command will

generate the “time” variable for you,
and combine separate time periods
into a single column.
94
reshape long fdi, i(code) j(year)
 Keys on specified variable, here “fdi”.
 Must declare cross-section identifier i.
 Generates “within” group identifier j.

Put new varname in parentheses.
Typically j will represent time, but not
necessarily.
95
Reshape Notes
 In general, list all variables that must be
combined into a single column.
 You do not need to list time-invariant
variables, but they will be converted to
“long” as well.
 Note that “reshape wide” will convert data
from long to wide format.
 Seems to be touchy about year values. ‘99
for 1999 is ok, but ‘00 for 2000 is a
problem.
96
Fixed Effects Logit
y *it   X it   i  u it where y it  (0,1)
y it  1 if y *it  0
y it  1 if  X it   i  u it  0
Pr( y it  1)  Pr(  X it   i  u it  0) 
Pr(u it  ( X it   i )) 
1  G ((  X it   i )) 
e X it  i
G ( X it   i ) 
1  e X it  i
log L   y it logG (  X it   i )  (1  y it ) log[1  G ( X it   i )]

97

Pooled Cross-Section Time Series Data: Interpreting Coefficients and Addressing Unobserved Heterogeneity

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pooled Cross-Section Time Series Data: Interpreting Coefficients and Addressing Unobserved Heterogeneity

Uploaded by

Copyright:

Available Formats

Pooled Cross-Section

Time Series Data

 Panel / Longitudinal: Uniquely identified

B1 = 0.72, but how to interpret?

Change in Yi Yit Yit  Y jt

 Time Dummies: the effect of being

 Type Dummies: the effect of being

Yit = B0 + TD2,t + eit

Variable | Obs Mean

Variable | Obs Mean

 Let DiB = 1 if individual i is Type B

Coefficient = difference in means

 Excellent for policy analysis. Takes advantage

 Can be expanded beyond two period

Examples: stadium construction, natural

 DD  (YB , 2  Y A, 2 )  (YB ,1  Y A,1 )

 Can also be written as

Treatment Impact on Treatment Impact on

Mean of y for type b when t=2: 44.00

Coefficient = (44.00 - 60.00) - (43.00- 56.50)

 It is this “composite error” approach

 Skills, charisma, connections,

 Levels of unobserved macro-level

 Where vit = uit + ai is the composite

 uit is the random, time-varying

 ai is the time invariant error 24

Very much like simultaneous equations

3. Institutional quality associated

Serial Correlation is the result:

Estimates will not be biased, but

 First Difference – eliminate ai.

 Dummy Variables – estimate the ai when N

 Fixed Effects. – estimate ai when N large.

 Random Effects. – account for serial correlation

Yi = 0 + B1(Xi2 – Xi1) + (ui2 – ui1)

 Corrects for heterogeneity bias and serial

• 2. Eliminates time dimension in two period

 Dummy coefficient will represent effect of

 Obvious problem is that degrees of

 tsset code time

Number of obs = 990

 An alternative to the first difference

 Results in a model essentially identical to

( yit  yi )  1 ( xit  xi )  (uit  ui )

CORR( xit , uit )  0

This is also known as the “within”

variation within a group over time.

 Variation in transformed variables are

Problem : Both ai and Di are eliminated. 40

 1. Within R2: fitted de-meaned yit

 Must declare your “i” and “t” identifiers:

 Unfortunately, Stata refers to the time-

Fixed-effects (within) regression Number of obs = 990

R-sq: within = 0.7071 Obs per group: min = 244

 we do not need to eliminate ai terms.

 Although ai terms do not truly have

 A “Quasi-demeaned” data transformation is

 A bonus is that time-invariant dummies are

 2. E(uit | Xit, ai) = 0

 3. E(uituis) = cov(uit,uis) = 0 for all t≠s.

 4. E(uit2 |Xit,ai) = 2u = constant

 5. E(ai2 | Xit) = Var(ai) = 2a

 Unnecessarily eliminating the ai terms