You are on page 1of 33

PANEL DATA WORKSHOP

BRUNEL UNIVERSITY
February 29, 2008.
PART II:
STATIC AND DYNAMIC PANEL DATA
MODELLING

Sourafel.girma@nottingham.ac.uk

Static and dynamic panel data modelling


1.
2.
3.
4.
5.
6.
7.

Presentation outline
Introduction
Static panel data models
Empirical example
Dynamic panel data models
Empirical example
Further considerations
Summary
2

Sourafel.girma@nottingham.ac.uk

1. Introduction
 Suppose the aim is to establish the link between domestic
firms profitability and foreign direct investment (foreign
finance) by multinational enterprises (MNEs).
 MNEs are at the frontier of technology and management
practices and have vast marketing resources. Thus it is
reasonable to expect that domestic firms receiving foreign
finance will increase their profitability.
 The regression analysis will typically require data on firms
profits and the amount of foreign finance they have
attracted. These variables are observable to the researcher.
 Profitability, however, is also affected by several firm level
characteristics that are unobservable to the analyst.
3

Sourafel.girma@nottingham.ac.uk

1. Introduction
 These characteristics include managerial ability, political and
business connections, happiness of workforce, and are
referred to as firm heterogeneity (or firm-specific effects).
 Firm heterogeneity is part of the error term of the model
since it is unobservable.
 It is assumed to be constant through a reasonably short
space of time.
 Firm heterogeneity not only affect profits, but also foreign
finance. It can be argued that MNEs are likely to invest in
firms with more able managers and useful connections.
 This creates error-regressor correlation, i.e. endogeneity
problem.

Sourafel.girma@nottingham.ac.uk

1. Introduction
 We saw that the method of instrumental variables can be
used to tackle the problem of endogeneity.
 Valid instruments, however, are not always easy to come by.
 One of the main advantages of panel data is the ability to
control for the problem of endogeneity, without the necessity
of getting external instruments (in other words, without the
need for additional data).
 The objective of this lecture is to demonstrate this advantage
of panel data using an example from finance.
 In particular static and dynamic linear panel data models will
be considered.

Sourafel.girma@nottingham.ac.uk

2. Static panel data models


 For simplicity, consider the following static panel data model
with a single explanatory variable.
(1)
 i and t index firms and time periods resp.; y is profits; x is
foreign finance, fi is firm heterogeneity and = error term.
 The problem with estimating equation (1) by OLS is that the
individual heterogeneity fi is likely to be correlated with x:
 fi is also called fixed effect or correlated effect.
 The panel data solution to the problem of correlated effects
6
is to eliminate them from the model by transforming the data.

Sourafel.girma@nottingham.ac.uk

2. Static panel data models


 There are two methods of transforming the data to eliminate
correlated effects.

Method I: The within transformation


Step 1: For each firm i, average Equation (1) over time as
(2)
where

and so on.

 Note that because fi does not change over time it appears in


both (1) and (2).
Step 2: Subtract (2) from (1) to eliminate fi and obtain the
following model:

Sourafel.girma@nottingham.ac.uk

2. Static panel data models


(3)
Step 3: Estimate equation (3) by OLS.
 The resulting estimator of is called the within (or fixed
effect) estimator.
 This estimator is free of endogeneity bias because the
correlated effect is not involved.
 A drawback of this transformation is that it eliminates all
variables that are time-invariant .
 For example if equation (1) includes the gender of the
manager and the location of the firm as regressors, these
variables will drop out of the model.
 Another drawback is that only the within variability of the
variables is used (i.e. between variability is neglected).

Sourafel.girma@nottingham.ac.uk

2. Static panel data models


Method II: The first-difference transformation
Step 1: For each firm i lag Equation (1) by one time period as
(4)
 Note that because fi does not change over time it appears in
both (1) and (4).
Step 2: Subtract (4) from (1) to eliminate fi and obtain the
following first differenced model.
(5)
Step 3: Estimate equation (5) by OLS, and the resulting
estimator of will also be free of endogeneity bias.
9

Sourafel.girma@nottingham.ac.uk

2. Static panel data models


 When estimating panel data models by OLS following the
within and first-difference transformations, it is vital to adjust
the transformed error term for serial correlation within each i.
 For example, the transformed error term in equation (5) for
time periods t=3 and t=4 are
and

 It is easy to see that the two error terms are correlated since
they have a common element in
 Most econometric packages can adjust the OLS standard
errors for serial correlation in the transformed model.
 When T= 2, the within and first-difference estimators
coincide.

Sourafel.girma@nottingham.ac.uk

10

10

Fixed/correlated effects.
Within transformation.
First-differencing.

11

Sourafel.girma@nottingham.ac.uk

11

3. Empirical example
 The aim is to test whether foreign direct investment (foreign
finance) leads to increased firm profitability in the UK.
 The following model is specified (i and t index firm and year)
(6)
 PROF is log of profitability, FDI is log of foreign finance, MS
is market share, f is firm heterogeneity (fixed effect) and is
an error term which is assumed to be serially uncorrelated.
Task 1: Estimate the model by using the within transformation.
Task 2: Test for heterogeneity-regressor correlation using
Hausman test ( remember this test?).
Task 3: For comparison, re-estimate the model by firstdifferencing the data.

Sourafel.girma@nottingham.ac.uk

12

12

3. Empirical example
 A peek at the data ( N=2813; T= 5)

13

Sourafel.girma@nottingham.ac.uk

13

3. Empirical example

TASK 1

14

Sourafel.girma@nottingham.ac.uk

14

3. Empirical example

TASK 2
Reject the null

Sourafel.girma@nottingham.ac.uk

15

15

3. Empirical example
TASK 3

1. Recall that D is the first-difference operator in Stata (it


automatically first differences the variables).
2. Also note that the noc (no constant) option was used.
3. The elasticity of profitability with respect to foreign finance
(0.0732) is equivalent to the corresponding elasticity from the
16
within estimator (0.0747).

Sourafel.girma@nottingham.ac.uk

16

4. Dynamic panel data models


 When econometric models contain lagged dependent
variables, they are called dynamic models.
 In these models the past influences the present because of
adjustment costs, habits, etc
 Consider the following dynamic panel data model:
(7)

, i=1,N; t=3,T.
 Note that we need T to be at least 3, so t=3,.,T.
 First-differencing equation (7) to eliminate fi gives
 For convenience, rewrite the above model as
(8)

17

Sourafel.girma@nottingham.ac.uk

17

4. Dynamic panel data models


 Although the heterogeneity term is eliminated from the
dynamic panel data model, the transformed equation (8) has
a problem of its own!
 Namely, the first-differenced lagged dependent variable and
the first-differenced error term are correlated:
(9)
 To prove equation (9) , note that

contains

,which

by the equation (7) can be written as


(10)
and
(11)
 As equations (10) and (11) have the term

Sourafel.girma@nottingham.ac.uk

in common,
18

18

4. Dynamic panel data models


 Because of this regressor-error correlation, the firstdifferenced dynamic panel data model cannot be estimated
by OLS (unlike the static panel model).
 Provided that is not serially correlated, however, it can be
estimated by IV/GMM using values of y lagged by two or
more periods as instruments.
 Consider the case of y lagged by two periods,

as IV.

 It is not difficult to see that

 As long as the lagged values of y are valid instruments, it is


possible to obtain consistent estimators of and .
19

Sourafel.girma@nottingham.ac.uk

19

First-difference model;
Lagged values of y as IV;
GMM.

20

Sourafel.girma@nottingham.ac.uk

20

5. Empirical example
 Now extend the static panel data model of equation (6) by
including a lagged dependent variable
(12)
1. Because of dynamics and individual heterogeneity in the
model, first-difference the data.
2. Use lagged values of profits as instruments. In this particular
case, use twice and three times lagged values.
3. Estimate the model by GMM.
4. Test for the validity of the instruments.
 Note that there are several way of estimating the model
depending on the choice of instruments, but the basic
principle is the same.

Sourafel.girma@nottingham.ac.uk

21

21

5. Empirical example

FDI is insignificant!
Note: Obtained from ivregress 2sls

Instruments are
valid

Sourafel.girma@nottingham.ac.uk

22

22

6. Further considerations
1. In dynamic panel data modelling, is it desirable to use all
possible lagged values of the dependent variable as
instruments?
 No necessarily so, mainly because of two reasons:
a. In general too many instruments, even if all of them are
valid, can bias IV/GMM in finite samples.
b. From practical point of view , if you employ the IV/GMM
estimator discussed in the previous section, the more lags
you use as IV , the more observations you loose. For
example if you decide to lag the dependent variable 4 times,
you cant use observations from the first 4 time periods.
23

Sourafel.girma@nottingham.ac.uk

23

6. Further considerations
2. Is there any technique that can allow me to use further lags
of the dependent variable as instruments, without loosing
data from the early periods of the panel?
 Yes there are several methods that can allow you to do so!
 Without loss of generality, consider the following simple
dynamic panel model with T = 5 (as in our example).
[1]
 First-difference the model to obtain
[2]
 If is not serially correlated, the following instruments are
valid for each i:
 At t=3: yi1 ; at t=4: yi1 and yi2 ; and at t=5: yi1 , yi2 and yi3.

Sourafel.girma@nottingham.ac.uk

24

24

6. Further considerations
 If you want to use all of the available instruments, construct
an instrument matrix Z with one row for each period that you
are instrumenting as:
IV for t=3
IV for t=4
IV for t=5

 This type of instrumentation procedure ensures that the


number of instruments are maximised.
 Z is refereed to as GMM-style matrix of instruments.
 Using the variables in Z as instruments, the parameter of the
25
dynamic panel model can be estimated by GMM.

Sourafel.girma@nottingham.ac.uk

25

6. Further considerations
 Using STATA with GMM-style instruments:

Test for instruments exogeneity.


26

Sourafel.girma@nottingham.ac.uk

26

6. Further considerations
 A two-step variant of the estimation :

Note: In theory two-step estimator


is superior!

We are looking for absence of second-order


serial correlation in the first-differenced errors.

Sourafel.girma@nottingham.ac.uk

27

27

6. Further considerations
3. First-differencing the model, wipes out the effects of timeinvariant variables. Is there a way of identifying these
effects?
 The answer is yes, and this involves using the level equation
[1] rather than the first-differenced equation [2].
 The idea is to use first-differences of the lagged dependent
variable as instruments in the level equation (as opposed to
using lagged dependent variables as instruments in the firstdifferenced equation.
 But this requires the additional assumption that the process
under study stationary in the sense that the distribution of
initial observations coincides with the steady state
distribution of the process.
 If this assumption is valid,
28

Sourafel.girma@nottingham.ac.uk

28

6. Further considerations
 In fact, one can use both level and first-differenced
equations with the relevant instruments to obtained what is
known the system GMM estimator.
 This estimator is especially recommended when y is highly
persistent ( is close to 1) and lagged values of y are quite
weak (irrelevant) instruments in the first-differenced model.
System GMM estimator

29

Sourafel.girma@nottingham.ac.uk

29

6. Further considerations
4. How can I deal with endogenous conditioning variables
(variables in X) in dynamic panel data model?
 A conditioning variable could be strictly exogenous
(uncorrelated with past, current and future values of ),
endogenous (correlated with past and/or current values of )
or predetermined (uncorrelated with past and current values
of , but correlated with its future values).
 If x is strictly exogenous, its past, current and future values
can be used as its instruments.
 Only current and past values of predetermined variables
may be used as valid instruments.
 With endogenous conditioning variables, the nature of
permissible instruments depends on the lag structure of the
30
error term.

Sourafel.girma@nottingham.ac.uk

30

6. Further considerations
System GMM estimation assuming MS and FDI are endogenous and the error term
Is serially uncorrelated.

31

Sourafel.girma@nottingham.ac.uk

31

6. Further considerations
5. Is it possible to get widely varying results when trying
different IV/GMM estimations?
 Indeed! IV/GMM estimates can be quite erratic in finite
samples.
 It is advisable to try different IV/GMM estimators in order to
establish the sensitivity of the results to choices of
instruments and estimation method.
 At the end of the day, however, you have to decide which
set of results , if any, is most convincing and be prepared to
explain and defend your decision.
32

Sourafel.girma@nottingham.ac.uk

32

7. Summary
1. Static panel data models with correlated effects can
be estimated by OLS after transforming the data to
eliminate individual heterogeneity.
2. The within or first-difference transformations can be
used to eliminate correlated effects.
3. The estimation of dynamic panel data models is
slightly more complicated and involves the use of
GMM.
THANK YOU!
33

Sourafel.girma@nottingham.ac.uk

33

You might also like