You are on page 1of 6

Economics Letters 87 (2005) 361 366

www.elsevier.com/locate/econbase

Approximating the bias of the LSDV estimator for dynamic


unbalanced panel data models
Giovanni S.F. BrunoT
Istituto di Economia Politica, Universita` Bocconi, Via U. Gobbi, 5, 20136 Milan, Italy
Received 3 May 2004; received in revised form 5 August 2004; accepted 20 January 2005
Available online 2 April 2005

Abstract
This paper extends the LSDV bias approximations in [Bun, B.J.G., Kiviet, J.F. 2003. On the diminishing returns
of higher order terms in asymptotic expansions of bias. Economic letters, 79, 145-152.] to unbalanced panels. The
approximations are obtained by modifying the within operator to accommodate the dynamic selection rule. They
are accurate, with higher order terms bringing only decreasing improvements to the approximations. This removes
an important cause for limited applicability of bias corrected LSDV estimators.
D 2005 Elsevier B.V. All rights reserved.
Keywords: Bias approximation; Unbalanced panels; Dynamic panel data; LSDV estimator; Monte Carlo experiment
JEL classification: C23

1. Introduction
It is well known that the Least squares dummy variable (LSDV) estimator for dynamic panel data
models is not consistent for N large and finite T. Nickell (1981) derives an expression for the
inconsistency for N Y l, which is O ( T 1). Kiviet (1995) and, to a higher level of accuracy, Kiviet
(1999) uses asymptotic expansion techniques to obtain approximations of the small sample bias of the
LSDV estimator that include higher order terms, so offering a method to correct the LSDV estimator for
samples where N is small or only moderately large. Bun and Kiviet (2003) analyze the performance of
T Corresponding author. Tel.: +39 02 5836 5411; fax: +39 02 5836 5438.
E-mail address: giovanni.bruno@unibocconi.it.
0165-1765/$ - see front matter D 2005 Elsevier B.V. All rights reserved.
doi:10.1016/j.econlet.2005.01.005

362

G.S.F. Bruno / Economics Letters 87 (2005) 361366

the approximation of Kiviet (1999) using simpler formulae. Monte Carlo evidence in Judson and Owen
(1999) strongly supports the corrected LSDV estimator (LSDVC) compared to more traditional GMM
estimators when N is only moderately large. However, a method for implementing LSDVC for an
unbalanced panel has not been provided yet, which clearly limits the applicability of such technique.
This paper extends the bias approximation formulae in Bun and Kiviet (2003) to accommodate
unbalanced panels with a strictly exogenous selection rule. Monte Carlo experiments are also carried out
to assess how unbalancedness affects the LSDV bias and its approximations at the true parameter values.

2. Bias approximations
We consider the standard dynamic panel data model
yit cyi;t1 xit Vb gi eit ; jcjb1; i 1; . . . ; N and t 1; . . . ; T ;

2:1

where y it is the dependent variable; x it is the ((k  1)  1) vector of strictly exogenous explanatory
variables; g i is an unobserved individual effect; and e it is an unobserved white noise disturbance.
Collecting observations over time and across individuals gives
y Dg W d e;
where y and W = [ y 1 v X] are the (NT  1) and (NT  k) matrices of stacked observations; D = I N  i T is
the (NT  N) matrix of individual dummies, (i T is the (T  1) vector of all unity elements); g is the
(N  1) vector of individual effects; e is the (NT  1) vector of disturbances; and d = [c v bV]V is the (k  1)
vector of coefficients.
It has been long recognized that the LSDV estimator for model (2.1) is not consistent for finite T.
Nickell (1981) derives an expression for the inconsistency for N Y + l, which is O( T 1). Kiviet (1995)
obtains a bias approximation that contains terms of higher order than T 1. In Kiviet (1999) a more
accurate bias approximation is derived. Bun and Kiviet (2003) reformulate the approximation in Kiviet
(1999) with simpler formulae for each term. Here we extend Bun and Kiviet (2003) formulae to a more
general version of model (2.1), which allows missing observations in the interval [0, T] for some
individuals. Define a selection indicator r it such that r it = 1 if ( y it , x it ) is observed and r it = 0 otherwise.
From this define the dynamic selection rule s ( r it , r i,t1) selecting only the observations that are usable
for the dynamic model, namely those for which both current values and one-time lagged values are
observable:
8


< 1 if ri;t ; ri;t1 1; 1
sit
i 1; . . . ; N and t 1; . . . ; T :
:
0
otherwise
P
Thus, for any i the number of usable
is given by Ti Tt1 sit. The total number of
Pobservations
P
usable observations is given by n Ni1 Ti ; and T = n/N denotes the average group size. The
(possibly) unbalanced dynamic model can then be written as


2:2
sit yit sit cyi;t1 xit Vb gi eit ; i 1; . . . ; N and t 1; . . . ; T :

G.S.F. Bruno / Economics Letters 87 (2005) 361366

363

We can formulate Eq. (2.2) in matrix form. For each i define the T  T diagonal matrix S i = diag (s it ).
Define also the (NT  NT) block-diagonal matrix S = diag (S i ). Then, the following is equivalent to
model (2.2)
Sy SDg SW d Se:
2:3
The LSDV estimator is given by d LSDV = (W VA s W)1W VA s y, where A s = S (I  D (DVSD)1DV) S is the
symmetric and idempotent (NT  NT) matrix wiping out individual means and selecting usable
observations.
Let y t 0 denote the (N  1) - vector of start-up values and assume e it |X,S,g,y t 0~i.i.d.N(0,r e2)8i,t. Then,
considering all expectations below as conditional on (X,S,g,y t 0), the LSDV bias is given by
h
i
1
2:4
E dLSDV  d E W VAs W W VAs e :
Under our assumption all the properties of normally distributed variables can be used as in Kiviet
(1999) to derive the terms of the bias approximation. These generalize the formulae of Bun and Kiviet
(2003) by replacing the standard within operator with A s (A s also matters for the order of the
approximation terms):
P

1
c1 T
r2e trPq1 ;

 P
P 1
P
P
P
 r2e QW VPAs W tr QW VPAs W Ik1 2r2e q11 trPVPPIk1 q1 ;
c2 N 1 T

 P
 P
P 2
P
P
P 
P
c3 N 1 T
r4e trPf2q11 QW VPPV W q1 qV1 W V PPV W q1 q11 tr QW V PPV W


2tr PV PPV P q211 q1 g;
P
1 P
P
where Q E W VAs W 1 W VAs W r2e trPVPe1 e1V ; W EW ;e 1 = (1,0,. . .,0)V is a (k  1)
vector; q 1 = Qe 1; q 11 = e 1Vq 1; L T is the (T  T) matrix with unit first lower subdiagonal and all other
elements equal to zero; L = I N  L T; GT = (I T  cL T )1; G = I N  GT; and P = A s LG. With an increasing
level of accuracy, the following three possible bias approximations emerge
P

P 1
P 2
1
; B2 B1 c2 N 1 T
; B3 B2 c3 N 1 T
:
B1 c 1 T
Below we evaluate their performance in approximating the LSDV bias as estimated by Monte Carlo
simulations.

3. Monte Carlo experiments


Our Monte Carlo experiments closely follow Kiviet (1995) and Bun and Kiviet (2003), with the
difference that we consider various unbalanced designs. Data for y it are generated by model (2.1) with
k = 2 and for x it by

xit qxi;t1 nit ; nit fN 0; r2n ; i 1; . . . ; N and t 1; . . . ; T


Initial observations y i0 and x i0 are generated following a procedure that avoids the waste of
random numbers and small sample non-stationary problems (see Kiviet (1995)) and are kept fixed
across replications. The long-run coefficient b / (1  c) is kept fixed to unity, so b = 1  c; r e2 is

364

G.S.F. Bruno / Economics Letters 87 (2005) 361366

Table 1
Unbalanced designs
N

Ti

20

20

10

40

24
36
48
72

16 (i V 10), 24 (i N 10)
4 (i V 10), 36 (i N 10)
32 (i V 5), 48 (i N 5)
8 (i V 5), 72 (i N 5)

0.96
0.36
0.96
0.36

normalized to unity; c and q alternate between 0.2 and 0.8. The individual effects g i are generated
by assuming g i e N (0,r g2) and r g = r e (1  c). Kiviet (1995) finds that the signal to noise ratio of
the regression, r s2, is a key determinant of the relative bias of estimators and therefore needs to be
controlled in the simulation, so r s2 alternates between 2 and 9 (notice that, once fixing r s2, r n2 gets
uniquely determined).
Table 2
Actual LSDV bias and bias approximations
r s2

Bias c

B1,c

B2,c

B3,c

Bias b

B1,b

B2,b

B3,b

20

0.2

0.2

0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36
0.96
0.36

0.021
0.019
0.038
0.034
0.102
0.072
0.108
0.076
0.011
0.011
0.020
0.019
0.051
0.040
0.054
0.043
0.004
0.004
0.013
0.012
0.006
0.004
0.034
0.019
0.003
0.003
0.008
0.007
0.007
0.004
0.020
0.014

0.020
0.018
0.036
0.032
0.098
0.067
0.101
0.069
0.010
0.010
0.018
0.017
0.046
0.036
0.048
0.036
0.004
0.004
0.012
0.011
0.006
0.004
0.032
0.017
0.003
0.002
0.007
0.006
0.007
0.003
0.018
0.012

0.021
0.018
0.038
0.034
0.100
0.070
0.105
0.074
0.011
0.010
0.019
0.018
0.050
0.039
0.052
0.042
0.004
0.004
0.013
0.012
0.006
0.004
0.033
0.019
0.003
0.002
0.008
0.007
0.007
0.004
0.020
0.013

0.021
0.018
0.038
0.034
0.102
0.072
0.108
0.076
0.011
0.010
0.020
0.019
0.051
0.040
0.054
0.043
0.004
0.004
0.013
0.012
0.006
0.004
0.033
0.019
0.003
0.002
0.008
0.007
0.007
0.004
0.020
0.014

0.002
0.003
0.026
0.024
0.003
0.001
0.022
0.020
0.002
0.002
0.014
0.014
0.001
0.001
0.015
0.011
0.000
0.001
0.009
0.009
0.000
0.000
0.012
0.008
0.000
0.000
0.006
0.005
0.000
0.000
0.007
0.006

0.002
0.003
0.024
0.022
0.002
0.001
0.021
0.018
0.001
0.002
0.012
0.012
0.001
0.000
0.013
0.010
0.000
0.001
0.009
0.008
0.000
0.000
0.011
0.007
0.000
0.000
0.005
0.005
0.000
0.000
0.005
0.005

0.002
0.003
0.025
0.023
0.003
0.001
0.022
0.020
0.001
0.002
0.013
0.014
0.001
0.001
0.015
0.012
0.000
0.001
0.009
0.009
0.000
0.000
0.012
0.008
0.000
0.000
0.005
0.005
0.000
0.000
0.006
0.006

0.002
0.003
0.025
0.024
0.003
0.001
0.022
0.020
0.001
0.002
0.013
0.014
0.001
0.001
0.015
0.012
0.000
0.001
0.009
0.009
0.000
0.000
0.012
0.008
0.000
0.000
0.005
0.005
0.000
0.000
0.006
0.006

0.8
0.8

0.2
0.8

40

0.2

0.2
0.8

0.8

0.2
0.8

20

0.2

0.2
0.8

0.8

0.2
0.8

40

0.2

0.2
0.8

0.8

0.2
0.8

G.S.F. Bruno / Economics Letters 87 (2005) 361366


P

365
P

Two different sample sizes are considered, N ; T 20; 20 and N ; T 10; 40. Then,
following Baltagi and Chang (1995),
the extent of unbalancedness as measured by the
Pwe
P control for
Ahrens and Pincus index: x N = T Ni1 1=Ti (0 b x V 1, x = 1 when the panel is balanced). For
each sample size we analyze a case of mild unbalancedness (x = 0.96) and a case of severe
unbalancedness (x = 0.36). Individuals are partitioned into two sets of equal dimension: one set contains
the first N / 2 individuals, each with the last h observations discarded, so Ti = T  h; the other contains the
P
remaining N / 2 individuals, each with Ti = T. We set T and h so that T and x take on the desired values
(the four panel designs are summarized in Table 1).
For our Monte Carlo experiments and bias approximations we have developed two codes in Stata,
version 8 (available on request). Results are presented in Table 2. Columns 15 show the
parametrizations for each panel design. Columns 6 and 10 show the actual LSDV biases as estimated
P
by 20,000 Monte Carlo replications. As expected, the bias for both c and b is decreasing in T .
Interestingly, the bias for c is also decreasing in unbalancedness. With respect to r s2, c and q, the patterns
found by Bun and Kiviet (2003) are all confirmed.
Columns 79 and 1113 in Table 2 present the bias approximations for c and b, respectively.
Regardless of the degree of unbalancedness, they are accurate, with the approximations including higher
order terms being equal to the true bias in a vast majority of cases. In addition, as it happens for the
balanced designs studied by Bun and Kiviet (2003), the leading term of the approximations already
accounts for a predominant portion of the true bias (90% on average).

4. Conclusion
This paper has derived approximations of various order to the bias of the LSDV dynamic estimator for
unbalanced panel data. We find that the bias approximations are accurate with a decreasing contribution
P
to the bias of the higher order terms. We also find that the bias is decreasing in T and, for c, in
unbalancedness. Our results, therefore, suggest that 1) the bias approximations can be used to construct
LSDVC estimators for unbalanced panels; 2) the finding of Bun and Kiviet (2003) that bias corrections
can be based on the simple leading term of the approximation carries over into unbalanced panels; 3)
P
while increasing T is always beneficial in reducing the LSDV bias, reducing unbalancedness at the
P
expenses of time observations, for given N and T may instead exacerbate the bias.

Acknowledgement
This paper has benefited from helpful comments by Orietta Dessy. Financial support from Bocconi
Ricerca di Base bLabour Demand, Production and GlobalizationQ is gratefully acknowledged.

References
Baltagi, B.H., Chang, Y.J., 1995. Incomplete panels. Journal of Econometrics, 67 89.
Bun, M.J.G., Kiviet, J.F., 2003. On the diminishing returns of higher order terms in asymptotic expansions of bias. Economics
Letters 79, 145 152.

366

G.S.F. Bruno / Economics Letters 87 (2005) 361366

Judson, R.A., Owen, A.L., 1999. Estimating dynamic panel data models: a guide for macroeconomists. Economics Letters 65,
9 15.
Kiviet, J.F., 1995. On bias, inconsistency and efficiency of various estimators in dynamic panel data models. Journal of
Econometrics 68, 53 78.
Kiviet, J.F., 1999. Expectation of expansions for estimators in a dynamic panel data model; some results for weakly exogenous
regressors. In: Hsiao, C., Lahiri, K., Lee, L.-F., Pesaran, M.H. (Eds.), Analysis of Panel Data and Limited Dependent
Variables. Cambridge University Press, Cambridge.
Nickell, S.J., 1981. Biases in dynamic models with fixed effects. Econometrica 49, 1417 1426.