You are on page 1of 49

Banco Central de Reserva del Per

57 Curso de Extensin Universitaria


2010

Econometra

Unit Roots: A Selected Survey


Gabriel Rodrguez
Central Bank of Peru
Pontiicia Universidad Catlica del Per
University of Ottawa

Unit Roots: A Selected Survey


Gabriel Rodrguez
Banco Central de Reserva del Per
Ponticia Universidad Catlica del Per
Universidad del Pacco
University of Ottawa

Ponticia Universidad Catlica del Per


Viernes Econmico
Lima, October 24 2008

c Gabriel Rodrguez, 2008

Motivation (1)
Random Walk (No Drift)
15
10
5
0
-5
-10
-15
-20
-25
250

500

750

1000

Japn: Exchange Rate Yen for Dollar


140.0
135.0
130.0
125.0
120.0
115.0
110.0
105.0
100.0

Ago-08

Feb-08

Ago-07

Feb-07

Ago-06

Feb-06

Ago-05

Feb-05

Ago-04

Feb-04

Ago-03

Feb-03

Ago-02

Feb-02

Ago-01

95.0

e2
En 1
e2
En 3
e2
En 5
e2
En 7
e2
En 9
e3
En 1
e3
En 3
e3
En 5
e3
En 7
e3
En 9
e4
En 1
e4
En 3
e4
En 5
e4
En 7
e4
En 9
e5
En 1
e5
En 3
e5
En 5
e5
En 7
e5
En 9
e6
En 1
e6
En 3
e6
En 5
e6
En 7
e6
En 9
e7
En 1
e7
En 3
e7
En 5
e7
En 7
e7
En 9
e8
En 1
e8
En 3
e8
En 5
e8
En 7
e8
En 9
e9
En 1
e9
En 3
e9
En 5
e9
En 7
e9
En 9
e0
En 1
e0
En 3
e0
En 5
e07

En

Motivation (2)

500

Random Walk (with Drift)

400

300

200

100

0
250
500

3
750
1000

120

Federal Reserve Board' Industrial Production Index

100

80

60

40

20

Motivation (3)
35
30
25
20
15
10
5
0
-5
25

50

75

Y_AR_0.5
Y_AR_0.98
Y_RANDOM_WALK

100
Y_AR_0.97
Y_AR_0.99

Outline
Basic References: Campbell and Perron (1991), Stock (1994), Phillips
and Xiao (1999), Maddala and Kim (2000), Haldrup and Jansson (2006)
Data Generating Process
Classical Unit Root Statistics
Other Unit Root Statistics
Recent Unit Root Statistics
Some Issues on Unit Roots
Structural Change and Unit Roots
The Role of the Initial Condition and Unit Roots
Covariates and Unit Roots
Additive Outliers and Unit Roots
Further Issues and/or Limitations of this Survey

The Data Generating Process (DGP)

(1)
(2)

yt = dt + ut ; t = 1; :::; T;
ut = u t 1 + vt ;

u0 = 0 (initial condition);
P1
P
vt = 1
i=0 ij i j < 1 and where f t g is a martingale
i=0 i t i with
dierence sequence;
vt has a non-normalized spectral density
zero given by
P at frequency
2
2
);
E(
= 2 (1)2 ; where 2 = limT !1 T 1 1
t
t=1
Under H0 , Functional Central Limit Theorem (FCLT) says: T
W (r); W (r) is a standard Wiener process.

dt =

H0 :

= 1;

zt ; where zt is a set of deterministic components;


HA : j j < 1;

Local-to-unity framework:

= 1 + c=T: Used after.

1=2

P[rT ]
t=1

vt )

4
4.1

Classical Unit Root Statistics


The Dickey-Fuller (DF) Statistic
References: Dickey and Fuller (1979, 1981).
The regression model is
0

yt =
Assume that vt
distributions are
T (b

i:i:d:(0;

zt + yt

+ vt :

(3)

) and zt = f;g. Then, the asymptotic

R
W (r)dW
0:5[W (1)2 1]
R
1) ) R
=
;
W (r)2 dr
W (r)2 dr
R
W (r)dW
0:5[W (1)2 1]
R
=
:
tb ) R
[ W (r)2 dr]1=2
[ W (r)2 dr]1=2

(4)
(5)

If
or zt = f1; tg, then W is replaced by W i = W (r)
R zt = f1g
W Z(ZZ 0 ) 1 Z(r), for i = ; . W i is the projection of W onto the
space orthogonal to z.
Asymptotic critical values at 5.0% are: -1.94, -2.86, -3.43 for zt = f;g,
zt = f1g and zt = f1; tg; respectively.

4.2

The Parametric ADF


Reference: Said and Dickey (1984).
Now, assume that vt is I(0), as in Section 3. In general: vt is an
ARM A(p; q) process.
Assuming, as before that zt = f;g, then
T (b

where

=(

0:5[W (1)2 1] +
R
;
W (r)2 dr
f0:5[W (1)2 1]g +
R
)
;
2
1=2
v [ W (r) dr]

1) )

(6)

tb

(7)

2
2
v )=2 :

Distributions depend of nuissance parameters.


The autocorrelation is corrected using the following autoregression
yt =

zt +

0 yt

1+

k
X

bi y t

(8)

i=1

where

1: Then, H0 :

= 0:
b 0 zt , then (8) is

If we dene a detrended time series as yet = yt


equivalently written as
yet =

et 1
0y

If k ! 1; k 3 =T ! 0; then, T (b
(4) and (5).

k
X
i=1

bi yet

1) and t b converge to the expressions

Important empirical application: Nelson and Plosser (1982).

(9)

4.3

The Semi-Parametric Z b and Zt Statistics

References: Phillips (1987, 1988), Phillips and Perron (1988).


The coe cient is estimated from equation (3). Residuals vbt are used
in constructing an estimator of 2 . Therefore, the autocorrelation is
taken into account in a non-parametric way:
b

= s =T

w( ; k) = 1

k+1

vbt2

+ 2T

k
X
=1

w( ; k)

T
X

t= +1

vbt vbt ; (10)


(11)

Using (10) and (11), we have that


Zb

Zt

0:5(s2 sb2v )
P
T 2 Tt=2 yt2 1
0:5[W (1)2 1]
R
;
)
W (r)2 dr
sbv
0:5(s2 sb2v )
= ( )t b
P
s
[T 2 Tt=2 yt2 1 ]1=2
= T (b

1)

0:5[W (1)2 1]
R
:
[ W (r)2 dr]1=2

(12)

(13)

which are the same as in (4) and (5), respectively.


Asymptotic critical values of Z b at 5.0% are -8.0, -14.1, and -21.7 for
zt = f;g, zt = f1g and zt = f1; tg; respectively.
Asymptotic critical values of Zt at 5.0% are -1.94, -2.86, -3.43 for zt =
f;g, zt = f1g and zt = f1; tg; respectively.

10

4.4

The M-Statistics
References: Stock (1999), Perron and Ng (1996).
Denitions:
MZ

T
2T

M SB = [
M Zt =

1 2
y~T s2
;
P
T
2
2
y
~
t
1
t=1
PT 2
2
~t 1 1=2
t=1 y
] ;
s2

[4s2 T

1 2
y~T
P
T
2
t=1

s2

y~t2 1 ]1=2

(14)
(15)
;

(16)

P
0
where: y~t = yt ^ zt , s2 = s2ek =[1 ^b(1)]2 , s2ek = Tt=k+1 e^2tk , ^b(1) =
Pk ^
j=1 bj , obtained from the autoregression (9).

The limiting distributions of M ZR ans M Zt are the expressions (4) and


(5), respectively. The M SB ) [ W (r)2 dr]1=2 :
Asymptotically: M Zt = (M Z )

(M SB):

Asymptotic critical values: see Stock (1999).


Simulation Monte-Carlo evidence.

11

4ijD 1
E1 4F
PJACT Si'E
200
Sc4/t'e r 4Rt4 1,

flOLL

nt

zt

cOL,19

o `it?!

=
fizo

o
= 0.5

1 35i2I
i oo5sJ
o

o. oo9

O Lif/

o.019
o o

5ouC

10.033

o oS?
031
011LQ

o
z 0!

o.o103J

tO

=0.0

o $101

6o60

o.023

LCSO

cO 72

o0i5

4A1 AJ

00q2

/T9&
/

5
5.1

Recent Unit Root Statistics


The ADF GLS
References: Elliott, Rothenberg and Stock (ERS, 1996), Ng and Perron
(2001).
Under local-to-unity framework:
= 1 + c=T: Then, T
Wc (r); where Wc (r) is an Ornstein-Uhlenbeck process.

1=2

u[T r] )

It bridges the gap between I(0) and I(1) asymptotics. If c ! 1,


T (b 1) and t b have I(0) distributions. If c ! +1, T (b 1) and t b
have a Cauchy and Normal distributions, respectively.
Particular characteristic: use of GLS detrended data with

= 1 + c=T:

Construction of GLS detrended Data:


yt = [y1 ; (1
zt = [z1; (1

L)yt ]; t = 2; ::::; T;
L)zt ]; t = 2; :::::; T;

(17)
(18)

Let ^ be the estimator that minimizes:


0

S( ) = (yt

12

zt )0 (yt

zt ):

(19)

^0

Detrended series: yet = yt

GLS zt :

All unit root statistics may be used with yet . For the ADF, see ERS
(1996) and for the M-statistics, see Ng and Perron (2001).
When zt = f1g and zt = f1; tg, the limiting distributions are:
0:5[Wc (1)2 1]
R
;
[ Wc (r)2 dr]1=2
0:5[Vc;c (1)2 1]
) R
;
[ Vc;c (r)2 dr]1=2

DF GLS )

(20)

DF GLS

(21)

where Vc;c (r; c) = Wc (r)


(1 c)=(1 c + c2 =3).

rb, b = Wc (1) + 3(1

R
) rWc (r)dr,

Asymptotic critical values: see ERS (1996), Ng and Perron (2001).

13

5.2

A Feasible Point Optimal Test


References: Dufour and King (1991), ERS (1996).
This test is denoted by PTGLS and dened by:

PTGLS (c; c) =

S( )

S(1)
s2

(22)

where S( ) and S(1) are the sums of squared errors from GLS regressions with = and = 1, respectively.
Limiting distributions:
GLS
PT;
(c; c)
GLS
PT;
(c; c)

) c

) c

Wc (r)2 dr

cWc (1)2 ;

(23)

Vc;c (r; c)2 dr + (1

c)Vc;c (1; c)2 ;

(24)

for zt = f1g, and zt = f1; tg, respectively.


Selection of c:
Asymptotic critical values: see ERS (1996), Ng and Perron (2001).

14

Some Issues on Unit Root Tests

6.1

The Asymptotic Gaussian Power Envelope


There is no uniform most powerful (UMP) or uniform most powerful
invariant (UMPI) statistic in unit root framework.
With
lope.

= 1 + c=T , derivation of the asymptotic Gaussian power enve-

Power envelope allows to judge between dierent alternative statistics.


The asymptotic Gaussian power envelope is dened by:
GLS

(c) = Pr[H PT

(c; c) < b

PTGLS

(c)];

(25)

where bPT (c) is such that


GLS

Pr[H PT
with

(0; c) < b

PTGLS

(c)] = ;

the size of the test.

Selection of c (-7.0 for zt = f1g and -13.5 for zt = f1; tg).

15

(26)

6.2

Asymptotic Power Functions


The asymptotic power functions of the tests are dened by:
J (c; c)

= Pr[H J

GLS

(c; c) < bJ

GLS

(c)];
GLS

where J( ) = M Z , M SB, M Zt , and ADF , and the constant bJ


GLS
GLS
is such that Pr[H J (0; c) < bJ (c)] = , the size of the tests.

16

(c)

gis

r
e
w
L
It
e,

28

:8

o
E',.

a
O

`o

E
it

O!

o
o
o
4

&

2
-c

f.n:,c

2t

24

32

6.3

Selection of the Lag length


Information Criteria: AIC, BIC
2k
;
fkg
T
log(T )k
= arg min log(s2ek ) +
:
fkg
T

kaic = arg min log(s2ek ) +

(27)

kbic

(28)

Recursive t-sig method


Modied Information Criteria: MAIC, MBIC (Ng and Perron, 2001):
kmic = arg min log(s2ek ) +
fkg

where
^T (k) =

(s2ek ) 1 b 20

CT [^T (k) + k]
T
T
X
t=1

yet2 1 :

(29)

(30)

The MAIC uses CT = 2 and the MBIC uses CT = log(T ):


Ng and Perron (2001), based on theoretical considerations and simulations, recommended MAIC.
The advantage of the MIC is that it takes into account the possible
dependence of b 0 on k:

17

6.4

Summary of Monte-Carlo Evidence


All asymptotic valid tests exhibit nite-sample size distortions for models close to I(0) model.
Importance of data dependent methods to select lag length.
Presence of non-normality or conditional heteroskedasticity increases
size distortions.
Including additional trend terms reduce the power of the unit root test
if the trends are unnecessary.
Span is important, not the frequency.
Power of the unit root depends of the initial condition u0 .
If trend is underspecied, unit root tests and estimators are inconsistent.

18

Sinl 77

`7

+00

bUi JY

03 /

0O

oO

OTI

oto &i/
Q&'Q

LiGO

060

Z5C
LOO

5791QV

jQ

cli

vYW

03
QO

jO

bo o

Yo

h&Q
ECO

hO

oVo

zoo

oG

oi wny5
080

Yo

800

go
o
oo

co

00/

?0

Z2
9O0

?v

01!

oo

900

bAO

A/o

ozo

001

oQ
?`O-

QY2

9j
`4 vii

1e
.1

7SV VV3fj7&

/-9

?JJIYJ

ouWd5r
=`

Qj/2l

7U.SiLLS

*oo =1
vv ?

7
7.1

Structural Change and Unit Root Tests


Introduction
References: Perron (1989), Christiano (1992), Banerjee et al. (1992),
Zivot and Andrews (1992), Perron (1997), Perron and Rodrguez (2003a).
Basic idea: misspecication of the trend function is responsible for the
nonrejection of the null hypothesis of a unit root in Nelson and Plosser
(1982).
Models (I, II, III):
zt = f1; 1(t > TB ); tg;
zt = f1; t; 1(t > TB )(t TB )g;
zt = f1; 1(t > TB ); t; 1(t > TB )(t

TB )g;

where 1(:) is the indicator function and TB is the break point. Assume
that TB = T , for some 2 (0; 1):
Perron (1989)
Christiano (1992)
Zivot and Andrews( 1992)

19

92

- -

881

s64
84-4
62

76
74

72
68
66
4

62
6

-----------`---

90'3

732

&3

933

940

Note. flw broLen ttai&hi hite ji a taed trend


it:'929.
DL-O:f :l929and DL

Lajr.}n .f

Kl

19W

1973

960

OLS o the forrn

Nott:nt1 V.Lgc.

8 2ai

757'-i

-/

73
72

`1
69
95C

.955

r65

9.'

.e4e

975

Note Tse br,kcn `L:a;gZ; l.nc ji al ,J trtnd `o' 01.5 of sic Sim.
D70:!973:i.o, lE'-. Tifr>H!.!- *
-tu

2.-i_.gar:lt

Kj

es

, -

DT

-,asr Qualteri kca

1/

//

25
t <'
.7

Ir

SCE

80

7228

3r; 34O

lic t:,,i, .,::a.ht.;ocis

Sic tren.
01.5 o! cc 5
h,r;cL%D7Or7ctaruL rL-.DT-I:r,,:97
.;;

Logan;

.;

C.

tiaton

Sock 5

"t,.
tt

973
O

;crc

Mo't SIOC

VeIoc.ty
02 34*10*3411
2,
-

3 -*33310 tabIlo.3
f----100.l-&*

7.'2I'

r_v

"::::: :: ::

hiUd_

//

`&%CV

,qS

`*04

91*3

IC6,i It

1044

`956

1944

361 0

1660

1090

`9033

`930

1920

1040

1840

33,0

1*60

Vea,
61.1,.,,.

l,,r-/ f4ea

jij

-`

1030, *II.'!y R..r.Ia

6-';
7.5

_1

II 3J'

1171.-.-.-

31*3*

`3330

Ye.3,

L-c'rr.:',c''

Real

S1-lE'r,:es

lodo

y.,11-'Il

LII .-J,oll.e1 `00..

`.`.-

qe
1

4-4

50111013 1% C.C

42

.0111,:,

1933

`541

--".*.-,0
119111963

1103

Italo IIbS'OOO'

A,VllWIOl3t

0910

lI;

lOo

943

Vea

Vea,

Loo 0%.o' -Ilool.


Fogure

2 cco,h,wedl

TItl,.lit

`969

7.2

GLS Detrended Data and Structural Change


Reference: Perron and Rodrguez (2003a).
Limiting distributions of the unit root statistics (Models II and III)
M Z GLS ( ) )

0:5K1 (c; c; )
K2 (c; c; )
(K2 (c; c; ))1=2
0:5K1 (c; c; )

M SB GLS ( ) )
M ZtGLS ( ) )
ADF GLS ( ) )

H M Z (c; c; );
H M SB (c; c; );

(31)
(32)

1=2

H M Zt (c; c; );

(33)

1=2

H ADF (c; c; );

(34)

(K2 (c; c; ))
0:5K1 (c; c; )
(K2 (c; c; ))

where:
(1)

(2)

K1 (c; c; ) = Vcc (1; )2 2Vcc (1; ) 1;


Z 1
Z 1
(2)
(1)
2
Vcc (r; )dr;
K2 (c; c; ) =
Vcc (r) dr 2
0

What does happen with the Model I with GLS Detrended data? Reference: Rodrguez (2007)

20

7.3

Selection of the Break Point


Method 1: Estimating as the break point that yields the minimal
value of the statistics; see Zivot and Andrews (1992), i.e. using
inf J GLS ( )
f g

where J( ) = M Z , M SB, M Zt , and ADF .


By the Continous Mapping Theorem (CMT), the limiting distribution
using method 1 is:
inf J GLS ( ) ) inf H J (c; c; );

2(0;1)

(35)

2(0;1)

Method 2: Choose the break point such that the absolute value of the
t-statistic on the coe cient of the change in slope is maximized; see
Perron (1997):
^ = arg max jtb ( )j;
2
2(";1 ")

Limiting distribution using method 2:

^ = arg max jtb ( )j ) arg max jb4 =(


2
2(";1 ")

2(";1 ")

1=2
3 )j

Hence, the limiting distributions of the statistics are given by


J GLS (^) ) H J (c; c;

21

):

(36)

7.4

The Feasible Optimal Point Test


When

is unknown:
GLS
PT;
(c; c) = f inf

2[";1 "]

S( ; )

inf

2[";1 "]

S(1; )g=s2 :

(37)

Limiting distribution:
GLS
PT;
(c; c) )

sup M (c; 0; )
2[";1 "]

2c

Wc (r)dW (r) + (c

PTGLS

2cc)

Wc (r)2 dr

(38)

(c; c):

Derivation of the power envelope.


Selection of c (c =

sup M (c; c; )
2[";1 "]

22:5)

Asymptotic Power Functions.


Finite-Sample Size and Power.
Empirical Evidence.

22

24
Figure 1: Gaussian Local Power Envelope and the Local Asymptotic Power Functions
of the Tests

The Role of the Initial Condition


Traditionally, theoretical works assume the starting value of time series is zero or has nite expectations. The eect of initial observation
disappears asymptotically.
Exceptions: Elliott (1999), Mller and Elliott (2003) in no structural
change models.
Hui and Rodrguez (2006) introduces both an unknown structural break
and a random initial condition under the alternative hypothesis.
The data generating process (DGP) is the same as before, except that:

Condition A (Initial condition assumption). We assume that u0 is zero


2
when = 1; so u1 = v1 ; while u1 has mean zero and variance 2 =(1
)
when < 1:
The innovations fvt g satisfy the standard asumptions.
Same statistics as in Perron and Rodrguez (2003a).
For Model I and II have the following limiting distributions:
M Z GLS ( ) )

0:5g1 (c; c; )
g2 (c; c; )

M SB GLS ( ) ) (g2 (c; c; ))1=2


0:5g1 (c; c; )
M ZtGLS ( ) )
(g2 (c; c; ))1=2
0:5g1 (c; c; )
ADF GLS ( ) )
(g2 (c; c; ))1=2
Using power envelope, we obtain c =

J MZ

GLS

J M SB

(c; c; )

GLS

GLS

J M Zt

J ADF

(c; c; )

(c; c; )

GLS

(c; c; )

24:

T = 1000 and 10,000 replications to calculate the asymptotic power


function for each statistic.
The curve of power function lies under the power envelope, but not far
from it.
Using Inmum method to choose break point sometimes gives a slightly
higher power function than supremum method.

Figure 1. Gaussian Power Envelope and Asymptotic Power Functions; Inmum Method
and Fixed and Random Initial Condition.

Figure 2. Gaussian Power Envelope and Asymptotic Power Functions; Supremum


Method and Fixed and Random Initial Condition.

Covariates and Unit Root Tests


Importance of covariates in improving the power of unit root tests.
References: Hansen (1995), Elliott and Jansson (2003). For structural
change models: Hui and Rodrguez (2006).
The data generating process (DGP):
yt = dyt + uyt ;
xt = dxt + uxt ;
A (L)

[1

L]uy;t
ux;t
= 1 + cT

A (L) ut ( ) = et
1

(39)
(40)
(41)
(42)

where xt , an m 1 vector, is an arbitrary number of stationary covariates containing extra information of yt , the variable to be tested.
is dened as the spectral density at the frequency zero (scaled by 2 )
of ut ( ). Therefore R2 = ! yy1 ! yx xx1 ! 0yx is a measure of the long-run
correlation between shocks to xt and quasi-dierences of yt at the frequency zero. The value of R2 represents the contribution of covariates
to the explanation of yt , and the value of R2 ranges from zero to unity.

27

The optimal statistic is dened by


i

P (1; ) =

inf

2(0;1)

inf

2(0;1)

where u^it (r) = zt (r)

T
X

u^it ( ; )0

1 i
u^t (

u^it (1; )0

1 i
u^t

; )

t=1

T
X

(1; )

c:

(43)

t=1

dt (r)0 ^ and r = ; 1:

The Theorem establishes that for cases i = 1 and 2 :


P i (1; ) )

1 (c; c; R

)+

i
2 (c; c;

; R2 )

(44)

The asymptotic power depends on c, which corresponds to one particular point under the alternative hypothesis.
The distribution of the P i (1; ) test also depends on the parameter R2 :
When R2 = 0; there is no covariate correlated with the quasi-dierences
of yt and consequently we retrieve the same asymptotic distribution as
that derived in Perron and Rodrguez (2003). When R2 is greater than
zero, the limiting distribution is a function of R2 , indicating that extra
information contained in the covariates may make a dierence on the
performance of the test.

28

Figure 1. Power Envelopes for R2 = 0:0; 0:3; 0:5; 0:7; 0:9.

29

10
10.1

Additive Outliers and Unit Roots


Motivation (1)

Additive outliers aect:


Inference on coe cients in ARMA models (Cheng and Liu, 1993)
Size and power of unit root statistics (Franses and Haldrup, 1994; Vogelsang, 1999; Perron and Rodrguez, 2003b)
Size and power of causality tests (Balde and Rodrguez, 2005)
Inference on parameter of fractional integration (Haldrup and Nielsen,
2007)

30

Motivation (2)

Monthly Latin-American Ination Series

31

Motivation (3)
120
160

400

Arg entina

120

300

80

200

40

100

Chile

100

Bol i vi a

80
60
40
20
0

0
85

90

95

00

85

05

90

95

00

70

05

80

90

8
25

C ol ombi a
6

Par ag uay

10

Ec uador

00

20

6
15
4

4
10
2

0
85

90

95

00

85

05

90

95

00

85

05

90

95

00

05

25

800

Venezuel a

U r ug uay
Per u

30

20

600
15

20

400
10
10
200
5
0

0
85

90

95

00

05

85

90

95

00

05

Quarterly Latin-American Ination Series

32

85

90

95

00

05

10.2

The DGP

The data generating process:


yt = dt +

m
X

i D(Tao;i )t

+ ut

(45)

i=1

(1

L)ut = vt

D(Tao;i )t = 1 if t = Tao;i and 0 otherwise;


or dt = + t;
PT
P
2
1
i
2
vt = 1
t=1 E(et ) is nite;
i=0 'i L et , e = limT !1 T
P r]
T 1=2 [T
t=1 vt ) W (r), W (r) is the unit Wiener process.

dt =

m is the number of additive outliers.

33

(46)

10.3

Detection of Outliers
An iterative strategy using tests based on rst-dierences of the data.
Consider data generated by (45)-(46) with dt = , and a single outlier
occurring at date Tao with magnitude . Then,
yt = [D(Tao )t

(47)

D(Tao )t 1 ] + vt ;

where D(Tao )t = 1, if t = Tao (0, otherwise) and D(Tao )t


t = Tao 1 (0, otherwise).

Under the null hypothesis of no outliers, the OLS estimate of


b =
yt
yt
= ut ut 1

= 1; if
is:
(48)

Consider the following test statistic


d

= sup jtb(Tao )j

(49)

Tao

Ru (j) = T
Procedure:

^ u (0)
tb(Tao ) = ^=(2(R

PT

j
t=1

^ u (1))
R

v^t v^t j , v^t the least-squares residuals from (47)

Compute the d statistic for the entire series. If d exceeds the critical value, then an outlier is detected at date Tbao = arg maxTao jtb(Tao )j:
The outlier and the corresponding row of the regression is dropped
and (47) is again estimated and tested for the presence of another
outlier. This continues until the test shows a non-rejection.

34

10.4

Monte Carlo Design


Data Generating Process:
yt = dt +

m
X

i D(Tao;i )t

+ ut ;

i=1

(1

L)ut = vt ;
vt
i:i:d: N (0; 1)

D(Tao;i )t = 1 if t = Tao;i and 0 otherwise;

is the size of outliers;

dt are the deterministic components (only an intercept);


5000 replications; T = 100;
m = 4.
Three cases for

i:

= 0 for i = 1; 2; 3; 4 (no outliers);

= 10,

= 10 for i = 1; 2; 3; 4 (big outliers).

= 5 for i = 2; 3; 4 (intermediate size outliers);

When outliers are present they are located at 20%, 40%, 60% and 80%
of the sample size.
b and size of t b
Criteria to evaluate results: Mean, Bias, MSE of d,
d
(H0 : d = d0 ).

35

198

P, PERRON AND G. RODRGUEZ

2.!. Simulado,, experiments /br .vi:e


To assess the properties of the method in finite samplcs, we performed simulation
experiments under the hypothesis that the series contain no outlier. We consider a
simple data-generating process with an AR unit root, e.
= Y

--

U,

Twa cases are considered for thc errors u,; namcly MA1 processcs of the
form
u,

--

t,

`-j- Ov,

and ARQ processes of the form


u, = pu,-

-`-

In al! cases, t
i.i.d. N0, 1. Wc consider values of 6 and p in thc range
[-0.8,0.8' with a step sze of 0,2. Two sample sizes are used: T = lOO and
T = 200. The number of replications used was 10,000 and tests at the 5% and
10% significance leveis were performed.
We first consider the size of the procedures in what we label the `one pass' case.
The size is the number of times an observation is categorized as an outlier when
scarching for a single outlier without iterating any further for a given sample.
Rcsults are presented itt Table 1.

TABLE 1
E XAC[ Sizr or Sisc LE Oui TIER

DE'rrc'rION

ARcase

MA case
Constan!

Constani

Time trend

Time Irend

T05.0%

10,0%

5.0%

10.0%

5.0%

I0.O9

5.0/o

IU.0u

lOO

0.332
0.191
0.125
0.104
0.082
0.076
0.072
0.069
0.060

0.137
0.095
0.068
0,051
0.043
0.037
0,035
0.035
0,035

0.24!
0.177
0.126
0.097
0,082
0.075
0.069
0.067
0.066

lOO

-0.80
-0.60
-0.40
-0.20

0.073
0.064
0.053
0,046

0.149
0.121
0.107
0.09!

0.068
0.061
0.055
0.048

0,135
0.l!t
0.106
0,094

0.60
0.80

0.184
0.101
0.067
0.050
0.043
0.038
0.037
0.037
0.037

0,20
0.40
0.60
0.80

0.03!
0.034
0.023
0.020

0.073
0.064
0.050
0.042

0.036
0.032
0.029
0.029

0.071
0.063
0.053
0.047

-0.80
-0.60
-0.40
-0.20
0.00
0.20
0.40
0.60
0.80

0.227
0.105
0.065
0.050
0.042
0.038
0.037
0.037
0.037

0.400
0.205
0.131
0.102
0.086
0.079
0.076
0.074
0.074

0.177
0.102
0.068
0.051
0.042
0.038
0.036
0.034
0.034

0.327
0.198
0.127
0.099
0.084
0.075
0.072
0.069
0.069

200

-0.80
-0.60
-0.40
-0.20

0.080
0.064
0.055
0.048

0.154
0.126
0.110
0.097

0.076
0.065
0.056
0.049

0.151
0.124
0.109
0.096

0.20
0.40
0.60
0.80

0.037
0,034
0,029
0.022

0.077
0.067
0.055
0.043

0.036
0.030
0.025
0.022

0.073
0.062
0052

-0.80
-0.60
-0.40
-0.20
0.00
0.20

0.40

200

Blackwe!! Pub!ishing Ltd 2003

0.04!

SIARCHIN; 1:0K OIJTLII RS 11 TIME SERIES

203

series since, hy dil'erencmg. we effcctively work with a stationary series. The


standard practice in the literature is raiher ad /wc and consists in ejecting if the
t-statistic on sorne observation is greater than a crtica! value chosen to be sorne
nurnber between 3 and 4; see, for example Tiao 1985, Chang and Tiao 1983
and Tsay 1986, among others. l-!ere, we shall simulate critical values assuming
i.i.d. normal errors and discuss the extent to which inference is affectcd when the
data deviates from these specifica;ions. So, the data generating proecss is again
y

VIL-fil,

i.i.d. N0, 1. Tvo samples sites are considered, narnely T = 100 and
T 200. The number of rep!ications used as 50,000. The percentage points of
the test rj are presented in Table IV. To assess the sze of thc test in finite sarnples
when eorrelation is preseni in Wc errors, we consider, as in Seetion 2.!, the same
process defined by 7 with correlated errors. Two cases are considered for the
errors u,: narnely MAl processes of the form u, = y, + Oi', and ARU processes
of the forrn u,
pu,1 +
In alI cases. u,
lid. N0. 1. Wc consider values of
O and p in the range [-0.8, 0.8 vlth a step site of 0.4. The sample size is T -u lOO,
the number of replications used was 10,000 and tests aL the 5% significance leve!
were performed. Wc consider the iterative procedure with up to four outlicrs. The
results are presented in Table y.
where u,

,.

TABLE IV
F

NI

-.

SAlrLL CRI'I CAL, VAI.uL 5 OF 1 iii:

Model 1;
LevcLofsgnificance
1.0/o
2.5%
5.0%
100%

Tr:s ``
MoJeL 2z1

ji.'

T I00

T200

Tu IDO

T200

4.14
3.87
3.65
344

4.20
3.95
3.75
356

4.13
3.85
3.63
342

4,19
3.94
3.74
3.55

TABLE Y
ExAcr Sizr OF THE TEn BASED ON tj
Probablity to lind

`d case
MA ewe

Alt case

Ist outl,er

2nd outlier

3rd outlier

4th outlier

0147

0.002

0.000

0.000

O =
0=
0
0

-0.80
-0,40
0,40
0.80

0.053
0.052
0.034
0.021

0,003
II 002
0.003
II 8K

0.000

0,000

0.000
0.001

0,000
0.000

0.00!

0,00!

-0.80

0.029

0.003

0.000

0.000

-.0.40

0.053

0.40

0.039

11.81

0.029

0 102
0.003
0.007

0.000
0.00!
0005

0.000
0.000
0.004

Blackwell Publishing Lid 2003

199

SEARI'llj FOI o1rLlI1Rs itt TIME SERlES

For the i.i.d. case. Vogelsang's method has an exact suc close lo nominal suc.
For the case with negative MA errors, the test has size distortions being liberal.
These distortions are smaller when more deterministic componcnts are included in
the models. For positive MA errors and particular! for the model thaI includes a
time trend, the procedure is s!ightl undersized. A similar result is observed when
there are posiuvel correlated AR crrors.
The next experimenis consicler the properties of the method vhen applicd in a
fuil iterative fashion, i.c. continuing to search for additional outliers when one is
found. Here, we record the total number of obserations categorized as outliers
divided by the number of replications. These values can be labelled as the
expected number of out!iers found. lfthe tests have the correct size say, al each
step of the iterations, and the tests are independent, this number shou!d be close
to /l , thai is0.llI for a signilicance leve! 10% and 0.053 for a significance
level 5'Y.
The resulis are presented in Table 11. The main thing to note is thai Vogelsangs
procedure linds many more out!iers than would be expected if the test liad the
correct size at each step. For example, for the mode! with only a constant with
i.i.d. enors. T
lOO. and a significance leve! of 10%. the number is 0.293 instead
of 0.11!, i.e. an average of 2.93 outliers for each replieation which contains at least
one outlier. These distortions increase when 7' increases to 200 with a value of
0.520 instead of 0.11! which corresponds to approximately 5.2 outliers per
replications which huye at least one outlier.
.,

TABLE II
ExpEcrrn NOMBER OF OFTUERS Fouso Usixu IUt.TIFLE OUTLIER 1,! rECIlON

MA

Constani

T
lOO

200

AR CLISe

CLISe

ime trend

Constan!
50'

10.0%

5.O

10.0%

-0.80
-0.60
-0.40
-0.20

0.124
0.128
0.127
0.129

II 278
II 272
0.286
1292

0.091
0.086
0.088
0.091

0.198
0.181
0.192
0.199

0.20
0,40
0.60
0.80

0.128
0.129
0,136
0,147

0.294
0.293
1294
0.297

0.097
0.105
0.110
0.146

0204
0.207
0127
0.278

-0.80
-0.60
-040
-11.1

0.195
0.203
0.206
0.217

1487

0,499
0.505
0.519

0.126
0.133
0,141
0.143

1296
0.301
0.319
0.328

0.20
04
0.60
0.80

0.217
0,217
0,225
0.22!

0.505
0.494
0 .195
0488

0.145
0,145
0.150
0.183

0.337
11.332
0.345
0359

5.0%

10.0%

5"

lO 0%

-0.80
-0.60
-0.40
-0.20
0.00
0.20
0.4
0.60
0.80

0 216
0.139
0.132
0.129
0.129
0.127
0.130
0.133
0.332

0.447
0.306
0.285
0.292
0.293
0.295
0.296
1288
1293

054
0.109
0.094
0.092
0.096
0,097
0,097
0.101
0.101

0.291
0.220
0.194
0.196
0.205
0.208
0.205
0.207
0,206

lOO

-0,80
-0.60

0.295
0.197
0.200
0.24
0.209
0.214
1112
0.214
0.26

0.638
1.478
0.494
0.515
0.520
0.509
0.505
0.503
0.500

0,206
0.111
0.138
0.144
0,142
0,147
014'
0 147
0.1.18

418
0.311
0.308
0.324
033!
0.338
0.341
0,338
0.339

200

-11.4

-0.21
0.00
0.20
0.40
0.60
180

Trne trcnd

IiIackwelI Pub!ishing Lid 2003

8
1

7
=

1 =

ti

Sur
AS

P}ER
SS T

Di
id

5,,

OK AEEDETIVE.

TAI9LE VI
TE

OF TOE
=

Otii
=

E RS,

3.;
2
2

SIAEI ERRORS
=

E1
Ti

0.998
0.957
0.865
0h01

0.746
0.179
0.019
0.002
0.793
0,40!
0.194
0.081

0.865
0.121
0.00!
1101
0.941
.396
0.076
0.007

o.a4

0.342
0.009
0.000
0.000

0.464
1122
0138
1.1113

0.i1

0.834
0.360
0.104
0.022

0.135

0.996
0.674
0.22%
0.0411

0.23!
0.044
1.016
0.1110

U:
t,.

056
0.002
0.000
000
0.286
0.158
0.013
0.007

0.000
11.001

1.001
0.82!
0.380
0.106

a
`r

0.242
0.101
0183
080
0.054
0.002
0.000
0.000
0.101
0 024
01114
0.010
0,079
0.003
1.000
0.000

0.166
O.02o
0.004
11001
0198
0113
0.000
0.000
0.047
lE 002
0.0X
000
0.055
0.022
0.015
0.010

ProbahLllt to find
2nd outlter
3rd ipuilier
4th outlier
0.063
0021
0.011
0.007
0165
0 004
0.000
0110
003%
0.004
0.00!
1001

st lUther

IsI outliet
2nd ouUier
3rd outlier
4th outlier
1041
0.023
0.015
11.010
0.054
084
0010
0.000

-0.80

0.00

st ouiliei
2nd ouiliet
3rd ti ulier
4tO outliet
0.036
0.024
0.018
0.012

1.40

0.40

st outlier
2nd nutlper
3rd outliei
4th outlier

11.1104

80

0 130
1 26
0.1115
0111

II

0.994
0.71
1.297
1187

u,_ -u,andr1e+de,i here e

id. NtO. 1

0.063
0.003
0.000
0000

0.042
0.023
0.017
0.012

with u

11.02
0.005
11111
0.001

+ u

0.053
111114
11110
0,000

Y.1 a,Dt i'. 4,

0.036
0.026
11119
0.014
-

1 st outlier
2nd tiudier
3rd `titltet
41li outlier

.ou' The data gene! aung process is. *v,


a
o

o
o.
r
E.

ji =

ji =

ji

1ABI E XLII
Sin OF Tiff ADF TEST; AR 1 1gR0Rs

0.029
0.02
0.050

0.031
3022
0 053

0148
0.002
0.050

0.046
0.002
0.048

0.045
0002
0.147

0.146
0001
0147

0123

0.09
0.111
0030

0.035
1029
0.064

0.051
0.158
0.109

0056
11177
0.133

0.02

0.04
0.05
0.029

0.030
103
0.061

0.041
0.158
0.099

046
0178
0.124

1136

0136

0.000

0.000
0.1141
0.041

0.000
0.048
0.048

0.010
0.057
0067

0.091
0133
0.124

0.032

084

0.028

0.017
0.023
0.040

0.021
0.094
0.15

0.017
0.159
0.176

0.013
0.192
0.205

0037
0018
0.055

0030
0.018
0.048

1052

088

1 =.2=3.l =2.54=2

0.034
0.016
0.050

0.134
0.013
0047

0.129

Wiih
Total

0.034
0.03
0.047

0145
0.004

u =0.&=0.,=064=0

0.40

Without
Wiih
1 ural

0.038
0.009
0.047

0.027

loo

Otiiut
Wtrh

0.047
0.00

ithitut

41

Withtui

itht,ut
Wjth
letal

1153

tI SU

0.80
0053

0.008

une

.005

ti

0152

length ixcd

WLh

att

1 oral

*Witli thr

368

G. Rodrguez

lable 3. ADA" and l'hillips-Perron lesis Regression neludes only `a consianh


Couniry

Simple

krnax

`,

Argentina
Bolivia
Chile
Pero

1979:01-1999:03
1979:01 2000:05
1970:01-2000:05
1979:01 2000:05

19
19
2
19

-2.S09"
-2 932'
-2 150
-5 300"

a. b.

ti, Dei tites si


gui leii'lce levels al 1 ile 2.5% 5 0"

`lable 4.i.I"
Countn

Sample

kmax

7,

15
lO
21
3

-7.205'
-8.83'
-16.53"
-12.32

1
lO
21
3

and 10.0% respeeLiveI.

P- and .4DFt' tesIs Deterniinistie components z

Argentina 1979:01- 1999:03 19


Rtilivui
1979:01-2000:05 19
197011-200005 23
Chile
1979:01 20ft:ll5 19
Peru
a.

0.796
0.822
0.ssI
0.512

.Z.

MSS

AIX,

P7

-31.96'
I8.87'
-9.85
-411.17'

0.126
0.163'

-3.922'

II

-3.071"
-2 205'
-4.480"

2 519h
1609'

0*24h

0.111'

534'
.36?

{1

-7.0
le

.4Db'
-2745'
-2.73r
-2152'
-5.2143"

0 504
0.846
0.880
0.525

15

lo
21
3

1.. e. ti

Denotes significanee levels al l.0/, 2 5%. 5.0% and 10.0%, respeetisely. Crine-al values
for ,&f''' iesis wele uhl:iined from Ng and l'erron 2001. Critical alues br the ADFGI' les are
equivalenl Lo the case where no delerministie componenis are ineluded in tite regression as was
shown by ElliotI el al, 1996.

3,3, Et'idence roin standard un! roo!

tesis

Here. 1 apply standard unit roo! tesIs tu veril'y the existence of a uni! roo! in
the inflation lime series. In he end. 1 also consider he application of .&i'
tests, which are considered as robust un! roo! tess for the presenee of neg
ative moving average serial correlation.
The lag length is selected using the sequential t-sig procedure. For Tables 3
md -L 1 use kniax = int[l2 T/l00*]. Nios! of the results go in the same
direction. ha! is, they lead lo a strong rejeetion of the unit roo! hypolhesis in
favor of stationarily. Notice thaI these results are found although a largo lag
is used. A similar comment can be mentioned whcn unit roo! tesis uslng GLS
detrended dala are applied see Table 4.

3.4. Detecting

en

additive ovil/cus

A foliowing siep is to seareh for additive outliers and then to erif> our visual
inspection l'rom Hg. 1. Since use monthl dala, 1 preclude the possibilit of
finding un excessive number of outliers usng a critical value a! 1.0% when the
procedure based on the flrst difterenee of the dala t is used. Uritical values
u 5.0% are used when 1 use the procodure based on dilTerent critical values a!
difterent steps of the iterations te.
The evidence see the fourth column in Tables 5 and 6 shows that Ihere
are signilicant numbers of outliers in dI inflation time serles. As 1 men
tioned in the previous section, rd is more powerful and he results reveal
this fact. Both procedures are able tu debe! principal outliers as the
observations assoeiated directly to the dates of application of stabilization
programs and adjaeent observations lo stop high inflation episodes.

An empirical note about uddilive oulliers md nonstiiionarity


lable 5. IDE corrected for iddutive outliers uslng

Td

369

Delerminktie componenls z,

{1

Country

Sample

mmx

Outlicrs

e.

Argentina
Bolivia
`hile
Pero

1979:01-1999:03
1979:01-2000:05
1970:01-200105
1979:01 2t11fttlS

15
15
17
15

12

-0.312

0.99!

30
19
lO

-1.57Y

0.399
0.830
0,968

5
13
lO

--0.652

Additive oulliers were detecLed using 1.0% critical value.


1 able 6..4DE corrected for additi'e outliers using r Oeterministic eomponents

{1

Couiitrv

Sample

knrnx

Outliers

e,

Aigeriti'ia
Bolivia
Chile
Peru

1979:01-1999:03
1979:01 - 2000:05
1970:01 211105
1979:01-2000:05

15
15
17
15

4
4
4
4

0.128
4,049
-7.1 39'
0.151

1.006
1,289
0.852
.009

1-1
12
II
13

3.5. Eviclence from ADE test eorrecred for additire outfjets


A final step is the application of the ADE test using dumrn variables to
incorporate additive outliers found jo the fourth colurnn o! Tables 5 and 6. In
ibis case, 1 use a kmax = /m{1 2 T/ lOO /41 Results for the procedure r,1 are
pi'esented in Table 5. According to Lhese results, the inflation series for
Argentina and Peru can be considered as 11 processes. In the cases of
Bolivia and Chile, inilation can be considered asan 0 process. Notice that a
small length was selected in the case of Bolivia and sorne doubt can remain
about ibis rejection. However, when impose a kmin = 6 or a kmin = 12, the
t-statistics are -3.69 and -2.62, which are significani al the 1.0% and the
10.0% levels, respectivel.
Results using the procedure r are presented in Table 6. 1 obtain similar
resulis as in Table 5 exccpl for Bolivia which now shovs evidence of an
cxplosivc root. Although the restilt is difTerent from Table 5. 1 can rejcct the
unu root hypothesis, whieh is important here.
One cornrnent about the number of outliers is of note here. Vnfortunately, 1
were able lo use only four critical values using the procedure t whieh cor
respond lo the firsi, second, third and fourth steps in ihe fuil iterativc procedure
lo search for additive outliei's. The reason for this limited number of' critical
values is the !`act thai cach time 1 oecd critical values very far in the tail see
Pci'ron and Rodrguez 2003. A possible ad hoc suggestion sto use the critical
value corresponding to the I'ourth step for alI subsequent steps. 1 have done ibis
iterating unu! 36 steps. In ah cases not reponed here, 1 find 36 outhers and no
different `esults were found vith respect to uhose shown in Table 6.
lo another approach, Lucas 1995a,b proposes the use of robust estimates
for the presence of outhiers. It is not a sequential procedure whieh may be an
advantage in sorne cases vhere identification of the dates of the out1iers'ti'e
not releant. However, in most cases, researehers oecd tu know the dates of
ube outliers. This is our case hecause arnong other reasons, dates of the
outlicrs allow us to conirm visual anahysis. and l'rorn a macroeconornic
perspectkc they allow us to identifv which phenonicna are rehated lo these
dates.

11

Further Issues and/or Limitations of this


Survey
Change in persistence (Kim, 2000; Kim, Belaire-Franch and Badillo,
2002; Busetti and Taylor, 2004; Harvey, Leybourne and Taylor, 2006;
Leybourne, Kim and Taylor, 2007)
Fractional Integration (Geweke and Porter-Hudak, 1983; Sowell, 1992;
Chung and Baillie, 1993; Beran, 1995; Tanaka, 1999; Robinson, 2005)
Double Unit Roots (Hasza and Fuller, 1982; Haldrup, 1999; Juselius,
1999; Haldrup and Lidholdt, 2002)
Seasonal Unit Roots (HEGY, 1990; Franses, 1996)
Moving Average Unit Roots (KPSS, 1992; Jansson (2004)
Others Issues. Example: testing convergence in growth in presence of
unit roots (Vogelsang, 1998; Perron and Yabu, 2006)

36

You might also like