You are on page 1of 35

Econometric Theory, 2012, Page 1 of 35.

doi:10.1017/S0266466612000357

WALD TESTS FOR DETECTING
MULTIPLE STRUCTURAL
CHANGES IN PERSISTENCE
MOHITOSH KEJRIWAL
Purdue University

PIERRE PERRON

Boston University

JING ZHOU

Orient Securities Company Limited

This paper considers the problem of testing for multiple structural changes in the persistence of a univariate time series. We propose sup-Wald tests of the null hypothesis
that the process has an autoregressive unit root throughout the sample against the
alternative hypothesis that the process alternates between stationary and unit root
regimes. We derive the limit distributions of the tests under the null and establish
their consistency under the relevant alternatives. We further show that the tests are
inconsistent when directed against the incorrect alternative, thereby enabling identification of the nature of persistence in the initial regime. We also propose hybrid
testing procedures that allow ruling out of stable stationary processes or ones that
are subject to only stationary changes under the null, thereby aiding the researcher
in interpreting a rejection as emanating from a switch between a unit root and stationary regime. The computation of the test statistics as well as asymptotic critical
values is facilitated by the dynamic programming algorithm proposed in Perron and
Qu (2006, Journal of Econometrics 134, 373–399) which allows imposing withinand cross-regime restrictions on the parameters. Finally, we present Monte Carlo
evidence to show that the proposed procedures perform well in finite samples relative to those available in the literature.

1. INTRODUCTION
Issues related to the detection and estimation of structural change in time series
models have received a great deal of attention in both the statistics and econometrics literature (see Perron, 2006, for a survey). Substantial advances have been
Perron acknowledges financial support for this work from the National Science Foundation under Grant
SES-0649350. The authors are grateful to Robert Taylor (the co-editor) and two anonymous referees for useful comments and suggestions that helped improve the paper. Address correspondence to Mohitosh Kejriwal,
Krannert School of Management, Purdue University, 403 West State Street, West Lafayette IN 47907 USA; e-mail:
mkejriwa@purdue.edu.
c Cambridge University Press 2012 

1

2

MOHITOSH KEJRIWAL ET AL.

made to cover models at a level of generality that allows a host of interesting
empirical applications. These include models with general stationary regressors,
models with trending variables and possible unit roots, cointegrated models, and
long memory processes, among others. Also of interest is the interplay between
structural changes and unit roots (Perron, 1989). The literature on testing for a
change in the persistence of a time series is less extensive and relatively recent. If
such a change preserves the stationarity properties of the series in the respective
regimes, methods developed in the context of stationary data can still be applied
(see Andrews, 1993; Bai and Perron, 1998; 2003). In many cases, however, a
process may switch from one with an autoregressive unit root [I (1)] to a stationary one [I (0)] or vice versa. This has been an issue of substantial empirical
interest, especially concerning inflation rate series (e.g., Barsky, 1987; Burdekin
and Siklos, 1999), short-term interest rates (e.g., Mankiw, Miron, and Wei, 1987),
government budget deficits (e.g., Hakkio and Rush, 1991), and real output (e.g.,
Delong and Summers, 1988). Kim (2003) shows that standard unit root tests are
not consistent against processes displaying a shift from stationarity to nonstationarity and vice versa. Hence, separate methods are needed to distinguish between a
process with stable persistence and one that undergoes a shift in persistence over
a given period.
Kim (2000), Busetti and Taylor (2004), and Taylor (2005) consider testing the
null hypothesis that the series is I (0) throughout the sample versus the alternative
that it switches from I (0) to I (1) and vice versa. Harvey, Leybourne, and Taylor
(2006) propose test statistics that allow the process to be I (1) or I (0) throughout under the null. The tests are based on partial sums of residuals obtained by
regressing the data on a constant or a constant and time trend. Leybourne, Kim,
Smith, and Newbold (2003) consider testing the null hypothesis of a stable unit
root process versus the same alternatives based on the minimal value of the locally
generalized least squares (GLS) detrended augmented Dickey-Fuller (ADF ) unit
root statistic developed in Elliott, Rothenberg, and Stock (1996) over subsamples
of the data. They propose different test statistics depending on whether the initial
regime is I (1) or I (0). When the direction of the change is unknown, they consider the minimal value of the pair of statistics for each case. Kurozumi (2005)
suggests an alternative testing procedure based on the Lagrange multiplier (LM)
principle, while Leybourne, Kim, and Taylor (2007a) develop tests of the unit root
null based on standardized cumulative sums of squared subsample residuals that
do not spuriously reject when the series is a constant I (0) process. Chong (2001)
studies the asymptotic properties of the estimated parameters in the first-order
autoregressive model with a single break in persistence.
The above tests are designed to detect a single change in persistence and do not
allow for multiple changes. Single break tests usually have low power in detecting processes that display multiple shifts in persistence. It is thus useful to develop
tests that are valid in the presence of multiple structural changes. In a recent paper,
Leybourne, Kim, and Taylor (2007b) develop tests of the unit root null hypothesis based on doubly recursive sequences of ADF-type unit root statistics and

DETECTING MULTIPLE CHANGES IN PERSISTENCE

3

associated breakpoint estimators. Their proposed procedure can accommodate
processes that exhibit multiple changes in persistence and are valid regardless
of the direction of change(s). In particular, they demonstrate the consistency of
their tests against such alternatives and show that their procedure can be used to
consistently partition the data into its separate I (0) and I (1) regimes. Kang, Kim,
and Morley (2009) consider an alternative approach to analyzing multiple regime
shifts in U.S. inflation persistence based on an unobserved components model
with Markov-switching parameters.
As is evident from this brief review, most tests for changes in persistence are
based on either partial sums of the (demeaned or detrended) data or on unit root
statistics applied to various data subsamples. In contrast, this paper proposes
sup-Wald tests of the null hypothesis that the process is I (1) against the alternative hypothesis that the process alternates between stationary and I (1) regimes.
The tests are based on the difference between the sum of squared residuals
from the unit root model and those from a model that allows shifts in persistence between stationary and nonstationary regimes. We consider tests for both
single and multiple changes in persistence. The limit distributions of the tests
are derived under the null, and their consistency is established under the relevant alternatives. We further show that the tests are inconsistent when directed
against the incorrect alternative, thereby allowing the researcher to identify the
nature of persistence in the initial regime. We also propose hybrid testing procedures that allow ruling out of stable stationary processes or ones that are are
subject to only stationary changes under the null, thereby aiding the researcher
in interpreting a rejection as emanating from a switch between a unit root and
stationary regime. We further discuss how our tests can be used to distinguish
between persistence breaks and pure level or trend breaks. The computation of
the test statistics as well as asymptotic critical values is facilitated by the dynamic programming algorithm proposed in Perron and Qu (2006), which allows
the minimization of the sum of squared residuals under the alternative hypothesis while imposing within- and cross-regime restrictions on the parameters. We
also propose estimators for the break dates that can be employed once evidence
against a stable persistence parameter is obtained. The performance of the proposed test statistics in small samples is evaluated via an extensive Monte Carlo
study.
The paper is organized as follows. Section 2 presents the models, the test statistics, and issues related to the computation of the statistics. Section 3 details the
asymptotic properties of the test statistics under the null and alternative hypotheses. Section 4 proposes hybrid testing procedures that allow ruling out processes
that are constant I (0) or ones that are subject to only I (0) changes under the null.
Section 5 suggests estimators for the locations of the break points that can be applied following evidence against the null hypothesis. Monte Carlo simulations are
presented in Section 6 to assess the adequacy of the asymptotic approximations
in finite samples. Recommendations for applied work are also included. Section 7
concludes. All technical derivations are in the Appendix.

. THE MODELS AND TEST STATISTICS Consider a scalar random variable yt generated by yt = ci + αi yt−1 + u it (1) for t ∈ [Ti−1 + 1. αi = 1 for all i. Hence. . bi = 0 in odd regimes and |αi | < 1 in even regimes. (3) s where u t = d(L)v t . bi = 0 in even regimes and |αi | < 1 in odd regimes. The vector of break fractions is λ = (λ1 . All roots of di (L) are outside the unit circle. we have m breaks and m + 1 regimes that increase in length in the same proportion as T increases. Also. m + 1. . . d(L) = ∑∞ s=0 ds L with v t and d(L) satisfying Assumptions ∞ A1–A2 and ∑s=1 s|ds | < ∞. In Model 1a. We make the following assumptions regarding the innovation process {v it } and u it for i = 1. . and supt E(|v it |4+β |v it−1 . . . . E(|v it |r |v it−1 . . 4). Model 1b is similar except that the first regime is stationary. We are interested in testing the null hypothesis that yt is I (1) throughout the sample. λm ) with λi = Ti /T for i = 1. In this case. with the convention that T0 = 0 and Tm+1 = T . i = 1. . Model 2b: αi = 1. For Models 2a and 2b. . di (L) = ∞ ∑ dis L s .4 MOHITOSH KEJRIWAL ET AL. . . .) = σ 2 . The process {v it } is a martingale difference sequence with E(v it2 |v it−1 . . the null hypothesis is H0 : ci = c. where T is the sample size. αi should be understood as standing for the sum of the coefficients in the autoregressive representation for yt in regime i. . For Models 1a and 1b. . . the process alternates between a unit root and a stationary process with a unit root in the first regime. . . bi = 0. The errors {u it } are generated by the stationary linear process u it = di (L)v it . this implies H0 : ci = 0. . αi = 1 for all i. 2. m. The corresponding models are: Model 2a: αi = 1. . we also consider the process yt = ci + bi t + αi yt−1 + u it . We consider the following two models depending on whether the initial regime contains a unit root or not: Model 1a: ci = 0. αi = 1 in even regimes and |αi | < 1 in odd regimes. . . Model 1b: ci = 0. αi = 1 in odd regimes and |αi | < 1 in even regimes. m + 1. (2) s=0 where ∑∞ s=1 s |dis | < ∞. To allow for the possibility of trending data. Assumption A1. Assumption A2. Ti ]. .) = κi < ∞ for some β > 0.) = κir (r = 3. . the data generating process (DGP) is denoted by yt = c + yt−1 + u t .

F1b (λ. SS R0 denotes the sum of squared residuals under the null hypothesis. i. as well as changes in the dynamics and the variance of the errors. A joint test on all parameters would not be particularly informative given the difficulty in interpreting a rejection. This is because we wish to direct the test against potential changes in the I(0)/I(1) nature of the process to ensure the highest power possible. our test does not have much power against pure changes in shortrun dynamics but is powerful when there is a change in both persistence and these dynamics.k ] if k is even.e.k )/[(k + 1)SS R1a.k ] if k is odd. which follows a leastsquares approach that does not exploit potential changes in the variance of the errors. the tests are based on the constrained and unconstrained sum of squared residuals.2. level shifts and changes in the slope of the trend are allowed... i. We. allowing for breaks in dynamics under the null would lead to limit distributions that depend on the (unknown) number and location of these breaks. For Models 1a–1b. F1a (λ. (6) In (5) and (6). Also. SS Rk.k )/[(k + 2)SS R1b. that obtained from ordinary least squares (OLS) estimation of (4) subject to the restrictions ci = 0.k )/[k SS R1a. we consider the Wald test that applies when the alternative involves a fixed value m = k of changes. k) = (T − k − 2 − l T )(SS R0 − SS R1b. thereby making asymptotic inference difficult. (4) j=1 In accordance with the discussion above. αi = 1 for all i. k) = (T − k − 1 − l T )(SS R0 − SS R1b. the coefficients π j pertaining to the dynamics are not allowed to change across regimes. Given the fact that the process has an autoregressive representation that can be approximated by an AR(l T ) for some sequence l T increasing with the sample size. since these often occur simultaneously with a change in persistence and can allow tests with higher power. Hence. We study two types of tests in this section.DETECTING MULTIPLE CHANGES IN PERSISTENCE 5 It is important to note that under the alternative hypothesis the process generating the data is such that all parameters are allowed to change across regimes.e. We nevertheless allow for concurrent changes in level and slope of the trend function. (5) F1b (λ. As shown in Section 6.k )/[(k + 1)SS R1b. the starting point is to consider the regression lT yt = ci + (αi − 1)yt−1 + ∑ π j yt− j + v t∗ . the test is defined as F1a (λ.1a denotes the sum of squared residuals obtained from estimating (4) under the restrictions imposed by . We first consider the test statistics for nontrending data. Also. shall not construct test statistics that exploit the possible changes in the dynamics or the variance of the errors. First. however.k ] if k is odd. those based on Models 1a and 1b.k ] if k is even. k) = (T − k − 1 − l T )(SS R0 − SS R1a. k) = (T − k − l T )(SS R0 − SS R1a.

Model 1a.k )/[(2k + 1)SS R2a.k )/[(2k)SS R2a. the sum of squared residuals obtained estimating (7) subject to the restrictions ci = c.k ] if k is even. To compute the sup-Wald test for any particular model. For models 2a and 2b. the remaining statistics are defined in the same way as for Models 1a and 1b. we do not have any a priori knowledge regarding whether the first regime contains a unit root or not. Note that to ensure that the Wald tests are nonnegative. sup F1b (λ. k) = (T − 2k − 2 − l T )(SS R0∗ − SS R2b. k) = (T − 2k − 1 − l T )(SS R0∗ − SS R2a. This is accomplished employing the dynamic programming algorithm of Perron and Qu (2006).k ] if k is odd. SS R0∗ denotes the sum of squared residuals under the null hypothesis. we consider the statistic W max1 = max1≤m≤A W1 (m). up to some maximal value A. Similarly.1b denotes the sum of squared residuals obtained from estimating (4) under the restrictions imposed by Model 1b. F2b (λ.1 we present the asymptotic distributions of the tests under the null hypothesis that the process is I (1) throughout the sample. i. in order to accommodate the case with an unknown number of breaks.k )[(2k + 1)SS R2b.. bi = 0. In 3.k ] if k is odd. ASYMPTOTIC RESULTS We now consider the limiting properties of the proposed statistics. SS Rk. (9) In (8) and (9). i. k) = (T − 2k − 2 −l T )(SS R0∗ − SS R2a.k )/[(2k + 2)SS R2b. Finally. (7) j=1 The Wald tests are defined as F2a (λ. k) = (T − 2k − 3 −l T )(SS R0∗ − SS R2b. W2 (k). αi = 1 for all i. k). F2a (λ. the same number of lags of the first differences of the dependent variable must be used when estimating the models under the null and alternative hypotheses. regression (4) is replaced by lT yt = ci + bi t + (αi − 1)yt−1 + ∑ π j yt− j + v t∗ . (8) F2b (λ. 3. The second type of test is based on the presumption that the nature of persistence in the first regime is unknown. Given these tests.e. The tests are given by W1 (k) = max[sup F1a (λ. we need to minimize the global sum of squared residuals over the set of permissible break fractions k subject to the restrictions implied by the model. The sup-Wald tests are then defined as sup F1a (k) = supλ ∈ k F1a (λ. k). λ k ≤ 1− }. another reason not to model the changes in the dynamics.6 MOHITOSH KEJRIWAL ET AL. These are denoted sup F2a (k).e. λ 1 ≥ . we define the set k = {λ : |λi+1 − λi | ≥ . k) and sup F1b (k) = supλ ∈ k F1b (λ. The computation of the asymptotic critical . For some arbitrary small positive number . sup F2b (k).k ] if k is even. and W max2 . k)]..

F1a (λ. Suppose that the data are generated by (3) with u t = v t . k) k/2 ⇒ ⎤ 2  λ2i k/2 (2i) (r )dW (r ) λ2i−1 W  λ2i λ2i−1 [W (2i) (r )]2 dr + 1 ⎥ {W (λ2i ) − W (λ2i−1 )}2 ⎦. 1]. we have F1a (λ. The Null Limiting Distributions Let W (. λ j ) (see the Appendix for detailed expressions). 3. k) ⎡  λ2i 1 (k+1)/2⎢ ⇒ ∑ ⎣ k + 1 i=1 F1b (λ. Suppose also that the test statistics are constructed based on autoregressions that do not include the lags of first differences of yt . over W r ∈ (λ j−1 . let W ( j) (r ) and  ( j) (r ) represent demeaned and detrended Brownian motions. Also. We start with the case where there is no serial correlation and subsequently show that all limit results are valid for the general case. + λ2i − λ2i−1 ⎡  ⎢ ⎣ 2 λ2i+1 W (2i+1) (r )dW (r ) λ2i  λ2i+1 (2i+1) (r )]2 dr λ2i [W + 1 λ2i+1 − λ2i . k) ⇒ 1 k +1 (k−1)/2 ∑ i=0 ⎤ 2 (2i) (r )dW (r ) λ2i−1 W  λ2i (2i) (r )]2 dr λ2i−1 [W 1 ⎥ {W (λ2i ) − W (λ2i−1 )}2⎦ . λ2i+1 − λ2i If k is odd. THEOREM 1. respectively.DETECTING MULTIPLE CHANGES IN PERSISTENCE 7 values is discussed in 3. where v t satisfies Assumption A1. The following theorem states the limit distributions of the tests under the null hypothesis of a unit root.) denote a standard Brownian motion on [0. and in 3. if k is even.3 we demonstrate the consistency of the tests under the relevant alternative hypotheses.1. k) ⎡ ⇒ 1 ⎢ ∑⎣ k i=1 F1b (λ.2. Then under the null hypothesis H0 : ci = 0. λ2i − λ2i−1 ⎡  1 ⎢ ∑⎣ k + 2 i=0 2 λ2i+1 (2i+1) (r )dW (r ) W λ2i  λ2i+1 (2i+1) (r )]2 dr λ2i [W ⎤ + 1 ⎥ {W (λ2i+1 ) − W (λ2i )}2⎦ . αi = 1 for all i.

bi = 0. we have . ⎤ 2⎥ W (λ2i+1 ) − W (λ2i ) ⎦ . Under the null hypothesis H0 : ci = c. αi = 1 for all i. if k is even.

λk ) are pivotal and depend only on functionals of a Wiener process. (2i+1) (k−1)/2  ⎢ λ ⎥⎥ W (r ) dr 2k + 1⎢ 2i ⎥⎥ ⎢ + ∑ ⎢ ⎥  λ   2  λ ⎢ ⎥ 2i+1 ⎢ r −(λ2i+1 −λ2i )−1 λ 2i+1 r dr dW (r ) i=1 ⎣ ⎦⎥ λ ⎣ ⎦ 2i + 2i λ  2 λ 2i+1 λ2i r −(λ2i+1 −λ2i )−1 2i+1 r dr λ2i dr Theorem 1 shows that for all models. The limit distributions are different depending on whether the alternative hypothesis specifies that the initial regime has a unit root or is stationary.8 MOHITOSH KEJRIWAL ET AL. ⎢ ⎥ 2k ⎢ + ∑ ⎢  2  λ  ⎥  λ2i ⎥ 2i −1 ⎢ i=1 ⎣ ⎦⎥ λ2i−1 r −(λ2i −λ2i−1 ) λ2i−1 r dr dW (r ) ⎣ ⎦ +  2  λ2i λ2i−1 r −(λ2i −λ2i−1 )−1  λ2i λ2i−1 r dr dr F2b (λ. k) ⎡ ⎤  k/2  1 2 {W (λ ) − W (λ )} −W (1)2 + ∑i=1 λ2i −λ 2i 2i−1 2i−1 ⎢ 2 ⎡ λ2i+1 ⎤⎥ ⎢ ⎥  (2i+1) (r )dW (r ) W ⎢ ⎥ λ2i 1 2⎥ ⎢ {W (λ + ) − W (λ )}   2i+1 2i ⎢ ⎥ 2 −1⎢  λ2i+1 −λ λ 2i+1 2i (2i+1) ⇒ (2k + 2) ⎢ . k)   ⎡ ⎤ k/2 −{W (1)}2 + ∑i=0 λ2i+11−λ2i {W (λ2i+1 ) − W (λ2i )}2 ⎢ 2 ⎡  λ ⎤⎥ ⎢ ⎥ 2i W  (2i) (r )dW (r ) ⎢ λ2i−1 1 2 ⎥ ⎥ {W (λ + ) − W (λ )}   1 ⎢ ⎢ ⎥ 2i 2i−1 2 λ2i −λ2i−1 ⎢ k/2 ⎢  λ2i  (2i) (r ) dr ⇒ ⎥⎥ λ2i−1 W ⎢ ⎥. . . . and are also different for the trending and nontrending cases. the limit distributions of the Wald tests based on a given vector of break fractions (λ1 . F2a (λ. . (k+1)/2  ⎢ ⎥ (r ) dr λ2i−1 W 2k + 1⎢ ⎥⎥  2  λ  ⎢ + ∑ ⎢ ⎥  λ ⎢ ⎥ 2i 2i −1 ⎢ r −(λ2i −λ2i−1 ) i=1 ⎣ ⎦⎥ λ λ2i−1 r dr dW (r ) ⎣ ⎦ + 2i−1  2 λ λ 2i λ2i−1 r −(λ2i −λ2i−1 )−1 2i λ2i−1 r dr dr F2b (λ. F2a (λ. The form of the distributions varies according to whether the number of breaks .  k/2 ⎢ ⎥⎥ (r ) dr W λ2i ⎥⎥ ⎢ +∑⎢ ⎥  λ   2  λ ⎢ ⎥ 2i+1 ⎢ r −(λ2i+1 −λ2i )−1 λ 2i+1 r dr dW (r ) i=0 ⎣ ⎦⎥ λ ⎣ ⎦ 2i + 2i λ  2 λ 2i+1 λ2i r −(λ2i+1 −λ2i )−1 2i+1 r dr λ2i dr If k is odd. k)   ⎡ ⎤ (k−1)/2 1 {W (λ2i+1 ) − W (λ2i )}2 −{W (1)}2 + ∑i=0 −λ λ 2i+1 2i ⎢ 2 ⎡  λ ⎤⎥ ⎢ ⎥ 2i  (2i) (r )dW (r ) ⎢ ⎥ W λ 1 2 ⎥ 2i−1 1 ⎢ + λ2i −λ2i−1 {W (λ2i ) − W (λ2i−1 )} ⎥⎥ ⎢  λ2i  (2i) 2 ⎢ ⇒ . k)   ⎡ ⎤ (k+1)/2 1 2 {W (λ ) − W (λ )} −{W (1)}2 + ∑i=1 2i 2i−1 λ −λ 2i 2i−1 ⎢ 2 ⎡  λ2i+1 ⎤⎥ ⎢ ⎥  (2i+1) (r )dW (r ) ⎢ ⎥ W λ2i 1 2 ⎥ 1 ⎢ {W (λ + ) − W (λ )}   ⎢ ⎥ 2i+1 2i 2  λ2i+1 ⎢ λ −λ 2i+1 2i ⇒ .

respectively. m)]]. the test statistics have the same limit distributions as those stated in Theorem 1 and Corollary 1. THEOREM 2. This is due to the fact that the limit . k) ⇒ supλ∈ k Fj∗ (λ. m)]]. We now show that the results of Theorem 1 and Corollary 1 remain valid when u t follows the general linear process (2) with the following assumption about the lag length l T .i. W2 (k) ⇒ max[supλ∈ k F2a (λ. We then apply the algorithm to obtain the minimized sum of squared residuals and the corresponding vector of break fractions subject to the relevant restrictions. W max ⇒ max ∗ (λ. 1995). (b) W1 (k) ⇒ ∗ (λ.DETECTING MULTIPLE CHANGES IN PERSISTENCE 9 under the alternative hypothesis is even or odd. Denote the limit distribution of the test Fj (λ.i. N (0. Panel A provides critical values for the nontrending case. k) by Fj∗ (λ. m). we generate a sample of T = 500 observations from a random walk with i. N (0. This procedure is repeated 5.d. 3. the lag length l T is assumed to satisfy (a) (upper bound condition) l T2 /T → 0 and (b) (lower bound condition) l T ∑ j>lT π j → 0. k). k). The maximum number of breaks considered is 5. The critical values for Models 1a and 2a are larger than those for Models 1b and 2b. Note also that the critical values are not monotonically decreasing as k increases. 1) errors. m). Asymptotic Critical Values Given the nonstandard nature of the limit distributions. Asymptotic critical values are provided in Table 1 with the level of trimming set at = 0. Next. sup ∗ ∗ max[supλ∈ k F1a λ∈ k F1b (λ. we can obtain the limit distributions of the proposed tests as a direct consequence of the continuous mapping theorem. j = 1a. k). Then. Finally. thereby allowing the use of data-dependent rules such as information criteria to select the lag length (see Ng and Perron. As T → ∞. 2a. while those for the trending case are presented in Panel B. 1) random variables. 2b. With these theoretical results.2. we evaluate the expressions appearing in the limit distributions at the vector of break fractions obtained earlier.15. ∗ ∗ supλ∈ k F2b (λ. Here again we use Perron and Qu’s (2006) dynamic programming algorithm. sup supλ∈ m F1b F2a 2 1≤m≤A [max[supλ∈ m λ∈ m ∗ F2b (λ. k). k)]. Under Assumptions A1–A3 and the null hypotheses considered in Theorem 1. COROLLARY 1. Note that the lower bound condition allows for a logarithmic rate of increase for l T . we simulate a Wiener process using the partial sums of 500 i.000 times to obtain the required quantiles of the limit distributions. Assumption A3.d. We now state the result for the general case. 1b. (c) W max1 ⇒ max1≤m≤A [max[supλ∈ m F1a (λ. k)]. we have (a) supλ∈ k Fj (λ. the critical values are obtained by Monte Carlo simulations. First. ∗ (λ. under the same null hypothesis as in Theorem 1.

51 10.49 9.56 5. TABLE 1.59 4.30 5.64 6.84 6.90 7.29 5.19 7.15 5.72 7.07 7.5% 1% sup F 2a (λ.84 7.87 4.75 9.28 5.33 9.41 6.52 7.60 8.95 13.28 7.80 5. k) sup F 2b (λ.08 6.71 8.14 5.40 5.07 MOHITOSH KEJRIWAL ET AL.20 9.10 (A) Nontrending case 10% 5% 2.78 6. k) W2 (k) Number of breaks.05 6.64 6.08 8.02 (B) Trending case 10% 5% 2.00 5.04 5.73 8.05 5.63 4.64 12.11 5. k 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 W max2 7.18 10.82 4.34 7.24 5. k) W1 (k) Number of breaks.28 8. k Number of breaks.59 4.73 7.67 6.98 5.21 6.12 8.02 6.47 5.33 6.32 9.35 8.33 5.99 10.93 11.44 7.57 8.31 6.77 6.88 9.00 11.68 7.57 4.11 9.04 7.69 4.47 10.50 6.67 6.39 7.18 6.77 7.10 7.62 11. k Number of breaks. k Number of breaks.91 8.43 9.18 5. k) sup F 1b (λ.05 5.90 11.74 4.72 7.08 7.86 10.28 7.84 8.12 4.94 8.43 7.42 6.84 5.5% 1% sup F 1a (λ.17 5. Asymptotic critical values .46 5.79 6.21 9.48 5.71 8.36 6.98 8.95 6.30 9.86 6.46 4.36 5.23 5.56 7.49 9.96 6. k 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 W max1 7.70 6.18 6.39 4.14 6.64 12.62 11.22 9.07 5.01 7. k Number of breaks.82 5.63 9.67 8.97 7.27 4.17 8.

where ( j. Consistency We now study the properties of the tests under the alternative hypothesis of an unstable persistence parameter. tests that are directed against the “wrong” alternative are inconsistent. the tests W1 (m) and W max1 are consistent.. i. i. 1a). denoted λ0 = (λ01 . . . For even or odd values they are. the tests W2 (m) and W max2 are consistent. . (1b. In Section 6 we show through simulations that these tests have empirical power reasonably close to their nominal size. Suppose that the data are generated under the alternative hypothesis represented by Model j ( j = 1a. (a) the tests supλ∈ m Fj (λ. We further show that tests that are directed against alternatives in which the initial regime is I (1) [I (0)] are inconsistent when the data are generated by alternatives in which the initial regime is I (0) [I (1)]. in particular. λ0 ∈ m . the tests that do not require any information regarding the direction of change are consistent regardless of whether the initial regime is I (1) or I (0). THEOREM 3.. We make the following assumptions. Part (c) states that for models with nontrending data. . Parts (a) and (b) of Theorem 3 state that the tests that are directed against the alternatives that represent the true DGP as well as those that do not require any information regarding the direction of change are both consistent. In particular. under Assumptions A1–A2.. Assumption A4 is not very restrictive given that in practice. monotonically decreasing as expected. m) and W max1 are consistent. 1b. (b) if the data are generated by Models 1a or 1b. can be chosen to be small. λ0m ). This feature is useful to identify the direction of persistence change. and A4. m) is inconsistent. the lag length l T is assumed to satisfy (a) (upper bound condition) l T6 /T → 0 and (b) (lower bound condition) l T ∑ j>lT π j → 0. Assumption A3 strengthens the upper bound condition in Assumption A3 to account for the fact that a subset of the regressors in the I (0) regimes (those corresponding to the lagged first differences) is over-differenced. O p (1).DETECTING MULTIPLE CHANGES IN PERSISTENCE 11 distributions are different for the cases with k even or odd. or 2b) with m breaks in persistence. which allow for changes in the I(1)/I(0) nature of the data as well as changes in the trend function. and (c) the test supλ∈ 1 Fj (λ. they reject the null hypothesis with probability one in large samples. that under the alternative the dynamics of the process and the variance of the errors are allowed to change along with the level and/or slope of the trend function and the I (0)/I (1) nature of the process. and the variance of the errors. 2a. Assumption A3 . while if the data are generated by Models 2a or 2b. is assumed to belong to the set of permissible break fractions. Assumption A4. . Then. We can then state the following theorem regarding the consistency of the tests under the relevant alternative hypotheses given by Model (2). 3. the dynamics of the process. in general.3.e.e. j ) = (1a. 1b). i. Note. As T → ∞. we demonstrate that in the presence of shifts in persistence of the form considered in this paper. The true vector of break fractions. A3 .e.

the assumption of a known number of breaks can be relaxed using the W max1 test and the UDmax version of the BP test. We therefore employ the following decision rule labeled the Dm test: “Reject the null if both W1 (m) and B P(m) reject. Dm has unit asymptotic power against such alternatives. The number of breaks m is assumed to be known.. This procedure is useful if the researcher seeks to distinguish between I (0) preserving changes and those that involve at least one switch between an I (1) and an I (0) regime. the asymptotic size of Dm cannot exceed κ. To this end. such a test will reject only with probability equal to the nominal significance level in large samples. Here.12 MOHITOSH KEJRIWAL ET AL. To accommodate such an interpretation. we propose hybrid testing procedures that entail the joint application of our tests with the Bai and Perron (1998) structural change tests designed for a stationary framework as well as the unit root tests proposed by Ng and Perron (2001) with the modification of Perron and Qu (2007) to select the lag length. thereby enabling the applied researcher to infer the direction of shift from the test outcomes. since B P(m) and W1 (m) are both consistent against processes that involve a switch between an I (1) and an I (0) regime. The second hybrid procedure allows the null hypothesis to include the case of I (0) preserving changes in addition to the stable I (1)/I (0) cases. We therefore propose joint application of the Dm procedure and the particular M G L S test on .” If the significance level κ is employed for both tests. HYBRID TESTING PROCEDURES One aspect of the test statistics introduced in Section 2 is that they will reject the null with probability one in large samples even if the process is stable I (0) or one that involves changes in the value of the autoregressive parameter such that the process is still I (0) in each regime. regardless of whether the process is I (1) or I (0) throughout. a unit root test applied on the regime with the largest estimated autoregressive root will reject the null asymptotically. we define B P(m) as the Bai-Perron (1998) partial structural change test that jointly tests the stability of the intercept and the autoregressive parameter in (4) while holding fixed the coefficients on the lagged first differences. To facilitate this distinction. The former feature ensures that our hybrid procedure is well sized. while the latter ensures little loss in power. 4. while if an I (1) segment is present. This test has the correct asymptotic size when the process is constant I (0). i. In practice. We therefore recommend using the Dm procedure in conjunction with one of the M G L S tests proposed by Ng and Perron (2001) with the modification of Perron and Qu (2007) to select the lag length. The first hybrid procedure is designed to test the null hypothesis that the process is stable I (1) or stable I (0). we note that with I (0) preserving changes.e. I (0) preserving changes. given that these tests avoid the power reversal problem for nonlocal stationary alternatives while maintaining empirical size close to nominal size. the researcher may be interested in reliably interpreting the test outcome as one emanating from a switch between an I (1) and an I (0) regime. Further.

6. We report results only for the nontrending case.. Tk }.. . . For a model with k breaks. . including their consistency. . . SIMULATION EXPERIMENTS In this section we conduct simulation experiments to assess the finite sample performance of the proposed tests as well as to provide a comparison with the tests proposed in Harvey et al. we allow the coefficients on the lagged first differences to vary across regimes. . They recommend using the so-called “m min-modified” and “Sm min-modified” tests .Tk SS R j. D1 . its asymptotic power is (1 − κ). 2b) evaluated at the partition {T1 .DETECTING MULTIPLE CHANGES IN PERSISTENCE 13 the regime with the largest estimated autoregressive root. . while for persistence changes that involve switches between I (1) and I (0) regimes. are investigated in a companion paper (Kejriwal and Perron. Tk ) is the sum of squared residuals for Model j ( j = 1a. ESTIMATORS FOR THE BREAK DATES Following evidence against the null hypothesis. and J2 . .k (T1 . . the asymptotic size of Jm is bounded by κ. J1 . 5. I (0) changes or I (1)/I (0) changes by letting the size of each test go to zero at a suitable rate. in large samples one can obtain a complete correct classification into I (0) or I (1) throughout. (2007b). 1b. it is desirable to determine the location of the break dates.” If a significance level κ is used for each of the tests in Dm as well as for M G L S . Using the Jm test.. where the regimes are identified by minimizing the unrestricted sum of squared residuals. The computation of the sum of squared residuals is similar to that discussed in Section 2 except that the cross-regime restrictions on the coefficients governing the shortrun dynamics are replaced by within-regime restrictions depending on the number of lags included in a specific regime. class of tests is designed to detect a single persistence break and is based on partial sums of the demeaned or detrended data. The asymptotic properties of these estimators.. T˜k ) = arg minT1 . . The number of lags is also allowed to be regime dependent. . (2006) and Leybourne et al. The finite sample performance of Dm and Jm will be investigated through simulations in Section 6. . rate of convergence.1 When estimating the break dates.k (T1 . given that qualitatively similar results were obtained for the trending case. we consider the statistics W1 (1). we propose estimating the break date estimators from global minimization of the sum of squared residuals under the relevant alternative hypothesis. D2 . Tk ) where SS R j. labeled the Jm test is: “Reject the null if Dm rejects and M G L S does not reject. and limit distribution. . The Harvey et al. . Simulations (not reported here) show that the estimators perform very well in small samples in terms of bias and root mean squared error. Specifically. Results for the W max1 test were found to be similar to those for the W1 test based on the true number of breaks and hence not reported. 2a. the estimated break dates are thus obtained as (T˜1 . W1 (2). . the decision rule. 2012). . To this end. In particular.

We present results for the following combinations of values of the autoregressive parameter (ρ) and the moving average parameter (θ): (a) ρ = θ = 0.i. θ = 0. (f) ρ = 0. θ = 0. θ = 0. since this was found to outperform the maximum and exponential versions in most of our experiments (as in Harvey et al. The Leybourne et al. we use the M Z αG L S unit root test of Ng and Perron (2001) with the modification of Perron and Qu (2007) to select the lag length with a maximum of five lags. 240. so that the lag length selection is based on the sequential approach of Ng and Perron (1995). We first obtain the number of lags based on the estimation of the alternative model and then use this number in the estimation of the null model. (b) ρ = 0.1] D FG (λ. Consider first the unit root case. N (0. Finally.14 MOHITOSH KEJRIWAL ET AL.3. The results are presented in Table 2a for α = 1 and Table 2b for α < 1. The sample sizes used are T = 150.5. Further. τ ) is the local GLS detrended ADF unit root t-statistic that uses the observations between λT and τ T . For the M test we used the Gauss posted by Leybourne et al.5. For the latter. (2007b) posted on the Studies in Nonlinear Dynamics and Econometrics website. We consider cases where the data generating processes (DGPs) involve no break (size). denoted H . {et } denotes a sequence of i.5. 6. although the full set of results is available upon request. The errors {u t } are generated by the autoregressive moving average (ARMA) process u t = ρu t−1 + et + θet−1 .) while producing very similar results in others.5. u 0 = 0.1. and the corresponding partition selected by the test does not correspond to the full sample. the rejection frequency of the M-procedure is computed as the proportion of Monte Carlo replications in which the M test rejects. τ ).3.d. 1) variables. The Empirical Size of the Tests In order to assess the empirical size of the tests. The nominal size for all tests is set at 5%. y0 = 0. we only report results for the “m min-modified” tests. (d) ρ = 0. More specifically. (c) ρ = 0. . All experiments are based on 1. θ = −0. Both the H and M tests allow the process to be stable I (1) or stable I (0) under the null hypothesis.1) infτ ∈(λ. These tests differ in the method used to compute the critical values.2 In all experiments. In order to account for the stable I (0) possibility under the null. the DGP considered is DGP-0: yt = αyt−1 + u t . based on extensive simulation experiments. to compute Jm . (e) ρ = 0.5. θ = −0. Given their similar finite sample performance. The lag length in the autoregression for our proposed procedures is selected using the Bayesian information criterion (BIC) with the maximum number of lags allowed set at 10. where D FG (λ. (2007b) tests allows for multiple changes and are based on a doubly recursive application of the unit root statistic using the local GLS detrending methodology developed in Elliott et al. as well as some involving one and two breaks (power). we present results only for the test based on the mean-functional. with a maximal lag order of four and a 10% significance level for the t-test on the highest lag. 000 replications. they propose the test statistic M = infλ∈(0. (1996). we report results only for ρ = θ = 0.

05 .01 .27 .87 .04 .92 .09 . Empirical size when the process is constant I (0) (DGP-0.93 .5) (0.02 .06 . Since the M test is based on the application of unit root tests to data subsamples.04 . 0) (0.05 . and T = 240.05 .07 .09 .03 .00 .08 .06 .04 .05 .92 .04 .06 .90 . The rejection probability is at least 15% for T = 150 and never falls below 10%.75 .0 .06 .93 .03 When the errors do not contain a negative MA component.04 1.DETECTING MULTIPLE CHANGES IN PERSISTENCE 15 TABLE 2a.04 .07 .05 .01 .08 .05 .06 .02 . these size problems arise from the downward bias in the persistence parameter estimates under the null hypothesis of a unit root. Nominal size = 5%) α = .06 .01 .05 .45 . which remain prominent even for T = 240.06 . which .15 .01 .04 .06 .05 . on the other hand.10 TABLE 2b.05 . the bias in the sum of the autoregressive coefficient estimates is exacerbated.94 .04 .99 . even for T = 240. 0) (.07 .46 . .03 .03 .02 .02 .05 .05 .05 .12 .07 .5.0 .5 α = .10 .9 Test\T 150 240 150 240 150 240 150 240 150 240 W1 (1) W1 (2) D1 D2 J1 J2 M H .0 1.06 .99 .04 . all the proposed statistics are adequately sized with the null rejection probabilities never exceeding 10% for either sample size.11 .05 . −.08 . Empirical size when the process is constant I (1) (DGP-0.09 . The M test.48 .05 .04 . θ) (0.07 . θ).6 α = .12 .18 . the empirical size of the M test is 83%.17 .39 .04 .87 .17 .02 .07 .10 .15 .5) (.05 .13 .05 .01 . As with standard unit root tests.7 α = .04 .02 .01 .07 .91 .04 .05 . For instance.08 .00 . the W1 (1) and W1 (2) tests suffer from important size distortions.03 .05 .8 α = .85 .04 .03 .10 .02 .06 .06 .02 .01 .04 1. A useful feature of the Dm and Jm tests is that they remain adequately sized across all values of (ρ.14 .04 .94 .07 .04 .05 .08 .09 .02 .13 .05 . is seriously oversized irrespective of the nature and extent of serial correlation in the errors.04 .02 .05 .08 .39 .05 .12 .13 .3. −.83 .06 .02 . These distortions are especially severe with negative MA errors.04 .08 .00 .02 .11 .01 .25 .06 .5.04 .23 .03 .07 .93 . .17 . With a negative MA component.05 .05 .05 .0 .99 .05 .3.5) Test\T 150 240 150 240 150 240 150 240 150 240 150 240 W1 (1) W1 (2) D1 D2 J1 J2 M H .05 .10 . ρ = θ = 0.05 .05 . Nominal size = 5%) (ρ.04 .21 .04 1.5) (.39 .41 .06 .04 .06 .01 . with ρ = 0.04 .03 .03 .01 .02 . θ = −0.13 .

including those for T = 150).8.3) and lower when it occurs late (λ01 = 0. Such a process is designed to avoid sharp jumps to zero at the break point between the I (1) and I (0) regimes and ensures a joining up of these regimes.3. it is higher when the break occurs early (λ01 = 0. The results are presented in Table 3.. 0. The H test is accurate except when a negative MA component is present.5. and M tests all overreject the null substantially. W1 (2). Otherwise. the qualitative features are similar.5 and vice versa for DGP-2. e. W1 (2). The Case with One Break We now consider the power of the tests with a single break and the following DGPs: For t ≤ [T λ01 ] DGP-1 DGP-2 DGP-3 DGP-4 DGP-5 yt = y t−1 +u t yt = αy t−1 +u t yt = y t−1 +π 1 y t−1 +et yt = αy t−1 +π 1 y t−1 +et yt = y t−1 +u t For t ≥ [T λ01 ] + 1 yt = αy t−1 +u t yt = y t−1 +u t yt = αy t−1 +π 2 y t−1 +et yt = y t−1 +π 2 y t−1 +et yt −y [T λ0 ] = α(y t−1 −y [T λ0 ] ) + u t 1 1 DGP-1 and DGP-2 are processes involving a shift in the persistence parameter but no change in the short-run dynamics. These spurious rejections decline as α increases but remain nonnegligible for α ≤ 0. When α < 1. in turn contributes to the poor finite sample performance of the test under the null hypothesis. We also examine the power of the tests when the persistence parameter is unity but the short-run dynamics change across regimes. however.2. the further away the series is from a pure unit root process. In contrast. i. 0. Relative to the H test and the proposed tests. As expected. the power of all the tests decreases as α increases. The tests are thus subject to a clear size-power trade-off in this latter case. and M tests are all sizeadjusted. 0.4.3.8.16 MOHITOSH KEJRIWAL ET AL.. the M test is much more sensitive to break location. DGP-3 and DGP-4 allow for the shortrun dynamics to simultaneously change as well. Power does vary with the location of the break: As expected. 6. Power is also lower with serially correlated errors compared to the i. and Jm tests maintain empirical size very close to nominal size for all stationary values of α and both sample sizes.i. (2007b). the H. power falls from 79% to . the W1 (1).7. 0.5 and T = 240 (more results are available in the working paper version.g. case.e.3 Panel (A) of Table 3 provides results for DGP-1. except when the errors contain a negative MA component. Given the extent of size distortions.7. Dm . We only report results for λ01 = 0. the powers of W1 (1). This is due to the fact that the longer the I (0) segment. We present results for three values of the autoregressive parameter: α = 0.d. DGP-5 is a variant of DGP-1 that is considered in Leybourne et al. the data are generated by DGP-3 (or DGP-4) with α = 1 but π1 = π2 . The loss in power from introducing an autoregressive component in the errors is especially significant for the M test.7) for DGP-1.5. We consider three values for the location of the break: λ01 = 0.

61 .46 .99 1.98 W 1 D1 J1 M H .34 .DETECTING MULTIPLE CHANGES IN PERSISTENCE 17 TABLE 3.70 .5) (0.78 .83 .70 .87 .67 .24 .99 .60 .47 . −. θ) (0.09 .72 .0 1.55 .06 .52 .85 .86 .62 .5 when α = 0.78 .93 1.36 .5) (0.60 .52 .18 .73 . −.92 1.5) (.0 1.50 .73 . 0) (0. .57 .3.76 .95 .21 .91 .94 .95 .92 .79 .0 .83 .88 .5. 0) (.78 .76 .94 .95 .93 .0 .60 .98 .38 .47 .92 .74 .71 .0 .90 .96 .5) 1.97 .85 .0 (0.99 1.91 . 0) (.50 . −.3.98 .96 .26 .54 .77 .74 .0 1.3.66 .84 .56 .28 .69 .96 .98 .0 .78 .76 .3.89 1.98 .93 .97 .0 .74 .5) H (B) DGP-2 (π1 .64 .82 .78 .70 .0 1.99 .35 .53 .93 .5) (.71 .63 .48 (ρ.35 .85 .5.59 .91 .0 .57 .3.95 .31 .92 .76 .92 .88 .76 .0 . This property is important in applications where the researcher does not want to take a stand on the nature of the process under the null .54 . −.87 .65 .2) (−.7 M H W 1 D1 (ρ.74 .5) (.99 .55 .26 .79 .53 .2) (−.0 .99 .89 1.91 .97 1.58 . 0) (0. Empirical power with one break (λ01 = 0.93 .79 .41 .66 .24 . .78 .71 .99 1.94 .21 .87 .48 . θ) (0.91 .95 .82 .0 .30 .48 .37 .96 .0 .0 1.30 .64 .50 .86 .94 .3.39 .91 (C) DGP-3 1.93 .93 .61 .5) (.91 .83 .0 .99 .07 .89 . π2 ) (D) DGP-4 .8 (E) DGP-5 1. 0) (0.79 .29 .0 .78 .5) (0.90 . −.72 .3.59 .26 .72 .90 .60 .70 .0 1.99 . .94 .95 .30 . .55 .99 1.51 .88 1. −.57 .86 .76 .90 .66 .26 . .93 .94 .85 .68 .90 .45 .97 1.73 . − .29 .95 .78 .79 .94 .35 (π1 .79 .90 .85 .21 .94 1. Moreover.7.0 1.54 .79 .88 .51 .95 1.87 .91 .94 .55 .95 .96 .57 .84 .5) 1.88 .80 .84 .50 .91 .98 .61 .89 .0 1.5 W1 D1 J1 α = 0.85 .73 .24 .92 . 0) (.97 . .0 .0 .98 .52 .37 .73 .5) (. −.94 .91 .48 .23 .5) (. W1 stands for the statistic W1 (1).47 .72 .37 .93 .46 .5) M (A) DGP-1 (ρ.71 .87 .81 .81 .94 .74 .53 .73 .95 .60 .0 . −.5.35 .81 . π2 ) (0.48 . there is only a mild loss in power from using the D1 and J1 tests compared to the less robust W1 (1).42 .91 .3.75 .90 .93 .98 .52 .59 .83 .58 .68 .99 .5) J1 α = 0.88 .20 .0 1.95 1.0 1.96 .5).85 . 45% as ρ increases from 0 to 0.65 .76 . θ ) (0.92 .67 .59 .77 .0 .68 Note: In all cases.69 .91 .90 .31 . In comparison.0 .65 .69 .85 .0 1.0 1. T = 240 α = 0. −.91 .93 .91 .91 . the power of the proposed tests is much more robust to the extent of error serial correlation.89 .54 .79 .99 .93 .20 .45 .22 .87 .71 .95 .89 .26 .43 .90 .

λ01 = 0. In contrast to the other tests. λ02 ) = (0. π1 = π2 . while the proposed tests have lower power. 0. α = 1.3.04 . our tests are more robust to potential changes in the dynamics only so that a rejection by our tests can be more reliably interpreted as coming from a change in persistence as opposed to a shift in short-run dynamics. while the H test performs favorably for DGP-4.2 π1 = −. nominal size = 5%). though the latter tests still exhibit the highest power except when the errors are driven by a negative MA component.30 .08 .08 . while similar results were obtained using the coordinates (0.3. T = 240 π1 = 0. The rejection frequencies of the M test increase sharply relative to the case where the DGP does not involve a shift in short-run dynamics. Again.3. 0. 0. we report results for locations of the breaks at (λ01 . π2 = −. relative to DGP-1. TABLE 4. It is of interest to assess the power of the tests when the short-run dynamics are allowed to change while the process remains I (1) throughout.09 hypothesis. the results are presented in Panels (C) and (D).7) and (0. The DGPs considered are: DGP-6 DGP-7 DGP-8 DGP-9 DGP-10 For t ≤ [T λ01 ] For [T λ01 ] + 1 ≤ t ≤ [T λ02 ] yt = y t−1 +u t yt = αy t−1 +u t yt = y t−1 +π 1 y t−1 +et yt = αy t−1 +π 1 y t−1 +et yt = y t−1 +u t yt = αy t−1 +u t yt = y t−1 +u t yt = αy t−1 +π 2 y t−1 +et yt = y t−1 +π 2 y t−1 +et yt −y [T λ0 ] = α(y t−1 −y [T λ0 ] ) + u t 1 1 For t ≥ [T λ02 ] + 1 yt = y t−1 +u t yt = αy t−1 +u t yt = y t−1 +et yt = αy t−1 +et yt = y t−1 +u t . the D1 and J1 tests retain power close to W1 (1). Again. The Case With Two Breaks With two breaks in persistence.3.6). in which case the M test rejects the null more often.08 . where the H test rejects the null more often. 6. except when the errors contain a pure negative MA component.5 W1 (1) W1 (2) D1 D2 J1 J2 M H .07 . the proposed tests are generally superior to the others for DGP-3. Hence. π2 = −. The H test dominates in this case. and ρ = θ = 0. The use of the proposed tests appears to be advantageous relative to the M and H tests in terms of detecting an I (1)-I (0) shift. ρ = θ = 0. For DGP-3 and DGP-4. The results for DGP-2 are reported in Panel (B) of Table 3.09 . while the rejection probabilities of the M and W1 (1) tests are broadly similar.05 .19 .18 MOHITOSH KEJRIWAL ET AL.5.07 . Empirical power (DGP-3.06 . the rejection frequencies of the Dm and Jm tests do not exceed 10% in any of the cases.5. the M and H tests now have higher power. the rejection frequencies for DGP-5 reported in Panel (E) indicate that.04 . Table 4 reports the rejection frequencies for DGP-3 when α = 1.05 . Finally.7).07 .07 .4. λ01 = 0.

0 (−.64 .90 .54 .84 .78 .85 .91 .22 .73 .36 .45 .0 .42 .75 .48 .25 .28 .45 .98 . .94 .94 .22 (ρ.45 .40 .32 .79 .43 .90 .42 .94 .17 .89 .49 .09 .22 .81 .56 .22 .41 .03 .05 .26 .75 . 0) (0.85 .33 .71 .61 .50 .06 .21 .94 . 0) (.84 .50 . Empirical power with two breaks (λ01 = 0.40 .57 .31 .91 .91 .43 .77 .43 .99 .87 .07 .81 .86 1.04 .99 W 2 D2 J2 M H .30 .71 1.55 .10 Note: In all cases.50 .22 .26 .75 .83 .73 .98 .41 .41 . the proposed tests are clearly preferred to the M and H tests.75 .63 .65 .41 .89 . π2 ) (D) DGP-9 .81 .70 . 0) (.71 .0 . 0) (.3.40 .5.18 .49 .30 . T = 240 α = 0.87 .42 .63 .80 .25 .66 .41 .41 .14 .23 .78 .16 .3.38 .5) J2 α = 0.73 .70 .29 .15 .5) 1.08 .15 .72 .54 .59 .5 W2 D2 J2 α = 0.87 1.41 .90 .17 .57 .0 .29 .63 .7 M H W 2 D2 (ρ.92 .47 . consider the power of the various tests when the data are generated by DGP-6 and DGP-7 (Panels (A)–(B)).49 .46 . −.5. θ) (0.04 .5) (0. θ) (0.97 .23 .5) (.43 . .32 .81 . 0) (0.46 .89 .95 .5) (0.28 .89 .51 .3.39 .68 .39 .74 .63 .53 .38 .11 .41 .09 (π1 .DETECTING MULTIPLE CHANGES IN PERSISTENCE 19 TABLE 5.87 .40 .54 .84 .50 .5) (.5) H (B) DGP-7 (π1 . .61 .5) (.3.33 .21 .73 .07 .42 .17 .45 .90 .58 .94 . θ) (0.84 .56 .59 .65 .34 .5) (0.96 .92 .49 .04 .06 .96 1.5) (.5) (.46 .28 .70 .49 .04 .70 .60 .24 .14 . λ02 = 0.20 .41 (C) DGP-8 (0.68 .90 .61 .83 .5) (.63 .3.16 .04 .95 .71 .54 .92 .2) (−.67 .95 .24 .26 .22 .28 .5) .04 .05 .45 .06 .−.48 . First. In unreported .96 .96 .55 .68 .20 .06 .65 .05 .96 .03 .91 . π2 ) (0.63 .64 .07 . .63 .32 .10 .04 .−.98 .74 .50 .07 .3.27 .60 . with the H test exhibiting very little power even with a large sample size.58 .41 .92 .97 .58 .75 .69 .68 .41 .55 .81 .15 .42 .99 .−.10 .68 .39 .3.43 .78 .43 .89 .32 .91 1.97 .−.0 .49 .74 .0 .78 .38 .77 .5.04 .12 .61 .49 .48 .28 .69 .64 .51 .3.42 .94 .37 . W2 stands for the statistic W1 (2).3.66 .73 1.28 . For DGP-6.89 .40 .06 .2) 1.58 .0 .81 .82 .94 .42 .49 .05 .56 . The results are presented in Table 5.40 .87 . −.87 .6).76 .26 .48 .8 (E) DGP-10 .92 .96 .88 .26 .60 .76 .82 .23 .64 .77 .30 . −.57 .51 .84 .49 .86 .19 .10 .44 .42 . −.95 .06 .78 . −. . 0) (0. .06 .0 .10 .63 .5) M (A) DGP-6 (ρ.65 .44 . −.55 .03 .

For DGP-9 (Panel (D)). the rejection frequencies of the tests are close to those in the absence of regimespecific short-run dynamics.3. in the case of DGP-8 (Panel (C)). our tests again outperform the others except in the case with pure negative MA errors. Finally. even though the tests are based against the alternative that these dynamics remain unchanged across regimes. and 7 for the case ρ = θ = 0. λ02 = 0. For DGP-7. the proposed tests are more powerful relative to the case with no change in the short-run dynamics. An important feature of these results is that the rejection frequencies do not display any tendency to increase with the sample size. the proposed tests can be used to distinguish between processes with an initial I (1) regime and those with an initial I (0) regime.7 compared to the other two location pairs. The results indicate that when the initial regime is I (0) in the true DGP (DGPs 2 and 7). we only present results for DGPs 1. Jm . and H tests have much better size control in finite samples relative to the M test. The performance of the M test was again found to be quite sensitive to the location of the breaks for both DGP-6 and DGP-7. 6. the rejection frequencies in most cases are within 10%. while those for two breaks are reported in Panels (C) and (D) of the same table. for instance. the conclusions based on power results for DGP-10 (Panel (E)) are qualitatively similar to those discussed for DGP-5.3. the H test has much higher power against DGP-7 relative to DGP-6. 6. Summary and Practical Recommendations In summary. we found that the power of all tests (except the H test) is higher for λ01 = 0.4. Interestingly. the results are reported in Panels (A) and (B) of Table 6. λ2 ) = (0.20 MOHITOSH KEJRIWAL ET AL. the exceptions are when the break occurs early in the single change case and when (λ1 . although the discrepancy in this latter case is not substantial.4. Surprisingly though. Here we evaluate the empirical power of single and double break tests that are directed against the incorrect alternative. This is not unexpected since power should depend positively on the length of the I (0) segment in the data. 6. 2. simulations. For the single break case. which when combined with the results in Table 3 indicates that this test is more effective at detecting deviations from the null when the initial regime is I (0). The latter test has a substantial probability of overrejection regardless of the degree of serial correlation in the errors and whether the process is I (1) or I (0).5. In most cases the suggested . Even when the initial regime is I (1). thereby confirming that the tests are indeed inconsistent when directed against incorrect alternatives. To save space. when the data involve an I (1)–I (0) change but the researcher applies a test directed against the I (0)– I (1) alternative. 0. Identifying the Initial Regime As discussed in Section 3. the rejection frequencies are well controlled irrespective of the number and locations of breaks as well as the sample size.7) in the two breaks case. the simulation results reveal that the Dm .

01 . but also in distinguishing between shifts that preserve the I (0) nature of the process in each segment and those that are characterized by switches between I (1) and I (0) regimes.01 .7) .04 .03 .02 .0.00 .01 .01 .6) (0. In particular. sup F1a (2) (0.01 .09 .01 .13 .01 .02 .02 .4.12 .04 .09 .3.04 .00 . combining the size and power results in the previous section.3.0.01 .01 .15 . In what follows we show that the use of the Jm test allows one to successfully discriminate between these possibilities.8 150 α = 0. the Dm and Jm tests appear to constitute a very useful addition to the existing battery of procedures designed to detect shifts in persistence.06 . with power being much higher in the latter case.02 .02 .7) (0.01 .09 .04 .02 . sup F1b (1) 0.7 .02 .01 . Hence.0.03 .00 (B) DGP-2.09 .01 .03 .02 .01 .7 150 240 α = 0.5 0. 1).07 .0.03 .5 λ0 \ T 150 240 α = 0.7 .03 . This feature appears especially relevant in the presence of multiple breaks.01 .04 .DETECTING MULTIPLE CHANGES IN PERSISTENCE 21 TABLE 6.02 .01 .3.00 .02 .01 .01 .02 (C) DGP-6.9 240 150 240 .0.00 .02 .00 . The power performance of the H test is quite sensitive to whether the initial regime is I (1) or I (0). λ01 = 0. and H tests for a range of stationary values of α are reported in Table 7. while existing procedures are not suited for the same. The rejection probabilities of the J1 .01 .13 .01 .4. Empirical power against incorrect alternatives.06 (D) DGP-7.01 .01 .01 .06 . where y0 = 0.09 .05 .6) (0.17 .05 .18 .02 . ρ = θ = 0 α = 0.01 . in which case the H test has very little power when the initial regime is I (1).7) (0.01 statistics are also shown to have superior performance in terms of rejecting the null when the alternatives of interest drive the DGP. sup F1b (2) (0.01 .02 .02 .00 .01 .01 .3 0.01 .3 0.01 .00 .01 .01 .3.04 .01 .00 . the researcher may be interested not only in determining if the process is governed by a stable persistence parameter.09 . In practice.00 .03 .5 0.01 .02 . M.01 (A) DGP-1.17 .04 .0. sup F1a (1) 0.7) . The results show that the M test almost always .01 .01 .01 . we consider the following DGP-S: yt = et if t ≤ [T λ01 ] and yt = αyt−1 + et if t ≥ [T λ01 ] + 1.02 .5 and et ∼ iid N (0.

0 .5) α = .08 1.0 .99 . 2000).24 . see Kim and Perron (2009). consider the case in which the process is I (1) across two segments but with a change in trend. the use of the Jm tests again provides a reliable safeguard. our tests will have power but so would unit root tests allowing for a change in the trend function. The problem is that the limits of the resulting statistics under the unit root null depend on the true trend break date.0 . Consider next the case where the process is I (0) with breaks in the slope of the trend function. The following procedure can be used to distinguish such a process from a persistence change process.51 . So our test can be used in conjunction with those of Kim and Perron to make sure that a change in persistence is indeed present and not only a change in the trend function.97 . We then apply our persistence change tests to the detrended data. however. the J1 test is much more immune to the value of α. Among the latter two tests.6 α = . ρ = θ = 0. To examine the finite sample performance of the detrended test statistics in the single break case. since its rejection frequencies are controlled owing to the fact that the unit root test on the regime with the largest estimated autoregressive unit root rejects with probability one in large samples (given the consistency of the estimated breakpoints).22 MOHITOSH KEJRIWAL ET AL. α2 = α. But we can use the critical values corresponding to Models 2a or 2b (as the case may be) as a benchmark. α1 = 0.99 . This experiment thus clearly illustrates the usefulness of the recommended tests in identifying the nature of the persistence shifts responsible for instabilities in the process generating the data.27 .06 . then the latter would not reject (see Kim. We first detrend the data using a regression of the data on a time trend and a slope dummy (where the break date is chosen by minimizing the sum of squared residuals).21 1.9 and T = 150.15 . Then.07 1.87 . one of the tests (for Model 2a or 2b) would reject.68 .9 Test \T 150 240 150 240 150 240 150 240 150 240 J1 M H .95 rejects.5 α = . TABLE 7. the likelihood of rejection being substantial only when α = 0.98 .06 . If there are changes both in persistence and in the slope of the trend.75 .7 α = . these tests should not reject the null.55 1.0 . λ01 = 0. regardless of the sample size and the break magnitude.96 .23 1.22 .35 . rejecting the null more frequently as the break becomes larger.8 α = . The H and J1 tests are much more sensitive to the magnitude of the change. If there is only a pure trend break. Finally. while if there is an accompanying change in persistence.44 . we consider .15 . It remains to discuss how to disentangle a rejection of the proposed statistics as coming from a change in persistence and not only a change in the trend function. Null rejection probabilities for an I (0)-I (0) change (DGP-S.07 . In the trendless case where the process is I (0) with pure level shifts.0 .

never exceeding 10%.37 .93 . the number of trend breaks can be estimated using the sequential procedure developed by Kejriwal and Perron (2010) that is robust to whether the errors are I (1) or I (0). .97 . α2 = 0.52 .72 .99 . the size remains adequate.45 .53 .86 .5 α1 = 1. it has the advantage of being agnostic to the types of breaks.5.67 . μ1 = 5) in which case the test J1 is employed.08 λ01 = 0. we fix τ10 = 0.05 .71 .3.7 α1 = 0. we can potentially adopt the following procedure.7.44 .56 . et ∼ iid N (0.80 .82 . ∗ + e if t ≤ [λ0 T ] and y ∗ = α y ∗ + e if t ≥ [λ0 T ] + 1 with where yt∗ = α1 yt−1 t 2 t−1 t t 1 1 ∗ y0 = 0. Empirical size and power of W1 (1) and W2 (1) (DGP-T.95 .7. Power is generally highest when the trend break date coincides with the persistence break date. α2 = 1 . 0.5) in which case the test W2 (1) is employed.70 .39 .5) (For J1 : β1 = 0. 0. We consider the case of a pure level shift (μ0 = β0 = β1 = 0.5 . This estimate can subsequently be used to detrend the data and apply the W max2 test in the second step.32 W2 W2 λ01 = 0. While such a procedure is likely to be computationally intensive. power is highest when the the trend break precedes the persistence break. μ1 = 5 and for W2 (1): μ1 = 0.71 .95 . Note that we allow τ10 = λ01 .45 .68 . 1).78 .DETECTING MULTIPLE CHANGES IN PERSISTENCE 23 TABLE 8.96 .11 . α2 = 0.85 . β1 = 0. In the simulations.92 .92 .78 W2 W2 λ01 = 0. which allows for the possibility that the number of trend beaks can be different from the number of persistence breaks. as well as the case of a trend break (μ0 = μ1 = β0 = 0. μ0 = β0 = 0.33 .97 .95 .09 .04 .5. CONCLUSION This paper has presented issues related to testing for multiple structural changes in the persistence of a univariate time series.0 .96 1. τ 0 = 0. β1 = 0.27 . In a first step. In contrast to the existing literature.66 .5 and set λ01 = 0.28 .72 .22 . The results are reported in Table 8. Note that in both the pure level shift and trend break cases. An exception is the trend break case where the persistence change is from an I (0) to an I (1) process: Here.55 .3 α1 = 1.76 . α2 = 1 α1 = 0. In the general case.90 the following DGP-T: yt = μ0 + μ1 I (t > [T τ10 ]) + β0 t + β1 (t − [T τ10 ])I (t > [T τ10 ]) + yt∗ .78 .5) α 1 = α2 = 1 T = 150 T = 240 T = 150 T = 240 T = 150 T = 240 J1 W2 J1 W2 J1 J1 J1 J1 .47 .80 .23 . Investigation of the asymptotic and finite sample properties of such a procedure is left as an important avenue for future research.7 . 7.30 .

813–836. NOTES 1. F. Summers (1988) How does macroeconomic policy affect output? Brookings Papers on Economic Activity 2. 33–66. Dickey (1994) Recognizing overdifferenced time series. Bai and Perron (1998) propose a sequential strategy based on repeated application of the single break test in the context of stationary regression models.B. Chong. Such an estimate was proposed by Chong (2001) for an AR(1) model with a single shift in persistence. Journal of Time Series Analysis 15. Econometric Theory 17. The size and power properties using other versions of the M G L S test were very similar. it is important to address the issue of the estimation of the break dates and develop a method to form confidence intervals. Our simulation experiments demonstrate that these tests have adequate finite sample properties. Stock (1996) Efficient tests for an autoregressive unit root. Credit and Banking 31.A. Siklos (1999) Exchange rate regimes and shifts in inflation persistence: Does nothing else matter. although his estimation procedure did not impose the unit root restriction in the relevant regime. Journal of Money. REFERENCES Andrews. Taylor (2004) Tests of stationarity against a change in persistence. 3–24. Econometrica 64. M. Bai. Berk. 1–22.M.T. J. & P. 3. (1974) Consistent autoregressive spectral estimates. 87–155. & P. Chang. T.C. Journal of Monetary Economics 19.C. The full set of results is available upon request. Journal of Econometrics 123.K. Burdekin. given that the process is stationary in only some regimes but has a unit root in others.H. however. 433–494. G.L. we propose sup-Wald tests based on the difference between the sum of squared residuals under the null hypothesis of a unit root and that under the alternative hypothesis that the process displays changes in persistence over the sample. J. Perron (1998) Estimating and testing linear models with multiple structural changes. K. Chang. (1989) Testing for Overdifferencing. & P. (1993) Tests for parameter instability and structural change with unknown change point. R.N.L. 235–247. 489–502. One important issue that we have not addressed is how to select the number of breaks. (2001) Structural change in AR(1) models. Econometrica 66. which has primarily focused on subsample unit root tests and tests based on partial sums of residuals. 2. M.. J. Bai. & A.C.B.R. T. & J. Finally. Econometrica 61. Such a strategy. we have assumed that the number of breaks is known a priori or less than some known upper bound.J.W. & L.K.D. North Carolina State University. Busetti. These and other issues are the object of ongoing research. Rothenberg. 47–78. Ph. Elliott. does not directly extend to our framework. (1987) The Fisher hypothesis and the forecastibility and persistence of inflation. Perron (2003) Computation and analysis of multiple structural change models. . Journal of Applied Econometrics 18. Annals of Statistics 2. 1–18.24 MOHITOSH KEJRIWAL ET AL. Developing methods that would allow the consistent estimation of the number of breaks in this framework is an important avenue for future research. & D. Barsky. DeLong. R. Indeed. dissertation. 821–856. D.H.

Morley (2009) Changes in U. 305–328. C. Econometrica 57.). T. & A. & J. J. 291–311. T. Kurozumi. Journal of the American Statistical Association 90. (1989) The great crash.. the standard euclidean norm. P. Leybourne. Econometrics Journal 6. M. 11(3).. D. S. & P. H. Economics Letters 94. 1361–1401.Y. Leybourne. & A. Econometrica 69. inflation persistence. Saikkonen (1999) Order selection in testing for the cointegrating rank of a VAR process. Oxford University Press. (2005) Fluctuation tests for a change in persistence. with . Journal of Econometrics 54. the oil price shock.).. Journal of Econometrics 134.. Kejriwal. Kim. M. S.F. article 2. Kejriwal.M. Taylor. S. White (eds. Tj we have ||B1 B2 || ≤ ||B1 ||||B2 ||1 .. Mills (eds. Cointegration.M.N.M. 441–469. Rush (1991) Is the budget deficit too large? Economic Inquiry 29. 1–13.J. 620–639.C. Perron.R.S. APPENDIX As a matter of notation.−1 = (Tj − Tj−1 )−1 ∑t=T j−1 z . Perron. S. article 1. & P. Kim. Qu (2007) A simple modification to improve the finite sample properties of Ng and Perron’s unit root tests.R.Y. Perron. Kang. 13(4). Boston University. Causality and Forecasting. pp. V.J. & A.J. In R. E.. American Economic Review 77. In K. Journal of Econometrics 134. Ng. Studies in Nonlinear Dynamics & Econometrics vol. Engle & H. 12–19. Studies in Nonlinear Dynamics & Econometrics vol. Perron (2010) A sequential procedure to determine the number of breaks in trend with an integrated or stationary noise component. Palgrave Macmillan. (2003) Inference on segmented cointegration. Patterson & T. and the unit root hypothesis.S. Kim. Note that ||B||1 equals the square root of the largest eigenvalue of B B and that ||Bx|| ≤ ||B||1 ||x||. & P.R. P.J. we define the following regime-wise j−1 +1 t−1 . Kim. 278–352. Kim. Econometric Theory 19. J. P. 181–206. Journal of Time Series Analysis 31. 429–445. Harvey. & P. Taylor (2007a) CUSUM of squares-based tests for a change in persistence. (2006) Dealing with structural breaks. pp.M. (2000) Detection of change in persistence of a linear time series. 358–374. 373–399. Newbold (2003) Tests for a change in persistence against the null of difference-stationarity.J. Perron (2001) Lag length selection and the construction of unit root tests with good size and power. 168–99. Journal of Time Series Analysis 28. Oxford Bulletin of Economics and Statistics 67. Perron (2012) Estimating a Structural Change in Persistence.A. Mankiw. J. & D. Kim. Qu (2006) Estimating restricted structural change models. N. we use the usual norm ||B||2 = tr(B B). K. such that ||B||21 ≤ ||B||2 . T. 268–281. Ng. 1519–1554.R.H. Next. Also. L¨utkepohl. A. Leybourne.G. & Z. D. S. Note that for any conformable matrices B1 and B2 . S. Kim.DETECTING MULTIPLE CHANGES IN PERSISTENCE 25 Hakkio. Weil (1987) The adjustment of expectations to change in regime: A study of the founding of the Federal Reserve. Miron. & M. C. Manuscript in preparation. 159–178. & Z. & P. & P. 207–230. P.I. Perron. Palgrave Handbook of Econometrics. Perron (1995) Unit root tests in ARMA models with data dependent methods for the selection of the truncation lag. we use the matrix norm ||B||1 = sup x ≤1 ||Bx||. Finally. we define z¯ j = (Tj − Tj−1 )−1 ∑t=T +1 z t and Tj z¯ j. & P. Perron (2009) Unit root tests allowing for a break in the trend function under both the null and alternative hypotheses. Taylor (2007b) Detecting multiple changes in persistence. (2005) Detection of structural change in the long-run persistence in a univariate time series. Smith. Oxford Bulletin of Economics and Statistics 67. Taylor (2006) Modified tests for a change in persistence. throughout. Leybourne. Journal of Econometrics 148. 408–433.

. . . We have yt = ci + αi yt−1 + u t . the sum of squared residuals is SS R 0 = ∑t=1 t t−1 ) = ∑t=1 u t . where v t satisfies Assumption A1.26 MOHITOSH KEJRIWAL ET AL. k +1 with αi = 1. k + 1): (a) T −3/2 ∑t=1i wt ⇒ λ λ [T λ ] λ [T λ ] σ 0 i W (r )dr . and ⎡ ⎤  λ j ( j) r W (r )dr ⎢ ⎥ λ j−1 ⎥  ( j) (r ) = W ( j) (r ) − ⎢ W ⎢ 2 ⎥ −1  λ j  ⎣  λj  ⎦ dr λ j−1 r − λ j − λ j−1 λ j−1 r dr    −1  λ j r dr . . If {wt } is generated as wt = wt−1 +v t . (b) T −3/2 ∑t=1i wt2 ⇒ σ 2 0 i W (r )2 dr . . We first state a lemma about the weak convergence of various sample moments whose proof is standard and thus omitted. . (c) T −1 ∑t=1i wt−1 v t ⇒ σ 2 0 i W (r )dW (r ). The proofs for the other models are similar and hence omitted. We shall prove the theorem for Models 1a and 2a. Proof of Theorem 1. t = Ti−1 +1. . 1].1.) denotes a standard Brownian motion on [0. LEMMA A. [T λ ] the following weak convergence results hold (for i = 1. For Model 1a. ci = 0 in odd regimes and |αi | < 1. ci unrestricted in even regimes. × r − λ j − λ j−1 λ j−1 where W (. Ti for i = 1. If k is even. the sum of squares residuals under the alternative hypothesis is   k/2 k/2 T2i+1 T2i .  λj demeaned and detrended Brownian motions: W ( j) (r ) = W (r ) − (λ j − λ j−1 )−1 λ j−1 W (r )dr . . . . Under the null hypothesis of a unit root throughout the T (y − y T 2 2 sample. . .

−1 ∑ i=1 t=T2i−1 +1 t=T2i−1 +1 T + ∑ u 2t . k/2. y¯2i = y¯2i.−1 i=1 t=T2i−1 +1 i=0 t=T2i +1 ⎡  ⎤  2    2 T k/2 − ∑ 2i T2i t=T2i−1 +1 yt−1 − y¯2i.−1 ) ∑ i=1 t=T2i−1 +1 i=0 t=T2i +1 T T 2i 2i (yt − y¯2i )(yt−1 − y¯2i. αˆ 2i = ∑t=T 2i−1 +1 2i−1 +1 (yt−1 − y¯2i.−1 )2 .−1 )/ ∑t=T where. under the null hypothesis. (A. t=1 .  2 ⇒  λ2i   T2i (2i) (r ) 2 dr T −2 ∑t=T yt−1 − y¯2i.1) yt − y¯2i − αˆ 2i (yt−1 − y¯2i.1).k = ∑ + ∑ ∑ u 2t .−1 + u¯ 2i . Note that.−1 t t=T2i−1 +1 ⎢ ⎥ SS R 1a. for i = 1. . . Substituting in the expression for αˆ 2i and using Lemma A.k = ∑ ⎣ + ∑ (u t − u¯ 2i )2 ⎦+ ∑ ∑ u 2t  2 T2i ∑t=T2i−1 +1 yt−1 − y¯2i. which implies. we thus have.−1 u t   λ2i−1 T αˆ 2i − 1 = . yt = yt−1 + u t .1. . . ⎡  ⎤   2 T k/2 − ∑ 2i k/2 T2i+1 T2i y u − y ¯ t−1 2i. we have  λ 2i   T2i W (2i) (r )dW (r ) ∑t=T2i−1 +1 yt−1 − y¯2i.−1 u t T ⎢ ⎥ T −1/2 ∑ u t ⎦ = ∑⎣ −  2 T2i − T T 2i 2i−1 yt−1 − y¯2i.−1 +1 W 2i−1 λ2i−1 T −1 From (A. under the null. 2 SS R 1a.

−1 i=1   ⎤ k/2 ⎡ ⇒ σ2 k/2 ⎢ ∑⎣ i=1 2 T2i T −1/2 × ∑ t=T2i−1 +1 ut ⎦ 2  λ2i (2i) (r )dW (r ) λ2i−1 W   λ2i  (2i) (r ) 2 dr λ2i−1 W + 1 λ2i − λ2i−1 ⎤ .−1 u t SS R 0 − SS R 1a.DETECTING MULTIPLE CHANGES IN PERSISTENCE 27 so that ⎡   2 T2i T ⎢ ∑t=T2i−1 +1 yt−1 − y¯2i.k = ∑ ⎣ +  2 T2i − T 2i T2i−1 ∑t=T2i−1 +1 yt−1 − y¯2i.

p T u 2 + o (1) → σ 2 .     2 ⎥ × W λ2i − W λ2i−1 ⎦.k = T −1 ∑t=1 p t ⎡ ⎤ 2 λ2i (2i) W (r )dW (r ) k/2 ⎢  . so that It is easy to show that T −1 SS R 1a.

   2⎥ 1 ⎢ λ ⎥ W k F1a (λ. 2i 2i−1   λ2i 2 ⎣ ⎦ − λ λ 2i 2i−1 i=1 W (2i) (r ) dr λ2i−1 If k is odd. SS R 1a.k = (k−1)/2 T2i+1 ∑ i=0 ∑ u 2t + t=T2i (k+1)/2  ∑ T2i ∑ t=T2i−1 +1 i=1 . k) ⇒ ∑ ⎢  2i−1 W + λ λ − ⎥.

yt − y¯2i − αˆ 2i (yt−1 − y¯2i.−1 2  and similar derivations show that ⎡  (k + 1)F1a (λ. k) ⇒ (k+1)/2 ⎢ ⎢ ⎢ ⎣ i=1 ∑ λ2i λ2i−1 2 W (2i) (r )dW (r )  λ  2i λ2i−1 2 W (2i) (r ) dr + 1 λ2i − λ2i−1 ⎤ .

. we have SS R0∗ = ∑t=1 t t−1 − T T −1 2 2 ¯ . For Model 2a. with αi = 1. T [y − y Under the null. For this model. . ci unrestricted in odd regimes and |αi | < 1. we have yt = ci + bi t + αi yt−1 + u t . yt = c + yt−1 + u t . Again. . T2i ]. 2 ⎥ × W (λ2i ) − W (λ2i−1 ) ⎦ . bi = 0. ci unrestricted in even regimes. bi . consider first the case with k even. define . Ti . . t = Ti−1 + 1. For T ∑t=1 (yt − yt−1 )] = ∑t=1 (u t − u) t ∈ [T2i−1 + 1.

k = .−1 t − t¯2i  ∑t=T  2i−1 +1 t − t¯2i . y˜t−1 = yt−1 − y¯2i.−1 −  2 T2i ∑t=T +1 t − t¯2i 2i−1 Then.28 MOHITOSH KEJRIWAL ET AL. under the null hypothesis.    T2i yt − y¯2i t − t¯2i  ∑t=T  2i−1 +1 y˜t = yt − y¯2i − t − t¯2i  2 T2i ∑t=T2i−1 +1 t − t¯2i    T2i yt−1 − y¯2i. we can write   T2i t − t¯2i u t  ∑t=T  2i−1 +1 y˜t = y˜t−1 + u t − u¯ 2i − T  2 t − t¯2i . 2i ∑t=T +1 t − t¯2i (A.2) 2i−1 We have k/2 SS R 2a.

we can express 2i−1 +1 2i−1 +1 t−1 (A. T2i ∑t=T +1 t − t¯2i k/2 2i−1 .3) 2 T2i+1   1 .3) as ⎡  2 T2i T2i ⎢ − ∑t=T2i−1 +1 y˜t−1 u t SS R 2a. Then using (A. yt − yt−1 yt − yt−1 − +∑ ∑ T2i+1 − T2i t=T∑+1 i=0 t=T2i +1 2i k/2 T2i+1  T T 2i 2i where α˜ 2i = ∑t=T y˜t y˜t−1 / ∑t=T y˜ 2 . T2i ∑ ∑ i=1 t=T2i−1 +1 y˜t − α˜ 2i y˜t−1 2 (A.k =− T −1/2 T ∑ ut t=1 2 k/2 ⎡ T +∑⎣ − T2i T 2i+1 i=0  T −1/2 2 ⎤ T2i+1 ∑ t=T2i +1 ut ⎦ ⎡ 2  2 T2i T2i y ˜ u ∑ t t−1 t=T +1 T 2i−1 ⎢ −1/2 + + ∑⎣ T ∑ ut T2i T2i − T2i−1 y˜ 2 ∑t=T i=1 t=T2i−1 +1 2i−1 +1 t−1    2 ⎤ T2i ¯ t − t ∑t=T 2i u t 2i−1 +1 ⎥ +  2 ⎦ .  2 ⎦ + ∑ ∑ T2i ∑t=T2i−1 +1 t − t¯2i i=0 t=T2i +1 k/2 We thus get  SS R ∗0 − SS R 2a.k = ∑ ⎣ + ∑ (u t − u¯ 2i )2 T2i 2 y ˜ ∑t=T i=1 t=T +1 2i−1 2i−1 +1 t−1 ⎤   2   T2i t − t¯2i u t ⎥ k/2 T2i+1  ∑t=T 2 2i−1 +1 − u t − u¯ 2i+1 .2).

DETECTING MULTIPLE CHANGES IN PERSISTENCE 29 which yields k/2  2k F2a (λ. k) ⇒ − {W (1)} + ∑ 2 ⎡ ⎧ ⎨ i=0 λ2i .

  2 1 W (λ2i+1 ) − W λ2i λ2i+1 − λ2i  (2i)  ⎤ ⎫2 ⎬ W (r )dW (r )⎭ ⎥ ⎢⎩ .

    2 ⎥ ⎢ λ2i−1 1 ⎥ ⎢ λ  W λ − W λ + 2i 2i−1 2 λ2i −λ2i−1 2i ⎥ ⎢  (2i) (r ) dr ⎥ ⎢ W ⎥ k/2 ⎢ λ2i−1 ⎥ ⎢ +∑⎢ ⎥. ⎡ ⎤2    λ2i λ2i ⎥ ⎢     −1 i=1 ⎢ ⎥ ⎣ ⎦ r − λ − λ r dr dW r 2i 2i−1 ⎥ ⎢ λ2i−1 λ2i−1 ⎥ ⎢ ⎥ ⎢ +     2 λ2i ⎦ ⎣ −1 λ2i  r − λ2i − λ2i−1 r dr dr λ2i−1 λ2i−1 If k is odd.k = ∑ ∑ yt − yt−1 − T2i+1 − T2i ∑ yt − yt−1 i=0 t=T2i +1 t=T2i +1 (k−1)/2 T2i+1 +  (k+1)/2 T2i i=1 t=T2i−1 +1 ∑ . 2 T2i+1   1 SS R 2a.

ηT ) . T −1 ). . (d) ||η E|| = o p (T l T 1/2 ). Define the (2 × 2) diagonal matrix DT = diag(T −1/2 . . . . . . yt−l T ). For the proof of Theorem 2. let γˆ2i = (cˆ2i . . the results  of Theorem 1 follow from an application of the continuous mapping theorem. we consider Model 1a when k is even. T2i . . . . . yTj ) . . . . (c) ||η V || = O p (T 1/2 l T ). v T∗j ) . . . yt−1 ) for t = T2i−1 + 1. . The autoregression in the ith regime (i = 1. . v Tj ) . αˆ 2i − 1) and Z 2i = (z T2i−1 +1 . . j−1 For i = 1. (A. and et = ∑ j>l T π j yt− j . . . −1/2 k/2 (e) ||E E|| = o p (T ). Given these limits. k + 1. LEMMA A. . (h) ||[η η − ∑i=1   ∗ Z (Z Z )−1 Z η∗ ]−1 || = O (T −1 ). . . for i = 1. . v T∗ ) = V + E with V = (v 1 . For j = 1. . .4) j=1 with v t∗ = et + v t .  −1 η∗ || = O (l 1/2 ) and (ii) ||D Z we have (a) || η η ||1 = O p (T −1 ). . . Vj = (v Tj−1 +1 . . . z T2i ) where z t = (1. the proof is similar for the other cases. . . . . . (f) ||E V || = o p (T ). . . E j = (eTj−1 +1 . Let ηt = (yt−1 . .4) as yt = ci + (αi − 1)yt−1 + ηt  + v t∗ with αi = 1. ηTj ) . . . . . η2i p 2i 2i 2i 1 2i 2i . We can write (A. v T ) and E = (e1 . . (g) ||η V ∗ || = o p (T l T ). we denote Y j = (yTj−1 +1 . .  = (π1 . . . . and Vj∗ = (v T∗ +1 . . k/2. . . . (b) (i) ||DT Z 2i p T T 2i 2i −1/2 E2i || = o p (l T−1 ). . ∑ y˜t − α˜ 2i y˜t−1 2 and similar derivations yield the result stated in Theorem 1. ci unrestricted in even regimes. ci = 0 in odd regimes and |αi | < 1. .2. k/2. πl T ) . . . k/2) is yt = c2i + (α2i − 1)yt−1 + lT ∑ π j yt− j + vt∗ . . . . . η = (η1 . η∗j = (ηTj−1 +1 . V ∗ = (v 1∗ . . Under Assumptions A1–A3. . . The proof of Theorem 2 is based on the following lemma. . . . eTj ) . Assume yt is generated as yt = yt−1 + u t . . eT ) .

the result ∗ −1 −1 η η)−1 − ( ∗ )−1 || = follows from the fact that |||(T −1 η η)−1 1 l 1 − (l ) ||1 | ≤ ||(T η∗ is O (1) and the o p (1). (g) Since V = V + E. where h = E(u t u t−h ). ( ( T where we again use the fact that ( j ( is bounded uniformly in j. and the result follows since the number of elements is of order O(l T ). (b) For (i). and Q = ||T −1 η η − T −1 ∑i=1 η2i 2i 2i 2i l 1 2i 2i .2.30 MOHITOSH KEJRIWAL ET AL. it follows that ||(T −1 η η)−1 − (l∗ )−1 ||1 = O p (T −1/2 l T ). (f) We have T −1 ∑t=1 T −1 v t et = T ∑i>l T i ∑t=1 yt−i v t . lT j>l T ( ( using the fact that (i− j ( is uniformly bounded by the stationarity of u t . v t et % ≤ T %T ∑ ∑ %t=1 % % % i>l t=1 T T y ∗ where we used the fact that T −1/2 ∑t=1 t−i v t = O p (1). (d) We have % % 1/2  & ' T % % E %T −1 η E % ≤ T −1 ∑ E ( et ηt ) ≤ E ηt 2 E(et2 ) t=1 ⎧  ⎨ 1/2 E = C2 l T ⎩ 1/2 ≤ C3 l T ∑ yt− j π j j>l T ( ( ∑ (πj ( = o 2 ⎫1/2 ⎬ ⎭  1/2 ≤ C2l T 1/2 ∑ ∑ ( ( ( ( (i− j ( |πi | (π j ( i>l T j>l T ' & −1/2 . For (ii).2(a) of L¨utkepohl and Saikkonen (1999). Proof of Lemma A. (a) Let l∗ = (i− j )i. 3). so that % % % % % T % % % & ' T % % % % −1 −1 πi % ∑ yt−i v t % = o p l T−1 T −1/2 = o p (1). Since (l∗ )−1 1 = O(1) uniformly in l T for sequences such that T −1/2 l T → 0. (e) We have % % & ' T T % % E %T −1 E E % = T −1 ∑ E et2 = T −1 ∑   πi E yt−i yt−a πa t=1 i>l T a>l T t=1 ≤ T −1 ∑ ∑ T ( ( ∑ ∑ ∑ |πi | (a−i ( |πa | ≤ o i>l T a>l T t=1 & ' l T−2 = o (1) . the result follows since each element of DT Z 2i p 2i number of elements is of order O(l T ). (c) The elements of T −1/2 η V are each O p (1) (since each element of ηt and v t is uncorrelated). (h) Let % %  % k/2   −1   % % −1   −1 −1 % −1 % q =% − l∗ ∑ η2i∗ Z 2i Z 2i Z 2i Z 2i η2i∗ % % T η η−T % % i=1 1   k/2 ∗ Z (Z Z )−1 Z η∗ −  ∗ || . −1/2 ||η V ∗ || ≤ ||η V || + ||η E|| = O p (T 1/2 l T ) + o p (T l T 1/2 −1/2 ) = o p (T l T ). Lem. From l Berk (1974.T j=1 . the result follows from Lemma A.

8) T 2i 2i 2i T 2i 2i ˆ − ) we get for i = 1. k + 1 ∗ ˆ − Z 2i γˆ2i . we denote the vector of residuals in the jth regime under the null and alternative hypotheses by V˜i∗ and Vˆi∗ . respectively. . . Solving for (  −1 k/2   −1 ∗  ∗ ˆ −  = η η − ∑ η Z 2i Z Z 2i  Z 2i η2i 2i 2i  i=1 × η V ∗ − k/2  ∑ i=1 ∗ Z  Z Z −1 Z V ∗ η2i 2i 2i 2i 2i 2i   . Vˆ2i+1 for i = 1. k/2 (A. Z 2i 2i for i = 1. % % k/2  % % % % % % % −1 % −1 ∗ ∗ −1 ∗ Q ≤ %T η η − l % + %T ∑ η2i Z 2i (Z 2i Z 2i ) Z 2i η2i %% 1 % i=1 % % %% k/2 % %% % % % ∗ Z D % % −1 % % ∗% = %T −1 η η − l∗ % + T −1 ∑ %η2i 2i T %(DT Z 2i Z 2i DT ) % DT Z 2i η2i 1 i=1 & ' & ' & ' & ' 1/2 1/2 = O p l T /T 1/2 + T −1 O p l T O p (1) O p l T = O p l T /T 1/2 . .6).  Proof of Theorem 2 (Model 1a and k even). . %  −1 % % % −1 k/2 ∗ ∗ −1 −1 % = O p (1) and the result % so that % T η η − T ∑i=1 η2i Z 2i (Z 2i Z 2i ) Z 2i η2i % 1 follows. . . .7). .6) ∗ ∗ = 0. . k/2 k/2 k/2 i=1 i=0 (A.   ˜ −  = η η −1 η V ∗ under H0 . . and thus (% (  % (% k/2  % (( %  −1 % (% −1 % % (% T η η − T −1 ∑ η∗ Z 2i (Z Z 2i )−1 Z η∗ % −% %(l∗ )−1 % (( 2i 2i 2i 2i (% % 1( (% % i=1 1 & ' = O p l T /T 1/2 = o p (1). Vˆ2i+1 ∑ η2i∗ Vˆ2i∗ + ∑ η2i+1 (A. k + 1. Also.9) .    Z D −1 D Z η∗ ( − ) ˆ + DT Z E2i + DT Z V2i DT−1 γˆ2i = DT Z 2i (A. . . . . k/2.7) ˆ −  = (η η)−1 (η V ∗ − ∑k/2 η∗ Z 2i γˆ2i ). . . from Under H0 . . . from (A. Also. . we have  i=1 2i (A.  ˆ and γˆ2i satisfy the first-order conditions where  Vˆ ∗ = 0. we get q = O p (l T /T 1/2 ). For i = 1. (A. . for i = 1. . Then we have ˜ V˜i∗ = Yi − ηi∗ .5) for i = 0. Next. . k/2. . . Since (l∗ )−1 1 = O p (1).DETECTING MULTIPLE CHANGES IN PERSISTENCE 31 Then q ≤ {q + (l∗ )−1 1 }Q (l∗ )−1 1 or q ≤ (l∗ )−1 21 Q/[1− Q (l∗ )−1 1 ]. Vˆ2i∗ = Y2i − η2i ∗ ∗ ˆ = Y2i+1 − η2i+1 .

−1/2 ˆ − || = o p (l Using Lemma A. DT−1 γˆ2i = DT Z 2i p T 2i 2i 2i T (A.g. we get || ).32 MOHITOSH KEJRIWAL ET AL.10) .2(b). Using this in (A. % % E % = o (l −1 ). Then.h).O p l T o p l T = o p (1).2 (b. % DT Z 2i p T 2i  Z D −1 D Z V + o (1).8). using Lemma A. we have Also. T % % & '% '% % %& −1 −1 % % % %% ∗ ∗ %% ˆ % ˆ % Z 2i DT DT Z 2i η2i  −  Z 2i DT η2i %  −  % ≤ % DT Z 2i % % DT Z 2i % % DT Z 2i & ' & ' 1/2 −1/2 = O p (1).

.k = k/2  ∑  Z D )−1 D Z V V2i Z 2i DT (DT Z 2i T 2i 2i + o p (1). 2i T (A.10) in (A. . Then for r ∈ (0. yt = d(1)wt + u¯ 0 − u¯ t . Then. using (A. V˜2i∗ = Vˆ2i∗ + Z 2i γˆ2i + η2i ∗ ∗ ∗ ˆ − ).5).11) ∗ ( ˆ − ). we have SS R 0 − SS R 1a. .k = = k/2  ∑ i=1 k/2 & ∑ i=1  k/2   ∗ V˜ ∗ ˆ ∗ ˆ ∗ V˜2i∗ V˜2i∗ − Vˆ2i∗ Vˆ2i∗ + ∑ V˜2i+1 2i+1 − V2i+1 V2i+1 i=0 DT−1 γˆ2i '  Z D  D −1 γˆ DT Z 2i 2i T T 2i ' & ' k/2  & ˆ − ˜ ∑ η∗ Z 2i DT D −1 γˆ2i .13) i=1 Under H0 .12) i=1 Next. we have T −2 ∑[T t=1 yt = r . from (A. ≤ %(η η)−1 % ∑ %η2i p T 2i T %DT 2i % 1 i=1 (A. ˆ − ˜ = −(η η)−1 ∑k/2 η∗ Z 2i γˆ2i so that Further. ˜ Thus the numerator of the F statistic can i = 0. % % %& % % % k/2 % %& ' k/2  '%  −1 % ˆ %ˆ % ∗ ∗ Z D % %% ˜ ∑ η Z 2i DT (D γˆ2i )% ˜% − % − % ≤ % % ∑ % η2i % DT−1 γˆ2i % 2i T 2i T % % i=1 i=1 & ' & ' 1/2 −1 1/2 = O p lT T . we get  i=1 2i % % % % k/2  % % %  & −1 '% % % %ˆ % % −1 ∗ ˜ % ≤ %(η η) % % ∑ η Z 2i DT D γˆ2i % % % −  2i T % 1 %i=1 % % k/2 % % & ' %% % % ∗ Z D % % −1 γˆ % = O l 1/2 T −1 . . V˜2i+1 = Vˆ2i+1 + η2i+1 ( be written as SS R 0 − SS R 1a. .12). k/2. ˜ and for We can write. 1]. u¯ t = ∑∞ s=0 ds v t−s . k/2.O p l T .O p (1) & ' = O p l T T −1 = o p (1) . where ∞ ¯ ¯ wt = ∑tj=1 v j . . +  2i T (A. ds = ∑i=s+1 di . . Note that (u¯ t ) is stochastically ] 2 of smaller order of magnitude than (wt ). . for i = 1. we have the Beveridge-Nelson decomposition.

k/2. . . using results in Chang (1989) and Chang and Dickey (1994) assuming the condition l T6 /T = o p (1) holds. . we Y || = O (T ).16). V˜ ∗ = Mη Y Vˆ ∗ = Mη Vˆ ∗ = Mη Y − Mη Z¯ 0 γˆ = V˜ ∗ − Mη Z¯ 0 γˆ . 1. D1T Z¯ 0 Mη Z¯ 0 D1T D1T (A. ⎦ λ2i − λ2i−1 p Using the fact that T −1 SS R1a. k/2) is an I (0) regime. 2. the result follows. k) diverges. first note that we can express the vector of residuals computed under the null and alternative as.15).DETECTING MULTIPLE CHANGES IN PERSISTENCE [T λ ] 33 d(1)2 T −2 ∑t=1 wt2 + o p (1) and T −1 ∑t=1i yt−1 v t = d(1)T −1 ∑t=1 wt−1 v t + o p (1). . To show that the test is consistent. From (A. . ||η∗ Z || = O (T ). we can write V˜ ∗ V˜ ∗ − Vˆ ∗ Vˆ ∗ = γˆ Z¯ 0 Mη Z¯ 0 γˆ + 2Vˆ ∗ Mη Z¯ 0 γˆ = γˆ Z¯ 0 Mη Z¯ 0 γˆ . . T −1/2 I2 . Define the [2(k + 1) × 2(k + 1)] matrix D1T = diag(DT . 1998). . To see this. . γˆ2i = γ2i0 + Z 2i Z 2i Z 2i η2i 2i −  2i 2i 2i  −1 k/2     ˆ = η η − ∑ η∗ Z 2i Z Z 2i −1 Z η∗  2i 2i 2i 2i  i=1 × η Y − k/2  ∑ ∗ Z  Z Z −1 Z Y η2i 2i 2i 2i 2i 2i   . λ0k ). . Substituting in (A. γˆ2 . .6) and (A. For part (a). . Hence. the true break fractions.17) SS R 0 − SS R 1a.7). we only need to focus on the behavior of T 1/2 γˆ2i for i = 1. DT ). note that D1T γˆ = T 1/2 (0. (A. . Next. . ||Z 2i p p 2i 2i 2i k/2 ∗ Z )−1 Z η∗ }]−1 || = O (l 2 T −1 ). respectively. (A. . 2. 0) since γˆ2i+1 = 0 for i = 0. k/2. . Using these results in (A. we get {η Z (Z ∑i=1 2i 2i 2i 2i p T 1 2i 2i 1/2 −1 T 1/2 γˆ2i = O p (T 1/2 l T ) and hence D1T γˆ = O p (T 1/2 l T ). . . . . . . Then. from (A. .14) where Mη = I T −η(η η)−1 η . DT . [T r ] ⎡ k/2 ⎢ ⎢ SS R 0 − SS R 1a. λ0k ) (see Bai and Perron. −1 Then we have D1T Z¯ 0 Mη Z¯ 0 D1T = O p (1). 0. . . and ||[η η − have ||η Y || = O p (T l T ). we will show that for λ0 = (λ01 . . we have & ' & ' & ' −1 −1 γˆ γˆ = O p T l T5 . . .13). (A.k ⇒ σ 2 ∑ ⎢ i=1⎣ λ2i λ2i−1 [T r ] ⎤ 2 W (2i) (r )dW (r )  λ2i  λ2i−1 2 W (2i) (r ) dr + ⎥ 1 ⎥ {W (λ2i ) − W (λ2i−1 )}2⎥. Combining conditions (A. . the statistic F1a (λ0 . γˆ4 . we get '   −1 ∗ & 0  ˆ + Z Z 2i −1 Z V ∗ . z T ) at the true break dates (λ01 .7).k = D1T 5/2 5/2 .  Proof of Theorem 3. Now.14). .k → σ 2 . . . T −1/2 I2 . γˆ and γ 0 are the estimated and true values under the alternative and Z¯ 0 is the diagonal partition of Z = (z 1 .16) i=1 Z )−1 = O (T −1 ) and Z V ∗ = O (T 1/2 ) given that It is easy to show that (Z 2i p p 2i 2i 2i regime 2i (for i = 1. we prove the result for Model 1a and k even.15) where the second term is zero by the first-order conditions (A. .6) and (A.

We have T  ∑ SS R 0 = yt − yt−1 2 [T λ]  ∑ = t=1 SS R 1b. k) = O p (T ). (A. We show that F1b (λ. k) is T −1 SS R 1a. Based on similar arguments. the denominator of F1a (λ.1 = [T λ0 ]  ∑ yt − yt−1 − + [T λ] ∑ [T λ0 ]  ∑   2 yt − αˆ 1 yt−1 − y¯1 − αˆ 1 y¯1. consider the case where λ ≤ λ0 . 2 t=1  −2 1 − αˆ 1  [T λ]  ∑  yt−1 − y¯1. we get SS R 0 − SS R 1b. uniformly over λ.−1 ∑ yt−1 − y¯1.1 = O p (1). 1 − αˆ 1 = O p (T −1 ).17) and (A.1 = + [T λ]  ∑ yt − yt−1 2 t=1  = [T λ]u¯ 21 − 1 − αˆ 1 − [T λ]   T ∑ t=[T λ]+1 T ∑ t=[T λ]+1  yt − yt−1 yt − yt−1 ∑  2 yt − y¯1 − αˆ 1 yt−1 − y¯1.−1 t=1 [T λ] ∑    2 yt − αˆ 1 yt−1 − y¯1 − αˆ 1 y¯1. 1) = O p (1) for λ ≤ λ0 .18) From (A.−1 ) = [T λ] O p (T 2 ).k = T −1 Vˆ ∗ Vˆ ∗ = T −1 Y Mη Y − 2T −1 Y Mη Z¯ 0 γˆ + T −1 γˆ Z¯ 0 Mη Z¯ 0 γˆ & ' & ' = O p (1) − O p (1) + O p l T5 = O p l T5 .34 MOHITOSH KEJRIWAL ET AL.−1 t=1 SS R 0 − SSR1b. We assume that the true DGP is given by Model 1a and study the limit of F1b (λ. ∑[T t=1 (yt−1 − y¯1.−1 t=1 [T 2 λ]  2 2 .18). we focus on the simple AR(1) model with a single break for simplicity of exposition. t=1 λ] 2 Using the facts that u¯ 1 = O p (T −1/2 ). 1) for λ ≤ λ0 and λ > λ0 . First.1 = O p (T ) so that F1b (λ. and ∑t=1 (yt−1 − y¯1. For λ > λ0 .1 = yt − yt−1 2 t=1 [T λ]  ∑  2 + yt − y¯1 − αˆ 1 yt−1 − y¯1. We also abstract from shortrun dynamics in the regression model so that the regressors included are only a constant and the lagged dependent variable. SS R 0 − SS R 1b. we have SS R 1b. 1) = O p (1). For part (c). The proof for the more general model essentially follows the same steps although it is much more tedious and thus omitted. . Next. we therefore have F1a (λ0 .−1 )u t = O p (T −1 ).−1 u t .−1 t=[T λ0 ]+1 = [T λ0 ] ∑ t=1  2 c2 + (α2 − 1)yt−1 + u t t=[T λ0 ]+1 t=1 − 2 ' & u t 2 + c22 T λ − λ0 . This proves (a). Part (b) follows directly from (a) and the definition of the tests.

y¯1 − αˆ 1 y¯1. 1) = O p (1).−1 0 t=1 × [T λ0 ] ∑  2 yt−1 − [T λ] y¯1 − αˆ 1 y¯1.−1 ∑ t=1 [T λ]   2   c2 + α2 − αˆ 1 yt−1 + u t − y¯1 − αˆ 1 y¯1. αˆ 1 = 1 + O p (T −1 ). ∑[T u = [T (λ − λ0 )]1/2 O p (1). ∑t=[T λ0 ]+1 yt−1 = [T (λ − λ0 )]O p (1). (A. Again.−1 = (1 − αˆ 1 ) [T λ] [T λ] 2 y¯1.−1 ∑ t=[T λ0 ]+1 [T λ] ∑ = (α2 − 1)2 t=[T λ0 ]+1 [T λ] ∑ × t=[T λ0 ]+1  2 + 2c α yt−1 2 ˆ1 − 1 [T λ] ×⎝ t=[T λ0 ]+1 [T λ0 ] ∑ t=1 ∑ yt−1 t=[T λ0 ]+1 ⎞ [T λ] ∑ ut + [T λ]   yt−1 u t + 2 y¯1 − αˆ 1 y¯1. similar arguments can be used to  show that SS R1b.DETECTING MULTIPLE CHANGES IN PERSISTENCE [T λ] ∑ + (α2 − 1)2 t=[T λ0 ]+1 [T λ] ∑ + 2c2 2 + yt−1 − ∑ u 2t + 2c2 (α2 − 1) t=[T λ0 ]+1 [T λ] ∑ yt−1 t=[T λ0 ]+1 [T λ] u t + 2(α2 − 1) t=[T λ0 ]+1 − [T λ] 35 ∑ yt−1 u t t=[T λ0 ]+1 [T λ0 ]    2 1 − αˆ 1 yt−1 + u t − y¯1 − αˆ 1 y¯1.−1 T λ − λ α2 − αˆ 1 ⎝ − c2 αˆ 1 − α2  '−1  & + T λ − λ0 [T λ] ∑ ⎞ yt−1 ⎠ . each of the terms in brackets is O p (1). Substituting these orders in (A. ∑t=[T λ0 ]+1 λ] yt−1 u t = [T (λ − λ0 )]1/2 O p (1).19) t=[T λ0 ]+1 [T λ0 ] 2 Now we have ∑t=1 yt−1 = O p (T 2 ).19).1 = O p (T ) and thus F1b (λ.−1 ∑ + 2(αˆ 1 − 1) ⎛ 2 2 − α − α yt−1 2 ˆ1 t=[T λ0 ]+1 2 [T λ ] 2  u t ⎠ − 1 − αˆ 1 ∑ yt−1 0 t=1    [T λ ]   − 2 1 − αˆ 1 ∑ yt−1 u t + 2 1 − αˆ 1 y¯1 − αˆ 1 y¯1.−1 t=1 ⎛ '  &   −1 0 +2 y¯1 − αˆ 1 y¯1.−1 + O p (T −1 ) = O p (T −1/2 ).−1 )(α2 − αˆ 1 ) ∑t=[T λ0 ]+1 t−1 −1/2 1/2 c2 /(αˆ 1 − α2 )) = O p (T )O p (1)O p (T ) = O p (1). Note that the t=[T λ0 ]+1 t [T λ] (y − last term in brackets can be expressed as 2( y¯1 − αˆ 1 y¯1. .