P. 1
TVCOINT

TVCOINT

|Views: 7|Likes:
Published by Renoir Vieira

More info:

Published by: Renoir Vieira on Apr 01, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

05/09/2013

pdf

text

original

Sections

  • 1 Introduction
  • 2 Definitions and Representations
  • 3 Testing TI CointegrationAgainst TVCoin- tegration
  • 4 The LR Test under the Alternative of TV Cointegration
  • 5 The Drift Case
  • 6 An Empirical Application
  • 7 Conclusion

Time Varying Cointegration∗

Herman J. Bierens† and Luis F. Martins‡ February 9, 2010

Abstract In this paper we propose a time varying vector error correction model in which the cointegrating relationship varies smoothly over time. The Johansen setup is a special case of our model. A likelihood ratio test for time-invariant cointegration is defined and its asymptotic chisquare distribution is derived. We apply our test to the purchasing power parity hypothesis of international prices and nominal exchange rates, and find evidence of time-varying cointegration. Keywords: Time Varying Error Correction Model; Chebyshev Polynomials; Likelihood Ratio; Power; Trace Statistic J.E.L. Classification: C32.
The authors are grateful to Peter Phillips, Pentti Saikkonen and two anonymous referees for helpful comments. † Department of Economics and CAPCP, Pennsylvania State University, 608 Kern Graduate Building, University Park, PA 16802, USA. E-mail: hbierens@psu.edu. Support for research within the Center for the Study of Auctions, Procurements, and Competition Policy (CAPCP) at Penn State has been provided by a gift from the Human Capital Foundation (http://www.hcfoundation.ru/en/). ‡ UNIDE and Department of Quantitative Methods, ISCTE - Business School, Av. das Forças Armadas, 1649-026 Lisbon, Portugal. E-mail: luis.martins@iscte.pt. Financial support under grants SFRH/BD/814/2000 and PTDC/ECO/68367/2006 from the Fundação para a Ciência e Tecnologia is gratefully acknowledged. This paper is a substantial further elaboration of Martins’ Ph.D. thesis at the Pennsylvania State University. A previous version of this paper was presented by Martins at the Econometric Society World Congress 2005 in London.

1

1

Introduction

Since the seminal papers by Granger (1987), Engle and Granger (1987) and Johansen (1988), the growth of literature on cointegration has been impressive. In the standard approach it is assumed that the cointegrating vectors do not change over time. However, this assumption is quite restrictive. The literature on structural change and cointegration has focused on developing procedures to detect structural breaks and/or to estimate their dates. Papers addressing these issues in a single-equation framework include Hansen (1992), Quintos and Phillips (1993), Hao (1996), Andrews et al. (1996), Bai et al. (1998), and Kuo (1998), among others (see Maddala and Kim 1998 for a survey). Moreover, Lütkepohl et al. (2003), Inoue (1999) and Johansen et al. (2000) analyze the effects of breaks in the deterministic trend. In the context of a system of equations, which is the focus of our analysis, the main contributions are those by Seo (1998), who extends the tests of Hansen (1992). Hansen and Johansen (1999) and Quintos (1997) propose fluctuation tests (based on recursive sequences of eigenvalues and cointegrating vectors) for parameter constancy in cointegrated VAR’s, but they do not parameterize the shifts. Regarding time-varying error correction models, Hansen (2003) generalizes reduced-rank methods to cointegration under sudden regime shifts with a known number of break points. Andrade et al. (2005) study a similar model as Hansen (2003) and develop tests on the cointegration rank and on the cointegration space under known and unknown break points. Also, the Markov-switching approach of Hall et al. (1997), and the smooth transition model of Saikkonen and Choi (2004) provide an interesting way of modeling shifts in the cointegrating vectors. The first authors considering sudden shifts between two states, whereas the latter authors permit a gradual shift between regimes. Lütkepohl et al. (1999) and Terasvirta and Eliasson (2001) propose money demand functions modeled by singleequation error correction models in which a smooth transition stationary term is added. The transition function is driven by one of the processes of the long-run relationship. Park and Hahn (1999) propose a cointegrating regression in the spirit of Engle and Granger (1987) with parameters that vary with time. They model the elements of a (single) cointegrating vector as smooth functions of time, via Fourier series expansions. They derive the asymptotic properties of the semi-nonparametric sieve estimators involved and propose several residual2

based specification tests. The latter approach is part of the growing literature on modeling nonlinear long run relationships. See for example Blake and Fomby (1997), de Jong (2001), Granger and Yoon (2002), Harris et al. (2002) and Juhl and Xiao (2005), among others. In this paper we propose a likelihood ratio test for time varying cointegration, with time invariant cointegration as the null hypothesis, by allowing the cointegrating vectors in a vector error correction model (VECM) to be smooth functions of time, similar to Park and Hahn (1999). In particular, we propose to model these time varying cointegrating vectors via expansions in terms of Chebyshev time polynomials. The resulting extended VECM can be estimated similar to Johansen’s (1988, 1991, 1995) ML approach. The null hypothesis of standard cointegration then corresponds to the hypothesis that the parameters in the VECM that are related to Chebyshev time polynomials are jointly zero. The latter hypothesis can be tested via a likelihood ratio test. The remainder of the paper is organized as follows. In Section 2 we introduce the time varying (TV) VECM using Chebyshev time polynomials. In Section 3 we propose a likelihood ratio test to distinguish Johansen’s standard cointegration from our time-varying alternative, for the case without drift, and show that the asymptotic null distribution is chi-square. In Section 4 the asymptotic power of the test is derived analytically and via Monte Carlo simulations. In Section 5 we show that our results carry over to the drift case. In Section 6 we illustrate the merits of our approach by testing for TV cointegration of international prices and nominal exchange rates. In Section 7 we make some concluding remarks. The proofs of the lemmas and theorems can be found in either the Appendix at the end of this paper or in Bierens and Martins (2009). d As to some notations, ”⇒” denotes weak convergence, ”→” denotes convergence in distribution, and 1 (.) is the indicator function.

2

Definitions and Representations

For the k × 1 vector time series Yt , we assume that for some t there are fixed r < k linearly independent columns of the time-varying k × r matrix βt = (β1t , β2t , ..., βrt ) of cointegrating vector. Thus, these columns form the basis of c the time-varying space of cointegrating vectors, St = span(β1t , β2t , ..., βrt ) ⊂ 3

Rk , t = 1, 2, ... The remaining k − r orthogonal vectors, expressed by a 0 k × (k − r) matrix βt⊥ , are such that βt⊥ Yt−1 does not represent a cointegrating relationship. The matrices βt will be modeled using Chebyshev time polynomials.

2.1

Time Varying VECM Representation

Consider the time-varying VECM(p) with Gaussian errors, without intercepts and time trends, 4Yt = Πt Yt−1 +
0

p−1 X j=1

Γj 4 Yt−j + εt , t = 1, ..., T,

(1)

where Yt ∈ Rk , εt ∼ i.i.d. Nk [0, Ω] and T is the number of observations. Our objective is to test the null hypothesis of time-invariant (TI) cointegration, 0 0 0 Πt = Π = αβ , where α and β are fixed k × r matrices with rank r, against TV cointegration of the type Πt = αβt , where α is the same as before but now the βt ’s are time-varying k×r matrices with constant rank r. In both cases Ω and the Γj ’s are fixed k × k matrices, and 1 ≤ r < k. Admittedly, this form of TV cointegration is quite restrictive, as only the βt ’s are assumed to be time dependent. A more general form of TV cointegration is the case Yt = Ct Zt , where Ct is a sequence of nonsingular k × k matrices and Zt ∈ Rk is a time-invariant cointegrated I(1) process with a VECM(p) representation. Then Yt has a VECM(p) representation, but where all the parameters are functions of t.
0 0

2.2

Chebyshev Time Polynomials

Chebyshev time polynomials Pi,T (t) are defined by √ P0,T (t) = 1, Pi,T (t) = 2 cos (iπ (t − 0.5) /T ) , t = 1, 2, ..., T, i = 1, 2, 3, ... See for example Hamming (1973). Bierens (1997) uses them in his unit root test against nonlinear trend stationarity. Chebyshev time polynomials are 4

.. .. To make the latter operational. where ϕ (x) is a square integrable real function on [0. Due to this orthonormality property..T (t))2 = 0. T T Pi.T (t)) ≤ 0 .. Therefore.T 6= Ok×r for some i = 1. j.T 6= Ok×r for some i ≥ 1.P 1 orthonormal. any function g (t) of discrete time. T − 1. T − 1.T Pi.T (t) . Then the null hypothesis of TI cointegration corresponds to ξi. R 1 ¡ (q) ¢2 T ϕ (x) dx 1X 2 (g (t) − gm. if ϕ(x) is q ≥ 2 times differentiable. T as P P 1 βt = T −1 ξi.T (t) Pj. Consequently.T (t) for some fixed natural number m < T − 1. . Lemma 1.. i = 0. lim 2q (m + 1)2q T →∞ T π t=1 Proof. T. where q is even. in the sense that for all integers i. .T = Ok×r for all 5 . with ¢2 R1¡ ϕ(q) (x) = dq ϕ(x)/ (dx)q satisfying 0 ϕ(q) (x) dx < ∞. g (t) is decomposed linearly in components ξi.. are i=0 t=1 unknown k × r matrices. 1]...T (t) . and the alternative of TV cointegration corresponds to limT →∞ ξi. . Then T 1X lim lim (g (t) − gm. and ξi.T (t) = t=1 1(i = j).T Pi. Let g (t) = ϕ (t/T ) .. we may without loss of generality write βt for t = 1.T (t) . t = 1.T = T 1X g (t) Pi. T t=1 In this expression.T Pi. where ξi. where ξi.T (t) . See Bierens and Martins (2009).T = T T βt Pi.T = Ok×r for i = 1.T (t) = m X i=0 ξi..T (t) of decreasing smoothness. we will confine our analysis to TV alternatives for which limT →∞ ξi. if g (t) is smooth (to be made more precise in Lemma 1 below)... m. then for m ≥ 1. m→∞ T →∞ T t=1 Moreover.. . it can be approximated quite well by gm. can be represented by g (t) = T −1 X i=0 ξi.T Pi..

Because low-order Chebyshev polynomials are rather smooth functions of t. Thus..T Pi. P1. which can be written more conveniently as 4Yt = αξ Yt−1 + ΓXt + εt . we allow βt to change gradually over time under the alternative of TV cointegration. Park and Hahn (1999) assume that the elements of αt are of the form ϕ (t/T ) .T (t) Yt−1 + Γj 4 Yt−j + εt 4Yt = α i=0 j=1 for some k × r matrices ξi . .k. They consider a TV cointegrating relationship of the form Zt = α0t Xt +Ut . (3) ¡ 0 0 0 0 ¢ (m) where ξ = ξ0 . Xt0 )0 0 and βt = (1. βt Yt = Ut is stationary.. P2. Xt is a kvariate I(1) process and Ut is a stationary process. 4Yt−p+1 . with Yt = (Zt . contrary to Hansen’s (2003) sudden change assumption.T (t)) in (1) yields i=0 Ãm !0 p−1 X X ξi Pi. ξm is an r × (m + 1)k matrix of rank r.T (t) Yt−1 (4) The null hypothesis of TI cointegration corresponds to ξ = (β 0 .i > m.T (t) Yt−1 . Or. 2. −α0t )0 . Yt−1 is defined by ¢0 ¡ 0 (m) 0 0 0 Yt−1 = Yt−1 . where ϕ (x) has a Fourier flexible functional form..T (t) (2) for some fixed m. ξ1 . Effectively this means that under the alternative βt is specified as βt = βm (t/T ) = m X i=0 ξi. .T (t) Yt−1 . 0 (m) where β is the k ×r matrix of TI cointegrating vectors. so that then ξ Yt−1 = 6 ³ ´0 0 0 Xt = 4Yt−1 . where m is chosen in advance. ..... Pm.. where Zt ∈ R. 0 0 (m) and .m ) . This specification of the matrix of time varying cointegrating vectors is related to the approach of Park and Hahn (1999)..3 Modeling TV Cointegration via Chebyshev Time Polynomials P 0 0 Substituting Πt = αβt0 = α ( m ξi Pi.

Then similar to Johansen (1988) the log-likelihood b (r.1 Testing TI Cointegration Against TV Cointegration ML Estimation and the LR Test T 1X 0 b b b 4Yt 4 Yt − Σ0X∆Y Σ−1 ΣX∆Y XX T t=1 Denote S00.T = S01. m) is the log-likelihood of the VECM(p) (3) in (m) the case where Yt−1 is given by (4).T S01.T S01.T is k. T 1 X (m) (m)0 b 0 b b Y Y − ΣXY (m) Σ−1 ΣXY (m) XX T t=1 t−1 t−1 T 7 .. and let λm.. given r and lT (m) (m) P P 0 0 1 1 b b b where ΣXX = T T Xt Xt . 0) − b (r. lT lT (0) (0) 3 3. because the rank of S10.where b (r.(m+1)k be T the ordered solutions of the generalized eigenvalue problem h i (m) (m) −1 (m) det λS11. (5) 1X (m)0 b b b 4Yt Yt−1 − Σ0X∆Y Σ−1 ΣXY (m) XX T t=1 ³ ´0 (m) = S01.T = 0.. and b (r. = λm. with Yt−1 = Yt−1 . m) .T S00. m) . β 0 Yt−1 .T = S11.T = S10.. This suggests to test the null hypothesis via a likelihood ratio test h i LRtvc = −2 b (r.T S00.1 ≥ λm.T − S10.. ΣX∆Y = T T Xt 4 Yt . 0) is the log-likelihood of the VECM(p) (3) in the case m = 0.k+1 = .T .(m+1)k ≡ 0.T (m) (m) (m) −1 b b Note that λm. ≥ λm. ≥ λm. where in both cases r is the cointegration rank. so lT (m) lT that Yt−1 = Yt−1 .r ≥ .. and ΣXY (m) = t=1 t=1 PT 0 (m) 1 b b b b t=1 Xt Yt−1 .2 ≥ .

ln (det (S00. where Ut ∼ i.T )) 3.d.5T. Nk [0. The matrix C (1) is singular. The elements of the k × k matrices Cj decrease exponentially to zero as j → ∞. which implies that t X Uj + Vt + Y0 − V0 . with rank 1 ≤ r < k: There exists a k × r matrix β with rank r such that β 0 C (1) = Or.j 1−λ tvc lT lT ln LRT = −2 b (r.i. m) = −0.m.5T. Therefore. the LR test of the null hypothesis of standard (TI) cointegration against the alternative of TV cointegration takes the form à ! r h i X b0. the r × k matrix β 0 D(1) has rank r. where C(L) − C(1) . (6) 1 − bm. Assumption 2. given m and r. 1−L This is the well-known Beveridge-Nelson (1981) decomposition. (7) Yt = C(1) D(L) = j=1 where Vt = D(L)Ut is a zero-mean stationary Gaussian process. m) = T . takes the form b (r. Ik ].j λ j=1 ³ ´ b ln 1 − λm.k . 4Yt is a strictly stationary zero-meanP k-variate Gaussian ∞ process with Wold decomposition 4Yt = C (L) Ut = j=0 Cj Ut−j . 8 . Moreover. lT r X j=1 plus a constant. 0) − b (r.j − 0.2 Data-Generating Process under the Null Hypothesis For m = 0 we have the standard cointegration case: Assumption 1. We can write 4Yt as 4Yt = C(1)Ut + (1 − L)D(L)Ut .

Ut = 0 for t < 1.d. Once we have completed the asymptotic analysis for the case under review. Admittedly.For the time being we will also assume that Assumption 3. Moreover. Assumption 3 is too restrictive. where under the null hypothesis. Under Assumptions 1-3. Ω]. Nk [0. so that Ω = C0 C0 . so that Y0 = V0 = 0 in (7). Due to Assumption 3. (9) 0 (m) 9 . ξ= µ β Om. (8) j=1 where εt ∼ i.i. and replacing εt by C0 Ut in (3). For the same reason we do not yet consider the more realistic case of drift in Yt . Rather than listing these standard regularity conditions we assume that the Granger representation theorem holds: Assumption 4. 0 note that εt = C0 Ut . t ≥ 1. t ≥ 1.k×r ¶ . the time-varying VECM(p) model becomes 4Yt = αξ Yt−1 + ΓXt + C0 Ut . Similar to (3). there is no vector of constants in this model. model (8) can be written more conveniently as 4Yt = αβ 0 Yt−1 + ΓXt + C0 Ut . but is made to focus on the main issues. Under some further regularity conditions it follows from Assumptions 1 and 2 and the Granger representation theorem (see Engle and Granger 1987) that Yt has a VECM representation. Yt has the VECM (p) representation p−1 X 0 4Yt = αβ Yt−1 + Γj 4 Yt−j + εt . we will show what happens if there is drift in Yt and Assumption 3 is dropped. with Ω non-singular.

3. T t=1 0 µZ 1 ¶ T 1X d 0 0 (4Yt−` ) Yt−1 → C(1) (dW ) W C(1)0 + M` . Var Yt−1 β. Z 1 T 1 X ³ (m) ´0 d f0 Ut Yt−1 → (dW ) Wm (C(1)0 ⊗ Im+1 ) .Finally. 2 T t=1 0 where W is a k-variate standard Wiener process. Under Assumptions 1-2. Xt is nonsingular. and the M` ’s are nonrandom k × k matrices. ` ≥ 0. T t=1 0 Z 1 T ³ ´0 1X d (m) f0 (4Yt−` ) Yt−1 → C(1) (dW ) Wm (C(1)0 ⊗ Im+1 ) T t=1 0 + M`∗ . 0 (10) 10 . to exclude the case that β 0 Yt−1 and Xt are multicollinear we need to assume that h¡ ¢i 0 0 0 Assumption 5. Z 1 T 1X d 0 Ut Yt−1 → (dW ) W 0 C(1)0 . See Phillips and Durlauf (1986) and Phillips (1988).3 Asymptotic Null Distribution The asymptotic results in the standard cointegration case hinge on the following well-known convergence results. T t=1 0 µZ 1 ¶ T 1 X d 0 0 Yt Yt−1 → C(1) W (x)W (x)dx C(1)0 . T 1 X ³ (m) ´ ³ (m) ´0 d Y Yt−1 → T 2 t=1 t−1 Z 1 f f0 (C(1) ⊗ Im+1 ) Wm (x)Wm (x)dx (C(1)0 ⊗ Im+1 ) . ` ≥ 0. Under Assumptions 1-2. We need to generalize these results to the case where Yt−1 is replaced by (m) Yt−1 : Lemma 2.

k converges λ 11 ..r+1 . Note that µZ 1 Z 1 √ Z 1 0 0 f0 = (dW ) Wm (dW (x)) W (x). 2 cos (1πx) dW (x)W (x). the question of how the matrices M`∗ look like is not relevant. 0 0 (m) Yt−1 In Bierens and Martins (2009) we define the proper meaning of the random R1 0 matrices 0 cos (`πx) dW (x)W (x) for ` = 1. b0.. PT where W is a k-variate standard Wiener process. As is well-known (see Johansen... . See the Appendix. 3. In particular.³ the standard TI cointegration in ´0 b b case m = 0 and under Assumptions 1-5.r of the generalized eigenvalue problem (5) converge in probability to constants 1 > λ1 ≥ .. these probability limits are the same as in the standard TI cointegration case. . ³ ´0 The result (10) implies that (1/T ) t=1 (4Yt−` ) = Op (1).. ≥ λm. Thus... 2. ³ ´0 √ √ f Wm (x) = W 0 (x) . 0 0 0 ¶ √ Z 1 √ Z 1 0 0 2 cos (2πx) dW (x)W (x)... Under Assumptions 1-5 the r largest ordered solutions λm.. 2 cos(mπx)W 0 (x) .. .. the following results can be shown. Proof.r+2 .and the M`∗ ’s are k × k(m + 1) non-random matrices. 1988). ≥ λr > 0. . Proof..2 ≥ . if W (x) is univariate then Z Z 1 (−1)` 2 `π 1 W (1) + cos (`πx) W (x)dW (x) = sin (`πx) W 2 (x)dx. 2 cos(πx)W 0 (x) .1 ≥ b b λm. which do not depend on m... b Lemma 3. λ0. 2 2 0 0 Using Lemma 2 (together with rather long list of auxiliary lemmas). The latter is what is needed for our analysis. T λ0. Therefore. See the Appendix. 2 cos (mπx) dW (x)W (x) .

r+2 .. ≥ ρm. 2 cos(πx)Wk−r (x) . . T ⊥ 11.k−r )0 .T 0 Z 1 d −1/2 0 (0) 0 0 (α⊥ Ωα⊥ ) α⊥ C0 S01. Under Assumptions 1-5. λm. ⊥ ⊥ 12 . ≥ ρ0.m 0 0 = det ρ Or.T β⊥ → (dWk−r ) Wk−r .r.k → (ρm.. .m (x)Wk−r..m (x)dx O(k−r)(m+1).(k−r)(m+1) Ir... ρm. 2 cos(mπx)Wk−r (x) ³ ´ −1/2 0 f = (α0⊥ Ωα⊥ ) α⊥ C0 ⊗ Im+1 Wm (x) (12) Lemma 4.m dWk−r 0 fk−r.. ´0 ³ d b b bm. However.2 ≥ . .in distribution to the vector of ordered solutions ρ0.1 ≥ ρ0.1 ≥ ρm. that is not the case! One would therefore expect that this result can be generalized to the TV cointegration case simply by replacing Wk−r (x) in (11) with ³ ´0 √ √ 0 0 0 f Wk−r..2 ≥ ... T λ where ρm..k−r are the k − r largest solutions of the generalized eigenvalue problem ∙ µ R1 ¶ f f0 Wk−r.. 0 0 Wk−r (x) = (α⊥ Ωα⊥ ) −1/2 0 α⊥ C0 W (x) while leaving dWk−r as is. (11) 0 0 where is a k − r variate standard Wiener process.1 This result is based on the fact that one can choose an orthogonal complement β⊥ of β such that Z 1 1 0 (0) d 0 β S β⊥ → Wk−r (x)Wk−r (x)dx.r+1 . 0 − (dWk−r ) W (13) V 0 1 0 Because α0 C0 C0 α⊥ = α0 Ωα⊥ ..m... λm.m .m (x) = Wk−r (x) ..1 .k−r of ∙ Z 1 0 Wk−r (x)Wk−r (x)dx det ρ 0 ¸ Z 1 Z 1 0 0 − Wk−r dWk−r (dWk−r ) Wk−r = 0. V 0 .m µ R1 ¶ µZ 1 ¶¸ 0 f Wk−r.

r ³ ´ . Under the null hypothesis (9). satisfies β=β ββ µZ 1 ¶−1 µZ 1 ¶ ³ ´ ¢−1/2 ¡ d 0 0 e − β → β⊥ T β Wk−r Wk−r Wk−r dW α α0 Ω−1 α . −1/2 T βΣββ ⊗ Im b Under standard cointegration. √ −1/2 T βΣββ ⊗ Im where Σββ = p lim (m) T 1X 0 0 β Yt−1 Yt−1 β.T S11.T ξ⊥. 0 0 2 So that Lemma 2 in Andersson et al.T → V 0.m . Proof.T = β⊥ ⊗ Im+1 .2 A suitable version of ξ⊥ that delivers this result is à à !! Ok. Moreover.r ξ⊥ = β⊥ ⊗ Im+1 . (1983) can be applied. We need to choose R such that T ξ⊥ S11.m. (14) ξ⊥. The reason for this unexpected result is the following. any orthogonal complement of the (m + 1)k × r matrix ξ of TV cointegrating vectors is an (m + 1)k × (k (m + 1) − r) matrix of the form µ µ ¶¶ Ok. the ML estimator β of β. 13 . × R. 1] elements. normalized as ³ ´−1 0 e b 0b β β. T →∞ T t=1 1 0 Then T ξ⊥.r d −1/2 0 ³ (m) ´ 0 √ (α⊥ Ωα⊥ ) α⊥ C0 S01.m×(k−r) random matrix with i. β ⊗ Im where the k × (k − r) matrix β⊥ is an orthogonal complements of β and R is a nonsingular (k (m + 1) − r) × (k (m + 1) − r) matrix.i.d. See the Appendix.m.T ξ⊥ converges in distribution to a nonsingular matrix. f V is independent of Wk−r and Wk−r. N[0.with V an r.m.T converges in distribution to the first matrix in (13). possibly depending (m) 1 0 on T . The matrix V involved is now due to à ! Ok.

k.m. ln ⎝ 0 (m) det x S11.3 and denote e = b ξ b ξ Let ξ⊥.T (x) = T. à ³R T. follows from the previous four lemmas and the Taylor expansion around the MLE of a function of the type ³ ³ ´ ´⎞ ⎛ 0 (m) (m) −1 (m) det x S11.m ) . V α is a k. W α and Wk−r. × α0 Ω−1 α Proof. Ok. ξ = ¡ 0 ¢ β .m (x)dx Wk−r.T be the orthogonal complement of ξ defined by (14). ξ⊥.m × r matrix f with independent N [0. In our case. 0 The test for standard cointegration is based on a simple hypothesis.T ξ ξ (ξ 0 ξ) . ξ Lemma 5 Let b be the ML estimator of ξ.m. ³ ´ fm.Ik Ok. See Johansen (1988).k. We can always write e − ξ = ξ⊥.m are independent.T x ⎠. where ξ ´ ³ ´−1 ¢−1 ³ 0 ¡ 0 Um. and V α . however. the corresponding result is again quite different: ³ ´−1 0 0 ξ ξ ξ ξ.T b ξ 0b Under Assumptions 1-5. ! ´−1 R 1f f f0 Wk−r.r . derived in the Appendix. Or.Um.T S00.m e−ξ √ ξ Ok.m (x)Wk−r.T − S10.k.T → d 1 0 where W α is an r-variate standard Wiener process.T = ξ⊥. µ ¶³ ´ T.m dW 0α d 0 ⎠ ´ → ⎝ ³ −1/2 βΣββ ⊗ Im V α ¢−1/2 ¡ . Consequently.T x 3 Vα ¡ ¢−1/2 × α0 Ω−1 α .m (x)Wk−r.where W α is an r-variate standard Wiener process which is independent of Wk−r .T S01. Ok.T .k T Ik.m dW 0α 0 (15) Recall that under the null hypothesis.m (x)dx Wk−r.m ⎛ ⎞ ³R ´−1 R 1f 1f f0 (β⊥ .T Um. ξ 0 = (β 0 . 14 . 1] distributed elements. See the Appendix.m ) 0 Wk−r.T ξ⊥. The chi-square asymptotic distribution of the likelihood ratio statistic.

t = Y2. It follows now straightforwardly that: Theorem 1 Given m ≥ 1 and r ≥ 1.T (β) − f0.t−1 + U2. to the decomposition of the LR statistic ³ ´ ³ ´ e e ξ) ξ) fm.t .T (β) − f0.³ ´−1 0 e b 0b where similar to e defined in Lemma 5.t with Ut = (U1.t )0 .r + χr(m+1)(k−r) .t )0 drawn independently from the bivariate standard normal 15 .m × µZ 0 1 0 Then we simply apply the Taylor expansion.T (β) = fm.4 Empirical Size To check how close the asymptotic critical values based on the χ2 distribution are to the ones based on the small sample null distribution.m dW 0α ¶¸ 0 2 ∼ χ2 r. It follows then from Lemma 5 that under the null hypothesis. ¡ ¢ d ξ) fm.t = Y2. β = β β β ξ β β.m (x)dx + trace dW α Wk−r.m (x)Wk−r.t + U1.m. where Y1. Wk−r dW 0α 0 ¶¸ 0 ∼ χ2 r(k−r) . mkr 3.T (e − f0.T (β) → trace V 0α V α "µZ ¶ µZ 1 ¶−1 1 0 0 f f f Wk−r.000 replications of the bivariate cointegrated vector time series process Yt = (Y1.t . Y2.T (e − f0. we have applied our test to 10. derived in Johansen (1988). whereas it has been shown by Johansen (1988) that ´ ³ ³ ´ b e b T f0 β − f0 (β) "µZ ¶ µZ 1 ¶−1 1 d 0 0 → trace dW α Wk−r Wk−r (x)Wk−r (x)dx × µZ 0 1 f Wk−r. Y2. U2.T (b − f0. under the null hypothesis of stantvc dard cointegration the LR statistic LRT defined in (6) is asymptotically χ2 distributed. where the two chi-square distributions are independent.T (β) .t .

for T = 100 and 5% asymptotic size the nominal size is 3% for m = 1.3% for m = 5. For example. for example. Thus.j ∆Z2.j ∆Z2. Ut = (U1.t−j + U1.t . 4 4.t ∈ Rr and Z2.t−1 + p−1 X j=1 C12.t )0 ∼ i. for T = 500 the empirical and the asymptotic distributions almost coincide. by using the asymptotic critical values the test tends to over-reject the correct null hypothesis of standard cointegration. with inverses D(L)−1 = ∞ Πj Lj and C22 (L)−1 = ∞ Γj Lj j=0 j=0 satisfying Πj → O.distribution.i. Γj → O exponentially as j → ∞. Vu ] . respectively.t . where A = (α. U2.t = C22. The matrix valued lag Pp P polynomials D(L) = Ir − j=1 Dj Lj and C22 (L) = Ik−r − p−1 C22.t . γ) is a nonsingular k × k matrix. Nk [0. where 0 0 Assumption 6. for various values of T and m.t + γZ2.t ∈ Rk−r are I(1) processes generated by Z1. As expected. with α the matrix of the first r columns of A and γ the matrix of the remaining k − r columns of A.t−j + B2 (t/T ) Z2. In this expression Z1.t−j + U2.t .1 The LR Test under the Alternative of TV Cointegration The Data Generating Process under TV Cointegration A time-varying cointegrated data-generating process Yt with VECM(p) representation (1) can be constructed. For large T and small m the right tail of the distribution is very well approximated by the asymptotic one. 2% for m = 3 and 1. (16) ∆Z2. as follows.t = p X j=1 p−1 X j=1 Dj Z1. For smaller T the test suffers from size distortion. Let Yt = AZt = αZ1.d. The numerical results are given in Bierens and Martins (2009). The elements of B2 (τ ) 16 .j Lj are j=1 P P invertible.

k−r j=1 0 = αβt Yt−1 + X j=1 p−1 Γj ∆Yt−j + AUt .t .j ∆Z1. ∆Yt = A∆Zt µ ¶ p−1 X B2 (t/T ) B1 = A ACj A−1 ∆Yt−j + AUt A−1 Yt−1 + Ok−r. C12.t is due to the dependence of Z1.t−j + U1. µ B1 B2 (t/T ) Ok−r.r C22. where βt0 = (B1. and B2 (τ ) = B2 (0) for τ < 0.k−r µ ¶ 4 hence p−1 X j=1 ∆Zt = where Zt−1 + Cj ∆Zt−j + Ut Cj = Thus.j ¶ . Γj = ACj A−1 . 1] with bounded derivatives. B2 (t/T )) A−1 .t = B1 Z1.t−1 p−1 p−1 X X + C11.j ∆Z2.j C11. (18) for example. Note that the nonstationarity of Z1. we can rewrite model (16) as ∆Z1. j=1 j=1 (17) where B1 = Pp j=1 Dj − Ir is nonsingular.t−1 .r Ok−r.are continuously differentiable function on an open interval containing [0.j Ok−r.r Ok−r. B2 (τ ) = B2 (1) for τ > 1.t−1 + B2 (t/T ) Z2.t on B2 (t/T ) Z2. As is well-known. 17 .t−j + C12. 4 (19) Because the invertibility of D(L) implies that all the roots of the polynomial ´ ³ Pp det Ir − j=1 Dj xj lie outside the complex unit circle.

. and the Op (1) term is uniform in t = 1. T.. . This follows from the result (21) in the following lemma. . (21) ³ √ ´ +Rt + Op 1/ T 4. B1 Z1. Under Assumption 6 we can write ∆Z1.t−1 = Rt + Op (1) uniformly in t = 1. T.2 Power of the LR test To study the power of our test.t ’s in the equation for ∆Z2. βt0 Yt−1 = Rt + Op (1) . T. where C11 (L) = Pp−1 Ir − j=1 C11... and by including lagged ∆Z1.t . 2.. in the sense that with βt define in (19). denote j=0 Qj L = D(L) C11 (L). Moreover. where Rt is a strictly stationary zero-mean Gaussian process. Then B1 Z1..t = t−1 X j=0 ³ √ ´ Πj (B2 ((t − j)/T ) − B2 (0)) ∆Z2..A more general TV model can be formulated by allowing B1 to be a function of t/T as well. (20) uniformly in t = 1. to keep the 18 ..t−1 + B2 (t/T ) Z2. Consequently.t−1 = t−1 X j=0 Qj (B2 ((t − j)/T ) − B2 (0)) ∆Z2.. Moreover. Lemma 6. where Vt is a strictly stationary zero-mean Gaussian P∞ j −1 process.t−1 + B2 (t/T ) Z2. Under Assumption 6 the process Yt is TV cointegrated. T. we will adopt the VECM(p) model (18) with βt defined by (2) as the data generating process.j Lj . that will make the power analysis too complicated. ....t−1−j uniformly in t = 1.t−1−j + Vt + Op 1/ T . However. See the Appendix. where Rt is a strictly stationary zero-mean Gaussian process. Proof. .

b−1 b2 (t/T ) = 1 Then (b1 .. . k = 2. Vu = I2 .t .T (t) . where Z1. . Pm.t−1 .t−1 + U1. 2. .T (t) .t ∈ R and Z2.d.t . 0. ρ1 . Hence.t−1 + U1.t = b1 Z1.T (t) Z2.t . 0..t−1 = Zi. Pm.T (t) Zi.T (t) . where m X j=0 0 ςj Pj.power analysis tractable we will focus on the case p = 1. N2 [0. . P1.t = b1 Z1.t )0 ∼ i. P1.t = U2.t = U2. Thus. pm (t/T ) = (1. Yt = Zt = (Z1. (m) ς 0 = (1..... I2 ] .. Z2.t−1 + b2 (t/T ) Z2. suppose that for some m > 0.t )0 . = b1 ∆Z2.t−1 (m) Z2.t j=0 m X j=0 0 ςj Pj. Next. .. U2.t ∈ R are assumed to be generated by ∆Z1.t = b1 ς 0 Zt−1 + U1. ρm ) and Zt−1 = with (m) (22) Ã Z1. .t−1 + ρj Pj. ρ0 ) and ςj = (0. ρ0 = (ρ0 . P2.t . .t−1 (m) ! and ¡ 0 ¢0 (m) 0 0 Zi.T (t) Zi. 0 0 where ς0 = (1. 0. ∆Z2. Ã ! m X ∆Z1..T (t) Zi. b 19 = Zt−1 ⊗ pm (t/T ) . Ut = (U1.t−1 . ρ2 .T (t) Zt−1 + U1. ρj ) for j ≥ 1. ρm ) . b2 (t/T )) = b1 m X j=0 ρj Pj.T (t))0 .. i = 1. A = I2 . r = 1.i. ρ1 .t−1 . b .t .t . ρ0 .t−1 .

T ξ (m) (m) ξ 0 S11. as H1 (p = 1).T β (0) 20 b(m) .We can now write the model in VECM(1) form as ∆Zt = δς 0 Zt−1 + Ut ¶ b1 δ= . The maximum log-likelihood in the standard case with r = 1 is à ! (0) (0) −1 β 0 S10. 0 In the sequel we will refer to this model.T S00.T = (m) respectively. ln (det (S00.T S00. .T β ³√ ´ 1 1 2π − kT − T. S11. ln 1 − max lT (0) β 2 β 0 S11. m) = − T. ln 1 − max lT (m) ξ 2 ξ 0 S11. ³ ´ p lim T −1 b (1. 0) > 0 lT lT (23) if b(0) p limT →∞ λmax b(0) λmax = max β b(m) < p limT →∞ λmax .T = S01.T S01. λmax = max ξ −1 ξ 0 S10.T S01.T )) − T.T .T ξ (m) .T β 1 b (1.T (m) T 1X 0 = 4Zt 4 Zt T t=1 T 1 X (m) (m)0 Z Z T t=1 t−1 t−1 T 1X (m)0 4Zt Zt−1 T t=1 (m) where µ S11.T S00. 0) = − T.T S00. (m) (m) (m) Under H1 (p = 1) the matrices S00.k ln 2 2 and in the TV case à ! (m) −1 (m) ξ 0 S10.T ξ ³√ ´ 1 1 − T.T )) − T. together with the applicable parts (m) of Assumptions 1-2.T β (0) (0) T →∞ β 0 S11.T S01.T S01. 2 2 Thus.k ln 2π − kT.T and S01. where −1 β 0 S10. ln (det (S00.T ξ 1 b (1.T become S00. m) − b (1.

1) .t .T ς (m) (m) (m) (m) T →∞ ς 0 S11.. p lim −1 β 0 S10. 21 . but a formal proof is beyond the scope of this paper.T β (0) (0) T →∞ β 0 S11.t−1 ) + 0.T S01.T S01.T β (0) =0 for all nonzero vectors β ∈ R2 . 4. T. It is our conjecture that along the lines of the proof of Theorem 2 it can be shown that the test has nontrivial local power. The same applies to the local power of the test. This proof is therefore given in Bierens and Martins (2009).t ...T S00.5 (Z1.t−1 − (1 − ω + ωf (t/T )) Z2. Because b(m) p lim λmax = p lim T →∞ (m) −1 (m) ς 0 S10. Therefore.254Z1.t−1 + U1.T ς (m) (m) T →∞ ς 0 S11. The optimal choice for m can be compared to the optimal choice of the order of an autoregressive process. p lim −1 ς 0 S10.Note that λmax is the maximal solution of (5). It is our conjecture that Theorem 2 carries over to more general alternatives. but verifying this analytically proved to be too tedious an exercise. The data generating process we have used is Z1. The power of our test depends on the choice of the Chebyshev polynomial order m.t = 0. in this subsection we check via a limited Monte Carlo study how the test performs if this assumption is not true. ∆Z2. Theorem 2.3 Empirical Power The assumption that the time varying cointegrating vector can be exactly represented by a fixed number of Chebyshev polynomials is quite restrictive.T ς ∈ (0. researchers usually employ the Hannan-Quinn (1979) or Schwarz (1978) information criteria. The results in this section suggest that these information criteria can also be used to estimate m consistently if m is finite.t = U2. As to the latter.T ς where ς is defined by (22).T S00. The proof of Theorem 2 is not too difficult but tedious and lengthy.T S00. Under H1 (p = 1). hence (23) holds. t = 1. the consistency of our test against the alternative (m) H1 (p = 1) follows from the following theorem.T S01. .

On the other hand. The case ω = 0 corresponds to TI cointegration. Z2. with cointegrating vector β = (1. As expected. In view of the results for ω = 0. See Bierens and Martins (2009) for the latter. both Park-Hahn tests suffer from extreme size distortion. 1} . 0.where Z1. 2ω − 1)0 to βT = (1. In Table 1.01. our test suffers from size distortion in small samples if the asymptotic critical values are used. 0. we have also analyzed the size and power properties of the two tests proposed by Park and Hahn (1999). Surprisingly.01 and ω = 0. In order to check the size and to mimic local alternatives we have conducted the power simulations for ω ∈ {0. and ω ∈ [0. 1]: Z x f (x) = 12 y (1 − y) dy − 1 = 6x2 − 4x3 − 1.t ∈ R. 22 . 5 and T = 100. 1]. whereas for ω > 0 we have timevarying cointegration with cointegrating vector βt moving smoothly from β0 = (1. so that the rejection rates involved are with respect to the asymptotic critical values. 0.05. 000.05 our test seems to have non-trivial local power. −1)0 . Moreover. The results are presented in Table 1. despite the fact that f (t/T ) cannot be represented by a fixed number of Chebyshev polynomials. The number of replications is 10. Finally. 000.t .1. for the same cases as in Table 1.t ∈ R. 0 Note that f (t/T ) cannot be represented by a fixed number m of Chebyshev polynomials. so that rejection rates involved are with respect to the empirical critical values. 200. I2 ] distributed. it is difficult to compare the actual power of these tests with the power of our test. The results are presented in Bierens and Martins (2009). U2. the error vectors (U1. Therefore. the size distortion is modest if the empirical critical values are used. 0. for m = 1. The number of replications is 10. 0.2. −1)0 .5. αasy indicates the asymptotic size. whereas αreal is the empirical size. For the function f we have chosen the following S-shaped function on [0. This size distortion increases with m. note that in general the power is not affected much by the choice of m.t )0 are independently N2 [0.

252 = 0.092 0.999 0.1 0.998 1.594 0.430 0.038 0.683 0.693 0.01 0.290 = 0.05 0.000 αasy = 0.988 αreal = 0.995 1.532 0.236 0. m = 1 αasy = 0.914 =1 0.818 0.063 0.369 0.01 0.202 0.057 0.096 0.05 0.01 0.019 0.164 0.603 = 0.5 0.178 0.106 0.196 0.991 1.688 0.000 23 .2 0.000 αreal = 0.000 αreal = 0.067 0. m = 5 αasy = 0.01 0.985 αasy = 0.909 0.403 0.366 0.000 1.112 0.237 = 0.373 = 0.Table 1: Power of the LR test T ω ω ω ω ω ω ω T ω ω ω ω ω ω ω = 100.829 0.05 =0 0.999 0.093 = 0.079 = 0.800 0.1 0.998 0.105 0.140 0.127 0.05 0.015 0.396 0.999 =1 1.095 = 0.884 0.01 0.227 0.162 0.491 0.05 =0 0.402 0.299 0.05 0.10 αasy = 0.208 0.152 = 0.163 0.05 0.10 αasy = 0.958 =1 0.130 = 0.278 = 0.071 = 0.000 αasy = 0.600 = 0.941 0.2 0.026 0.976 0.5 0.2 0.01 0.093 0.141 0.999 = 200.05 =0 0.998 0.947 0.253 0.950 0.405 0.01 0.05 0.920 = 0.487 = 0.382 0.825 = 0.05 =0 0.772 0.999 1.577 0.996 =1 1.374 0.127 0.10 αasy = 0.126 = 0.347 0.027 0.896 0.076 0.036 0.05 0.997 = 200.10 αasy = 0.133 0.065 0.079 0. m = 1 αasy = 0.000 1.1 0.5 0.05 0.052 0.692 0.251 = 0.360 0.01 0.1 0.2 0.997 αasy = 0. m = 5 αasy = 0.000 T ω ω ω ω ω ω ω T ω ω ω ω ω ω ω = 100.110 0.284 = 0.998 αreal = 0.5 0.574 = 0.

Assumption 5 still applies. These modified Assumptions 1-5 will be referred to as ”the drift case”. where µ 6= 0 is a vector of imbedded drift parameters.. Thus.. However. for most cointegrated macroeconomic time series. as in Johansen (1991): Assumption 4∗ . ∆Yt and β 0 Yt are nonzero-mean stationary processes. Assume 4Yt = C (L) (Ut + µ) = j=0 Cj (Ut−j + µ). β 0 Yt = β 0 Vt + β 0 (Y0 − V0 ) . The corresponding time-varying VECM(p) is now 4Yt = γ0 + αξ 0 (m) Yt−1 + p−1 X j=1 Γj 4 Yt−j + C0 Ut . Then similar to (7) we can write Yt = C(1) Under Assumption 2. 24 .5 The Drift Case Assumptions 1-2 imply that ∆Yt and β 0 Yt are zero-mean stationary processes. due to the drift we now need to include a vector of intercepts in VECM (8). However. Γj 4 Yt−j + C0 Ut . Moreover. and Ut and C (L) are the same as in Assumption 1. Assumption 2 can be adopted without modifications.. Assume ∆Yt has the VECM (p) representation 4Yt = γ0 + αβ 0 Yt−1 + p−1 X j=1 t X j=1 Uj + C(1)µ. . but Assumption 3 needs to be dropped as otherwise β 0 (Y0 − V0 ) = 0. 4Yt−p+1 )0 . with Xt = (4Yt−1 . which correspond to the following modification of Assumption 1: P∞ Assumption 1∗ .t + Vt + Y0 − V0 .

let ³ ´−1/2 −1 −1/2 0 0 0 0 µ = µ0 C0 α⊥ (α⊥ Ωα⊥ ) α0⊥ C0 µ (α0⊥ Ωα⊥ ) α⊥ C0 µ. Let µ⊥ be an orthogonal complement of µ. y 0 µ (25) (26) where W k−r−1 is defined by (24)..m (x) = p(x) ⊗ x µ ¶ Z 1 W k−r−1 (y) − p(y) ⊗ dy.m.T in (14) replaced by (25) and Wk−r. (24) is a (k − r − 1)-variate standard Wiener process. First. we need some additional notation. In the drift case.T of ξ in (14) as à à !! Ok. These proofs are therefore given in Bierens and Martins (2009). 2 cos(mπx) and 0 W k−r−1 = µ0⊥ (α⊥ Ωα⊥ ) −1/2 0 0 α⊥ C0 W To re-derive our previous results for this drift case. where p(x) = 1. 2 cos(πx). 1 (m) 0 (µ0⊥ ⊗ Im+1 ) (β⊥ ⊗ Im+1 ) √ Y[xT ] ⇒ p(x) ⊗ W k−r−1 (x) T 1 (m) 0 (µ0 ⊗ Im+1 ) (β⊥ ⊗ Im+1 ) Y[x. Next. 25 .which is a vector in Rk−r .. µ⊥ . Then Lemma 7.T ] ⇒ p(x) ⊗ x T √ ¡ √ ¢0 for x ∈ [0. normalized such that µ0⊥ µ⊥ = Ik−r−1 . 1]. The proofs of Lemma 7 and Theorem 3 are not too difficult but rather lengthy. let ¡ ¢ MT = T −1/2 µ. With ξ⊥.r ´ ξ⊥. . Then f Theorem 3.. √ ³ −1/2 T βΣββ ⊗ Im ¶ W k−r−1 (x) f Wk−r.m in (12) replaced by (26) the results of Lemmas 3-5 and Theorem 1 carry over.T = (β⊥ ⊗ Im+1 ) (MT ⊗ Im+1 ) . f and redefine Wk−r..m in (12) as Redefine the orthogonal complement ξ⊥. Note that µ0 µ = 1 by normalization.

Falk and Wang (2003) find support for the presence of unit roots in all series. and St is the nominal exchange rate in home currency per unit of the foreign currency. downloaded from the Journal of Applied Econometrics data archives web site. We test the TI cointegration hypothesis against TV cointegration for ´0 ³ Yt = ln Stf . Italy and Sweden were added to the list. Norway. where the process et represents the short run deviations from the PPP due to disturbances in the economic system (real or monetary shocks). and UK are found to have PPP with the US. Therefore. Germany. A reason why linear VECM models may be unable to detect long run PPP is the presence of transaction costs in equilibrium models of real exchange rate determination. ln Ptf . and βt is an unknown vector-valued function of time. Belgium. Falk and Wang (2003) found that the PPP hypothesis holds for some economies but not for all. thus capturing the implied nonlinearities. We propose an alternative framework where the cointegrating vectors fluctuate over time. Netherlands. βt will be approximated by βt (m) = m ξi Pi. (1997) successfully fit an exponential smooth transition autoregressive model. Japan. which imply a nonlinear adjustment process in the PPP relationship. Their work is based on Caner’s (1998) concept of cointegration where the VECM errors follow a stable distribution. France. they find support of the PPP hypothesis at the 5% level in eight of the 12 cases. The data are monthly and cover the period from January 1973 to December 1999. intimately related to the type of method applied. By means of the standard Johansen’s approach. Canada and Germany are the only countries for which the US has not had price parity 26 The validity of the purchasing power parity (PPP) hypothesis has generated a great deal of controversy.T (t) . The time-varying cointegrating relation is βt0 Yt = et . and the UK. .T (t) . The domestic country is the US and the bilateral relationship of study is with Canada. France. where the ξi ’s are the Fourier coefficients. Italy. UsingP Chebyshev time polynomials Pi. Japan. f respectively. Denmark. Recently. Spain. i=0 We use the same data as Falk and Wang (2003). so that the time series involved have length 324. Since the log-prices are unit root with drift processes the tests will be conducted under the ”drift-case” assumptions. Michael et al.6 An Empirical Application where Ptn and Ptf are the price indices in the domestic and foreign economies. ln Ptn . With one cointegrating vector. At the 10% level.

our results refute Falk and Wang’s (2003) findings of standard PPP for all countries except Canada and Germany. suggesting that β1t may be constant. This assumption may be restrictive in practice due to changes in taste. where β3 + β2 = 0. technology. Hence. Moreover.according to the standard approach. The patterns of these parameters suggest that. The plots of the time-varying coefficients β1t . so that one should expect differences in findings as well. We propose a generalization of the standard approach by allowing the cointegrating vectors to be time-varying and we approximate them by using orthogonal Chebyshev time polynomials. 7 Conclusion In Johansen’s standard approach it is assumed that the cointegrating vector is constant over time. On the other hand. and negative for the other countries. are presented in Bierens and Martins (2009). the p-values are zero for any m larger than four.the time varying error correction model . To distinguish our model from the time invariant Johansen’s specification. regardless of the lag order. for different combinations of the order m of the Chebyshev polynomial expansion and the lag order p. β2t and β3t in the cointegrat0 ing PPP relation βt Yt = β1t ln Stf + β2t ln Ptn + β3t ln Ptf are also presented in Bierens and Martins (2009). The asymptotic p-values of our test. the variation of β1t is minor compared with the variation of β2t and β3t . for all cases there is strong evidence of a time varying type of cointegration between international prices and nominal exchange rates. We find that. However. This is related to the symmetry assumption in the standard PPP theory. We propose a cointegration model that captures smooth time transitions of the cointegrating vectors . β2t + β3t = δ for some constant δ. we construct a likelihood ratio test for the 27 . Caner’s (1998) concept of cointegration employed by Falk and Wang (2003) is fundamentally different from our time-varying cointegration concept. approximately. Thus. or economic policies.and estimate it by maximum likelihood. It is unclear from these plots why Falk and Wang (2003) find standard PPP for all countries except Canada and Germany because the patterns of β2t and β3t for Canada and Germany do not look distinct from those of the other countries. δ seems to be positive for Canada and the UK.

K. C. 395432. Threshold Cointegration.A. 627-645. There are issues that merit further research. Gregoir (2005)..J. Martins (2009). Distribution of Eigenvalues in Multivariate Statistical Analysis. Bai.. A New Approach to the Decomposition of Economic Time Series into Permanent and Transitory Components with Particular Attention to Measurement of the Business Cycle. 28 . Journal of Econometrics 70.psu.la. R.. Optimal Changepoint Tests for Normal Linear Regression. Bierens. http://econ. and to allow for other time varying parameters. Testing the Unit Root with Drift Hypothesis Against Nonlinear Trend Stationarity. and T. Annals of Statistics 11. (1994). 151-174.T. Brons and S. S. (1997). Fomby (1997). Jensen (1983)... S.L. Bierens. Topics in Advanced Econometrics: Estimation. the analytical study of the power of the test against local alternatives deserves attention. References Andersson. Testing for and Dating Breaks in Multivariate Time Series.PDF Blake. Moreover. J. with an Application to the US Price Level and Interest Rate..null hypothesis of standard cointegration.edu/~hbierens/TVCOINT_APPENDIX. Journal of Econometrics 81. 9-38. ”Appendix: Time Varying Cointegration”. Review of Economic Studies 65. International Economic Review 38. In particular. 392-415. Bierens.S. H. P. Stock (1998).K. Andrews. 29-64. and C. H. Lumsdaine and J.B. Testing and Specification of Cross-Section and Time Series Models. We find evidence of time-varying cointegration between these series. Cambridge University Press. The limiting law appears to be chi-square.. N. Beveridge. and L. Nelson (1981). H. Ploberger (1996).W.J. Lee and W. 269-310. H. Bruneau and S.R.J. D. I. Journal of Econometrics 124. Testing for the Cointegration Rank when some Cointegrating Directions are Changing.H. a natural extension of our approach is to include deterministic components such time trends and/or seasonal dummy variables. Andrade. Journal of Monetary Economics 7. To illustrate the practical significance of our approach we apply our test to international prices and nominal exchange rates.

(1988). Tests for Parameter Instability in Regressions with I(1) Processes. Journal of Economic Dynamics and Control 12. 321-335.. Econometrics Journal 2. and C. Stochastic Cointegration: Estimation and Inference. 215-237. Structural Changes in the Cointegrated Vector Autoregressive Model. Hansen.G. Inoue. Statistical Analysis of Cointegration Vectors.W. A.J. Johansen.F. Leybourne (2002). Oxford Bulletin of Economics and Statistics 48. UCSD. Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models. Dover.. (1991).R.. Granger. Journal of Econometrics 111. McCabe and S. Journal of Applied Econometrics 18. R. 15511580. (1999). Harris. Developments in the Study of Cointegrated Economic Variables. Hall. Granger. D.J. 363-384. Numerical Methods for Scientists and Engineers. Hansen. 401-429. (1973). 29 . and S. Journal of the Royal Statistical Society B 41. S. The Determination of the Order of an Autoregression. B. Journal of Business and Economic Statistics 10. 190-195. Hansen. Journal of Econometrics 101. 306-333. Testing for Structural Change in Cointegrated Regression Models: Some Comparisons and Generalizations. Hamming. Tests for Cointegration with Infinite Variance Errors. 471-484.Caner. Department of Economics. Some Tests for Parameter Constancy in Cointegrated VAR-Models. and C. B. Journal of Econometrics 114. K. R. (1992). Estimation and Testing. Journal of Econometrics 86. 155-175.W. (1998). Testing Long Run PPP with Infinite Variance Returns. P. Hannan. 213-218. 151-168. B.. (1996). 251-276. (2001).J..M. S. Econometrica 55.W.G. (2003). Psaradakis and M. 261-295. C. Engle. Granger (1987). and G. M. Cointegration and Error Correction: Representations. Z. Cointegration and Changes in Regime: the Japanese Consumption Function. Econometrica 59. H. Wang (2003). S. 109-122.. Tests of Cointegration Rank with a Trend Break. Yoon (2002). Hao. Sola (1997). R. Falk. 231-254.W. Nonlinear Estimation Using Estimated Cointegrating Relations.J.. Quinn (1979). E. Working Paper. Econometric Reviews 15. Journal of Econometrics 90.E. C. Hidden Cointegration. Journal of Applied Econometrics 12. Johansen. (1987). De Jong. and B. Johansen (1999).

Testing for Cointegration using Partially Linear Models. Choi (2004). Econometrics Journal 3. Econometric Theory 20... Tests for Structural Change in Cointegrated Systems. Hahn (1999).. (1978). and I. H. S. 222-259. A. 675-706. Journal of Econometrics 124. 473-496. Terasvirta. Juhl. (1998). Likelihood-Based Inference in Cointegrated Vector Autoregressive Models.B. T. Nielsen (2000). J. Johansen. 301—340. Journal of Econometrics 113. P. and P. Oxford University Press. Lütkepohl. S. P. Unit Roots.A. Oxford. Nobay and D.B.B. (1988). Investigating Stability and Linearity of a German M1 Money Demand Function. Non-Linear Error Correction 30 . Cointegration Analysis in the Presence of Structural Breaks in the Deterministic Trend.Y.. P. and I-M. and S. Econometric Theory 15..N. Durlauf (1986). Cambridge University Press. Stability Tests in Error Correction Models. Quintos. Wolters (1999). Cointegrating Regressions with Time Varying Coefficients. Trenkler (2003). T.Johansen. Phillips (1993). B. 511-525. and S. Review of Economic Studies 53. 216-249.. Transactions Costs and Nonlinear Adjustment in Real Exchange Rates: An Empirical Investigation. 363-394. Journal of Multivariate Time Series Analysis 24. Test for Partial Parameter Instability in Regressions with I(1) Processes. G. Terasvirta and J. Phillips. Estimating the Dimension of a Model.. Journal of Econometrics 86. Maddala. Schwarz. Weak Convergence to the Matrix Stochastic InR1 tegral 0 BdB 0 .C. (1998). UK. 664—703. G. Journal of Applied Econometrics 14. C.C.B.S. Journal of Econometrics 82. H.. C. Park. 289-315. and Z. Kuo. Empirical Economics 18. Peel (1997).R. 337-368. (1995). and A. Multiple Time Series Regression with Integrated Processes. Michael.E.C. Journal of Political Economy 105. 201-229. 862-879. Seo. Mosconi and B. Kim (1998). Lütkepohl. T. Econometric Theory 14. Cointegrating Smooth Transition Regressions.. Parameter Constancy in Cointegrating Regressions. Cointegration and Structural Change. Phillips.S. R. P.E.C. Eliasson (2001). Saikkonen. B. Quintos. Comparison of Tests for the Cointegrating Rank of a VAR Process with a Structural Shift. Annals of Statistics 6. P.. Saikkonen and C. Xiao (2005).. 252-264. (1997). 461-464.

6. 277-288. Lemma 9. and let F (x) be an arbitrary differentiable function on [0. p. µZ 1 ¶ T 1X d 0 0 (4Yt−` ) Yt−1 → C(1) (dW ) W C(1)0 + M` . Then ⎛ ⎞ Z 1 [xT ] T T X X X ηt F (t/T ) = ηt F (1) − f (x) ⎝ ηt ⎠ dx.2.3. See Phillips and Durlauf (1986) and Phillips (1988). T t=1 0 µZ x ¶ [xT ] 1X d 0 0 (4Yt ) Yt−1 → C(1) (dW ) W C(1)0 + xM0 . Under Assumptions 1-2. See Bierens (1994. Lemma A. T →∞ T t=1 31 . Proof. Under Assumptions 1-2 the following probability limits exist: Σββ = p lim T 1X 0 0 β Yt−1 Yt−1 β.3. with derivative f (x). Lemma A. t=1 t=1 0 t=1 Proof. and the M` ’s are nonrandom k × k matrices. Appendix: Proofs The proofs of Lemmas 2-4 employ the following auxiliary results. T t=1 0 where W is a k-variate standard Wiener process. Journal of Applied Econometrics 16. Let ηt be an arbitrary sequence in Rn .and the UK Demand for Broad Money. 200). T t=1 0 Z 1 T 1X d 0 Ut Yt−1 → (dW ) W 0 C(1)0 . Lemma A. 1878-1993.1. ` ≥ 0. 1].

Ok(p−1).β⊗Im+1 − ΣβX Σ−1 ΣX. under the additional Assumption 5.m ) . ΣX.m.β⊗Im+1 − Σ0X. T 1X 0 (m)0 = p lim β Yt−1 Yt−1 (β ⊗ Im+1 ) T →∞ T t=1 = (Σββ . T →∞ T t=1 T 1X 0 Xt Xt .r.m = . Proof. T →∞ T t=1 Moreover.r Σββ ⊗ Im The latter is a nonsingular matrix.ΣXβ = p lim ΣXX = p lim T 1X 0 Xt Yt−1 β.X β⊗I XX µ ∗ ¶ Σββ Or.2 and 5.β⊗Im+1 = Σ∗ . Or. Or. under Assumptions 1. T 1X 0 (m) (m)0 = p lim (β ⊗ Im+1 ) Yt−1 Yt−1 (β ⊗ Im+1 ) T →∞ T t=1 Σβ⊗Im+1 .m .r. ΣXX is nonsingular and the matrix Σ∗ = Σββ − Σ0Xβ Σ−1 ΣXβ ββ XX is nonsingular.m ββ XX Σ∗ m+1 . 32 . Or.β⊗Im+1 Σβ. See Bierens and Martins (2009). and ¡ ¢ Σβ.r.β⊗Im+1 Consequently. T 1X (m)0 = p lim Xt Yt−1 (β ⊗ Im+1 ) T →∞ T t=1 ¡ ¢ = ΣXβ .β⊗Im+1 = Σββ ⊗ Im+1 .β⊗Im+1 = Σβ⊗Im+1 . Furthermore.β⊗Im+1 Σ−1 Σβ⊗Im+1 .r.

Proof.T S00.T S00.r (α0⊥ Ωα⊥ )−1/2 α0⊥ ³ ´−1 −1 S00.T ³ ´−1 (0) (0) (0) −1 −1 0 (0) −1 −1 = S00. There exists an orthogonal complement β⊥ of β such that 0 0 β⊥ C(1) = (α⊥ Ωα⊥ ) −1/2 0 α0⊥ C0 .T ξ ξ 0 S10. Lemma 10.T − S00.5. Then Z 1 d −1/2 0 (m) 0 0 f0 (α⊥ Ωα⊥ ) α⊥ S01.Lemma A. See Johansen (1995.T (β ⊗ Im+1 ) → Z (27) 33 .T = −1 −1 Or.7. See Johansen (1995).T 0 = α⊥ (α⊥ Ωα⊥ ) −1 0 α⊥ + op (1). −1 (α0 Ω−1 α) α0 Ω−1 Proof.k−r (α0 Ω−1 α) + Σ∗ (α0 Ω−1 α) α0 Ω−1 ββ Ã ! (α0⊥ Ωα⊥ )−1/2 α0⊥ × + op (1).T − S00.T S01. Let ξ be given by (9).T S00. This is a standard result.4.T ξ ξ 0 S10. Ã !0 Ã ! Ik−r Ok−r.T S00. Let α⊥ be an orthogonal complement of α. 0 Lemma A.T (β⊥ ⊗ Im+1 ) → (dWk−r ) Wk−r.T β β 0 S10. Then under Assumptions 1-5.T S01.T S01.m 0 and √ d −1/2 0 (m) 0 T (α⊥ Ωα⊥ ) α⊥ S01. Let β⊥ be the orthogonal complement of β defined in Lemma A. Lemma A.T β β S10.6.1). Proof. This is a standard result. Lemma A. See for example Johansen (1995). ³ ´−1 (m) (m) −1 (m) (m) −1 −1 −1 NT = S00. Under Assumptions 1-5.6. Let Assumptions 1-5 hold.T S01.

where M is a r × (k − r)(m + 1) random matrix.T S00.9 it follows from Lemma 2 in Andersson et al.4 and A. αΩ α Proof.T (β ⊗ Im+1 ) = (Σββ . Or.8 and A.m.r Σββ ⊗ Im distributed.m → . Combining the results of Lemmas A.T β ⊗ Im+1 . See Bierens and Martins (2009). Ok−r+k.8. β ⊗ Im+1 T ⎛ R1 ⎞ f f0 Wk−r.r Ok−r+k. the k − r columns of Z 0 are independent ∙ µ ∗ ¶¸ Σββ Or. Or. ¡ ¢−1 0 −1 (m) 0 d α0 Ω−1 α α Ω S01.T S01. Lemma A.7. ¢ (m) ¡ ¢ ¡ −1/2 0 0 β⊥ ⊗ Im+1 .r(m+1) O 0 ¶ ⎠ d →⎝ Or.jointly.m.m.r Σββ ⊗ Im Proof. Lemma A.(k−r)(m+1) Or.m (x)dx µ(k−r)(m+1).m ) + op (1). and ¡ 0 −1 ¢−1 0 −1 (m) α Ω S01.r. where Z is a (k − r) × r(m + 1) random matrix. (1983) that 34 .9. Moreover. Under Assumptions 1-5.m Σ∗ ββ Or(m+1). In particular.m Proof. Under Assumptions 1-5. This result follows straightforwardly from Lemmas A.m. β 0 ⊗ Im+1 S11. T −1/2 β⊥ ⊗ Im+1 à ! ³ ´−1 −1 ∗ 0 −1 ∗ ∗ d Σββ (α Ω α) + Σββ Σββ Or. T −1/2 β⊥ ⊗ Im+1 S10. See Bierens and Martins (2009).m (28) Nr(m+1) 0.T T −1/2 β⊥ ⊗ Im+1 .r.k−r+k.k−r+k. ¡ 0 ¢ (m) −1 (m) ¡ ¢ 0 0 β ⊗ Im+1 .m (x)Wk−r.r.T (β⊥ ⊗ Im+1 ) → M.

r(m+1) Wk−r. λk.m (x)Wk−r.T . λr+2. Proof of Lemma 4: To derive the limiting distribution of T (λr+1..T S01.T from converging to a singular matrix.k(m+1)−r Σββ (α Ω α) + Σββ − = 0.T − S10. because otherwise we cannot apply Lemma 35 .. Proof of Lemma 2: As shown in Bierens and Martins (2009). p..T of the generalized eigenvalue problem (5) converge in distribution to the ordered solutions λ1 ≥ λ2 ≥ .. and the non-zero solutions are the solutions of eigenvalue problem µ ¶ ³¡ ´−1 ¢−1 ∗ ∗ 0 −1 ∗ ∗ det λΣββ − Σββ α Ω α + Σββ Σββ = 0 This is the same result as in the standard TI cointegration case! With these results at hand we are now able to prove our main results..m Or(m+1).k(m+1)−r Obviously.159). all but r solutions are zero.r ³ ´ ξ⊥.m (x)dx 0 à !# ³ ´−1 −1 ∗ 0 −1 ∗ ∗ Σββ Or. Ok(m+1)−r.T S00.T ξ⊥.10.λ = Op (1). Let (m) (m) −1 (m) S (λ) = λS11. ≥ λ(m+1)k of ¶ ⎞ ⎡ ⎛ µ ∗ Σββ Or..T ≥ .(k−r)(m+1) ⎠ Or. Lemma 2 follows straightforwardly from Lemmas A. ≥ λ(m+1)k. Under Assumptions 1-5 the ordered solutions λ1.T = β⊥ ⊗ Im+1 .Lemma A.r Ok(m+1)−r.T )0 .T à à !! Ok..T ≥ λ2.r.m.T S11. −1/2 T βΣββ ⊗ Im (29) ρ = T. √ . .T . Proof of Lemma 3: Lemma 3 follows from Lemma A.10. we follow a similar procedure as in Johansen (1995.2.m.1 and A. √ (m) 0 The reason for the factor T in (29) is to prevent T −1 ξ⊥.r Σββ ⊗ Im det ⎣λ ⎝ R1 0 f f O(k−r)(m−1).

T 1 0 (m) (m) (m) 0 = ρ ξ⊥.T S00.159).T ) = det 0 0 0 ξ⊥.T ξ⊥. (0) (0) −1 ξ 0 S (λ) ξ = −β 0 S10.T .r.T S (λ) − S (λ) ξ (ξ S (λ) ξ) ξ S (λ) ξ⊥. (m) (0) 2 in Andersson et al.T S (λ) ξ⊥.T ξ⊥.T ξ⊥.T S11.m 1 (m) 0 ξ⊥. Next.T 1/2 Or.T β = Op (1) and ξ⊥.5.T S01.T ξ = Op (1).T S (λ) − S (λ) ξ (ξ 0 S (λ) ξ) ξ 0 S (λ) ξ⊥. S01. (1983).9 that β 0 S11.r.T S10.T S (λ) ξ ξ⊥.T S10.T − ξ⊥. (0) (m) 0 It follows from Lemma A.T S01. p.T α⊥ (α0⊥ Ωα⊥ ) α⊥ S01.T S (λ) ξ = −ξ⊥.T S00.T S11.where ξ is defined by (9).m (x)dx O(k−r)(m+1). Or.T = Op (1). ξ⊥.9 that ¶ µ I(m+1)(k−r) O(m+1)(k−r).r.T ξ⊥.m 0 → .T β + op (1).T ³ ´ ³ ´ −1 0 = det (ξ 0 S (λ) ξ) det ξ⊥. whereas by assumption. −1/2 Combining these results it follows that ³ ´ −1 0 0 0 ξ⊥. Then µµ 0 ¶ ¶ µ 0 ¶ ξ S (λ) ξ ξ ξ 0 S (λ) ξ⊥.T 1 0 (m) = ρ ξ⊥. Therefore.T + op (1).(m+1)(k−r) Σββ ⊗ Im µ R1 ¶ f f0 d Wk−r.T T −1 0 (m) (m) 0 − ξ⊥.T ξ⊥.m × 1/2 Or.7 that ³ ´ −1 0 0 0 ξ⊥.T NT S01.T β + op (1) and 0 0 −1 ξ⊥.T S11. T (m) where NT is defined in Lemma A. similar to Johansen (1995.m (x)Wk−r.T S10.(k−r)(m+1) Ψ 36 . Since by Lemma A. The reason for the normalization of β by Σββ will become clear below.T S11.T ξ⊥. it follows now from Lemmas A.T + op (1).m.m.T det S (λ) (ξ.m. observe from Lemma A.(m+1)(k−r) Σββ ⊗ Im T µ ¶ I(m+1)(k−r) O(m+1)(k−r).T ξ⊥.7.T S (λ) − S (λ) ξ (ξ S (λ) ξ) ξ S (λ) ξ⊥.5-A. ρ = Op (1).

r.T S10.m (31) (32) Z2 = Z 0 with Z defined in Lemma A.9.r (m) Ψ = p lim S11.r.T ¶ µZ 1 ¶ µ R1 0 f d Wk−r.m (β⊥ ⊗ Im+1 ) → µ d ¶ . µ ¶0 µ ¶ Ok.T → T µ R1 f f0 Wk−r.(k−1)(m+1) Ir.T d f0 (dWk−r ) Wk−r.m dWk−r 0 f0 0 → (dWk−r ) Wk−r.T Z 1 0 and −1/2 0 (α⊥ Ωα⊥ ) → Z2 .m ] distributed.m [0.m (m) 0 = p lim (β ⊗ Im+1 ) S11.T ξ⊥.m Ir.m (m) 0 α⊥ S01.m. and it follows from (31) and (32) that 0 0 0 ξ⊥.m ¶ (33) are independent Nr.r Ok. where ¶ Ok.r. Σββ ⊗ Im ] distributed.T T β ⊗ Im+1 Ir. it follows from Lemma A.7 that −1/2 0 (α⊥ Ωα⊥ ) (m) 0 α⊥ S01. Consequently. V 0 (m) −1 (m) (35) 37 . the columns of ³ ´ −1/2 0 V = Σββ ⊗ Im Z2 (34) µ Or.r.m Or. Ir.T ξ⊥.r.T β ⊗ Im β ⊗ Im T →∞ µ µ ¶0 ¶ Or.m.m.m T →∞ = Σββ ⊗ Im .m . Hence 1 0 d (m) ξ⊥.m.7. (30) Moreover.m 0 Or.m −1/2 0 (m) ¡ 1/2 0 = (α⊥ Ωα⊥ ) α⊥ S01.m (x)dx O(k−1)(m+1). V .T S11.T α⊥ (α⊥ Ωα⊥ ) α⊥ S01.m Ir.r T 1/2 β ⊗ Im µ ¶ ¢ Or.m [0.where by Lemma A. the columns of Z2 are independently Nr. Hence.m (x)Wk−r.T (β ⊗ Im+1 ) Ir.

T 11.T S11. Under Assumptions 1-5.T 01. .Um.T 00. br be the ML estimator of ξ. (m) (m) (m) b ξ ξ S S −1 S bi = λm. r.. are the eigenvectors associated with the r largest eigenvalues ξ b λm..T If we normalize b as ξ then similar to Johansen (1988) we can write ³ ´−1 0 e = b ξ 0b ξξ ξ ξ ξ where ξ⊥..T Moreover.T ξα0 T.UT as ¢−1 0 ξ⊥...T ξ⊥.i S bi . .T Um. i = 1. and the next lemma: Lemma A.T ξα0 = Y UC T t=1 t−1 t 0 Ã !Ã !−1 Ã ! T T T X (m) 0 X X 1 1 1 0 0 − Yt−1 Xt Xt Xt Xt Ut0 C0 . 10..T = ¡ e − ξ = ξ⊥.i . Lemma 2 in Anderson et al.11. where ξ ξ bi .T ξ ξ (ξ 0 ξ) ..T ξ ³ ´ ³ ´−1 0 b ξ 0b ξ⊥.Lemma 4 now follows from (30). i = 1.m . .T S10. (1983). ³ ´ ξ Proof of Lemma 5: Let b = b1 ..T − S11. See Bierens and Martins (2009). (35).T is defined by (29).T = T −1 ξ⊥. r. 1 X (m) 0 0 b(m) b(m) S10.T − S11. V is independent of Wk−r and f Wk−r.T ¡ ¢−1 × Ω−1 α α0 Ω−1 α + op (1). Proof. similar to Johansen (1988) it can be shown that T ³ ´−1 ³ ´ 0 0 b(m) b(m) b(m) ξ⊥. and Um. T t=1 T t=1 T t=1 38 .T ξ⊥. (36) Similar to Johansen (1988) we can expand T.

T ξα0 ´´ ³ ´ ⎠ =⎝ ³ √ ³ −1/2 0 b(m) − S (m) ξα0 b Om. ³ ´ 0 b(m) b(m) ξ⊥. T Σββ β ⊗ Im S10. Or.r.T − S11.7 it follows that ¡ T ´ ¢ 1 X ³ −1/2 0 ¡ ¢−1/2 (m) 0 Om. Moreover. which is independent of Wk−r. ³ ´ 0 b(m) b(m) (β⊥ ⊗ Im+1 ) S10. Ir. Similar to Johansen (1988) it follows now that Z 1 T 1X 0 (m) 0 0 −1 ¡ 0 −1 ¢−1/2 d f (β ⊗ Im+1 ) Yt−1 Ut C0 Ω α α Ω α → Wk−r.3.k .T S10. T 1X 0 (m) 0 (β ⊗ Im+1 ) Yt−1 Ut0 C0 T t=1 ⊥ ! Hence. similar to parts (27) and (28) of Lemma A.k(p−1) T t=1 Ã +op (1).T Ã 1 PT ! (m) 0 0 0 (β⊥ ⊗ Im+1 ) Yt−1 Ut C0 t=1 T ´ P ³ −1/2 0 = (m) 1 0 Σββ β ⊗ Im+1 Yt−1 Ut0 C0 (Om.m .m dW 0α .k .m ) √T T t=1 + op (1).T 11. d 39 .m.T − S11. T ³ ´ √ 1 X 0 (m) 0 b(m) b(m) T (β 0 ⊗ Im+1 ) S10.T − S11.T ξα0 ´ ⎞ ⎛ ³ 0 b(m) b(m) (β⊥ ⊗ Im+1 ) S10.T ξα0 = √ (β ⊗ Im+1 ) Yt−1 Ut0 C0 T t=1 µ 0 ¶ T 1 X ΣXβ Σ−1 0 XX √ − Xt Ut0 C0 + op (1).T − S11.m) √ Σββ β ⊗ Im+1 Yt−1 Ut0 C0 Ω−1 α α0 Ω−1 α T t=1 → V α.r.r.k . Ir. T t=1 ⊥ 0 ¡ ¢−1/2 0 −1 W α = α0 Ω−1 α α Ω C0 W where f is an r-variate standard Wiener process.Thus.T ξα0 = and by Lemma A.

m (x)dx 0 Z 1 ¢−1/2 ¡ f × Wk−r. where Um.T e − ξ ³ ´ e f0. where and ³ ´−1 0 ³ ´−1 0 e = b ξ0b e b 0b ξ ξ ξ ξ ξ. 40 . ³ ´ fm. ln ⎝ (0) det β 0 S11.11 it can f be shown that V α .m dW 0α α0 Ω−1 α .11 on page 224. ³ ´ f0. similar to Lemma A.T S00. where e0 is a k × r matrix and em a k.T S01. β = β β β ββ ⎛ Recall from Lemma 5 that e = ξ + ξ⊥.m .T ) ξ 5 ´ ´⎞ ³ ³ 0 (0) (0) (0) −1 det β S11.T − S10.m×r matrix with independent N[0. it follows from Lemma A.T ξ See also Johansen (1995). It folξ lows from Johansen (1988. Therefore.T S01. µZ 1 ¶−1 ³ ´ d b0 − β → (β⊥ .where V α is an r.T S00.m ) fk−r. Or. (29) and (36) that jointly. Ok.T e = fm.T (ξ + ξ⊥. ln ⎝ 0 (m) det ξ S11.m (x)W 0 f T ξ W k−r. ³ 0 0 ´ Denoting e = e0 . 249) 5 that under the null hypothesis ξ = (β 0 .T β .T .m × r ξ ξ ξ ξ ξ matrix. ³ ´ Proof of Theorem 1: Consider the likelihood-ratio statistic fm. However.T − S10. it follows now from (15).T Um. 1] distributed elements.T = Op (T −1 ) .k.m )0 .T (ξ) = T. 0 ³ ´ √ ¢−1/2 ¡ d −1/2 T em → βΣββ ⊗ Im V α α0 Ω−1 α ξ . ³ ´ fm.m are independent.T (β) = T.T β ⎠.9 that (15) holds. W α and Wk−r.k.T β ³ ³ ´ ´⎞ ⎛ 0 (m) (m) −1 (m) det ξ S11.T Um. p. em . Lemma 7. f which is also independent of Wk−r. equation A.T ξ ⎠.

T S ξ⊥.T S00.T β = −α0 Ω−1 α + op (1).T − S10.T − S10.T ξ⊥.T S01.T ξ = Op (1).T = Op T −1/2 .Um.T S00.T ξ⊥.k denotes the maximum absolute value of its elements.T S01.T (T.T S00.T S00.T Um.T − S10.T ξ⊥.T ¡ ¢ d → trace V 0α V α "µZ ¶−1 ¶ µZ 1 1 f0 fk−r.T ξ⊥. ³ 0 ´−1 ³ 0 ³ ´ ´−1 (0) (0) (0) (0) −1 β S11.T β ³ ´ ´o 0 (m) (m) −1 (m) ×ξ S11.T k3 .T − S10.m dW 0α 41 ¶¸ 0 .m (x)dx × µZ 0 1 0 and by Johansen (1995.1).T β ξ S11.T S11. f Wk−r.T ξ = Op (1).T Um.T e − f0.T = Op T −1 .Um.T − S10. ξ⊥.T ξ⊥.T S11.T ξ⊥. Since ¡ ¢ ¡ ¢ Um.T −T.m (x)W 0 f + trace dW α Wk−r.T S00.T S01.T S11.m W k−r.T ½³ ´−1 ³ 0 (0) (m) 0 0 UT ξ⊥. Lemma 10. 0 ξ⊥.T S01.T S00.T ) + op (1) T 11.T S11.T β ¾ ³ 0 ´−1 0 (m) (0) (m) 0 0 −Um.T S00.T UT ³ ´ (m) (m) −1 (m) 0 0 −Um.trace β S11.T ξ β S11. (m) ½³ ³ ´ ´−1 0 (0) (0) (0) −1 = f0. kξ⊥.T ξ⊥.T − S10.T Um.T ξ ´ ´−1 ³ 0³ (0) (0) (0) −1 × β S11.T S01.T Um.T Um.T S11. ³ ´ (m) (m) −1 (m) 0 ξ⊥.where for a matrix the norm k.T ¡ ¢ +O T. it follows now from (30) and Lemma 5 that ³ ´ ξ fm.T S01.T S01.T − S10.T S11.T ξ⊥.trace β S11.T β − β S11.T (β) ∙ µ ¶ ¸ ¡ 0 −1 ¢ ¡ ¢ 0 1 (m) 0 = trace α Ω α T.T β ³ ³ ´ (m) (m) −1 (m) 0 0 × Um.T (β) + T.

with V α = (α0 Ω−1 α) α0 Ω−1/2 W.and similarly. where Y ∈ Rp . p XX ΣXY ΣXX 42 µZ 1 0 f Wk−r. page 192) has shown that. (38) × f because W α and Wk−r.m (x)Wk−r.m. Thus. which is distributed as (α0 Ω−1 α) V α with V α as in Johansen(1995).T (β) "µZ ¶ µZ 1 d 0 → trace dW α Wk−r 0 1 0 Wk−r (x)Wk−r (x)dx 0 × µZ 1 Wk−r dW 0α 0 ¶¸ ¶−1 . X ∈ Rq and X ¶ µ ΣY Y ΣY X Σ= . Then the difference of (38) and 2 (37) is χr. ³ ´ e f0. Σ] . "µZ ¶ µZ 1 ¶−1 1 0 0 trace dW α Wk−r Wk−r (x)Wk−r (x)dx (37) 0 × Similarly.m are independent. det(Σ) > 0. which follows from the following easy result: µ ¶ Y If Z = ∼ Np+q [0.m dW 0α ¶¸ 0 ∼ χ2 r(m+1)(k−r) .(k−r) distributed. " µZ 1 ¶−1 ¶ µZ 1 ¡ 0 −1 ¢ 0 0 dV α Wk−r Wk−r (x)Wk−r (x)dx trace α Ω α × µZ 0 1 Wk−r dV α −1/2 0 0 ¶¸ 0 ∼ χ2 r(k−r) In our notation. −1 Johansen (1995. .T β − f0. W α = (α0 Ω−1 α) α0 Ω−1 C0 W is a r-variate standard 1/2 Wiener process.m Wk−r. then Z 0 Σ−1 Z − X 0 Σ−1 X ∼ χ2 . it follows that "µZ ¶ µZ 1 ¶−1 1 0 0 f f f dW α Wk−r.m (x)dx trace 0 µZ 1 Wk−r dW 0α 0 ¶¸ 0 ∼ χ2 r(k−r) .

t−j + ∆U1.j (τ ) and let Z2.j ((t − 1)/T )∆Z2. j) of B2 (t/T ) with derivative b02.j.i. (39) r.j (t/T )Z2.T be the matrix with elements Ψi.t−1 for some λt.T ∈ [0. the likelihood-ratio statistic ³ ³ ´ ³ ³ ´ ´ b e − f0 (β) − T f0 β − f0 (β) converges in distribution to (39) b b e b T f1 ξ plus (38) minus (37).j (τ ) be element (i. let b2.j.t−1 / T = Op (1). since V α is a r.t−1 .t−1 ³ √ ´ = B2 (t/T )∆Z2.m .j.t−i−j + ∞ X j=0 ∞ X j=t p−1 X j=1 C12.j (t/T ) − b2.i. 1].t−1 /T + B2 (t/T )∆Z2.i.i.t−1−j 43 .t−j + ∆ (B2 (t/T ) Z2. 1] distributed elements.i.t−1 ) = (b2. (39) is independent of f (37) and´(38). Denote by Ψt. it follows that ¡ ¢ trace V 0α V α ∼ χ2 .t−1 + Op 1/ T (40) where the latter follows from the fact that Ψt.t−1 + b2.t−1 = b02.i. Then ∆ (B2 (t/T )Z2.i. Hence.t−1 ) + Πj ∆ (B2 ((t − j)/T ) Z2. conditional on Wk−r.T )/T ).j.T Z2. mkr Proof of Lemma 6: To prove (20).i.j ((t − 1)/T )) Z2. since V α and W α are independent.T = b02.i ∞ X j=0 + = t−1 X j=0 p−1 X i=1 Πj ∆2 Z2.i.j ∆2 Z2. resulting in a χ2 distribution. ∆ (b2.j.j ((t − λt.j.j.Moreover.i.i.i.j ((t − λt.t−j Πj (B2 ((t − j)/T ) − B2 (0)) ∆Z2.T is uniformly bounded and √ that Z2. observe from (16) and (40) that ∆Z1.j. Then by the mean value theorem.r Furthermore.t−1 ) = Ψt.j ((t − 1)/T )∆Z2.t = = p X j=1 t−1 X j=0 Dj ∆Z1.j.t−1−j Πj ∆U1.t−1 be component j of Z2.j.T )/T )Z2.t. Next.m.t−1−j ) + C12.t Πj B2 (0) ∆Z2.m × r matrix with independent N[0.t−1 /T + b2.

i Vt−i − ³ √ ´ Qj (B2 ((t − j)/T ) − B2 (0)) ∆Z2. 44 .j ∆Z2.t + Op 1/ T Rt = Vt − C11.t−1 + B2 (t/T ) Z2. p−1 X i=1 ∞ X j=0 ∞ X j=0 (41) Πj B2 (0) ∆Z2. it follows from (17) and (41) that B1 Z1.j ∆Z1.t − C11.t−j − U1.t−i−j−1 p−1 X j=1 + Vt − = where t−1 X j=0 C11.t−j − U1.t−1−j C11.where Vt = ∞ X j=0 ³ √ ´ +Vt + Op 1/ T .t−1−j + C12. p−1 X i=1 ³ √ ´ C12.t .t−j − C12.i Πj ∆2 Z2.t−1−j + Rt + Op 1/ T .t−j .i Vt−i − p−1 X j=1 C12.t−i−j + Πj ∆U1.j ∆Z2. Finally. This proves (20).i X i=1 p−1 t−1−i X j=0 Πj (B2 ((t − j − i)/T ) − B2 (0)) ∆Z2.t−1 p−1 p−1 X X = ∆Z1.j ∆Z2.t−j − U1.t j=1 j=1 = − p−1 X i=1 t−1 X j=0 Πj (B2 ((t − j)/T ) − B2 (0)) ∆Z2.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->