Regime Switching for Dynamic Correlations∗

Denis Pelletier † North Carolina State University and Université de Montréal First version: March 2002 This version: May 2003 Compiled: 19th May 2003

I thank my co-advisor Nour Meddahi for his numerous advises. I also thank Peter Christoffersen, William McCausland, Jean-Marie Dufour, Doug Pearce, James McKinnon, Lynda Khalaf, Todd Smith for several useful comments. I also want to thank seminar participants at Carleton University, NCSU, Queen’s University, University of Alberta, UBC, Bank of Canada, LSU, Texas A&M for useful discussions. Financial support by the Social Sciences and Humanities Research Council of Canada, the Fonds FCAR (Government of Québec), CIRANO and CRDE is greatly acknowledged. † Centre interuniversitaire de recherche en économie quantitative (CIREQ), Centre interuniversitaire de recherche en analyse des organisations (CIRANO), and Département de sciences économiques, Université de Montréal. Mailing address: Département de sciences économiques, Université de Montréal, C.P. 6128 succursale Centre-ville, Montréal, Québec, Canada H3C 3J7. e-mail: denis.pelletier@umontreal.ca.

ABSTRACT In this work we propose a new model for the variance between multiple time series where we decompose the covariances into correlations and standard deviations. Both the correlations and the standard deviations are dynamic. For the correlation matrix we propose a regime switching model driven by an unobserved state variable which follows a Markov chain. Our model has the interesting properties of having constant correlations within each regime but still having dynamic correlations because of the regime switching. This property can have important impacts, namely for the computation of Value at Risk. We also present a restricted version of our model where a single factor is driving the changes in the correlation matrix. This regime switching model for the correlations, when combined with the ARMACH model for the standard deviations, allows explicit computation of multi-step ahead conditional expectations of the whole variance matrix. The maximization of the likelihood can be performed through a two-step procedure where the number of parameters in every nonlinear optimization is not a function of the number of time series. An application of this model to exchange rate time series is also presented. Key words: dynamic correlation, regime switching, factor model, Markov chain, ARMACH. Journal of Economic Literature Classification: C32, C53, G0, G1.

i

Contents
List of Definitions, Propositions and Theorems 1. 2. Introduction The Model 2.1. Regime switching for the correlations . . 2.2. A restricted model . . . . . . . . . . . . 2.3. Univariate volatility models . . . . . . . 2.4. Review of multivariate GARCH models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 1 3 3 5 7 8 10 10 12 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 20 22 22 23 24 30 31

3.

Estimation 3.1. One-Step Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Two-Step Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . Multi-Step Ahead Conditional Expectations Application to exchange rate data 5.1. Switching regime model with two states 5.2. Switching regime model with three states 5.3. DCC-GARCH . . . . . . . . . . . . . . 5.4. Series associated to the Markov chain . . conclusion Proofs Moments of the ARMACH model

4. 5.

6. A. B.

ii

List of Definitions, Propositions and Theorems
2.1 3.1 3.2 3.3 Proposition : PSD variance matrix . . . . . . . . . . . . . . Proof of Proposition 2.1 . . . . . . . . . . . . . . . . . Theorem : One-step maximum likelihood estimation . . . . . Proof of Theorem 3.1 . . . . . . . . . . . . . . . . . . . Theorem : Two-step maximum likelihood estimation . . . . Theorem : Two-step efficient maximum likelihood estimation Proof of Theorem 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6 12 12 14 15 30

List of Tables
1 2 3 4 5 6 7 8 9 10 11 12 Moments of the absolute value of a standard normal variable Unrestricted model with two states and ARMACH . . . . . Unrestricted model with two states and GARCH . . . . . . Restricted model with two states and ARMACH . . . . . . Restricted model with two states and GARCH . . . . . . . Unrestricted model with three states and ARMACH . . . . Unrestricted model with three states and GARCH . . . . . Restricted model with three states and ARMACH . . . . . . Restricted model with three states and GARCH . . . . . . . DCC-GARCH(1,1) . . . . . . . . . . . . . . . . . . . . . DCC-ARMACH(1,1) . . . . . . . . . . . . . . . . . . . . Likelihood comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 37 38 39 40 43 44 45 46 49 50 52

List of Figures
1 2 3 4 5 6 7 8 9 10 Exchange rate series . . . . . . . . . . . . . . . . . . . . . . . . . . . . ACF of the cross-product of the standardized residuals . . . . . . . . . . ACF of the cross-product of the standardized residuals (regime switching) ACF of the cross-product of the standardized residuals (DCC-GARCH) . Smoothed probabilities for the models with two states and ARMACH . . Smoothed correlations for the models with two states and ARMACH . . . Smoothed probabilities for the models with three states and ARMACH . Smoothed correlations for the models with three states and ARMACH . . Correlations for the DCC-ARMACH(1,1) . . . . . . . . . . . . . . . . . Conditional variance from a GARCH(1,1) for the return on the Dow Jones index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 34 35 36 41 42 47 48 51 52

iii

1. Introduction
It is a well known fact that the variance of most financial time series is time-varying. A lot of work has been done to model univariate financial time series since the introduction of the ARCH model by Engle (1982). Modeling time-varying variance is not just a statistical exercise where someone tries to increase the value of the likelihood, it has important impacts in term of asset allocation, asset pricing, computation of Value at Risk (VaR). But univariate time series models are not enough. For example to compute the variance of a portfolio of assets we need not only the variance of each asset but also the covariance between each pair of assets. Just like the variances, the covariances are also time-varying. But we face additional problems when we try to write a multivariate model of volatility. Not only do we need that the variances are positive, we also need that the variance matrix be positive semi-definite (PSD) at every point in time. Another important problem is the curse of dimensionality. We want models that can be applied to more than a few time series. This rules out most multivariate GARCH models such as the BEKK model of Engle and Kroner (1995). The most popular multivariate volatility model so far is certainly the constant conditional correlation model of Bollerslev (1990). In this model, the variance of a vector of returns is decomposed into standard deviations and correlations. The major hypothesis in this model is that the conditional correlations are constant through time. With this hypothesis it is easy to get PSD variance matrices because we only have to ensure that the correlation matrix is PSD and that the standard deviations are non-negative. It also breaks the curse of dimensionality because the likelihood can be seen as a set of SURE equations, i.e. a two-step estimation procedure where univariate volatility models are estimated in a first step will yield consistent estimates. To test the hypothesis that the correlations are constant, Bollerslev computed a series of portmanteau test statistics (Ljung-Box tests) on the standardized residuals from the univariate volatility models (GARCH). The test statistics being small for the cross-product of the standardized residuals, Bollerslev did not reject the hypothesis of constant correlations. But since then we learned that these tests have low power (e.g. see Hong (1996)) so the non-rejection of the hypothesis of i.i.d. standardized innovations could simply be due to the low power of this test. We illustrate this with a small Monte Carlo experiment later on. There are two reasons why the covariances could be time-varying. Since a covariance is the product of a correlation and two standard deviations, it could be time-varying because the standard deviations or the correlations are time-varying. This is an additional reason to study models with time-varying correlations. Recent results from realized volatilities from high frequency data by Andersen, Bollerslev, Diebold, and Labys (2001) also indicates that correlations are not constant through time. Recently, Tse and Tsui (2002) and Engle (2002) introduced multivariate GARCH models with time-varying correlations which have interesting properties. These models use 1

We can also ask ourselves if a GARCH-type model is appropriate for the correlations because the dynamic of a correlation can be intrinsically different than the behavior of a variance. it is easy to impose that the variance matrices are PSD.g. First. Section six contains a few concluding remarks. these GARCH-type models for the correlations must include a rescaling that introduces non-linearities. This model has many interesting properties. which appears to be a better description of these series. Engle. Another approach for breaking the curse of dimensionality of the multivariate GARCH is Ledoit. e. The maximization of the likelihood of this model is not computationally feasible if the number of time series is greater than five [see Ding and Engle (2001)]. Section four outlines the computation of one-step and multi-step ahead conditional expectations of the variance matrix. Third. when combined with the ARMACH model [see Taylor (1986)] for the standard deviations this correlation model allows explicit computations of multi-step ahead conditional expectations of the whole variance matrix. We also present a restricted version of this model where the change in the correlations between the various states is driven by a single factor. it does not suffer from a curse of dimensionality because it can be estimated with a two-step procedure. In this work we propose a multivariate volatility model where the covariances are decomposed into correlations and standard deviations. Second. and Wolf (2001) who propose a flexible estimation procedure for the Diagonal-Vech model of Bollerslev. The correlation matrix is constant in each regime and potentially different across regime. and Wooldridge (1988). a correlation is bounded from below and above. In these models. Clara. Because the correlations must lie between -1 and 1. One side-effect of this rescaling is that we can’t explicitly compute multi-step ahead conditional expectations of the correlation and variance matrices. The second section presents the model and its properties.the same decomposition for the variance matrix as Bollerslev (1990) but instead of taking constant correlations they propose a GARCH-type dynamic for the correlations. They propose a way to combine the estimates from univariate and bivariate model so as to get consistent estimates of the parameters of the full multivariate Diagonal-Vech and insure that the variance matrices are PSD. The correlations follow a regime switching model driven by an unobserved Markov chain. it is easy to constrain the variance matrices to be PSD and using an argument presented in Engle (2002) the parameters can be consistently estimated in a two-step procedure even tough the correlations are not constant. The paper is organized as follows. Section three describes the estimation of this model and the theoretical properties of the estimates. 2 . Section five presents an application of the model to multiple exchange rates series. This procedure is only valid for the somewhat restrictive Diagonal-Vech model. Dynamic correlations can be obtained by transiting among the different regimes. An empirical application to exchange rate time series illustrate that it fit well financial time series and can produce smoother paths for the correlations than the GARCH-type model of Tse and Tsui (2002) and Engle (2002).

. This can be seen as a midpoint between the model with constant conditional correlations of Bollerslev (1990) and models such as the dynamic conditional correlations of Engle (2002) where the correlations change every period.i.1. 2. (0.3) t=1 T t=1 ˜ u ˜ where Ut = [˜1. The time varying covariance matrix Ht can be decomposed into: H t ≡ S t Γ t St (2. This decomposition of the covariance matrix has previously been used by Bollerslev (1990). This is the first building block of our model: to model the full covariance matrix we model the variances and the correlations separately. . .t . . The Model Assume that the K-variate process Yt has the form: Yt = H t U t 1/2 (2. The series Yt could be a filtered process. . . Engle (2002) and Barnard. Tse and Tsui (2002). and Meng (1999).2) where St is a diagonal matrix composed of the standard deviations si.1) where Ut is an i.d. proofs are in the appendix.t ] is a zero-mean process with covariance matrix Γt .Finally. IK ) process. . uK. Regime switching for the correlations In this work we will argue for a regime switching model for the correlations. Both St and Γt are time varying. This model will have the appealing property of constant correlations within a regime but will still have dynamic correlations because of the regime 3 . With this decomposition the log-likelihood can be written 1 L = − 2 = − = − 1 2 1 2 T t=1 T K log(2π) + log(|Ht |) + Yt Ht−1 Yt K log(2π) + log(|St Γt St |) + Yt St−1 Γt St−1 Yt ˜ −1 ˜ K log(2π) + 2 log(|St |) + log(|Γt |) + Ut ΓT Ut (2.t . 2. i = 1. K and the matrix Γt contains the correlations. . McCulloch.

. The probability of going from state i in period t to state j in period t + 1 is denoted by πi. ones on the diagonal. .1 p1.1 0 0 p1. Γi = Pi Pi where Pi is a lower triangular matrix. Beside its very intuitive interpretation this model has many appealing properties.2 p3. .1 2. Note imposing that the diagonal elements are equal to one and that the off-diagonal elements are in [−1.j . These constraints will automatically give off-diagonal elements between −1 and 1. .2 3.1 p3.1 Γ =  p2. The symbol is the indicator function.1 + p2.2 p3.1 p2 + p2 p2.j and the limiting probability of being in state i is πi . K) (2.1 p3.2  0 0 p3. .1 3. The probability law governing ∆t is defined by its transition probability matrix.3   p2 p1. The element on row j and column i of Π is πi. 1] does not guarantee that Γi is PSD.2 p2 + p2 + p2 3.switching. .1 p2. Consider a trivariate example:    p1. the restrictions are j−1 with ∆t a random variable governed by a first order Markov chain independent of Ut which can take N possible values (∆t = 1.5) is restricting elements p j. .i i=1 (j = 1.3   pj. We could think that estimation of this model would be complicated by the possibly high number of parameters coming from each Γi .2 p1.2 0   0 p2.e. . and to impose constraints on Pi so that we get ones on the diagonal.1 + p2. Equation (2. More specifically.2  .j = 1− p2 .1 p2. We make the standard assumptions on the Markov chain (aperiodic. The K × K matrices Γi are correlation matrices (symmetric. . .1 p3. It is easy to impose that Γt is a correlation matrix because we only have to impose it for every Γi .1 p2. .1 1.1 p3.1 p2. off-diagonal elements between -1 and 1) with Γi = Γj for i = j. irreducible and ergodic).4) Imposing that the elements on the diagonal of the Choleski decomposition are positive. j.1 p3. i = 1.2 p3.5) where the sum is zero for j = 1.i . denoted by Π. N ). j − 1 to be inside a sphere of unit radius and these restrictions are easy to impose.1 p2.2 p3. =  p1. Fortunately we will see later on that we 4 . the time-varying correlation matrix Γt follows: N   Γt = n=1 {∆t =i} Γi (2. PSD. 2. One way to impose that Γi will be a correlation matrix is to take its Choleski decomposition.3 p3.1 2. . i.1 p3.

S market conditional on a fall of the two appears to be higher than what a Gaussian model with constant correlation would imply. if we use an appropriate model for the standard deviations we will also be able to perform these computations for the whole variance matrix.S equity portfolio and the U. 1] is a univariate random process governed by a first order Markov chain ∆t that can take N possible values (∆t = 1.can use the EM algorithm (Dempster. The idea of regime switching for the correlation matrix has been used by Ang and Chen (2002) to explain the asymmetric correlations of equity portfolios. 2. The probability law governing ∆t is defined by its transition probability matrix.2. 5 . A restricted model We next present a restricted version of the general regime switching model which will have a reduced number of parameters and will remain easy to estimate. denoted by Π. A model where we switch between states with a different correlation matrix could generate these higher correlations conditional on a joint downside move. The correlations between a U. . This can have very important impacts namely for the computation of Value at Risk (VaR) because the diversification gains of a portfolio would be less volatile. The second property comes from the Markov chain. It can also have important impacts for dynamic portfolio allocations. λ(∆t ) ∈ [0. Also. We then have regimes of generally higher or lower correlations and the changes across returns are proportional. The variable λ(∆t ) can be related to the notion of common features and factor models [Engle and Susmel (1993). This is in contrast to the models of Tse and Tsui (2002) and Engle (2002) where the rescaling that is used to keep the correlations between -1 and 1 introduces non-linearities that forbids the computation of multi-step ahead conditional expectations. and Rubin (1977)) as presented in Hamilton (1994. 2. Laird. The correlation matrix at time t is a weighted average of two extreme states of the world. King. . We present this model in section 2. N ) and is independent of Ut . . The first is that because this model for the correlations is basically linear we are able to compute multi-step ahead conditional expectations of the correlation matrix.3. . the returns are uncorrelated and in the other the returns are (highly) correlated. chapter 22) so increasing the number of time series the model is applied to will not complicate the estimation. If there is some general form of persistence in the chain (high probability of staying in a given state for more than one period) then this will lead to smooth time-varying correlations. This specification has two additional interesting properties.6) where Γ is a fixed correlation matrix. In one state. Bollerslev and Engle (1993). IK is a rank K identity matrix. For the matrix Γ t we propose the following form: Γt = Γ λ(∆t ) + IK (1 − λ(∆t )) (2.

Engle. and Rothschild (1992)] where the factor affects the variance matrix instead of the correlation matrix. Note that for the off-diagonal elements only the product of Γ and λ can be identified (by construction the diagonal elements of Γt are equal to 1). 1] for i = 1. cK ] ∈ RK : c Γt c = c Γ c λ(∆t ) + c c (1 − λ(∆t )) ≥ 0 because Γ is PSD and λ(∆t ) ∈ [0. we impose this restriction on an off-diagonal element of Γ . Diebold and Nerlove (1989). and Rothschild (1990). consider a vector c = [c1 . λ(1) > λ(2). . . λ(N − 1) > λ(N ) with max |Γi. We can prove that the matrix Ht is positive semi-definite with probability one for all t.99 then the correlation matrix is PSD. . . . . .1 PSD VARIANCE MATRIX. . and Wadhwani (1994). consider the correlation matrix of a trivariate time series. λ(1) > λ(2).e. To understand the problem. But so far we don’t have a result for a lower bound on λ(∆t ) that would guarantee that Γt is PSD. If all the correlations are 0. Depending on the estimation scheme that we use. Proposition 2.99 then it will not be PSD.Sentana. will also be PSD. N and Γ is a PSD correlation matrix then the variance matrix Ht will be positive semi-definite with probability one for all t. P ROOF OF P ROPOSITION 2. 6 . the variance matrix Ht . λ(i) ∈ [0. i=j (2. (2. To prove this. Engle.j | = 1. To solve this identification problem we can consider two natural sets of constraints. λ(2) > λ(3). i. one of the two sets of constraints will be more appropriate. .7) In this case. Ng. If the standard deviations are non-negative then the product St Γt St . Ng. . . The second set of constraints does not impose that one correlation is equal to 1 or -1 because we multiply Γ by λ(∆t ). But if the correlations are all −0. . . 1]. The first is: λ(1) = 1. . A second set of constraints is: 1 > λ(1). If the standard deviations si. It is tempting to allow λ(∆t ) to take negative values so as to allow the correlations to change sign.t are nonnegative with probability one for all t. .8) In this case instead of fixing the highest value of λ(i) to be one.1 We can first state that Γt is positive semi-definite for all t. We also restrict the λ(i)’s to be a decreasing sequence to remove the possibility of relabelling state i as state j and vice versa. fixing one of the λ(i) to be one identifies the product of Γ and λ.

One model for the volatility of univariate time series that would not have this problem is the GARCH in absolute innovations of Taylor (1986) and Schwert (1989). There are numerous reasons why a volatility model based on absolute value instead of the square of the innovations could be a good thing. In these models the conditional standard deviations follows: q p st = ω + i=1 αi |yt−i | + ˜ βj st−j j=1 (2. If conditional expectations are not a point of interest or if the ARMACH gives a clearly inferior fit of the data than another model could be used. ht . One reason can be linked to the least absolute deviations versus least squares approach.9) But we should notice that our model is not written in term of variances but in term of standard deviations. It could also be that the absolute return is a better measure of risk than the squared return.2.3. The conditional standard deviations (instead of the conditional vari˜ u ance) are a recursive function of absolute value of past innovations (instead of squared innovations). 7 . We consider this model because it allow the computation of multi-step ahead conditional expectations of the variance matrix. By using a model such as the GARCH for the variance.10) with αi = αi /E|˜t |. the covariance becomes the product of a correlation and the square-root of the product of two variances. (2. a covariance is a correlation times the standard deviations. The square-root introduces non-linearities that will prohibit analytic computation of conditional expectations. Using the ARMACH model for the volatility of univariate time series is not a prerequisite of our model. is a linear function of past squared innovations and past conditional variances: q p 2 αi yt−i i=1 ht = ω + + j=1 βj ht−j . As argued by Davidian and Carroll (1987). the model could be more robust if we use the absolute value instead of the square of the innovation. Univariate volatility models The most common model that is used to model the volatility of univariate processes is certainly the GARCH model of Bollerslev (1986) where the conditional variance at time t. This question is studied by Granger and Ding (1993). This class of model is also referred to as ARMACH process in Taylor (1986). But we must reckon that the interpretation of an outlier in a volatility model is not as straightforward as in the least squares/regression setup.

Engle and Kroner (1995) propose the BEKK representation which guarantee that (2. As its name says the correlations are constant. It is easy to impose that the variance matrix is PSD since we only have to take Γ to be PSD. ∀t. Another paper which favor constant correlations is Schwert and Seguin (1990) who tried several specifications of the multivariate GARCH model (2.1) to the four exchange rate series that we will later use in section 5 and compute the autocorrelation function for each cross-product of the standardized residuals. Bollerslev computed Portmanteau test statistics with the standardized residuals from the univariate GARCH estimations.g. Standard univariate GARCH models are used for the conditional variance. (2.11) This model is not really useful because it is very hard to impose that the matrices H t are PSD.11) will generate PSD variance matrices but the problem of simultaneous estimation of a high number of parameters is not solved. Given the low value of these (Ljung-Box) tests he did not reject the null hypothesis. and assume that the correlations are constant. The same argument is used by Baillie and Bollerslev (1990). They don’t mention which model they tried. see Hong (1996)].e. Review of multivariate GARCH models The most straightforward multivariate generalization of the univariate GARCH model can be written in the following way: p q Ht = C + i=1 Ai Yt−i Yt−i + j=1 Bj Ht−j . questions have been raised about the power of these tests [e. To test the hypothesis that the conditional correlations are constant. in equation (2. Under the null hypothesis of constant conditional correlations the cross-product of the standardized residuals from the univariate GARCH estimations should be i. This model has many attractive properties. just like Bollerslev (1990) did.2. i. These are plotted in Figure 2 with the two standard deviations confidence band.i. Looking at these we are tempted to conclude that there is no dynamic in the cross-product of the standardized innovations. To illustrate the lack of power we repeat the work of Bollerslev by fitting a GARCH(1. The most popular multivariate variance model is certainly the constant conditional correlations of Bollerslev (1990).d. it is not parsimonious and it is hard to estimate because of the high number of parameters. Since then.3) we have Γt = Γ. The model is also easy to estimate because we can maximize univariate GARCH likelihoods in a first step and then get an estimate of Γ from the correlation matrix of the standardized residuals.4. The interpretation of this model is easy because of the decomposition of the variance matrix into a matrix of correlation and a diagonal matrix of standard deviations. We can run simple Monte Carlo simulations to illustrate that the conclusion of constant 8 .11) for monthly stock returns and they could not find one that obviously dominated the constant conditional correlations model.

i) on row i and column i. Looking at the two figures.t−h uj. we must rescale the correlation matrix [equation (2. Since a correlation matrix must have ones on the diagonal and off-diagonal elements between -1 and 1. This is certainly a reason that would explain why there is very little evidence in the literature that the conditional correlations are not constant.12) (2.t−h ˜ ˜ u2 ˜i. Tse and Tsui (2002) and Engle (2002) introduced multivariate GARCH models with dynamic correlations.t−1 = M h=1 M h=1 (2. We simulate a multivariate volatility model with a strong dynamic in the correlations. (2. The model of Tse and Tsui (2002) is similar to the one of Engle (2002) but the rescaling is done differently: Γt = (1 − θ1 − θ2 )Γ + θ1 Γt−1 + θ2 Ψt−1 . Diebold.13) −1 −1 ˜ Γ t = D t Γ t Dt ˜ where Dt is a diagonal matrix with Γt (i. More recently. and Labys (2001) who gives strong proofs of important dynamics in the correlations by studying realized volatilities of exchange rates data. Bollerslev. The intuition behind this model is to impose a GARCH-type dynamic for the correlations. 9 . The parameter values are the estimates obtained with the same exchange rates dataset. Ψi. A conclusion of this simulation is that unless we directly take a model with dynamic correlations we can be under the impression that they are constant.13)] be˜ ˜ cause Ut−i Ut−i is not constrained to have elements between -1 and 1.t−h M h=1 u2 ˜j.j.15) ui. In our example we simulated two models: our restricted model (Figure 3) and the DCC-GARCH (Figure 4) of Engle (2002) which we review below.14) (2.t−h with M ≥ K. One interpretation of this experiment is that the standardized residuals are very noisy estimates of the standardized innovations and that this noise is hiding the dynamic in the correlations.correlations could be erroneous. In Engle (2002) the conditional correlation matrix Γt follows q p q p ˜ Γt = (1 − i=1 ai − bj )Γ + j=1 i=1 ˜ ˜ ai (Ut−i Ut−i ) + j=1 ˜ bj Γt−j . One evidence is Andersen. Both of them employ the St Γt St decomposition of the variance matrix Ht . estimate the univariate volatility model and plot the ACF of the cross-product of the standardized residuals. we see that the results from the simulated sample and the true data are similar. and ai and bj are scalars.

. θ)   . Fortunately we can use a two-step estimation procedure as in Engle (2002). and denotes elements-by-elements multiplication. ˆ Given a starting value ξ1|0 and parameter values θ. Since the variable ∆t which drives the correlation matrix is unobserved it is not straightforward. In the following subsection we present estimation methods which can greatly ease the estimation problem due to the high number of parameters. θ) (3. The ith element of the (N × 1) vector ηt is the density of Yt conditional on past observations and being in state i at time t. .   f (Yt |Yt−1 . (3. f (Yt |Yt−1 . In a first step we can estimate the univariate volatility models and in a second step we can estimate the parameters in the correlation matrix conditional on the first step estimates.2 and 3. Estimation The estimation of our regime switching model can in theory be done in one step but if we have a great number of time series the high number of parameters will prohibit us to do so.4) ˆ where ξt|t is an (N × 1) vector which contains the probability of being in each state at ˆ time t conditional on the observations up to time t. One-Step Estimation To maximize the likelihood we need to evaluate T QL(θ.1. 1 is an (N × 1) vector of 1s. =  .3 for 10 . ∆t = 1. . The (N × 1) vector ξt+1|t gives these probabilities at time t + 1 conditional on observations up to time t. To do this we use Hamilton’s filter [Hamilton (1989). chapter 22)] which we adapt to our setup.} and θ is the vector of parameter values. one can iterate over 3. ˆ 1 (ξt|t−1 ηt ) ˆ = Π ξt|t . Yt−2 . Inference on the state of the Markov chain is given by the following equations: ˆ ξt|t = ˆ ξt+1|t ηt ˆ (ξt|t−1 ηt ) . . In the first subsection we review the theoretical properties of the one-step estimates and explain how the likelihood can be evaluated. ∆t = N .1) where Yt−1 = {Yt−1 . 3. .2) (3.3. Y ) = t=1 log f (Yt |Yt−1 ). Hamilton (1994.3) (3.

π2 . T . specifying ξ1|0 . . . . (1 − π1.2 ) In this work both approach will be used. As we will see below. .6) with ˆ t = T . πN πN N πi = 1.5) Smoothing inference on the state of the Markov chain can also be computed using an algorithm developed by Kim (1994). .2). . . (1 − π1.1 ) + (1 − π2. . One would start iterating over (3. πN ) of the Markov process [Ross (1993.e. One thing to notice in the evaluation of the likelihood is that the correlation matrix can take N possible values in our model so we only have to invert N times a K × K matrix. depending on the estimation method used. i=1 In the two-state case the solution is π1 = 1 − π2. i. Chapter 4)].   .   .2 .6) where (÷) denotes element-by-element division. . . when using the EM algorithm there is an advantage in treating ξ1|0 as unknown parameters.1 ) + (1 − π2. . One approach would be to add this vector to the parameter space and estimate these initial probabilities. The likelihood is obtained as a by-product of this algorithm: T QL(θ) = t=1 ˆ log 1 (ξt|t−1 ηt ) . pN ≥ 0 with p1 + · · · + pN = 1. . Another approach would be to use the limiting probabilities (π1 . This would add N − 1 parameters. If we are not using the EM algorithm then we will use the limiting probabilities of the Markov chain because in this case these extra parameters would complicate the estimation.  = Π  .2 ) π2 = 1 − π1. ˆ The only thing remaining is to decide how to start up the algorithm. where ξT |T is given by (3. . .1 . These probabilities are the solution of the following system of equations:     π1 π1  . The probability of being in each state at time t conditional on observations up to time T is given by the following equation: ˆ ˆ ξt|T = ξt|t ˆ ˆ Π ξt+1|T (÷) ξt+1|t (3.t = 1. p1 . (3. This can be a computational advantage when the number of time series is large over models 11 . .

∂ ∂θ∂θ t=1 ˆ log f (Yt |Yt−1 .2. Two-Step Estimation By splitting the model in two parts. But one-step estimation is not really usable in practice if the number of time series is more than a few because of a curse of dimensionality.1 See Newey and McFadden (1994). The matrices I and J can be consistently ∂θ∂θ ∂θ estimated by their plug-in estimates: ˆ I = ˆ J = 1 T T t=1 ∂ ˆ log f (Yt |Yt−1 . We are now ready to state the properties of the maximum likelihood estimates.1 O NE .such as Tse and Tsui (2002) and Engle (2002) where a different correlation matrix has to be inverted for every observation. The first step involves the parameters of the univariate volatility models and the second step involves the parameters of the correlation model. Theorem 3.7) 12 . This is what we present in the next subsection. θ) ∂θ 1 T T ∂ ˆ log f (Yt |Yt−1 . J −1 IJ −1 log with J = E[ ∂ log f ] and I = E[ ∂ log f ∂ ∂θ f ].1 and if the usual regularity assumptions for the validity of the QMLE are satisfied then the maximum likelihood estimates are consistent and their asymptotic distribution is given by: √ ˆ T θ − θ −→ N 0.STEP MAXIMUM LIKELIHOOD ESTIMATION. If the assumptions of proposition 2. The complete parameter space θ is split into θ1 for the parameters in the univariate volatility model and θ2 for the parameters in the correlation model. (3. Y ) = − 2 T t=1 (K log(2π) + 2 log(|St |) + Ut Ut ) . To practically use this model we need an estimation method which only require non-linear optimization of O(1) parameters at a time. standard deviations and correlations. θ) ∂θ . We denote by QL1 the likelihood where the correlation matrix is taken to be an identity matrix: 1 QL1 (θ1 . We first begin by introducing elements of notation. θ) . we can estimate the model in two steps like in Engle (2002). P ROOF OF T HEOREM 3. 3.

We should also mention that equation (3. Using the results of Hamilton (1994.13) so they are correlation matrices.9) and (3.8) t=1 Notice two important features of QL1 . To bypass this problem we present two estimation method. (3. For the non-restricted model. ∆t−1 = i|UT . the number of time series) does not affect the complexity of the estimation because we only have to take weighted sums of outer-products. i.10) ˆ(0) ˆ(1) Starting with an initial value θ2 for the vector θ2 .e. To maximize QL2 we again have to use Hamilton’s filter since ∆t . This estimation method is more efficient than successive vectors θ2 and θ2 blindly maximizing the likelihood with Newton-type algorithms because more information on the structure of the problem is used. one for the non-restricted model and one for the restricted model.We denote by QL2 the likelihood given θ1 where we have concentrate out St : QL2 (θ2 . Firstly it is the sum of K univariate loglikelihood so that maximizing it is equivalent to maximizing each univariate log-likelihood separately.e. θ2 . which do not rely on simultaneous non-linear maximization of all the parameters. Y . θ2 ] T t=2 ˆ ˆ P ∆t−1 = i|UT . We then continue the iteration until the difference between ˆ(n) ˆ(n+1) is small. θ1 ) = − 1 2 T −1 K log(2π) + log(|Γt |) + Ut ΓT Ut . By doing this transformation. it turns out that maximization of the likelihood QL 2 for the correlation model can be done with the EM algorithm.j = ˆ ˆ Γi = T t=2 ˆ ˆ P [∆t = j. we can compute a new vector θ2 using equations (3. θ2 ] . the estimates obtained with these equations will not exactly be the numerical maximum of 13 .10) can’t be use directly because typically it does not provide correlation matrices. the ˆ elements on the diagonal of Γi are not imposed to be one. (3. θ2 ] t=1 (3. the state process. One should rescale these matrices like in equation (2.9) T ˆ ˆ ˆ ˆ t=1 (Ut Ut )P [∆t = i|UT .10). Notice also that the dimension of Γi (i. The procedure is the same as the one-step case because the correlations are not a function of the standard deviations. Because the number of parameters in the correlation model grows at a quadratic rate with the number of time series direct maximization of the likelihood is not practicable if we analyze more than a few series. is unobserved. chapter 22) we know that the MLE estimates of the transition probabilities and the ˆ correlation matrices satisfy the following equations if the initial probabilities ξ1|0 are not function of Π and Γi : πi. Secondly the evaluation of these log-likelihood is straightforward since it does not involve the use of Hamilton’s filter. T ˆ ˆ P [∆t = i|UT .

θ2 ) = g(Y. M =E ∂θ2 ∂θ2 ∂ ln f (Yt |Yt−1 ) ∂ ln f (Yt |Yt−1 ) . For the restricted model we can estimate the matrix Γ . m(Y. The properties of the two-step estimation are described in the following theorem. a limited number of Newton-type iterations are necessary to obtain the exact numerical maximum. For the ˆ vector of initial probabilities ξ1|0 . and we would take λ(1) > 1. up to a scale factor. θ2 ) .2 T WO . This leaves O(1) parameters to be non-linearly estimated. θ2 ) .1 are satisfied then the two-step estimates are consistent and their asymptotic distribution is: √ with V = where G θ1 = E ∂g(Y. If the assumptions of theorem 3.8). θ1 . it is also shown that their MLE estimates are given by the smoothed probabilities of the first observation. This leaves a number of parameters to be non-linearly estimated which grows with the number of state in the Markov chain. θ1 . ∂θ1 ∂g(Y. The scale indetermination can be solved by using the constraints on Γ and λ(i) ˆ described in equation (2. θ2 ) = ∂θ1 ∂θ2 G θ2 = E G−1 −G−1 Gθ2 M −1 θ1 θ1 0 M −1 E ∂ ln f ∂ ln f ∂θ ∂θ G−1 −G−1 Gθ2 M −1 θ1 θ1 0 M −1 T ˆ θ1 ˆ θ2 − θ1 θ2 −→ N (0. not with the number of time series. θ2 ) ∂m(Y. by doing correlation targeting.the likelihood.STEP MAXIMUM LIKELIHOOD ESTIMATION. so as to get a 1 or −1 off the diagonal. To do the correlation targeting notice that N N E[Γt ] = Γ i=1 λ(i)πi + IK i=1 (1 − λ(i))πi . From our experience. but something very close to it. So a correlation matrix computed with the standardized residuals from the first step estimaˆ tion will provide an estimate Γ of Γ up to the scale factor N λ(i)πi for the off-diagonal i=1 elements. θ1 . V ) 14 . Theorem 3. We would divide the off-diagonal elements of Γ by the highest in absolute value.

θ2 ) ∂θ ∂ ˆ ˆ ln f (Yt |Yt−1 . θ2 ). Proof: See Pagan (1986). . t=1 T t=1 T t=1 The proof is in the appendix. θ1 . If the assumptions of theorem 3. The properties of the estimates that we get from this procedure are described in the following theorem. θ2 ). θ1 . ∂θ2 ∂θ2 ∂ ˆ ˆ ln f (Yt |Yt−1 .The matrix V can be constantly estimated by their plug-in estimate: ˆ V = where ˆ G θ1 = ˆ G θ2 = ˆ M = ˆ I = 1 T 1 T 1 T 1 T T ˆ ˆ ˆ ˆ G−1 −G−1 Gθ2 M −1 θ1 θ1 ˆ 0 M −1 ˆ I ˆ ˆ ˆ ˆ G−1 −G−1 Gθ2 M −1 θ1 θ1 ˆ 0 M −1 t=1 T ∂2 ˆ ˆ ln f (Yt |Yt−1 . The remaining problem in this work is to specify the number of states in the Markov 15 . .1 are satisfied then efficient estimates can be obtained by doing one step of a Newton-Raphson estimation of the full ˆ likelihood using the two-step estimates θ: ¯ θ √ ¯ T θ − θ0 = 2 ˆ − ∂ QL θ ∂θ∂θ −1 ˆ θ −1 ∂QL ∂θ ˆ θ −→ N 0. . . θ1 . Theorem 3. . θ2 ). θ1 .3 (Two-step efficient estimation) T WO .STEP EFFICIENT MAXIMUM LIKE LIHOOD ESTIMATION. J −1 IJ The matrices I and J can be constantly estimated by their plug-in estimate. . ∂θ1 ∂θ2 ∂2 ˆ ˆ ln f (Yt |Yt−1 . Using the general results summarized in Pagan (1986) on two-step estimation we can compute efficient estimates from the two-step estimates by doing one step of a NewtonRaphson estimation of the full likelihood using our two-step estimates as the starting point. Notice that the computation of these estimates could be costly in computing time when dealing with very large systems because of the need to compute the matrix of second derivatives. . θ2 ) ∂θ . ∂θ1 ∂θ1 ∂2 ˆ ˆ ln f (Yt |Yt−1 . . θ1 .

K.t+1 Γi. Using the fact that tomorrow’s conditional standard deviations are known. One way to completely avoid this problem is to restrict the parameter space such that the correlation matrices are positive definite. To compute this expectation we use the fact that the Markov chain ∆t is independent of Ut . j = 1. 2001)]. i. We leave this problem for further work. .j to be in a sphere of radius 1 − ε for a given positive small value for ε.g. Given the information up to 16 . Because we are using a two-step estimation procedure we see that we will not be affected by this problem in practice. All the calculus will be presented for the case where the univariate volatility model is an ARMACH(1. One-step ahead conditional expectations are straightforward. Extension to a more general ARMACH(p. The asymptotic properties of this test are unknown for the moment. We begin by introducing a notation for the matrix Γt that covers both the restricted and unrestricted model.j (∆t+1 )] = si. We can easily do this by using the Choleski decomposition in equation (2. Day (1969)]. To compute these we must take the conditional expectations of the product of a correlation and two standard deviations.t+1 Et [Γi.j. Multi-Step Ahead Conditional Expectations In this section we study one-step and multi-step ahead conditional expectations of the covariance matrix.j (∆t+1 )] = si.t+1 Γi. The maximization of the likelihood is an ill-posed problem and this is a well known fact in Bayesian econometrics.j. From the literature on mixture models we know that the likelihood of our model is unbounded [e. . . A solution could be the use of Monte Carlo test procedures [see Dufour (2000)]. to compute Et [Ht+1 ] we have to compute Et [si.chain. 2. . The asymptotic theory of an LR test of N + 1 versus N states is complicated by the fact that some parameters are not identified under the null hypothesis and we are testing parameter values that are on the boundary of the maintained hypothesis [see Andrews (1999.1). An alternative procedure could be the specification tests presented in Hamilton (1996).t+1 sj. 4. We will denote by Γ (∆ t = i) the value taken by Γt when the chain is in the state ∆t = i.5) and restricting the pi. It is well known that testing for the number of states in a Markov chain is a hard problem to tackle. where Γi. we will find a local maxima.j (∆t+1 )].e.t+1|t = Et [Γi.q) would not introduce new difficulties. It is possible to choose the parameter values such that the determinant of the correlation matrix in one state is decreasing to zero while at the same time the parameters of the standard deviation models are chosen such that we have ˜ ˜ Ut Γt −1 Ut going to zero.t+1|t for i.t+1 sj.t+1 sj.

1) is a martingale difference sequence. the probability of being in each state at time t + 1 is ξt+1|t = Πξt|t .j (∆t+n )EtU [si. for i. At this point we see why we can’t analytically compute multi-step ahead conditional expectations with a GARCH model for the standard deviations. If i = j. We see that for the one-step ahead conditional expectations the choice of the model for the standard deviations does not play a role when a GARCH-type model is used because tomorrow’s standard deviations are known.2) (4.1) we get: si. we assume that i = j. Et [si.t+n sj.t−1 | u −1 E|˜i. .j (∆t+n )]. and Et∆ [· · · ] is the expectation with respect to the process ∆t . To compute the n-step ahead conditional expectations Et [Ht+n ] we have to compute elements of the following form. The ARMACH model described in equation (2.t−1 ˜ where vi.t−1 | u (4. the correlation is always equal to 1 and the Markov chain does not play a role.10) can be rewritten in an ARMA-type representation and for an ARMACH(1. From this we deduce that. .t+n sj. N Γt+1|t = i=1 Γ (∆t+1 = i)ξi.t−1 vi.t−1 + αi si.t−1 = ˜ |˜i.t+n |∆] where EtU [· · · |∆] is the expectation with respect to the innovations Ut conditional on the present and future values of ∆t . .time t. In the following. we can first compute the expectation conditional on the Markov chain and then integrate it out: Et∆ EtU [si.t = ωi + (αi + βi )si. . 2.t+n sj. We can now treat the correlations as known for the computation of EtU [· · · |∆].t+n Γi.t+1|t . K. j = 1.j (∆t+n )|∆] = Et∆ Γi.t+n Γi. Using the fact that the Markov chain is independent of the process Ut . We would have to take conditional expectations of the square root of a linear expression. Before 17 .

j. where (x)k = x(x + 1) · · · (x + k). 1−Γi.t+n−1 sj.t+1 .t+n ||˜j. 2 . = ωi 1 − (αi + βi ) For the expectation in (4. for the form of the distribution of Ut .going on we define the following elements: v ˜ fi.t+n | ∆t+n = d u u (E|˜t |)2 u and ai.t+n vj.t+n−1 + {(αi + βi )(αj + βj ) + αi αj fi.t+n−1 + αi si.t+n−l ∆ ˜ (4. In the case where Ut is not Gaussian and a closed-form solution cannot be found.t+n−1 ) |∆ ˜ = ωi ωj + ωi (αj + βj )aj.t+n (d) = EtU [˜i.t+n |∆ ˜ = EtU (ωi + (αi + βi )si.t+n sj. 18 . 2. fi.t+n−1 ) × (ωj + (αj + βj )sj.j (d)2 )2 + 2Γi.t+n−1 vj.t+n .t+n−1 vi.j.t+n−1 (d)} EtU si.4) 1 − (αi + βi )n−1 + (αi + βi )n−1 si. c.t+n (d) = HG(a.j (a)k (b)k z k (c)k k! (E|˜t |)2 π u 1 − Γi. which is known.t+n−1 + ωj (αi + βi )ai.t+n−1 + αj sj.t+n−1 |∆ 1 Computed with Mathematica. a stronger stand must be taken than only saying that it has mean zero and an identity matrix for the variance.j (d)2 HG ∞ k=0 2 1 3 −Γ (d) . as an argument1 : ˜ ˜ fi.t+n−l vi. z) = 2 (1 − Γi.j.j.t+1 + l=1 αi (αi + βi )l−1 si.t+n and uj.j (d)2 2 i.t+n = EtU [si.t+n (d) could be evaluated by numerical integration. Using these expressions the n-step ahead conditional expectation becomes: EtU si.t+n |∆t+n = d] 1 = E |˜i.j (d)2 . b. if we assume that the Ut ’s are jointly Gaussian then it has a closed-form solution which involve an hyper-geometric function with the correlation between ui. In any case.3).3) n−1 l n−1 = EtU ωi l=0 (αi + βi ) + (αi + βi ) si. But this would have to be done only N times because d can take only N possible values.t+n |∆] n−2 (4.

j. So we see that the sum over N n+1 terms in equation (4.j (∆t+n )] N N = dn =1 n−1 ··· d0 =1 EtU si.t+n sj. bi.j.t+n−1 + ωj (αi + βi )ai.t+n Γi.5) can be rearranged in the same way.j.t|t πd0 .n .d1 · · · πn−1.t+1 d0 =1 m=1 Γi.t+n sj. Doing so we get Et [si. The second is the use of a model for the conditional standard deviation (ARMACH) instead of the variance.d1 · · · πdn−1 . The first is that since our model for the correlation matrix is linear the conditional expectations are given by the summation of a constant times a probability which is linearly updated.t|t πd0 .= ai.t+1 sj.d1 · · · πn−1. The summations in the second term of the last equality can be rearranged so as to obtained N N N N si.j.t+n−1 .j.j.j (dn ) dn−1 =1 bi.t+n sj.t|t πd0 .j (dn )ξd0 .t+1 sj.j.t+n−m (d)si.t+n−1 (d) = (αi + βi )(αj + βj ) + αi αj fi.t+1 dn =1 Γi.t+n−m (d) + m=1 bi.t+n−1 (d)EtU si.t+n−1 sj.5) dn =1 ··· bi.t+n−1 + bi.t+n−1 (d).t+n−1 (dn−1 )πdn−1 . If another univariate model for the conditional volatility is obviously 19 .j. Keeping in mind that bi.j.j (dn )ξd0 .t|t πd0 .t+n |∆ = ai.n + (4.t+n−m (d)si.t+1 sj.j.t+n−m (d) d0 =1 m=1 Γi.dn · · · bi.d1 .t+n |∆t+n = dn Γi.t+n−1 = ωi ωj + ωi (αj + βj )aj. We see that we are able to compute multi-step ahead conditional expectations of the whole variance matrix for two reasons. The summations in the first element of equation (4.t+n−l l=1 m=1 bi.t (d) depends on the state of the Markov chain at time t we next integrate out the Markov chain.j.t+n−1 |∆ where ai.t+n−l dn =1 N n−1 ··· bi.5) can be written as a sum over (n + 1)N terms.j.j (dn )ξd0 .dn N N l−1 = l=1 N ai. Note that the use of the ARMACH model is not required.j.t+1 (d1 )πd1 .d2 d1 =1 d0 =1 ξd0 . We can solve this expression recursively to get n−1 l−1 n−1 EtU si.j.t+1 .

Yen. The results are generated using Ox version 3.1. Ruiz. We can do it because we have a limited number of time series in our example. For the restricted model we present the correlation matrix in each state and their standard deviations computed with the delta method instead of the matrix Γ and the value of λ(2) (we use the identification scheme of equation (2. The results for the univariate volatility models are similar to the usual findings with this type of financial series. The level of persistence for the univariate GARCH models (α + β) 20 .13) introduces non-linearities.1)]. We first take the first difference of the logarithm of each series before applying directly our variance model (these are our filtered series). Deutschmark. and Shephard (1994) and Kim. Shephard. Models with three states are studied in the following subsection. 5.1)] and 5 [GARCH(1. Ruiz. 5. in the DCC-GARCH of Engle (2002) it is not even possible to compute multi-step ahead conditional expectations of the correlation matrix because the rescaling performed in equation (2. This dataset contains four weekdays close exchange rates (Pound. The number of observation is 946. dollar) over the period 1/10/81 to 28/6/85. We first do the two-step estimation (EM algorithm or correlation targeting) and then use these values to initialize the full maximization. Application to exchange rate data In this section we apply both the unrestricted and restricted version of our model to the exchange rate dataset used by Harvey. The reason why we use this dataset is that Harvey. The outputs for the restricted version of the model are in Tables 4 [ARMACH(1. For example.30 on Linux [see Doornik (1999)].7) when doing the one-step estimation) so that the results are directly comparable to those of the unrestricted model. Swiss-Franc all against the U. and Chib (1998).S. Using our model we can check is their assumption was reasonable. The results for the unrestricted models are presented in Tables 2 [ARMACH(1. The estimation results that we present in the various tables are for full one-step maximum likelihood estimation. Switching regime model with two states We first present results for the regime switching models with two states.better and if analytic computation of multi-step ahead conditional expectations are not a requirement then this alternative univariate model should be used.1) for the standard deviations].1) for the standard deviations] and 3 [GARCH(1. and Shephard (1994) use the same dataset to present a multivariate stochastic volatility model where they assume that correlations are constant through time. It is not easy to design a multivariate volatility model that has a rich enough dynamic but allows these analytic computations of multi-step ahead conditional expectations of the variance matrix.

The impact on the likelihood of replacing the ARMACH models by GARCH models is an increase of about 15 points. are very different.09.1) is capturing the dynamic in the second moment of each series we compute the ACF of the absolute value of the standardized residuals. But remembering the results in section 2. for state two this probability is 0. the implied value for λ(2) for each correlation is as low as 0. To check if an ARMACH(1. In comparison.1 . In the unrestricted model.675 = 0. Under the unrestricted model the magnitude of all the correlations in state two is smaller than in state one.1) the degree of persistence of the standard deviations are given by α + β.2 and the 1% critical values is 15. The hypothesis that they all decrease in the same proportion is less supported by the data. It is an indications that we can replace the traditional GARCH model by the ARMACH model or that the correlation model is robust to the specification of the standard deviations. We can conclude that an ARMACH(1. π1.93 and 0.935 = 0. 21 . So the hypothesis of the restricted version of the model that there is an ordering in the magnitude of the correlations across the different states seems plausible. We none the less keep these orders for the ARMACH and the GARCH. As for the value of the correlations in each state.93. is around 0. The process is spending more time in state one and spells in state two are shorter on average than in state one.14. although both high probabilities. for the unrestricted and the restricted model. This is explained by the estimate of the transition probability matrix. Since these two models are nested we can use an LR test for this hypothesis. the results for the restricted model are similar to those of the unrestricted model. Figure 5 shows that we frequently move between both states and there is little uncertainty about the state we are in at each point in time. The probability of being in state one at time t+1 conditional on being in state one at time t. the smoothed probabilities from Hamilton’s filter of being in state one and the smoothed correlations at each point in time we see that the correlations appears to be dynamic. For an ARMACH(1. not α + β.1) or a GARCH(1. For the estimation of the regime switching model the first thing to notice is that the results do not depend on the univariate model for the standard deviations. This illustrate that 0. We also find that the persistences are ˜ high but strictly lower than one. which is very similar across the various models with two states.67.243 and as high as 0. Under the null hypothesis that the restricted model is the reality. The likelihood may be higher with the GARCH model but the parameters of the correlation model are basically the same for both models.1) or GARCH(1. twice the difference in the log-likelihood should follow a Chi-square with five degrees of freedom. Looking at the tables and the figures 5 and 6 where we have plotted.are high put strictly lower than one.4 we know that we should not have too much faith in these graphs.70.592 (for the ARMACH case). The value of the test statistic is 27. That means a high level of persistence in the Markov chain because the probability of spending the next five days in state one is 0. We would reject the restricted version of the model at the 1% level.1) is doing a good job at capturing the dynamic since most of the autocorrelations are within the confidence bands.

to a Chi-square with ten degrees of freedom and doing so we don’t reject the restricted model. again an indication that the correlation models are robust to the univariate volatility model employed. which are smaller than in state three. If we have in mind a likelihood ratio test to gauge the increase in the likelihood we would compare 80 or 100 (twice the increase) to the critical values of a Chi-square with eleven or five degrees of freedom (24. Examining more closely the correlation matrix for each state. state one and two for the restricted model). the estimates of univariate volatility models are not affected by the addition of a third state. 22 . In this case. With ˆ both of these univariate volatility models we get similar results for the matrix Γ and for the parameters α and β. while the third state adds respectively eleven and five parameters.e.73 and 15.3. the smoothed probabilities and the smoothed correlations in figure 8. 5. Looking at Figure 7 we see that the Markov chain is spending most of its time in states of high correlations (state two and three for the unrestricted model. we see that with a third state the Markov chain is beginning to identify what could be outliers. DCC-GARCH To evaluate the relative performance of our model to fit the data we estimate the DCCGARCH(1. The addition of a third state now allows the data to identified two states with high correlations and one state of very low correlations.1). Very rarely does the chain goes in the state of low correlation. The chain is going very rarely in a state which is very different from the others. 8 for the ARMACH. although it is not a valid procedure because the LR test is probably not asymptotically Chi-square with these degrees of freedom. Again we see that most of the time we have a strong idea about which state we are in at all point in time as the smoothed probabilities are close to either zero or one most of the time. The increase of the log-likelihood is about 40 points for the unrestricted model and 50 points for the restricted model. Again we have in general the same ordering of the magnitude of the correlations across the states with the unrestricted model. We can again test the restricted model versus the unrestricted. The estimation results for the various models are presented in Tables 6 to 9. Again. Switching regime model with three states We next allow a third state in the Markov chain.5. The full maximum likelihood estimates are reported. To isolate the impact of not using the same model for the standard deviations we also estimate a DCC-ARMACH(1.2. i.09 respectively).1) are in Table 11.1) of Engle (2002). The magnitude of the correlations in state one are smaller than in state two. As expected. we compare twice the difference of the likelihood. there is no impact on the estimates of the correlation model when going from the GARCH to the ARMACH model. This could be seen as an indicator that three states is enough. The results for the DCCGARCH are in Table 10 and the results for the DCC-ARMACH(1.

1) appears to fit the data a bit better than the ARMACH(1.What is interesting is to compare the different values of the log-likelihood. We also get a similar increase in our regime switching model. Series associated to the Markov chain An interesting exercise with regime switching models is identifying what is driving the latent process ∆t . This is even more apparent when we take the smoothed correlations from the restricted model for the comparison. dollars we can look at the return on the Dow Jones index.1) because the likelihood increases by 28 points when we use the first of the two models.S.1) fitted on this series over the same period as our exchange rates is plotted in Figure 10. One valid test for testing non-nested models is proposed by Vuong (1989). The GARCH(1. The difference in the log-likelihood is 114. Because our regime switching model and the DCC model are not nested we cannot perform a likelihood ratio test to test if the increase in the likelihood is significant. But there is a big difference in the level of the log-likelihood when we compare our regime switching model and the DCC model. One process which could drive the correlations of the various currencies is the return on the stock market. Comparing with the smoothed probabilities in Figure 7 for the unrestricted model we see that the increase in the volatility after observation number 200 of the index corresponds to a period where the process is in regime 1 (highest correlations) for a prolonged period. The conditional variance from a GARCH(1. because the regime switching is for the standardized innovations Ut we have to look at series other than the standard deviations of each return. 5. The exception would be the correlation between the Deutschmark and the Swiss-Franc where there is almost no movement for the DCC-GARCH while the single factor imposes changes in this correlation. Another interesting comparison is the correlations extracted from both model. One interesting implication of smoother behavior for the correlation is for the VaR. If the time-varying correlations are smoother then the gain from portfolio diversification will also be smoother which might imply a better behave time-varying VaR. With this test we reject at the 5% level the hypothesis that the DCC model is as close to the true model as the regime switching model.5 points between the unrestricted model with two states and the DCC-GARCH at the cost of seven additional parameters. With our model. 23 . This is far from a complete explanation because we cannot really discern a link between this conditional variance and the rest of the smoothed probabilities.4. If we compare the smoothed correlations from our unrestricted regime switching model with ARMACH models for the standard deviations (Figure 8) with the correlations from the DCC-ARMACH (Figure 9) we see that the correlations are generally smoother with the switching regime model. For our restricted model with two states (and GARCH model) the log-likelihood is 100 points higher than the DCC-GARCH while the regime switching model has only one more parameter than the DCC-GARCH. Since all the currencies are expressed in term of U.

Both the correlations and the standard deviations are dynamic. we can use a two-step estimation procedure as in Engle (2002). Reading newspapers from this period we see that over this week there was a lot of uncertainty about what the Fed would do with the interest rates. Our model has the interesting properties of having constant correlations within each regime but still having dynamic correlations because of the regime switching. Combining this two-step estimation procedure with either correlation targeting (for the restricted model) or the EM algorithm (for the unrestricted model) breaks the curse of dimensionality. 6. The evaluation of the likelihood is done with Hamilton’s filter because of the unobserved Markov chain. The uncertainty is resolved. the process stays in this regime. Volcker sent a strong but noisy signal that something might or might not happen to the interest rates. the number of parameters in every non-linear estimation is not 24 . it allows analytic computation of multi-step ahead conditional expectations of the whole variance matrix. This property can have important impacts. The process enters the state of very low correlations. The ARMACH model is a GARCH-type model for the conditional standard deviations instead of the conditional variance. Again this is not a complete explanation because we similar event studies for the other periods where the process goes into the regime of low correlations are not as satisfactory. namely for the computation of Value at Risk. These observations correspond to the July 11. Through out the week the Fed keeps sending this strong and noisy signal.If we believe that adding a third state is equivalent to chasing outliers we can try to see if something special happened in the days when the process went into that third and infrequent regime. By decomposing the variance matrix into a diagonal matrix of standard deviations and a correlation matrix. The process leaves the state of very low correlations. conclusion In this work we propose a model for the variance between multiple time series where we decompose the covariances into correlations and standard deviations. i. One appealing feature of this regime switching model for the correlations is that when combined with the ARMACH model (Taylor (1986) and Schwert (1989)) for the conditional standard deviations. For the correlation matrix we propose a regime switching model driven by an unobserved state variable which follows a Markov chain. 1983 period. Then at the end of the week. These regime switching models can be seen as a mid-point between the constant conditional correlations model of Bollerslev (1990) and the multivariate GARCH models of Tse and Tsui (2002) and Engle (2002). Looking again at the smoothed probabilities for the unrestricted model in Figure 7 we see that around observation number 450 the process is spending five days in regime 3. At the beginning of the week. We also present a restricted version of our model where a single factor is driving the changes of the whole correlation matrix.e. on July 15 Volcker announce that the interest rates will go up. 1983 to July 15.

An interesting aspect of our regime switching model is that we find strong persistence in the Markov chain. Bollerslev. Identification of the number of states in the Markov chain is also an ongoing research project. and Diebold (2002). and Labys (2001) seems to indicate. 25 . A comparison of our regime switching model with the DCC-GARCH of Engle and Sheppard (2001) shows that our model has a better in-sample fit. Brandt. An application of this model to four major exchange rate series illustrate its good behavior. Another research avenue is the use of a better proxy for volatility than the absolute value (or the square) of past observations. could be this better proxy. Diebold. Possible extensions in future work includes the addition of relations between correlations and standard deviations as the work of Andersen. which produces smoother time-varying correlations than the DCC-GARCH. The daily range. as indicated by Alizadeh.a function of the number of time series.

F. 9. T. Forthcoming. (1986): “Generalized Autoregressive Conditional Heteroskedasticity.. 498– 505. E NGLE .-L. 443–494. F..References A LIZADEH . 1341–1383.” Journal of International Money and Finance. 69(3). AND J. with Application to Shrinkage. M ENG (1999): “Modelling Covariance Matrices in Terms of Standard Deviations and Correlations. D IEBOLD (2002): “Range-Based Estimation of Stochastic Volatility Models. C ARROLL (1987): “Variance Function Estimation. 309–324. B OLLERSLEV. AND P. 116– 131. 82. (1999): “Estimation when a parameter is on a boundary.” The Journal of Political Economy. AND R. X. 26 . B OLLERSLEV.” Journal of Finance. AND F. DAVIDIAN . A NG . 72. L ABYS (2001): “The Distribution of Realized Exchange Rate Volatility. D. AND X. J. AND J.. S. 167–186.. B OLLERSLEV (1990): “A Multivariate Generalized ARCH Approach to Modeling Risk Premia in Forward Foreign Exchange Rate Markets. M.” journal of Econometrics. B RANDT. T. AND R. W. M C C ULLOCH . BAILLIE .” Econometrica. A NDREWS . 307–327.. J.” Working Paper.” Journal of the American Statistical Association. 1079–1091. AND T. BARNARD .. R.” Review of Economics and Statistics.” Econometrica. D IEBOLD . 63(3).” Journal of the American Statistical Association. T. A NDERSEN .. 67(6). 31. (1990): “Modelling the Coherence in Short-Run Nominal Exchange Rates: A Multivariate Generalized ARCH Model. M. T.. 683–734. B OLLERSLEV.” Econometrica. T. K. M. R. B OLLERSLEV. E NGLE (1993): “Common Persistence in Conditional Variances. W. F. 61(1).” Journal of Financial Economics. 96. A. X. 42–55. C HEN (2002): “Asymmetric Correlations of Equity Portfolios. 96(1). T. W OOLDRIDGE (1988): “A capital asset pricing model with time-varying covariances. (2001): “Testing when a parameter is on the boundary of the maintained hypothesis. R. 57.

RUBIN (1977): “Maximum likelihood from incomplete data via the EM algorithm.R. R. E NGLE . AND M. S USMEL (1993): “Common Volatility in International Equity Markets. F. R. B.” UCSD Discussion Paper 2001-15. R. D ING . 11(2). 463–474. D ING (1993): “Some Properties of Absolute Return . P. E NGLE . J.” Econometrica.” Biometrika. C. 122–150. Soc. Z. 167–176.” Journal of Business and Economic Statistics. J. D IEBOLD ..” Journal of Applied Econometrics.ac. 3rd ed.. K RONER (1995): “Multivariate simultaneous generalized ARCH. (2000): “Monte Carlo Tests with Nuisance Parameters: A General Approach to Finite-Sample Inference and Nonstandard Asymptotics in Economics. G RANGER . F. C. R.ox.An Alternative Measure of Risk. AND K. F. E.” Journal of Business and Economic Statistics.E. X. 1–38. D UFOUR . AND Z. D EMPSTER . Roy. With discussion. ROTHSCHILD (1990): “Asset pricing with a factorARCH covariance structure.” Econometric Theory. 1–21. 27 . AND M.DAY.” Timberlake Consultants Press and Oxford: www.” J. (1982): “Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of United Kingdom Inflation. Université de Montréal. L AIRD . 45.. E NGLE .D. (2002): “Dynamic Conditional Correlation: A Simple Class of Multivariate Generalized Autoregressive Conditional Heteroskedasticity Models.” Discussion paper. M. F. Stern School of Business.” Discussion Paper 93-38. 20(3). Statist.nuff. 11. 56. V.. R. (1969): “Estimating the components of a mixture of normal distributions. D OORNIK .. E NGLE . New York University.. J. A.. K.-M.uk/Users/Doornik.” Jopurnal of Econometrics. F. E NGLE (2001): “Large Scale Conditional Covariance Matrix Modeling. N. 987–1007. 339–350.. 4(1). AND R. N ERLOVE (1989): “The dynamics of exchange rate volatility: a multivariate latent factor ARCH model.. San Diego. S HEPPARD (2001): “Theoretical and Empirical Properties of Dynamic Conditional Correlation Multivariate GARCH. N. 213–237.” Discussion Paper FIN-01-029. F. Ser. 50. empirical estimates for treasury bills. AND D. AND K. A. F. University of California. N G . E NGLE . Estimation and Testing. AND R. 39(1). B.. E NGLE . W. (1999): “Object-Oriented Matrix Programming Using Ox. R.

North-Holland. 28 ... WADHWANI (1994): “Volatility and Links Between National Stock Markets. E. L EDOIT. 247–264. (1986): “Two Stage and Related Estimators and Their Applications.” Discussion Paper 586. K. C. 60(1-2). pp. N. C LARA . 53(4).” Discussion Paper 578. ROTHSCHILD (1992): “A multi-dynamic-factor model for stock returns. 1–22. 62(4). S HEPHARD (1994): “Multivariate Stochastic Variance Models. 837–864. J. E NGLE . Amsterdam.H AMILTON . (1989): “A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle. Universitat Pompeu Fabra. 361–393. (1994): “Dynamic linear models with Markov-switching.” Journal of Econometrics. K IM . AND D. W OLF (2001a): “Improved Estimation of the Covariance Matrix of Stock Returns With an Application to Portfolio Selection. F. 357–384. W OLF (2001): “Flexible Multivariate GARCH Modeling With an Application to International Stock Markets. Vol. P..” Econometrica. 517–538. C HIB (1998): “Stochastic Volatility: Likelihood Inference and Comparison With ARCH Models. 245–266.” in Handbook of econometrics. AND S. O. RUIZ . S ENTANA .” Econometrica. O. H ARVEY. (2001b): “Some Hypothesis Tests for the Covariance Matrix when the Dimension is Large Compared to the Sample Size. R. D.. AND M. 127–157.” Journal of Econometrics. (1994): Time Series Analysis. 52.” Econometrica. S. E.” Review of Economic Studies. Universitat Pompeu Fabra. M C FADDEN (1994): “Large sample estimation and hypothesis testing. M.. 64(4).-J.” Journal of Econometrics.. 2111–2245. IV. K ING .” The Review of Economic Studies. Universitat Pompeu Fabra. AND N.. K IM . W. N EWEY. S HEPHARD . S.” Review of Economic Studies. L EDOIT. 901–933. H ONG . (1996): “Consistent Testing for Serial Correlation of Unknown Form. A.” Discussion Paper 575. V. 57(2). 65. C. 61. PAGAN . 70. (1996): “Specification Testing in Markov-Switching Time-Series Models. N G . AND S. Y. Princeton University Press. AND M. AND M.

H. T SE .” The Journal of Finance. W. J.ROSS . 307–333. G. S. Academic Press. R.” Journal of Business and Economic Statistics.” Journal of Finance. S CHWERT. 1115–1153. T SUI (2002): “A Multivariate Generalized Autoregressive Conditional Heteroscedasticity Model With Time-Varying Correlations. (1986): Modelling Financial time Series. (1989): “Why Does Stock Market Volatility Change Over Time. AND P. 45(4). 57. Y. S EGUIN (1990): “Heteroskedasticity in Stock Returns. 29 . 351–362. W. TAYLOR . S. 20(3).. V UONG . C. 1129–1155. fifth edn.” Econometrica. John Wiley and Sons. K.. S CHWERT. G. Q. 44(5). AND K. (1989): “Likelihood ratio tests for model selection and non-nested hypotheses. (1993): Introduction to Probability Models.

Appendix A.t 1 =0 − 2˜2 ui. Denoting by θi. pi + qi + 1.j si. ˜  = −2Eθ0 Ut Γt−1  1 ∂si.3) by 1/T .7) will converge to their true value. . .t y s2 ∂θi.t ∂θi. .j  Yt 0 0    = 0.t 30 .j one of the parameters in θ1 which appears in the expression of si.t log 2π + 2 log si.t ˜ ˜ − Eθ0 + Ut Γt−1 Ut 2 si. Scaling (2.3) are 1 ∂ 2 ∂si.j While the first order conditions for the objective limit of (2.t i=1 where Eθ0 is the expectation with respect to the true density.t with j = 1.t 2 ∂si.7) by 1/T .j ∂θi. we can write the first order condition for the objective limit of (3. Proofs P ROOF OF T HEOREM 3.t + 2 si.t i. .j i.t + Ut Γt−1 Ut i=1 If we can show that both set of first order conditions with respect to θ1 are satisfied for the same vector of parameters then we can conclude that the estimates from (3.t ∂θi.t − Eθ0 2 si.t ∂θi. the objective limit is 1 − Eθ0 K log 2π + log |Γt | + 2 2 K ˜ ˜ log si.j The second part of the last equation becomes: 2Eθ0 ˜ Ut Γt−1  ∂St−1 ∂θi.7) as 1 ∂si. the objective limit is 1 − Eθ0 2 K 2 yi.2 Scaling (3.

first take the filtered process yt : yt 2m yt = st u t ˜ 2m 2m ˜ = st u t . . Taking the unconditional expectation and assuming that ut follows a Gaussian distribution. am = i=1 (2j − 1) Using the binomial formula the 2mth power of st is 2m s2m t Because = n=0 2m 2m−n ω n n j=0 n j n−j n−j α β |yt−1 |j st−1 .˜ Since that Ut Γt−1 [0.t  ˜ ˜ = 2Eθ0 Ut Γt−1  ui. 2m E[yt ] = am E[s2m ] where am is given by t m a0 = 1.   0 1 ∂si.1).t ∂θi.t  si. .t . . ui. . . . For the rest of the proof see Newey and McFadden (1994). Moments of the ARMACH model The properties of ARMACH processes are not well known but we can reuse the work of Bollerslev (1986) to find the conditions under which the 2mth moment is finite. 0] is a random variables with unit mean we see that the ˜ two sets of first order conditions are equivalent and so will give the same estimates asymptotically. ˜ j E |yt−1 |j sn−j |It−2 t−1 = sn−j E |yt−1 |j |It−2 t−1 u = sn−j sj E |˜t−1 |j |It−2 t−1 t−1 = sn E |˜t−1 |j u t−1 n = st−1 bj 31 . . For an ARMACH(1. .j 0   B.

. β. 32 .797885 2 1.00 6 7 38. i − 1) < 1 so that only one α α condition has to be checked to ensure that the 2mth moment is finite. st ) then t 2m 2m−n ω n j=0 n j n−j α β bj . β.2985 8 105.00 (the bj ’s are given in table 1 for the j ≤ 8) we have 2m E s2m |It−2 t = n=0 2m 2m 2m−n ω n sn t−1 n j=0 n j n−j n α β bj st−1 ˜ j n = n=0 2m−1 Let wt = (s2m . . β. In α Bollerslev (1986) they show that if µ(˜ . .00 1. . . 2m which is the equivalent to the expression (A10) in Bollerslev (1986). . ˜ j E [wt |It−2 ] = d + Cwt−1 where C is an 2m × 2m upper triangular matrix with diagonal elements i µ(˜ . E|ut |i i 1 0.00 5 6. i) < 1 then µ(˜ . i) < 1 for i ≤ 2m. st . . Following the remaining argument in this article the 2mth moment is finite if µ(˜ .Table 1: Moments of the absolute value of a standard normal variable.59577 3 4 3. i) = α j=0 i j i−j α β bj ˜ j i = 1.38308 15. . β. We have not proved a similar result yet.

75 0 150 300 450 600 750 900 Swiss−Franc 260 2.5 150 Yen 300 450 600 750 900 3.5 Pound 2 Deutschmark 0.5 Deutschmark 3.0 −2.9 0.0 0 −2.25 2.8 0. 1.0 0.7 0.0 2.00 0 150 300 450 600 750 900 0 150 300 450 600 750 900 220 2.6 0 280 Pound 3.Figure 1: Exchange rate series.0 2.5 −2 0 2 1 0 −1 −2 0 150 Yen 300 450 600 750 900 5.00 2.5 0 150 300 450 600 750 900 Swiss−Franc 150 300 450 600 750 900 0 150 300 450 600 750 900 33 .5 0.50 240 2. The top and bottom figures are respectively the level and the growth rate of each series.

2 20 ACF − Pound/SF 40 60 80 100 0.2 0 20 ACF − DM/Yen 40 60 80 100 0.2 ACF − Pound/Yen 0.2 0 20 ACF − Yen/SF 40 60 80 100 0.1).0 0 20 40 60 80 100 0 20 40 60 80 100 34 .2 ACF − Pound/DM 0.0 0 0.Figure 2: ACF of the cross-product of the standardized residuals from a ARMACH(1. 0.0 0.0 0 0.0 0.0 0.2 20 ACF − DM/SF 40 60 80 100 0.

2 ACF − Pound/DM ACF − Pound/Yen 0.0 0 20 40 60 80 100 0 20 40 60 80 100 35 .0 0.Figure 3: ACF of the cross-product of the standardized residuals with data simulated from a regime switching model.2 20 ACF − DM/SF 40 60 80 100 0.2 0 20 ACF − Yen/SF 40 60 80 100 0.0 0.0 0 0. Sample size is 1000.2 0.0 0 0.2 20 ACF − Pound/SF 40 60 80 100 0.2 0 20 ACF − DM/Yen 40 60 80 100 0.0 0. 0.

2 0 20 ACF − DM/Yen 40 60 80 100 0.2 0 20 ACF − Yen/SF 40 60 80 100 0.2 20 ACF − DM/SF 40 60 80 100 0.Figure 4: ACF of the cross-product of the standardized residuals with data simulated from a DCC-GARCH.0 0.0 0 0. Sample size is 1000.2 ACF − Pound/DM ACF − Pound/Yen 0.0 0 20 40 60 80 100 0 20 40 60 80 100 36 .2 0.0 0.0 0.2 20 ACF − Pound/SF 40 60 80 100 0. 0.0 0 0.

0843) State 2 0.Table 2: Estimation results for the unrestricted model with two states and ARMACH.0184) 0.0249) 0.0061) 0.8078 (0.1225 (0.0363) (0.7975 (0.3334 0.0136) 0. Standard errors are in parenthesis.0996) (0.0283) 0.0874 (0.4 0.4 0.4011 (0.1666) ω 0.3255 (0.0330) 0.8471 (0.0086) 0. The log-likelihood value is -2011.0181) 0.8895 (0.9291 (0.0709 Γ2.8754 (0.0094) 0.0928 (0.1028 (0.3250 (0.0430) 37 .0295 (0.0049 (0.1286 (0.0821) Γ2.0327) β 0.0044) 0.3 0.0179) 0.9705 (0.0452) α ˜ 0.0292) 0.1859 0.4739 (0.0795 (0.6. State 1 State 2 Γ1.2 0.0113) 0.0295) 0.8569 (0.1014 (0.0194) 0.3 Γ1.6666 (0.1275) Π State 1 State 2 Series Pound Deutschmark Yen Swiss-Franc State 1 0.0245 (0.8617 (0.0605) α 0.1871) Γ3.0225 (0.9510 (0.7656 0.4 0.0710 (0.0356) 0.0197) 0.0263) 0.0958) Γ1.5626 (0.

0798 (0.0117) 0.4 0.9131 (0. Standard errors are in parenthesis.3 Γ1.7.3 0.5217 (0. State 1 State 2 Γ1.1748) Γ3.0181 (0.0449) 0.0102) 0.2484 0.1143 (0.0275) 0.0418) 0.0450 (0.7646 (0.2 0.0869 Γ2. The log-likelihood value is -1994.8160 (0.8789 (0.0169) 0.0833) State 2 0.1792) ω 0.1136 (0.1149) (0.0757) α 0.9802 (0.0592) β 0.4636 (0.4 0.9536 (0.6794 (0.0296) 0.7805 0.0344) 0.8648 (0.0321) (0.0895 (0.1209) Γ2.3206 0.6222 (0.0091) 0.0159) 0.8696 (0.8842 (0.0089) 0.0486) 38 .4 0.0392) 0.3930 (0.Table 3: Estimation results for the unrestricted model with two states and GARCH.0018) 0.0193 (0.8567 (0.0011 (0.1329) Π State 1 State 2 Series Pound Deutschmark Yen Swiss-Franc State 1 0.1015) Γ1.3953 (0.0191) 0.0264) 0.

3 Γ1.0097) 0.3 0.0798) Γ2.8826 (0.3362 (0.0635) α 0. The log-likelihood value is -2025.8037 (0.0040 (0.9473 (0.0241) 0.0207) 0.1282 (0.7274 0. Standard errors are in parenthesis.8347 (0.1469) Γ3.1138) (0.0265) 0.0439) α ˜ 0.0406) 39 .6682 (0.3283 (0.0221) 0.0233) 0.0220) 0.4 0.1316) ω 0.1068 (0.3728 (0.8334 (0.1327) Γ1.0527 Γ2.0074) 0.0285) 0.3334 (0.0211 (0.1206 (0.0227) 0.0115) 0.9479 (0.Table 4: Estimation results for the restricted model with two states and ARMACH.3318 0.3278 (0.0739 (0.0175) 0.1294) State 2 0.8549 (0.1296) Π State 1 State 2 Series Pound Deutschmark Yen Swiss-Franc State 1 0.4 0.2861 0.4 0.2 0.0318) 0.0827 (0.0913 (0.2.0406) β 0.0038) 0. State 1 State 2 Γ1.0102) 0.7983 (0.0194) 0.0069) 0.0400) (0.0866 (0.0276 (0.0271 (0.8477 (0.0254) 0.9731 (0.1010 (0.

0166) 0.8420 (0.1192) (0.8545 (0.0933 (0.4475 (0.4025 (0.0087) 0.0354) 0.0583) α 0.8145 (0.7521 (0.0619 Γ2.7377 0.0149) 0.0.3475 0.0352) 0.0219 (0.0273) 0.3966 (0.0016) 0.4 0. Standard errors are in parenthesis.0105) 0.1369) ω 0.0194) 0.0176 (0.0354) (0.0078) 0. State 1 State 2 Γ1.9500 (0.4 0.1015) Γ2.3 Γ1.2 0.1350) State 2 0.9805 (0.0566) β 0.0332) 0.0454) 0.8697 (0.1521) Γ3.3196 0.0103) 0.1345) Π State 1 State 2 Series Pound Deutschmark Yen Swiss-Franc State 1 0.0268) 0.4052 (0.1382) Γ1.1204 (0.Table 5: Estimation results for the restricted model with two states and GARCH.0817 (0.3944 (0.0257) 0.6804 (0.0477 (0.8373 (0.4 0.0010 (0.8602 (0. The log-likelihood value is -2009.1098 (0.3 0.0388) 40 .9381 (0.

9 0.5 0.1 0 100 200 300 400 500 600 700 800 900 1. 1. The top and bottom figures represents the smoothed state probabilities of being in state 1 for the unrestricted and restricted model respectively.0 Smoothed − State 1 0.3 0.8 0.4 0.0 Smoothed 0.6 0.8 0.1 0 100 200 300 400 500 600 700 800 900 41 .7 0.4 0.2 0.Figure 5: Smoothed probabilities for the models with two states and ARMACH.2 0.6 0.3 0.7 0.5 0.9 0.

1 Correlation Pound/DM 1 Correlation Pound/Yen 0 0 0 1 150 300 450 600 750 900 1 0 150 300 450 600 750 900 Correlation Pound/SF Correlation DM/Yen 0 0 0 1 150 300 450 600 750 900 1 0 150 300 450 600 750 900 Correlation DM/SF Correlation Yen/SF 0 0 0 150 300 450 600 750 900 0 150 300 450 600 750 900 1 Correlation Pound/DM 1 Correlation Pound/Yen 0 0 0 1 150 300 450 600 750 900 1 0 150 300 450 600 750 900 Correlation Pound/SF Correlation DM/Yen 0 0 0 1 150 300 450 600 750 900 1 0 150 300 450 600 750 900 Correlation DM/SF Correlation Yen/SF 0 0 0 150 300 450 600 750 900 0 150 300 450 600 750 900 42 . The top and bottom panel are for the unrestricted and restricted version of the model respectively.Figure 6: Smoothed correlations for the models with two states case and ARMACH.

0730 (0. The log-likelihood value is -1971.0101) Γ1.0620 (0.1341) 0.1263) 0.1200 (0.0819) (0.0347) (0.Table 6: Estimation results for the unrestricted model with three states and ARMACH.2257) Γ3.0000 (0.9694 (0.0502) 0.0894) Π State 1 State 2 State 3 Series Pound Deutschmark Yen Swiss-Franc State 2 0.8705 (0.0239 (0.2045) 0.8615 (0.9298 (0.8853 (0.0364) 0.8847 0.0381) 0.0538) State 3 0.4 0.0746) α ˜ 0.9491 (0.0905) 0.1153 (0.2189) ω 0.2199 (0.8311 (0.6250 0.0193) 0.0908 (0.0332 (0.2 0.0855 0.0878 (0.1151 (0.8568 (0.4 0.0543 (0.0313 (0.0040 (0.3 Γ1.0989) 0. Standard errors are in parenthesis.0590) 0.6039 (0.2592) 0.1892) 43 .0410) 0.7222 (0.0226) 0.1271 (0.7238 (0.0179) 0.0177 (0.7.0718 (0.1248 (0.5598 (0.1135 (0.9251 (0.0965) Γ2.8497 0.8108 (0.0469) 0.1307) 0.0672) Γ2.1168) 0.0697) 0.1850 (0.0408) β 0.0831) (0.0326) 0.0907 (0. State 1 State 2 State 3 Γ1.0667) State 1 0.8575 α 0.1205) 0.2048 (0.0445) 0.0253) 0.0053) 0.0480) 0.0251) 0.4 0.0830) 0.4189 0.2479 (0.3 0.

0662) α 0.0545 (0.7973 (0.5559 (0.2056) 44 .0688) (0.3074 (0. Standard errors are in parenthesis.0678) 0.1383) 0.1277) 0.1728) 0.8567 (0.0333) 0.0470) 0.0921) 0.0234 (0.0419) 0.3 Γ1.2702 (0.2012) Γ2.0949) 0.0742) 0.4174 0.1218 (0.0044) 0.2169 (0.0341) β 0.1128 (0.0381) 0.0172 (0.8258 (0.0810) 0.8835 0.0000 (0.0344 (0.1038 (0.2 0.Table 7: Estimation results for the unrestricted model with three states and GARCH.2730) Γ2.4512) 0.2039) Π State 1 State 2 State 3 Series Pound Deutschmark Yen Swiss-Franc State 2 0.1190) (0.5992 (0.1101 (0.0566) 0.0260) Γ1.1132 0.1582 (0.0474 (0.3 0.1106) 0.8610 ω 0.6759 0.0537) 0.8524 0.3.4 -0.3846) Γ3.0587) 0.0290 (0.8203 (0.4 0.8718 (0.1561) 0.2500) 0.1433) State 1 0.9721 (0.0015 (0.0407) 0.0539 (0.9297 (0.0698) State 3 0. The log-likelihood value is -1955.0930) 0. State 1 State 2 State 3 Γ1.1491) (0.4 0.9487 (0.7196 (0.7177 (0.1985) 0.1165 (0.9249 (0.8863 (0.1054 (0.

0172) 0.8270 (0.0686 (0.7.0080) 0.0203) α ˜ 0.7835 (0.0212) 0.4365 (0.6561 0.0141) 0.0212) 0. Standard errors are in parenthesis.0288) 0.0038) 0.1101 (0.1486 (0.0333) 0.0416 (0.4 0.0034) 0.1508 (0.0674) Γ2.0285) (0.5330 α 0.0169) 0.0311 (0.1151) 0.0868 (0.8800 (0.9260 0.0676) State 1 0.0460) Γ2.1281 (0.0970 (0.0143) 0.0801 (0.7649 (0.0682) Π State 1 State 2 State 3 Series Pound Deutschmark Yen Swiss-Franc State 2 0.0581) (0.4 0.0116) 0.0320) β 0.1035 (0.1262 0.7723 (0.0228 (0.0421) State 3 0.8550 (0.0187) 0.1849) 0.0054 (0.1670 (0.1469 (0. State 1 State 2 State 3 Γ1.0195) 0.8775 (0.9709 (0.8649 (0.0827) 45 .0339) 0.0200) 0.0105) 0.0275) (0.0123) 0.1472 (0.0097) ω 0.8682 (0.0797 (0.0692) Γ1.0281) 0.0043 (0.8787 0.7348 0.4 0.0219) 0.3 Γ1.0305 (0.8567 (0.3 0. The log-likelihood value is -1975.0299 (0.0225) 0.0964 (0.0659 (0.7634 (0.0766) Γ3.2 0.0160) 0.9723 (0.0183) 0.0236) 0.7814 (0.Table 8: Estimation results for the restricted model with three states and ARMACH.

0098) Γ2.4 0.0810) Γ2.0017 (0.8545 (0.0764) 46 .0369) 0.2 0.0375) 0.0282) (0.8645 (0.0018) 0.8776 (0.0255) 0. The log-likelihood value is -1961.0144) 0.9134 0.0226) State 3 0.0832) Γ1. State 1 State 2 State 3 Γ1.1599 (0.0232) 0.4823 (0.0396) 0.0439 (0.1444 (0.1422 (0.0820) Π State 1 State 2 State 3 Series Pound Deutschmark Yen Swiss-Franc State 2 0.0464 (0.8555 0.0091) 0.Table 9: Estimation results for the restricted model with three states and GARCH.0232) 0.0346) α 0.0036) 0.0521) 0.0148) 0.3.4 0.0921) 0.7643 (0.0809 (0.3 Γ1.9739 (0.0138 (0.3 0.8558 (0.7759 (0.1006 (0.1209 0.1406 (0.0789 (0.1058 (0.8163 (0.7555 (0.0232) 0.0299) (0.7347 0.0254) 0.0226 (0.0140) 0.6495 0.4 0.9718 (0.0127) 0.0941 (0.0921) Γ3.8621 (0.0245) 0.1408 (0.7566 (0.1947) 0.5039 ω 0.0256 (0.0191) 0.0057 (0.0120) 0.1222 (0.8592 (0.0166) 0. Standard errors are in parenthesis.0698) (0.7555 (0.0220) 0.0351) β 0.0812) State 1 0.

5 0 1.5 0 1.5 0 1. The top and bottom figures represents the smoothed state probabilities of being in each state for the unrestricted and restricted model respectively.0 100 200 300 400 500 600 700 800 900 Smoothed − State 2 0.5 0 100 200 300 400 500 600 700 800 900 47 .0 100 200 300 400 500 600 700 800 900 Smoothed − State 3 0.0 100 200 300 400 500 600 700 800 900 Smoothed − State 3 0.0 100 200 300 400 500 600 700 800 900 Smoothed − State 1 0.0 100 200 300 400 500 600 700 800 900 Smoothed − State 2 0.Figure 7: Smoothed probabilities for the three-state case with ARMACH. 1.0 Smoothed − State 1 0.5 0 1.5 0 1.

Figure 8: Smoothed correlations for the three-state case with ARMACH. The top and bottom panel are for the unrestricted and restricted version of the model respectively. 1 Correlation Pound/DM 1 Correlation Pound/Yen 0 0 0 1 150 300 450 600 750 900 1 0 150 300 450 600 750 900 Correlation Pound/SF Correlation DM/Yen 0 0 0 1 150 300 450 600 750 900 1 0 150 300 450 600 750 900 Correlation DM/SF Correlation Yen/SF 0 0 0 150 300 450 600 750 900 0 150 300 450 600 750 900 1 Correlation Pound/DM 1 Correlation Pound/Yen 0 0 0 1 150 300 450 600 750 900 1 0 150 300 450 600 750 900 Correlation Pound/SF Correlation DM/Yen 0 0 0 1 150 300 450 600 750 900 1 0 150 300 450 600 750 900 Correlation DM/SF Correlation Yen/SF 0 0 0 150 300 450 600 750 900 0 150 300 450 600 750 900 48 .

7093 (0.0539) 1 α ˆ 0.0564) 1 ˆ Γ a ˆ 0.0171) 0.7415 (0.0451) ˆ b 0.0472 (0.1085) 0.8145 (0.1823 (0.0158) 0.0518) 0.0128) 0.7822 (0.0716) 0.1235 (0.0446) 0.2 Series Pound Deutschmark Yen Swiss-Franc ω ˆ 0.0483) 0.7657 (0. The log-likelihood value is -2109.8639 (0.0027) 0.1338 (0.0467) (0.0307) 1 0.7189 (0.0282 (0.0186) 0.0629) 0.0636) (0.6814 (0.7219 0.0646 (0.Table 10: Estimation results for the DCC-GARCH(1.1689 (0.9615 (0.0440) 1 ˆ β 0.1636) 0.5896 0.0893) 49 .0306 (0.0031 (0.1). Standard errors are in parenthesis.

1088 (0.0153) 0.1201 (0.0486) 1 α ˆ 0.0291) 0.0562 (0.0045 (0.7255 (0.Table 11: Estimation results for the DCC-ARMACH(1.0383) 1 ˆ β 0.1420 (0.0524) 0.1). The log-likelihood value is -2137.1524 (0.0082) 0.8682 (0.8841 (0.0288) 0.0294) 0.6849 (0.0252) 0.7377 (0.0325 (0.0163) 0.8 Series Pound Deutschmark Yen Swiss-Franc ω ˆ 0.1263) 0.6843 (0.0882 (0.1120 (0.0377) 0.0117) 0.0201) 0.8083 (0.0759) 0.7932 (0.0224) 0.0515) 1 ˆ Γ a ˆ 0.6036 (0.0152) 0.9677 (0.1876 (0.0553) 0. Standard errors are in parenthesis.0344) ˆ b 0.0596) 0.7554 (0.0248 (0.0636) 1 ˆ α ˜ 0.0867 (0.0204 (0.0571) 50 .0031) 0.

1).Figure 9: Correlations for the DCC-ARMACH(1. 1 Correlation Pound/DM 1 Correlation Pound/Yen 0 0 0 1 150 300 450 600 750 900 1 0 150 300 450 600 750 900 Correlation Pound/SF Correlation DM/Yen 0 0 0 1 150 300 450 600 750 900 1 0 150 300 450 600 750 900 Correlation DM/SF Correlation Yen/SF 0 0 0 150 300 450 600 750 900 0 150 300 450 600 750 900 51 .

50 1.1) -2137.2 21 DCC-GARCH(1. Log-likelihood Nb.3 38 Restricted 3-state GARCH -1961.25 2.50 0 100 200 300 400 500 600 700 800 900 52 .7 38 Restricted 3-state ARMACH -1975.8 18 Figure 10: Conditional variance from a GARCH(1.7 26 Unrestricted 2-state GARCH -1994.75 1.2 20 DCC-ARMACH(1.1) -2301.8 20 CCC-GARCH(1.1 18 CCC-ARMACH(1.1) for the return on the Dow Jones index.3 26 Unrestricted 3-state ARMACH -1971.0 21 Unrestricted 2-state ARMACH -2011.7 27 Restricted 2-state GARCH -2009.1) 2.00 1.25 1.1) -2272. Unrestricted 3-state GARCH -1955.Table 12: Likelihood value and number of parameters for various models.6 27 Restricted 2-state ARMACH -2025.1) -2109. par.00 0. Variance of the Dow Jones index returns − GARCH(1.75 0.

Sign up to vote on this title
UsefulNot useful