You are on page 1of 38

Unit Roots, Structural Breaks and Cointegration Analysis: A Review of the Available Processes and Procedures and an Application

(Presented at the Macroeconomics and Financial Economics Workshop:
Recent Developments in Theory and Empirical Modeling workshop, held from October 8 to October 9 at Eastern Mediterranean University)

Asst. Prof. Dr. Mete Feridun Department of Banking and Finance Faculty of Business and Economics Eastern Mediterranean University


 

Testing for stationarity Testing for structural breaks Dealing with structural breaks in cointegration analysis Tests for parameter stability

Time-series Econometrics

Prior to any time-series econometric analysis, it is necessary to investigate the stationarity properties of the variables. A stationary series fluctuates around a constant longrun mean and, this implies that the series has a finite variance which does not depend on time. On the other hand, non-stationary series have no tendency to return to a long-run deterministic path and the variances of the series are time-dependent.

constant and time invariant. The problems caused by non-stationary variables in standard regression analysis have been well documented in the time-series literature. Non-stationary series suffer permanent effects from random shocks and thus the series follow a random walk. The standard classical estimation methods are based on the assumption that the mean and variances of the stochastic series are finite.   .

 Given that economic time-series are typically described as nonstationary processes. but with strong temporal properties. Alternatively. but this incurs a loss of important long-run information. if a long-run relationship exists among the set of variables that share similar nonstationarity properties. regression involving the levels of the variables can proceed without generating spurious results. that is.    . the estimates of such variables will lead to spurious regression and their economic interpretation will not be meaningful. the appropriate route in this case is to transform the data by differencing the variables prior to their inclusion in the regression model. if the variables are cointegrated. are found apparently to be related according to standard inference in an OLS regression If the unit root tests find that a series contain one unit root. A spurious regression occurs when a pair of independent series.

a “balanced” regression leads to meaningful interpretations and evades the spurious regression problem.e. i. stationary.   . Incorporating non-stationary or unit root variables in estimating the regression equations using these models give misleading inferences. In this case. the method of estimation of the are based on the assumption that the means and variances of these variables being tested are constant over time. In binary choice models also.

all variables are tested with a linear trend and/or intercept or none. Usually.Unit Root Tests  Traditionally. Augmented Dickey–Fuller (ADF) and Phillips–Perron (PP) tests are used to assess the order of integration of the variables.   . Uniform outcomes of both tests are necessary for the final conclusion about the stationarity properties of each series.

In other words. for the series that are found to be I(1).Structural Breaks  A well-known weakness of the ADF and PP unit root tests is their potential confusion of structural breaks in the series as evidence of non-stationarity. In other words. they may fail to reject the unit root hypothesis if the series have a structural break. there may be a possibility that they are in fact stationary around the structural break(s). I(0).   . but are erroneously classified as I(1).

including Zivot and Andrews (1992) and Perron (1997). Following this development. Perron (1989) shows that failure to allow for an existing break leads to a bias that reduces the ability to reject a false unit root null hypothesis. many authors. To overcome this. proposed determining the break point „endogenously‟ from the data.   . the author proposes allowing for a known or exogenous structural break in the Augmented Dickey-Fuller (ADF) tests.

 Shrestha and Chowdhury (2005) argue that. a currency crisis. Also. war and so forth.   . the testing power of the Perron-Vogelsang unit root test is superior to that of the Zivot-Andrews test. it facilitates the analysis of whether a structural break on a certain variable is associated with a particular event such as a change in government policy. as suspected by Perron (1989). in the case of a structural break. Applying the unit root tests which allow for the possible presence of the structural break prevents obtaining a test result which is possibly biased towards non-rejection. Enders (2004) argues that Perron-Vogelsang (1992) unit root tests are more appropriate “if the date of the break is uncertain”. since this procedure can identify the date of the structural break.

 . Clemente et al (1998) base their approach on Perron and Vogelsang (1992). allowing for the possibility of having two structural breaks in the mean of the series.Multiple Structural Breaks  The Zivot-Andrews and Perron-Vogelsang (1992) unit root tests allow for one structural break. whereas the Clemente-MontanesReyes (1998) unit root test allows for two structural breaks in the mean of the series196.

if they exist. In these tests. The advantage of these tests is that they do not require an a priori knowledge of the structural break dates. 2003: 304). Ben-David et al (2003) cautions that “just as failure to allow one break can cause non-rejection of the unit root null by the Augmented Dickey –Fuller test.   . can cause non-rejection of the unit root null by the tests which only incorporate one break” (Ben-David et al. failure to allow for two breaks. the null hypothesis is that the series has a unit root with structural break(s) against the alternative hypothesis that they are stationary with break(s).

this test was criticized for the absence of the breaks under the null hypothesis of unit root as this could result in a tendency for these tests to suggest evidence of stationarity with breaks (see Glynn et al.  However. 2007). . Lumisdaine and Papell (1997) extended the Zivot and Andrews (1992) model to accommodate two structural breaks.

 Hence. which allows for a gradual shift in the mean of the series. Both of these tests offer two models: (1) an additive outliers (AO) model.    . which captures a sudden change in the mean of a series. and (2) an innovational outliers (IO) model. The Perron-Vogelsang and Clemente-MontanesReyes unit root tests are more preferable.

in applying unit root tests in time series that exhibit structural breaks. According to Baum (2004). as this is evidence that the model excluding structural breaks is misspecified.  . Therefore. the results derived from ADF and PP tests are doubtful. if the estimates of the Perron-Vogelsang and Clemente-Montanes-Reyes unit root tests provide evidence of significant additive or innovational outliers in the time series. only the results from the Clemente-Montanes-Reyes unit root tests should be considered if the two structural breaks indicated by the respective tests are statistically significant (at the 5% level as used by STATA).

If these tests show no evidence of a structural break. On the other hand. the ADF and PP tests can be considered.  . the results from the Perron–Vogelsang unit root tests are considered. if the results of the Perron-Vogelsang and Clemente-MontanesReyes unit root tests show no evidence of two significant breaks in the series.

and Pesaran et al (2001) should be preferred. the conventional Johansen cointegration technique can be safely used.   . However. Pesaran and Shin (1999). in the case where the presence of structural breaks introduces uncertainty as to the true order of integration of the variables.Cointegration Analysis  If it is certain that the underlying series are all I(1). The advantage of this methodology is that it yields valid results regardless of whether the underlying variables are I(1) or I(0). the autoregressive distributed lag (ARDL) bounds testing procedure introduced by Pesaran and Pesaran (1997). or a combination of both.

a “balanced” regression leads to meaningful interpretations and evades the spurious regression problem. if the variables are cointegrated. in the case where the unit root tests reveal that a series contain one unit root. Conventionally.   . regression involving the levels of the variables can proceed without generating spurious results. the appropriate method is to transform the data by differencing the variables prior to their inclusion in the regression model to avoid the risk of spurious regression. Nonetheless. this incurs a loss of important long-run information. In this case. Alternatively. if a long-run relationship exists among the set of variables that share similar nonstationarity properties. that is.

the intuition behind cointegration is that it allows capturing the equilibrium relationships dictated by the economic theory between nonstationary variables within a stationary model. meaning even though they are individually not stationary.   Cointegration vectors are of considerable interest since they determine I(0) relations that hold between variables which are individually non-stationary. they are bound by an equilibrium relationship. A search is made for a linear combination of such variables such that the combination is stationary.  Hence. Essentially. variables are cointegrated when a long-run linear relationship is obtained from a set of variables that share the same non-stationary properties. If such a stationary combination exists.  . then the variables are said to be cointegrated.

 In this case. An advantage of the cointegration approach is that it provides a direct test of the economic theory and enables utilization of the estimated long-run parameters into the estimation of the shortrun disequilibrium relationships. Although Engle and Granger‟s (1987) original definition of cointegration refers to variables that are integrated of the same order. cointegrating relationships might exist. . Asteriou and Hall (2007) also explains that in cases where a mix of I(0) and I(1) variables are present in the model. the application of traditional econometric modelling to non-stationary time series data generates meaningful results. Enders (2004) argues that:    “It is possible to find equilibrium relationships among groups of variables that are integrated of different orders”200.

 Similarly. although this terminology is not in the spirit of the original definition because it can happen that a linear combination of I(0) variables is called a cointegration relation” . Lutkopohl (2004) explains:  “Occasionally it is convenient to consider systems with both I(1) and I(0) variables. Thereby the concept of cointegration is extended by calling any linear combination that is I(0) a cointegration relation.

a subset of the higher order series must cointegrate to the order of the lower order series. even in the presence of a set of variables which contains both I(1) and I(0) variables. In the multivariate case. Therefore. Hence. it is possible to find long-run equilibrium relationships among a set of I(0) and I(1) variables if their linear combination reveals a cointegrating relationship. it is possible to have series with different orders of integration. The long-run relationship among the variables could be achieved if the low frequency or the stochastic trend components of a set of variables offset each other to achieve a stationary linear combination of the variables.   . In this case. cointegration analysis is applicable and the presence of a long-run linear combination denotes the existence of cointegrated variables.

Therefore. On the other hand. “Autoregressive” refers to lags in the dependent variable.    .Autoregressive Distributed Lag (ARDL) Bounds Tests  The choice of the ARDL bounds testing procedure as a tool for investigating the existence of a long-run relationship is based on the following considerations: First and the foremost. “distributed lag” refers to the lags of the explanatory variables. the past values of a variable are allowed to determine its present value. both dependent and the independent variables can be introduced in the model with lags.

A change in the economic variables may not necessarily lead to an immediate change in another variable.   . the dependence of the dependent variable on the independent variables may or may not be instantaneous depending on the theoretical considerations. This is a highly plausible feature: Conceptually.

in some cases. The reaction to a change in each variable may be different depending on various factors. Hence. they may respond to the economic developments with a lag and there is usually no reason to assume that all regressors should have the same lags. ARDL bounds testing approach is appropriate as it allows flexibility in terms of the structure of lags of the regressors in the ARDL model as opposed to the cointegration VAR models where different lags for different variables is not permitted (Pesaran et al. 2001).   Hence. the ARDL approach has the advantage that it takes a sufficient number of lags to capture the data generating process in a general-to-specific modelling framework  . In this respect. It goes without saying that the correct choice of the order of the ARDL model is very important in the long-run analysis.

and heteroscedasticity. functional form misspecification. as shown by Pesaran et al (2001). the lag orders can be selected based on four different selection criteria taking into consideration the results of the diagnostic tests for residual serial correlation.   . the ARDL models yield consistent estimates of the long-run coefficients that are asymptotically normal irrespective of whether the underlying regressors are purely I(0). Furthermore. In other words. this procedure allows making inferences in the absence of any a priori information about the order of integration of the series under investigation. I(1) or mutually cointegrated. non-normality. Also.

Narayan and Smyth (2005) provide exact critical values for up to 80 observations    . In particular. which typically requires a large sample size for the results to be valid. it is likely to have better statistical properties than the traditional cointegration techniques. the small sample properties of the bounds testing approach are superior to that of the traditional Johansen cointegration approach. On the other hand. Pesaran and Shin (1999) show that the ARDL approach has better properties in sample sizes up to 150 observations. As demonstrated by Pesaran and Shin (1999). Since the ARDL approach draws on the unrestricted error correction model.

it can be conveniently tested whether the underlying structural breaks have affected the long-run stability of the estimated coefficients. With the ARDL approach. The ARDL approach is particularly applicable in the presence of the disequilibrium nature of the time series data stemming from the presence of possible. Structural breaks as happens with most economic variables. the cumulative sum of recursive residuals (CUSUM) and the CUSUM of square (CUSUMSQ) tests proposed by Brown et al (1975) can be applied to the residuals of the estimated error correction models to test parameter constancy   . As suggested by Pesaran (1997).

functional form misspecification. Having established the existence of a long-run relationship based on F-tests. The long-run relationship is regarded as a steady-state equilibrium. The order of the lags in the ARDL model are selected using the appropriate selection criteria such as Akaike Information Criterion (AIC). Schwartz Bayesian Criterion (SBC). non-normality and heteroscedasticity. whereas the short-run relationship is evaluated by the magnitude of the deviation from the equilibrium. HannanQuinn Criterion (HQC) and R2 ensuring that there is no evidence of residual serial correlation. the second step of the ARDL analysis is to estimate the long-run and the associated short-run coefficients.   .

which results from non-recurring. In this case. The presence of non-normality problem can be attributed to the presence of outliers over the sample period.   . oil price shocks.) rather than the normal evolution of the economic data. terrorist attacks. financial crises etc. In the presence of structural breaks. exogenous shocks (such as wars. the short-run and long-run coefficients of the estimated models will not be valid. the diagnostic tests of the selected models will most likely suggest that the estimated model suffers from the nonnormality problem.

e. Indeed. the presence of extreme residuals. may lead to a rejection of the normality assumption as can be seen in Table 1 The outliers can individually or collectively be responsible for the residual non-normality problem. i. outliers. this is not surprising in most economic cases given most of the series are characterized by frequent fluctuations. In econometric modelling.   . Let‟s see an example of the diagnostic test results of Turkish macroeconomic data which has severe structural breaks due to currency crises. (see Table 1).

the estimated models should be re-estimated by augmenting the cointegrating equations with pulse dummy variables. Following the existing econometrics literature. the dummy variable takes on the value of 1.   The dummy variables are set equal to zero for all observations except the month in which the observation goes beyond the threshold of two standard errors. Accordingly. i.e. the operational definition of an outlier is considered as any data point for which the residuals are in excess of 2 standard deviations from the fitted model. . In these months.Pulse Dummy Variables  One way to improve the chances of error normality is to use pulse dummy variables to capture those one-off abnormal observations. separate dummy variables should be introduced for each of the outliers.

The horizontal lines in the figures represent the 2 standard error bands. In this case. Figure I plots the residuals of several econometric models with structural breaks and 2 standard errors.   In these models the identified outliers correspond to the Turkish currency crises of 1994 and 2000-2001. For example. pulse dummy variables may be justifiably used to remove observations corresponding to these one-off events that are highly unlikely to be repeated.  .

or model misspecification problems. (See Table 2) The results also confirm that the re-estimated models do not suffer from autocorrelation. heteroscedasticity. it can be seen that the use of the intervention dummy variables ensured normality of the probability distribution of the residuals. which permits hypothesis testing on the results of the model.  . When the results of the models which are reestimated using the pulse dummy variables to account for the presence of the outliers are examined.

  .Testing for Parameter Stability  The existence of cointegration does not necessary imply that the estimated coefficients are stable. unlike the alternative Chow test that requires break point(s) to be specified. Pesaran and Pesaran (1997) suggest applying the cumulative sum of recursive residuals (CUSUM) and the cumulative sum of recursive residuals of square (CUSUMSQ) tests proposed by Brown et al (1975) to the residuals of the estimated ECMs to test for parameter constancy. they can be used without the requirement of a priori knowledge of the exact date of the structural break(s). If the coefficients are unstable the results will be unreliable. In order to test for long-run parameter stability. The advantage of these tests is that.

The CUSUM test uses the cumulative sum of recursive residuals based on the first observations and is updated recursively and plotted against break point.    . The test is more suitable for detecting systematic changes in the regression coefficients. they do not incorporate the short-run dynamics of a model into testing unlike CUSUM and CUSUMSQ tests. Hansen and Johansen (1992) also suggest a parameter constancy test but they require the variables to be I(1). However. it is more useful in situations where the departure from the constancy of the regression coefficients is haphazard and sudden. The CUSUMSQ makes use of the squared recursive residuals and follows the same procedure. In both CUSUM and CUSUMSQ. Also. the related null hypothesis is that all coefficients are stable.

If however.    . indicating the absence of any instability of the coefficients Thus. the parameters of the model do not suffer from any structural instability. either of the parallel lines are crossed then the null hypothesis of parameter stability is rejected at the 5 percent significance level. If the plot of the CUSUM and CUSUMSQ stays within the 5 percent critical bounds the null hypothesis that all coefficients are stable cannot be rejected. For example in Figure II and III. which plot the CUSUM and CUSUMSQ tests. the plots of the CUSUM and CUSUMSQ statistics are generally confined within the 5 percent critical value bounds.

Thank you ! .