# The Cointegrated VAR Model: Econometric Methodology and Macroeconomic Applications

Katarina Juselius July, 20th, 2003

Chapter 1 Introduction
Economists frequently formulate an economically well-speciﬁed model as the empirical model and apply statistical methods to estimate its parameters. In contrast, statisticians might formulate a statistically well-speciﬁed model for the data and analyze the statistical model to answer the economic questions of interest. In the ﬁrst case, statistics are used passively as a tool to get some desired estimates, and in the second case, the statistical model is taken seriously and used actively as a means of analyzing the underlying generating process of the phenomenon in question. The general principle of analyzing statistical models instead of applying methods can be traced back to R.A. Fisher. It was introduced into econometrics by Haavelmo (1944) [hereafter Haavelmo] and operationalized and further developed by Hendry and Richard (1983), Hendry (1987), Johansen (1995) and recent followers. Haavelmo’s inﬂuence on modern econometrics has been discussed for example in Hendry, Spanos, and Ericsson (1989) and Anderson (1992). Few observed macroeconomic variables can be assumed ﬁxed or predetermined a prior. Haavelmo’s probability approach to econometrics, therefore, requires a probability formulation of the full process that generated the data. Thus, the statistical model is based on a full system of equations. The computational complexities involved in the solution of such a system were clearly prohibitive at the time of the monograph when even the solution of a multiple regression problem was a nontrivial task. In today’s computerized world it is certainly technically feasible to adopt Haavelmo’s guide-lines in empirical econometrics. Although the technical diﬃculties have been solved long ago most papers in empirical macroeconomics do not seem follow the general 1

2

CHAPTER 1. INTRODUCTION

principles stated very clearly in the monograph. In this monograph we will claim that the VAR approach oﬀers a number of advantages as a general framework for addressing empirical questions in (macro)economics without violating Haavelmo’s general probability principle. First we need to discuss a number of important questions raised by Haavelmo and their relevance for recent developments in the econometric analysis of time series. In so doing we will essentially focus on issues related to empirical macroeconomic analysis using readily available aggregated data, and only brieﬂy contrast this situation with a hypothetical case in which the data have been collected by controlled experiments. The last few decades, in particular the nineties, will probably be remembered as a period when the scientiﬁc status of macroeconomics, in particular empirical macroeconomics was much debated and criticized both by people from outside, but increasingly also from inside the economics profession. See for example the discussions in Colander and Klamer (1987), Colander and Brenner (1992), and Colander (2001). An article expressing critical views on empirical economics is Summers (1991), in which he discusses what he claims to be “the scientiﬁc illusion in empirical macroeconomics”. He observes that applied econometric work in general has exerted little inﬂuence on the development of economic theory, and generated little new insight into economic mechanisms. As an illustration he discusses two widely diﬀerent approaches to applied econometric modelling; (i) the representative agent’s approach, where the ﬁnal aim of the empirical analysis is to estimate a few deep parameters characterizing preferences and technology, and (ii) the use of sophisticated statistical techniques, exempliﬁed by VAR-models a la Sims, to “identify” certain parameters on which inference to the underlying economic mechanisms is based. In neither case he ﬁnds that the obtained empirical results can convincingly discriminate between theories aimed at explaining a macroeconomic reality which is inﬁnitely more rich and complicated than the highly simpliﬁed empirical models. Therefore, he concludes that a less formal examination of empirical observations, the so-called stylized facts (usually given as correlations, mean growth rates, etc.) has generally resulted in more fruitful economic research. This is a very pessimistic view of the usefulness of formal econometric modelling which has not yet been properly met, nor oﬃcially discussed by the profession. The aim of this book is to challenge this view, claiming that part of the reason why empirical results often appear unconvincing is a neglect to follow the principles laid out by Haavelmo (1943).

1.1. A HISTORICAL OVERVIEW

3

The formal link between economic theory and empirical modelling lies in the ﬁeld of statistical inference, and the focus is on statistical aspects of the proposed VAR methodology, but at the same time stressing applicability in the ﬁelds of macroeconomic models. Therefore, all through the text the statistical concepts are interpreted in terms of relevant economic entities. The idea is to deﬁne a diﬀerent class of ‘stylized facts’ which are statistically well founded and much richer than the conventional graphs, correlations and mean growth rates often referred to in discussion of stylized facts. In this chapter we will revisit Haavelmo’s monograph as a background for the discussion of the reasons for “the scientiﬁc illusion in empirical macroeconomics” and ask whether it can be explained by a general failure to follow the principles expressed in the monograph. Section 1.1 provides a historical overview, Section 1.2 discusses the choice of a theoretical model. Section 1.3-1.5 discuss three issues from Haavelmo that often seem to have been overlooked in empirical macroeconomics: (i) the link between theoretical, true and observed variables, (ii) the distinction between testing a hypothesis and testing a theory, and (iii) the formulation of an adequate design of experiment in econometrics and relates it to a design by controlled experiments and by passive observation. Section 1.6 ﬁnally introduces the empirical problem which is to be used as an illustration all through the book.

1.1

A historical overview

To be included!

1.2

On the choice of economic models

This section discusses the important link between economic theory and the empirical model. In order to make the discussion more concrete, we will illustrate the ideas with an example taken from the monetary sector of the economy. In particular we will focus on the aggregate demand for money relation, being one of the most analyzed relations in empirical macroeconomics. Before selecting a theoretical model describing the demand for money as a function of some hypothetical variables, we will ﬁrst discuss the reasons why it is interesting to investigate such a relation. The empirical interest in money demand relations stems from basic macroe-

4

CHAPTER 1. INTRODUCTION

conomic theory postulating that the inﬂation rate is directly related to the expansion in the (appropriately deﬁned) supply of money at a rate greater than that warranted by the growth of the real productive potential of the economy. The policy implication is that the aggregate supply of money should be controlled in order to control the inﬂation rate. The optimal control of money, however, requires knowledge of the “noninﬂationary level” of aggregate demand for money at each point of time, deﬁned as the level of money stock, m∗ , at which there is no tendency for the inﬂation rate to increase or decrease. Thus, on a practical level, the reasoning is based on the assumption that there exists a stable aggregate demand-for-money relation, m∗ = f (x), that can be estimated. Given this background, what can be learned from the available economic theories about the form of such a relation, and which are the crucial determinants? There are three distinct motives for holding money. The transactions motive is related to the need to hold cash for handling everyday transactions. The precautionary motive is related to the need to hold money to be able to meet unforeseen expenditures. Finally, the speculative motive is related to agents’ wishes to hold money as part of their portfolio. Since all three motives are likely to aﬀect agents’ needs to hold money, let the initial assumption be that m/p = f (y r , c), saying that real money holdings, m/p, is a function of the level of real income (assumed to determine the volume of transactions and precautionary money) and the cost of holding money, c. Further assumptions of optimizing behavior are needed in order to derive a formal model for agents’ willingness to hold money balances. Among the available theories two diﬀerent approaches can be distinguished: (i) theories treating money as a medium of exchange for transaction purposes, so that minimizing a derived cost function leads to optimizing behavior, (ii) theories treating money as a good producing utility, so that maximizing the utility function leads to optimizing behavior. For expository purposes, only the ﬁrst approach will be discussed here, speciﬁcally the theoretical model suggested by Baumol (1952), which is still frequently referred to in this context. The model is strongly inﬂuenced by inventory theory, and has the following basic features. Over a certain time period t1 −t2 , the agent will pay out T units of money in a steady stream. Two diﬀerent costs are involved, the opportunity cost of the foregone investment measured by the interest rate r, and the so-called “brokerage” cost b. The latter should be assumed to cover all kinds of costs in connection with a cash withdrawal. It is also assumed that liquid money does not yield interest.

5. to give the theoretical variable a precise meaning. These have been the prior hypotheses of many empirical investigations based on aggregated data. This will be brieﬂy discussed in the next section. and estimates supporting this have been found. for instance.4: ”When considering a theoretical set-up.3. more precise statements of what is meant by the theoretical concepts C. The average holding of cash under these assumption is C/2.3 Theoretical. TRUE AND OBSERVABLE VARIABLES 5 so that the cost-minimizing agent will demand cash in proportion to the square root of the value of his transactions. Hendry. and Starr (1992). But this question has no sense within a theoretical model.” As a means to clarify this diﬃcult issue Haavelmo introduces the concepts of true and theoretical variables as opposed to observable variables and proposes that one should try to deﬁne how the variables should be measured in an ideal situation. it is another thing to classify and measure objects of real life. p. THEORETICAL. T and r need to be made. b. 5 as: We may express the diﬀerence [between the ”true” and the theoretical variables] by saying that the ”true” variables (or time functions) represent our ideal as to accurate measurements of reality ”as it is in fact” while the variables deﬁned in theory are the true measurements that we should make if reality were actually in accordance with our theoretical model. This is no straightforward task. It is one thing to choose the model from the ﬁeld of mathematics. If this theoretical model is tested against data.1) gives a transactions elasticity of 0. involving certain variables and certain mathematical relations.e.1) 1. The optimal value of cash withdrawn from investment can now be found as: p C = 2bT /r (1. And if the question applies to reality it has no precise answer. This is expressed in Haavelmo. . one has to make precise statements about how to measure the corresponding theoretical variable. p.5 and an interest elasticity of -0. It is one thing to build a theoretical model it is another thing to give rules for choosing the facts to which the theoretical model is to be applied.1. in Baba. as expressed by Haavelmo. true and observable variables In order to operationalize a theoretical concept. it is common to ask about the actual meaning of this and that variable. Taking the logarithms of (1. i.

etc. INTRODUCTION Say. theoretical arguments are needed to understand the variation in these data in spite of the weak correspondence between the theoretical and observed variables. Nevertheless. most empirical macroeconomic models are nonetheless based on the oﬃcially collected data. ii) transactions are paid out in a steady stream over successive periods.) measured at closing time each trading day of a month. the unit root tests of GDP series to discriminate between diﬀerent real growth theories are good examples of misleading inference in this respect. or in the words of Haavelmo: It is then natural to adopt the convention that a theory is called true or false according as the hypotheses implied are true or false. 1 . the available measurements from oﬃcial statistics are very far from the deﬁnitions of the true variables. Even if data collected by passive observation do not generally qualify for testing “deep” theoretical models. that a careful analysis of the above example shows that the true measurements are the average holdings of cash and demand deposits by private persons in private banks.6 CHAPTER 1. postal banks or similar institutions (building societies. vi) no cash and demand deposits are held for speculative or precautionary motives. The theoretical variable C as deﬁned by Baumol’s model would then correspond to the true measurements given that i) no interest is paid on this liquid money. Then we may speak interchangeably about testing hypotheses or testing theories. For example. Even if it were possible to obtain measurements satisfying the above deﬁnition of the true measurements. Nevertheless. iii) the brokerage cost and interest rate r are unambiguously deﬁned. This is simply because of a genuine interest in the macroeconomic data as such. and so on. if the purpose of the empirical investigation is to test a theory. Since “data seldom speak by themselves”. when tested against the data chosen as the ”true” variables. it seems obvious that these would not correspond to the theoretical variables1 . a prerequisite for valid inference about the theoretical model is a close correspondence between the observed variables and the true variables. he gives a long list of reasons why this cannot be the case. for example. In fact. Needless to say. the link between the theoretical It should be pinted out that Baumol does not claim any such correspondence. Much of the criticism expressed by Summers may well be related to situations in which valid inference would require a much closer correspondence between observed and true variables.

though valid inference about the theory as such depends on the strength of the correspondence between the true and the theoretical variables. according to Summers among others. there seems to be a need for a critical appraisal of econometric practice in this context. inferences about the speciﬁed hypotheses are valid. Since.4.5 have frequently been tested within empirical models that do not include all aspects of Baumol’s theoretical model. and the direction of causality. the distinction between testing a hypothesis and testing a theory.1. In this case. one compromise is to be less ambitious about testing theories and instead concentrate on testing speciﬁc hypotheses derived from the theoretical model. First. the sign of ﬁrst derivatives. This monograph will claim that based on macroeconomic data it only possible to test theoretical hypotheses but not a theory model as such. Other popular hypotheses that have been widely tested include long-run price and income homogeneity. and valid procedures can be derived by analyzing the likelihood function. not enough attention is paid to the crucial role of the ceteris paribus‘everything else constant’ assumption when the necessary information . Second. “Sophisticated statistical techniques” have often been used in this context. This leads to the second issue to be discussed here. 1. not enough care is taken to ensure that the speciﬁcation of empirical models mimics the general characteristics of the data. TESTING A THEORY AS OPPOSED TO A HYPOTHESIS 7 and the empirical model is rather ambiguous in this case and the interpretation of the empirical results in terms of the theoretical arguments is not straightforward.4 Testing a theory as opposed to a hypothesis When the arguments of the theory do not apply directly to the empirical model. i.5 and of the interest rate er = −0. the hypotheses that the elasticity of the transactions demand for cash ec = 0. The frequent failiure to separate between the two might explain a great del of Summers’ critique. then statistical hypothesis testing is straightforward. There are several explanations why empirical results are often considered unpersuasive. zero restrictions. the outcomes of these exercises have generally not been considered convincing or interesting enough to change economists’ views.e. If the empirical model is based on a valid probability formulation of the data. For example.

. But. the role of rational expectations as a behavioral assumption in empirical models has been very unconvincing! Fourth. in most cases.. which is not likely to be the case in practical applications. and which we merely watch as passive observers. the available measurments do not correspond closely enough to the true values of the theoretical variables. High-quality. Even in ideal cases. . in many applications it is assumed that certain (linearly transformed) VAR residuals can be interpreted as structural shocks. say liquid money stock M1. Third. the link between the true variables suggested by the theory and the actual measurements taken from oﬃcial statistics is. In the ﬁrst case we can make agreements or disagreements between theory .. when the oﬃcial deﬁnition of the aggregated variable. but still quite far from the true measurements of an ideal situation. reasonably long aggregated series are diﬃcult to obtain because deﬁnitions change.5 Experimental design in macroeconomics As discussed above. (1) experiments that we should like to make to see if certain real economic phenomena – when artiﬁcially isolated from other inﬂuences – would verify certain hypothesis and (2) the stream of experiments that nature is steadily turning out from his own enormous laboratory. ﬁrst the concept of “a design of experiment” in econometrics as discussed by Haavelmo has to be introduced. The end result is the best set of measurements in the circumstances. 1. too weak to justify valid inference from the empirical model to the theory. INTRODUCTION set is speciﬁed. there are usually measurement problems. new components have entered the aggregates and new regulations have changed the behavior.If economists would describe the experiments they have in mind when they construct the theories ”they would see that the experiments they have in mind may be grouped into two diﬀerent classes namely. 14: . For example..8 CHAPTER 1. p. This would require the VAR residuals to be invariant to changes in the information set. These issues will be further discussed in the next chapter and related to some general principles for VAR-modelling with nonstationary data.. corresponds quite closely to the true measurements.. This problem is discussed in terms of a design of experiment in Haavelmo..

.5. And what is the meaning of a design of experiment in this case.. This is essentially the ‘general- . It is this: We try to choose a theory and a design of experiments to go with it. To the extent that the ‘true’ economic model underlying behavior satisﬁes a ﬁrst order linear approximation (see Henry and Richard. Another possibility is to rely on the “experiments” provided by other countries or regions that diﬀer in various aspects with regard to the investigated economic problem. We will argue that this model in spite of its simplicity oﬀers a potential richness in the speciﬁcation of economically meaningful short. within a statistically valid framework. interaction and feed-back eﬀects. we will restrict ourselves to this case.and long-run structures and components. For instance. Controlled experiments are not usually possible within a single macroeconomy and the only possibility to ‘test’ hypothetical relationships is to wait for new data which were not used to generate the hypothesis. in such a way that the resulting data would be those which we get by passive observation of reality. such as steady-state relations and common trends. Since the primary interest here is in the case of passive observations.. .1. 1983). in the unrestricted general form the VAR model is essentially only a reformulation of the covariances of the data. it seems advisable to examine this using similar data from countries that diﬀer in terms of the pursued economic policy. And to the extent that we succeed in doing so. EXPERIMENTAL DESIGN IN MACROECONOMICS 9 and facts depend on two things: the facts we choose to consider and our theory about them. and a common modelling strategy to allow questions of interest to be investigated in a consistent framework. steady-state behavior. Provided that these have remained approximatively constant over the sample period the VAR can be considered a convenient summary of the ‘stylized facts’ of the data. one can test economic hypotheses expressed as the number of autonomous permanent shocks. Chapter 3 will discuss under which conditions the VAR model can work as a reasonable formalization of a “design of experiment” for data by passive observation. if the question of interest is whether an expansion of money supply generally tends to increase the inﬂation rate. What seems to be needed is a set of assumptions which are general enough to ensure a statistically valid description of typical macroeconomic data. Even more importantly. etc. feed-back and interaction eﬀects. we become masters of reality – by passive agreement. In the rest of the book we will discuss the applicability of the cointegrated VAR model as a common modelling strategy. In the second case we can only try to adjust our theories to reality as it appears before us..

This choice does not exclude the possibility that these variables are closely related to the theoretical variables. be structural both in the economic and statistical sense of the word. the time interval between the measurements is too broad. subsequently evaluated in Hoover and Perez (1999) and recently implemented as an automatic selection procedure in PcGets (Hendry and Krolzig. For example. The VAR procedure is less pretentious about the prior role of a theoretical economic model. the data have been collected because of an interest in the variables for their own sake. INTRODUCTION to-speciﬁc’ approach described in Hendry and Mizon (1993). 2003). in the ideal case. nor that the empirical investigation might suggest other more relevant measurements for policy analysis. 1. From these observed variables. and so on. but it avoids the lack of empirical relevance the theory-based approch has often been criticized for.10 CHAPTER 1. For instance. See for example Johansen and Juselius (1990) and Juselius (1993a). new components have entered the aggregate. banking technology has changed. Given the above discussion one should ask whether there is a close enough correspondence between observed variables and the true versus theoretical variables for example as given by the Baumol model? The answer is clearly no. it would be very hard to justify inference from the empirical analysis to a speciﬁc transactions demand-for-money theory. The main reason why these variables were selected is simply because they were being used by the Central Bank of Denmark as a basis for their policy analysis. The “ﬁnal” empirical model should. A detailed examination of the measurements reveal that essentially all the usual problems plague the data. These data have been extensively analyzed by myself and Søren Johansen both as an inspiration for developing new test procedures and as a means to understand the potential use of the cointegrated VAR model.6 On the choice of empirical example Throughout the book we will illustrate the methodological arguments with empirical analyses of a Danish money demand data. Since the starting point is the timeseries structure of the chosen data. LaCour (1993) demonstrated that long-run . In this sense the suggested approach is a combination of inductive and deductive inference. In that sense. it is often advantageous to search for structures against the background of not just one but a variety of possibly relevant theories.

The latter approach demonstrated that the original speciﬁcation in real variables were misspeciﬁed without including the inﬂation rate. In this case. Juselius (1992a) added the loan rate to the previously used information set and found one additional cointegration relation between the . thus. Juselius (1993) tested the long-run price homogeneity assumption in the I(2) model and found that it was accepted. It turned out to be one of the rare data sets describing long-run relationships which have remained remarkable stable over the last few decades. not to loose track so easily. In this sense the Danish data were also able to illustrate a most important ﬁnding: the sensitivity of the VAR approach to empirical shifts in growth rates and in equilibrium means. ON THE CHOICE OF EMPIRICAL EXAMPLE 11 price homogeneity was more strongly accepted when a weighted average of components of diﬀerent liquidity was used as a proxy for the transactions demand for money. If. Re-estimating the model with inﬂation rate correctly included as a new variable showed that inﬂation rate was a crucial determinant in the system. was also inﬂuenced by empirical analyses of the Danish data. real income. Many of the theoretical advances were directly inﬂuenced by empirical analyses of this data set. this was based on an implicit assumption of long-run price homogeneity. 1990). 1992. 1995. A need to test this properly resulted in the development of the I(2) analysis (Johansen. it is obvious that we were very fortunate to begin our ﬁrst cointegration attempts using this data set. 1997).1. building on previously found results. When the number of potentially interesting variables is large. which strongly aﬀected the interpretation of the steady-state relations and the dynamic feed-back eﬀects within the system. so is the number of cointegrating relations. In the ﬁrst applied paper (Johansen and Juselius. and two interest rates was analyzed and only one cointegration relation was found. Looking in the rear mirror. a vector of real money. it is possible to build on previous results and. instead. However. However. it should be pointed out that this was true only after having econometrically accounted for a regime shift in 1983 as a consequence of deregulating capital movements in Denmark.6. The idea of gradually increasing the data vector of the VAR model. The number of possible combinations is simply too large. it is often a tremendously diﬃcult task to identify them. Including inﬂation rate in the vector produced one more cointegration relation (a relation between the two interest rates and inﬂation) in addition to the previously found money demand relation. more information is gradually added to the analysis. after having applied the cointegrated VAR model to many other data sets.

of imposing more and more restrictions on the unrestricted VAR). Thus. contrary to the “general to speciﬁc” approach of the statistical modelling process (i. where three diﬀerent sectors of the macroeconomy were analyzed separately and then combined into one model.e. for example. it appeared more advantageous to follow the principle of “speciﬁc to general ” in the choice of information set. Needless to say. distinguishing between models and relations in economics and econometrics. This approach was then further developed in Juselius (1992b). Juselius (1993) analyzed common trends in both the I(2) and I(1) models.) based on data from various European countries (with essentially the same conclusions!) The experience of looking at various economies characterized by diﬀerent policies through the same ‘spectacles’ generated the idea that the VAR approach could possibly be used as a proxy for a ‘designed experiment’ in a situation where only data by passive observations were available. INTRODUCTION loan rate and the deposit rate. The aim is to propose a framework for discussing the probability approach to econometrics as contrasted to more traditional methods. It focuses on what could be called the economist’s approach as opposed to the statistician’s approach to macroeconomic modelling. The representation and statistical analysis of the I(2) model meant a major step forward in the empirical understanding of the mechanisms determining nominal growth rates in Denmark. . Therefore. a similar design was used in a number of other studies (ref. 1995) or in PcGive (Doornik and Hendry. It draws heavily on Juselius (1999). Today’s academics living under an increasingly strong pressure ”to publish or perish” can seldom aﬀord to spend time on writing computer software. To meet the demand for empirical applicability. It is no coincidence that the most inﬂuential works in econometrics in the last decades were those which combined theoretical results with the development of the necessary computer software. 2002). This was clearly against conventional wisdom that predicted the link to be the other way around. CATS for RATS (Hansen and Juselius. all test and estimation procedures discussed in this book are readily implemented as user-friendly menu-driven programs in.12 CHAPTER 1. such a result could not convince economists as long as it stood alone. An urge to understand more fully why this approach frequently produced results that seemed to add a question mark to conventional theories and beliefs was the motivation for writing the next chapter. and found that the stochastic trend component in nominal interest rates seemed to have generated price inﬂation.

Chapter 2 Models and Relations in Economics and Econometrics In Chapter 1 we discussed Haavelmo’s probability approach to empirical macroeconomics and the need to formulate a statistical model based on a stochastic formulation of the process that has generated the chosen data. Section 2. and irregular components. it is natural to formulate the empirical model in terms of time dependent stochastic processes.3 discusses informally some empirical and theoretical implications of unit roots in the data. Section 2.6 as I(1). cycles. and Section 2. Because most macroeconomic data exhibit strong time dependence. 2. The organization of this chapter is as follows: Section 2.5 gives an empirical motivation for treating the stochastic trend in nominal prices as I(2).1 discusses in general terms the VAR approach as contrasted to a theory-based model approach. Section 2. Section 2.1 The VAR approach and theory based models The vector autoregressive (V AR) process based on Gaussian errors has frequently been a popular choice as a description of macroeconomic time series 13 .2 brieﬂy considers the treatment of inﬂation and monetary policy in Romer (1996) with special reference to the equilibrium in the money market.4 addresses more formally a stochastic formulation based on a decomposition of the data into trends.

There are many reasons for this: the VAR model is ﬂexible. Juselius (1993). From an econometric point of view the two approaches are fundamentally diﬀerent: one starting from an explicit stochastic formulation of all data and then reducing the general statistical (dynamic) model by imposing testable restrictions on the parameters. In this sense the V AR analysis can be useful for generating new hypotheses. hence. Chapter 9 in D. or for suggesting modiﬁcations of too narrowly speciﬁed theoretical models.14 CHAPTER 2. By imbedding the theory model in a broader empirical framework. the empirical analysis not only answers a speciﬁc theoretical question. whereas a statistically well-speciﬁed empirical model has to address the theoretical problem in the context of ”everything else changing”. For a detailed methodological discussion of the two approaches. A theory model can be simpliﬁed by the ceteris paribus assumptions ”everything else unchanged”. but also gives additional insight into the macroeconomic problem. we will here attempt to bridge the gap between the two views by starting from some typical questions of theoretical interest and then show how one would answer these questions based on a statistical analysis of the V AR model. Romer (1996): ”Advanced Macroeconomics”. and Pagan (1987). the two approaches have shown to produce very diﬀerent results even when applied to identical data and. From a scientiﬁc point of view this is not satisfactory. the other starting from a mathematical (static) formulation of a theoretical model and then expanding the model by adding stochastic components. Theory-based economic models have traditionally been developed as nonstochastic mathematical entities and applied to empirical data by adding a stochastic error process to the mathematical model1 . easy to estimate. see for example Gilbert (1986). and it usually gives a good ﬁt to macroeconomic data. Because the latter by construction is ”bigger” than the theory model. Hendry (1995). the analysis of the statistically based model can provide evidence of possible pitfalls in macroeconomic reasoning. the possibility of combining long-run and short-run information in the data by exploiting the cointegration property is probably the most important reason why the V AR model continues to receive the interest of both econometricians and applied economists. However. As a convincing illustration see 1 Dynamic general equilibrium models . diﬀerent conclusions. As an example of this approach we will use the macroeconomic treatment in ”Inﬂation and Monetary Policy”. Unfortunately. Therefore. MODELS AND RELATIONS data.

upwardly skewed relative-cost shocks. at least in the long run. increases in government purchases. This is in contrast to many empirical investigations. The well-known diagram illustrating the intersection of aggregate demand and aggregate supply provides the framework for identifying potential sources of inﬂation as shocks shifting either aggregate demand upwards or aggregate supply to the left. be checked for its consistency with all previous empirical and theoretical statements. As examples of aggregate supply shocks that shift the AS curve to the left Romer (1996) mentions. downward shifts in labor supply. and should. and many others. Here we will brieﬂy consider some conventional ideas underlying this belief as described in Chapter 9 by Romer (1996). INFLATION AND MONEY GROWTH 15 Hoﬀman (2001?). and interest rates and include questions such as: • How eﬀective is monetary policy when based on changes in money stock or changes in interest rates? • What is the eﬀect of expanding money supply on prices in the short run? in the medium run? in the long run? • Is an empirically stable demand for money relation a prerequisite for monetary policy control to be eﬀective? • How strong is the direct (indirect) relationship between a monetary policy instrument and price inﬂation? Based on the V AR formulation we will demonstrate that every empirical statement can. prices. These questions have been motivated by many empirical V AR analyses of money. negative technology shocks. Since all these types of shocks. but not in the full context of the empirical model.1.2.2 Inﬂation and money growth A fundamental proposition in macroeconomic theory is that growth in money supply in excess of real productive growth is the cause of inﬂation. As examples of aggregate demand shocks that shift the AD curve to the right he mentions. downward shifts in money demand. increases in money stock. 2.2. income. occur quite . Throughout the book we will illustrate a variety of econometric problems by addressing questions of empirical relevance based on an analysis of monetary inﬂation and the transmission mechanisms of monetary policy. See the upper panel of Figure 2. where inference relies on many untested assumptions using test procedures that only make sense in isolation.

P is the price level. when time is introduced the ceteris paribus assumption and the assumption of ﬁxed exogenous variables become much more questionable.16 CHAPTER 2. Y r ). the static equilibrium concept has to be replaced by a dynamic concept.2) The equilibrium condition (2. when changing another variable. Romer (1996) concludes that the price level is determined by: P = M/L(R. when the additional variables (R and Y r ) are exogenously given and everything else is taken account of by the ceteris paribus assumption. Y r real income. In this sense an equilibrium position is always related to a speciﬁc time point in empirical modelling. some of which permanently change the previous equilibrium condition. hence (2. LR < 0.e. Among the latter economists usually emphasize changes in money supply as the crucial inﬂationary source. However. whereas money in principle is unlimited in supply. is a static concept that can be thought of as a hypothetical relation between money and prices for ﬁxed income and interest rate. say money supply. Hence. Neither interest rates nor real income have been ﬁxed or controlled in most periods subject to empirical analysis. say price. Y r ) (2. therefore. Some of these shocks may only inﬂuence inﬂation temporarily and are.1) where M is the money stock. Thus.2). More formally the reasoning is based on money demand and supply and the condition for equilibrium in the money market: M/P = L(R. The underlying comparative static analysis investigates the eﬀect on one variable. for instance a steady-state . in empirical macroeconomic analysis all variables (inclusive the ceteris paribus ones) are more or less continuously subject to shocks. The economic intuition behind this is that other factors are limited in scope.1) and. and L(·) the demand for real money balances. less important than shocks with a permanent eﬀect on inﬂation. Ly > 0. the focus is on the hypothetical eﬀect of a change in one variable (M) on another variable (P ). R the nominal interest rate. Based on the equilibrium condition. MODELS AND RELATIONS frequently there are many factors that potentially can aﬀect inﬂation. Therefore. i. (2. with the purpose of deriving the new equilibrium position after the change. no changes in any of the variables.

but pushing a ball will bring the system away from equilibrium.e.2. the balls would correspond to the economic variables and the springs to the transmission mechanisms that describe how economic shocks are transmitted through the system. The observed money stock can be demand or supply determined or both. inversion of (2.2. In (2. To illustrate the ideas one can use an analogy from physics and think of the economy as a system of balls connected by springs.1) is estimated as a regression model it is likely to result in misleading conclusions. we need to replace the above picture with a system where the balls are moving with some ‘controlled’ speed and by pushing a ball the speed will change and inﬂuence all the other balls. If the inverted (2. i. we will not be able to observe a steady-state position and the empirical investigation has to account for the stochastic properties of the variables as well as the theoretical equilibrium relationship between them. For instance. INFLATION AND MONEY GROWTH 17 position. then observed money holdings are likely to be supply determined and the demand for money has to adjust to the supplied quantities. Left alone the system will return to the ‘controlled’ state. See the lower panel of Figure 1 for an illustration of a stochastic steady-state relation. Because all balls are connected. This raises the question whether it is possible to empirically identify and estimate the underlying theoretical relations. the steady state. Hence. but it is not necessarily a measurement of a long-run steady-state position. Instead of saying that the economy is in equilibrium it is more appropriate to use the word ‘steady-state’. in real applications the adjustment back to steady-state is disturbed by new shocks and the system essentially never comes to rest.2). This is likely to be the case in trade and capital .1) the money market equilibrium is an exact mathematical expression and it is straightforward to invert it to determine prices as is done in (2. In a typical macroeconomic system new disturbances push the variables away from steady-state. Therefore.1) is no longer guaranteed (see for instance Hendry and Ericsson. The observations from a typical macroeconomic system are adequately described by a stochastic vector time series process. if central banks are able to eﬀectively control money stock. In the economy. 1991). the ‘shock’ will inﬂuence the whole system. But in stochastic systems. but after a while the eﬀect will die out and the system is back in equilibrium. For an equilibrium relation time is irrelevant whereas a steady-state relation without a time index is meaningless. However. When left alone the system will be in equilibrium. the economy is not a static entity. but the economic adjustment forces pull them back towards a new steady-state position. As we know.

1: An equilibrium position of the AD and AS curve (upper panel) and deviations from an estimated money demand relation for Denmark: (m− p − y)t − 14. whereas in open deregulated economies with ﬁxed exchange rates central banks would not in general be able to control money stock.3) where vt is a stationary process measuring the deviation from the steady-state .e.5 .1(Rm − Rb ) (lower panel). possibly. a feasible solution in this respect. In the latter case one would expect observed money stock to be demand determined. Because macroeconomic variables are generally found to be nonstationary.1 1975 1980 1985 1990 1995 Figure 2.5 AD CHAPTER 2. in economies with ﬂexible exchange rates. regulated economies or. R = Rb − Rm ) can be written as a cointegrating relation. the statistical estimation problem has to be addressed.18 1.1) (with the opportunity cost of holding money. therefore.1 Deviations from money steady-state: m-p-y-14(Rm-Rb) 0 -. MODELS AND RELATIONS AS An equilibrium position 1 . The empirical counterpart of (2. Under the assumption that the money demand relation can be empirically identiﬁed. standard regression methods are no longer feasible from an econometric point of view. i. Cointegration analysis speciﬁcally addresses the nonstationarity problem and is.: ln(M/P Y r )t − L(Rb − Rm )t = vt (2.

3 The time dependence of macroeconomic data As advocated above. the equilibrium deviation vt is not necessarily due to a money supply shock at time t.e.e. i. can be assumed to be ﬁxed (i. prices. empirical investigation of (2. which is based on the assumption of ”ﬁxed” regressors. The stochastic feature of all variables implies that the equilibrium adjustment can take place in either money. but can originate from any change in the variables. However. Therefore.1 (lower panel) by the graph of the deviations from an estimated money demand relation based on Danish data with the opportunity cost of holding money being measured by Rb − Rm (Juselius. the strong time dependence of macroeconomic data suggests a statistical formulation based on stochastic processes. one should be cautious to interpret a coeﬃcient in a cointegrating relation as in the conventional regression context. 1998b). money.3.2. in an empirical model none of the variables in (2. In this context it is useful to distinguish between: • stationary variables with a short time dependence and • nonstationary variables with a long time dependence. Although in a theoretical exercise it is straightforward to keep some of the variables ﬁxed (the exogenous variables). Hence. controlled). THE TIME DEPENDENCE OF MACROECONOMIC DATA 19 position at time t.3) based on cointegration analysis poses several additional problems. In multivariate cointegration analysis all variables are stochastic and a shock to one variable is transmitted to all other variables via the dynamics of the system until the system has found its new equilibrium position.1). . income or interest rates. The empirical investigation of the above questions raises several econometric questions: What is the meaning of a shock and how do we measure it econometrically? How do we distinguish empirically between the long run. The stationarity of vt implies that whenever the system has been shocked it will adjust back to equilibrium. prices. as a result of removing restrictions on capital movements and the consequent adjustment back to steady-state. 2. income or interest rates. the medium run and the short run? Given the measurements can the parameter estimates be given an economically meaningful interpretation? These questions will be discussed in more detail in the subsequent sections. This is illustrated in Figure 2. Note the large equilibrium error at about 1983.

alternatively. For instance. for 19451992 (middle panel). This need not be so. the econometric analysis will ﬁnd signiﬁcant mean reversion and. We will illustrate this with a few examples involving money. That inﬂation is considered stationary in one study and nonstationary in another. would not appear to be statistically diﬀerent from a nonstationary variable. This is illustrated in Figure 2. Crossing the mean level a few times is not enough to obtain statistically signiﬁcant mean reversion and the econometric analysis will show that inﬂation should be treated as a nonstationary variable. prices. and 1975-1992 (lower panel). the order of integration of a variable. medium-run. after which the inﬂation rate has returned to its mean level. where the latter is based. it is useful to classify variables exhibiting a high degree of time persistence (insigniﬁcant mean reversion) as nonstationary and variables exhibiting a signiﬁcant tendency to mean reversion as stationary. income. The ﬁrst two time series of inﬂation rates look mean-reverting (though not to zero mean inﬂation). There are many arguments in favor of considering a unit root (a stochastic trend) as a convenient econometric approximation rather than as a deep structural parameter. Because inﬂation. lasting sometimes a decade or even more. say. conclude that inﬂation rate is stationary. MODELS AND RELATIONS In practice. is not in general a property of an economic variable but a convenient statistical approximation to distinguish between the short-run. and interest rates.20 CHAPTER 2. unless a unit root process is given a structural economic interpretation.2 where yearly observations of the Danish inﬂation rate has been graphed for 1901-1992 (upper panel). treating it as a stationary variable is likely to invalidate the statistical analysis and. for example. hence. then most macroeconomic variables exhibit considerable inertia. lead to wrong economic conclusions. Most countries have exhibited periods of high and low inﬂation. For this to happen we might need up to hundred years of observations. . consistent with nonstationary rather than stationary behavior. quarterly European inﬂation over the last few decades will cover a high inﬂation period in the seventies and beginning of the eighties and a low inﬂation period from mideighties until the present date. The time path of. and long-run variation in the data. However. If inﬂation crosses its mean level. therefore. on a sub-sample of the former might seem contradictory. for example. say ten times. whereas signiﬁcant mean-reversion would not be found for the last section of the series. if the time perspective of our study is the macroeconomic behavior in the medium run. it is important to stress that the stationarity/nonstationarity or.

95). THE TIME DEPENDENCE OF MACROECONOMIC DATA 21 .9.1 10 20 30 40 50 60 70 80 90 Inflation after 2nd world war 50 55 60 65 70 75 80 85 90 Inflation after 1975 . 95 monthly observations (1987:11-1995:9). This will be discussed at some length in Section 2. hence.1 .95-25.15 . 1945-92 (middle panel). can be treated as a stationary variable.5. independently of whether one takes a closeup or a long-distance look. Finally.3.2 Century long inflation 0 0 .2.and medium-run macroeconomic behavior. we have graphed the Danish bond rate in levels and diﬀerences in Figure 2.5. treating inﬂation as a nonstationary variable gives us the opportunity to ﬁnd out which other variable(s) have exhibited a similar stochastic trend by exploiting the cointegration property.3 based on a sample of 95 quarterly observations (1972:1-1995:3).05 75 80 85 90 Figure 2. where we will demonstrate that the unit root property of economic variables is very useful for the empirical analysis of long. and 75-92 (lower panel).2: Yearly Danish inﬂation 1901-92 (upper panel). inﬂation as well as interest rates are likely to show signiﬁcant mean reversion and. When the time perspective of our study is the long historical macroeconomic movements. to illustrate that the same type of stochastic processes are able to adequately describe the data. The daily sample corresponds to the little hump .05 0 45 . and 95 daily observations (1. On the other hand.

Sept 1995 (middle panel).5 0 20 40 quarterly (level) 20 15 10 0 20 40 60 80 60 80 -1 100 0 20 40 quarterly (difference) 2 0 -2 100 0 20 40 60 80 100 60 80 100 60 80 100 0 20 40 monthly (difference) 60 80 100 8. I.2 0 0 20 40 monthly (level) 10 0 7. or alternatively the very long-run. monthly observations. econometrically it is convenient to let the deﬁnition of long-run or short-run. at the end of the quarterly time series. Thus. Altogether. 2. of a typical . depend on the time perspective of the study. C. based on daily observations. showing no signiﬁcant mean reversion.5 8 1 Figure 2. and irregular component. (lower panel).95 (upper panel). T . the three time series look very similar from a stochastic point of view. and quarterly observations.3: Average Danish bond rates. Nov 1987 . MODELS AND RELATIONS daily (difference) .25.9.5.4 A stochastic formulation To illustrate the above questions we will consider a conventional decomposition into trend. From an economic point of view the question remains in what sense a ”unit root” process can be given a ”structural” interpretation. It would be considered a small stationary blip from a quarterly perspective. 1972:1-1995:3. and the short-run.22 daily (level) CHAPTER 2. whereas from a daily perspective it is nonstationary. cycle. the medium long-run. 1.

from a statistical point of view need to be modelled. t = 1. p. p the price level. Cl . say 3-5 years. and Rb the interest rate on bonds. y r real income. y r . we will allow the trend to be both deterministic. A STOCHASTIC FORMULATION macroeconomic variable. Td . and the cyclical component to be of long duration. i.2. independently of whether they are considered endogenous or exogenous in the economic model. T = Ts × Td . C = Cl × Cs . where for simplicity u1 is a nominal shock causing a permanent shift in the AD curve and u2 is a real shock causing a permanent shift in the AS curve.e. we will illustrate the ideas in Section 4. middle panel.. as is usually done in conventional analysis. and of shorter duration. To illustrate the ideas we will assume two autonomous shocks u1 and u2 . . i.4) where lower case letters indicate a logarithmic transformation. T. To give the economic intuition for the subsequent multivariate cointegration analysis of money demand / money supply relations.. but the linear time trend will also be important as a measure of average linear growth rates usually present in economic data.4. All variables are treated as stochastic and. say 6-10 years. where m is a measure of money stock. As an illustration of long cycles that have been found nonstationary by the statistical analysis (Juselius. X =T ×C×I 23 Instead of treating the trend component as deterministic. This would be consistent with the aggregate supply (AS ) aggregate demand (AD) curve. In the subsequent chapters the stochastic time dependence of the variables will be of primary interest.. To clarify the connection between the econometric analysis and the economic interpretation we will ﬁrst assume that the empirical analysis is based .e. and stochastic. Ts .2 using the time series vector xt = [m. The reason for distinguishing between short and long cycles is that a long/short cycle can either be treated as nonstationary or stationary depending on the time perspective of the study. Rm the own interest on money stock. An additive formulation is obtained by taking logarithms: x = (ts + td ) + (cl + cs ) + i (2. Rb ]t . hence. 1998b) see the graph of trend-adjusted real income in Figure 4. Rm . Cs .1 and 4.

The lower panel is a graph of ∆pt and describes the stochastic I(1) trend in the inﬂation rate. when the perspective of the study is the medium run. whereas in the latter case. prices can sometimes be approximated as a strongly correlated I(1) process. a hundred years.24 CHAPTER 2. say.1 . Figure 4 illustrates diﬀerent stochastic trends in the Danish quarterly data set. The stochastic I(2) trend in the upper panel corresponds to trendadjusted prices and the stochastic I(1) trend in the middle panel corresponds to trend-adjusted real income. which is equivalent to the diﬀerenced I(2) trend. In the ﬁrst case. based on quarterly data 1975:1-1994:4 The concept of a common stochastic trend or a driving force requires a further distinction between: . a few decades and then on a yearly model of.05 1975 .05 0 -. pt .4: Stochastic trends in Danish prices. ∆p 6= 0. say. that inﬂation has been positive over the full sample.15 . is consistent with a linear trend in the price levels. we will argue that prices should generally be treated as I(2). real income and inﬂation. however.1 0 -. MODELS AND RELATIONS on a quarterly model of.1 1975 . Note. Stochastic I(2) trend in prices . Thus. meaning that the price level will contain a linear deterministic trend.1 . a nonzero sample average of inﬂation.05 1975 1980 1985 1990 1995 Stochastic I(1) trend in real income 1980 1985 1990 1995 Stochastic I(1) trend in inflation 1980 1985 1990 1995 Figure 2. when the perspective is the very long run.

starting from an initial value of inﬂation rate. a transitory shock can be described as a shock that occurs a second time in the series but then with opposite sign.t + εp. consists of a permanent shock.2 By integrating the shocks. we will assume that inﬂation rate.1.i + π 0 ≈ i=1 εp. For simplicity we assume that only the former case is relevant here. t =1.t−2 + . Therefore. A STOCHASTIC FORMULATION 25 • an unanticipated shock with a permanent eﬀect (a disturbance to the system with a long lasting eﬀect) • an unanticipated shock with a transitory eﬀect (a disturbance to the system with a short duration).t−1 + εp. εs.t + π 0 A permanent shock is by deﬁnition a shock that has a lasting eﬀect on the level of inﬂation..t .t .. i=1 The diﬀerence between a linear stochastic and deterministic trend is that Note that in this case an ARIMA(0.t + εs. + εs. such as a permanent increase in government expenditure.6) i=1 P only the permanent shocks will remain and we call t εi a stochastic trend.. To give the non-expert reader a more intuitive understanding for the meaning of a stochastic trend of ﬁrst or second order. 2 . (2. Hence. and a transitory shock.. but return to their previous level after the removal. whereas a permanent shock has a long lasting eﬀect on the level.t + εs. + εp.2.1 + π 0 Pt Pt Pt = i=1 εp.t .i + εs. In the summation of the shocks εi : πt = t P εi + π 0 (2.4.t−1 + εs.t−2 + .. we get: π t = εp. π0 .1) model would give a more appropriate speciﬁcation provided εp and εs are white noise processes. = εt /(1 − L) + π 0 . In the latter case prices increase temporarily.. εp.T..5) where εt = εp. whereas the eﬀect of a transitory shock disappears either during the next period or over the next few ones. π t . An example of a transitory price shock is a value added tax imposed in one period and removed the next. a transitory shock disappears in cumulation.i + i=1 εs. follows a random walk: π t = π t−1 + εt .1 + εs.

4. by a strongly autoregressive ﬁrst order stochastic trend and a deterministic linear trend. The stochastic I(2) trend is illustrated in the upper part of Figure 2.T. whereas those of a deterministic trend are constant over time. (2. though strongly autocorrelated. MODELS AND RELATIONS the increments of a stochastic trend change randomly. i. The question whether inﬂation rates should be treated as I(1) or I(0) has been subject to much debate. .2 illustrated that the Danish inﬂation rate measured over the last decades was probably best approximated by a nonstationary process.e..6) once..10) s=1 s=1 i=1 = (1 − ρ)−1 i=1 εi − ρ(1 − ρ)−1 ρt−i εi + π 0 t + p0 i=1 i. whereas measured over a century by a stationary. The former is illustrated in the middle and lower panels of Figure 2.7) s=1 s=1 i=1 It appears that inﬂation being I(1) with a nonzero mean. corresponds to prices being I(2) with linear trends. In this case prices would be represented by: t P t s PP pt = π s + p0 = t P ρs−i εi + π 0 t + p0 t P (2.. t =1. process. t P s t PP pt = π s + p0 = εi + π 0 t + p0 .. Figure 2.e.5) should be replaced by: π t = ρπt−1 + εt .8) πt = ρt−i εi + π0 = εt + ρεt−1 + . (2..26 CHAPTER 2. + ρt−1 ε1 + π 0 (2. which becomes: t P (2. For a description of the latter case. = εt /(1 − ρL) + π 0 .4.9) i=1 where the autoregressive parameter ρ is less than but close to one. A representation of prices instead of inﬂation is obtained by integrating (2.

2. is I(1) is consistent with inﬂationary shocks being strongly persistent. because the sample period is too short for the estimate to be statistically diﬀerent from one. For example. whereas a much higher value of ρ. In the second case the sample period contains enough turning points for ρ to be signiﬁcantly diﬀerent from one.6) as inﬂation having a ﬁrst order stochastic trend deﬁned as the cumulative sum of all previous shock from the starting date. The statement that inﬂation. ∆pt . the choice of one representation or the other is as such not very important. Whether inﬂation should be considered I(1) or I(0) has been much debated. be based on all available information. a price variable could easily be considered I(2) based on a short sample. In the ﬁrst case the parameter ρ is approximated with unity. often based on a structural (economic) interpretation of a unit root. Statistically this is expressed in (2. say 0. nevertheless. the fact that a small value of ρ. whereas I(1) based on a longer period.98. inference on the cointegration and integra- . However.4) is that ts is a true unit root process (ρ = 1). but we.4. the distinction between a permanent (long-lasting) and a transitory shock is fundamental for the empirical interpretation of integration and cointegration results.6) and (2. We argue here that the order of integration should be based on statistical. rather than economic arguments. say 0.9) is only a matter of approximation. A STOCHASTIC FORMULATION 27 The diﬀerence between (2. treat inﬂation as I(0) then the statistical inference will sooner or later produce logically inconsistent results. as long as there is consistency between the economic analysis and the choice. we will argue below that. Econometrically it is more optimal to treat a long business cycle component spanning over. is likely to give semantic problems when using ‘long-run’ and ‘short-run’ to describe integration and cointegration properties. Nevertheless. say. is often not significantly diﬀerent from one in a small sample. In this sense the diﬀerence between the long cyclical component cl and ts in (2. say. from an econometric point of view the choice between the two representations is usually crucial for the whole empirical analysis and should. Because of the sample split. 10 years as an I(1) process when the sample period is less than. whereas cl is a near unit root process (ρ ≤ 1) that needs a very long sample to distinguish it from a true unit root process. If ρ is not signiﬁcantly diﬀerent from one (for example because the sample period is short). therefore.80. However. unless a unit root is given a structural interpretation. can diﬀer signiﬁcantly from one in a long sample. 20 years.

the medium-run stochastic P trend in P inﬂation. as deﬁned above. we make a further distinction between the statistical concept of a ‘long-run relation’ and the economic concept of a ‘steady-state relation’. MODELS AND RELATIONS tion properties will be based on relatively short samples leading to the above interpretational problems. u2i . u1i . Because a cointegrating relation does not necessarily correspond to an interpretable economic relation. u1i . Based on this deﬁnition non-cointegrating relations incorrectly included in the model will eventually drop out as future observations become available: a stationary variable cannot signiﬁcantly adjust to a nonstationary variable. This representation gives us the possibility of distinguishing between the long-run PP stochastic trend component in prices. u1i . We will talk about ‘short-run adjustment’ when a stationary variable is signiﬁcantly related to a cointegration relation or to another stationary variable. 2. u2i . The need to distinguish between these concepts is moderate here and we refer to Juselius (1999b) for a more formal treatment.5 Scenario Analyses: treating prices as I(2) As an illustration of how the econometric analysis is inﬂuenced by the above assumptions we will consider the following decomposition of the data vector: 3 In this section we will assume that the long-run stochastic trend tP P s in (2. and the once cumulated real (AS) shocks. When interpreting the subsequent results we will use the concept of ‘long-run relation’ to mean a cointegrating relation between I(1) or I(2) variables.28 CHAPTER 2.4) can be described by the twice cumulated nominal (AD) shocks. and the medium-run stochastic trend in real activity. price u1i .7) and the long cyclical component cl by the once cumulated nomP P inal shocks. . as in (2. When data are I(2) we have integration and cointegration on diﬀerent levels and the concepts need to be modiﬁed accordingly. A necessary condition for a ‘long-run relation’ to be empirically relevant in the model is ‘short-run adjustment’ in at least one of the system equations3 .

THE I(2) SCENARIO 29       mt pt r yt Rm. If g3 6= 0. Conditions for long-run price homogeneity: Let us now take a closer look at the trend components of mt and pt in (2.e. i. as a long-run structural trend or not.comp. the possibility of interpreting the P second stochastic trend. pt } ∼ I(2). when {g1 6= 0. then the level of real income growth rate and nominal growth rates is nonzero consistent with stylized facts in most industrialized P countries. c21 ) 6= 0. The trend-adjusted real income variable in the middle panel of Figure 4 illustrates such long business cycles.e. accounts for linear growth in nominal money and prices as well as real income. td = t.11) inﬂuences. in addition. c11 = c21 then mt − pt = (d11 − d21 ) u1i + (d12 − d22 ) u2i + (g1 − g2 )t + stat.11) The deterministic trend component.2. King. The ﬁrst case explicitly assumes that the average real growth rate is zero whereas the latter case does not. For a further discussion.t      =      c11 c21 0 0 0      PP [ u1i ] +       d11 d21 d31 d41 d51 d12 d22 d32 d42 d52  · P ¸     P u1i +    u2i    g1 g2 g3 0 0     [t] + stat. Plosser. pt = c21 u1i + d21 u1i + d22 u2i + g2 t + stat. See for instance. Whether one includes a linear trend or not in (2.5. If g3 = 0 and d31 = 0 in (2. u2i . Generally. then u2i is likely to describe the long-run real growth in the economy. P P If (c11 . g2 6= 0. i. see Rubin (1998).   (2.comp.t Rb. . Stock and Watson (1991). the long business cycles. then {mt . g3 6= 0}. comp.11). If.11): PP P P mt = c11 P P u1i + d11 P u1i + d12 P u2i + g1 t + stat. therefore. a ”structural” unit root process as discussed in the many papers on the stochastic versus deterministic real growth models. comp. then the linear time trend is likely to capture the long-run trend and P u2i will describe the medium-run deviations from this trend.

testing long. the nominal (AD) shocks u1t aﬀect nominal money and prices in the same way both in the long run and in the medium run.e. Note that (mt − pt ) ∼ I(1) implies (∆mt − ∆pt ) ∼ I(0). i. long-run price homogeneity implies cointegration between price inﬂation and money growth. then mt and pt are cointegrating from I(2) to I(1). the joint hypothesis is not as straightforward to test as long-run price homogeneity alone. The case trend-adjusted (mt − pt ) ∼ I(0) requires that both d11 = d21 and d12 = d22 . and d11 = d21 }. Longrun and medium-run price homogeneity requires {c11 = c21 . which is not very likely from an economic point of view.. i. The case (mt − pt ) ∼ I(1) implies long-run price homogeneity and is a testable hypothesis. If {(d11 6= d21 ). If this is the case. but not necessarily in the medium-run (over the business cycle). . they are CI(2.30 CHAPTER 2. then the stochastic trend in inﬂation can equally well be measured by the stochastic trend in the growth in money stock. and discuss various cases where medium-run price homogeneity is either present or absent. Assuming long-run price homogeneity: In the following we will assume long-run price homogeneity. but all the other variables can in principle be aﬀected by both stochastic trends. 1). i. A 4 Chapter 14 and 15 will discuss this nominal to real transformation in more detail. (d12 6= d22 )}.e. in addition. If.12) In (2. u1i . Because the real P stochastic trend u2i is likely to enter mt but not necessarily pt . P P u1i and u2i . (g1 6= g2 ) then real money stock grows around a linear trend.. MODELS AND RELATIONS is at most I(1). In this case it is convenient to transform the nominal data vector into real money (m − p) and inﬂation rate ∆p (or equivalently ∆m)4 :       mt − pt ∆pt r yt Rm. The inﬂation rate (measured by P ∆pt or ∆mt ) is only aﬀected by the once cumulated AD trend.12) all variables are at most I(1).   (2. c11 = c21 . Therefore.e.t      =      d11 − d21 d12 − d22 c21 0 d31 d32 d41 d42 d51 d52  · P ¸    u1i  P +   u2i    g1 − g2 0 g3 0 0     [t] + .e. i. Money stock and prices are moving together in the longrun.and medium-run price homogeneity jointly is not equivalent to testing (mt − pt ) ∼ I(0).t Rb.

9) implies that d11 − d21 = b1 c21 . has generally found little empirical support (Juselius. and d12 = d32 . d31 = 0 (medium-run price growth r does not aﬀect real income). r (mt − pt − yt ) ∼ I(0). precautionary and speculative demands for money) but not the price level. is then not inconsistent with the conventional monetarist assumption as stated by Friedman (1970) that ”inﬂation always and everywhere is a monetary problem”. though if | b1 |> 1. In this case it is not possible to interpret (2. and d12 = d32 . The stationarity of money velocity. Under the previous assumptions that d31 .e. THE I(2) SCENARIO 31 r The case (mt − pt − yt ) ∼ I(0). If b1 > 0. We will now turn to the more realistic assumption of money velocity being I(1). one would expect the real stochastic trend u2i to inﬂuence money stock (by increasing the transactions. 1996. P the real stochastic trend u2i . medium run price homogeneity). i. Juselius and Toro. and income. Juselius. 2000. i. money velocity of circulation is a stationary variable. then inﬂation adjusts to excess money.e. r mt − pt − yt + b1 ∆pt ∼ I(0). implies that either {(d11 − d21 − d31 ) 6= 0 or (d12 − d22 − d32 ) 6= 0}.5.2. A few examples illustrate this: Example 1 : Inﬂation is cointegrating with velocity. As an illustration see the graph of money velocity in the upper panel of Figure 5. . d22 = 0 (real stochastic growth does not aﬀect prices). It suggests that the two common stochastic trends aﬀect the level of real money stock and real income diﬀerently. then (2. r The case (mt − pt − yt ) ∼ I(1). 1999). where the opportunity cost of holding money relative to real stock as measured by ∆pt is a determinant of money velocity.: P priori. with some time lag. prices. 1998b.13) can be interpreted as a money demand relation. the I(0) assumption of (2. d12 −d22 −d32 = 0 and g1 − g2 − g3 = 0.13) as a money demand relation. This case. implying common movements in money.e.13) or alternatively r (mt − pt − yt ) + b2 ∆mt ∼ I(0). then mt − pt − yt ∼ I(0).e. that d12 6= 0 and d22 = 0. In this case real money stock and real aggregate income share one common trend. i. On the other hand if b1 < 0 (or b2 > 0). If d11 = d21 (i. d22 = 0. (2. requires that d11 −d21 −d31 = 0.

shows the interest spread between the Danish 10 year bond rate and the deposit rate. then Rm and Rb must be cointegrating. In either case the stochastic trend in the spread has to cointegrate with the stochastic trend in velocity for (2. in addition. then (2. the middle panel.15) i=1 where R0 is a constant real interest rate and Et (∆m pt+m )/m is the expected value at time t of inﬂation at the period of maturity t + m. Figure 5. then the predictions do not deviate from the actual realization with more than a stationary error. A stationary spread on the other hand signals fast adjustment between the two interest rates.e. (Rm −b4 Rb )t ∼ . either (d41 − d51 ) 6= 0. In a model explaining monetary transmission mechanisms. or (d42 − d52 ) 6= 0.32 CHAPTER 2. if b3 < 0 then the money demand interpretation is no longer possible. such that changing the short interest rate only changes the spread in the very short run and.: r (mt − pt − yt ) − b3 (Rm − Rb )t ∼ I(0). If (∆m pt+m −Et ∆m pt+m ) ∼ I(0). MODELS AND RELATIONS Example 2 : The interest rate spread and velocity are cointegrating. or both. the determination of real interest rates is likely to play an important role. From the perspective of monetary policy a nonstationary spread suggests that the short-term central bank interest rate can be used as an instrument to inﬂuence money demand. (∆pt − (∆m pt+m )/m) ∼ I(0). From (2. i. If. If b3 > 0. (2. and (2. leaves money demand essentially unchanged. It is notable how well the nonstationary behavior of money velocity and the spread cancels in the linear money demand relation.14) could instead be a central bank policy rule.14) with b3 = 14.14) can be interpreted as a money demand relation in which the opportunity cost of holding money relative to bonds is a determinant of agent’s desired money holdings. then Rt − ∆pt is stationary. if d42 = d52 = 0.14) to hold. and the lower panel the linear combination (2. i.15) it appears that if (Rm − ∆p) ∼ I(0) and (Rb − ∆p) ∼ I(0). hence. m P Rt = Et (∆m pt+m )/m + R0 = (pt+m − pt )/m + R0 = 1/m ∆pt+i + R0 (2. On the other hand.14) Because (Rm − Rb )t ∼ I(1). Also.e. then d42 = d52 = 0. The Fisher parity predicts that real interest rates are constant.

This .  u2i  (2. real interest rates.1 0 -.16) Though appealing from a theory point of view. Pu2i .5.16) has not found much empirical support.2. and money demand (lower panel) for Danish data..12):       mt − pt ∆pt r yt Rm. u1i . This case can be formulated as a restricted version of (2.5: Money velocity (upper panel).8 -1 Money velocity 1975 The interest rate spread .02 . I(0) with b4 = 1 for d41 = d51 . In this sense stationary real interest rates are both econometrically and economically consistent with the spread and the velocity being stationary.1 1975 1980 1985 1990 1995 1980 1985 1990 1995 Deviations from money steady-state 1980 1985 1990 1995 Figure 2.t Rb.. Instead. interest rate spreads. and inﬂation and the two nominal interest rates share the AD trend. It corresponds to the situation where real income P and real money stock share the common AS trend.01 1975 . THE I(2) SCENARIO 33 -. the interest rate spread (middel panel). and money velocity have frequently been found to be nonstationary.t      =      0 d12 c21 0 0 d12 c21 0 c21 0  · P ¸   P u1i + . (2.

17) u2i + g1 t + stat. at least over the horizon of a long business cycle. ∆m} ∼ I(0).9) implying that inﬂation is stationary albeit strongly autocorrelated. r If d12 = d32 then (mt − pt − yt ) ∼ I(0). In this case one would also expect Et {(∆b pt+b )/b − (∆m pt+m )/m} ∼ I(1). Goldberg and Frydman (2002) shows that imperfect knowledge expectations (instead of rational expectations) is likely to generate an I(1) trend in the interest rate spread. Also if agents systematically mispredict the future inﬂation rate we would expect (∆m pt+m − Et ∆m pt+m ) ∼ I(1) and. pt } ∼ I(1) it follows that {∆p.comp. and real interest rates cannot be stationary unless d42 = d52 = 0. For example. From {mt . hence. but (mt − pt ) ∼ I(1) unless (d12 − d22 ) = 0. the nonstationarity of real interest rates and the interest rate spread can be justiﬁed. By modifying some of the assumptions underlying the Fisher parity. If d12 6= 0 and d22 = 0. MODELS AND RELATIONS suggests the presence of real and nominal interaction eﬀects. 2. Rt − ∆pt ∼ I(1). u2i + g2 t + stat.comp.t Rb.comp.6 Scenario Analyses: treating prices as I(1) In this case ρ < 1 in (2. The representation of the vector process becomes:       mt pt r yt Rm.comp.34 CHAPTER 2. and (Rm − Rb )t ∼ I(1) would be consistent with the predictions from the expectation’s hypothesis (or the Fisher parity).t   c11 c21 0 0 0 d12 d22 d32 d42 d52   g1 g2 g3 0 0  Money and prices are represented by: mt = c11 pt = c21 P P u1i + d12 u1i + d22     =     · P ¸    u1i  P +   u2i   P P    [t] + stat. then mt − pt = d12 P u2i + (g1 − g2 )t + stat.   (2. If c11 = c21 there is long-run price homogeneity. .

As discussed above.7 Concluding remarks This chapter focused on the decomposition of a nonstationary time-series process into stochastic and deterministic trends as well as cycles and other stationary components.7. In the case when some of the variables contain common stochastic trends Sections 4 and 5 showed that the latter can be canceled by taking linear combinations of the former and that these linear combinations potentially can be interpreted as economic steady-state relations. All this was given in a purely descriptive manner without specifying a statistical model consistent with these features.18) 2.comp. CONCLUDING REMARKS 35 Hence. unless one is prepared a priori to exclude the possibility of stationary real interest rates. .       mt pt r yt Rm. but is usually only relevant in the analysis of long historical data sets.t Rb. This is the purpose of the next chapter which discusses the properties of aggregated time-series data and point out under which assumptions these data will produce the VAR model.2.   (2.t     c11 d12 g1 · P ¸  g2 c21 0   u 0 d12  P 1i +  g3   u2i  0 0 0  0 0 0      =        [t] + stat. the inﬂation rate and the interest rates have to cross their mean path fairly frequently to obtain statistically signiﬁcant meanreversion. a consequence of treating prices as I(1) is that nominal interest rates should be treated as I(0).17) given below is economically as well as econometrically consistent. The restricted version of (2.

when there is just one realization (x1 . The aim is to discuss under which simplifying assumptions on the vector time series process the VAR model can be used as an adequate summary description of the information in the sample data.2. When S > 1 this could. . T describe S realizations of a variable x over T time periods.... ... inﬂation and two interest rates from Denmark. We illustrate the diﬃculties with two simple examples in Figures 3. we cannot make inference on the shape of the distribution or its parameter values without making simplifying assumptions.Chapter 3 The Probability Approach in Econometrics and the VAR This chapter will (i) deﬁne the basic characteristics of a single time series and a vector process. . 37 .. (ii) derive the VAR model under certain simplifying assumptions on the vector process. 3.. Let xs.1 A single time series process To begin with we will look at a single variable observed over consecutive time points and discuss its time-series properties. xT ) on the index set T . for example describe a variable in a study based on panel data or it could describe a simulation study of a time series process xt .t . income.1 and 3. s = 1. t = 1. Here we will focus on the case when S = 1. (iii) discuss the dynamic properties of the VAR model (iv) and illustrate the concepts with a data set of quarterly observations covering 1975:1-1993:4 on money. S. i..e. in which the number of replications are S. Since we have just one realization of the random variable xt ..

V ar(xt ) = σ 2 . 6 In the two examples. E(xt ) = µ.1. t = 1. T . . E(xt ) = µt .2 the distribution and the variance are identical. the mean value and the variance is the same for each xt . For instance in Figure 3. . .. THE PROBABILITY APPROACH µ r r ´ ´S · QQ ´ S Q · Qr ´ Q´ S · x4 S · S · Srx6 · · ³r ³³ x2 ³ r ³ x3 x5 x1 - 1 2 3 4 5 6 T Figure 3. In ﬁgure 3. but the mean varies with t.. Note that the observed time graph is the same .1 we have assumed that the distribution.2. t = 1..38 x(t) 6 CHAPTER 3.. V ar(xt ) = σ 2 ... the line connecting the realizations xt produces the graph of the time-series. t = 1.. 6 x(t) 6 r r ´ ´S ·QQ ´ µ3 Q S · Qr ´ Q´ µ5 S · x4 S µ2 · µ S 6 · Srx6 µ4 · · µ1 ³ r ³ ³ x2 ³ r ³ x3 x5 x1 - 1 2 3 4 5 6 T Figure 3.

.. . .T −3 .T −1   x=     ∼ N(µ.. the ﬁrst two moments around the mean are suﬃcient to describe the variation in the data. . .T −3 ··· ··· ··· . . ··· σ 1. .T −2 σ 3..3.. for example dependent or independent drawings For the normal distribution.h     h = ..1 . µ1 µ2 . xT  σ T.. . .T −1 σ 2.1.0 .0     =Σ      0 Cov[x] = E[x − E(x)][x − E(x)] =     x1 x2 . µT   =µ  σ 1.2 σ 2.0 σ 3. xT     =    σ 1.. .1 σ 3. 1. . Σ)  . σ T. A SINGLE TIME SERIES PROCESS 39 in both cases illustrating the fact. . that we often need rather long time series to be able to statistically distinguish between diﬀerent hypotheses in time series models. . .1 σ 3.1 σ 2. for example the normal model (ii) a sampling model for xt .T −2 σ 1. To be able to make statistical inference we need: (i) a probability model for xt .. σ T.2 . xt−h ) = E[(xt − µt )(xt−h − µt−h )] = σ t. − 1.0 Cov(xt . Without simplifying assumptions on the time series process we have the general formulation for t = 1. T : E(xt ) = = µt V ar(xt ) = E(xt − µt )2 = σ2 t. .   E[x] = E   x1 x2 . σ T. . .0 σ 2.

. given the normal distribution. E(xt −µ)2 = σ 2 . For the normal distribution the time dependence between xt and xt−h ....e. . distributed...40 CHAPTER 3.. If xt has constant mean and variance and in addition σ h = 0 for all h = 1. T.e. THE PROBABILITY APPROACH Because there is just one realization of the process at each time t.. 0. there is not enough information to make statistical inference about the underlying functional form of the distribution of each xt . for t = 1. In this case we need additionally to discuss covariances between the variables at time t as well as covariances between t and t − h.h = σ h . 3. h = ...... . it is frequently assumed that the mean is the same. i. i. independently. .2 A vector process We will now move on to the more interesting case where we observe a vector of p variables.. In this case we say that: xt ∼ Niid(µ. E(xt ) = µ. t. t ∈ T and we have to make simplifying assumptions to secure that the number of parameters describing the process are fewer than the number of observations available. T t = 1.. (ii) time varying mean and constant variance.... (iii) constant mean and variance: (i) xt (ii) xt (iii) xt ∼ N(µt .. .. T can be described by the covariance function. Furthermore.. and that the variance is the same... σ 2 ) t ∼ N(µt . T For a time series process time dependence is an additional problem that has to be addressed. T t = 1. A simplifying assumption in this context is that the covariance function is a function of h. .. . T ... T. T.. . identically. 1. then Σ is a diagonal matrix and xt is independent of xt−h for h = 1. . σ 2 ) ∼ N(µ. Consecutive realizations cannot usually be considered as independent drawings from the same underlying stochastic process. A typical assumption in time series models is that each xt has the same distribution and that the functional form is approximately normal.. . We use the following notation to describe (i) time varying mean and variance... .. and t = 1. but not of t. for t = 1.. σ 2 ) t = 1. for t = 1. σ 2 ). where Niid is a shorthand notation for Normally. i. The .e. −1. σ t.

. xt ∼ N(µt .h σ 22.h    E[xt ] =   µp. For notational simplicity xt will here be used to denote both a random variable and its realization.. Consider the p × 1 dimensional vector xt :   xt =    x1.T −2 ··· .t . σ p2.t     . σ p1.  . E[Z] =  ..1 Σ0T.0     =   (T p×T p) ˜ Σ . xp. .T −2 .h .   xT µT ˜ matrix Σ is given by ··· .  σ 11. =µ .0 Σ02. . . .t x2.. .t . .  . Under the normality assumption the ﬁrst two moments around the mean (central moments) are suﬃcient to describe the variation in the data.T −1 ΣT.   We introduce the following notation for the case when no simplifying assumptions have been made:  µ1. . .h σ 21.e.h σ 12.. T.0  . A VECTOR PROCESS 41 covariances contain information about static and dynamic relationships between the variables which we would like to uncover using econometrics.... ΣT −1. σ pp.Σt ).  ΣT −1. T. ··· Σ0T −1. ΣT.1 Σ2. . . . . Cov[xt . t = 1. ··· σ 1p. .h σ 2p.T −2 We will now assume that the same distribution applies for all xt and that it is approximately normal. xt−h ] =       = Σt. . .      = µt . i.h  where Z is a pT × 1 vector..3.h ··· ··· .T −1 Σ0T.h . The covariance  Σ1.2.1) Z= .. .. .t µ2.  .T −2 . Σ0 T.h . E[(Z − µ)(Z − µ) ] =   . .1 ΣT.0 ΣT. . . .t t = 1.1  Σ2.  0 . We introduce the notation :     µ1 x1  µ  x2    2    ˜ (3.

.... Σ2 ··· .... These two assumptions are needed to secure parameter constancy in the VAR model to be deﬁned in Section 3.. THE PROBABILITY APPROACH where Σt.. . • µt = µ for all t ∈ T.h < ∞ f or all t and h = 1. Empirical models are typically based on the following assumptions: • Σt.h = Σh . we have to make simplifying assumptions to reduce the number of parameters.. When the assumptions are satisﬁed we can write the mean and the covariances of the data matrix in the simpliﬁed form:  Σ0 Σ1 Σ2 .. . h = . . 1.42 CHAPTER 3. Therefore. −1. . Σ =      The above two assumptions for inﬁnite T deﬁne a weakly stationary process: Deﬁnition 1 Let {xt } be a stochastic process (an ordered series of random variables) for t = .. ΣT −1 Σ01 Σ0 Σ1 . xtk +h ) for h = . . . xtk ) is the same as (xt1 +h . .. Strict stationarity requires that the distribution of (xt1 .. E[xt − µ]2 = σ 2 < ∞ f or all t. .. xt−h ) = E(xt − µt )(xt−h − µt−h )0 ...4.. Σ02 Σ01 Σ0           µ= ˜   µ µ .. −1.. .... . 2. 1..h = Cov(xt . 2.. If E[xt ] = µ < ∞ f or all t.... it has no meaning from a practical point of view. 0. .. Σ1 Σ0 T −1 . . ··· Σ0 2 Σ0 1 Σ0 .. µ      ˜  . then {xt } is said to be weakly stationary.. The above notation provides a completely general description of a multivariate normal vector time series process.. 1.. .. 2. for all t ∈ T.. . Since there are far more parameters than observations available for estimation. . E[(xt − µ)(xt+h − µ)] = σ . −1.

where t mt is the nominal M3 and pt is the implicit deﬂator of the gross national expenditure. 1993:4..2. The r data vector is deﬁned by [mr .. Rb.t is the average deposit rate as a proxy for the interest yield on M3.03 1975 1980 1980 .5 . The Danish data in levels.05 .015 1985 1990 1995 1975 1980 1985 1990 1995 Bond rate 1985 1990 1995 Figure 3.4 1.where t mr = mt − pt is a measurement of real money stock at time t.t ]. GNE. Figure 3. r yt is the real gross national expenditure..3 and 3.05 .3.02 Real aggregate expenditure 1975 1980 Deposit rate 1985 1990 1995 .025 0 1975 .2 1975 1980 1985 Inflation rate 1990 1995 .025 .4 show the graphs of the data in levels and in ﬁrst diﬀerences.t is the 10 year government bond rate. .03 . A VECTOR PROCESS 43 An illustration: The data set introduced here will be used throughout the book to illustrate the many questions and their empirical answers that can be asked within the cointegrated VAR model.3 1.3.5 1. .04 .t Rb. . GNE. ∆pt is the quarterly inﬂation rate measured by the implicit GNE deﬂator.25 Real money stock 1. Rm. yt ∆pt Rm. t = 1975:1.75 . As a matter of fact the development of many of the subsequent cointegration procedures were more or less forced upon us as a result of the empirical analyses being performed on this data set.

The Danish data in ﬁrst diﬀerences.005 1975 1980 1985 1990 1995 Dib DDpy . Furthermore. inﬂation rate. THE PROBABILITY APPROACH .005 1975 Did 1980 1985 1990 1995 0 1980 1985 1990 1995 1975 1980 1985 1990 1995 Figure 3.025 1975 . A visual inspection reveals that neither the assumption of a constant mean nor of a constant variance seem appropriate for the levels of the variables. The outlier observation in bond rate is related to the lifting of previous restrictions on capital movements and the start of the ‘hard’ EMS in 1983. At this stage it is a good idea to have a look in the economic calendar to ﬁnd out if the outlier observations can be related to some signiﬁcant economic interventions or reforms. whereas the diﬀerenced variables look more satisfactory in this respect.025 0 -. This seems approximately to be the case for real money.05 .1 .1 Dm Dry 1980 1985 1990 1995 . whereas the outliers in the deposit rate are related Central Bank interventions.005 0 -.44 CHAPTER 3.4. Denmark had experienced a stagnating domestic demand in the period after the ﬁrst oil shock and to boost aggregate activity the government decided to remove VAT for one quarter and gradually put it back again over the next two quarters. but for real income. The outlier observation in real income and inﬂation rate appears at the same date and seems related to a temporary removal of the value added tax rate in 1975:4. we note that the . If the marginal processes are normal then the observations should lie symmetrically on both sides of the mean. and the two interest rates there are some ”outlier” observations.05 0 0 1975 . The question is whether these observations are too far away from the mean to be considered realizations from a normal distribution.

year is subject to some kind of political interventions.2. It is always a good idea to start with a visual inspection of the data and their time series properties as a ﬁrst check of the assumptions of the V AR model. if an ordinary intervention does not ‘stick out’ as an outlier.t looks stationary with constant mean and variance. Major interventions. This can be done by including new variables measuring the eﬀect of institutional reforms. quarter. it will be treated as a random shock for practical reasons. whether it seems to be additive to the model or whether it fundamentally changed the parameters of the model. the assumption of constant variance does not seem to be satisﬁed in this particular case. hence. etc. most of them have a minor impact on the data and the model. The next step is then to include this information in the model and ﬁnd out whether the intervention or reform had caused a permanent or transitory eﬀect. A VECTOR PROCESS 45 variability of the inﬂation rate in Figure 3. If the answer is negative to both questions. The econometric modelling of intervention eﬀects will be discussed in more detail in Chapter 6. need to be included in the systematic part of the model. looks higher in the ﬁrst part of the sample. or whether this is the case for ∆xi. but positive to the next one. Essentially every single month. . we can solve the problem by respecifying the VAR in error-correction form as will be demonstrated in the next chapter. like removing restrictions on capital movements. These are realistic examples that point at the need to include additional information on interventions and institutional reforms in the empirical model analysis. Thus. joining the EMS. Ignoring this problem is likely to seriously bias all estimates of our model and result in invalid inference. or if such variables are not available by using dummy variables as a proxy for the change in institutions. At the start of the empirical analysis it is not always possible to know whether an intervention was strong enough to produce an ”extraordinary” eﬀect or not. In the latter case the intervention is likely to have caused a regime shift and the model would need to be re-speciﬁed allowing for the shift in the structure. are likely to have much more fundamental impact on economic behavior and.3.t .3. it is often a good idea to check the economic calendar to ﬁnd out whether any signiﬁcant departure from the constant mean and constant variance coincides with speciﬁc reforms or interventions. If the answer is negative to the ﬁrst question. Based on the graphs we can get a ﬁrst impression of whether xi. Thus.

46

CHAPTER 3. THE PROBABILITY APPROACH

3.3

Sequential decomposition of the likelihood function

The purpose of this section is to demonstrate (i) that the joint likelihood function P (X; θ) can be sequentially decomposed into T conditional probabilities P (xt | xt−1,..., x1 ; X0 ,θ), (ii) that the conditional process (xt | xt−1,..., x1 ; X0 ) has a parameterization that corresponds to the vector autoregressive model. First we give a repetition of the simple multiplicative rule to calculate joint probabilities, and the formulas for calculating the conditional and marginal mean and the variance of a multivariate normal vector Y.

Repetition:
*********************************************** An illustration of the multiplicative rule for probability calculations based on four dependent events, A, B, C, andD: P (A ∩ B ∩ C ∩ D) = P (A|B ∩ C ∩ D)P (B ∩ C ∩ D) = P (A|B ∩ C ∩ D)P (B|C ∩ D)P (C ∩ D) = P (A|B ∩ C ∩ D)P (B|C ∩ D)P (C|D)P (D) Note that a multiplicative formulation has been achieved for the conditional events, even if the events themselves are are not independent. The general principle of the multiplicative rule for probability calculations will be applied in the derivation of conditional and marginal distributions. Consider ﬁrst two normally distributed random variables y1,t and y2,t with the joint distribution: Y ∼ N(m, S) Y= · y1,t y2,t ¸ , E[Y] = · m1 m2 ¸ , Cov · y1,t y2,t ¸ = · s11 s21 s12 s22 ¸

The marginal distributions for y1,t and y2,t are given by y1,t ∼ N(m1 , s11 ) y2,t ∼ N(m2 , s22 )

3.4. DERIVING THE VAR The conditional distribution for y1,t |y2,t is given by (y1,t |y2,t ) ∼ N(m1.2 , s11.2 ) where m1.2 = m1 + s12 s−1 (y2,t − m2 ) 22 = (m1 − s12 s−1 m2 ) + s12 s−1 y2,t 22 22 = β 0 + β 1 y2,t and s11.2 = s11 − s12 s−1 s21 22

47

(3.2)

(3.3)

The joint distribution of Y can now be expressed as the product of the conditional and the marginal distribution: P (y , y ; θ) = | 1,t{z 2,t } P (y |y ; θ ) | 1,t {z2,t 1} × P (y ; θ ) {z | 2,t 2} (3.4)

the joint distribution

the conditional distribution

the marginal distribution

**************************************************

3.4
1

Deriving the VAR

The empirical analysis begins with the data matrix X = [x1 , ..., xT ]0 where xt is a (p × 1) vector of variables. Under the assumption that the observed data X is a realization of a stochastic process we can express the joint probability of X given the initial value X0 and the parameter value θ describing the stochastic process: P (X|X0 ; θ) = P (x1 , x2 , ..., xT |X0 ; θ) For a given probability function maximum likelihood estimates can be found by maximizing the likelihood function. Here we will restrict the discussion to the multivariate normality distribution. To express the joint probability of X|X0 it is convenient to use the stacked process Z0 = x01 , x02 , x03 , ..., x0T ∼ NT p (µ,Σ) deﬁned in (3.1) instead of the (T × p) data matrix X. Since
1

This section draws heavily on Hendry and Richard (1983).

48

CHAPTER 3. THE PROBABILITY APPROACH

µ is T p × 1 and Σ is T p × T p, there are far more parameters than observations without simplifying assumptions. But even if we impose simplifying restrictions on the mean and the covariances of the process, they are not very informative as such. Therefore, to obtain more interpretable results we will decompose the joint process into a conditional process and a marginal process and then sequentially repeat the decomposition for the marginal process: P (x1 , x2 , x3 , ..., xT |X0 ; θ) = P (xT |xT −1 , ..., x1 , X0 ; θ)P (xT −1 , xT −2 , ..., x1 |X0 ; θ) . . . = where X0 = [xt−1 , xt−2 , ..., x1 , X0 ]. t−1 The VAR model is essentially a description of the conditional process It is now possible to see how µt and Ω are related to µ and Σ by using the rules given in Section 3.3. for calculating the mean and the variance of the conditional distribution (3.2)-(3.3). We ﬁrst decompose the data ¸ into two · xt sets, the vector xt and the conditioning set X0 , i.e. X = . Using t−1 X0 t−1 the notation of Section 3.3 we write the marginal and the conditional process: y1,t =   =   Σ0 Σ1 . . . ΣT −1  xt xt−1 xt−2 . . . x1 Σ0 1 Σ0 ... ···      µ1 =   =     E[xt ] E[xt−1 ] E[xt−2 ] . . . E[x1 ]      Σ12 Σ22 ¸ {xt |X0 } ∼ NIDp (µt , Ω). t−1
t=1

(3.5)

Π P (xt |X0 ; θ) t−1

T

y2,t

µ2

 ˜  Σ= 

···

Σ0 1 ... Σ1

Σ0 T −1 . . . Σ0 1 Σ0

 · Σ11  = Σ21 

3.4. DERIVING THE VAR We can now derive the parameters of the conditional model: (xt |X0 ) ∼ N(µ1.2 , Σ11.2 ) t−1 where µ1.2 = µ1 + Σ12 Σ−1 (X0 − µ2 ) 22 t−1 and Σ11.2 = Σ11 − Σ12 Σ−−1 Σ21 22

49

(3.6)

(3.7)

The diﬀerence between the observed value of the process and its conditional mean is denoted εt : xt − µt = εt Inserting the expression for the conditional mean gives: xt = µ1 + Σ12 Σ−1 (X0 − µ2 ) + εt 22 t−1 −1 xt = µ1 − Σ12 Σ22 µ2 + Σ12 Σ−1 X0 + εt 22 t−1 Using the notation: Π0 = µ1 −Σ12 Σ−1 µ2 , [Π1, Π2 , ..., ΠT −1 ] = Σ12 Σ−1 and 22 22 assuming that Πk+1 , Πk+2 , ..., ΠT −1 = 0, we arrive at the k0 th order vector autoregressive model: xt = Π0 + Π1 xt−1 + ... + Πk xt−k + εt , t = 1, ..., T (3.8)

where εt is Niidp (0, Ω) and x0 , ...x−k+1 are assumed ﬁxed. If the assumption that X = [x1, x2 , ..., xT ] is multivariate normal (µ,Σ) is correct then it follows that (3.8): • is linear in the parameters • has constant parameters • has normally distributed errors εt . Note that the constancy of parameters depends on the constancy of the covariance matrices Σ12 and Σ22 . If any of them change as a result of a reform or intervention during the sample, both the intercept, Π0 , and the ‘slope coeﬃcients’ Π1 , ..., Πk are likely to change.

50

CHAPTER 3. THE PROBABILITY APPROACH

3.5

Interpreting the VAR model

We have shown that the VAR model is essentially a reformulation of the covariances of the data. The question is whether it can be interpreted in terms of rational economic behavior and if so whether it could be used as a ’design of experiment’ when data are collected by passive observation. The idea, drawing on Hendry and Richard (1983), is to interpret the conditional mean µt of the VAR model

µt = Et−1 (xt | xt−1 , ..., xt−k ) = Π1 xt−1 + ... + Πk xt−k ,

(3.9)

as describing agents’ plans at time t − 1 given the available information X0 = [xt−1 , ..., xt−k ]. According to the assumptions of the VAR model the t−1 diﬀerence between the mean and the actual realization is white noise process

xt − µt = εt , εt ∼ Niidp (0, Ω).

(3.10)

Thus, the N iid(0, Ω) assumption in (3.10) is consistent with economic agents which are rational in the sense that they do not make systematic forecast errors when they make plans for time t based on the available information at time t − 1. For example, a VAR model with autocorrelated and or heteroscedatic residuals would describe agents that do not use all information in the data as eﬃciently as possible. This is because they could do better by including the systematic variation left in the residuals, thereby improving their expectations about the future. Checking the assumptions of the model, i.e. checking the white noise requirement of the residuals, is not only crucial for correct statistical inference, but also for the economic interpretation of the model as a rough description of the behavior of rational agents. As an illustration Figure 3.5 shows the graphs of the (0,1) standardized residuals from the VAR(2) model based on the Danish data.

8). inﬂation.8) so that it preserves its attractiveness as a convenient description of the basic properties of the data. Since in general the statistical inference is valid only to the extent that the assumptions of the underlying methods are satisﬁed. the worse!) and skewed residuals. though a few outlier observations can be found. while at the same time yielding valid inference.5 0 res_m3 2 0 -2 1980 1985 1990 1995 2. real income. Therefore. while quite robust to others. .3. this is potentially a serious problem. like parameter non-constancy. Unfortunately. autocorrelated residuals (the higher. we have to ask whether it is possible to modify the baseline VAR model (3.5.5. INTERPRETING THE VAR MODEL 51 2 0 -2 1975 2.5 0 -2. The residuals do not look too bad. Simulation studies we shown that valid statistical inference is sensitive to the validity of some of the assumptions. deposit rate and bond rate. approximately corresponding to the interventions and reforms discussed above. The graphs of the residuals from the VAR(2) model for real money stock.5 1975 1980 Res_bondrate 1985 1990 1995 1975 1980 1985 1990 1995 1980 1985 1990 1995 Figure 3. like excess kurtosis and residual heteroscedastisity.5 0 1975 res_y 1980 res_deprate 1985 1990 1995 Res_infl 1975 2. in many economic applications the multivariate normality assumption is seldom satisﬁed for the VAR in its simplest form (3.

8). How to use these tools will be further discussed in the subsequent chapters. possible deterministic components Dt : .. • checking the measurements of the chosen variables. The autoregressive formulation is useful for expressing hypotheses on economic behavior.6 The dynamic properties of the VAR process The dynamic properties of the process can be investigated by calculating the roots of the VAR process (3. THE PROBABILITY APPROACH This will be discussed in more detail in Chapter 5. 1. When the process is stationary. the latter representation can be found directly by inverting the VAR model so that xt . Important tools in this context are: • the use of intervention dummies to account for signiﬁcant political or institutional events during the sample. where Li xt = xt−i : (I − Π1 L − . (3.. . • conditioning on weakly exogenous variables. seasonal dummies and intervention dummies. such as a constant. a vector of deterministic components. t = 1.. It is convenient to formulate the VAR as a polynomial in the lag operator L.. is expressed as a function of past and present shocks. j = 0. Whatever the case.52 CHAPTER 3. direct or indirect testing of the assumptions is crucial for the success of the empirical application. − Πk Lk )xt = Φ Dt + εt . 3. whereas the moving average representation is useful when examining the properties of the process.initial values X 0 . Π(L)xt = Φ Dt + εt ...11) where the model has been extended to contain Dt . so that in the end we can start from a statistically ”well-behaved” model. As soon as we understand the reasons why the model fails to satisfy the assumptions it is often possible to modify the model. εt−j . T.. . • changing the sample period to avoid fundamental regime shift or splitting the sample into more homogenous periods.

Johansen (1995). = |Π (L)|−1 Πa (L)(Φ Dt + εt ).11 π 2. .21 π 2. When the VAR process is nonstationary xt is non-invertible and the Cj matrices have to be derived under the assumption of reduced rank.12 z 2 −π 1. − Πk z k ...22 ¸ · π 2.22 ¸ Π (z) = I − = I− = · z− ¸ z2 ¸ · π 1.13) (3. = (I + C1 L + C2 L2 .21 π 1.12 z π 1.6.12 z − π 2.11 z − π 2... Πk ) when the VAR process is stationary.22 z 2 1 − π 1.21 z π 1.3.21 z 2 1 − π 1.22 z − π 2. We assume a stationary two-dimensional VAR(2) model: (I − Π1 L − Π2 L2 )xt = Φ Dt + εt .)(Φ Dt + εt ). The characteristic function is: · π 1..12 π 2. Chapter 2 gives a recursive formula for Cj = f ( Π1 .12 π 1.21 z − π 2..1 The roots of the characteristic function To calculate the roots of the VAR process we consider ﬁrst the characteristic polynomial: Π(z) = I − Π1 z − .12) (3.6.12 z 2 π 2. (Π(z))−1 = |Π (z)|−1 Πa (z).11 z 2 π 2.14) where |Π (L)| = det(Π (L)) and Πa (L) is the adjunct matrix of Π (L) .22 z 2 ¸ .11 z 2 −π 1. (3.21 z 2 π 2. This case will be discussed in Chapter 5. THE DYNAMIC PROPERTIES OF THE VAR PROCESS 53 xt = Π−1 (L)(Φ Dt + εt ).11 z π 1..11 π 1. 3.22 z − · π 2. We will ﬁrst illustrate that the roots of |Π (z)| = 0 summarize important information about the dynamics and the stability of the process.

the determinant is a fourth order polynomial in z which gives us four characteristic roots. ρ = −1) it will generate non-stationary behavior.e. The persistence of the eﬀect depends on the magnitude of |ρ1 | . If the modulus of a complex root is one it corresponds to nonstationary seasonal behavior.11 z − π 2. each shock εt will dynamically aﬀect both present and future values of the variables in xt .)( εt + Φ Dt ). z4 = 1/ρ4 . This can be illustrated using the simple two-dimensional VAR(2) model: Πa (L)(Φ Dt + εt ) .21 z 2 )..21 z + π2.22 z 2 ) − (π 1. . We set the characteristic polynomial to zero . It is noteworthy that already the simple two-dimensional VAR(2) model can generate a very rich dynamic pattern in the variables xt as a result of the multiplicity of the roots and the additional dynamics given by the Πc (L) matrix. a stochastic trend in xt . i. the stronger the persistence. a complex pair of roots ρj = ρreal ± iρcomplex will generate exponentially declining cyclical behavior.12 z 2 )(π 1.11 z 2 )(1 − π 1. THE PROBABILITY APPROACH |Π (z)| = (1 − π1. T. As an example of the dynamic behavior of the process we expand the ﬁrst root component (1 − ρ1 L)−1 ( εt + ΦDt ) = (1 + ρ1 L + ..e.12 z + π2. the larger.22 z − π 2.. = (1 − ρ1 z)(1 − ρ2 z)(1 − ρ3 z)(1 − ρ4 z). If a real root is lying on the unit circle (ρ = 1..54 and CHAPTER 3. The characteristic roots contain useful information about the dynamic behavior of the process. i. . An example of the latter is the simple fourth order diﬀerence model for quarterly data: (1 − L4 ) xt = (1 − L)(1 + L)(1 + L2 ) xt = εt . Note that this holds true also for any dummy variable included in Dt .... A real root ρj will generate exponentially declining behavior. (1 − ρ2 L)(1 − ρ3 L)(1 − ρ4 L) (1 − ρ1 L) xt for t = 1. = (1 − ρ1 L)(1 − ρ2 L)(1 − ρ3 L)(1 − ρ4 L) µ ¶µ ¶ Πa (L) εt + Φ Dt = . Thus. when solving for |Π (z)| = 0. z1 = 1/ρ1 . = 1 − a1 z − a2 z 2 − a3 z 3 − a4 z 4 .

3.6. THE DYNAMIC PROPERTIES OF THE VAR PROCESS

55

(1 − z)(1 + z)(1 + z 2 ) = 0 and ﬁnd the characteristic roots: z1 = 1, z2 = −1, z3 = −i, z4 = i.

3.6.2

Calculating the roots using the companion matrix

We will here demonstrate that the roots of the process can be conveniently calculated by ﬁrst reformulating the VAR(k) model into the ‘companion AR(1) form’ and then solving an eigenvalue problem. In this case the eigenvalue solution gives the roots directly as ρ1 , ..., ρp×k instead of the inverse ρ−1 , ..., ρ−1 obtained by solving the characteristic function. To distinguish 1 p×k between the two cases we call the former characteristic roots and the latter eigenvalue roots. The latter are calculated by transforming the VAR(k) model into an AR(1) model based on the companion form. For simplicity we assume k = 2 when we illustrate the procedure. First we rewrite the VAR(2) model in the AR(1) form: · xt xt−1 ¸ · Π1 I Π2 0 ¸· xt−1 xt−2 ¸ · εt 0 ¸

=

+

,

or more compactly: ex e xt = Πet−1 + et ε e ρV = ΠV · Π1 I

e The roots of the matrix Π can be found by solving the eigenvalue problem: where V is a kp × 1 vector. ρ · v1 v2 ¸ =

Π2 0

¸·

v1 v2

¸

56 i.e.

CHAPTER 3. THE PROBABILITY APPROACH

ρv1 = ρv2 = The solution can be found from: ρv1 = v1 =

Π1 v1 + Π2 v2 v1

Π1 v1 + Π2 (v1 /ρ) Π1 (v1 /ρ) + Π2 (v1 /ρ2 )

˜ i.e. the eigenvalues of Π are the pk roots of the second order polynomial: ¯ ¯ ¯I − Π1 ρ−1 − Π2 ρ−2 ¯ = 0 or ¯ ¯ ¯I − Π1 z − Π2 z2 ¯ = 0, ¯ ¯ ¯ ˜¯ ¯I − z Π¯ = 0

where z = ρ−1 . Note that the roots of the companion matrix ρi are the inverse of the roots of the characteristic polynomial. Thus, the solution to

gives the stationary roots outside the unit circle, whereas the solution to ¯ ¯ ¯ ˜¯ = 0 ¯ρI − Π¯

gives stationary roots inside the unit circle. To summarize: • if the roots of |Π (z)| , are all outside the unit circle (or alternatively if the eigenvalues of the companion matrix are all inside the unit circle) then {xt } is stationary, • if the roots are outside or on the unit circle (alternatively if the eigenvalues are inside or on the unit circle) then {xt } is nonstationary, • if any of the roots are inside the unit circle (alternatively if the eigenvalues are outside the unit circle) then {xt } is explosive.

3.6. THE DYNAMIC PROPERTIES OF THE VAR PROCESS Table 3.1: The roots of the VAR(2) model real complex modulus 0.99 0.00 0.99 0.75 0.10 0.76 0.75 -0.10 0.76 0.69 -0.33 0.76 0.69 0.33 0.76 -0.29 -0.40 0.49 -0.29 0.40 0.49 -0.35 0.00 0.35 0.11 -0.28 0.30 0.11 0.28 0.30

57

3.6.3

Illustration

Table 3.1 reports the roots of the VAR(2) model for the Danish data and Figure 3.6 shows them in the unit circle.

There are two real roots, one is almost on the unit circle, the other is negative root (which is likely to be the results of ∆pt being to some extent over-diﬀerenced). All remaining roots come in complex pairs.

58

CHAPTER 3. THE PROBABILITY APPROACH
The eigenvalues of the companion matrix

1.00 0.75 0.50 0.25 0.00 -0.25 -0.50 -0.75 -1.00 -1.0

-0.5

0.0

0.5

1.0

Figure 3.6. The pk = 10 roots of the VAR(2) model for the Danish data.

3.7

Concluding remarks

The aim of this chapter was to describe a “design of experiment” that may have generated data by passive observation for which the VAR model is an appropriate description. On an aggregated level, economic agents were assumed rational in the sense that they learn by past experience and adjust their behavior accordingly, so that their plans do not systematically deviate from actual realizations. Thus, the “design of experiment” consistent with the Niid assumption of the residuals relies on the assumption that agents make plans based on conditional expectations using the information set {xt−1 , Dt }, so that the residual (the unexpected component given the chosen information set) behaves as a normal innovation process. In this framework, the success of the empirical analysis relies crucially on the choice of a suﬃcient and relevant information set of an appropriate sample period and the skillfulness of the investigator to extract economically interesting results from this information. The purpose of the next chapter is to discuss estimation of the unrestricted VAR and some diagnostic tools which can be used when assessing the appropriateness of the chosen model.

Chapter 4 Estimation, Speciﬁcation and Tests in the Unrestricted VAR
The probability approach in econometrics requires an explicit probability formulation of the empirical model so that a fully speciﬁed statistical model can be derived and checked against the data. Assume that we have derived an estimator under the assumption of multivariate normality as demonstrated in the previous chapter. We then take the model to the data and obtain model estimates derived under this assumption. If the multivariate normality assumption is correct the residuals should not deviate signiﬁcantly from the Niid assumption. If they do not pass the tests, for example because they are autocorrelated or heteroscedastic, or because the distribution is skewed or leptocurtic, then the estimators may no longer have optimal properties and cannot be considered full information maximum likelihood (FIML) estimators. The obtained parameter estimates (based on an incorrectly derived estimator) may not have any meaning and since we do not know their ”true” properties inference is likely to be hazardous. However, some assumptions are more crucial for the properties of the estimates than others. Therefore, when reporting the various misspeciﬁcation tests below we will discuss robustness properties against modest violations of the assumptions. Nevertheless, if we are going to claim that our conclusions are based on FIML inference, then we also have to demonstrate that our model is capable of mirroring the ‘full information’ of the data in a satisfactory way. Before being able to test the assumptions we need to estimate the model and Section 4.1 derives the ML estimator under the null of correct model 59

T εt ∼ Np (0.. Section 4. x0t−2 ... .1 Likelihood based estimation in the unrestricted VAR Under the assumption that the parameters Θ = { Π1 . For simplicity we assume ΦDt = 0. Ω. We consider ﬁrst the log likelihood function 1 1 1X ln L(B. To simplify notation we rewrite (4. (xt Z0t )( t=1 T P Zt Z0t )−1 = MxZ M−1 . x0t−k ] and the initial values X0 = [x00 . 2 2 2 t=1 T and calculate ∂ ln L/∂B = 0 which gives T P t=1 xt Z0t = B0 t=1 so that the FIML estimator for B is: ˆ B0 = t=1 T P T P Zt Z0t . .1) where B0 = [ Π1 ..7) in compact form: xt = B0 Zt + εt . Π2 . 4. . X) = −T ln(2π) − T |Ω| − (xt −B0 Zt )0 Ω−1 (xt −B0 Zt ). Ω) (4. x0−k+1 ] are given. We need to derive the equations for estimating B and Ω which can be done by ﬁnding the expression for B and Ω for which the ﬁrst order derivatives of the likelihood function are equal to zero.. When the data contain unit roots we need to derive the likelihood estimator subject to reduced rank restrictions..2) .3. . Π2 ... Chapter 7 will give a detailed discussion of how to solve this problem. ZZ (4.. ESTIMATION AND SPECIFICATION speciﬁcation.. ..3 reports brieﬂy some frequently used misspeciﬁcation tests. Z0t = [x0t−1 .. Πk ]... x0−1 .2 discusses diﬀerent parametrization of the unrestricted VAR model and illustrates the estimates based on the Danish data. Ω} in the VAR model (3.6) of Chapter 3 are unrestricted. Πk . it can be shown that the simple OLS estimator is identical to the FIML estimator.60 CHAPTER 4.. t = 1. Section 4.

Let: εˆ  u2 1t  u2 pt It follows that trace(U) = u2 + .4..  .3) where ui. We can now ﬁnd the maximal value of the (log) likelihood ˆ ˆ function for the ML estimates B and Ω: ¯ ¯ 1P T 1 1 ¯ˆ¯ ˆ ˆ0 ˆ0 ln Lmax = − T ln(2π) − T ln ¯Ω¯ − (xt −B Zt )0 Ω−1 (xt −B Zt ) 2 2 2 t=1 ¯ ¯ ¯ˆ¯ We will now show that ln Lmax = − 1 T ln ¯Ω¯ + constant terms..... 1t pt Using the rule trace(ABC) = trace(CAB) we have that: h i i h ˆ0 ˆ0 ˆ0 ˆ0 ˆ ˆ trace (xt −B Zt )0 Ω−1 (xt −B Zt ) = trace (xt −B Zt )(xt −B Zt )0 Ω−1 εεˆ so that trace(ˆ0t Ω−1ˆt ) = trace(ˆtˆ0t Ω−1 ). LIKELIHOOD BASED ESTIMATION IN THE UNRESTRICTED VAR61 Next we calculate ∂ ln L/∂Ω = 0 which gives the estimator of Ω: ˆ Ω = T −1 T P t=1 ˆ ˆ (xt −B Zt )(xt −B Zt )0 = T −1 0 0 t=1 The ML estimators (4. + u2 .   . Consider 2 ﬁrst: ˆ ˆ0 ˆ0 (xt −B Zt )0 Ω−1 (xt −B Zt ) = ˆt Ω−1ˆ0t εˆ ε = u2 + .2) and (4. 1t pt T P ˆtˆ0t .t = ˆt Ω1/2 . Summing over T we obtain: εˆ ε ˆ ˆ ˆ ˆ trace{T −1 ΣT (xt −B Zt )Ω−1 (xt −B Zt )0 } = trace{T −1 ΣT ˆtˆ0t Ω−1 } t=1 t=1 ε ε ˆ ˆ −1 = trace{ΩΩ } = trace{Ip } = p 0 0   U=  u2 2t . εε (4..1. + u2 .3) are identical to the corresponding OLS estimators.

xt−2 ) . 2 2 2 ln Lmax This result will be used in many of the test procedures discussed below and in the derivation of the maximum likelihood estimator for the cointegrated VAR model in Chapter 7. consider the estimation error of the VAR coeﬃcients: i h ˆ 0 −B0 = Π1 . To be able to test hypotheses on B we have to ﬁnd the distribution of the ˆ estimates B. ESTIMATION AND SPECIFICATION ln Lmax i. Next. = Σ= Cov(xt−2 . Π2 − [Π1 .e. the maximum of the log likelihood function is proportional to the log determinant of the residual covariance ˆ matrix Ω : 1 ¯ˆ¯ ¯ ¯ = −T ln ¯Ω¯ + constant terms 2 1 1 1 ¯ˆ¯ ¯ ¯ = −T ln ¯Ω¯ − T p − T ln(2π).4) First we denote the covariance matrices between xt−1 and xt−2 ¸ ¸ · · V ar(xt−1 ) Σ11 Σ12 Cov(xt−1 . . xt−1 ) Σ21 Σ22 V ar(xt−2 ) Under the stationarity assumption the distribution of (4.4) has the following asymptotic property: ˆ T 2 (B − B) → N(0. Π2 ] ˆ ˆ B (4. Σ−1 ⊗ Ω) where Σ ⊗Ω = −1 1 w (4.62 and CHAPTER 4. We will use the simple VAR(2) model to discuss the asymptotic ˆ distribution of B under the assumption of stationarity of the process xt . apart from some constant terms.5) · Σ−1 Ω Σ−1 Ω 11 12 Σ−1 Ω Σ−1 Ω 21 22 ¸ .

y = Xβ + ε. 0.. ε ∼ Niid (0.4. Assume now that we would like to test the signiﬁcance of a single coeﬃcient. w (4.5) we can ﬁnd the test statistic for the null hypothesis π 1. Repetition: ********************************** The distribution of the linear regression model estimate. 0] where ξ is p × 1 and η is 2p × 1. 0. 1).. 0. so that ξ0 B0 η = π 1. Note that the asymptotic distribution of the linear regression model coeﬃcients are based on the assumption that the design P matrix T −1 X0 X → A where A is a constant matrix. σ 2 ) ε ˆ β = (X0 X)−1 X0 y ˆ β − β = (X0 X)−1 X0 ε ˆ V ar(β − β) = σ 2 (X0 X)−1 ε *********************************** Thus. except that the design matrix X0 X of the latter is replaced by the 2p × 2p covariance matrix M. We deﬁne two ‘design’ vectors ξ0 = [1.1....11 = 0. . When the data have unit roots this is no longer the case and the design matrix when normalized diﬀerently will instead converge towards a matrix of Brownian motions. . for example the ﬁrst element π 1.11 of Π1 .1) distribution. 0.6) . Using (4..11 . This will be discussed further in Chapter 8. the VAR results are similar to the linear regression model. LIKELIHOOD BASED ESTIMATION IN THE UNRESTRICTED VAR63 To see how the distribution of the unrestricted VAR estimates relates to the corresponding results for the standard regression model we digress brieﬂy to the latter. This can be generalized to testing any coeﬃcient in B by appropriately choosing the vectors ξ and η : 1 T 2 ξ 0 B0 η (ξ Ωξη 0 0 Σ−1 η) 2 1 → N(0. which has a Normal (0. 0] and η 0 = [1.

.22 r  0.27 Rb.11 0. coeﬃcients with a ‘t’-ratio greater than 1.03 −0.23 −2.0017   ε1.05  0.11 2.t−1  r    mt−2 0.30   Rm.t ε5.52 −0.       mr t r yt ∆pt Rm.25  0.59 0.41 2.59 −7. Instead.0      .42 −1..24 0.00 0.02 1.0013  0.02 −0.0145   −0.24   yt−2       ∆pT −2  +   0. instead.64 CHAPTER 4.0 0.37 −0.01  0.32 1.22 r  −0. not be interpreted as Student’s t.0 Ω =   ˆ     −0. In this case the ‘t’-ratios are more likely to be distributed as the Dickey-Fuller’s ‘τ ’ and should. εt ∼ Np (0.05 0.09 −0.t−2   0.1.t    .0267   0.01 −0.09 −0. This implies that the ’design matrix’ normalized diﬀerently will no longer converge towards a constant matrix in the limit but..04 −0.15 −0.21 −0. σε =  0.22 1.   .01 −0.t  r  mt−1 0.0 0.07   0.01 −0. As discussed above these estimates are ML estimates as long as no restrictions have been imposed on the VAR model.9 have been given in bold face. To increase readability we have omitted standard errors of estimates and ‘t’ ratios.0 −0.96 −0.1 The estimates of the unrestricted VAR(2) for the Danish data The unrestricted VAR(2) model was estimated based on the following assumptions: xt = Π1 xt−1 + Π2 xt−2 + ΦDt + εt t = 1.19 ˆ 1.t−2     1.01 0. ESTIMATION AND SPECIFICATION 4. xt is not likely to be stationary.43   +    0.03 0.25   Rm.13 −0. towards a matrix of Brownian motions.38 0.01 −0.00 0.07 3.t Rb.04 −0. Ω) (4.03 1.1 are calculated in GiveWin by running an OLS regression equation by equation.t−1  −0. Since Chapter 3 found at least one characteristic root very close to the unit circle in the unrestricted VAR.03 −0.T.87 0.07 2.83 −0.29       −0.09 0. The estimates reported in Table 4. therefore.40 Rb.t ε4.01 0.20 −0.0169  1.13   yt−1         ∆pt−1  +  =  −0.09 −0.04 0.7) where Dt contains three centered seasonal dummies and a constant.25 −0.t ε3.t ε2.

5∆pt−1 + εt . R2 (LR) = 0. For example.19 > −51.3.59): · r ¸ r r mt−1 yt−1 ∆pt−1 Rm. which does not imply that we have explained all variation in the data but. THREE DIFFERENT ECM-REPRESENTATIONS 65 ¯ ¯ ¯ˆ¯ Log(Lmax ) = 1973.24.t−2 4. The tests are distributed as F(5. ∆pt . lagged diﬀerences. R2 (LM) = 0. based on the VAR(1) model we obtained ¯ ¯ˆ¯ log ¯Ω¯ = −50. that R2 is an incorrect measure when the variables are trending as they are in the present case.5pt−2 then ∆pt = −0.7 2.24.7) in terms of diﬀerences. implying that the variables are highly autoregressive.0 0. without changing the value of the likelihood function.e.t−2 Rb. An inspection of the estimated coeﬃcients reveals more signiﬁcant coeﬃcients at lag 1 than lag 2.64. has a negative autoregressive coeﬃcient. The log ¯ ¯ˆ¯ likelihood value and log ¯Ω¯ are only informative when compared to another model¯ speciﬁcation. 4. Finally we report F-tests on the signiﬁcance of single regressors. The so called vector (equilibrium)-error-correction model gives a convenient reformulation of (4.2 Three diﬀerent ECM-representations The unrestricted V AR model can be given diﬀerent parametrization without imposing any binding restrictions on the model parameters.8 10. Only the inﬂation rate.3.0. F-test on all regressors: F(50.3 22.5pt−1 + 0.5.2.9 We note that the second lag of inﬂation and deposit rate could altogether be omitted from the system.4. The bond rate seems to be quite important for most of the variables in this system. . The R2 (LR) is almost 1. and levels of the process.t−1 Rb.9 2.272) =32.1 4. if pt = 0. where R2 (LR) and R2 (LM) will be explained in Section 4. Most of the coeﬃcients with large t-ratios are on the diagonal.8 14. which is a result of the imposed diﬀerence operator1 .1.9998. instead. log ¯Ω¯ = −51.8 3. See also the discussion in Section 4. The residual correlations between equations are generally moderately sized and suggest that the current eﬀects are not ¯likely to be very important in this case.7 1. There are several advantages of this formulation: 1 For example.t−1 mr t−2 yt−2 ∆pt−2 Rm. i.2.

For the Danish data we will assume the lag length k = 2 and report the unrestricted parameter estimates for that choice.9) the lagged levels matrix Π has been placed at time t − 1. 2. ESTIMATION AND SPECIFICATION 1.1 The ECM formulation with m = 1. All information about long-run eﬀects are summarized in the levels matrix Π which can. some multivariate R2 measures. + Γk−1 ∆xt−k+1 + Πxt−m + Φ Dt + εt (4. .9) where Π= I− Π1 − Π2 .2. Additionally the value of the log likelihood function. 3. changed from the previous to the present period. therefore.. We will now discuss three diﬀerent versions of the VAR(k) model represented in the general error-correction form ∆xt = Γ1 ∆xt−1 + Γ2 ∆xt−2 + . The purpose is to illustrate how diﬀerent the estimates can look although it is exactly the same model that has been estimated in all three cases. be given special attention when solving the problem of cointegration. The multicollinearity eﬀect which typically is strongly present in timeseries data is signiﬁcantly reduced in the error-correction form. In (4. and F-test of the signiﬁcance of the regressors will be reported.8) where m is an integer value between 1 and k.. Note that the value of the likelihood function does not change even if we change the value of m. The VAR(2) model is speciﬁed as: ∆xt = Γ1 ∆xt−1 + Πxt−1 + Φ Dt + εt (1) (1) (4. and Γ1 = − Π2 .66 CHAPTER 4. as the coeﬃcients can be naturally classiﬁed into short-run eﬀects and long-run eﬀects. 4. (m) (m) (m) 4. The ECM-formulation answers this question directly. Diﬀerences are much more ‘orthogonal’ than the levels of variables. say. The interpretation of the estimates is much more intuitive. We are generally interested in understanding why inﬂation rate.

what cointegration analysis does: it identiﬁes stationary linear combinations between nonstationary variables so that an I(1) model can be reformulated exclusively in stationary variables. estimating the model exclusively in diﬀerences. This is. will be discussed in Chapter 9. the estimation results can only make sense if Πxt−1 deﬁnes stationary linear combinations of the variables. In Chapter 5 we will re-estimate the VAR model accounting for this intervention. Among the former. therefore.t−1 ). t−1 If the linear combination in the bracket deﬁnes a stationary variable then all parts of the ﬁrst equation in the system would be stationary and. i. t We might now like to test whether real income has a unit coeﬃcient.1yt−1 − 0.e.0Rm. whereas only four out (1) of 25 coeﬃcients in the Γ1 matrix seem signiﬁcant. whether the coeﬃcient to inﬂation is zero.1yt−1 + 0.9∆pt−1 − 30. balanced. But.t−1 . would not deliver many interesting results. For example the above relation may be interpreted as the deviation of observed money holdings from a steady-state money demand relation. and whether the interest rate coeﬃcients are equal with opposite sign.t−1 − 25. which coincided with a huge drop in the bond rate. This can be seen by noting that the ﬁrst row of Πxt−1 can be reformulated as: r −0.20(mr − 1. Altogether.0Rb. .0Rm.0Rb. THREE DIFFERENT ECM-REPRESENTATIONS 67 The estimated coeﬃcients reported below show that most of the signiﬁcant coeﬃcients are now in the lagged levels matrix Π. setting Πxt−1 = 0. Since a stationary process cannot be equal to a nonstationary process. mr − mr∗ where t−1 t−1 r mr∗ = 1.9∆pt−1 + 30. in a nutshell.4. How to do it. as a result of a reallocation of money holdings when restrictions on capital movements were lifted at 1983. two are on the diagonal and the remaining two describe the big change in inverse velocity.t−1 + 25.2. including Πxt−1 in the model raises the question of how to handle the nonstationarity problem. Our task is to give the stationary linear combinations an economically meaningful interpretation by imposing relevant identifying or over-identifying restrictions on the coeﬃcients.

01 0.7 1.35  1. demonstrating that from a likelihood point of view the models are identical.12 −1. For example.9 4.18 6. whereas the test values for the lagged variables in levels are identical.02 0. which depend on how we choose m.43   t−1     −0.t−1  r   mt−1 −0.1. F-test on all regressors: F(50. The trace correlation coeﬃcient will be described in Section 4. log ¯Ω¯ = −51.46 −0.0 4.15 0.05   Rm.10 −0.07 −3. distributed as F(5.00 r  0.22 t−1 r  0.59).40 ∆Rb.1 16.01 0.00 −0.14   +   −0.24   ∆yt−1       ∆p2  =  −0.9 3.00 0.96.41 −2. ¸ · r 2 r r ∆mr t−1 ∆yt−1 ∆pt−1 ∆Rm. all residual tests or information criteria are identical.01 −0.31 −2.00 −0.07 −2.3. R2 (LM) = 0. ESTIMATION AND SPECIFICATION       ∆mr t r ∆yt ∆p2 t ∆Rm.t−1 2.74 1.t−1  −0.00 0.10) . Because the residuals are identical in all the ECM representations.01 −0.t−1 ∆Rb.01  0.08 + 0.8 3. R2 (LR) = 0.29 −0.7 4.04 −0.t−1 Rb.2 The ECM formulation with m = 2 The VAR(2) model is now speciﬁed so that Π is placed at xt−2 : ∆xt = Γ1 ∆xt−1 + Πxt−2 + Φ Dt + εt (2) (4.25   0. trace correlation = 0. the single F-tests of the lagged variables in differences. This illustrates that the Π matrix is invariant to linear (m) transformations of the VAR system but not the Γ1 matrices. are very diﬀerent when compared to the two subsequent speciﬁcations.2.00 −0.07 −0.02 0.22 −0.t ∆Rb.01 0.t−1   4.11   yt−1      ∆pt−1  + ΦDt + εt  0.20 0.5   ∆mr −0.20 0.42.26 0.2.t−1 mt−1 yt−1 ∆pt−1 Rm.05   ∆Rm.4.26 0.83 0.01 −5.272) =5.t−1  −0.13 Rb.9 2.01 0.54.04 0.0 0.22 −1.t ¯ ¯ ¯ˆ¯ Log(Lmax ) = 1973.68 CHAPTER 4. ¯ ¯ ¯ˆ¯ Note that Log(Lmax ) and log ¯Ω¯ are exactly the same as for the unrestricted VAR in the previous section.2.03 0. whereas tests of the signiﬁcance of single variables need not be (and often are not).

9 3.01 −0. Thus.05 0.38 0.t−2 6.t ∆Rb.272) = 5.01 0.01 0.04 −1.01 0.13 0.0 4. Usually many more signiﬁcant coeﬃcients are obtained with formulation (4. but that the Π matrix is unchanged.22 −0.96.7 4. F-test on all regressors: F(50.9) describes ‘pure’ transitory eﬀects measured by the lagged changes of the variables.t−2 Rb. It illustrates that the interpretation of the estimated coeﬃcients in dynamic models is less straightforward than in static regression models.2.03 + 0. For example.7 5. the Π matrix remains (2) unchanged.9 1.9).01  0.2 4.t   ∆mr −0.t−1 ∆Rb. F-test on single regressors: · r 2 r r ∆mr t−1 ∆yt−1 ∆pt−1 ∆Rm.42 −1.31 −2.4. R2 (LR) = 0.40 0. THREE DIFFERENT ECM-REPRESENTATIONS (2) 69 with Π= I− Π1 − Π2 and Γ1 = (I− Π1 ).3 33.01 −0.03 0.74 1.59 −7.01 −5.35  1. many signiﬁcant coeﬃcients does not necessarily imply high explanatory power.00 −0. whereas Γ1 in (4.00 0.46 −0.27 ∆Rb. .29   t−1     −0. log ¯Ω¯ = −51.       ∆mr t r ∆yt ∆p2 t ∆Rm. ∆p2 is now highly signiﬁcant.1.29 −0.26 0.11 2.t−2   ¯ ¯ ¯ˆ¯ Log(Lmax ) = 1973.11   yt−2      ∆pt−2  + ΦDt + εt  0.23 −2.13   ∆yt−1         ∆p2  =  −0.13 Rb.t−1  r   mt−2 −0. but not the Γ1 matrix.30   ∆Rm.20 0.00 −0.t−1  −0.04 −0.12 −1. Thus. on the lagged variables in diﬀerences have changed.10 −0.t−2  −0.59).18 6.36 −0.02 0.10) than with (4. the estimated coeﬃcients and their p-values can vary considerably.00 r  0.05   Rm. R2 (LM) = 0.00 0.22 t−1 r  0. F(5.00 −0. We note that the ˜ ˆ number of ‘signiﬁcant’ coeﬃcients in Γ1 is larger than in Γ1 . but may as well be a consequence of the parameterization of the model.03 0. whereas t−1 it was completely insigniﬁcant in the case m = 1.5 ¸ We note that the single F-tests.11 −0.42.1 16.t−1 mt−2 yt−2 ∆pt−2 Rm. While the explanatory power is identical for the two model versions.9 4. The latter measures the cumulative long(1) run eﬀect.2.14   +   −0.52 −0.4.

..11   r   0. and Π = I− Π1 − Π2 .t  ∆mr −1. Π= I− Π1 − .01 −0.− Γk−1 = I− Π2 − 2 Π3 − .13 Rb.22 t−1 r  0.t−2 −0. the signiﬁcance of the diagonal elements are only a consequence of applying the diﬀerence operator once more to ∆xt .23 −2.03 0. + Γ1. Thus. but it is in general a convenient representation when the sample contains periods of rapid change.13 0.36 −0.10 −0. and levels: ∆2 xt = Γ1. A second look shows that the coeﬃcients are identical except that a constant factor of -1 has been added to the diagonal elements.i = Γi+2 + .02 0.05 0.11 2..11) where Γ = I− Γ1 − .04 −0.42 −1...00 −1...03 0.t−2      +       + ΦDt + εt   . it may be more meaningful to test whether the diagonal elements are signiﬁcantly diﬀerent from -1 (or from -2 for the inﬂation rate) than from zero..29   ∆2 pt−1     −0.74 1. and Γ1.− Πk as before. At ﬁrst sight the estimates of the Γ1 matrix reported below look very diﬀerent from the previous case. changes and levels Another convenient formulation of the VAR model is in second order diﬀerences (acceleration rates).t−1 −0.46 −0.3 ECM-representation in acceleration rates.00 −0..01 0. Chapter 15 will show that this formulation is particularly useful when xt contains I(2) variables..t ∆2 Rb.03 −0.01  0.04 −2.01 0. − (k − 1) Πk .73 ∆Rb.k−2 ∆2 xt−2 + Γ∆xt−1 + Πxt−2 + Φ Dt + εt (4. ESTIMATION AND SPECIFICATION 4.00 −0.22 −0.11 −1.26 0.01 −0. + k Πk .t−1  r  mt−2 −0.12 −1.14   ∆pt−2 +   −0.2.31 −2.1 ∆2 xt−1 + .13   ∆yt−1      =  −0.       ∆2 mr t r ∆2 yt ∆∆2 pt ∆2 Rm. The VAR(2) model for the Danish data now becomes: ∆2 xt = Γ∆xt−1 + Πxt−2 + Φ Dt + εt (4.52 −0.18 6.30   ∆Rm.70 CHAPTER 4.01 0.+ Γk = Π3 + 2 Π4 + .00 0.20 0.05   Rm..00 0.00   yt−2  0. Therefore.38 0.40 0.29 −0.12) where Γ = I− Γ1 . changes. so that acceleration rates (in addition to growth rates) become relevant determinants of agents’ behavior.01 −5..35 1.59 −7.

5 ¸ The F-tests on the signiﬁcance of the ﬁrst ﬁve regressors have now obtained very large values.5. and they do not really say much about how important the lagged t − 1 variables are for explaining the variation in xt .3 3.14) can now be found as: . log ¯Ω¯ = −51.272) =11.9 33.3 20.9 4. 4.7 4. R2 (LR) = 0.0 4.14) = I − (I + Γ(1) + Π)z − ( Γ(1) − Γ1 )z2 + Γ2 z3 . 1 2 The relation between the parameters of 4. THREE DIFFERENT ECM-REPRESENTATIONS ¯ ¯ ¯ˆ¯ Log(Lmax ) = 1973.1. is: ∆xt = Γ1 ∆xt−1 + Γ2 ∆xt−2 + Πxt−1 + Φ Dt + εt and the characteristic function: Π(z) = I − z − Γ1 (1 − z)z − Γ(1) (1 − z)z2 − Πz.13) and (4.4 The relationship between the diﬀerent VAR formulations We will now evaluate the above model formulations using the characteristic function based on the slightly more general VAR(3) model: xt = Π1 xt−1 + Π2 xt−2 + Π3 xt−3 + ΦDt + εt with the characteristic function: Π(z) = I − Π1 z − Π2 z2 − Π3 z3 . F-test on all regressors: F(50.t−1 mt−2 yt−2 ∆pt−2 Rm.2.13).13) (4. The ECM form of (4.2.999.65.t−2 34. 2 (1) (1) (1) (1) (1) (4. R2 (LM) = 0.4. which is just an artifact of the ∆2 transformation. with m = 1.9 18. F-tests on single regressors: · 71 r 2 r r ∆mr t−1 ∆yt−1 ∆pt−1 ∆Rm.1 16.5 24.2.t−1 ∆Rb.t−2 Rb.

= −( Π2 + Π3 ). The ECM form of (4. In the subsequent sections we will brieﬂy discuss some ε of the test procedures and information criteria contained in CATS in RATS (Hansen and Juselius.13). is: ∆xt = Γ1 ∆xt−1 + Γ2 ∆xt−2 + Πxt−3 + Φ Dt + εt and the characteristic function: Π(z) = I − z − Γ1 (1 − z)z − Γ2 (1 − z)z − Πz3 . 1995) and in GiveWin (Doornik and Hendry. depends on the 4.15) is: Γ1 (3) 3 (3) Γ2 = −(I − Π1 − Π2 ). Π = −(I − Π1 − Π2 − Π3 ). Quite often the graphs reveal speciﬁcation problems that the tests fail to . 2002). ESTIMATION AND SPECIFICATION Γ1 (1) (1) Γ2 = − Π3 . The relationship between (4.72 CHAPTER 4.1 Speciﬁcation checking It is always useful to begin the speciﬁcation checking with a graphical analysis.15) = I − (I + Γ1 )z − ( Γ2 − Γ1 )z +( Γ2 − Π)z .13) and (4.3. (m) = −(I − Π1 ). Π = −(I − Π1 − Π2 − Π3 ).3 Misspeciﬁcation tests After the model has been estimated the multivariate normality assumption underlying the VAR model can (and should) be checked against the data using the residuals ˆt . In both cases the Π matrix is unchanged. but the Γi chosen lag m of xt in the model. (3) (3) (3) 2 (3) (3) (3) 2 (3) (3) (4. with m = 3. 4.

20 0.05 -0.1-4.1-4.050 0.025 -0.0 -0.7 1.6 shows all the autocorrelations for the full system. h = 1.45 0. εxit−h ).7 75 77 79 81 83 85 87 89 91 93 2 4 6 8 10 12 14 16 18 Lag Figure 4. panel d.00 -1..75 Correlogram of residuals 1. Graphs of residuals from the money stock equation.025 -0.35 0.5 show the ﬁtted and actual values of ∆xi.00 -0.9 0. Figures 4.25 -0.075 0.3.8 0.075 0. . the residuals (panel c). εxjt−h ). the empirical distribution compared to the normal (panel b).50 0.4.30 0. .40 0.050 0.100 75 78 81 84 87 90 93 0.9 -0.5.00 0.50 -0. i 6= j.100 0. .75 -1.00 0.t (panel a). i.25 0.000 -0. and the autocorrelogram of order 20 (panel d).. whereas the oﬀ diagonal diagrams deﬁne the cross-autocorrelograms Corr(εxit . Actual and Fitted for DMO 0. i = 1. Figure 4.8 -2...25 0.10 -0. 18. 5.125 0. MISSPECIFICATION TESTS 73 discover..15 Histogram of Standardized Residuals Normal DMO Standardized Residuals 2.1.e. are the same as in Figures 4. The diagonal autocorrelograms are deﬁned by Corr(εxit .

025 0.025 0.50 Correlogram of residuals 0.050 75 78 81 84 87 90 93 0.75 Correlogram of residuals 2 0.00 2 4 6 8 10 12 14 16 18 Lag Figure 4.10 0.2.1 -0.4 0.00 -2 -3 75 77 79 81 83 85 87 89 91 93 2 4 6 8 10 12 14 16 18 Lag Figure 4.8 -1.30 0. Graphs of residuals from the income equation.4 1.25 0.15 0.6 0.0 Standardized Residuals 3 1.75 0.000 -0.050 0.25 -0.0 -0.075 0.2 0.06 0.00 -0.02 0.04 Standardized Residuals 3.75 -1.4 75 77 79 81 83 85 87 89 91 93 0.6 -2.2 2.74 0.05 -0.20 -0.45 0.25 0 -1 -0.75 -1.00 -0. .50 -0.00 -0.8 -0.35 0.5 Histogram of Standardized Residuals Normal DFY 0. Graphs of residuals from the inﬂation equation. ESTIMATION AND SPECIFICATION Actual and Fitted for DFY 0.00 0.3.02 0.50 -0.40 0.3 0.06 75 78 81 84 87 90 93 0.50 1 0. Actual and Fitted for DDIFPY 0.04 0.25 1.100 CHAPTER 4.00 Histogram of Standardized Residuals Normal DDIFPY 0.25 0.00 0.

006 -0.3 0.8 -2.7 -3. MISSPECIFICATION TESTS Actual and Fitted for DIDE 0.25 0. .00 0.0 -0.0 -0.24 -0.00 2 4 6 8 10 12 14 16 18 Lag Figure 4.40 0.00 0.2 2.3.50 -0.002 0.5 Histogram of Standardized Residuals Normal DIBO Standardized Residuals 2.50 0.006 75 78 81 84 87 90 93 0. Graphs of residuals from the bond rate equation.75 0.4. Graphs of residuals from the deposit rate equation.75 Correlogram of residuals -3.00 -0.4 1.002 0.50 Correlogram of residuals -0.08 -0.8 0.25 -0.75 -1.2 -0.006 0.8 -1.000 0.48 0.2 75 77 79 81 83 85 87 89 91 93 2 4 6 8 10 12 14 16 18 Lag Figure 4.002 -0.004 0.6 75 77 79 81 83 85 87 89 91 93 0.1 0.56 75 Histogram of Standardized Residuals Normal DIDE 0.9 0. Actual and Fitted for DIBO 0.4 -1.010 75 78 81 84 87 90 93 0.9 -1.25 1.0 0.5.32 0.50 -0.6 -2.008 -0.16 -0.000 -0.004 0.6 0.004 0.00 Standardized Residuals 3.8 -0.4 0.004 0.00 -0.002 0.00 1.75 0.4.7 1.25 -0.

6. but not of current values.2 Residual correlations and information criteria The VAR model is often called a ‘reduced form’ model because it describes the variation in xt as a function of lagged values of the process. Because correlations (standardized covariances) are easier to interpret most software programs . ESTIMATION AND SPECIFICATION Cross. the multivariate Doornik-Hansen test for normality. a trace correlation statistic. even if the graphical analysis can be a powerful tool to detect problems in model speciﬁcation.76 CHAPTER 4. a multivariate LM test for ﬁrst and fourth order residual autocorrelation.and autocorrelograms of the full system. The following multivariate and univariate residual tests will be discussed and illustrated based on the Danish data: an LR test and three information criteria tests for the choice of lag length.3. a univariate ARCH test and a univariate Jarque-Bera normality test. 4. The graphs helps us to spot the big ‘value-added’ residual in the income equation and the big ‘deregulation’ residual in the bond rate equation. Because the VAR estimates are more sensitive to deviations from normality due to skewness than to excess kurtosis we also report these measures.and autocorrelograms of the residuals DMO DMO DFY DDIFPY DIDE DIBO DFY DDIFPY DIDE DIBO Lags 1 to 19 Figure 4. But. This means that all information about current eﬀects in the data is stored in the residual covariance matrix Ω. it cannot replace formal misspeciﬁcation tests. Cross.

.0132 0.16) When the correlation coeﬃcients and the residual variances (or residual standard deviations) are given it is. The LR test of k = 1 versus k = 2 for the Danish data becomes: −2lnQ(H1 /H2 ) = 77(−50. When assessing the adequacy of the VAR speciﬁcation we frequently make use of the maximal likelihood value given by: ˆ −2/T ln Lmax = ln |Ω| + constant terms For example. The correlation coeﬃcients are calculated as follows: σ ij ˆ ρij = p ˆ .. can be interpreted as an % error in this case because the variables are in logarithmic changes.2 + 51. which is distributed as χ2 (25) under the null of no signiﬁcant coeﬃcients at lag 2 in the VAR model. the test statistic −2lnQ is asymptotically distributed as χ2 with p2 degrees of freedom. .0012 0.0016 Note that the residual standard errors. .95 therefore rejected.0153 0.0241 0. σ i = σ ii i = 1. σ ii σ jj ˆ ˆ (4.1. MISSPECIFICATION TESTS 77 (inclusive CATS and PcFiml) report correlations instead of covariances. p are reported below: ˆ ˆ ∆mr ∆y r ∆2 p ∆Rm ∆Rb 0.. The residual standard errors. thus. when determining the truncation lag k of the VAR model one can use the Likelihood ratio test procedure ˆ ˆ −2lnQ(Hk /Hk+1 ) = T (ln|Ωk | − ln|Ωk+1 |) where Hk is the null hypothesis that k lags are suﬃcient and Hk+1 is the alternative hypothesis that the VAR model needs k + 1 lags. . p.4.1.3. The estimated standardized residual covariance matrix for the Danish data is reported in √ Section 4. The χ2 (25) is approximately 35 and the null is . straightforward to derive the corresponding covariances.2) = 77.. multiplied by 100. i = 1.. Because the LR test is testing a p × p matrix to be zero..

. If there are other problems with the model. The suggested criteria diﬀer regarding the strength of the penalty associated with the increase in model parameters as a result of adding more lags. and 3 lag below: k=1 k=2 k=3 Schwarz −47.19) All of them are based on the maximal value of the likelihood function with an additional penalizing factor related to the number of estimated parameters.58 −48. such as regime shifts and non-constant parameters.50 −48.30 46. Because the suggested information criteria are based on diﬀerent penalties of estimated coeﬃcients they need not produce the same answer and often do not. Checking the other misspeciﬁcation tests for k = 1 showed that all of them got much worse as compared to k = 2. They are deﬁned by: 2 ˆ AIC = ln |Ω| + (p2 k) .18) (4.78 CHAPTER 4.17) (4. the graphical analysis of the previous chapter suggested the possibility of a regime shift in the model which has not yet been tested for. the Akaike. 2. The idea is to calculate the test criterium for diﬀerent values of k and then choose the value of k that corresponds to the smallest value.68 −47.53 Hannan − Quinn −48.28 The SC criterium suggests k = 1 and the H-Q suggests k = 2. then these should be accounted for prior to choosing the lag length. When using these criteria for the choice of lag length it is important to remember that they are valid under the assumption of a correctly speciﬁed model. Before this is done the speciﬁcation tests remain tentative. However. ESTIMATION AND SPECIFICATION There are various other test procedures for the determination of the lag length and we will brieﬂy discuss three of them.1 showed that many of the coeﬃcients at lag two were insigniﬁcant in the Danish data and we may consult the information criteria to ﬁnd out whether a VAR(1) would be appropriate. We report the Schwarz and Hannan-Quinn criteria for lag 1.1. T (4. At this stage we continue with the VAR(2) model. Section 4. T ln T ˆ SC = ln |Ω| + (p2 k) T 2 ln ln T ˆ HQ = ln |Ω| + (p2 k) . the Schwartz and the Hannan-Quinn information criteria.

Section 10.75 0.1. When xi.20) P ˆ where Ωh = T −1 T ˆt ˆ0t−h and the residuals are from the estimated VAR t=h ε ε model..1 because the dependent variable in this case is a nonstationary.3 Tests of residual autocorrelation [T /4] The Ljung-Box test of residual autocorrelation is given by: Ljung Box = T (T + 2) X h=1 ˆ ˆ ˆ ˆ (T − h)−1 trace(Ω0h Ω−1 Ω0h Ω−1 ) (4.75 0.49 0. For the Danish data it is 0. i. which is similar to the conventional R2 in the linear regression model. PcFiml calculates two alternative measures called R2 (LR) and R2 (LM) in the VAR model which are deﬁned in the PcFiml manual. 2 ˆ Finally. the baseline hypothesis of a constant mean is no longer appropriate.54.. ˆ Trace correlation = 1 − trace(Ω(V(∆xt ))−1 )/p. The Ljung-Box test is considered to be approximately distributed as χ2 with p2 ([T /4] − k + 1) − p2 degrees of freedom.. MISSPECIFICATION TESTS 79 In the VAR model we can calculate an overall measure of goodness of ﬁt. In CATS this is called ‘trace correlation’.8.44 Note that the R2 values are completely misleading when calculated for the unrestricted VAR(2) in Section 4. (See Ljung & Box (1978) .36 0. 4.4. . For the Danish data the estimated Ri for the models in ECM form: is ∆mr ∆y r ∆2 p ∆Rm ∆Rb 0.3. where V(∆xt ) is the covariance matrix of ∆xt . i = 2 1. Essentially any variable will do better in this respect and the random walk hypothesis should replace the constant mean as the baseline hypothesis. It can be roughly interpreted as an average R2 of all the VAR equations.e.t is i a nonstationary variable.3. R2 for each equation is calculated as Ri = 1−Ωii /V ar(∆xt )ii . p. R2 = 1 − Σε2 /Σ(xi − x)2 . In CATS it is calculated using the Rats instruction VCV. trending variable. The R2 compares the model’s ability to explain the variation in the dependent variable as compared to the baseline of a constant mean.

ˆT −j ]. 4.21) ˆ 2 |Ω| The test is asymptotically χ2 -distributed with p2 degrees of freedom. . section 8.86) and LM(4) : χ2 (25) = 40. and the test statistic is calculated as (T − k) × R2 . and ˆ0Lag j = [ˆ−j ... agents plans based on the conditional VAR expectations would have deviated systematically from actual realizations. (See Anderson (1984.5..3. . ˆ2 εit = γ0 + k X j=1 γ j ˆ2 + error. . xT ].31). the bigger the worse.5)). ˆ−j+1. εi. ε j = 1. x2 . This is a fairly important test. ε ε ε ε where x0 = [x1 . Chapter 5). The covariance matrix of the residuals in the auxiliary model is calculated as: ˜ T Ω(j) = ˆ0ˆ − ˆ0 εε ε µ ˆ0Lag j ε x0 ¶0 µ· ˆ0Lag j ε x0 ¸· ˆ0Lag j ε x0 ¸0 ¶−1 µ ˆ0Lag j ε x0 ¶ ˆ. partly because the whole VAR philosophy is based on the idea of decomposing the variation in the data into a systematic part describing all the dynamics and an unsystematic random part. 4. (4. Ã ! ˜ 1 |Ω(j)| LM(j) = −(T − p(k + 1) − ) ln ..4 (pvalue = 0. The ﬁrst j missing values ˆ−j .. The LM-tests for ﬁrst and fourth order autocorrelation are calculated using an auxiliary regression as proposed in Godfrey (1988. For the Danish data the test becomes χ2 (425) = 438.2) or Rao (1973...8 (p-value = 0.03).4 Tests of residual heteroscedastisity In CATS the number of lags in the test for ARCH is equal to the number of lags in the model. If the test suggests that there are signiﬁcant autocorrelations left in the model. section 8c. ˆ−1 are set equal to 0.80 CHAPTER 4. The test statistic for the Danish data became LM(1) : χ2 (25) = 17...t−j .2 (p-value = 0. The properties of the estimators are also sensitive to signiﬁcant autocorrelations. where R2 is from the auxiliary regression. ESTIMATION AND SPECIFICATION and Hosking (1980)). The LM-test is calculated ε ε as Wilks’ ratio test with a small-sample correction.

. so that p X¡ ¢ Normality test = τ2 + τ2 1i 2i i=1 (4. p. t=1 ˆit PT 4 (4. where ‘system residuals’ are deﬁned as σ ε ε ut = VΛ−1 V0 diag(ˆ i 2 )( ˆt − ¯). The ‘system residuals’.3. With the small sample approximations.3. P √ skewnessi = (4. we obtain 2p independent standard normal variables. The skewness and kurtosis of each series is calculated as.7 4.23) b1i = T −1 T u3 . and a modiﬁcation of that test as proposed in Doornik & Hansen (1994).22) is a principal components decomposition of the standardized residuals. which are squared and summed to a multivariate omnibus test for normality.5 6.24) kurtosisi = b2i = T −1 t=1 uit .1 Only the residuals from the deposit rate equation were borderline signiﬁcant. . so that (4. . . However. For the Danish data the tests became: ∆mr ∆y r ∆2 p ∆Rm ∆Rb 2.0 3. simulation studies have demonstrated that the (cointegrated) VAR estimates are robust against moderate ARCH eﬀects in the residuals. ˆ −1 (4. are uncorrelated by construction and thereby independent under the assumption of normality. MISSPECIFICATION TESTS 81 The residuals from each equation are tested individually for ARCH eﬀects.5 Normality tests The test for normality used in CATS is based on Doornik & Hansen (1994). and the univariate tests are based on the skewness and kurtosis estimates of the residuals with small sample corrections as proposed in Shenton & Bowman (1977). which are invariant to aﬃne transformations of each variable and reordering of the system. ˆ i = 1. The multivariate test for normality is the sum of p univariate tests based on ‘system residuals’.25) . 4.22) where Λ is a diagonal matrix of eigenvalues of the correlation matrix of the (original) residuals and V are the eigenvectors.4. Because the test is not very well known we explain the calculations in more detail.7 0. Testing for normality is done by testing each of the system residual series.

normality is borderline rejected.t = 0 ε ¶3 ¶4 (4. ε σ Thus. Thus. Note also that the Jarque-Bera test is χ2 -distributed only asymptotically.t ε σi ˆ ˆi.27) std. ESTIMATION AND SPECIFICATION is approximately χ2 -distributed with 2p degrees of freedom. To be able to ﬁnd out where in the system the non-normality is most pronounced we also need to calculate the univariate Jarque-Bera test for normality. which means that the normality test is more easily rejected when the empirical distribution is skewed (often because of outliers in the VAR model) than when it is leptocurtic (thick tails or too many small residuals close to the mean).t ε σi ˆ (4.82 CHAPTER 4. 6) ε σ and (ˆi.t ) = T ε −1 T Xµ t=1 T Xµ t=1 ˆi.t ) = T ε −1 T X t=1 a ˆi. For the Danish data the test results are reported in Table 4. the variance of skewness is smaller than the variance of kurtosis.7 (p-value = 0.t /ˆ i )4 ∼ N(3. For the Danish data the test became χ2 (10) = 18. This means that the small sample behavior may not follow a χ2 distribution very closely.t ) = T −1 ε kurtosis(ˆi.29) The Jarque-Bera test of normality is based on the null hypothesis of normally distributed errors. 24).t /ˆ i )3 ∼ N(0. a a .04).1: We ﬁnd that the normality is rejected primarily because of non-normality in the two interest rate equations.Bera = T (skewness)2 /6 + T (kurtosis − 3)2 /24 ∼ χ2 (2) where skewness and excess kurtosis are deﬁned below: mean(ˆi. under which the following applies: (ˆi.t ) = σ i ε ˆ skewness(ˆi.dev(ˆi. distributed as χ2 (2): Jarq.26) (4. This is due to the big ‘deregulation’ outlier in 1983 for the bond rate and excess kurtosis for both interest rates.28) (4.

The generality of the VAR formulation has a cost: adding one variable to a p-dimensional VAR(k) system introduces (2p+1)×k new parameters. Such restrictions are.4.Bera(2) 1.28 3.02 4. zero parameter restrictions and other linear or nonlinear parameter restrictions. When the sample is small. By imposing statistically acceptable restrictions on the VAR model we hope to uncover meaningful economic models with interpretable coeﬃcients.4 Concluding remarks All parameters of the VAR models (4.27 -0.(4.39 4.20 0. Even if the VAR(2) model seemed to provide a fair description of the . reduced rank restrictions. p.9 Skewness 0. there is some seasonal autocorrelation left in the residuals. for example.4. This is consistent with the discussion in Chapter 3 which showed that the VAR model is essentially just a convenient reformulation of the covariances of the data.07 83 Altogether the formal misspeciﬁcation tests have conﬁrmed the ﬁndings based on the graphical inspection: multivariate normality is borderline rejected due to non-normality in the two interest rates.2 6. However.05 -0. In this chapter the estimates of these models showed that the unrestricted VAR models are heavily over-parametrized.7) . Univariate normality tests for: ∆mr ∆y r ∆p ∆Rm ∆Rb Jarq. CONCLUDING REMARKS Table 4. This will be the focus of the remaining chapters of this book. in some cases there can be trade-oﬀ between the number of variables in the system. some of which are very small and statistically insigniﬁcant.6 2.32 -0. and the number of lags.2 5. the deposit rate exhibits moderate ARCH eﬀects.1: Speciﬁcation tests for the unrestricted VAR(2) model.4 1.11) of Chapter 3 were unrestricted and we demonstrated that OLS equation by equation produced ML estimates.40 Kurtosis 3. 4. typically 50-100 in quarterly macroeconomic models. k. Since an extra lag corresponds to p × p additional parameters one can in some cases reduce the total number of VAR parameters by adding a relevant variable to the model. adding more variables can easily become prohibitive.00 3. needed to obtain uncorrelated residuals.

84 CHAPTER 4. . ESTIMATION AND SPECIFICATION information in the data. In Chapter 6 we will discuss the important role of deterministic components as a means to improve model speciﬁcation. the tests and the graphs suggested some scope for improvements.

Section 5.Chapter 5 The Cointegrated VAR Model The purpose of this chapter is to introduce the nonstationary VAR model and show that the nonstationarity can be accounted for by a reduced rank condition on the long-run matrix Π = αβ0 .4 demonstrates how the parameters of the MA representation are related to the AR parameters. 5.1 deﬁnes the concepts of integration and cointegration and Section 5. Section 5. Chapter 3. as the number of common driving trends and shows how they can be related to the VAR model by inverting the AR lag polynomial. Based on the simple VAR(1) model Section 5. provides a mathematically precise deﬁnition of the order of integration and cointegration. p − r.5 concludes the chapter with a discussion of the cointegrated VAR model as a general framework within which one can describe economic behavior in terms of pulling forces towards equilibrium.3 discusses the interpretation of the number of unit roots. Johansen (1996). Section 5. and pushing forces away from equilibrium. Finally. as the number of stationary long-run relations based on the unrestricted VAR estimates of the Danish data. Here we only reproduce the basic deﬁnitions: 85 . generating stationary behavior. generating nonstationary behavior.2 gives an intuitive interpretation of the reduced rank.1 Deﬁning integration and cointegration We will now show that the presence of unit roots in the unrestricted VAR model corresponds to nonstationary stochastic behavior which can be accounted for by a reduced rank (r < p) restriction of the long-run levels matrix Π = αβ0 . r.

01 0.t−1 −0.00   yt−1 0.29 −0. In the next section we will give an intuitive account of such relations and how one can ﬁnd them in the long-run matrix Π. b = 1.13 Rb.2 An intuitive interpretation of Π = αβ0 Within the VAR model the cointegration hypothesis can be formulated as a reduced rank restriction on the Π matrix deﬁned in the previous chapter. of considerable economic interest.00 −0. can often be interpreted as long-run economic steady-state relations and are.26 0.11   r 0. they will show a tendency to move together in the long-run.01 −0. if the non-stationarity of one variable corresponds to the non-stationarity of another variable.. σ 2 ). Below we reproduce the VAR(2) model in ECM form with m=1: ∆xt = Γ1 ∆xt−1 + Πxt−1 + µ + εt (5.46 −0. Cointegration implies that certain linear combinations of the variables of the vector process are integrated of lower order than the process itself.   ..18 6.22 −0.01 −5.d. Such cointegrated relations. therefore.00 −0..01 0.5.t−1  1   Πxt =    Coeﬃcients with a |t-ratio| > 0 are given in bold face.10 −0. d = 1.05   Rm...02 0. where C(1) 6= 0. β0 xt .86 CHAPTER 5. As already discussed informally in Chapter 2.31 −2. Another way of expressing this is that when two or several variables have common stochastic (and deterministic) trends.b) with cointegrating vector β 6= 0 if β0 xt is I(d-b).14   ∆pt−1  −0.35 1. THE COINTEGRATED VAR MODEL Deﬁnition 1 xt is integrated of order d if xt has the representation (1 − L)d xt = C(L)εt . cointegrated variables are driven by the same persistent shocks. Thus. Deﬁnition 2 The I(d) process xt is called cointegrated CI(d.   .20 0. and εt ∼ Niid(0... 5. then there exists a linear combination between them that becomes stationary.01 −0. Section 4.12 −1.00 0.1) and give the estimate of the unrestricted Π1 matrix for the Danish data (keeping in mind that it was invariant to the choice of model formulation):   r mt−1 −0.74 1.

0. we showed that the ﬁrst row of Π was a linear combination of the ﬁve variables which could be given a tentative interpretation as a (stationary) deviation from a long-run money demand relation.5.2) and the system is now logically consistent. −25. An inspection of the unrestricted Π matrix shows that the coeﬃcients of the second row corresponding . among the ﬁve variables there exists only one stationary relation. + Γk−1 ∆xt−k+1 + αβ0 xt−1 + µ + εt (5...t   −0. either Π = 0. AN INTUITIVE INTERPRETATION OF Π = αβ0 87 If xt ∼ I(1). Section 3. In this case each equation would deﬁne a stationary variable ∆xt to be equal to a nonstationary variable. .. α51 so that the relevant information contained in Π is preserved.0] we can approximately reproduce the ﬁrst row of Π as α11 β 01 . r ≤ p. 25. under the I(1) hypothesis the cointegrated VAR model is given by: ∆xt = Γ1 ∆xt−1 + . or it must have reduced rank: Π = αβ0 where α is a p × r matrix and β is a p × r matrix. assuming that r = 1. −1. xt−1 .9∆p − 25(Rm − Rb )}t−1 + µ + εt   (5. Using roughly the estimated coeﬃcients of the Danish data α11 = −0.1). i..3) The question is now how to choose the estimates of α21 .e. Thus. Hence. Under the hypothesis that xt ∼ I(1) all stochastic components are stationary in model (5.t ∆Rb. In Chapter 4. plus some lagged stationary variables Γ1 ∆xt−1 and a stationary error term. then ∆xt ∼ I(0) implying that Π cannot have full rank as this would lead to a logical inconsistency in (5. If.e. as an equilibrium error.2 a21 a31 a41 a51      =      £ ¤  {mr − y r + 0. we assume that Γ1 = 0 the cointegrated VAR model can now be written as:       ∆mr t r ∆yt ∆2 pt ∆Rm. i.2 and β 01 = [1. 0.2) where β0 xt−1 is an r × 1 vector of stationary cointegration relations..2. for simplicity. the money demand relation.9. This can be seen by considering Π =I as a simple full rank matrix. We will now examine the implications of this for the full VAR model.

0 0.1 0. When actual money holdings are above or below the long-run desired level. agents make gradual adjustments of their money holdings (and in their expenditure level) until the level of money stock is back at the steady-state level. model (5. THE COINTEGRATED VAR MODEL to real income might (with some good will) be considered proportional to β 01 xt with a21 ≈ 0. mr .t ∆Rb.0 0. Needless to say the implications of (5. nor on the two interest rates. 2 . I will argue that one reason why this is necessarily so is At this point we will disregard the possibility of extending the information set and how this may change the interpretation of the model. 2003) that the central bank would only be able to inﬂuence the level of aggregate expenditure by changing money supply but this would have no eﬀect on inﬂation rate. Under the assumption that a21 = 0. Thus. real expenditure goes up when there is ‘excess money’ in the economy.10.4) represents an economy where the deviation between agents’ actual money holdings.0      =      £ ¤  {mr − y r + 0.88 CHAPTER 5. and the t r aggregate level of expenditure. determines the real money stock.) None of the remaining rows seems to have coeﬃcients even vaguely proportional to β01 xt−1 and a31 = a41 = a51 = 0 seems the only possible choice.4) would be equivalent to the single equation error-correction model. mr .9∆p + 25(Rm − Rb )}t−1 . Thus for r = 1 the best representation of Π = αβ0 seems to be:       ∆mr t r ∆yt ∆2 pt ∆Rm.4) as discussed above suggest that the assumption r = 1 is a too simplistic2 to be relevant as an analytical tool for monetary policy decisions.4) Model (5.9∆p − 25(Rm − Rb )}t−1 + µ + εt   (5.t   −0. Such models have been widely used to estimate money demand relations based on the assumption that a stable relation is a prerequisite for monetary policy control. In this simple empirical model it can be shown (see Johansen and Juselius. Because a21 > 0. (Whether this is so is a testable hypothesis which will be discussed formally in Chapter 9. ‘Excess money’ has no shortrun or long-run impact on inﬂation.2 0. and their desired demand for money. the speciﬁc values of the coeﬃcients α and β can have important implications for whether a chosen policy is eﬀective or not. yt . mr∗ = t−1 t−1 y r − 0.

An inspection of the ﬁrst two rows of the unrestricted Π shows that the proportionality assumption is probably not valid. ∆p would have to be excluded altogether from the cointegration space.9.3mr + 7Rd }t−1  0 0 0 0 (5. One possibility is that inﬂation rate is in fact stationary by itself and.5) can approximately reproduce the relevant information of the ﬁrst two rows of Π. therefore. The question is how to choose the second relation β02 xt−1 so that the two cointegration relations together approximately describe the structure of the Π matrix. we can instead let the second cointegration relation describe an IS type of relation between real income. Therefore. real money stock and the deposit rate with coeﬃcients approximately consistent with the second row of the Π matrix. In this case the rank of Π would . (5. Thus. With r = 1.5)     =     Although.t    −0. The r = 2 system can now be represented as:       ∆mr t r ∆yt ∆2 pt ∆Rm. For example. AN INTUITIVE INTERPRETATION OF Π = αβ0 89 that r = 1 presumes p − r = 4 common stochastic trends.2.2 0 ¸ 0 −0. The choice of r = 1 forced us to impose many restrictions on Π. 0. We will now assume that r = 2 and see how this choice can give us more ﬂexibility. which is inconsistent with its highly signiﬁcant coeﬃcient in the third row of Π. it excludes the inﬂation rate from the long-run relations. From an econometric point of view such a choice would be contradicted by the signiﬁcant 0 t0 -ratio of inﬂation in the third row of Π. in β01 xt is in fact zero. assume that the coeﬃcient of inﬂation rate. Given the present information set this seems too many to be consistent with most theoretical assumptions underlying inﬂation and monetary policy control. which would leave out the variable of primary interest for monetary policy control. such as the proportionality of the ﬁrst two rows and the zeros of the remaining rows.t ∆Rb. However. it does not allow for enough ﬂexibility in the description of the feed-back dynamics of the long-run structure.5.3  ·  {(mr − y r ) − 25(Rm − Rb )}t−1 0 0  + µ + εt  {y r − 0. it is only by formulating a model for the full system that we can examine the implications of this seemingly innocent assumption. is a ‘cointegration’ vector by itself. supported by its very small 0 t0 -ratio in the ﬁrst row of Π.

t   −a11 a12 0 0 0 −a23 a31 0 0 a41 0 0 a51 a52 −a53   µ1 µ2 µ3 µ4 µ5   ε1. Since Chapter 2 did not discuss a hypothetical adjustment structure for the cointegration relations.2 0 0  ∆mr t  r    ∆yt 0 −0. The above discussion served the purpose of illustrating that the choice of α and β has to reproduce the statistical information in the Π matrix.t  ε5.t  0 0 0  ∆pt−1 0 0 0 ∆Rb.6)       ∆Rm.3mr + 7Rd }t−1 (5.t  µ5 ε5.6) is now able to roughly reproduce the data information in Πxt−1 .       ∆mr t r ∆yt ∆2 pt ∆Rm ∆Rb.t       ε4. Chapters 8 and 9 will discuss likelihood based test procedures for a wide variety of hypotheses including the above mentioned.t (5.74   {y r − 0. therefore. No testing was performed and the proposed decompositions were.3 0  {(mr − y r ) − 25(Rm − Rb )}t−1     2  ∆ pt  =   0 0 −1.t     ε1. However.t       µ4   ε4. THE COINTEGRATED VAR MODEL have to be increased to r = 3 and the model would become:     −0.7)      =        (m − p − y r )t−1      (Rm − Rb )t−1  +      (Rb − ∆p)t−1 . with the exception of the ‘signiﬁcant’ deposit rate in the fourth row of Π.t µ1  µ2   ε2. the choice of α and β should also ideally describe an interpretable economic structure and provide some empirical insight on the appropriateness of the underlying economic model. the α coeﬃcients reported below have been chosen to be roughly consistent with the predictions from monetarists’ theory models.t Model (5.90 CHAPTER 5. The tentatively proposed cointegration relations given above are quite diﬀerent from the ‘monetary theory’ consistent cointegration relations discussed in Chapter 2 and reproduced below.t      +  µ3  +  ε3.t   ε2. completely hypothetical.t      +  ε3.

Most empirical models would require a much more complicated cointegration structure and without the help of sophisticated test procedures it would be utterly hard to uncover these structures.20 and a12 ≈ 5 would approximately reproduce the money demand relation of (5..8) Π (L) Since Π−1 (L) = det(Π(L)) and det(Π(L)) is a polynomial in z we multiply both sides of (5. First we move the matrix lag polynomial to the right hand side of the equation for xt : Π(L)xt = µ + εt xt = Π−1 (L)(µ + εt ) a (5. we need to factorize out the unit root component of the lag polynomial of the VAR model. In this case the determinant |Π(z)| = 0 and Π(z) cannot be inverted for z = 1.6). Therefore. COMMON TRENDS 91 It is noteworthy that a11 = −0.. In this sense the illustration of the three cointegration relations in (5. This will be the topic of Chapters 9 . a row in the Π matrix can also be found as a linear combination of several cointegration relations. Thus.)(εt + µ) . The purpose of this section is to demonstrate how we can ﬁnd the moving average representation in this case.8) with the diﬀerence operator (1 − L) so that the non-invertible unit root is cancelled out: (1 − L)xt = Π−1 (L)(1 − L)(εt + µ) = C(L)(εt + µ) = (C0 + C1 L + C2 L2 + .3 Common trends and the moving average representation Chapter 3 showed that the stationary VAR model could be directly inverted into the moving average form.13.3. When the VAR model contains unit roots the autoregressive lag polynomial becomes non-invertible. 5. For simplicity of notation we focus on the simple VAR(2) model: Π(L)xt = (I − Π1 L − Π2 L2 )xt = µ + εt We assume now that the characteristic polynomial |Π(z)| = |I − Π1 z − Π2 z 2 | contains a unit root.5.6) is likely to be too simpliﬁed.

C(z) = C0 + C1 z + C2 z 2 + . of the process xt and the initial value of the short-run dynamics C∗ (L)ε0 .. x0 contains both the initial value.9) By inserting (5. εs + Yt − Y0 . THE COINTEGRATED VAR MODEL where C(L) is now a stationary lag polynomial and hence invertible. . e Thus.8) we get: where C =C(1). and initial values. stationary stochastic components C∗ (L)εt . x0 . i=1 Pt showing that xt contains stochastic trends C i=1 εi .92 CHAPTER 5.10) xt = CΣt εi + C∗ (L)εt + x0 . Summing for s = 1. (5. e (5. C∗ (L)εs . Thus the VAR model is capable of reproducing a similar trend-cycle-irregular decomposition of the vector process as was informally discussed in Chapter 2.9) in (5. This equation can be written as: (1 − L)xt = {C + C∗ (L)(1 − L)}(εt + µ) where Ys is a short-hand notation for the stationary part of the process.. ... t we get: xt = x0 + C = C t X s=1 t X s=1 xs = xs−1 + Cεs + C∗ (L)(εs − εs−1 ). Dividing through by (1 − L) we obtain: ¶ µ εt e xt = C + C∗ (L)εt + x0 . 1−L We can now formulate the VAR model in moving average form as: e εs + Yt + x0 . = xs−1 + Cεs + Ys − Ys−1 . The characteristic function of C(L) is now expanded using the Taylor rule.. evaluated for z = 1 and reformulated as: C(L) = C(1) + C∗ (L)(1 − L).

β0 xt and the next step is to express them as a function of initial values and the errors (εt . α⊥ ) = p. .12) Using (5. We ﬁrst pre-multiply equation (5. We consider now the matrix α⊥ of full rank and dimension p × (p − r) so that α0 α⊥ = 0 and rank (α. FROM AR TO MA 93 5. the general principle can be understood using the VAR(1) model: ∆xt = αβ0 xt−1 + µ + εt . T. but the results here apply for any choice of α⊥ . β⊥ : β⊥ (α0⊥ β⊥ ) −1 α0⊥ + α (β 0 α) −1 β0 = I. . xt can be expressed as a linear combination of α0⊥ xt and the cointegration relations. . Although the detailed derivation of the result for the VAR(k) model is more complicated...13) Thus.4. As will be discussed in Chapter 12 the matrix α⊥ is not uniquely deﬁned without imposing identifying restrictions. 0 −1 β 0 xt (5. Chapter 4 provides a detailed and lengthy derivation of the results for the general VAR(k) model. t = 1.We now apply the results to the p-dimensional vector xt : xt = β⊥ (α0⊥ β⊥ )−1 α0⊥ xt + α (β0 α) = ω1 α0⊥ xt + ω1 β xt . We will now illustrate how one can ﬁnd the Ci matrices when β and α are known for the simple VAR(1) model.11) with initial value x0 . The interested reader is referred to the discussion there.). We will now make use of the following relationship between α.12) we can decompose any vector v in Rp into a vector v1 ∈ sp(β⊥ )and a vector v2 ∈ sp(α). . εt−1 . β.11) with β0 and then solve for β0 xt to get the equation β0 xt = (I + β0 α) β0 xt−1 + β0 µ + β0 εt . (5. Johansen (1996). (5. . α⊥ .4 From the AR to the MA representation3 We showed above that the Ci matrices are functions of the Πi matrices.5. . 3 This section relies heavily on the SJ Chapter 3.

15) Inserting (5.. τ 1 = Cµ measures the slope of a linear trend in xt .94 CHAPTER 5. I = 1.14) An expression for α0⊥ xt is found by pre-multiplying (5.17) .13) we obtain the following result: £ ¤ P P i xt = β⊥ (α0⊥ β⊥ )−1 α0⊥ t (εi + µ) + α0⊥ x0 + α(β0 α)−1 ∞ (I + β0 α) β 0 (εt−i + µ i=1 i=0 = C = C Pt Pt 0 i=1 εi + Cµt + Cx0 + α (β α) −1 i=1 εi + τ 1 t + τ 0 + Yt P∞ i=0 (I + β0 α) β0 (εt−i + µ) .16) where C = β⊥ (α0⊥ β⊥ )−1 α0⊥ .14) and (5. T and the constant µ : ∞ X i=0 β xt = 0 (I + β0 α) β 0 (εt−i + µ) . The main result relates the C matrix to all the parameters of the VAR(k) model: C = β ⊥ (α0⊥ Γβ⊥ )−1 α0⊥ (5. which has the solution: t X i=1 α0⊥ xt = α0⊥ x0 + α0⊥ (εi + µ) . and Yt is a stationary process. This will not be done here but is left for the interested reader to do as an exercise.15) into (5. τ 0 = Cx0 depends on initial values. (5.. i (5.11) with α0⊥ and get the equation α0⊥ ∆xt = α0⊥ εt + α0⊥ µ. By expressing the VAR(k) model in the companion form it is possible to derive the results for the more general case using the same principle as above.. i (5. THE COINTEGRATED VAR MODEL The eigenvalues of the matrix (I + β 0 α) are inside the unit circle when the r-dimensional process β0 xt is stationary and it is straightforward to represent β 0 xt as a function of εI . .

. leading to the following i=1 deﬁnition: Deﬁnition 3 The common driving trends are the variables α0⊥ Pt i=1 εi . Such a variable is called a weakly exogenous variable. 0.i 0 0 1 t i=1 ε5.4.17). whereas the next three stochastic trends are equal to the cumulated shocks to inﬂation.i 1 0 0  Xt t   i=1 εi =  P i=1 ε3. (5. .2. Thus. except that in the AR representation β determines the common long-run relations and α the loadings..18) It is easy to verify that α0⊥ α = 0. Thus. therefore.−Γk−1 (See Johansen (1996). It is. the deposit rate and the bond rate.i 0 1 0  t  P i=1 ε4.i        (5. whereas in the moving average representation α0⊥ determines the common ˜ stochastic trends and β ⊥ their loadings.16) show that the non-stationarity in the process xt originates from the cumulative sum of the p − r combinations α0⊥ Σt εi . We will illustrate this with a few simple examples: Representation (5.5.0] we can ﬁnd α⊥ as: 1 2  0 0 α0⊥ =   0 0 0 0   Pt P i=1 ε1. We also note that a zero row in α corresponds to a unit vector in α⊥ .3): r = 1 corresponds to p − r = 4. so that α⊥ and β⊥ are 5 × 4 matrices. respectively. All the remaining chapters will use results based on both the AR and the MA representation. which will be further discussed in Chapter 9. By expressing (5. important to have a good intuition for the meaning of the orthogonal complements.17) as: ˜ C = β ⊥ α0⊥ ˜ where β⊥ = β ⊥ (α0⊥ Γβ⊥ )−1 it is easy to see that the decomposition of the C matrix is similar to the one of the Π matrix. Chapter 4). 0. FROM AR TO MA 95 where Γ =I−Γ1 − .1.0. 0.16) with Γ =I is a special case of (5. 0. For α = [−0.0. (5.17) together with (5.i 0 0 0 t   P i=1 ε2. Interpreting P α0⊥ t εi as an estimate of the p − r common stochastic trends shows that i=1 the ﬁrst one is a weighted sum of cumulated shocks to real money stock and real income.

THE COINTEGRATED VAR MODEL Note that a linear combination of common stochastic trends are also stochastic trends. so that β 0 = [1. Figure 5. Thus. where mt is money stock and yt is income. the attractor set. 25. The common trends representation (5.6). −1]. −25]. This will done in subsequent chapters.96 CHAPTER 5.   β⊥ =    (5. where for simplicity we have set the coeﬃcient to inﬂation to zero.  1 1 0 0 0 0 0 1 0 0 0 0 0 1 1 25 0 0 0 1       To be able to interpret β⊥ as loadings to the common stochastic trends we would ﬁrst need to post multiply by the component (α0⊥ Γβ⊥ )−1 . for example.1 illustrates these forces for a simple bivariate system with x0t = [mt . deﬁned by β0 xt 6= E(β0 xt ). 0 .5 4 Pulling and pushing forces The simple error correction model (5. If β 0 xt = (mt − yt ) − E(β0 xt ) 6= 0.18) is just one of inﬁnitely many representations. In the picture this is indicated by the 450 line showing that mt = yt in steady state. then the adjustment coeﬃcients α will force the process back towards the attractor set with a speed of adjustment that depends on the length of α and the size of the equilibrium error β 0 xt . to a constant money velocity m − y = µ. deﬁned by β0 xt −E(β0 xt ) = 0. measured by α0⊥ Σt εi . with the force α which activates as soon as the process is out of steady-state. positive shocks 4 This section relies heavily on Johansen (1976) Chapter 3. We will now similarly ﬁnd β⊥ corresponding to β0 = [1. In this sense the AR and the MA i=1 representation are two sides of the same coin: the pulling and the pushing forces of the system. then the attractor set is β ⊥ = [1. Thus. The common trend. has pushed money and i=1 income along the line deﬁned by β⊥ .19) 5. (5.yt ].5) and (5.16) illustrates on the other hand how the variables move in a nonstationary manner described P by the common driving trends α0⊥ t εi .11) illustrates on one hand how the process is pulled towards steady-state. . −1. 1]. We leave it to the reader to ﬁnd the orthogonal complements to α and β for (5. If the steady state position corresponds.

both εt and α⊥ change when the information set changes. Generally. but instead is pushing the process further away from steady state.t−1 ). will necessarily contain the eﬀect of omitted variables. nonetheless.t = − (x1. 2 1 ∆x2. But. whereas this is not necessarily the case with a common trend.6 Concluding discussion P This chapter has shown that the notion of common trends. εt . |−0. 5. β0 xt . then the same cointegration relation will still be found in a larger set of variables. a model where the variable x2. see Hendry (1995). .t . is only unanticipated i=1 for the chosen information set.t−1 ) + ε2. 4 i.t = − (x1.t is also reacting on the same equilibrium error with a larger error correction coeﬃcient. x1.t is not error-correcting to the equilibrium error (x1. An ‘unanticipated disP turbance’. Although we can. If cointegration holds between a set of variables.t−1 −x2.25| . This will be discussed in great detail in Chapter 10. α. the process is nevertheless stable.t−1 ) + ε1. In this model x2. α0⊥ t εi .e.98 CHAPTER 5. This will be further discussed in Chapter 11. are two sides of the same coin. THE COINTEGRATED VAR MODEL Hansen and Johansen (1998) also discusses a model with overshooting 1 ∆x1. of course. Unless the latter is complete in the sense of comprising all relevant variables. as are the loadings coeﬃcients. Thus an unanticipated εt based on a smaller system need no longer be so in a larger system.t−1 − x2. choose the representation we prefer. there is.50| > |−0.t−1 − x2. one aspect in which the two concepts diﬀer. and i=1 the notion of cointegrating relations. A cointegration relation is invariant to changes in the information set. and the adjustment coeﬃcients. despite the fact that the variables are cointegrating. since.t does not error-correct with a coeﬃcient of plausible sign. implying that the deﬁnition of a common trend is not invariant to changes in the information set. β. deﬁning the common trend α0⊥ t εi .t . the residual. εt .

1 ﬁrst illustrates the dual role of the constant term and the linear trend in a simple dynamic regression model.3 discusses ﬁve cases of diﬀerent restrictions imposed on the trend and the constant in the VAR model. Section 6.1 A trend and a constant in a simple dynamic regression model Most economists are familiar with the standard regression model and how to interpret its coeﬃcients. in particular to whether the dynamics contains 99 .6 extends the discussion to the VAR model. and intervention dummies and show how they aﬀect the mean of the diﬀerenced process. E(∆xt ) and the mean of the equilibrium error process.7 illustrates. inference and interpretation changes fundamentally and interpretational mistakes are easy to make. Section 6. deterministic trends. 6. E(β0 xt ). Section 6.4 derives the MA representation when there is a trend and a constant in the VAR model. We will use a simple univariate model to demonstrate how the interpretation of a linear time trend and a constant term is crucially related to the dynamics of the model.5 discusses the role of three diﬀerent types of dummy variables in a simple dynamic regression model and Section 6. When dynamics are introduced. Section 6.2 then extends the discussion to the more complicated case of the VAR model. Section 6. such as constant.Chapter 6 Deterministic Components in the I(1) Model The purpose of this chapter is to discuss the interpretation of ﬁxed eﬀects. Section 6.

. It is useful to see how the constant µ0 is related to the initial value of yt .3) and by multiplying through with (1 − ρL) : (1 − ρL)yt = (1 − ρL)γt + (1 − ρL)y0 + εt .. DETERMINISTIC COMPONENTS a unit root or not. Using (6.1) is a common factor model. Since an economic variable is usually given in logs.4) using Lxt = xt−1 we get: yt = ρyt−1 + γ(1 − ρ)t + ργ + (1 − ρ)y0 + εt . Note that the assumption (6.6) . As demonstrated below such a model imposes nonlinear restrictions on the parameters of the AR model and is. For practical purposes µ0 ' y0 and in the discussion below we will set µ0 = y0 to emphasize the role of measurments on the constant in a dynamic regression model. the value of µ0 is generally dominated by y0 .5) (6. the level contains information about the unit of measurements (the log of 100. We consider the following simple regression model for yt containing a linear trend and a constant: yt = γt + ut + µ0 .1) we have that y0 = µ0 + u0 .1)-(6. T where the residual ut is a ﬁrst order autoregressive process: ut = εt 1 − ρL (6..2) serves the purpose of providing a pedagogical illustration of the dual roles of deterministic components in dynamic models. Nonetheless.4) It is easy to see that the ”static” regression model (6.1) is equivalent to the following dynamic regression model: yt = b1 yt−1 + b2 t + b0 + εt (6.000 euro.1) we get: yt = γt + εt + y0 1 − ρL (6. (6. t = 1.2) in (6. Therefore. Rewriting (6.100 CHAPTER 6.2) implies that (6.2) (6.1) and u0 is assumed ﬁxed. (6. By substituting (6. a special case of the general autoregressive model. say). . therefore.

. . T. ρ = 1 and γ = 0. one should note that the coeﬃcients in a dynamic regression model have to be interpreted with caution.5) that ∆yt = γ + εt . It follows from (6. For example b2 in (6.6) is not an estimate of the trend slope in yt and b0 is not an estimate of µ0 . Case 3. Case 2. Case 4.. ”the pure random walk” model. Thus. A DYNAMIC REGRESSION MODEL with b1 = ρ b2 = γ(1 − ρ) b0 = ργ + (1 − ρ)y0 .5) that ∆yt = εt . We will now show that a1 = γ and a0 = y0 : Eyt = ρEyt−1 + γ(1 − ρ)t + ργ + (1 − ρ)y0 a1 t + a0 = ρ(a1 (t − 1) + a0 ) + γ(1 − ρ)t + ργ + (1 − ρ)y0 a1 (1 − ρ)t + a0 (1 − ρ) + ρa1 = γ(1 − ρ)t + ργ + (1 − ρ)y0 Hence: a1 (1 − ρ)t = γ(1 − ρ)t a1 = γ and: a0 (1 − ρ) + ργ = ργ + (1 − ρ)y0 a0 = y0 .e. T. yt is stationary around its mean Eyt = a1 t + a0 .. To summarize: • in the static regression model (6. . where Eyt = y0 i.7) We consider the following four cases: Case 1.e. gives us yt = ρyt−1 + (1 − ρ)y0 + εt . i.1) the constant term is essentially accounting for the unit of measurement of yt . ρ = 1 and γ 6= 0.1. γt. | ρ |< 1 and γ 6= 0 gives (6.. 101 (6. for t = 1. Note that E(∆yt ) = γ 6= 0 is equivalent to yt having a linear trend. for t = 1.6). i. .. i.. It follows from (6.e. ”the stationary autoregressive” model with a constant term.e. In this case E(∆yt ) = 0 and yt contains no linear trend. ”the random walk with drift” model. | ρ |< 1 and γ = 0.6.

102 CHAPTER 6. allowing us to investigate short-run as well as long-run eﬀects in the data.e. In the cointegrated VAR model the latter case can be modeled by adding a trend to the cointegration space. 6. A characteristic feature of the error-correction formulation given below is the inclusion of both diﬀerences and levels in the same model. sometimes it is better to model it with a deterministic trend and in most cases we need a combination of the two. For notational simplicity all short-run dynamic eﬀects. especially over short sample periods. i. a cointegration relation. In other cases a linear combination between the variables removes the stochastic trend but not the deterministic trend and we need to add a linear trend to the cointegration relation to achieve stationarity. • in the diﬀerenced model (ρ = 1) the constant term is only measuring the growth rate. We will now demonstrate that one can.2 A trend and a constant in the VAR The above results were derived for the univariate model. γ. have been set to zero. Thus we consider the following model in AR form: ∆xt = αβ0 xt−1 + µ0 + µ1 t + εt and in the MA form: (6. When two variables share the same stochastic trend we showed in the previous chapter that it is possible to ﬁnd a linear combination that cancels the trend. But many economic variables typically exhibit linear deterministic growth (at least locally over the sample period) in addition to stochastic growth. with some modiﬁcations. µ1 t. Sometimes it is preferable to approximate the trend behavior with a stochastic trend. We call a variable which contains only a deterministic trend. Γi . but no stochastic trend a trend-stationary variable. DETERMINISTIC COMPONENTS • in the dynamic regression model (6.5) the constant term is a weighted average of the growth rate γ and the initial value y0 . The basic idea can be illustrated with a simple VAR(1) model containing a constant. Statistically it is not always straightforward to distinguish between the two. µ0 . and a trend. We call it a trendstationary cointegration relation. apply a similar interpretation of the deterministic components in the multivariate model.8) .

6.14) (6.13) where β0 = β0 α)−1 β0 µ0 and γ 0 = β⊥ (α0⊥ β⊥ )−1 α0⊥ µ0 . to ∆xt . so that one of them belongs to the α space.2. i. i. µ1 = 0).9) Because. and the other to the orthogonal space. ∆xt and β0 xt−1 are stationary we can express (6. = αβ0 +γ 0 (6.e. a constant term but no linear trend in the VAR model. To achieve this when µ1 6= 0 is quite involved and we will focus here on the case (µ0 6= 0. to the cointegration relations. A TREND AND A CONSTANT IN THE VAR 103 t ∞ X X xt = C (εi + µ0 +iµ1 ) + C∗ (εt−i + µ0 +(t − i)µ1 ) i i=1 i=0 (6.12) (6.8) when µ1 = 0.e. We will now show that E(β 0 xt + β0 ) = 0 and E∆xt = γ 0 in (6. i.9) becomes: .11) The two (p × 1) vectors µ0 and µ1 can always be decomposed into two new vectors. Generally we would like an equilibrium error to have mean zero and one possibility is to decompose µ0 and µ1 so that (β0 xt − Eβ0 xt ) = 0.e.10) (6. Since a vector can be decomposed in many diﬀerent ways we need some principle for doing it.8) as: ∆xt − E∆xt = α(β0 xt−1 − E(β 0 xt−1 )) + εt where E∆xt = αE(β0 xt−1 ) + µ0 +µ1 t and Eβ0 ∆xt = β 0 αE(β0 xt−1 ) + β0 µ0 +β0 µ1 t Eβ0 xt = (1 + β0 α)E(β 0 xt−1 ) + β0 µ0 +β0 µ1 t (6. In this case (6. In this case we can use the equality: α(β0 α)−1 β0 +β⊥ (α0⊥ β⊥ )−1 α0⊥ = I to decompose the vector µ0 into two new vectors: µ0 = α(β0 α)−1 β0 µ0 +β⊥ (α0⊥ β⊥ )−1 α0⊥ µ0 .

as has εt .. DETERMINISTIC COMPONENTS xt = C and t ∞ X X (εi + µ0 ) + C∗ (εt−i + µ0 ) i i=1 i=0 E∆xt = Cµ0 ∞ X Eβ0 xt = β 0 C∗ µ0 i i=0 (6. i (6.15) and (6. However.104 CHAPTER 6.13) satisﬁes the criterium E(β 0 xt + β0 ) = 0. .16) Inserting (6.. E(β0 xt + β 0 ) = 0... E(β0 xt−1 + β0 ) = 0 and we have shown that the decomposition (6. − Γk−1 . Thus..17) has a zero mean.12) (because Eβ 0 xt is constant) we obtain the following expression for Eβ0 xt : Eβ xt = −(β α) β µ0 = β 0 0 −1 0 0 ∞ X i=0 C∗ µ0 .h.18) where Γ = I− Γ1 − . the derivation of the mean value of ∆xt becomes more complicated in the general case. When Γ1 . Γk−1 6= 0 (but µ1 = 0) we can apply a slightly diﬀerent decomposition: ΓC + (I − ΓC) = Γβ⊥ (α0⊥ Γβ ⊥ )−1 α0⊥ + I − Γβ⊥ (α0⊥ Γβ⊥ )−1 α0⊥ = I (6. of (6.13) is that when Γ1 = 0 and µ1 = 0 we have that E(4xt ) = γ 0 and. hence.16) into (6.17) ∆xt − γ = α(β 0 xt−1 + β 0 ) + εt Because the l. . the motivation for choosing the decomposition (6.15) By noting that Eβ0 ∆xt = 0 in (6.10) and noting that C = β⊥ (α0⊥ β⊥ )−1 α0⊥ gives: ∆xt − β⊥ (α0⊥ β⊥ )−1 α0⊥ µ0 = αβ 0 xt−1 + α(β0 α)−1 β0 µ0 + εt (6.s.

β0 . A TREND AND A CONSTANT IN THE VAR 105 When µ1 6= 0 we would need a diﬀerent. t ˜0˜ ∆xt = αβ xt−1 + γ 0 +γ 1 t + εt .19) (6. γ 0 6= 0. decomposition of µ0 and µ1 to achieve that the equilibrium error has a zero mean1 . 1. and γ 1 6= 0 implies quadratic trends in the variables. and the part that belongs to the diﬀerences. γ 0 .2. In any case a similar logic is used for the decomposition of the constant and the trend into the space spanned by α and β⊥ : µ0 = αβ0 +γ 0 µ1 = αβ1 +γ 1 By substituting (6. If.19) in (6. (6.13) is used to obtain (β0 . in general.21) E(∆xt ) = γ 0 +γ 1 t. (??) can be reformulated as: (6. more complicated.22) i. β0 . β1 . The γ components can be interpreted from the equations: 0 (6.e. ˜ ˜ where β = [β 0 . 1 . β1 ]  1  + γ 0 +γ 1 t + εt . t)0 . nevertheless.6.8) we get: ∆xt = αβ 0 xt−1 + αβ 0 + αβ1 t + γ 0 +γ 1 t + εt . From the above discussion it appears that the constant and the deterministic trend play a double role in the cointegration model and we need to be able to distinguish between the part that belongs to the cointegration relations. and by rearranging (6. γ 1 ). the consequence is that E(β0 xt−1 + β0 + β1 t) 6= 0. the deviation from a zero mean is very small.20)  Thus. β1 ] and xt−1 = (xt−1 . (6. implies linear growth in at least some of the variables as demonstrated in Case 1 in Section 1. but.20) can be written as:  xt−1 ∆xt = α [β 0 .

We will demonstrate in Chapter 9 that such hypotheses can be expressed as testable linear restrictions on the econometric model. In this case. i. The only deterministic component in the model is the intercept of the cointegrating relations. no linear trends in the VAR model (6. i. β 1 6= 0 the linear trends in the variables do not cancel in the cointegrating relations. i. or when the measurements cancel in the cointegrating relations a zero restriction can be justiﬁed. µ0 = 0. It appears that µ0 6= 0 implies both linear trends in the data and a non-zero intercept in the cointegration relations.e. there are no linear trends in the data.e.106 CHAPTER 6.e. The ﬁve diﬀerent models discussed below arise from imposing diﬀerent restrictions on the deterministic components in (??). but the constant is unrestricted in the model. and only in the exceptional case when the measurements start from zero. Case 1. or whether they cancel in the cointegrating relations or not. DETERMINISTIC COMPONENTS 6. but linear trends in the data (6. implying that the intercept of every cointegrating relation is zero. Case 3.e. i. Case 2. γ 0 = 0 but β0 6= 0. µ1 . can be tested in this model. these trends cancel in the cointegrating relations. in the data. µ1 = 0. As demonstrated in the previous section an intercept is generally needed to account for the initial level of measurements. Case 4. but E(∆xt ) = γ 0 6= 0. and the constant term µ0 is unrestricted. the constant term is restricted to be in the cointegrating relations. µ1 = 0. implying that the equilibrium mean is diﬀerent from zero. the trend is restricted only to appear in the cointegrating relations. since β 1 = 0. . These can either describe a trend-stationary variable or a trend-stationary cointegration relation.8).22). the hypothesis that a variable is trend-stationary. our model contain ‘trend-stationary’ variables or trend-stationary cointegrating relations. there is no trend in the cointegration space. is consistent with linear trends in the variables but. β1 are unrestricted.3 Five cases In empirical work we generally do not know from the outset whether there are linear trends in some of the variables. in addition.9). E(∆xt ) = γ 0 6= 0.e. i. consistent with E(∆xt ) = 0 in (6. As illustrated in the previous section. implies a linear trend in the level of xt . When γ 1 is restricted to zero we allow linear. This case corresponds to a model with no deterministic components in the data. E∆xt = 0 and E(β0 xt ) = 0. In this case. β 0 . for example that the output gap is stationary. but γ 0 . γ 1 = 0. but no quadratic trends. Therefore. When. X0 .

24) ˜ ˜ where X0 contains the eﬀect of the initial values deﬁned so that β 0 X0 = 0. population growth or the proportion of old/young people in a population). As before: .6. These are only a few examples showing that the role of the deterministic and stochastic components in the cointegrated VAR is quite complicated. µ1 . No restrictions on µ0 .4 The MA representation with deterministic components For simplicity we will here focus on the derivation of the MA representation when the VAR contains an unrestricted constant and a linear trend. Although quadratic trends may sometimes improve the ﬁt within the sample. i. the model is consistent with linear trends in the diﬀerenced series ∆xt and. but also the asymptotic distribution of the rank test depends on the speciﬁcation of these components.e. 6. Instead it seems preferable to ﬁnd out what has caused this approximate quadratic growth. trend and constant are unrestricted in the model. and if possible include more appropriate information in the model (for example. It is straightforward to generalize to other cases. Not only is a correct speciﬁcation important for the model estimates and their interpretation.23) xt = C(1) (6. quadratic trends in xt . THE MA REPRESENTATION 107 Case 5. Chapter 5 showed that the MA form of the VAR model (??) can be obtained by inverting: 4xt = C(L)(εt + µ0 +µ1 t) = [C(1) + C∗ (L)(1 − L)](εt + µ0 +µ1 t) and summing: (εt + µ0 +µ1 t) ˜ + C∗ (L)(εt + µ0 +µ1 t) + X0 (1 − L) (6.4. This will be further discussed in Chapter 8. thus. With unrestricted parameters. forecasting outside the sample is likely to produce implausible results.

for t = 1.5µ1 t2 = 0..26) Substituting (6.28) shows that linear trends in the variables can originate from three diﬀerent sources in the VAR model: .27) and focusing on the linear and quadratic trend components we get: α0⊥ µ0 t = α0⊥ αβ0 t + α0⊥ γ 0 t | {z } 0 0 £ ¤ xt = β ⊥ (α0⊥ Γβ⊥ )−1 α0⊥ µ0 t + 0.5γ 1 t + 0.27) +CΣεi + C∗ (L)εt + C∗ (1)µ0 + X0 and α0⊥ 0.5(α0⊥ αβ1 t2 + α0⊥ γ 1 t2 ).28) Thus.5µ1 t + 0. (6.5µ1 t = 0. | {z } 0 and the MA representation can be written as: xt = β ⊥ (α0⊥ Γβ⊥ )−1 α0⊥ {γ 0 t + 0. comp.26) we get: ˜ +CΣεi + C∗ (L)εt + X0 . comp.24) can be written as: xt = Cµ0 t + 0.5Cµ1 t2 + 0. DETERMINISTIC COMPONENTS C(1) = C = β⊥ (α0⊥ Γβ⊥ )−1 α0⊥ . T.5γ 1 t2 } + C∗ (L)µ1 t + ˜ + C∗ (1)µ0 + CΣεi + C∗ (L)εt + X0 .14) in (6. (6.108 CHAPTER 6.5µ1 t2 + [C∗ (L)µ1 t] + | {z } | {z } ˜ (6.25) in (6.5(α0⊥ αβ1 t + α0⊥ γ 1 t) | {z } α0⊥ 0..25) Substituting (6. . (6. | {z } stoch. (6..5Cµ1 t + C∗ (L)µ1 t + C∗ (1)µ0 + | {z } determ. By summing and rearranging (6.

0. We write (6.0.. the β⊥ component (γ 1 t) of the unrestricted linear trend µ1 t 3.0.0.0.30) we get: (6.... Dpt .1.30) where Dst is a mean-shift dummy (.-1.. Dpt is a permanent intervention dummy (.32) (1 − ρL)yt = φs (1 − ρL)Dst + φp (1 − ρL)Dpt + φtr (1 − ρL)Dtrt +(1 − ρL)y0 + εt .0... the β⊥ component (γ 0 t) of unrestricted constant term µ0 .33) .0.1. the α component (C∗ (L)µ1 t) of the unrestricted linear trend µ1 t 2.) and Dtrt is a transitory shock dummy (. and the residual ut is a ﬁrst order autoregressive process: ut = εt 1 − ρL (6.28) in a more compact form: ˜ xt = C{τ 1 t + τ 2 t2 } + CΣεi + C∗ (L)εt +X0 ..1.1.0. | {z } | {z } 109 (6.31) By substituting (6.5 Dummy variables in a simple regression model Similarly as for the trend and the constant we consider ﬁrst a simple regression model for yt containing three diﬀerent types of dummy variables. DUMMY VARIABLES IN A SIMPLE REGRESSION MODEL 1.28).0....)....31) in (6.1.).5... T (6. and Dtrt : yt = φs Dst + φp Dpt + φtr Dtrt + ut + y0 .0. yt = ρyt−1 + φs Dst − ρφs Dst−1 + φp Dpt − ρφp Dpt−1 +φtr Dtrt − ρφtr Dtrt−1 + (1 − ρ)y0 + εt . = b1 yt−1 + b2 Dst + b3 Dst−1 + b4 Dpt + b5 Dpt−1 +b6 Dtrt + b7 Dtrt−1 + b0 + εt (6. Dst .0. 6. . t = 1...6.29) where τ 1 and τ 2 can be derived from (6.

110 CHAPTER 6.-2.0..30) with autoregressive errors corresponds to a dynamic model with lagged dummy variables.....0.....0.34) (6.1.1..37) +[φp ρ + φtr (1 − ρ)]Dtrt + φtr ρ∆Dtrt + εt When ρ = 1 (6.1. 6. (6... Note that the eﬀects of the dummy variables can equivalently be formulated as: φs Dst − ρφs Dst−1 = φs ρ∆Dst + φs (1 − ρ)Dst φp Dpt − ρφp Dpt−1 = φp ρ∆Dpt + φp (1 − ρ)Dpt φtr Dtrt − ρφtr Dtrt−1 = φtr ρ∆Dtrt + φtr (1 − ρ)Dtrt (6.. a shift in the levels of a variable becomes a ’blip’ in the diﬀerenced variable. t = 2.0. T.35) (6. . thus violating the normality assumption. = φs Dpt + φp Dtrt + φtr ∆Dtrt + εt . We will strongly argue below that it would be a mistake to treat them exclusively as a statistical nuisance to be remedied by appropriately correcting the observations.. First we will illustrate the need for intervention dummies and how to model them based on the analysis of the Danish data.0.6 Dummy variables and the VAR Signiﬁcant interventions and reforms frequently show up as extraordinary large (non-normal) shocks in the VAR analysis.0. Hence.)..0.32) can be reformulated as: ∆yt = −(1 − ρ)yt−1 + φs (1 − ρ)Dst + [φs ρ + φp (1 − ρ)]Dpt + (6. and ﬁnally a transitory blip in the levels becomes a double transitory blip in the diﬀerences..0.0.0.0.1.. i. a permanent ’blip’ in the levels becomes a transitory blip in the diﬀerences.36) where ∆Dst now becomes an impulse dummy (.) describing an permanent intervention (.) and ∆Dpt becomes a transitory blip dummy (.0..1.0. then we will discuss more formally how they inﬂuence the dynamics of the .-1. DETERMINISTIC COMPONENTS Thus.0.0. the ”static” regression model (6..e..0..37) becomes: ∆yt = φs ∆Dst + φp ∆Dpt + φtr ∆Dtrt + εt ...). and ∆Dtrt a double transitory blip dummy (.

it was clearly meant to have a permanent eﬀect on real aggregate income and we might hypothetically expect both a transitory and a permanent effect. . 1. DUMMY VARIABLES AND THE VAR 111 VAR model. whereas a permanent eﬀect should be modelled by a blip dummy. For example. 1. The huge drop in interest rate was associated with a similarly large increase in aggregate money stock as a result of the possibility among foreigners to hold Danish krones.0. for example Dtrt = [0.6. It is often much easier to recognize a possible outlier observation in ∆xt than in the levels xt . 0].. Thus.0.. However. will increase prices to some extent. The transitory eﬀect of the VAT intervention should be modelled by a transitory blip dummy. The eﬀect on the levels of the process will be discussed subsequently. 0]. 0. if we measure prices by an CPI index. 0. 2 .. 0. . 0.6. markets sometimes overreact and we often The purpose of the intervention was exactly to boost domestic demand to avoid a depression in the aftermath of the ﬁrst oil shock. The second intervention is a result of permanently removing restrictions in the capital market and is. Nonetheless.. . −0. in the expectation of a high demand for their products..... In the present illustration we are using the implicit price deﬂator of GNE as a measure of prices so the eﬀects are likely to be much smaller than for the CPI. can be made by examining the diﬀerenced process as illustrated below. The removal of VAT is also likely to inﬂuence prices in various ways. 0. 25%. a ﬁrst tentative decision to model a political intervention.. 0. The statistical analysis of the VAR model can then be used to ﬁnd out if the VAT intervention eﬀects were indeed signiﬁcant and if this was the case for which equations.5. 0.5. for example Dpt = [0. 0. The ﬁrst intervention is transitory in the sense that the VAT was removed for one quarter and gradually restored again over two quarters. for example with a permanent or a transitory blip dummy. ﬁrst of all likely to have permanent eﬀects on the system. 0. . But it is also possible that some producers. −0. 0. A graphical analysis of the residuals based on the unrestricted VAR(2) model for the Danish data showed that the temporary removal of the VAT in 1975 had a very strong impact on real aggregate expenditure2 and that the removal of restrictions on capital movements in 1983 caused the yearly long-term bond rate to fall with approximately 10% from a level of ca. and ﬁnally discuss their signiﬁcance for the empirical model estimates.. therefore. 0. 0. which was internationally high. then a change in VAT is likely to be seen as a roughly proportional eﬀect.

An unrestricted transitory blip dummy accounts for two consecutive blips of opposite signs in ∆xt and cumulates to a single blip in xt .0.1. will have no eﬀect on the stochastic trends deﬁned as the cumulative sum of all previous shocks.. Using dummies to account for extraordinary mean-shifts. 2001. (6. in particular. and transitory shocks..).0. To understand the role of the dummies in the CVAR model we use (6... permanent blips. If they are of minor signiﬁcance and..0. contrary to permanent shocks they will disappear in cumulation and. εt ∼ Niid (0. Ω ).1.112 CHAPTER 6. followed shortly afterwards by a similar shock of opposite sign making the variable return to its previous level..). ..13) to partition the dummy eﬀects into an α and an β⊥ component: Φs = αδ 0 +δ 1 . But.0. DETERMINISTIC COMPONENTS see both a permanent eﬀect and a transitory eﬀect as a result of a signiﬁcant reform or intervention in the market.-1. An unrestricted mean shift dummy accounts for a mean shift in ∆xt and cumulates to a broken trend in xt .40) ..1.. Johansen. for high frequency data.0. t = 1..1.. An unrestricted permanent blip dummy accounts for a large blip (impulse) in ∆xt and cumulates to a level shift in xt .0. (6. Nielsen. therefore. 2001).39) Φp = αϕ0 +ϕ1 .. The latter would show up as a shock in one period.0.. Such transitory shocks are quite common.38) where Dst is d1 × 1 vector of mean-shift dummy variables (.0...0. the CVAR model is reformulated as: ∆xt = Γ1 ∆xt−1 + αβ0 xt−1 + Φs Dst + Φp Dpt + Φtr Dtrt + αµ0 + εt .. not explicitly modelled they are likely to produce some negative residual autocorrelation in the VAR model.0.. Dpt is a d2 ×1 vector of permanent blip dummy variables (.) and Dtrt is a d3 ×1 vector of transitory shock dummy variables (. Because the VAR model contain both diﬀerences and levels of the variables the role dummy variables (and other deterministic terms) is more complicated than in the usual regression model (Hendry and Juselius. T (6.0.. therefore. and Mosconi.1.

A mean shift in a variable xj. the forth summation a blip in the variables. the second summation generates a broken linear trend in xt . The second part of (6. the third summation a shift in the level of the variables associated with the permanent extraordinary large shock. The ﬁrst summation in (6. 113 (6.43) and C∗ (L) is an inﬁnite polynomial in the lag operator L. i. Φs = 0.41) where δ 0 = (β 0 α)−1 β0 Φs and δ 1 = β ⊥ (α0⊥ β ⊥ )−1 α0⊥ Φs and ϕ0 .t implies a permanent blip in ∆xj.42) gives the common stochastic trends generated by the ordinary shocks to the system. disappear.t and. which deﬁnes the variables xt as a function of εi . to avoid broken linear trend in the data δ 1 = 0 has to be imposed in (6. DUMMY VARIABLES AND THE VAR Φtr = αψ 0 +ψ1 . Hence.43) that only the β ⊥ components of (6. i = 1. whereas the α components will have a zero coeﬃcient and.. then there is no broken trend in the variables nor a mean shift in the cointegration relations. ψ1 are similarly deﬁned. hence. ψ0 .6. in the cointegration relations. If.42). whereas ψ 0 6= 0 describes a blip . ϕ1 .41) will enter with a nonzero coeﬃcient in the summation of the dummy components in (6. dummy variables which are restricted to be in the cointegration relations do not cumulate in xt .38) it is given by: xt = C Pt−1 i=1 εi + CΦs ˜ C∗ (L)(εt + µ0 +δ 0 Dst +Φp Dpt +ψ1 Dtt ) + X0 where.39). It is now possible to investigate the dynamic eﬀects of the dummies from the moving average representation of the model.e. δ 0 6= 0 describes a mean shift in β0 xt as a result of mean shifts in the variables that do not cancel in a cointegrated relation.39). X0 .42) (6. .. is restricted to lie in the α space. It appears from (6. the dummy variables ˜ Dst . Dpt and Dtt . as before C = β⊥ (α0⊥ Γβ⊥ )−1 α0⊥ Pt−1 i=1 Dsi + CΦp Pt−1 i=1 Dpi + CΦtr Pt−1 i=1 Dti + (6.39) (6. hence. t. but if Φp 6= 0. For model (6. Thus. and the initial values. Thus. then ψ 1 6= 0 describes level shifts in the variables that cancel in β0 xt . αδ 0 . δ 0 6= 0 implies Φp 6= 0..6.

To some extent it can be accounted for by the inclusion of transitory intervention dummies in the model.114 CHAPTER 6. |ˆi. 3 .) described by a +/. Conceptually we can distinguish between: • ordinary (normally distributed) random shocks. if a dummy variable needs a lag in the model we will consider the corresponding ’intervention’ shock to be inherently diﬀerent from the ’ordinary’ shocks. Thus.t | > 3.3ˆ ε .3ˆ ε ) described ε by a blip dummy without lags. σ • (extra)ordinary large permanent random shocks (|ˆi. but nevertheless ordinary. A blip in the variables xt implies a transitory shock in ∆xt . for example as a result of central bank or government interventions. But only the very large transitory shocks will generally be accounted for by dummies and the A similar distinction is between additive outliers. Because transitory shocks appear un-systematically this problem cannot be solved by increasing the lag length of the VAR or by including a moving average term in the error process.42) shows that a large shock at time t. Thus. The occurrence of transitory shocks in the model. outliers (typing mistakes. (6. for example as a result of market (over)reaction to various ’news’. σ • intervention shocks (large permanent shocks. we need to make a distinction between extraordinary intervention shocks with a permanent eﬀect.blip dummy. Thus. and ’ordinary’ large shocks. hence. will produce some (usually small) residual autocorrelations in the model and. will inﬂuence the variables with the same dynamics as an ordinary shock unless the dummies enters the model with lags.t | > 3. Thus. DETERMINISTIC COMPONENTS in β0 xt . and describes a situation when the blips in the levels of xt generated by transitory shocks to ∆xt do not cancel in β0 xt . ψ 0 6= 0 is consistent with Φtr 6= 0. we will consider it a big. which are extraordinary shocks which are not subject to the VAR dynamics (a typical example is a typing error in the data) and which after they have occurred are subject to the VAR dynamics. shock3 . To avoid adding more dummy components we assume here that αψ0 = 0. related to ε a well-deﬁned intervention) described by a blip dummy with lags • transitory large shocks. accounted for by the dummies Dpt or Dtrt . etc. violate the independence assumption of the VAR model. whether large or small. whereas if the dummy is needed only once. at the day of the ’news’.

75 in quarter i.  0 .1993:4. i + 2. a shift dummy Ds83t = 1 for t = 1983:1.1 Dp83t + Φp. three centered seasonal dummies deﬁned by Dq0t = [Dq1t . therefore.. the intervention may only have aﬀected one of the variables (or several variables but not proportionally with β). where Dqit = 0. The intervention may have inﬂuenced several variables in such a way that the intervention eﬀect is cancelled in a cointegration relation. likely to exhibit some minor autocorrelations in the residuals. 6. Alternatively. To be worked out later. an unrestricted transitory blip dummy Dtr75t = 1 for t = 1975:4. Table 1 not yet ﬁnished.5 for 1976:1 and 1976:2. Dq2t . 0 otherwise.6. where   xt−1 β  1  β0    ˜ ˜ β=  β 1  and xt−1 =  t Ds83t δ0    .. so that the eﬀect does not disappear in a cointegration relation.2 Dp83t−1 + Φtr D75trt + Φ0q Dqt + γ 0 + εt ..7. In addition. -0. Similar arguments as was made for the trend and the constant term in the VAR model can be made for intervention dummies. i + 3. j=1 The VAR model to be estimated is given by: ∆xt = Γ1 ∆xt−1 + αβ 0 xt−1 + αβ0 + αβ1 t + αδ 0 Ds83t + +Φp.. -0. ˜˜ = Γ1 ∆xt−1 + αβ xt−1 + Φp.. the advantages of using centered seasonal dummies i+ is that T Dqij = 0 in samples covering complete years. 0 otherwise. AN ILLUSTRATIVE EXAMPLE 115 empirical model is.1 Dp83t + Φp.25 in quarterP 1. The Dp83t and its lagged value will be included in the model. also restricted to lie in the cointegration space.7 An illustrative example Consistent with the discussion above we will now re-estimate the Danish VAR model allowing for a trend restricted to lie in the cointegration space.2 Dp83t−1 + Φtr D75trt + Φ0q Dqt + γ 0 + εt .. 0 otherwise and a permanent blip dummy Dp83t = 1 for 1983:1. Dq3t ].

116 CHAPTER 6.19 29.58 Complex 0. .8 p-val.04 The 5 largest roots of the characteristic polynomial Real 0. 0.73 0.0 1.2 Jarq.26 0. For example the test for forth order autocorrelation is no longer signiﬁcant and the residuals from the bond rate equation now pass the normality test.88 0.31 0.2 p-val.92 3. DETERMINISTIC COMPONENTS Table 6.16 2 Normality: LM χ (10) = 17.3 1.LM1 χ2 (25) = 2 LM4 χ (25) = 31.9 2.02 0.66 0. 0. 0.1: Speciﬁcation tests for the unrestricted VAR(2) model with dummies. This is probably due to the fact that calculated standard errors of the equations are now smaller and.02 -0.1.1 2.12 -0.66 0. Multivariate tests: Ljung-Box χ2 (425) = 450.10 2.37 0. the ’excess’ kurtosis of the residuals from the deposit rate is now further away from normality and the Jarque-Bera test clearly rejects normality. 0.50 3.87 0. It appears that the model speciﬁcation has improved to some extent.8 10. However.07 Univariate tests: ∆mr ∆y r ∆p ∆Rm ∆Rb ARCH(2) 5. We will ﬁrst have a look at the misspeciﬁcation tests reported in Table 6.87 0.25 Residual autocorr.31 -0.88 0.Bera(2) 1.6 p-val.22 -0. We note that instead of one root almost on the unit circle we have now two fairly large complex roots in which the complex part is very small. The question whether they correspond to approximate unit roots will be tested in Chapter 8.2 4.58 The properties of the residuals from the estimated VAR model with dummies have now changed compared to the unrestricted model of Chapter 4.29 4.1 0.1 1.37 Kurtosis 3. hence. any deviations from normality will now be measured with a more precise ’yardstick’.3 p-val.73 0.00 Modulus 0.9 Skewness 0.

00 0.15 4.01 −0.000 0.00 −0.23 1. We notice that although the estimates of Π and Γ1 have not changed a lot compared to the no-dummy VAR.t−1 Ds83t−1 t   +    0.73 −0.t−1 Rb.000   0.01 0.09 0.05 0.17 0.09 −0.00 0.01 0.008 0.000    −0.10 −3.18 −0.00 −0.29 0.01 0. AN ILLUSTRATIVE EXAMPLE 117       ∆mr t r ∆yt ∆p2 t ∆Rm.41 1.0218 0.00 1.19 −0.t−1     +                +     −0.0       .20   ∆p2 t−1  −0.01 0.001 −0.7.34 −0.01 −0.20 1.83 0.000    0. trace correlation = 0.0143 0.006 0.26 −0.16 0.00 −0.31 0.006 −0.000 −0.001 −0.08 0.04 −0.01 0.00 −0.11 0.00 −0.03 0.002 −0.01 −0.031 −0.00 −0.01 0.12 −1.20 1.0130 0.0 −0.002 −0.01 −0.008 0.61.25 0.01                  Dtr75t Dp83t Dp83t−1 Dq1t Dq2t Dq3t const      + εt     Log(Lmax ) = 1973. the parameters of the implicit money   Ω=    1.0014 .14 −1.0.000    −0.11 −0.001 0.91 t−1 r −0.0 0.000  mr t−1 r yt−1 ∆pt−1 Rm.04 −0.69   ∆yt−1  −0.20 ∆Rb.17 0.00 −0.76 1.08 0.6.t      =       ∆mr −0.01 −0.0011 0.19 −0.t ∆Rb.27 −2.22 −1.87 −4.00 −0.07 0.00 0.03 −0.64 0.09 0.10 −0.1.048 −0.000 −0.28 0.10 −0. σε =  ˆ      0.01 −0.06 0.25   ∆Rm.0 −0.00 0.002 −0.00 −0. log |Ω| = −52.00 0.t−1 −0.00 −0.00 0.01 0.05 −0.02 −0.0 −0.05 0.93 0.03 0.13 0.

The shift dummy is signiﬁcant in the money stock and deposit rate equation which is consistent with our prior hypotheses. 6. The estimates of the VAR model are generally robust to deviations from normality. It is almost impossible to know beforehand which of the three cases is closest to the truth in a speciﬁc empirical application. The transitory VAT dummy is signiﬁcant only in the income equation and the 1983 blip dummy is signiﬁcant in the money stock and the bond rate equation. For example the interest rate elasticities are now more moderately sized. Most of the seasonal eﬀects are quite insigniﬁcant. There are several possibilities: 1. The lagged 1983 dummy is not signiﬁcant in any of the equations and we conclude that the 1983 shock can be considered a large. 2. and so on. 3. but the estimated standard errors have decreased quite substantially. . DETERMINISTIC COMPONENTS demand relation in the ﬁrst row of the Π matrix seem now more reasonable. The linear trend seems to be important in the income and inﬂation rate equation. The residual correlations are almost unchanged. Neglecting the outlier problem is likely to produce unreliable results. The best advice is to take them seriously. Ordinary and extraordinary shocks are drawn from diﬀerent distributions. but the properties of the VAR estimates are sensitive to the presence of extraordinary large shocks.118 CHAPTER 6. but ordinary shock. It depends very much on such aspects as the length of the sample. The linear relationship of the VAR model holds approximately. how frequently the outliers occur and whether positive and negative outliers occur relatively symmetrically.8 Conclusions The normality assumption of εt is frequently not satisﬁed in empirical VAR models without accounting for such reforms and interventions that have produced extraordinary large residuals. The linear relationship of the VAR model does not hold for large shocks: market reacts diﬀerently to ordinary and extraordinary shocks.

If r = p then xt is stationary and standard inference applies. Section 7..1 Concentrating the general VAR-model The I(1) condition can be stated as: Π = αβ0 . Section 7. − Γk−1 satisfy the I(1) condition. If p > r > 0 then xt ∼ I(1) and there exists r directions into which the process 119 .1 demonstrates how the short-run dynamics can be concentrated out from the general VAR model. and Section 7.1) where α and β are p ×r matrices. (7.5 illustrates the estimation procedure. Section 7. is data congruent) so that we can proceed by testing the reduced rank conditions of Π under the assumption that Γ = I − Γ1 −.4 discusses the uniqueness of the unrestricted estimates.e. 7.e.Chapter 7 Estimation in the I(1) Model We assume here that the empirical VAR model can describe the data satisfactorily (i. i. Section 7. there are no I(2) components in the model. If r = 0 then xt is nonstationary and it is not possible to obtain stationary relations between the levels of the variables by linear combinations.. In this case the VAR model in levels becomes a VAR model in diﬀerences (without loss of long-run information) and since ∆xt ∼ I(0) standard inference applies. We say that the variables do not have any common stochastic trends and hence cannot move together in the long run.2 derives the ML estimator of β and α.3 shows how to normalize the cointegration vectors.

120

CHAPTER 7. ESTIMATION IN THE I(1) MODEL

can be made stationary by linear combinations. These are the cointegrating relations and the question is whether they can be given an interpretation as economic steady-state relations. We consider now a VAR(k) model in ECM form with Π = αβ 0 : ∆xt = Γ1 ∆xt−1 + ... + Γk−1 ∆xt−k+1 + αβ0 xt−1 ˜ +ΦDt + εt , (7.2)

where t = 1, ..., T, xt−1 is p1×1, p1 = p+m, m is the number of deterministic ˜ components, such as constant or trend, and the initial values x−1,..., x−k are assumed ﬁxed. In (7.2) the time series process xt is dependent on lagged values xt−1,..., xt−k . When we estimate the model based on a ﬁnite sample it is useful to condition on the ﬁrst k observations X0 = { x−1,..., x−k }, i.e. treating them as ﬁxed known parameters. This is particularly the case when the data are nonstationary, simply because it is not meaningful to include the marginal probability of a nonstationary variable in the likelihood function. Note that when choosing the sample period, it is important to make sure that the ﬁrst observations are not too far away from equilibrium. Otherwise the ﬁrst initial values might generate explosive roots in the data. We use the following shorthand notation: Z0t = ∆xt Z1t = xt−1 Z2t = [∆xt−1 , ∆xt−2 , ..., ∆xt−k+1 , Dt ], and write (7.2) in the more compact form Z0t = αβ0 Z1t +ΨZ2t + εt , where Ψ = [Γ1 , Γ2 , ..., Γk−1 , Φ]. We will now concentrate out the short-run ’transitory’ eﬀects, ΨZ2t , to obtain a ’cleaner’ long-run adjustment model. To explain the idea of ’concentration’ which is used in many diﬀerent situations in econometrics, we will ﬁrst illustrate its use in a multiple regression model.

A digression:
*********************************** It is well known (the Frish-Waugh theorem) that the OLS estimate of β 2.1 in the linear regression model:

7.1. CONCENTRATING THE GENERAL VAR-MODEL

121

yt = β 1.2 x1t + β 2.1 x2t + εt can be obtained in two steps: b • 1.a. Regress yt on x1t , obtaining the residual u1t from yt = ˆ1 x1t + u1t • 1.b. Regress x2t on x1t , obtaining the residual u2t from x2t = ˆ2 x1t + u2t b • 2. Regress u1t on u2t , to obtain the estimate of β 2.1 , i.e.:

u1t = β 2.1 u2t + error. Hence, we ﬁrst concentrate out the eﬀect of x1t on both yt and x2t , and then regress the ”cleaned” yt , i.e. u1t , on the ”cleaned” x2t , i.e. u2t . ************************************************* We now use the same idea on the VAR-model. First we deﬁne the auxiliary regressions: ˆ Z0t = B01 Z2t +R0t ˆ Z1t = B02 Z2t +R1t

(7.3)

ˆ ˆ where B01 = M02 M−1 and B02 = M12 M−1 are OLS estimates and Mij = Σt (Zit Z0jt )/T . 22 22 ¯ ¯ i Z0j are the empirical counterparts of the covariance matrices Σij Thus Mij − Z discussed in Chapter 3. The following scheme shows how these are deﬁned for the VAR(k) model: ∆xt M00 M10 M20 ∆xt−1 ··· M01 M11 M21 ∆xt−k+1 xt−1 M02 M12 M22

∆xt ∆xt−1 . . . ∆xt−k+1 xt−1

122

CHAPTER 7. ESTIMATION IN THE I(1) MODEL

The concentrated model R0t = αβ0 R1t + error (7.4)

is important for understanding both the statistical and economic properties of the VAR model. In the form (7.4) we have transformed the original ”messy” VAR containing short-run adjustment and intervention eﬀects into ”the baby model” form, in which the adjustment exclusively takes place towards the long-run steady-state relations. This means that we not only have transformed the ”dirty” empirical model into a nice statistical model but also into a more interpretable economic form.

7.2

Derivation of the ML estimator

Consider the concentrated model (7.4): R0t = αβ0 R1t + εt , t = 1, ..., T, where εt ∼ Np (0, Ω). The ML estimator is derived in two steps: First we assume that β is known and derive an estimator of α under the assumption that β0 R1t is a known variable. Then we insert α = α(β) in the expression for the maximum of ˆ the likelihood function so that it becomes a function of β, but not of α. We ˆ then ﬁnd the value of β that maximizes the likelihood function. When we ˆ have found the ML estimator of β we can then ﬁnd α = α(β). ˆ Step 1. The ML estimator of α given β corresponds to the standard LS estimator. It can be derived by post-multiplying (7.4) with R01t β and dropping the error term: R0t R01t β = αβ0 R1t R01t β. Summing over t, and dividing by T gives:

7.2. DERIVATION OF THE ML ESTIMATOR

123

S01 β = αβ0 S11 β where Sij = T −1 Σt Rit R0jt = Mij −Mi2 M−1 M2j . 22 It is now easy to derive the least squares estimator of α as a function of β: α(β) = S01 β(β 0 S11 β)−1 ˆ (7.5)

Step 2. Given the assumption of multivariate normality we have that the maximum of the likelihood function of (7.4) is equal to the determinant of the error covariance matrix as a function of ﬁxed β and α: ˆ L−2/T (β, α) = |Ω(β, α)| + cons tan t terms max where ˆ Ω(β, α) = T −1 P (R0t −αβ 0 R1t )(R0t −αβ0 R1t )0 (7.6)

P P P P = T −1 ( R0t R00t − R0t R01t βα0 −αβ0 R1t R00t +αβ0 R1t R01t βα0 ) = S00 −S01 βα0 −αβ 0 S10 +αβ 0 S11 βα0 (7.7)

By substituting (7.5) in (7.7) we can express the error covariance matrix as a function exclusively of β: ˆ Ω(β) = S00 −S01 β(β0 S11 β)−1 β 0 S10 −S01 β(β0 S11 β)−1 β0 S10 + S01 β(β0 S11 β)−1 β0 S11 β(β0 S11 β)−1 β0 S10 . | {z }
I

Hence:

Ω(β) = S00 −S01 β(β0 S11 β)−1 β0 S10

(7.8)

124

CHAPTER 7. ESTIMATION IN THE I(1) MODEL

where A, B, C are nonsingular square matrices. Now substitute: S00 = A 0 β S11 β = C S01 β = B in (7.9), resulting in:

ˆ ˆ The ML estimator of β is given by the estimate β that minimizes |Ω(β)| . To derive the estimator we use of the following result: ¯ ¯ ¯ A B ¯ 0 −1 −1 0 ¯ 0 ¯ (7.9) ¯ B C ¯ = |A| · |C − B A B| = |C| · |A − BC B |

|S00 | · |β0 S11 β − β0 S10 S−1 S01 β| = |β0 S11 β| · |S00 −S01 β(β0 S11 β)−1 β0 S10 | 00 | {z }
|Ω(β)|

Hence, |Ω(β)| = |S00 | · |β0 S11 β − β0 S10 S−1 S01 β| 00 |β0 S11 β| |β0 (S11 −S10 S−1 S01 )β| 00 = |S00 | · |β0 S11 β|

Using the result that the function f (x) = |X0 MX| |X0 NX|

is maximized by solving the eigenvalue problem |ρN − M| = 0, we can obtain a solution for β that minimizes |Ω(β)|. We ﬁrst substitute: M = S11 −S10 S−1 S01 00 N = S11 X = β (7.10)

12) 0 The cointegration vectors Vi xt are not yet normalized on a variable and in the next section we will discuss how to choose an appropriate normalization for each vector... This is similar to what we do in a regression model. to be the ’dependent’ variable. . when we choose one of the variables.. x1t = β 0 + β 1 x2t + β 3 x3t + ut .3.. Note that the relations β0i xt are ordered ˆ ˆ ˆ according to λ1 > .3 Normalization To be able to interpret a cointegration relation as a relation to be primarily associated with a particular economic variable we need to normalize the former by setting the coeﬃcient of the latter to be unity. . .. This would be the case if it turns out that the regression. The next chapter will discuss how to classify the p relation into r stationary relations corresponding to the r largest eigenvalues and p − r non-stationary relations corresponding to the p − r smallest eigenvalues. 7. the solution of which gives the estimates of β: |ρS11 −S11 +S10 S−1 S01 | = 0 00 or equivalently: |(1 − ρ)S11 −S10 S−1 S01 | = 0 00 | {z } λ (7.. i. Vp and we can now express the determinant of the residual covariance matrix as: p Q |Ω| = |S00 | i=1 (1 − λi ) (7.7.. x1t . λp and p eigenvectors V1 . > λp > 0 and the magnitude of λi is a measure of the ’stationarity’ of the corresponding β0i xt . to have a unitary coeﬃcient.e. NORMALIZATION 125 in (7. The normalized vectors will be called βi to distinguish them from the non-normalized vectors.10) to formulate the eigenvalue problem.11) The solution gives p eigenvalues λ1 . In a regression model with stochastic variables it might happen that we choose the ’wrong’ variable to be the dependent e variable.. x2t = β 0 + e e β 1 x1t + β 3 x3t +et . gives more interpretable coeﬃcient estimates and improved u statistical properties.

e. The unrestricted estimates of α and β are calculated given the following conditions: ˆ0 1. i. Because it happens quite frequently that the unrestricted cointegration relations are interpretable. In this sense the coeﬃcient estimates in a cointegration relation are more ’canonical’. β S11 β = I. The conditional independence condition is a consequence of the chosen ˆ0 ˆ eigenvalue normalization. r.126 CHAPTER 7. 3. since stationarity is clearly mandatory) are reasonable from an economic point of view. an important diﬀerence between a regression model and a cointegration relation. Stationarity. ˆ0 ˆ ˆ0 2. In Chapter 6 we gave these estimates a ﬁrst tentative interpretation in terms of underlying steady-state relations. There is. Given the above criteria the unrestricted cointegrating relations are uniquely determined and possibly meaningful if the former can be considered relevant. β xt ∼ I(0). For example. Normalizing on either x1t or x2t in the regression model generally changes the estimates of the regression coeﬃcients. The ordering given by the maximal conditional correlation with the stationary process ∆xt . however. in a relation describing real income seems a bad choice. where S11 was deﬁned at the beginning of this chapter. whereas in a cointegration relation the ratios between coeﬃcients are the same independent on the chosen normalization. How to determine r will be discussed in the next chapter. Conditional independence of β j xt . the Johansen procedure gives the maximum likelihood estimates of the unrestricted cointegrating relations β 0 xt .e. 7. In this section we will discuss whether such an interpretation is at all meaningful. normalizing on an insigniﬁcant or irrelevant coeﬃcient does not make sense. ESTIMATION IN THE I(1) MODEL In an analog manner the choice of normalization of a cointegrating relation should make sense economically as well as statistically. as a result of analyzing the likelihood .4 The uniqueness of the unrestricted estimates For a given choice of the number of stationary cointegrating relations. it may be of some interest to discuss whether the three conditions (or rather the last two. β S11 β = I. say. i. normalizing on money stock.

1. if we choose a suﬃciently rich set of variables for the VAR analysis. and cannot replace formal testing of structural hypotheses. The eigenvectors are calculated based on the normalization v0 S11 v = I. economic theory would generally suggest at least two (but often more) stationary long-run relationships. A somewhat heuristic guess is that the conditional independence may produce empirically interpretable relations when the VAR model contains suﬃciently many variables to identify these hypothetical long-run relations. of course. producers versus consumers. 7. to be able to empirically identify a long-run demand and a supply relation among the unrestricted cointegration relations there should be at least one variable which is strongly inﬂuencing the demand behavior but unrelated to the supply behavior and vice versa. If the empirical problem is about macroeconomic behavior in a market where equilibrating forces are allowed to work without binding restrictions. This is.5 An illustration Solving the eigenvalue problem (7. It is a purely statistical condition and is arbitrary in the sense that we could have chosen another normalization. the results should be considered indicative rather than conclusive. AN ILLUSTRATION 127 function. Because the conditional orthogonality condition surprisingly often seem to produce economically interpretable relations it is tempting to look for some regularity in macroeconomic behavior which could be associated with this purely statistical condition. the maximal correlation with the stationary part of the process. where we discuss how to obtain unique estimates without the need to imposing the condition of conditional independence. even if a direct interpretation of the unrestricted cointegration vectors is sometimes possible. The ordering is based on the magnitude of λi so that the ﬁrst . etc. one would generally expect two types of agents with disparate goals interacting in such a way that equilibrium is restored once it has been violated. at least in the long-run. the basic idea behind identiﬁcation which will be discussed in great detail in Chapter 10.7. The third statistical criterion. The question is whether we should expect them to be conditionally independent.11) for the Danish data produced the ﬁve eigenvalues and the corresponding eigenvectors reported in the ﬁrst part of Table 7.5. For example. Thus. These can be demanders versus suppliers. Therefore. employers versus employees. does not seem easily interpretable as a meaningful economical criterion.

1. αi· = wi vij .76. ESTIMATION IN THE I(1) MODEL 0 relation v1 xt is most strongly correlated with the stationary part of the process. 1 We will further discuss normalization of cointegration vectors in Section 9. The ﬁrst choice of normalization is tentative by nature and will often change as a result of more detailed inspection of the results. p. the fourth on y r . normalizing on ∆p in the ﬁrst relation means that the ﬁrst relation should in some vague sense describe a relation for inﬂation rate (with signiﬁcant equilibrium correction in the inﬂation rate equation).58 = 0.. the third on R m .1 are generally quite large and the coeﬃcients of the weights wi correspondingly small. p. both of which are statistical criteria without any obvious economic interpretation. The ﬁrst vector has been normalized on ∆p.58 corresponds to √ a correlation coeﬃcient 0.. The next chapter will deal with this important and diﬃcult issue.1. the second on m r . The squared canonical correlation coeﬃcient λ1 = 0. The question is when the value of λi is small enough not to be signiﬁcantly diﬀerent from zero. .128 CHAPTER 7. For example. we will illustrate below that a careful inspection of the ﬁrst estimation results can often be crucial for a successful completion of the empirical exercise.. The normalized vectors are reported in the middle part of Table 7.. For each eigenvector vi there is a corresponding vector of weights (loadP 0 ings) wi = S01 vi satisfying p wi vi = Π. Nevertheless. but to be able to interpret the results one needs to normalize on a variable that is ’representative’ for the relation. . The estimated unrestricted cointegration vectors may or may not make economic sense.. To distinguish between the non-normalized and the normalized vectors we use the notation vi· and wi· for the former and β i· and αi· for the latter. As already mentioned they are uniquely deﬁned based on (1) the ordering of the λi and (2) the choice of eigenvector normalization β 0 S11 β = I.. The coeﬃcients of the eigenveci=1 tors vi reported in Table 7. Without an adequate normalization it is hard to see what they mean and the ﬁrst task is to normalize1 the eigenvectors by an element vij as follows: −1 β i· = vi vij . . i = 1. We note that the last eigenvalue λ5 is quite close to zero. i = 1. The choice of normalization element can be done arbitrarily. and ﬁnally the ﬁfth on R b .

0 0.8) 1.20 0.9 -208.0) 0.00 (−1.6 -1.1: Estimated eigenvalues.03 −0.00 .1) 0.12 -0.0 0.5) (−2.2) 0.06 -0.00 The weights to the eigenvectors: αi α·1 α·2 α·3 α·4 α·5 ∆mr −0.28 0.001 (−1.00 0.00 0.2 25.04 4.7) (2.01 0.0) 0.1) 0.00 β 03 0 β4 -0. AN ILLUSTRATION 129 Table 7.00 0.1 -350.02 mr t −0.62 (−9.00 (0.9) (0.84 (−1.04 (−0.04 1.1) (−0.01 -0.9) (−0.5 17.06 1.0 0 -25.27 −0.2) (−1.5 -7.5) (0.2 19.01 0.2 327.01 −0.06 (−0.1) (1.01 -0.02 0.10 −0.01 (−2.00 (1.5) (−2.00 0.2) −1.00 (−0.3) 0.t 0.7) (−3.01 (1.25 −0.92 -0.00 β 02 1.6) (−5.96 (−3.6 2.0 -10.37 -0.11 (0.0 0.21 −0.9 0.36 v2 0.6) (−0.15 (0.00 -0.08 −0.2 -64.7 5.7 0.41 0.7 24.1) (0.7) (0.26 −0.5 -82.34 −0.9 3.6) −0.00 0.00 −0.02 (−0.00 (0.18 −0.04 (3.1) (−4.80 (1.1) (2.95 -0.1) 0.5) 0.7 199.8 -2. eigenvectors.3) 0.06 (0.33 0.63 0.5) ∆mr t r ∆yt The combined eﬀects : Π r yt ∆pt Rmt Rbt 0.8 -0.9 -3.00 −0. and loadings for the Danish data 0 Non-normalized eigenvectors: vi r r λi m y ∆p Rm Rb D83 trend 0 0.66 0.06 v 05 0 Normalized eigenvectors: β i 0 β1 0.6) (−1.6) (1.9) (−1.11 D83 t 0.05 (1.0 0 0.7) ∆2 pt ∆Rm.31 0.6 0.01 (0.6) (0.2 556.8) r ∆yt −0.3 40.58 v1 -1.01 −0.67 -.5) (1.5) (−0.94 (1.68 -12.03 0.5) trend 0.2) (0.11 −1.3 -0.21 −0.6) (0.00 β 05 0.30 (−1.00 −0.t ∆Rb.03 -0.8) −0.2) 0.5.97 13.0) (1.1 68.8 183.9) 0.9) (1.00 -1.2) (−0.01 0.60 0.8) −0.7) −0.07 1.09 (0.17 t (−0.8) (−2.7) −1.01 (0.2) −0.00 -1.1) (0.12 v4 -13.00 0.9) (3.1) (−1.7.9 3.3) ∆2 pt ∆Rm.00 -0.0) −0.5) −0.54 1.26 v 03 4.29 (−4.21 (−9.01 0.t 0.t ∆Rb.4) 0.2 -122.3) (3.00 −0.11 −1.01 0.7 -339.82 −4.7) 0.

and (v) the last relation is not really important in any equation. ∆xt .6(Rb − Rm ) + . ESTIMATION IN THE I(1) MODEL To be able to discriminate between signiﬁcant and less signiﬁcant αi coeﬃcients we have reported the least squares standard errors of estimates. Because each cointegration relation β0i xt was found to be important in just one equation. The graphs in Figure 7. real money. If this is not the case. (7. The ﬁrst relation seems approximately to describe a relation between real deposit rate and the interest rate spread: (Rm − ∆p) = 0. the long-run parameters β. inﬂation and deposit rate.t. where β0i R1t is derived from the concentrated model (7. As a zero row of α is the condition for weak exogeneity w. Note. this is a sign of nonstationarity of the last two eigenvectors.1-7. in- . are not. but with a t-value which would hardly be signiﬁcant based on a Dickey-Fuller distribution.. The ﬁnding that the cointegration vectors are signiﬁcant in just one equation each is a fortunate (and atypical) situation. (iv ) the fourth relation only in real income. whereas the remaining two. however.r. (ii) the second relation only in the money stock equation. we will tentatively try to interpret them as potential steadystate relations in respective equations. then a Dickey-Fuller type distribution is probably more appropriate. this tentative ﬁnding suggests that real income and bond rate are weakly exogenous in this model.130 CHAPTER 7.13) It was found to be important in the inﬂation rate equation where the sign of the adjustment coeﬃcient suggests equilibrium error correction.5 of β0i xt and β0i R1t . cannot be signiﬁcantly explained by a nonstationary variable. (iii) the third relation only in the deposit rate equation. that the ’t’ values are distributed as Student’s t only if the corresponding β0i xt is stationary. Based on the estimated α coeﬃcients we note that (i) the ﬁrst relation is signiﬁcantly adjusting only in the inﬂation rate equation. Hence. an issue that will be further discussed in Chapter 9. Since a stationary variable. are equilibrium error correcting. the real income and the bond rate. In this case we already know from the outset of the cointegration analysis that three of the variables..4) seem to support this interpretation. This is supported by the ﬁnding of no signiﬁcant coeﬃcients in the ﬁfth row of the Π matrix corresponding to the bond rate equation and only a signiﬁcant trend coeﬃcient in the second row corresponding to the income equation.

7. But. (7.(7. It has been debated whether it is at all meaningful even tentatively to interpret the estimated eigenvectors and in some cases the unrestricted relations are clearly not interpretable.15) Only the deposit rate seems to be signiﬁcantly adjusting to this relation.5..14) Only money stock is signiﬁcantly adjusting to this relation with a negative sign of the coeﬃcient α12 .3mr − 1. (7...16) which resembles an IS-curve relationship with positive real money eﬀects.5(Rb − Rm ) + . then the ﬁrst three eigenvectors (7.4(Rm − Rb ) + . Normalizing on real income gives the following relation: y r = 0.15) deﬁne stationary relations (provided that the coeﬃcients we tentatively set to zero were in fact .. seems most important for the real income equation. The fourth relation. surprisingly often the ﬁrst unrestricted estimates give a rough picture of the basic long-run information in the data. This suggests that money stock is equilibrium error correcting to agents’ demand for money. Again the coeﬃcient suggests it is equilibrium error correcting. The last relation seems deﬁnitely nonstationary and we will not attempt to interpret it.4Rb + . The latter can subsequently be used to facilitate the identiﬁcation of an acceptable structure of cointegration relations as argued below: If we assume that the cointegration rank is three in the above example. which is probably nonstationary. (7. AN ILLUSTRATION 131 ﬂation rate seems to have adjusted upwards when the real short-term interest rate has been above 0..6 times the long-short interest rate spread.13) . The third relation seems to describe an interest rate relation: Rm = 0. and it does not make sense to attempt to do so. The second relation resembles a typical money demand relation where the opportunity cost of holding money is measured by the spread between the bond rate and the deposit rate: mr = y r − 13..

squeezing the reality into ’all-too-small-size clothes’. which is needed for the income relation to become stationary. ESTIMATION IN THE I(1) MODEL zero).. This is in my view one way of translating Haavelmo’s . in other cases the model needs mor fundamental changes. because some of the β 0 xt relations are nonstationary. a ﬁrst tentative inspection of the empirical results might at an early stage of the analysis suggest how to modify either your empirical or your economic model. deﬁnes a stationary relation only when Π = αβ0 where α and β are p×r and (p + m) × r matrices and m is the number of deterministic components estimated to be proportional to α. One possibility is to go back to the economic model and see whether it is possible to understand the long-run weak exogeneity of the real income variable and the long-term bond rate and whether it is possible to make inﬂation and the deposit rate enter the model.1 reports the estimates of the unrestricted Π based on full rank. p. Another possibility is to reconsider the choice of variables and ﬁnd out whether an extended empirical model would be more consistent with the chosen economic model. This can be formally tested based on a LR test procedure discussed in Chapter 9 and if accepted we have identiﬁed three tentatively interpretable cointegration vectors spanning the cointegration space. For example in the above example we found that the fourth relation.132 CHAPTER 7. . for example real exchange rate. Assume now that the economic model we had in mind contained two steady-state relations: a money demand relation and an aggregate income relation. The other alternative. This could suggest that there is an important omitted I(1) variable. but due to a ceteris paribus assumption no prior relation for inﬂation rate and the interest rates... though probably nonstationary. is a too frustrating experience which all too often makes the desperate researcher choose solutions which are not scientiﬁcally justiﬁed... exhibited coeﬃcients which resembled an income relation. Therefore t-values in the brackets cannot be interpreted as Student’s t. which is to force your economic model onto the data.. In some cases the economic model needs only minor modiﬁcations. . The last part of Table 7. i = 1. First note that π0i.e. i. xt . Thus. How should we proceed after this ﬁrst inspection of the results? In my view already at this stage we need to adjust our intuition of how the economic and the empirical model work together.

1. AN ILLUSTRATION V1` * Zk(t) 1 0 -1 -2 -3 -4 -5 -6 74 76 78 80 82 84 86 88 90 92 133 V1` * Rk(t) 3 2 1 0 -1 -2 -3 74 76 78 80 82 84 86 88 90 92 Figure 7. .5.7. The ﬁrst cointegration relation β 01 xt (upper panel) and β 01 R1t corrected for short-run eﬀects (lower panel).

The second cointegration relation β 02 xt (upper panel) and β 02 R1t corrected for short-run eﬀects (lower panel).4 -35.9 -0.8 -37.0 -36. .7 74 76 78 80 82 84 86 88 90 92 Figure 7.6 -34.7 1.2 -36. ESTIMATION IN THE I(1) MODEL V2` * Zk(t) -32.0 -0.2.0 -32.134 CHAPTER 7.8 0.8 -33.4 74 76 78 80 82 84 86 88 90 92 V2` * Rk(t) 2.9 -1.8 -2.6 -38.

4 74 76 78 80 82 84 86 88 90 92 Figure 7.0 -0.3.5. AN ILLUSTRATION V3` * Zk(t) -9 -10 -11 -12 -13 -14 -15 74 76 78 80 82 84 86 88 90 92 135 V3` * Rk(t) 3.4 1.6 -2. The third cointegration relation β 03 xt (upper panel) and β 03 R1t corrected for short-run eﬀects (lower panel).7.6 0.2 2.8 -1. .8 -0.

The fourth cointegration relation β 04 xt (upper panel) and β 04 R1t corrected for short-run eﬀects (lower panel).0 -0.8 74 76 78 80 82 84 86 88 90 92 Figure 7.2 -1.6 -1. .4.2 0.0 2.6 0. ESTIMATION IN THE I(1) MODEL V4` * Zk(t) 55 54 53 52 51 50 49 48 74 76 78 80 82 84 86 88 90 92 V4` * Rk(t) 3.136 CHAPTER 7.8 1.4 1.

7.5. AN ILLUSTRATION
V5` * Zk(t)
-4 -5 -6 -7 -8 -9 -10 74 76 78 80 82 84 86 88 90 92

137

V5` * Rk(t)
2.1 1.4 0.7 0.0 -0.7 -1.4 -2.1 -2.8 74 76 78 80 82 84 86 88 90 92

Figure 7.5. The ﬁfth cointegration relation β 05 xt (upper panel) and β 05 R1t corrected for short-run eﬀects (lower panel).

Chapter 8 Determination of Cointegration Rank
Section 8.1 gives the basic results for the derivation of the Likelihood Ratio test of the cointegration rank and discusses whether there is an optimal sequence of these tests. Section 8.2 discusses the derivation of the asymptotic tables and how the presence of deterministic components inﬂuence these tables. Section 8.3 discusses the diﬃcult choice of the cointegration rank in a practical situation, Section 8.4 provides an empirical illustration, and Section 8.5 reports on some diagnostic tools for checking parameter constancy and illustrates with an analysis of the Danish data.

8.1

The LR test for cointegration rank

The LR test for the cointegration rank r is based on the VAR model in the R-form (??), where all short-run dynamics, dummies and other deterministic components have been concentrated out. Using (??) and (??) we can write the log likelihood function as:
p X i=1

−2lnL(β) = T ln|S00 | + T

ln(1 − λi ),

(8.1)

by calculating the eigenvalues of the determinant
−1 |λS11 − S01 S00 S10 | = 0

(8.2)

139

140

CHAPTER 8. COINTEGRATION RANK

giving the solution (λ1 , λ2 , ..., λp ). The eigenvalues λi can be interpreted as the squared canonical correlation between linear combinations of the levels β 0i R1t−1 and a linear combination of the diﬀerences ω 0i R0t . In this sense the magnitude of λi is an indication of how strongly the linear relation β 0i R1t−1 is correlated with the stationary part of the process R0t . Another way of exˆ ˆ ˆ0 ˆ pressing this is by noticing that diag(λ1 , ..., λr ) = α0 S−1 α = β S10 S−1 S01 β, ˆ 00 ˆ 00 ˆ i.e. λi is related to the estimated αi . When λi = 0 the linear combination 0 β i xt is nonstationary and there is no equilibrium correction, i.e. αi = 0. The statistical problem is to derive a test procedure to discriminate between those λi , i = 1, ..., r which correspond to stationary relations and those λi , i = r + 1, ..., p which correspond to nonstationary relations. Because λi = 0 does not change the likelihood function, the maximum is exclusively a function of the non-zero eigenvalues: L−2/T = |S00 | Πr (1 − λi ). i=1 (8.3)

Based on (8.3) it straightforward to derive a likelihood ratio test for the determination of the cointegration rank r, which involve the following hypotheses: H0 (p) : rank = p, i.e. no unit roots, xt is stationary H1 (r) : rank = r, i.e. p − r unit roots, r cointegration relations, xt is non-stationary The LR test, the so called trace test, is found as: ( ˆ ˆ ˆ |S00 |(1 − λ1 )(1 − λ2 ) · · · (1 − λr ) ˆ ˆ ˆ ˆ |S00 |(1 − λ1 )(1 − λ2 ) · · · (1 − λr ) · · · (1 − λp ) )

−2lnQ(Hr /Hp ) = T ln

ˆ ˆ τ p−r = −T ln(1 − λr+1 ) · · · (1 − λp ).

(8.4)

As an illustration consider a VAR with p = 5 variables based on which we test the hypothesis H2 : r = 2, i.e. p − r = 3 against the null H5 : r = 5, i.e. p − r = 0. The test value is calculated as: ½ ¾ |S00 |(1 − λ1 )(1 − λ2 ) −2lnQ(H2 /H5 ) = T ln |S00 |(1 − λ1 )(1 − λ2 )(1 − λ3 )(1 − λ4 )(1 − λ5 ) τ 3 = −T {ln(1 − λ3 ) + ln(1 − λ4 ) + ln(1 − λ5 )}

8.1. THE TRACE TEST

141

i.e. the LR test is a test of λ3 = λ4 = λ5 = 0, corresponding to three unit roots in the model. If this hypothesis is correct then the test statistic should be ”small” when compared to some critical value derived under the assumption that λ3 = λ4 = λ5 = 0. Note, however, that H2 , i.e. (λ3 = λ4 = λ5 = 0) is correctly accepted also when λ2 = 0 or λ2 = λ1 = 0. Therefore, if H2 is accepted, we conclude that there are at least 3 unit roots and, hence, at most two stationary relations. Assume that we have a prior hypothesis of the correct number of common trends p − r∗ , i.e. r∗ cointegrating relations. We could then calculate the test statistic τ p−r∗ using (8.4) and compare it with the appropriate critical value Cp−r∗ to be discussed in the next section. If τ p−r∗ > Cp−r∗ , we reject the hypothesis of p − r∗ unit roots (common trends) in the model, and conclude that they are fewer than assumed. If τ p−r∗ < Cp−r∗ , we accept the hypothesis of at least p − r∗ unit roots in the model, but conclude there may be more. Hence, the trace test (8.4) does not give us the exact number of unit roots p − r (or cointegration relations r). It only tells us whether p − r < p − r∗ (r ≥ r∗ ) when τ p−r∗ > Cp−r∗ or alternatively p − r ≥ p − r∗ (r < r∗ ) when τ p−r∗ ≤ Cp−r∗ . Therefore, to estimate the value of r we have to perform a sequence of tests. The question is whether this sequence should be from top to bottom, i.e. {r = 0, p unit roots}, {r = 1, p − 1 unit roots}, .... , {r = p, 0 unit roots} or the other way around. The asymptotic tables are determined so that when Hr is true then Pp−r (τ p−r ≤ C95% (p − r)) = 95% , where τ p−r is given by: τ p−r = −2lnQ(Hr |Hp ) = −T
p X

i=r+1

ˆ ln(1 − λi )

(8.5)

We discuss ﬁrst the ’top−→bottom’ procedure and then compare it with the ’bottom−→top’ procedure based on a simple example where p = 3. Applying the ’top−→bottom’ trace test procedure can hypothetically produce four diﬀerent choices of the cointegration rank r and, hence, the number ˜ of unit roots (p − r) : ˜ {p − r = 3, ˜ {p − r = 2, ˜ {p − r = 1, ˜ {p − r = 0, ˜ r = 0} ˜ r = 1} ˜ r = 2} ˜ r = 3} ˜ when when when when {τ 3 {τ 3 {τ 3 {τ 3 ≤ C3 } > C3 , τ 2 ≤ C2 } > C3 , τ 2 > C2 , τ 1 ≤ C1 } > C3 , τ 2 > C2 , τ 1 > C1 } .

142

CHAPTER 8. COINTEGRATION RANK

We will now illustrate the properties of the ’top → bottom’ sequence by investigating P1 {˜ = i, i = 0, ..., 3} when the true value of r = 1, i.e. r p − r = 2, and the size of the test is 5%. First, P1 {˜ = 0} = P1 {τ 3 ≤ C3 } r where n o ˆ ˆ ˆ τ 3 = −T ln(1 − λ1 ) + ln(1 − λ2 ) + ln(1 − λ3 ) .

ˆ For λ1 > 0, we have that −T ln(1 − λ1 ) → ∞ when T → ∞. Thus, P1 (τ 3 ≤ C3 ) → 0 asymptotically. The next value p − r = 2 corresponds to r = 1, the ˜ ˜ true cointegration number, and P1 (τ 2 ≤ C2 ) → 0.95 in accordance with the as. ˜ way the critical tables have been constructed. Thus, P1 (p − r = 1) → 0.05 as. and P1 (p − r = 0) →≤ 5%. We summarize: ˜ P1 (p − r = 3, ˜ P1 (p − r = 2, ˜ P1 (p − r = 1, ˜ ˜ P1 (p − r = 0, r = 0) ˜ r = 1) ˜ r = 2) ˜ r = 3) ˜ = = = = P1 (τ 3 P1 (τ 3 P1 (τ 3 P1 (τ 3 ≤ C3 ) > C3 , τ 2 ≤ C2 ) > C3 , τ 2 > C2 , τ 1 ≤ C1 ) > C3 , τ 2 > C2 , τ 1 > C1 ) → 0 → 0.95 → 0.05 → p0 < 0.05

In this case we start by testing p− r = 0, i.e. stationarity and if rejected ˜ continue with p− r = 1, and so on until ﬁrst acceptance. ˜ P1 (p − r = 0, r = 3) ˜ ˜ P1 (p − r = 1, r = 2) ˜ ˜ P1 (p − r = 2, r = 1) ˜ ˜ P1 (p − r = 3, r = 0) ˜ ˜ = = = = P (τ 1 P (τ 1 P (τ 1 P (τ 1 > C1 ) ≤ C1 , τ 2 > C2 ) ≤ C1 , τ 2 ≤ C2 , τ 3 > C3 ) ≤ C1 , τ 2 ≤ C2 , τ 3 ≤ C3 ) → p0 < 0.05 → 0.05 → ≤ 0.95 → 0

Thus, by applying the ’top → bottom’ procedure we will asymptotically accept the correct value of r in 95% of all cases, which is exactly what we would like a 5% test procedure to do. We will now similarly investigate the ’bottom → top’ procedure. For p = 3 there are the following four diﬀerent choices of the cointegration rank: {p − r = 0, r = 3} when {τ 1 > C1 } ˜ ˜ {p − r = 1, r = 2} when {τ 1 ≤ C1 , τ 2 > C2 } ˜ ˜ {p − r = 2, r = 1} when {τ 1 ≤ C1 , τ 2 ≤ C2 , τ 3 > C3 } ˜ ˜ {p − r = 3, r = 0} when {τ 1 ≤ C1 , τ 2 ≤ C2 , τ 3 ≤ C3 } . ˜ ˜

Under the null of p − r unit roots the last p − r eigenvectors vi . −T p i=r+1 ln(1− λi ) should not deviate signiﬁcantly from the simulated test values C ∗ .. 1995c where a detailed treatment can be found. the asymptotic tables have been simulated for (p − r)-dimensional VAR models where r = 0. Therefore. The asymptotic distributions depend on the deterministic terms in the V AR model as shown in Johansen.6) . The LR test statistic of the hypothesis H(r) against H(p) is given by (8.. 8. Hence. The reason why the ’top →down’ procedure is asymptotically more correct is because the probability of incorrectly accepting r < r∗ is asymptot˜ ically zero.8.2. it can be shown that the asymptotic distribution of −2lnLR{(H(r)/H(p)} is the same as −2lnLR{(H(0)/H(p − r)}. Here we will only give the intuition for how the distributions have been derived and how they have been aﬀected by deterministic components in the VAR model. whereas the probability of incorrectly accepting r > r∗ in the ˜ ’bottom → top’ procedure is generally greater than the chosen p-value.. However. p. THE ASYMPTOTIC TABLES 143 In this case the probability of wrongly accepting r = 2 or r = 3 is ˜ ˜ 0. the following approximation can be used: −T where p−r X i=1 p−r X i=1 ln(1 − λi ) ≈ T p−r X i=1 λi −1 −1 λi ≈ trace(S11 S10 S00 S01 ) (8. i = r+1.4). should behave like random walks.05 + p0 ≥ 0.05.2 The asymptotic tables and the deterministic components The distribution of the Likelihood Ratio test statistic (8.4) is non-standard and has been determined by simulations for the asymptotic case. Therefore. the probability of choosing the correct value r = 1 ˜ is ≤ 0. Under the null hypothesis of p − r unit roots. . if the null hypothesis P ˆ is correct then the calculated trace test statistic.95.

8) we have that µ0 = γ 0 ..9) (8..10) (8..7) Thus. there are no equilibrium correction in the simulated models.11) If µ0 and µ1 are unrestricted in (8. the tables are simulated under the assumption that there are no short-run adjustment eﬀects.144 CHAPTER 8. Under the assumption that r = 0 in (8. ... . and that r = 0. µ1 = γ 1 in (8. Γk−1 = 0. Γ1 = 0. Therefore.9) and (8. the idea behind the asymptotic tables is to simulate the distribution of (8.8) all nonstationary directions of the process contain stochastic trends as well as linear and quadratic deterministic . In the following we will discuss how to simulate the asymptotic tables for ﬁve diﬀerent assumptions on the deterministic terms in the model.7) by ﬁrst generating a (p − r)−dimensional random walk process of acceptable length and then replicate this process a large number of times. COINTEGRATION RANK Without loss of generality we let S00 = I and hence: p−r X i=1 −1 −1 λi ≈ trace(S11 S10 S01 ) = trace(S01 S11 S10 ) (8. 12 unit roots and for ﬁve diﬀerent assumptions on the deterministic components. Thus. the process xt can be represented as: t X i=1 xt = 1 εi + µ0 t + µ1 t(t + 1) + x0 2 (8.8) Because the trace test is exclusively related to the non-stationary directions of the model we need not consider the stationary components of the model when deriving the asymptotic distributions. They are all sub-models of the following model: ∆xt = αβ 0 xt−1 + µ0 + µ1 t + εt where xt is (p × 1).. i. Altogether the tables have been simulated for p−r = 1.10). µ0 = αβ 0 + γ 0 and µ1 = αβ 1 + γ 1 (8.e.e. i.

µ1 = 0 and γ 0 = 0.t R1. We will now show how to simulate the asymptotic tables using an example where p − r = 3. Because the asymptotic distributions change depending on whether µ0 and µ1 are restricted or unrestricted in the model a correct speciﬁcation of the deterministic components in the VAR model is crucial for correct inference.t R0. If there are no linear trends at all in the data.t = ∆xt − ˆ00 − ˆ01 t are the residuals in the auxiliary regressions of xt−1 and ∆x in (8. then γ 1 = 0. The asymptotic tables reported in Johansen (1976) and reproduced in the Appendix have been derived from 6000 replications of p . xt−1 and ∆xt will only be corrected in those versions of the model which contain an unrestricted trend and/or a constant.. p = 1. which under the assumption that αβ 0 = 0 gives: xt = t X i=1 εi + x0 . 12 and εt ∼ Np (0.dimensional random walk processes.t are speciﬁed as: . I). 400.. There are no deterministic terms in this model and R1.t t=1 (8. j = 0. This corresponds to the VAR model: ∆xt = αβ 0 xt−1 + εt . If there are linear but no quadratic trends in the data. . i.t R1.t and R0. ∆yt = εt .t Pt=1 0 S01 = T −1 T 0 R0.2. . t = 1. µ0 .t Pt=1 0 S00 = T −1 T R0. As discussed in Chapter 7 the reduced rank regression is based on the covariance matrices Sij . The following ﬁve VAR models with diﬀerent assumptions on the deterministic terms have been simulated: Case 1.12) b b b b where R1. 1 reproduced below: P 0 S11 = T −1 T R1. µ1 = 0....8).t = xt−1 − ˆ10 − ˆ11 t and R0..8. THE ASYMPTOTIC TABLES 145 time trends. Because the simulated VAR model does not contain any short-run eﬀects.

µ0 6= 0.. This corresponds to the VAR model: ∆xt = α β £ 0    Pt−1  ε1i ε1t Pi=1 =  t−1 ε2i  and R0.t =  ε2t  . .13) .t  Case 3. COINTEGRATION RANK R1..t = xt−1 − ˆ01 and R0. Pi=1 t−1 ε3t i=1 ε3i β0 which under the assumption that αβ 0 = 0 gives: xt = t X i=1 ¤ · xt−1 1 ¸ + εt ..t   and R0. T. t = 1. nor in the data. and β 0 6= 0. .146 CHAPTER 8. with γ 0 = 0.t Case 2. T. εi + x0 There are no linear trends in the model.t = ∆xt−1 − b ˆ00 . t = 1.. This corresponds to the VAR model: ∆xt = αβ 0 xt−1 + µ0 + εt with the vectors of corrected residuals: R1. µ1 = 0. Under the assumption that αβ 0 = 0 and µ0 6= 0 the process xt is given b by: t X i=1 xt = εi + µ0 t + x0 (8. ε3t  R1. µ1 = 0.. so:  Pt−1 i=1 ε1i  Pt−1 ε2i =  Pi=1  t−1 ε3i i=1 1   ε1t =  ε2t  . but the cointegrating relations have an intercept term. µ0 is unrestricted..

2.t = ∆xt−1 − ˆ00 .t Case 4. In the auxiliary regression of xt−1 on the unrestricted constant in the VAR model.13) cancels and R1.t =  ε2t  . Under the assumption that αβ = 0 the process xt becomes: xt = t X i=1 εi + µ0 t + x0 In this case we have allowed for a linear trend both in the data and in the cointegration relations. but have restricted the quadratic trend to be zero.t contain three corrected stochastic trends and a linear trend but no constant.. Because the constant x0 cancels in the regression of xt−1 on a constant. Since a linear time trend asymptotically dominates a stochastic trend.t  .t will contain stochastic trends t−1 εi and a linear i=1 trend t but no constant. ε3t  R1. t]0 − ˆ01 and b 0 b R0... T. THE ASYMPTOTIC TABLES 147 There are no linear trends in the VAR model but.t = [x0t−1 . T. γ 1 = 0. . the data contain linear trends. t = 1. because of the unrestricted constant µ0 . µ0 is unrestricted. R1.. This corresponds to the VAR model: ¸ · £ 0 ¤ xt−1 + µ0 + εt ∆xt = α β β 1 t with the vectors of corrected residuals given by : R1. ..t   and R0.8. The tables are based on:   Pt−1   ( i=1 ε1i ) | 1 ε1t P =  ( t−1 ε2i ) | 1  and R0. the constant x0 in P (8. i=1 ε3t t|1 R1. β 1 6= 0. the (p − r) = 3 nonstationary directions of the process are decomposed into (p − r − 1) = 2 directions that contain the corrected stochastic trends and one direction that contains the linear trend. t = 1..  Pt−1 i=1 ε1i | 1  Pt−1 ε2i | 1 =  Pi=1  t−1 ε3i | 1 i=1 trend | 1   ε1t =  ε2t  .

t As an example let us consider a VAR model with an unrestricted constant.t . the (p − r) = 3 nonstationary directions of the process are decomposed into (p − r − 1) = 2 directions which contain the corrected stochastic trends and one direction which contains the quadratic trend.t = xt−1 − ˆ01 − ˆ11 t and R0. This corresponds to the VAR model: ∆xt = αβ 0 xt−1 + µ0 + µ1 t + εt with the vectors of corrected residuals: R1. such as intervention dummies. t R1. T. are left in R1. t ε1t Pt−1 =  ( i=1 ε2i ) | 1. Under the assumption that αβ = 0 and µ1 6= 0 the process xt becomes: xt = t X i=1 1 εi + µ0 t + µ1 t(t + 1) + x0 2 (8.. COINTEGRATION RANK Case 5. only the stochastic trends. Furthermore.3 and the ﬁrst test value corresponds to the 50% quantile of the 5000 test statistics calculated from the simulated VAR model with p − r = 3 unit roots and an unrestricted constant. This case corresponds to the third row of Table A.65 and in 4750 cases smaller than 29.38.148 CHAPTER 8. in 2500 cases the trace test statistic was smaller than 18. We have shown that the asymptotic distributions depend on whether there is a constant and/or a trend in the VAR model and whether they are unrestricted or not.. . .14) This model allows for linear trends and quadratic trends in the data as well as linear trends in the cointegrating relations. and the i=1 quadratic trend. the 95% quantile.    Pt−1  ( i=1 ε1i ) | 1. µ0 are unrestricted. µ1 .14) cancel in the regression of xt−1 onP unrestricted the constant and the linear trend. However. other deterministic components.. t2 .t =  ε2t  . t = 1. Thus. are also likely to inﬂuence the shape of the distributions. because a quadratic time trend asymptotically dominates a linear stochastic trend.t = b b 0 b b ∆xt−1 − ˆ00 − ˆ10 t. Because the constant and the linear trend in (8. t−1 εi . In this model we would like to test the hypothesis of p − r = 3 unit roots. ε3t t2 | 1. t  and R0.

the diﬀerence of t) has to be unrestricted in the model. Thus. Hendry. A typical example is an unrestricted shift dummy (· · · . Because the asymptotic distributions for the rank test depend on the deterministic components in the model and whether these are restricted or unrestricted.2. This is achieved by allowing the linear trend to enter the cointegrating relations.· · · ) which cumulates to a broken linear trend in the data. When the rank has been determined it is always possible to test the hypothesis β 1 = 0. and Doornik. Assume. case 2 is the appropriate speciﬁcation (unless exceptionally the cointegration relations can be assumed to have a zero mean) . Section 3) is generally the best speciﬁcation to start with unless we have a strong prior that the linear trends cancel in the cointegration relations. This is because case 4 allows for trends both in the stationary and nonstationary directions of the model and. In this case we need to include an unrestricted constant term µ0 = αβ 0 + γ 0 in the VAR model (c. a linear trend in the variables need not cancel in the cointegrating relations and we need to allow for the possibility of trend-stationary cointegration relations.0. the rank and the speciﬁcation of the deterministic components have to be determined jointly.1. β 1 6= 0 in (??).1. ∆Dt .e. as a linear hypothesis on the cointegrating relations. Nielsen and Rahbek (1998) have demonstrated that a test procedure based on a model formulation that allows a deterministic variable. that the data contain a linear trend t so that E[∆xt ] = γ 0 6= 0. Juselius (2000). care should be taken when a deterministic component generates trending behavior in the levels of the data.e. hence.0. An detailed discussion of this case can be found in Johansen. i. This will be illustrated in the next chapter. to be in the cointegration relations and its diﬀerence. for example. Nielsen. similarity in the test procedure.8. and Mosconi (2000). case 4 (see also Chapter 6. To summarize: Given linear trends in the data.f.1. Dt .0. (??) and the discussion in the previous chapter) to account for the linear growth in the data. However. to achieve similarity in the test procedure the linear trend t has to be restricted to the cointegration relations and a constant term (i. Given no linear trends in the data. THE ASYMPTOTIC TABLES 149 In particular. and Nielsen (1998). Alternatively the deterministic components have to be removed from the model prior to testing. to be in the VAR equations gives similarity in the test procedure.

The former will be given an interpretation as equilibrium errors (deviations from steady-state) and the latter as common driving trends in the system.150 CHAPTER 8. COINTEGRATION RANK 8.+ Γk−1 ∆xt−k+1 do not matter asymptotically. Thus. Note. If the data are very informative about a hypothetical long-run relation (β 0 xt . Whether the sample is ’small’ or ’big’ is not exclusively a function of the number of observations available in the sample but also of the information in the data. the choice of r will inﬂuence all subsequent econometric analysis and will be crucial for conclusions we draw on our economic hypotheses. A low power of the test is a sign that the data are not very informative about the cointegration rank. Hence.. however. In small samples these eﬀects are in most cases important.). Johansen (2002) demonstrated that the closer the VAR model is to the I(2) boundary the more important is the short-term dynamic eﬀects. that a high value of λi can also be an indication of a small ratio between the number of estimated parameters and the number of observations. Unfortunately there is no clear answer to the question of how many observations we need for the asymptotic results to hold suﬃciently well. we may have a problem both with the size and the power of the test when determining the rank. In the ideal case we would like the probability to reject a correct null hypothesis (r = r∗ ) to be small and the probability . the equilibrium error crosses the mean line several times over the sample period) then we might have good test properties even if the sample period is relatively short. If the estimated eigenvalues are empirically informative in the sense of being either very high or very low the trace test is likely to perform well.. i.e. In many cases the proper solution is to use bootstrap methods to determine the critical values (ref.3 The cointegration rank: a diﬃcult and crucial choice The cointegration rank divides the data into r relations towards which the process is adjusting and p − r relations which are pushing the process. In the previous section we showed that the asymptotic distributions depend on the deterministic components in the VAR model and that the stationary short-run eﬀects Γ1 ∆xt−1 + . The idea is to simulate tables for models mimicking the short-run dynamics of the empirical model. If some of the estimated eigenvalues are in the region where it is hard to discriminate between signiﬁcant and insigniﬁcant eigenvalues the trace test will usually have low power for near unit root alternatives.

Many simulation studies have demonstrated that the asymptotic distributions can be poor approximations to the true distributions when the sample size is small resulting in substantial size and power distortions.. In some cases the size of the test and the power of alternative hypotheses close to the unit circle are almost of the same magnitude. It will also incorrectly accept r = r∗ in say 90% of the cases when the true value of r is greater than r∗ . Small sample corrections have been developed in Johansen (2002). i = 1. say p − r = 2. therefore. if rejected.. if accepted. would invalidate subsequent inference. While applying a small sample correction to the trace test statistics leads to a more correct size. In such cases a 5% test procedure will reject r = r∗ incorrectly in 5 % of all the cases where r∗ is the true value. As discussed above the trace test is based on a sequence of tests. the test procedure is essentially based on the principle of ”no prior economic knowledge” regarding the rank r. in the monetary model of Chapter 2 we demonstrated that the hypothesis (r = 3. more likely. then including the r0 th cointegrating relation in the model would not improve the explanatory power of the model but. continue with this assumption unless the data strongly suggest the presence of additional unit roots. if the adjustment back to equilibrium is very slow then the correct hypothesis would be a stationary but near unit root. For moderately sized samples (50-70) typical of many empirical models in economics these corrections can be substantial. Whether we choose the ’top-down’ or the ’down-top’. p of the r0 th cointegrating vector.3. . This is particularly worrying when the null of a unit root is not a natural economic hypothesis. But if the choice of r incorrectly includes a nonstationary relation . using the trace test and. p−r = 3) might be preferable a priori as a result of very slow market adjustment. An alternative procedure is. For example. continues until ﬁrst acceptance of p − r unit roots. For example.r . This is in many cases diﬃcult to justify. we can test the signiﬁcance of the adjustment coeﬃcients αi. The latter can be investigated in a number of ways. to test a given prior economic hypothesis. For example. it does not solve the power problem. one shifting the aggregate demand curve and the other the aggregate supply curve..8. p − r = 2) was a priori consistent with two types of autonomous shocks. In the ’top-down’ case we test the hypothesis ”p unit roots” and.i coeﬃcients have small t-ratios. We also discussed that for a more regulated economy the hypothesis (r = 2. If all αr. CHOOSING THE RANK 151 to accept a correct alternative hypothesis (r 6= r∗ ) to be high for relevant hypotheses in the ’near unit root’ region.

2001). As demonstrated in Chapter 2. For example. in the monetary model of Chapter 2 there was one equilibrium relation. the characteristic roots of the model: If the rth + 1 cointegration vector is nonstationary and is wrongly included in the model. thus. Note also that the cointegration rank is not in general equivalent to the number of theoretical equilibrium relations derived from an economic model. If all of them are small. the t-values of the α-coeﬃcients to the rth + 1 cointegration vector. 2. inﬂation and two interest rates (instead of just one as in Romer’s example). The prior assumption that r = 1 has been incorrectly assumed in many empirical applications of money demand data. Thus. To summarize: When assessing the appropriateness of the asymptotic tables to determine the cointegration rank we need to consider not only the sample size but also the short-run dynamics. then the largest characteristic root will be close to the unit circle. say less than 3. If either of these cases occur. we can check 1. Reducing the rank in this case will not solve the problem as will be further discussed in Chapters 14 and 15.0. . The reason for this is that a theoretically meaningful relation can be (and often is) a weighted sum of several ’irreducible’ cointegration relations (Davidson. For example. Section 4 the monetary model would be consistent with r = 3 cointegrating relations (and not one) in a VAR model with real money. be large. then one would not gain a lot by including the rth + 1 vector as a cointegrating relation in the model.152 CHAPTER 8. then one of the roots of the characteristic polynomial of the model would correspond to a unit root or a near unit root and. Note. the money demand relation (??). these relations contain invaluable information about common stochastic trends between sets of variables. real income. However. In the next chapter we will illustrate that this can be used to assess the hypothetical scenario of the economic problem as proposed in Chapter 2. Because the power of the trace test can be very low for alternative hypotheses in the neighborhood of the unit circle it is advisable to use as much additional information as possible. cointegration between variables is a statistical property of the data that only exceptionally can be given a direct interpretation as an economic equilibrium relation. then the cointegration rank should be reduced. that additional unit roots in the characteristic polynomial can be the result of I(2) components in the data. however. COINTEGRATION RANK among the cointegrating relations.

95 The trace test appears to reject 5 unit roots (140. the 95% quantiles from the asymptotic Table 15. p. but not 3 unit roots (39.... The sample size is 79 and the asymptotic tables may not be very precise approximations in this case. Furthermore. one should reconsider the choice of r.83 corresponds to a unit root or not is diﬃcult to know and we will collect more information by inspecting the signiﬁcance of the adjustment coeﬃcients αi. As demonstrated in Chapter 1. . Whether the root 0. the power of the test might be low for the third eigenvector with the eigenvalue λ3 = 0. a moderate diﬀerence. the economic interpretability of the results. Table 7. We .6)..77 for r = 2 and 0. The above criteria will now be illustrated based on the Danish data.0) and 4 unit roots (71.2). the recursive graphs of the trace statistic for r = 1..27. our prior economic hypothesis was r = 3 assuming two driving trends. before choosing the cointegration rank it is useful to examine all ﬁve sources of additional information suggested at the end of Section 8. 2.4. −Tln(1-λi ).1 we have reported the estimated eigenvalues. P. 8. i = 1. Since the e variable −Tj ln(1 − λi ). . This value is in the borderline region where it is hard to know whether the corresponding eigenvector should be considered stationary or nonstationary. choose r = 2.. therefore. 5.. 4. AN ILLUSTRATION 153 3. but stay constant for i = r + 1. r . T. .1 reported the Π matrix decomposed into all ﬁve α and β vectors. The largest characteristic root is 0.. for example are data I(2) instead of I(1). j = T1 . . one nominal and one real stochastic trend.8.7 > 62. p.and j=1 C ∗ .4 > 87. Based on the trace test we would. the trace test. p..3. Therefore. grows linearly over time when λi 6= 0 the recursively calculated components of the trace statistic should grow linearly for all i = 1. the graphs of the cointegrating relations: If the graph of a supposedly stationary cointegration relation reveals distinctly nonstationary behavior..4 in Johansen (1995)..0 < 42..4 An illustration based on the Danish data In Table 8....3.2 and αi.83 when r = 3. . Trace(i) = − i Tln(1-λi ). . the components λi of the trace test. or ﬁnd out if the model speciﬁcation is in fact incorrect.

consistent with λ3 being diﬀerent from zero.0 42.6 0.87 1.2 0.43 notice that the choice r = 2 will exclude the relation β 03 xt describing a relation between the two interest rates from the model.0 0.0 0.44 0.0 1.3 is highly signiﬁcant implying that the deposit rate is strongly adjusting to β 03 xt .88 1. .12 2 10.1: The trace test of the cointegration rank and the eigenvalue roots of the model λi p-r Tln(1-λi ) Trace(i) C ∗ Modulus: 5 largest roots . Finally.27 3 24. The adjustment coeﬃcient α4.88 0.59 5 68.74 0.0 25.74 0.1 the graph of the third relation suggests meanreverting behavior in spite of some evidence of drift. before trusting this choice we need to assess the constancy of parameters of the VAR model.4 87.1 15.55 0. the graph of the recursively calculated trace test for the third component given in Figure 8. Altogether we have found strong evidence supporting our economic prior r = 3.35 4 32.77 0.06 1 5.4 0.95 r=5 r=4 r=3 r=2 0.56 0. In the next section we will discuss some recursive procedures which have been developed to detect possible sources of parameter nonconstancy in the VAR model.2 exhibits linear growth over time.0 0. However.7 62.7 140.83 1.154 CHAPTER 8. In Figure 8.0 12. COINTEGRATION RANK Table 8.0 39.0 1.0 5.74 0.66 0.66 0.5 0.0 1.7 71.0 0.

4 74 76 78 80 82 84 86 88 90 92 Figure 8.1.0 -0.5.6 0.8 -1. RECURSIVE TESTS OF CONSTANCY V3` * Zk(t) -9 -10 -11 -12 -13 -14 -15 74 76 78 80 82 84 86 88 90 92 155 V3` * Rk(t) 3. 8.1 Recursively calculated trace tests .8. 8.2 2.4 1. Upper panel based on β 0 xt and lower panel based on β 0 R1t .5 Recursive tests of constancy In Chapter 4 we applied a number of residual misspeciﬁcation tests of the VAR model.8 -0. The purpose of this section is to provide a number of diagnostic tests to check for this important feature of the model. Though the empirical model seemed to pass these tests suﬃciently well to continue the analysis it is completely possible that the model suﬀer from parameter non-constancy.5.6 -2. The graphs of the third cointegration relation.

however.0 1.6 1.0 0. Note.4 1. The expression for the trace test in (8. Figure 8.4 83 84 85 86 87 R(t) 88 89 90 91 92 93 1 is the 10% significance level Figure 8.0 0.6 0.2 2.8 0. The recursively calculated components of the trace statistic scaled by the 90% quantile of the asymptotic distributions.2 2.2 illustrates the recursively calculated components. To increase readability.2 1. whereas the lower panel is based on recursive estimation of the R-form in which shortrun eﬀects have been concentrated out as follows: Based on the full sample R0t and R1t are determined once and for all by the auxiliary regressions (??) .0 1.2 1.8 1.8 83 84 85 86 87 88 89 90 91 92 93 Z(t) THE CRITICAL VALUES ARE NOT VALID 2.6 1.8 1.2. that the scaling by the critical values can have the consequence that the lines cross each other.6) is used repeatedly in the recursive estimation of the VAR model. COINTEGRATION RANK The Trace tests 2. then the applied 90% quantiles may no longer be appropriate and CATS will insert a warning in the picture ”The critical values are not valid”.4 1. The upper panel is based on recursive estimation of the full model. the trace statistic is scaled by the 90% quantile of the appropriate asymptotic distribution. Note also that if the VAR model contains exogenous variables or dummy variables.156 CHAPTER 8.

. .3 83 84 85 86 87 88 89 90 91 92 93 50. p.9 50. The log-likelihood value is calculated as: Ã −2/t1 ln(L(r)) = t1 = T0 .92 1.5 -Sum(ln(1-lambda)) 87 88 89 90 91 92 93 83 84 85 86 87 88 89 90 91 92 93 60.(t1 ) ) .15) p and the 95% conﬁdence bound is calculated as ±2 2p/t1 . Figure 8. .(t1 ) | + r X i=1 ˆ ln(1 − λi.0 83 84 85 86 -2/T*log-likelihood 60. 8.8.5.5 45. therefore.40 50. likely to correspond to a unit root or a near unit root.0 52.1 2.0 -2/T*log-likelihood 87 88 89 90 91 92 93 83 84 85 86 87 88 89 90 91 92 93 Figure 8.52 R(t) 50. (8.64 -ln(det(S00)) -ln(det(S00)) 50.. However.08 1. . the smallest eigenvalue test component in the upper panel is hardly growing and is.1 51.28 50.0 47.0 57.5 55. .2 The recursively calculated log-likelihood Z(t) 51.76 1. ! ln |S00.72 2.7 1. All trace components exhibit (to some extent) trending behavior consistent with non-zero λi . The recursively calculated loglikelihood based on the full model and the R-form.3.0 50. T.0 52.40 2.60 1.4 50.5 50.6 50.9 1.5 50..3 illustrates.8 50.24 2.0 1.8 1.5 55.6 1.7 50. In this sense any parameter instability in the short-run coeﬃcients will have been averaged out in the lower panel.5 45.5. RECURSIVE TESTS OF CONSTANCY 157 in Chapter 7.04 83 84 85 86 87 88 89 90 91 92 93 50.5 50. .16 2.0 57. The recursions are now calculated as if R0t = αβ 0 R1t + εt is the true model. . i = 1.44 83 84 85 86 -Sum(ln(1-lambda)) 2.56 2.0 47.

The upper panel is for the full model and the lower panel for the R-form model.0 83 84 85 86 87 88 89 90 91 92 93 Figure 8. This is a typical outcome. . 8.0 0.3 Recursively calculated prediction tests 1-step prediction test 5 Z(t) 4 3 2 1 0 83 84 85 86 87 88 89 90 91 92 93 2. (t1 = T0 + 1. The one-step-ahead prediction test is based on the hypothesis that the vector process ∆xt1 is generated by the same cointegrated process that has generated ∆x1 .0 1. u section 4. Note also that the R . . (See L¨tkepohl (1991).5 0. . T ). Recursively calculated one-step ahead prediction errors of the system. . .5.6).5 R(t) 2. The one-step-ahead prediction error for the system is calculated . . . .5 1.4. COINTEGRATION RANK It appears that the calculated log-likelihood lies within the 95% conﬁdence bands for t1 =1983:1-1994:3. ∆xt1 −1 . because some of the short-run coeﬃcients in the full model are likely to be unstable over time.158 CHAPTER 8.form is more stable than the full model.

. A prediction error larger than two standard errors it indicated with a long vertical line. and the test statistic as: ¶ t1 0 ˆ T (t1 ) = + 1 f(t1 ) Ω−1−1) f(t1 ) (t1 d1 + r t1 = T0 + 1. T.18) and is asymptotically distributed as χ2 (1). . .6. Also in this case we can choose between predictions from the full model illustrated in Figure 8. . RECURSIVE TESTS OF CONSTANCY as: k−1 X j=1 159 ft1 = ∆xt1 − ˆ ˆ ˆ Γj.t1 /Ωii. The test of one-step-ahead prediction errors for the individual variables xi.(t1 −1) − Φ(t1 −1) Dt1 . Figure 8.8. (8. d1 + r Ti (t1 ) = t1 = T0 + 1. T.(t1 −1) . Note the large prediction errors of the bond rate and the deposit rate at 1983:1 based on the full model.17) where d1 = k − 1 + d and d is the number of dummy variables in the model. This is almost an artifact of the R-model predicting exclusively the long-run components of the model in contrast to the full model which has to predict both the long-run and the short-run components. The R-form model does not show a similar prediction failure because this eﬀect has been concentrated out by the dummy variable Dp83t . Note that the prediction errors from the full model are generally much larger than from the R-form model. . . .(t1 ) ∆xt1 −j − Π(t1−1 ) xt1 −1 − µ0.5. . Under the null T (t1 ) is asymptotically distributed as χ2 with p degrees of freedom.5 and from the R-form model illustrated in Figure 8. . . .t1 is given by: µ ¶ t1 2 ˆ + 1 fi. ˆ (8. T. .4 illustrates.16) t1 = T0 + 1. . µ (8.

5 0.5 DFY 6.2 1.160 CHAPTER 8.5 0. One-step ahead prediction errors of each variable of the system based on the full model.0 2.75 1.8 4.0 83 84 85 86 87 88 89 90 91 92 93 0.0 5.25 0.25 1.0 DMO 2.5 1.25 0.5.6 2.00 83 84 85 86 87 88 89 90 91 92 93 Figure 8.0 3.75 1.5 4. A vertical line indicates a prediction error larger than two standard errors.0 1.00 0.6 0.5 2.50 0.00 0.8 0.50 0.75 2.75 0.4 1.4 DIBO 3.00 DIDE 1.25 1.00 DDIFPY 1.00 83 84 85 86 87 88 89 90 91 92 93 3.50 1.0 0.5 1.0 83 84 85 86 87 88 89 90 91 92 93 0.50 2. COINTEGRATION RANK 3.0 83 84 85 86 87 88 89 90 91 92 93 2. .0 1.

0 0. The ˆ standard error of the estimate λi (t1 ) is calculated as: ˆ s.5 1.0 1.u (h)2 − ri.25 1.uv (h)2 )).00 2.75 DMO_R 1. .25 1.25 2.0 83 84 85 86 87 88 89 90 91 92 93 0. q P 1 1 3 ˆ ˆ T −1 4(1 − λi )2 (λi + T (1 − h/T 3 )2 (ri.75 0.0 0.t−h . One-step ahead prediction errors of each variable of the system based on the R-form model.75 0.t vi.uv (h) = T −1 PT ˆ ˆ t=h ui.25 1.t−h .00 0.00 83 84 85 86 DDIFPY_R 87 88 89 90 91 92 93 Figure 8.5 0.50 1.00 83 84 85 86 87 88 89 90 91 92 93 2.25 0. RECURSIVE TESTS OF CONSTANCY 161 2.00 1.50 0.50 1.u (h) = T −1 PT ˆ ˆ t=h ui.6.00 83 84 85 86 87 88 89 90 91 92 93 3.5 DFY_R 1.50 0.75 DIDE_R 1.8.5 0.00 83 84 85 86 87 88 89 90 91 92 93 0.25 1.(λi ) = where ri.25 2.e.75 0.00 1.19) ri.25 2.25 0. The time paths of the recursively calculated r largest eigenvalues shows the estimated eigenvalues from the unrestricted VAR model (8. h=1 (8.00 0.50 DIBO_R 3.00 0.t ui.5.50 1. A vertical line indicates a prediction error larger than two standard errors.50 1.75 1.22).50 0.75 1.25 0.

Figure 8. It appears that the estimated eigenvalues stays within the bands for all periods. i = 1.50 0.50 0.7 illustrates the recursively calculated λi .. This is a quite typical outcome which suggests that the power of detecting instability might be quite low.t = λi 2 α0i S00 R0t . . 3 together with 95% conﬁdence bands..75 0.25 0. i = 1.t = (λi (1 − λi ))− 2 α0i S00 (R0t − αβ R1t ). . ˆ 1 For further details see Hansen & Johansen (1999).25 0.00 lambda1 1.50 0. ..00 83 84 85 86 87 88 89 90 91 92 93 Figure 8. Recursively calculated λi .00 83 84 85 86 87 88 89 90 91 92 93 1..7.. 3 together with the 95% conﬁdence bands. 1.75 0.75 0. COINTEGRATION RANK ˆ − ˆ −1 ui.162 for CHAPTER 8.00 lambda3 0.00 83 84 85 86 87 88 89 90 91 92 93 0.00 lambda2 0. ˆ and 0 1 ˆ ˆ ˆ −1 ˆˆ vi..25 0.

. . T. . 163 ˜ in which β is a known matrix. . In Figure 8..9 the conclusion is not as clear.(t1 ) β| = 0.20) ˆ i=1 t1 = T0 . Figure 8. A value larger than 1..0 means that the test rejects constancy. Here we have chosen a speciﬁc sub-sample. This serves as an illustration that the test procedures can give quite diﬀerent results depending on the questions we ask.8 illustrates the ˜ ˆ ˜ case where β is the full sample estimate β and Figure 8. The solid line is based on the full model the dotted line based on the R-form model. .(t1 ) | = 0. Based on Figure 8. . j = 1.22) The test statistic (8. 47 when the short-run eﬀects had been corrected for whereas the solid line shows that this would not have been the case in the ﬁrst few years after 1983. RECURSIVE TESTS OF CONSTANCY The test of constancy of β is a test of the hypothesis: ˜ Hβ τ : β ∈ sp(β (t1 ) ).8 ˆ the dotted line shows that the full sample β would have been accepted in all periods 1974:2-1983:1+j. though at the end of the sample the test is close to the critical value.. . . . where ρi (t1 ) are the solutions of ˆ −1 ˜ ˜ ˜ ˜ |ρβS11. 1974:2-1987:1 as a reference point.(t1 ) ) . t1 = T0 .21) ˆ and λi (τ ) are the r largest eigenvalues in the solution of the unrestricted eigenvalue problem: −1 |λS11. . . Based on the R-form model (the dotted line) we could approximately accept constancy of β over the full period. . T.(t1 ) S00. T. The test statistic is given by ˆ −2 ln(Q(Hβ τ |β (t1 ) )) = t1 r ´ X³ ˆ ln(1 − ρi. T. . (8. It is often useful to check the sensitivity of the stability ˜ tests to the choice of reference value β using diﬀerent sample periods. (8. Based on the full model (solid line) constancy of β is less strongly supported.(t1 ) S01.(t1 ) S01. .5.(t1 ) − S10. . 1993). t1 = T0 .(t1 ) ) − ln(1 − λi. t1 = T0 .9 where β is the estimate based on the sample 1974:2-1987:1. . .(t1 ) β − βS10. .20) is asymptotically distributed as χ2 with (p1 − r)r degrees of freedom (Hansen & Johansen.(t1 ) S00. (8.8. To increase readability the test values have been scaled by the 95% quantiles of the χ2 distribution.

75 0.50 1.50 1.25 0.164 CHAPTER 8. to beta(t) BETA_Z BETA_R 1 is the 5% significance level ˜ ˜ Figure 8.75 0.9.75 1. to beta(t) BETA_Z BETA_R 1.00 -0.8. Recursively calculated tests of the full sample estimate ˆ β ⊆ sp(β t1 ). Recursively calculated tests of β ⊆ sp(β t1 ) where β is estimated on the sub-sample 1974:2-1987:1. 1.25 1.25 83 85 87 89 91 93 1 is the 5% significance level Figure 8.50 0.25 1. .00 0.25 0.00 0.50 0.00 83 85 87 89 91 93 Test of known beta eq. COINTEGRATION RANK Test of known beta eq.

4 how to test some restrictions on a cointegration vector. 165 . (iii) only some coeﬃcients known in a cointegrating relation. The organization is as follows: Section 9. and Section 9. By systematic application of these tests one will gain invaluable information on the time-series properties of the long-run relations which will facilitate identiﬁcation of the long-run and short-run structure to be discussed in Chapters 10-14.1 discusses how to formulate hypotheses as restrictions on the parameter matrices.2 how to test the same restriction on all β.6 interprets the results in terms of the scenario analysis of Chapter 2.5 how to test long-run weak exogeneity as a row restriction on α. Section 9.Chapter 9 Testing restrictions on β and α In Section 7. Section 9. are testable.4 we discussed the eigenvector decomposition of the long-run matrix Π and interpreted the unrestricted estimates as a convenient description of the information given by the covariances of the data. Section 9.3 how to test a known β. This chapter will discuss how to test for stationarity of cointegration relations subject to the following types of restrictions on β and α: (i) same restrictions on all cointegrating relations (ii) all coeﬃcients known in some of the relations. The purpose of this chapter is to discuss a number of test procedures by which we can test various restrictions on the r stationary cointegrating relations. (iv) zero restrictions of rows of α. While the ﬁnal aim is to test and impose over-identifying structural restrictions on the long-run structure β and on the adjustment coeﬃcients α. hence. Section 9. the restrictions discussed here are not identifying by themselves. Nonetheless. In Chapter 8 we discussed how to determine the cointegration rank which separates the eigenvectors into r stationary and p − r nonstationary directions. they imply binding restrictions on Π = αβ0 and.

As an illustration consider the following hypothetical speciﬁcation of β0 xt ˜ 0 r r where xt = [mt . Ds83t ] : ˜ r ˜ β01 xt = mr − yt − b1 (Rm..1) where ϕi is a (si × 1) coeﬃcient matrix. .1 Formulating hypotheses as restrictions on β Hypotheses on the cointegration vectors can be formulated in two alternative ways: either by specifying the si free parameters in each β i vector.t ) − b2 Ds83t t r β02 xt = yt − b3 (∆pt − Rb.t − Rb. and i = 1.t ..166 CHAPTER 9.t − Rb. R0r βr = 0. and H i is a (p1 × si ). We ﬁrst specify the constrained cointegration ˜ vector βc in terms of the si free parameters: i βc = (βc . Rb. TESTING RESTRICTIONS 9. or the mi restrictions on each vector.t . βc ) = (H1 ϕ1 . . . design matrix.t ) + b4 Ds83t ˜ The ﬁrst cointegration relation has three free parameters (s1 = 3)... corresponding to three restrictions (m1 = 3) and the second and the third relation have two free parameters (s2 = s3 = 2). 1 r (9. In this case we use the design matrices to determine the si free parameters in each cointegration vector. corresponding to four restrictions (m2 = m3 = 4). Rm. . yt ... In the other case we specify some restriction matrices Ri (p1 × mi ) which deﬁne the mi restrictions on β i : R01 β1 = 0 .. We will consider both cases for a general formulation of restrictions on the (p1×r) matrix β.t ) ˜ 0 β3 xt = (Rm. The three restricted vectors Hi ϕi take the following form: . ∆pt . r . . Hr ϕr )..where p1 is the dimension of xt−1 in the VAR model..

  0 0 0 0 0 1   · ¸  ϕ31   ϕ32 . FORMULATING HYPOTHESES 167    β1 = H1 ϕ1 =        β2 = H2 ϕ2 =       1 0 0 −1 0 0 0 0 0 0 1 0 0 −1 0 0 0 1 0 0 1 0 0 1 0 0 0 −1 0 0  0 0 0 1 −1 0      ϕ11    ϕ12  . After normalization on real money in the ﬁrst relation the normalized coeﬃcients become b1 = ϕ12 /ϕ11 . ϕ13 = β 36 . b2 = ϕ13 /ϕ11 and similarly for the remaining relations. ϕ12 = β24 = −β 25 .1.9. We will now formulate the above relations as restrictions on the βi coefﬁcients.   ϕ13   · ¸  ϕ21   ϕ22 . They are as follows: .      β3 = H3 ϕ3 =     It appears that ϕ11 = β11 = −β12 .

i . i. i. where α and β are of dimension r.    1  0 R03 β3 =   0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1      = 0. After the rank has been determined all tests are about stationarity. the null hypothesis is now that a restricted linear combinations of the vector process βc is stationary. R0i Hi = 0. Hence testing restrictions on β is the same as asking whether a restricted vector β c lies in the ’stationarity space’ spanned i by β. Under the assumption that the rank was correctly chosen. When a test rejects it means that the restricted vector points outside the ’stationarity space’.e. describes the r stationary directions β.168 CHAPTER 9.e.    9.2 Same restriction on all β Before imposing speciﬁc restrictions on each βi it is often useful to test for general restrictions on all relations. This is in contrast to the rank test where the null i hypothesis was a unit root. the matrix Π = αβ0 .     = 0. TESTING RESTRICTIONS  1 1 0 0 0 0   R01 β1 =  0 0 1 0 0 0    0 0 0 1 1 0    1  0 R02 β2 =   0 0   0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0  β 11 β 12 β 13 β 14 β 15 β 16 β 21 β 22 β 23 β 24 β 25 β 26 β 31 β 32 β 33 β 34 β 35 β 36   0   0   0   1   0   0   0   0         = 0.    Note that Ri = H⊥. Typical examples are tests of long-run .

are identical and we can formulate the hypothesis as: Hc (r) : βc = (Hϕ1 .. All Hi (or Ri ).2) generally implies a transformation of the data vector. we can re-specify the long-run relations directly in real money (and. the variable is not needed in the cointegration relations and can be omitted altogether from the cointegration space. For example.9. as will be shown in Chapter 14 a nominal growth rate) thus simplifying the long-run structure. The hypothesis Hc (r) is tested against H(r) : β unrestricted.e. ... but.2) where βc is p1 × r..r. since they impose the same restriction on all cointegration relations. Hϕr ) = Hϕ (9. ϕ =      ϕ11 ϕ21 ϕ31 ϕ41 ϕ51 ϕ12 ϕ22 ϕ32 ϕ42 ϕ52 ϕ13 ϕ23 ϕ33 ϕ43 ϕ53 The transformed data vector becomes: . H0 xt .e. These are testable hypotheses.3) The hypothesis (9. i. ϕ is s × r and s is the number of unrestricted coeﬃcients in each vector. Assume that we wish to test the hypothesis of long-run proportionality between mr and y r in all cointegration relations speciﬁed by the following design matrix H:            H =   0 1 −1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1     . as can be illustrated by the following example. we want to test whether nominal money and prices are long-run homogeneous in all cointegration relations. SAME RESTRICTION 169 exclusion of a variables. If accepted. i.. a zero row restriction on β. they are not identifying as will be shown in the next chapter.2. Another example is the test of longrun price homogeneity between some (or all) of the variables. If accepted. we test the following restricted model: ˜ ∆xt = αϕ H xt−1 + 0 0 k−1 X i=1 Γi ∆xt−i + εt (9. H is p1 × s. i = 1. .

(Rm − Rb )].. m ≤ p1 − r. which would violate the condition r = 3.e. vc .   c This gives the eigenvalues λc . Note that this type of restrictions ϕ are not identifying and the estimates of the constrained cointegration relations are. λc . vc .... Thus. the restricted β. vc ] and then ˆ 2 r c ˆ transforming the vectors back to β = Hˆ .170 CHAPTER 9. . for example.. less than the p1 eigenvalues of the unrestricted model and. the restricted model will have s eigenvalues.3) that we can ﬁnd the restricted ML estimator similarly as for the unrestricted model. max i=1 It appears from (9. c The ML estimator of β is found by choosing ϕ = [ v1 . The likelihood ratio test procedure is derived by calculating the ratio between the value of the likelihood function in the restricted and unrestricted model.e.4. The maximum likelihood of the unrestricted model is: ˆ L−2/T (H(r)) = |S00 | Πr (1 − λi ). based on the normalization condition and the ordering of the eigenvalues as discussed in Section 7.. i. We could. 2 s 1 2 s Note that when H is p1 × s. i.t Rb. .. hence. vc . . In the money demand example with r = 3 and p1 − r = 3 we can impose at most three identical restrictions on the cointegrating relations.. in addition to the liquidity ratio impose the interest rate spread in all relations as well as long-run exclusion of the shift dummy Ds83t resulting in the transformed data vector x0t H = [(mr − y r ). ∆p.. .is now (5×3) instead of (6×3). Imposing one more restriction would make the dimension of the transformed vector equal to 2. λc and the eigenvectors v1 . by solving the reduced rank regression of ∆xt on H0 xt−1 corrected for the short-run dynamics: ¯ 0 ¯ ¯λH S11 H − H0 S10 S−1 S01 H¯ = 0. 00   H xt =    0  (mr − y r )t ∆pt Rm. therefore.t Ds83t    . TESTING RESTRICTIONS If this restriction is accepted all cointegrating relations can be expressed as a function of the liquidity ratio (mr − y r )t .

λi is related to the estimated αi .. Thus.1 Illustrations The test of β = Hϕ can be calculated in CATS by choosing the option: ”Restrictions on subsets of the β−vectors”. the most general formulation is to include a linear trend inside the cointegration relations and an unrestricted constant in the VAR equations. Another ˆ ˆ way of expressing this is again to notice that diag(λ1 . When the same restriction is imposed on all vectors. The program then asks for the number of restrictions [m].. 9. this formulation delivers ’similar’ inference based on the trace test for cointegration rank. Section 8. thus. 1 It is asymptotically distributed as χ2 (ν) where ν = rm. There are rm degrees of freedom because we have imposed m restrictions on each of the r cointegration vectors. µ ˆc ˆc ¶ |S00 | × (1 − λc )(1 − λ2 ) · · · (1 − λr ) 1 = ˆ ˆ ˆ |S00 | × (1 − λ1 )(1 − λ2 ) · · · (1 − λr ) c ˆ ˆ ˆ −2 ln Λ = T {ln(1 − λc ) − ln(1 − λ1 ) + .9. the rejection β 00 of the hypothesis β = Hϕ implies that at least one of the restricted relations no longer deﬁne a mean-reverting relation and. i.. the answer is [1 ]. it is a sign of nonstationarity of the restricted cointegration relation. After the rank is determined we ﬁrst test whether the .. ˆc if λi becomes very small as a result of imposing the restrictions Ri .2. If the eigenvalues change signiﬁcantly when we impose the restrictions H0 xt the test will reject.e. As discussed in Chapter 8. . In Chapter 6 we discussed the role of the deterministic components in the model and showed that if there are linear trends in the data. The program ﬁrst prompts for the number of diﬀerent groups. Therefore.e.. + ln(1 − λr ) − ln(1 − λr )}. λr ) = α0 S−1 α = ˆ 00 ˆ 0 ˆ ˆ ˆ S10 S−1 S01 β. SAME RESTRICTION The LR test statistic is calculated as: µ −2/T c ¶ Lmax H (r) Lmax H(r) −2/T 171 Λ= i. will have insigniﬁcant α coeﬃcients.2.3 gave an interpretation of the eigenvalues λi as a squared correlation coeﬃcients between a linear combination of the stationary part of the vector process and a linear combination of the non-stationary part.

Example 1: A test of long-run exclusion of a linear trend in the cointer gration relations for x0t = [mr .t . . if two variables are strongly trending. we sometimes ﬁnd that long-run exclusion is accepted even if it subsequently turns out that the variable in question was very important in at least some of the long-run relations. The design matrix H is speciﬁed so that the last row. in particular if the variables are strongly collinear. Rb. corresponding to the trend. some caution is needed with this test. This could be the case. yt .e.e. Thus. collinear. i. unrestricted constant in the model. ˜ t H1 : β c = Hϕ or R0 β = 0 where  1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0          . the three zero restrictions on the trend can be accepted with a p-values of 0.t . When testing the hypothesis of long-run exclusion of variables which are strongly correlated. ϕ =                 and     H=     ϕ11 ϕ21 ϕ31 ϕ41 ϕ51 ϕ61 ϕ12 ϕ22 ϕ32 ϕ42 ϕ52 ϕ62 ϕ13 ϕ23 ϕ33 ϕ43 ϕ53 ϕ63 R0 = [0. It appears that the λc i hardly changes compared to the λi . TESTING RESTRICTIONS linear trend is needed in the cointegration relations.1 in Chapter 7. 0. ∆pt . Table 9. t]. Rm. Ds83t .e.3. 0. no trend in the cointegration relations. 1]. for example. i. is equal to zero and the restriction matrix R so that it makes the last coeﬃcient row of β equal to zero. i.172 CHAPTER 9. 0. With few restrictions the R formulation is more parsimonious than the H formulation. 0.1 reports the estimates of λc and βc in the restricted model. whether we can impose a zero restriction on the trend coeﬃcient in all cointegrating relations. 0.61 and we conclude that the appropriate speciﬁcation is Case 3 of Section 6. Not surprisingly. The same is true with β c compared to the unrestricted β reported in Table 7.

A test of long-run exclusion of the shift dummy Ds83t in the r cointegration relations for x0t = [mr . ∆pt . the previously very high yield on bonds dropped dramatically and. Without accounting for this shift in the equilibrium level between the interest rates . after a period of increased volatility. Thus. the interest coeﬃcients have almost doubled and one of them has changed sign. The deregulation of capital movements in 1983 also aﬀected the Danish interest rates. yt .5 in Chapter 2 illustrates this graphically. At around ˜ t this date the graphs of the variables in levels and diﬀerences reported in Section 3. in particular on the long-term bond rate. the consequence of deregulating capital movements in 1983 seems to have been a shift in money stock to a higher level without a corresponding shift in the level of real income. The speciﬁcation of the H and the R matrices are similar as for the test for long-run exclusion of the trend and will. Figure 3.3 in Section 3. and a big blip in the diﬀerences of the bond rate. as it was when the shift dummy was allowed to enter the relation. The hypothesis of long-run exclusion of the shift dummy Ds83 t was rejected with a p-value of 0. Rb.2 illustrates this graphically. a less pronounced shift in the level of real aggregate demand. therefore. Because it is often hard to know whether the mean shift will cancel or not in a cointegration relation.t . not be reported. Ds83t . Rm.2 showed a major shift in the level of money stock. The eﬀect was a marked decrease in the interest rate spread and a similar gradual stabilization at a much lower level in the new regime. SAME RESTRICTION 173 Example 2.01. The restricted estimates are reported under H2 in Table 9. it is often useful to include the shift dummy in the cointegration relations and then test its signiﬁcance.1.9. Figure 2. t]. If the shift to the new (deregulated) equilibrium level of money stock is not appropriately accounted for by the shift dummy. This implies that the estimates of the money demand relation and of the interest rate relation are sensitive to whether we ˆc include the shift dummy or not: The income coeﬃcient in β 2 is no longer close to -1. then the econometric procedure will account for the extraordinary increase in money stock by increasing the estimates of the real income and interest rate coeﬃcients. As a result of the increased foreign demand for Danish bonds.t . settled down on a much lower level where it has stayed all since. whereas this ˆc ˆc is not the case with β 2 and β3 . Moreover. A blip in ∆xt corresponds to a shift in the levels xt . Comparing the estimates under H2 to the estimates under H1 (which are ˆc close to the unrestricted β estimates) it appears that β1 has hardly changed (implying that the shift dummy was not needed in this relation).2.

1 under H3 and shows that βc does not diﬀer very much from the unrestricted values. Rb. The joint hypothesis H5 was accepted with a p-value of 0. It appears that the ﬁrst and the third restricted cointegration relation. TESTING RESTRICTIONS (the risk premium) the relationship between the two interest rates becomes negative as βc demonstrates. yt . but also to the level of nominal interest rates.174 CHAPTER 9.00. Note that the joint test is not the sum of the individual tests. Example 4. 3 This example serves as an illustration of the importance of appropriately accounting for major reforms and interventions in order not to bias the coeﬃcient estimates of the economic steady-state relations. The violation of the spread in the third relation suggests that the two interest rates are not homogeneously related to each other.1. because the latter are not independent. Ds83t . A test of long-run homogeneity between mr and y r in all coinr tegrating relations for x0t = [mr . Example 5.20. The H and R matrices are similar to the previous example and will not be reported. 0’ relationship.     . The restricted estimates of H4 are reported in Table 9. The design matrices H and R have the following form:  1 −1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1  The restricted estimates are reported in Table 9.t . two of the acceptable long-run homogeneity restrictions are describing a ’0.13. The violation of the spread restriction in the ﬁrst relation suggests that inﬂation is not exclusively related to the interest rate spread. Rm. Example 3. It appeared that the zero restriction on the trend and the homogeneity restriction on real money and income were each acceptable. t]. R = 1 1 0 0 0 0 0 . The hypothesis H4 was strongly rejected with a p-value of 0.t . suggesting that they enter signiﬁcantly only in one of the three cointegration relations. A joint test of H1 and H3 . The coeﬃcients to real money and real income are very small except for in the second relation. The hypothesis ˜ t H3 can be accepted with a p-value of 0. ∆pt . Thus. A test of the interest rate spread in all cointegration relations. It frequently happens that the joint     H=         £ ¤ .

36 β2 1.04 -0.2 14. p-value = 0.58 βc0 0.58 βc0 0.00 3 2 H3 : β1· = −β 2· .00 0.00 0.2 -0.0 1 c0 0.00 1 c0 0.00 -1.58 β1 0.00 0.54. χ (6) = 8.05 3 H4 : β4· = −β5· .71.0 0.05 1.05 -0.36 β2 1.00 0.2.00 2 0.06 1.63 0.20 0.00 -0.0 -0.00 1 0.33 -0.00 -2.69 1.58 βc0 0. and loadings for the Danish data λi mr yr ∆p Rm Rb Ds83 trend 2 H1 : β7· = 0.15 0.00 -1.12 -0.06 1.01 1.1: Estimated eigenvalues.12 -0.05 1.69 1.61 0.21 27.46 0.03 0.86 -13.01 0.80 -14.02 0.00 -0.15 0.00 1.05 1.33.83 -13.02 3 2 H5 : β 1· = −β2· and β7· = 0.16 0. χ2 (3) = 15. though the individual tests are accepted.00 -0.00 2 0. eigenvectors.06 -0.26 βc0 -0.26 β3 0.5 0.80.0 0.04 1. p-value = 0.69.34 -1.  0 0 0 0 0 0 1    .02 0.9.00 -2.86 -13.00 0.37 -0.05 1.15 0.00 -1.01 0.26 βc0 0.03 0.5 14.00 0.0 test rejects. χ (3) = 11. SAME RESTRICTION 175 Table 9.00 -1.26 βc0 0.01 0.33 -0.02 -0.01 0.81 -0.54 -0.00 0.66 0.7 21.9 14.26 βc0 0.0 1 c0 0.01 0.54 0.00 0.36 βc0 1. In the latter cases it was quite straightforward to     H=        ¸ ·  .00 -0.13 0.01 0.00 -0.02 -0. p-value = 0.36 β2 1.67 0.5 -0. p-value = 0.00 -1. p-value = 0.00 3 2 H2 : β6· = 0.36 βc0 1.06 -0. χ (3) = 5.2 14.00 -0.00 -0.00 0.0 c0 0.01 c0 0.86 0.00 -1.0 -0. χ (3) = 1.58 βc0 0.46 0.00 1.0 -0. R = 1 1 0 0 0 0 0 .06 1. The design matrices H and R are formulated as:  1 −1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0  We have given several examples of restrictions that were acceptable and restrictions that were not.0 0.0 -0.00 -1.00 0.

that potentially important information in the original variables is being lost. Therefore. ∆p. y r . To be able to re-specify the hypothesis in a more adequate way it is crucial to understand why a restriction rejects. the discussion in Section 6. and the bond rate of Πc . TESTING RESTRICTIONS see why a restriction violated the information in the data. c. ˆ ˆ ˆˆ the constrained cointegration relations and the corresponding weights should ˆ approximately reproduce the unrestricted Π.e. It is now easy to see that imposing the interest rate spread on all cointegrating relations hardly changes the rows for real money. This was partly because the unrestricted cointegration relations could be roughly interpreted as plausible steady-state relations. To formulate this hypothesis it is convenient to decompose the r cointegrating relations into s known vectors b (in most cases s = 1) and r − s unrestricted vectors ϕ : . therefore. In the inﬂation equation the signiﬁcant eﬀect of the deposit rate in Π has now disappeared and in the deposit rate equation the deposit rate is no longer related to the bond rate.f. and whether the income velocity of money. To illustrate how to locate the reason ˆ ˆ ˆˆ ˆ ˆ for a test failure Table 9. real income.3 Some β assumed known This test is useful when we want to test whether a hypothetical known vector is stationary.7. whether 4p is stationary by itself. If the rec0 0 ˆ ˆ strictions are acceptable then the constrained Πc = αc β ≈ αβ = Π. mr − y r is stationary. 9. frequently employed in empirical models without ﬁrst testing for data admissibility. (Rm − Rb )] important feedback information from changes in nominal interest rate on the system would have been lost. then Πc = αc β 6= αβ = Π. If the restrictions are not ac0 c0 ˆ ˆ ceptable. Such transformations are often suggested by the underlying theory model and. however. where the latter is calculated under the hypothesis H4 in Table 9.176 CHAPTER 9. The consequence is. For example we might be interested in whether real interest rate deﬁned as R − 4p is stationary. but does change the rows for inﬂation and the deposit rate.2 compares Π with Πc .1. In most cases it is not so obvious why a restriction violates the information in the data. if we had started the analysis with the transformed vector e x0t = [mr . i. A useful tool in this context ˆ ˆ is to compare the unrestricted Π with the restricted Πc matrix.

8) (−0.2) (−0.87 1.00 (−0.00 (−5.0) 5.4) −0.1) 0.0) (0.98 −4.00 (0.1) 0.00 (0.00 (0.34 0.8) (3.30 (0.03 −0.3) −0.00 (0.3) (4.3) (−2.28 −0.3) (3.33 0.04 −0.1) −0.7) −0.22 −0.9.5) 0.6) 0.7) R UR R UR R UR R UR R ∆mr t r ∆yt r ∆yt −0.01 (0.3) (−1.00 (0.5) 0.88 0.22 (0.01 (−1.06 (−4.01 (0.00 −0.002 0.00 0.10 0.8) (0.1) 0.4) (5.6) (−1.3.20 (−0.7) (0.t 2 0.00 (−1.07 (−0.10 0.2) (0.1) 0.96 (2.1) (−3.02 −0.1) (4.3) −0.00 (0.00 −0.32 −0.00 (−0.t ∆Rb.7) −0.1) (−0.6) (−0.5) (5.1) (−3.1) 0.00 0.21 (−9.5) −0.03 (0. SOME β ASSUMED KNOWN 177 Table 9.04 0.00 −5.03 (0.t ∆Rm.0) 0.00 0.76 (−1.3) ∆2 pt ∆ pt ∆Rm.1) (3.5) −0.1) −0.00 1.00 .4) (−5.00 −1.8) (1.1) (−0.8) (1.04 0.5) 0.2) 0.00 −0.10 (−1.00 0.1) 0.66 (1.9) (4.00 −0.00 (−0.6) 0.9) (2.32 −0.1) −0.15 −1.01 0.4) (0.1) (5.6) 0.16 −1.002 (1.00 0.8) (1.4) 0.3) (−0.1) (0.9) (−1.5) (−0.07 (1.6) −0.03 4.1) 0.00 −0.02 (0.4) (2.00 −0.65 −0.1) (0.87 −0.06 (−1.3) (−1.2: Comparing the combined eﬀects of unrestricted and restricted cointegration relations for r=3 c0 ˆ ˆ ˆ0 ˆ Comparing Π = αβ (UR) with Πc = αc β (R) under H4 ˆ ˆ mr yr ∆p Rm Rb Ds83 trend r UR ∆mt −0.5) (1.3) −0.2) (−8.00 (0.5) (3.05 −0.t ∆Rb.

t + R0.t and ˆ R1. α2 ).5) in (9.t + R1.b.4) (9.t = α2 ϕ0 R1.t + εt (9.t = B1 b0 R1.178 CHAPTER 9.5) where α1 are the adjustment coeﬃcients to b.b. where b is a p1 × s.b.6) The unrestricted estimates α and β are ﬁrst obtained from the ”concentrated model”: R0t = αβ 0 R1t + εt deﬁned in (??). Inserting (9.8) (9.7) Since the variable b0 R1t is known and stationary under Hc it can be concentrated out from (9.8): ˆ R0. (9.7) we get: R0t = α1 b0 R1t + α2 ϕ0 R1t + εt (9. ϕ).t The new ”concentrated model” is given as: R0.b.4) and (9. (9. TESTING RESTRICTIONS Hc (r) : βc = (b.9) .t = B2 b0 R1. The cointegrated VAR model can now be written as: 4xt = α1 b0 xt−1 + α2 ϕ0 xt−1 + Γ1 4 xt−1 + ΦDt + εt . We partition α = (α1 . and α2 to ϕ. and ϕ is a p1 × (r − s) vector.

.9. if ϕ happens to lie in the space spanned by b.. to avoid this we can force ϕ to lie in the space spanned by b⊥ . SOME β ASSUMED KNOWN 179 The remaining task is now to derive the ML estimator of (ϕ.b | = |S00 | Πs (1 − ρi ) i=1 Therefore.t + εt Second. only ρ1 > 0. It can ˆ be shown that ˆ |S00. α1 . Therefore. > ρs > ρs+1 = . α2 ). because the maximum of the likelihood function is given by: ˆc L−T /2 = |S00. = ρp = 0 ˆ If s = 1.. as is usually the case in practical applications. Two complications have to be solved: First.t = α2 ψ 0 b0⊥ R1. ˆc ˆ i=1 L−2/T (Hc (r)) = |S00 | Πs (1 − ρi )Πr−s (1 − λi ) max i=1 and the LR test procedure Λ = L−2/T (Hc (r))/L−2/T (H(r)) max max .9) for ϕ = b⊥ ψ: R0.b | Πr−s (1 − λi ) max i=1 and |S00. This can be circumvented by solving an auxiliary eigenvalue problem: ¯ ¯ ¯ρS00 −S01 b(b0 S11 b)−1 b0 S10 ¯ = 0 from which we get the eigenvalues: ˆ ˆ ˆ ρ1 > .b | 6= |S00 | the determinants do not cancel in the LR test..b.b.3. then the model will become singular. This can be achieved by reformulating model (9.

180 gives us the LR test statistic: CHAPTER 9.. where b0 = £ 0 0 1 0 0 0 0 ¤ i. In most cases ϕij ˆ are not of particular interest and there is no need to interpret the estimated . Then you need to type in the restrictions deﬁning the known vector and save the restrictions. and the interest rate spread are stationary by themselves. b is a unit vector that picks up the inﬂation rate.3. the test that ∆p ∼ I(0) is formulated as: H6 : β c = (b... First.. 9. one group containing the known vector and the second group the remaining r − 1 vectors. + ln(1 − λr−s )− ˆ ˆ ˆ − ln(1 − λ1 ) + . Note that the null is stationarity in this case. The program ﬁrst prompts you to input the number of diﬀerent groups. The remaining r-1 = 2 vectors are unrestricted and described by the matrix ϕ of dimension p1 × r − 1 = 7 × 2. ϕ). then the answer is [2 ]. Finally.e. The coeﬃcients ϕij are uniquely determined based on the ordering of eigenvalues and the normalization ϕ0 S11. + ln(1 − ρs ) + ln(1 − λ1 ) + .b ϕ = I. + ln(1 − λr )} which is asymptotically distributed as χ2 with (p1 − r)s degrees of freedom. the program will ask for the number of vectors in the second group [r-1 ] and the number of restrictions [0 ]. real interest rates. i.. If there is one known vector to test.e.. TESTING RESTRICTIONS ˆc ˆc ˆ −2 ln Λ = T {ln(1 − ρ1 ) + .1 Illustrations This test can be calculated in CATS by choosing the same option as in the previous test: ”Restrictions on subsets of the β−vectors. The program prompts for the number of vectors in group 1 [1 ] and for the number of restrictions [p1-1]. As an illustration of the test procedure we will ask whether inﬂation rate.

H7 of real deposit rate. whether (R − ω∆p) is stationary for some value of ω. i. 9. This case is useful for asking questions about the stationarity of a single hypothetical cointegrating relation while leaving the remaining r − 1 relations unrestricted.3.e. if the interest rate spread is stationary around one level before 1983 and another level after that date. For example. then H9 would probably be rejected as a consequence of imposing a zero restriction on Ds83 t . H8 of real bond rate.4 Only some coeﬃcients are known Here we will test restrictions on one (or a few) of the cointegration vectors assuming that some of the coeﬃcients are known and some have to be estimated. For example. this test would answer the question whether there exists any stationary combination between nominal interest rate R and inﬂation 4p.. Only when b is found to be stationary can an inspection of the remaining relations sometimes be helpful in tentatively identifying the complete cointegration structure.9. however. Here we will focus on the special case r1 = 1. Hr ϕr }. H2 ϕ2 . i = 1.4. SOME COEFFICIENTS KNOWN 181 coeﬃcients. The results of testing the four stationarity hypotheses are presented in the ﬁrst part of Table 9.. that the test results are not independent of the choice of r.. We note. H6 tests the stationarity of inﬂation rate. In the general case we formulate the hypothesis as β = {H1 ϕ1 . The latter is expressed as: . and H9 of the interest rate spread. that the rejection might be related to the zero restriction of the shift dummy Ds83 t . where b1 Ds83 is the estimated shift in the level of interest rate spread as a result of deregulation of capital movements.. however. If a conservative value of r is chosen (small r) then it will be more diﬃcult to accept the stationarity hypothesis than if when r is big. where Hi is a (p1 × si ) matrix. . have to be estimated. r2 = r − 1 and H2 = I and leave the more general speciﬁcation to the next chapter. In the next section we will discuss a test procedure for the situation when at least one of the coeﬃcients is not known and. The test procedure will ﬁnd the value ω that produces the ’most stationary’ relation and calculate the p-value associated with the null hypothesis. r. All of them are rejected. Note. . For simplicity we assume in the following that the r cointegration relations are divided into two groups containing r1 and r2 vectors each.. therefore. If this is the case it would be more relevant to ask whether (Rm − Rb − b1 Ds83) ∼ I(0) rather than (Rm − Rb ) ∼ I(0).

β 2 .β 2 .000001 for the algorithm to stop. ϕ1 is s1 × 1 matrix and ϕ2 is a p1×(r −1) matrix of unrestricted coeﬃcients. This deﬁnes β 2 and Lmax (β 1 ). A maximum of 200 iterations is set by CATS.t and R1. For a ﬁxed value of βc = β1 estimate α2 and β2 by reduced rank 1 regression of R0. TESTING RESTRICTIONS βc = (βc .β 1 ¯ = 0 . This deﬁnes β1 = H1 ϕ1 and Lmax (β2 ). Repeat the steps. β 2 ) = (H1 ϕ1 .β 1 . where R0.β 1 .t . always using the last obtained values of βi until the values of the maximized likelihood function converge.β 1 (S00.t are corrected −2/T ˜ ˜ ˜0 for β 1 R1t . where R0. ˜ 3.10) where H1 is a known design matrix of dimension p1 × s1 . α2 ). ˜ 4.t on R1.182 CHAPTER 9.t . CATS uses a switching algorithm described below: ˜ 1. Estimate an initial value of βc = β1 . 1 ˜ 2.t and H01 R1. The concentrated model can be written as: R0t = α1 ϕ01 H01 R1t + α2 β02 R1t + εt The estimation problem is now more complicated because neither ϕ01 H01 R1t nor β02 R1t are known and can be concentrated out. Diﬀerent algorithms for solving nonlinear estimation problems can be used. For ﬁxed value of β2 = β2 estimate α1 and ϕ1 by reduced rank regres0 sion of R0.β 2 . ˜ The eigenvalue problem for ﬁxed β1 = β1 is given by: ¯ ¯ ¯ ¯ ¯ρ1 S11 − S10.t are corrected −2/T ˜ ˜ ˜0 ˜ for β 2 R1t . ϕ2 ) 1 (9.β 1 .β 2 .β 1 )−1 S01.t on H1 R1. Again we partition α so that it corresponds to the partitioning of βc : α = (α1 . In CATS −2/T ˜ −2/T ˜ {Lmax (β 1 ) − Lmax (β 2 )} ≤ 0.β 1 .

The program ﬁrst prompts you to input the number of diﬀerent groups..10) can be tested by calculating the test statistic: ˆ ˆ ˆ −2 ln Λ = T {ln(1 − ρ1 ) + . The program prompts for the number of vectors in group 1 [1 ] and for the number of restrictions [m]. SOME COEFFICIENTS KNOWN ˜ or for ﬁxed β 2 = β2 by: ¯ ¯ ¯ ¯ −1 ¯ρ2 H01 S11 H1 −H01 S10. usually [2 ] i. We will now illustrate this type of tests: β c = {Hϕ1 .1 Illustrations This test can be performed in CATS by choosing the same option as in the previous test: ”Restrictions on subsets of the β−vectors”.11) which is asymptotically χ2 distributed with the degrees of freedom given by: ν = (m1 − r + 1) = (p1 − r) − (s1 − 1) 9.4. + ln(1 − ρr ) − ln(1 − λ1 ) − .β 2 (S00.. one group containing the restricted vector(s) and the second group the remaining vectors..β 2 ) S01..(1 − ρr ) ˆ ˆ max 183 and the maximum of the likelihood function by: where ρi are the eigenvalues obtained after convergence of the likelihood ˆ function. ˆ (9.. Using the LR procedure the hypothesis (9..9. Then CATS will ask you to type in the restrictions of the design matrix H or alternatively the restriction matrix R deﬁning the restricted vector and save the restrictions. ϕ2 } (9. Finally.β 2 H1 ¯ = 0 L−2/T (Hc ) = |S00 | (1 − ρ1 ). the program will ask for the number of vectors in the second group [r-1 ] and the number of restrictions [0 ].4.12) .e. − ln(1 − λr )}.

1 14.1 -0.053 0.0 0 0 -0.0 -1.8(2) 0.56 0 -0.39(1) 0.1(4) 0.56 0.45 0 0.5 14.32(1) 0. real interest rates and the spread 0 0 1 0 0 0.011 0 6.66 0.1(2) 0. Tests of a known β vector 0 0 1 0 0 0 0 20.000 4.27(1) 0.00 Tests of a trend-stationary relation 1 0 0 0 0 -0.012 0 7.04 1 -1 0 -14.00 0 0 1 -1 0 0 0 16.0 -0.003 8.001 0 0.05 0 0 1 0 -1 -0.13 0 1 17.3 -15.val.01 Tests of velocity relations 1 -1 0 0 0 -0.01 0.06 0 0 1 -1 0 0.438 0.8(2) 0.5(4) 0.02(1) 0.75 0 0 12.01 0 0 1.184 CHAPTER 9.8(3) 0.57 0.001 0 12.7 0 0 -0.4(1) 0.3 0 -0.45 -0.015 0 7.88 1 -1 0 -14.25 -0.00 Tests of combinations of interest rates and inﬂation rate 0 0 1 -0.01 0.6(3) 0.00 0 0 1 0 -1 0 0 12.105 0.1(3) 0.20 0.01 0 0 0 1 -1 0 0 20.65(2) 0.002 2.39 0.113 -0.001 0.5 -0.02(1) 0.009 0 7.002 0.00 0.7 0 -17.0(4) 0.06 0 -0.6(0) 0.3: Testing the stationarity of simple relations mr y r ∆p Rm R b Ds83 trend χ2 (υ) p.04 -0.66 0.64 Tests of homogeneity between inﬂation and the interest rates 0 0 1. TESTING RESTRICTIONS H6 H7 H8 H9 H10 H11 H12 H13 H14 H15 H16 H17 H18 H19 H20 H21 H22 H23 H24 H25 H26 H27 H28 Table 9.001 0 14.28 0.0 -1.04 1.015 0 5.89 .145 0 0.4(3) 0.3(3) 0.7(4) 0.03 0 0 0 1 -0.00 Tests of inﬂation.03 0 0 1 0 -0.9(2) 0.72 Tests of real income relations 0 1 11.003 12.01 0 0 0 1 -1 -0.006 9.04 1.7 -0.01 0 1 0 0 0 -0.9(2) 0.5(2) 0.002 8.53 0 1 15.

The diﬀerence between the two test values. This test is indispensable for identifying irreducible sets of cointegrated variables (Davidson. the eﬀect is not signiﬁcant and the trend can be set to zero in H14 . our so called building blocks. The hypotheses H23 −H25 are similar to H20 −H22 except that the unitary restriction on the bond rate coeﬃcient is now relaxed. real interest rates and the spread are all nonstationary even when we allow for a mean shift in 1983:1. It appears from H13 that the liquidity ratio and inﬂation do not appear to be cointegrated. Rejecting the stationarity of the hypothetical relation ϕ01 H0 xt implies that it does not qualify as a steady-state relation in the ﬁnal long-run structure. To investigate whether the trend is signiﬁcant we have imposed a zero restriction on the trend in H15 . Note also that the estimated trend coeﬃcient implies a negatively sloped trend in both real money and income! The next group of tests. Therefore. However. The hypotheses H16 − H18 are tests about real income relations. This is probably a consequence of a declining inﬂation and a modestly growing real income in this period. 0.02 = 0. Though being found stationary. The trend-stationarity of the liquidity ratio. are similar to the tests in the ﬁrst part of the table. are about (inverse) velocity relations. systematically testing the stationarity of all possible relationships is an indispensable help in spotting potentially relevant cointegration relations in the structure of identiﬁed long-run steady-state relations. none of the variables becomes convincingly stationary and we conclude that inﬂation. The diﬀerence is that we here allow the shift dummy Ds83 t to enter the relations.65-0.9. while H14 shows strong cointegration between the liquidity ratio and the interest rate spread. SOME COEFFICIENTS KNOWN 185 where we impose restrictions on just one of the vectors and leave the remaining ones unrestricted. The test of a fully identiﬁed cointegration structure will be discussed in more detail in the next chapter. Real income and inﬂation appear cointegrated in H16 but with implausible coeﬃcients. H12 − H15 . Trend-stationarity of real money and real income is tested as the hypotheses H10 and H11 . The same is true for H17 which cannot be interpreted as an IS relation and for H18 the stationarity of which is rejected. hence.63. H12 . The tests in the next group of hypotheses.5 in Chapter 2). We note that the two .4. None of them are accepted. is clearly rejected (see also Figure 2. 2001). H19 − H22 . we would not consider this relation to be a serious candidate for an economic steadystate relation. is approximately χ2 (1) and.

186 CHAPTER 9. but not ˆ convincingly so. some of which yielded no interest. We test the following hypothesis on α: c Hα (r) : α = Hαc (9. Stationarity did improve.5 Long-run weak exogeneity: restrictions on α The hypothesis that a variable has inﬂuenced the long-run stochastic path of the other variables of the system. Hence. By inspecting the Πc and comparing it with the unrestricted ˆ Π (not reported here) we found that imposing H27 on the data caused real income to become insigniﬁcant in the inﬂation equation. The low value of the coeﬃcient can possibly be explained by the deposit rate being an average of all the components in M3. To summarize: this exercise has ’identiﬁed’ the following three possible candidates for a long-run structure: H15 . the last group tests homogeneity between inﬂation rate and the two interest rates.13) .   Because H26 is rejected we ask in H27 whether the results would improve by allowing for a mean shift in 1983:1. and H28 .45. H25 . by adding real income (and the trend) to H27 we obtained H28 which was strongly accepted. R =  −1 −1    0 1   0 0  0 0  1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1   . is called the hypothesis of ‘no levels feedback’ or longrun weak exogeneity when the parameters of interest are β. 9. while others quite a high interest. Finally. while at the same time has not been inﬂuenced by them. TESTING RESTRICTIONS interest rates appear cointegrated with a coeﬃcient of 0. The design matrices in H26 are speciﬁed as:     H=       0 0  0 0    1 0    . These will be tested jointly in Chapter 10.

The condition s ≥ r implies that the number of non-zero rows in α must not be greater than r. H is a p × s matrix. Ω) and Ω= · ω 11 ω 12 ω 21 ω 22 ¸ (9.) As with tests on β we can express the restriction (9.t and the errors from the p − m weakly exogenous variables .9. Since there can be at most (p − r) common trends the number of zero-row restrictions can at most be equal to (p − r) . Since the R matrix is often of much smaller dimension than the H matrix. The hypothesis (9. can be considered as a common driving trend in the system.5. This is because a variable that has a zero row in α does not adjust to the long-run relations and.13) can be expressed as: µ c ¶ α1 c H0 : Hα = α = .13) in the equivalent form: c Hα (r) : R0 α = 0 (9. (Compare this formulation with the hypothesis of the same restriction on all β. β = Hϕ. the default in CATS is to input the R matrix. We consider ﬁrst the concentrated model: R0t = αβ 0 R1t + εt Under H0 : R0t = Hαβ0 R1t + εt with εt ∼ NIID(0.14) where R = H⊥ . LONG-RUN WEAK EXOGENEITY 187 where α is p × r. hence.e. 0 c Under Hα (r): ∆xt = Γ1 ∆xt−1 + αc β0 xt−1 + µ0 + ΦDt + εt . i.15) where ω12 is the covariance matrix between the errors from the m ’endogenous’ variables x1. αc is a s × r matrix of nonzero αcoeﬃcients and s ≥ r.

16) (9.e.t ” and the ”x2. Since ∆x2t is stationary we can correct R0.: H R0t = αc β 0 R1t + H εt 1 H0⊥ R0t = H0⊥ εt For the ”baby” model this would correspond to: ∆x1t = αc β0 xt−1 + ε1t 1 ∆x2t = ε2t Next step is to formulate the joint model (9.t . The idea behind the derivation of the test procedure is to partition the equation system (9.15) into the ”x1.2t .17) .188 CHAPTER 9. All relevant information about α and β are now in the ∆x1t equation and we can solve the eigenvalue problem entirely based on that equation.2t = ε1t −ρε2t and Cov(ε1. Formally this can be done by multiplying (9. To obtain simpler results we use the normalized 0 matrix H = H(H0 H)−1 instead of H (because H H = H0 H= I).e. i.t (∆x1t ) and R1.16) as a conditional and marginal model using results from multivariate normal distributions: H R0t = αc β 0 R1t + ρH0⊥ R0t − ρH0⊥ εt + H εt 1 H0⊥ R0t = H0⊥ εt where ρ = ω 12 ω −1 .H⊥ t 0 0 0 0 (9. ∆x1t = αc β 0 xt−1 + ρ∆x2t + ε1. TESTING RESTRICTIONS x2.t ” equations. ε2t ) = 0. The weak exogeneity hypothesis can be tested with a LR test procedure described in Johansen and Juselius (1990). For the ”baby” model this would correspond to: 22 ∆x1t = αc β0 xt−1 + ρ∆x2t − ρε2t + ε1t 1 ∆x2t = ε2t i.2t 1 ∆x2t = ε2t where ε1.t (xt−1 ) for this variable based on the auxiliary regressions: ˆ R0t = B1 ∆x2t + R0.15) with H and H⊥ respectively.

s = 4. r = 3.. .18) The ’usual’ eigenvalue problem is based on (9.5.  . We have p = 5. LONG-RUN WEAK EXOGENEITY and ˆ R1t = B2 ∆x2t + R1. . +  .  .t ∆Rb. 9. .9.       ∆mr t r ∆yt ∆2 pt ∆Rm.  .  β03 xt−1 α51 α51 α51    1 0 0 0  c  0  α11 αc αc 12 13  0 1 0 0  β 1 xt−1  . .  .1 Empirical illustration: We will now test the hypothesis that the bond rate is weakly exogenous for the long-run parameters in the Danish money demand data.e.5. .   β02 xt−1  =  . . that α51 = α52 = α53 = 0 . The r largest are used in the LR test which is given by: c ˆc ˆ −2 ln L(Hα (r)/H(r)) = −T Σr {ln(1 − λi ) − ln(1 − λi )} i=1 It is asymptotically distributed as χ2 (ν) where ν = rm and m is the number of weakly exogenous variables.  β01 xt−1   . .H⊥ t + ut 189 (9.   β02 xt−1      . ..18) and the solution delivers ˆc p − m eigenvalues λi . m = 1.t   α11 α12 α13    .  0  β 2 xt−1  =  0 0 1 0   .H⊥ t so that: R0.  β01 xt−1     = .   0 0 0 1  β 03 xt−1 c c c α41 α42 α43 0 0 0 0   c α11 αc αc 12 13    . i. .   c  α41 αc αc  β03 xt−1 42 43 0 0 0  .H⊥ t = αc β0 R1.

became 0. then only inﬂation would have been ’non-exogenous’ and all information about agents’ adjustment towards a ’money-demand’ relation would have been lost.1 4 9. The test statistic.6 54. 0.4 10.8 1. The remaining test results for other choices of r serve the purpose of sensitivity analyses: If.2 29.5 r 1 2 3 4 The speciﬁcation of the zero row restriction on the adjustment coeﬃcients of the bond rate is given by the R matrix (which is the form used in CATS): R0 = [0. TESTING RESTRICTIONS Tests of long-run weak exogeneity yr ∆p Rm Rb ν χ2 (ν) 0.4 1.5 0. 0.6 2 6. On the other hand.1). distributed as χ2 (3). if we had chosen r = 1 (as has often been the case in many empirical models of money demand). The row corresponding to our preferred choice r = 3 is indicated by two arrows.6 0. In Table 9.4 11.4 we have reported the weak exogeneity test for all variables and for all possible values of r.7 15.3 0.8 6.4: mr 0. instead we had chosen r = 2 (remember that based on the rank test we could as well have chosen this value). Because the weak exogeneity tests of a single variable are not independent the joint hypothesis of two or several variables can be (and often is) rejected even if the latter are individually accepted as weakly exogenous. The design matrix for the joint test is speciﬁed as: .7 ← 3 7. Furthermore.4 50.8 0. 0.9 → 19.8 CHAPTER 9.4 show that the weak exogeneity result for y r and Rb is robust to the choice of r and we might ask whether they are jointly exogenous.0 2. which a priori would not seem very plausible. the test results in Table 9. implying that each of them would have acted as an independent common driving trend in this system. This is because the ﬁrst unrestricted cointegration relation was signiﬁcant exclusively in the inﬂation equation (see Table 7. the two interest rates would have become weakly exogenous.190 Table 9.0 40. 1].6 1 3.2 23.7 and the weak exogeneity of the bond rate is clearly accepted.1 0.

∆xt−1 . a1 is (3×2). Dt . 1992. θ1 . and the test statistic. When m zero-row restrictions on α are accepted (m ≤ (p − r) we can partition the p equations into (p − m) equations which exhibit levels feedback. Dp92. Long-run weak exogeneity of the real income and the bond rate implies that valid inference on β can be obtained from the three-dimensional system describing mr .9. p − r.19) r where x0t = [x01t . distributed as χ2 (6). Ds83t−1 .4]. and Rm conditional on y r and Rb (Johansen. Rb. LONG-RUN WEAK EXOGENEITY 191 R = 0 · 0 0 0 0 1 0 1 0 0 0 ¸ .21 with a p-value of 0. θ1 ) × D(∆x2t |∆xt−1 .t ]. ε2t ) = 0. Dt . Dt . ∆Ds83t−1 . xt−1 . we can safely accept both variables to be weakly exogenous in this system. 1983). Γ1. Φ1 (3 × 5).5. θ 2 ) (9.2t . ∆pt . this is the maximum number of weakly exogenous variables in this case. Hence. became 3.1 ∆xt−1 +α1 β 0 xt−1 +Φ1 Dt +ε1t − ρε2t ∆x2t = Γ1. Ds83t−1 .2 (2 × 5). Because the . Since m = 2 which corresponds to the number of common trends. Γ1.1 (3×5). xt−1 . x02t = [yt . Hendry and Richard. θ 2 are variation free and only θ1 contains the long-run parameters of interest β. In this case the cointegrated VAR(2) model ∆xt = Γ1 ∆xt−1 +αβ0 xt−1 +ΦDt +εt ˜ is equivalent to ˜ ∆x1t = ρ∆x2t +Γ1.t ].4. The argument is based on a partitioning of the joint density into the conditional and marginal densities: D(∆xt |∆xt−1 . x02t ] and x01t = [mr t . Dt = [D75. and Φ2 (2 × 5).2 ∆xt−1 +Φ2 Dt +ε2t where Cov(ε1. and m equations with no levels feedback.78. We say that the m variables are weakly exogenous when the parameters of interest are β. α1 (3×3). ∆Ds83t . ∆p. θ) = = D(∆x1t |∆x2t . Rm.

t = A0 ∆x2. When the number of potentially relevant variables to include in the VAR model is large it can be useful to impose weak exogeneity restrictions from the outset. testing may not be necessary and we can estimate a partial system conditional on the US bond rate. By conditioning on weakly exogenous variables one can often achieve a partial system which has more stable parameters than the full system.t . for example it is almost sure that US bond rate has an inﬂuence on the Danish economy.192 CHAPTER 9. Hence.6 Revisiting the scenario analysis The choice of r = 3 is consistent with the discussion in Chapter 2. ε (9. t u1i and i=1 . TESTING RESTRICTIONS m weakly exogenous variables do not contain information about the longrun parameters. the question is why would we bother to reestimate a partial system. It is sometimes a priori very likely that weak exogeneity holds. then a fully eﬃcient estimate of β can be obtained from the partial model: ∆x1. 2.t } where x2. ﬁnding weakly exogenous variables is often helpful in the identiﬁcation of the common driving trends as will be illustrated in Chapter 13. There are two reasons: 1. while it is almost as sure that the Danish economy is irrelevant for the US bond rate. we can obtain fully eﬃcient estimate of β from the (p − m) equations conditional on the marginal models of the m weakly exogenous variables. 9. This gives the condition for when a partial model can be used to estimate β without losing information. to know whether we can estimate β from a partial system we need ﬁrst to estimate the full system and test α2 = 0 in that system. Thus the hypothesis α2 = 0 is of interest for its own sake as it corresponds to the hypothesis of ’no long-run feed-back. More formally: let {xt } = {x1. This is often the case when the marginal model of the weakly exogenous variable has non-constant parameters or exhibit non-linearities in the parameters. x2. Finally. After having estimated the full system.t is weakly exogenous when β is the parameter of interest. where P we hypothetically assumed two autonomous common trends.t + Γ11 ∆xt−1 + α1 β0 xt−1 + µ0 + Φ1 Dt + ˜1t .20) Thus.

Juselius (2000). 1998b). it is a linear combination of two nonstationary relations rather than a linear combination of two stationary relations. Italy and Spain in the post Bretton Woods period. The ﬁrst relation shows that there is empirical support for a Danish money demand relation of the type discussed in Romer (1996).56(Rm − Rb ) − 0.t Rb. In the next chapter we will demonstrate that each of them imposes two overidentifying restrictions on the cointegrating space.6.45Rb } ∼ I(0).   . and Juselius and Toro (2003). {Rm − 0.e. We found strong empirical support for the stationarity of the following relations1 : {(m − p − y r ) + 14(Rb − Rm )} ∼ I(0). a total of six testable restrictions. i.t  However. which together would completely determine the cointegration space:  (m − p)t r yt ∆pt Rm. {(∆p − Rm ) − 0. However. REVISITING THE SCENARIO ANALYSIS Pt i=1 u2i 193 as the driving forces in the small monetary system. but {(mr − y r )t + 14(Rb − Rm )}t ∼ I(0) implies that the stochastic Similar results have been found in many other empirical applications based on monetary transmission models for Germany. (Rb − ∆p) ∼ I(0). none of the above relations were found to be stationary suggesting that there have been permanent interaction eﬀects between the nominal and the real side of the economy (at least over the business cycle horizon). (Rb − Rm ) ∼ I(0). The ﬁnding that (mr −y r )t ∼ I(1) and (Rb −Rm )t ∼ I(1). See for example Juselius (1998a. 1   1 −1 0 0 0  0 0 1 −1   β xt =  0 0   0 0 −1 0 1    .04(y r − b4 t)} ∼ I(0).9. We also noted in Chapter 2 that if nominal and real shocks are separated in the sense of nominal shocks exhibiting no real eﬀects (at least in the long-run) we would expect to ﬁnd the following relations to be stationary cointegrating relations: (m − p − y r ) ∼ I(0).

The cointegration results showed that even if (Rm − Rb )t ∼ I(1) there existed a stationary linear combination (Rm − 0. TESTING RESTRICTIONS trend in money velocity is the same as the stochastic trend in the spread. Which of the two cases is empirically correct can be inferred by examining the time series properties of the two interest rates. If (Rm − Rb ) ∼ I(1). 2003).194 CHAPTER 9. whereas mr by both real and nominal shocks. For example it depends on whether excess aggregate demand is related to the real interest rate. The fact that (mr − y r )t ∼ I(1) can mean either that mr and y r contain diﬀerent stochastic trends or that mr and y r have been aﬀected by the same stochastic trend but not in the same proportion one to one. and so on. Chapter 2 pointed out that it is not irrelevant for the eﬀectiveness of monetary policy whether we get one result or the other. and ﬁnally bring inﬂation down. money velocity must have been aﬀected by the nominal stochastic trend. Whether this is the case or not depends on the remaining cointegration properties and the dynamics of the system (Johansen and Juselius. then the shocks will not be transmitted one to one and it is less obvious that the central bank is able to bring the inﬂation down by increasing its own interest rates. Since (mr − y r ) and (Rm − Rb ) were found to be cointegrating. money velocity seems to have been aﬀected by the nominal stochastic trend. This is based on the assumption that the interest rate shock will ﬁrst inﬂuence all market interest rates from the short to the long end. The hypothesis (mr − by r ) ∼ I(0) is a testable hypothesis which was rejected by the data. In the ﬁrst case no value of b in the linear combination mr − by r would produce stationarity between mr and y r . Hence. As an example. We will now take a closer look at the time series implications of this result. If (Rm − Rb ) ∼ I(0). This trend is much more likely to describe cumulated nominal rather than real shocks to the system. whether the latter is stationary or not. then the shocks will be transmitted one to one and the central bank may very well be successful in bringing the inﬂation down by increasing interest rates.45Rb )t ∼ I(0). let us consider the simple case when the central bank increases its interest rate in order to curb excess aggregate demand in the economy (and hence inﬂationary pressure). whether excess aggregate demand is related to inﬂation or not. but possibly also by the real stochastic trend in this period. for example because y r had only been aﬀected by cumulated real shocks. then lower the demand for investment. Thus. the two interest rates do share a common trend though not in the proportion one to one. Because the implications of monetary policy are likely to diﬀer depend- .

6. intermediate targets and a goal variable it is important to have reliable information about these relationships. . In periods of deregulation or changes in regimes the latter are likely to change. cointegration properties and changes in them are likely to provide valuable information about the consequences of shifting from one regime to another. REVISITING THE SCENARIO ANALYSIS 195 ing on the cointegration properties between a policy instrument variable. See for example Juselius (1998a).9. Therefore.

5 does the same for an over-identiﬁed structure and Section 10. Section 10.8 concludes. Section 10. the latter is about imposing short-run dynamic adjustment structure on the equations for the diﬀerenced process. A general result for formal identiﬁcation in a statistical model is given.2 discusses the condition for identiﬁcation in terms of restrictions of the longrun structure. of the equations.Chapter 10 Identiﬁcation of the Long-Run Structure When the empirical model is estimated with data that are nonstationary in levels we need to discuss two diﬀerent identiﬁcation problems: identiﬁcation of the long-run structure. Section 10. and empirical identiﬁcation is deﬁned. The former is about imposing long-run economic structure on the unrestricted cointegration relations. The parameters of the model are partitioned into the shortrun and the long-run parameters and it is shown that the analysis of the long-run structure can be performed in either representation.6 for an nonidentiﬁed structure.4 discusses just-identiﬁed restrictions of the long-run structure and provides two illustrations in which economic identiﬁcation can be addressed.7 illustrates some recursive procedures to check for parameter constancy and Section 10. i. Section 10. The organization of this chapter is the following: Section 10. Section 10.e. i.1 discusses the cointegrated VAR model for ﬁrst order integrated data in reduced and structural form. In this chapter we will primarily discuss identiﬁcation of the long-run relations and leave the short-run adjustment structure to the next chapter.3 discusses how to calculate degrees of freedom when testing over-identifying restrictions. 195 .e. of the cointegration relations and identiﬁcation of the short-run structure.

0 0 0 0 0 The short-run parameters of the reduced form. Ω} ˜ and structural form parameters λSF = {A0 . where λS = {Γ1 . Ω) (10. Σ} are unrestricted. A1 . where λS = {A0 . Φ. a. whereas those of the structural form.1 Identiﬁcation when data are nonstationary To illustrate the diﬀerence between the two identiﬁcation problems it is useful to consider the cointegrated VAR model both in the so called reduced form and the structural form and discuss in which aspects they diﬀer. this need not coincide with an economic identiﬁcation and in general we need to impose r(r − 1) just-identifying restrictions also on β. λL }. IDENTIFICATION LONG-RUN STRUCTURE 10. λL }. β.Φ. α. Ω} and λL = RF RF RF RF ˜ {β} and λSF = {λS . consider the usual reduced form representation: ∆xt = Γ1 ∆xt−1 + αβ 0 xt−1 + ΦDt + εt . Σ} and λL = {β}. Σ). To distinguish between parameters of the long-run and the short-run structure we partition λRF = {λS . Φ.2) At this stage we assume that reduced form parameters λRF = {Γ1 . we can discuss the statistical problem of how to test structural hypotheses on the long-run structure {β} before . Φ = A−1 Φ. Although the long-run parameters β are uniquely deﬁned based on the normalization of the eigenvalue problem. εt ∼ Np (0. First. β is the same both in both forms and identiﬁcation of the long-run structure can be done in either the reduced form or the structural form.1) with a nonsingular p × p matrix A0 to obtain the so called structural form representation (10. εt = A−1 vt . Because the longrun parameters remain unaltered under linear transformations of the VAR model. This gives the rationale for identifying the long-run and the short-run structure as two separate statistical problems. α. though from an economic point of view they are interrelated. Ω = A−1 Σ A00 −1 . vt ∼ Np (0. β.2): ˜ A0 ∆xt = A1 ∆xt−1 + aβ0 xt−1 + ΦDt + vt . are uniquely deRF ﬁned. are not without imposing SF p(p − 1) just-identifying restrictions. a. Φ. λS .1) and then pre-multiply (10. SF SF SF SF The relation between λS and λS is given by: RF SF ˜ Γ1 = A−1 A1 . α = A−1 a. (10. λS .196 CHAPTER 10. A1 . Therefore.

that is β = (H1 ϕ1 . As before. which is related to the actual estimated parameter values. Thus. Hr express linear hypotheses to be tested against the data.A1 . which is related to the economic interpretability of the estimated coeﬃcients of a formally and empirically identiﬁed model.. IDENTIFYING RESTRICTIONS 197 ˜ addressing structural hypotheses on the short-run structure {A0 ..10. From a practical point of view this is invaluable. 10.. which is related to a statistical model • empirical (statistical) identiﬁcation.and short-run structure is likely to be immensely diﬃcult. which as a crucial part involves the choice of data. .2. Note that the linear restrictions do not specify any normalization 1 This section relies strongly on Johansen and Juselius (1995). The cointegrating relations are assumed to satisfy the restrictions R0i βi = 0. Hr ϕr ). For identiﬁcation to be empirically useful all three conditions for identiﬁcation have to be satisﬁed in the empirical problem. .Φ.3) where the matrices H1 . and • economic identiﬁcation.... or equivalently β i = Hi ϕi for some si -vector ϕi . Σ}. Ri denotes a p1×mi restriction matrix and Hi = Ri⊥ a p1×si design matrix (mi +si = p1) so that Hi deﬁned by R0i Hi = 0. as the joint identiﬁcation of the long. . (10. a.2 1 Identifying restrictions In order to identify the long-run structure we have to impose restrictions on each of the cointegrating relations. The identiﬁcation process starts with the identiﬁcation of β in the reduced form and proceed to the identiﬁcation of the short-run structure keeping the identiﬁed β ﬁxed. there are mi restrictions and consequently si parameters to be estimated in the i’ th relation. To understand all aspects of identiﬁcation it is useful to distinguish between identiﬁcation in three diﬀerent meanings: • generic (formal) identiﬁcation.

i... . The value of ri. Note. (10. m diﬀerent. however. R01 Hr ϕr ) = r − 1. Hm )) ≥ 2. Johansen (1992b) gives the following condition for a set of restrictions to be identifying: The set of restrictions is formally identifying if for all i and k = 1.. . say.. The well known rank condition expresses that the ﬁrst cointegration relation.. If the rank condition is satisﬁed estimation can proceed. R01 βr ) = rank(R01 H1 ϕ1 . βr can produce a vector that “looks like” the coeﬃcients of the ﬁrst relation. < ik ≤ r not containing i it holds that rank(R0i Hi1 . If r = 3 the conditions to be satisﬁed are ri. (10. ..... Hm )) = rank(Hj . The idea is to choose Hi so that (10.. Hm )0 (I − Hi (H0i Hi )−1 H0i )(Hj . IDENTIFICATION LONG-RUN STRUCTURE of the vectors β i .. ri... satisﬁes the restrictions deﬁning the ﬁrst relation. by ﬁrst giving the coeﬃcients ϕ some arbitrary numbers.jm can be determined by ﬁnding the eigenvalues of the symmetric matrix (Hj . In the previous chapter we gave several examples of design matrices Hi .j = rank(R0i Hj ) ≥ 1. .. but in order to estimate the coeﬃcients we need to know whether the restrictions are identifying. .. Hm ) using the identity: ri.6) .5) As an example we consider r = 2. Hm )0 (I − Hi (H0i Hi )−1 H0i )(Hj .. (10.5) reduces to the condition: ri.198 CHAPTER 10.e. however. r.3) identiﬁes the cointegrating relations. i 6= j.. avoid the arbitrary coeﬃcients and explicitly check the rank condition based on the known matrices Ri and Hi . i 6= j. Most software programs check the rank condition prior to estimation. . Hm ).4) This implies that no linear combination of β2 . is identiﬁed if rank(R01 β 1 . r−1 and any set of indices 1 ≤ i1 < . i. where (10.jm = rank(R0i (Hj . j.jm = rank(R0i (Hj .4) we need to know the coeﬃcients ϕi i = 1. One can.. R0i Hik ) ≥ k . that in order to check the rank condition (10..j = rank(R0i Hj ) ≥ 1..

be overidentifying. and if equality holds the i0 th rel ation is exactly identiﬁed. do not change the likelihood function. The reason is that when accepting further overidentifying restrictions the rank condition (10. IDENTIFYING RESTRICTIONS 199 Thus the usual rank condition (10. then the rank condition in the ’true’ model is not satisﬁed. if inequality holds then the i0 th rel ation is overidentiﬁed. hence. In the empirical applications below we will illustrate how to check the rank conditions using (10. Such restrictions constrain the parameter space and. the economic model is not empirically identiﬁed. This can be formalized as: An economic model speciﬁed by the parameter value ϑ. thus. Generally the parameter values of the generically identiﬁed model are not known but have to be estimated. if the rank condition (10. The system is exactly identiﬁed if rank(R0i β) = r−1 for all i. one would generally prefer to set insigniﬁcant coeﬃcients in the just-identiﬁed model to zero.5) is a property of the known design matrices. whereas the latter constrain the parameter space and. is formally identiﬁed if ϑ is contained in the parameter space speciﬁed by identifying restrictions.4) requires the knowledge of the (not yet estimated) parameters. For identifying restrictions it holds that rank (R0i β) ≥ r−1. As a general procedure it is useful to start with a just-identiﬁed system and then impose further restrictions if the estimated parameters indicate that a further reduction in the statistical model is possible.5) need not be satisﬁed and the more restricted model is no longer identiﬁed. Thus. For example. The former can be achieved by linear combinations of the relations (equations) and.6). hence.10. are testable. It is empirically identiﬁed if ϑ is not contained in any nonidentiﬁed sub-model. whereas condition (10. change the likelihood function. It is useful to distinguish between just identifying restrictions and overidentifying restrictions. For example. . but need not. we have that although the original statistical model is formally identifying. but the true value is in fact zero. then the estimate will in general not be signiﬁcantly diﬀerent from zero and restricting it to zero will in such a case violate the rank condition. and overidentiﬁed if it is identiﬁed and rank R0i β > r − 1 for at least one i.5) is satisﬁed under the condition that a certain coeﬃcient is nonzero. say. If the true coeﬃcient is zero. However they may.2.

. This deﬁnes β1 = H1 ϕ1 and Lmax (β2 .... Estimate an initial value of β1 = β1 . r are known design matrices of dimension p1 × si . ... βr−1 ). described below: ˜ ˜ 1. Hr ϕr ) where Hi ... . ... −2/T ˜ ˜ ˜ ˜ This deﬁnes βr = Hr ϕr and Lmax (β1 . . . βr−1 = β r−1 estimate αr and ϕr by re˜0 ˜0 duced rank regression of R0t on H0r R1t corrected for β1 R1t . introduced in Section 9. We consider the equilibrium error correction term of (10..200 CHAPTER 10..1) and write it as: αβ0 xt−1 = α1 β 01 xt−1 + . + αr β0r xt−1 = α1 ϕ01 H01 xt−1 + . βr−1 = βr−1 .. .... Again we partition α so that it corresponds to the partitioning of β: α = (α1 . αr ).. . For ﬁxed value of β1 = β1 .... .. The hypothesis on β is expressed as: β = (β1 . Repeat the steps using the last obtained values of βi until the value of the maximized likelihood function has converged.. ˜ ˜ ˜ 3..3. For ﬁxed value of β2 = β 2 . . β r−1 R1t . ˜ 4.. and βr−1 = βr−1 .. Section 4. βr R1t ..... .. ˜ ˜ 2. IDENTIFICATION LONG-RUN STRUCTURE In a formally identiﬁed model the parameters can be estimated subject to the restrictions by the iterative procedure discussed in Chapter 9. . The results can now be generalized to the case where we impose (identifying) restrictions on all cointegrating vectors. β r ) = (H1 ϕ1 ... .... βr ).... + αr ϕ0r Hr R1t + εt The estimation problem can be solved by extending the switching algorithm. βr = β r estimate α1 0 and ϕ1 by reduced rank regression of R0t on H1 R1t corrected for −2/T ˜ ˜0 ˜ ˜ ˜0 ˜ β2 R1t . The concentrated model can be written as: R0t = α1 ϕ01 H01 R1t + . and ϕi are si × 1 matrices of unrestricted coeﬃcients. i = 1.. + αr ϕ0r H0r xt−1 .

βr ﬁxed and perform a reduced rank regression of ∆xt on H02 xt−1 corrected for all ¯ ¯ ¯ ¯ −1 0 0 ¯ρ1 H1 S11 H1 −H1 S10.7) which. e e e ˜ ˜ ˜ 2 . Using the LR procedure. under assumption that all Hi are identifying.. The maximum of the likelihood function is given by: L−2/T (Hc ) = |S00 | (1 − ρ1 ).. − ln(1 − λr )}. . ϕr . .10. βr . To summarize: For ﬁxed values of ϕ2 .. ˆ ˆ (10.β H1 ¯ = 0.β (S00...β 1 ) S01. IDENTIFYING RESTRICTIONS 201 e ˜ ˜ ˜ The eigenvalue problem for ﬁxed β1 = {β1 = β1 ... β 02 xt−1 . hence. β1 = H1 ϕ1 . that is.2. β3 . .... ∆xt−1 and Dt ... β r = βr } by: 2 2 and so on.. . In the next step we keep the values β1 ....β ) S01.. β0r xt−1 . .. we can ﬁnd the ML estimate of β1 by performing a reduced rank regression of ∆xt on H01 xt−1 corrected for all the stationary and deterministic terms.. i=1 i=1 How to calculate the degrees of freedom will be given a detailed discussion in the next section. This determines the estimate of ϕ1 and..β (S00. the hypothesis (??) can be tested by calculating the test statistic: ˆ ˆ −2 ln Λ = T {ln(1 − ρ1 ) + .β Hr ¯ = 0 ˜ e e ˜ ˜ 1 1 e ˜ ˜ ˜ and for ﬁxed β2 = {β 2 = β2 . .. + ln(1 − ρr ) − ln(1 − λ1 ) − . is asymptotically χ2 distributed with degrees of freedom given by: υ = Σr (mi − r + 1) = Σr (p1 − r) − (si − 1). or β2 . βr−1 = β r−1 } is given by: ¯ ¯ ¯ ¯ 0 0 −1 ¯ρr Hr S11 Hr − Hr S10..(1 − ρr ) ˆ ˆ max where ρi are the eigenvalues obtained from applying the switching algorithm ˆ until convergence of the likelihood function...

The following example illustrates how to calculate degrees of freedom when we have imposed over-identifying restrictions on three cointegrating r relations in a VAR analysis of the Danish data (mr . Hr and. important to understand the logic behind the calculations of the degrees of freedom. the unrestricted estimates of β are not in general the best choice. therefore. vr . we need to understand why identiﬁcation failed in order to respecify the restrictions. ∆pt .. Rb.202 CHAPTER 10.. r is clearly preferable as a starting value for βi . the linear combination of the unrestricted estimates which is as close as possible to sp(Hi ).. for the r eigenvalues ρ1 > .. > ρr and v1 . These can be found by solving the eigenvalue problem: ˆ ˆˆ ˆ |ρH0i Hi − H0i β (β β)β Hi | = 0. It is. yt . However.. The speed of convergence of the switching algorithm depends very much on how we choose the initial values of β. i = 1. . even if most software packages check identiﬁcation using some generic values for the model parameters and inform the user when identiﬁcation is violated. 0 0 0 10. some hypotheses can impose restrictions which are quite complicated and where standard formula for calculating degrees of freedom may no longer be applicable. The t . For example. By applying the algorithm until the likelihood function has converged to its maximum we can ﬁnd the maximum likelihood estimates of β subject to the identifying restrictions.3 Formulation of identifying hypotheses and degrees of freedom When testing restrictions imposed on the cointegration relations using readily available software packages.t ). This determines β 2 . the degrees of freedom are usually provided by the program.. and choose as initial value for βi the ﬁrst eigenvector deﬁned by Hi ϕi .. CATS will in such cases suggest a number and ask if you agree or simply prompt for the degrees of freedom. Instead. because the unrestricted eigenvectors need not correspond to the ordering given by H1 .. thus. . can be very poor initial values.. . IDENTIFICATION LONG-RUN STRUCTURE stationary and deterministic terms. Rm.t . Furthermore.. This choice of initial values has ˆ the extra advantage that for exactly identiﬁed equations no iterations are needed..

1995. and it is possible to normalize on one element in each vector without changing the likelihood:   mr t   r  yt  1 −1 0 β c /β c −β c /β c 12 11 12 11   c c c c  0 1 β 22 /β 21 0 β 23 /β 21   ∆pt  . As discussed in Chapter 9. . we disregard any deterministic components in the cointegration relations. . We will here illustrate how this can be done analytically using condition (10. the i ij corresponding αc vector is multiplied by the same element. (10. 76). As discussed in Chapter 7 the parameters (β c .t .   mr t  c  r  yt  β 11 −β c 0 β c −β c 11 12 12   c c c  0 β 21 β 22 0 β 23   ∆pt  (10.e. normalizai tion does not change Π = αc βc0 = αβ0 and we can generally choose whether i i to normalize or not.8)    Rm.t ) are cointegrated. the second relation that yt . Hr ϕr ) or the number of restrictions mi . However. β c ). (β c .t are cointegrated... β c = (H1 ϕ1 .t and Rb. and the third relation that Rm. The ﬁrst step is to examine whether the restricted structure deﬁned by β c = {H1 ϕ1 .10. Hr ϕr } satisﬁes the rank and order condition for identiﬁcation.1 hypotheses on the cointegration structure can be formulated either by specifying the number of free parameters ϕi . FORMULATING IDENTIFYING HYPOTHESES 203 r ﬁrst relation expresses that (mr − yt ) and (Rm. p.t  0 0 0 βc βc 31 32 Rb. Hi ) = sp(Hi ) (see Johansen. ∆pt and Rb. Thus.t − Rb. so t r are driven by the same stochastic trend.. β c .9)    Rm. R0 β = 0.t  0 0 0 1 β c /β c 32 31 Rb.. and ˜ sp(hi .5) based on the following example where.t When normalizing βc by diving through with a non-zero element β c . hi is a vector in sp(Hi ) deﬁning the chosen normalization.. when the long-run structure is identiﬁed normalization becomes more important. In this case it is convenient to express βc = hi + Hi ϕi where ϕ is now ˜ i (si − 1) × 1.t are cointegrated. When we discuss identiﬁcation it is convenient to choose the former formulation. so share two common trends. β c ) 11 12 21 22 23 31 32 are deﬁned up to a factor of proportionality. This is because it is only possible to ˆ get standard errors of β ij when each cointegration vector is properly normal˜ ˜ ized. so share one common trend.e. i.3. i.. for simplicity. β c ) and (β c .

11) ˜ 0 H2 .  0 . For each cointegration vector βc we have to make a distinction between the i normalized coeﬃcient and the remaining si − 1 free coeﬃcients ϕi . h2 + H2 ϕ2 =  ˜ ˜ ϕ         +     · ¸  ϕ22 ˜   ϕ23 .  ˜     +        [˜ 32 ] .10)   ˜ ˜ h1 + H1 ϕ1 =      ˜ 3 ϕ3 =  h3 + H ˜        +      0 0 0 0 1      [˜ 12 ] .9) using βc = hi + Hi ϕi : i ˜ ˜ ˜ ˜ ˜ ˜ β c = {h1 + H1 ϕ1 . h2 +H2 ϕ2 . IDENTIFICATION LONG-RUN STRUCTURE i.. As an ˜ ˜ ˜ illustration we express the above structure (10.204 CHAPTER 10.   .e. to express the hypotheses in terms of the number of free parameters. ˆ ˆc and the standard errors of β ij are calculated as: σβc = ˆ ij where q ˜ ˜ ˜ α ˆ ˆ ˜ T −1 diag[H{H0 (ˆ c0 Ω−1 αc ⊗ S11 )H}−1 H0 ]ij   ˜ H=    ˜ H1 0 ···  0 .12) c c0 c −1 (10.  ϕ  For given estimates of βc the estimates of αc are given by the formula in Chapter 7: ˆ ˆ ˆ αc = S01 β (β S11 β ) .  . . 0 0 ··· . 0  ˜ 0 Hr (10. where  1 −1 0 0 0 0 0 0 1 0    0 0 0 1 −1    0 1 0 0 0   0 0 1 0 0 0 0 0 0 1  (10. h3 +H3 ϕ3 }. ..

however. thus. that some of the restrictions may not be identifying (for example the same restriction on all cointegration relations). . Whatever the case.10. Consider si .5). otherwise they are non-identifying though nevertheless real testable restrictions.5). p1. Note. and mi = p − si . thus.. Note that pseudo ’restrictions’ are just-identifying only if there are exactly r − 1 of them and they satisfy the condition (10. the i total number of restrictions on vector βc . then the degrees of freedom in the i above example are calculated as follows: si s1 = 2 s2 = 3 s3 = 2 mi m1 = 3 m2 = 2 m3 = 3 r−1 2 2 2 mi − (r − 1) 1 0 1 so the degrees of freedom are ν = 2. no testing is involved in this case. but are nevertheless testable restrictions. Additional restrictions on the structure change the value of the likelihood function and.. are testable.. The standard errors of the corresponding αc coeﬃcients are calculated ˆ ij as: q ˆ ˆ c ˆ c0 ˆ c −1 ˆ c0 = T −1 Ωii (β (β S11 β ) β )jj σ αc ˆ ij (10. Such restrictions are over-identifying if they satisfy (10.. Such ’pseudo’ restrictions do not change the value of the likelihood function and. FORMULATING IDENTIFYING HYPOTHESES 205 and the elements diag[·]ij are deﬁned by the ordering i = 1. .. the number of restricted coeﬃcients in βc .5) the degrees of freedom can be calculated from the following formula: ν= X (mi − (r − 1)).3. Given that the restrictions satisfy (10.. .13) We will show below that it is always possible to impose r − 1 justidentifying restrictions on β by linear manipulations of the unrestricted cointegration vectors. the restrictions that are identifying must as a minimum satisfy the condition for just identiﬁcation. r and j = 1.

and the short-term ...4 Just-identifying restrictions In general we can always transform the long-run matrix Π = αβ0 by a 0 nonsingular r × r matrix Q in the following way: Π = αQQ−1 β0 = αβ .t given ∆x2... β] where I is the (r × r) unit matrix and 1 ˜ = β−10 β 2 is a r × (p − r) matrix of full rank..t and x2. which is a testable hypothesis and it is straightforward to test whether this is a good description of the data.. For example... inﬂation.. These just-identifying restrictions have transformed β to the long-run ”reduced form”. assume that β β 1 is (5×3):        1 0 0 . β 43 β 53   β1   β1               =  .. then β 0 xt does not appear in the equation for ∆x1. ˜˜ ˜ where α = αQ and β = βQ0−1 .t is weakly exogenous for β... x5t ]. x2t . Furthermore. As an example of the latter.. β −1  = 1    . the above example for xt = [x1t . We will now demonstrate how to choose the ˜ matrix Q so that it imposes r − 1 just-identifying restrictions on each βi ..t as discussed in the previous chapter.           β2   β2   We notice that the choice of Q = β1 in our example has in fact imposed two zero restrictions and one normalization on each cointegration relation....... When ’endogenous’ and ’exogenous’ are given an economic interpretation this corresponds to the representation suggested by Phillips (1990). Note that the triangular representation requires that α2 = 0. see also Johansen (1992a)..206 CHAPTER 10. β2 ].  . β 41 β 51 β 12 β 22 β 32 .. e β 41 e β 51 0 1 0 . IDENTIFICATION LONG-RUN STRUCTURE 10. x2t ]. β 42 β 52 β 13 β 23 β 33 .... Thus. x3t ] and x2t = [x4t . we consider the following design matrix Q = [β1 ] where β1 is a (r × r) nonsingular matrix deﬁned by β0 = [β1 . In this ˜ case αβ0 = α(β1 β−10 β0 ) = α[I. e β 43 e β 53            β=    β 11 β 21 β 31 . where x1t = [x1t .. Example 1: We consider here a just identiﬁed structure describing the long-run ’reduced form’ assuming real money. In this case eﬃcient inference on the long-run relations can be conducted in the conditional model of ∆x1. e β 42 e β 52 0 0 1 . if α2 = 0. would describe an economic application where the three variables in x1t are ’endogenous’ and the two in x2t are ’exogenous’.

9) (−4.00 −0.0 0 −0.8) 1. H2 ϕ2 .1) * 1.31 * * −0.4) (−0.0 (6.9 −0.2) 0 0 (4.8) 0 1.01 −0.0 0 0 1.00 −0. H2 real money.4) ∆Rm.01 (−1.01 0.32 * * (−1.4) −0. H1 picks up inﬂation rate. H3 the short-term interest rate and the two ’exogenous’ variables and the shift dummy Ds83 enter all .00 (−2.5) α3 * * −1.9) 207 mr yr ∆p Rm Rb D83 ∆mr t r ∆yt ∆2 pt 1.4) (−3.t ∆Rb.4 * * * −5.14 α1 α2 * −0.1: Two just-identiﬁed long-run structures HS1 HS2 ˆ1 ˆ2 ˆ3 ˆ1 ˆ ˆ β β β β β2 β3 0 1.09 −0.7) 0.09 −0.18 α1 (−5.3) (1. H3 ϕ3 ).6) (−0.86 0.12 (−0.4.9) 8.09 -1.21 −15.6) (−4. where    H1 =      0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1         .5) 15.30 (−5.00 (−0.e.1) −0.83 * −0.5) * * interest rate to be ’endogenous’ and real income and the bond rate to be ’exogenous’ corresponding to the following restrictions on β : HS1 : β = (H1 ϕ1 .5) α2 α3 * −0. JUST-IDENTIFYING RESTRICTIONS Table 10.0) −0. H3 =         0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1         i.7) (−4.3) (3.31 −5.7) (−3.10.2) −0.27 (4.08 (3.0 0 0.83 (4.t * * −0.66 (−9.0 0.7) (−3.00 (−1.7) (1.0) 0 −0.0 0 1.0 0. H2 =         1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1         .9) (3.44 (−6.0 (−6.1) (−5.7 * 3.4) (1.21 (−4.

1. Just identiﬁcation corresponds to rank(R0i Hj ) = 1 for i.1(Rm −Rb )−0. Only coeﬃˆ cients with a |t-value| > 1. and rank (R0i (Hj .208 CHAPTER 10. Because no real restrictions ˆ ˆ ˆ 0 ˆ ˆ c0 have been imposed the matrix Π = αβ = αc β .145Ds83 is in fact a linear combination of two stationary relations. k diﬀerent. H2 ϕ2 . mr −y r −14. H3 ϕ3 ). Similarly. 3 and j 6= i. it is exactly the same as for the unrestricted model. except that the income coeﬃcient is not as close to unity as we have seen before.2 (noticing that real income is not signiﬁcant). The α coeﬃcients are reported in the lower part of Table 10. IDENTIFICATION LONG-RUN STRUCTURE three relations. where    H1 =      0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 −1 0 0 0 1         . j. Hk )) = 2 for i. the money demand relation.2. It is now easy to see that the combination c0 c0 αc β 2 xt + αc β3 xt will replicate the money demand relation H15 in Table ˆ 12 ˆ ˆ 13 ˆ 9. The estimates are reported in Table 10. The ﬁrst relation is similar to the real income relation H16 in Table 9.1. 2. H3 =         0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1         In this case we check that the restrictions are in fact just-identifying by calculating the rank condition (10. noticing that the coeﬃcient to bond rate (and the shift dummy) is not signiﬁcantly diﬀerent from zero.1 = 2 restrictions have been obtained by linear combinations of the unrestricted relations. The third relation is almost replicating H25 in Table 9. Here we give an example of both zero and non-zero (homogeneity) just-identifying restrictions: HS2 : β = (H1 ϕ1 . by rotating the cointegration space. i.3 column HS2 .e.6 have been reported. Hence. The second relation is approximately money velocity as a function of the long-term bond rate. i. The rank indices reported in Table 10. j = 1.3. H2 =         1 −1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1         .e.2. There is no testing in this case since the r . Example 2.5). . the linc0 c0 ear combination αc β 1 xt +ˆ c β3 xt will replicate the inﬂation relation H28 ˆ 31 ˆ α33 ˆ in Table 9.

0 0 0 0 0 (5. These two examples serve the purpose of illustrating that in general one can ﬁnd several identiﬁed structures by rotating the cointegrating space.30 (−5.3 HS.1) 0 1. 3 and j 6= i.2: Two over-identiﬁed long-run structures for the Danish data HS.0 0 0 1.5 Over-identifying restrictions We will consider two overidentiﬁed structures based on the money demand data for p1 = 6 and r = 3.09 −0.66 (−9. OVER-IDENTIFYING RESTRICTIONS 209 Table 10.38 (6. j.0) ∗ (−2. It appears that the second relation is now the money demand relation.1) 0 −0.5. 10.09 (−5.5 −1.4) α1 ˆ ∆mr t r ∆yt ∆2 pt α2 ˆ α3 ˆ α1 ˆ ∗ −0.3) (1.01 −0.10.56 −0.5) ∗ ∗ 1.6) 0 1.2) 13.38 (−10.0 0 1.18 (−4.1) ∗ ∗ (−2.0) ∗ −1.6) (−3. In the next section we will discuss how to impose over-identifying restrictions on β and how to choose between diﬀerent long-run structures.4) α2 ˆ ∗ −0. The estimates of α and β are given in the right hand side of Table 10.5) 0 α3 ˆ ∗ ∗ ∗ (4.t ∆Rb.15 (−5.t ∗ −0.8 (−6.28 ∗ ∗ ∗ ∗ ∗ −0. 2.1.7) (−6.7) 8.66 (−9.37 ∗ ∗ conﬁrm that the above conditions are indeed met. and that rank (R0i (Hj .08 -1. Table 10.7) 0 0 −0.8 −0. Hk )) ≥ 2 for i.3) ∆Rm.0 (−9.1 showed that two of the estimated coeﬃcients in the just-identiﬁed structures were insigniﬁcant.0) (−5.48 −13.29 −5.01 −0.0) (−3.0 0 (9.96 0 0.0) ∆p Rm Rb D83 1. whereas β1 +β 3 becomes the ﬁrst relation and β1 −β3 the third relation under HS1 .0 0 r y 0.48 (2.3) (5. Formal identiﬁcation requires that rank(R0i Hj ) ≥ 1 for i.0 −1. j = 1. k .0) 0.4 ˆ1 ˆ2 ˆ3 ˆ1 ˆ ˆ β β β β β2 β3 r m 0 1.

and H25 reported in Table 9. We ﬁrst consider an overidentiﬁed structure reported deﬁned by: HS3 : β = (H1 ϕ1 . H15 .3 2. Thus the structure can clearly be accepted.2 HS5 4 2.12 2 4 3 2 diﬀerent.j elements should be at least 1 and the i. IDENTIFICATION LONG-RUN STRUCTURE Table 10. where mi is the number of restrictions on βi .    .2. The degrees of freedom are ν = Σr mi − (r − 1) = i=1 (4 − 2) + (2 − 2) + (4 − 2) = 2 + 0 + 2 = 4 and the test statistic became χ2 (4) = 1.3: Checking the rank conditions HS2 HS3 HS4 HS5 ri. The rank indices are given in Table 10. Example 3.1 3.e. H3 ϕ3 ).3 3. H2 =               . H2 ϕ2 .jk elements at least 2 for generic identiﬁcation. The estimates are given in Table 10.79 with a p-value of 0.4 where the i.1 2. The degrees of freedom in the test for overidentifying restrictions are calculated as ν = Σi (mi − r + 1).210 CHAPTER 10.2 1.2.3 and HS. where  0 1 0 0 0 0 0 0 1 0 0 0   1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1   0 0 0 1 0 0 0 0 0 0 1 0  i. The next example consists of the joint testing of the three stationary relations H28 .23 2 4 4 ri. Example 4.77.j 1. H3 =           .13 2 2 3 3 3.jk HS2 HS3 HS4 1 1 1 1 1 1 3 2 1 1 2 3 2 1 2 1 2 2 2 0 3 2 2 3 1.3 column HS. They are reported    H1 =            . structure HS1 with the insigniﬁcant coeﬃcients set to zero.

The relations of HS4 describe directly interpretable steady-state relations. HS3 has more the character of ’building blocks’ (corresponding to the irreducible cointegration vectors in Davidson 2001) which. when a steady-state relation is a direct combination of two stationary building blocks. and as demonstrated above. as demonstrated in Table 9. From a statistical point of view the ’building blocks’ representation is in many cases preferable for two diﬀerent but related reasons: First. become more interpretable from an economic point of view. when weighted by the corresponding αij .r it becomes increasingly diﬃcult to identify r meaningful steady-state relations given that at least r . Hence. and the super-consistency result for estimated cointegration coeﬃcients no longer holds. Does it matter or not whether we choose one representation or the other? In the present example the answer is probably ’not really’. the calculated standard errors are based on correct distributional assumptions in both HS3 and HS4 .81. Thus. (mr − y r ) ∼ I(1) and (Rm − Rb ) ∼ I(1). Nevertheless. LACK OF IDENTIFICATION 211 under HS4 in Table 10.6. both HS3 and HS4 are acceptable long-run structures. We now consider the structure deﬁned by: . Because of the speciﬁc restrictions imposed the β ij coeﬃcients of HS4 are all coeﬃcients between nonstationary variables.6 Lack of identiﬁcation Example 5: . such as (mr −y r )−ˆ1 (Rm −Rb ) where (mr −y r ) ∼ I(0) b ˆ1 is no longer a coeﬃcient between two nonand (Rm − Rb ) ∼ I(0). are in fact linear combinations of the building blocks in HS3 . then b stationary variables. Similar arguments hold for the ﬁrst relation in HS4 . Most economists would probably prefer a structural representation that mimics as closely as possible the hypothetical steady-state relations. when r is large relative to p . ˆ1 has now the meaning of an b α coeﬃcient combining two stationary cointegration relations.1 restrictions has to be imposed on each relation. 10.10. Second. In fact.60 with a p-value of 0.2. the money demand relation in HS4 is a linear combination of two nonstationary relations.2. The degrees of freedom are ν = Σr mi − (r − 1) = i=1 1 + 1 + 2 = 4 and the test statistic became χ2 (4) = 1. We demonstrated above that the same money demand relation can also be obtained by combining two stationary building blocks.

the model speciﬁed by the restrictions in HS5 is not identifying in the sense deﬁned here implying that the four parameters ϕ11 . Another way ˆ of expressing this is that one of the interest rates can be removed from β1 ˆ ˆ ˆ by adding a linear combinations of β 3 .12) and (10.36 ˆ the bond rate from β1 . .  0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0         .212 CHAPTER 10. ω ∈ R.13). in this set-up we can only estimate the impact of a linear combination of the interest rates in the ﬁrst relation.    H1 =     10. H3 =         0 0 0 1 0 0 0 0 0 0 1 0     . We say that H3 is a subset of H1 . IDENTIFICATION LONG-RUN STRUCTURE where HS5 : β = (H1 ϕ1 . . In this case the test statistic became 2. Thus. Thus. they are genuine restrictions on the parameter space and the model can be tested by a likelihood ratio test but in this case we need to tell CATS how many degrees of freedom there is in the test. removes 0. though the restrictions in Table 10. The reason is that the above structure cannot be distinguished from {βc + ωβc . But. ϕ12 .    i.10 β3 . the 1 3 2 3 homogeneity restriction of HS4 identiﬁes β1 uniquely in the sense that ωβc 3 can no longer be added without violating the former. T . . βc }. . Therefore. . H3 ϕ3 ). ϕ13 and ϕ14 cannot be estimated uniquely without further restrictions.7 Recursive tests of α and β ˆc The parameters of the identiﬁed model are simply plots of β (t1 ) and αc (t1 ) ˆ for t1 = T0 .e. For example. H2 =         1 0 0 −1 0 0 0 0 0 0 1 0 0 −1 0 0 0 1         .07 with v = 1 + 1 + 2 = 4 degrees of freedom and the estimated relations clearly deﬁne stationary relations in the cointegration space. β 1 − 0. Nevertheless. The rank conditions of Table 10.4 are not identifying. H2 ϕ2 . The standard errors are calculated as indicated in (10. identiﬁcation can be restored by either imposing the homogeneity restriction as in HS4 or by setting one of two interest rates to zero in H1 ϕ1 . structure HS4 without imposing the homogeneity restriction on β 1 . βc .3 shows that relation 1 is not identiﬁed relative to relation 3.

90 -1.80 -1..00 0..08 −1.25 0.10.01 −0.1) ∗ ∗ (−2.62 -1.08 -1.50 0. Recursively calculated coeﬃcients of β 1 (t1 ) for t1 =1983:2.25 -0.26 -1.75 0.06 0.9 1.00 87 88 89 90 91 92 93 83 84 85 86 87 88 89 90 91 92 93 Figure 10.00 -0.30 ∗ (−5.14 FY 1.50 0.4 0.08 0.50 1.0 0.8) (−4.0 Rb 0.10 0.50 -0.0 0 r y 0. RECURSIVE TESTS OF α AND β Table 10.04 -12.00 87 88 89 90 91 92 93 83 84 85 86 87 88 89 90 91 92 93 0.2 83 84 85 86 87 88 89 90 91 92 93 2.00 83 84 85 86 MO = 0.7.00 1.2 1.6 0.00 0.0) r ∆yt ∆2 pt 213 ∗ −1.1.00 0.10 12.16 ∗ IDE ∗ 1.6 ∆Rm.44 -1.04 83 84 85 86 87 88 89 90 91 92 93 -0.9 −0.25 1. .t ∆Rb.98 -2.75 0.0 0 0 Rm -0.75 -1..15 0 α1 ˆ α2 ˆ α3 ˆ r ∆mt ∗ −0.2 0.00 D83 = 0.36 D83 0 −0.25 -0.68 (−9.t ∗ −0.75 0.75 1.5) ∗ 1.00 -0.00 1.5 ˆ1 ˆ ˆ β β2 β3 r m 0 1.0 0.36 ∗ -0.12 0.1993:4.4: An un-identiﬁed long-run structures for the Danish data HS.50 -0.25 0.0 0 ∆p 1.50 0.8 IBO 0.00 83 84 85 86 DIFPY = 1.3) (1.25 0..75 -1.

050 -0.3..25 -0..25 1.075 0. IDENTIFICATION LONG-RUN STRUCTURE 1.25 -0.. Recursively calculated coeﬃcients of β 2 (t1 ) for t1 =1983:2.30 87 88 89 90 91 92 93 83 84 85 86 87 88 89 90 91 92 93 Figure 10.00 -0.25 0.0 -2.50 0.8 -2.05 D83 -0.025 -0. 2.075 0.75 0.000 -0..025 0.0 0.75 83 84 85 86 87 DFY 0.00 0. Recursively calculated coeﬃcients of α1 (t1 ) for t1 =1983:2.2 83 84 85 86 87 DDIFPY 88 89 90 91 92 93 Figure 10.50 0.0 -0.75 1.5 -1.00 83 84 85 86 87 88 89 90 91 92 93 -25 83 84 85 86 87 88 89 90 91 92 93 -15 -5 MO = 1.025 -0.2..0 -1.25 0.25 -0.0 83 84 85 86 87 88 89 90 91 92 93 DMO 0.00 -0.50 -0.000 -0.00 -0.050 -0.75 -1.75 -1.050 0.00 0 IDE -10 -20 0.100 -0.6 -1.00 83 84 85 86 FY = -1.1993:4.25 -1.00 25 IBO 20 15 10 5 0 87 88 89 90 91 92 93 83 84 85 86 87 88 89 90 91 92 93 1.4 -1.00 -1.20 -0.25 1.00 1.5 -2.0 -1.214 CHAPTER 10..50 0.50 -0.075 -0.00 83 84 85 86 DIFPY = 0.1993:4.50 -0.00 -0.75 0.75 -2.15 -0. .25 -0..8 -1.10 -0.125 DIBO 88 89 90 91 92 93 83 84 85 86 87 88 89 90 91 92 93 -0.025 -0.00 0.00 0.50 1.2 -1.075 83 84 85 86 87 DIDE 88 89 90 91 92 93 1.050 0.5 0..50 -1.25 0.75 0.

.1993:4.240 -0.05 0.00 87 88 89 90 91 92 93 1.24 -0.00 0.15 -0.005 0.20 0.30 0.25 0.00 83 84 85 86 FY = 0.0200 -0.015 0.75 0.48 83 84 85 86 87 88 89 90 91 92 93 215 DIDE DMO 0.00 -0.520 IBO 87 88 89 90 91 92 93 83 84 85 86 87 88 89 90 91 92 93 1..25 1.00 0.005 -0.00 -0.25 -0.00 0.480 -0.05 -0.00 1..10.75 0.25 0.00 0.0040 -0.00 -0.75 -1.00 -0..025 0.75 0.08 0.4.16 -0.00 2.25 0.05 -0.75 -1.25 -0.50 -0.16 0.35 0.000 -0.08 -0..00 83 84 85 86 DIFPY = 0.50 1.0080 0.75 -1.00 D83 = 0. Recursively calculated coeﬃcients of α2 (t1 ) for t1 =1983:2.280 -0.00 83 84 85 86 IDE = 1.00 -0.25 0.00 0.020 0.00 -0.50 0.360 -0.5. .010 0.010 -0.400 -0.40 -0.0120 -0.25 0.05 -0.0000 -0.015 DIBO 88 89 90 91 92 93 83 84 85 86 87 88 89 90 91 92 93 0.50 0.25 -0.00 83 84 85 86 87 88 89 90 91 92 93 MO = 0. Recursively calculated coeﬃcients of β 3 (t1 ) for t1 =1983:2..50 -0.32 -0.15 0.75 0.10 -0.25 -0.50 -0.00 -0.10 0.75 -1. RECURSIVE TESTS OF α AND β 0.0080 -0.440 -0.0240 83 84 85 86 87 88 89 90 91 92 93 0.00 87 88 89 90 91 92 93 83 84 85 86 87 88 89 90 91 92 93 Figure 10.15 0.10 0.20 83 84 85 86 DDIFPY 87 88 89 90 91 92 93 Figure 10.50 0.0040 0.200 -0.160 -0.75 1...7.75 0.320 -0.50 0.0160 -0.50 0.00 1. 1.10 83 84 85 86 87 DFY 0.50 -0.00 -0.1993:4.25 0.

More! 10.4Rb )t . i.0 DIDE 2 1 0 DFY 0. Recursively calculated coeﬃcients of α3 (t1 ) for t1 =1983:2.6Rb − 0.2Ds83)t and (Rm − Rb )t .1 -0.24 DIBO 0.8 Concluding discussions This chapter demonstrated that an empirically stable money demand relation of the type discussed in Romer (1996) can be found as a linear combination (1..12 0.8) between two nonstationary processes.2 -0.2Ds83)t and (Rm − 0. IDENTIFICATION LONG-RUN STRUCTURE 6 5 4 3 2 1 0 -1 -2 -3 83 84 85 86 87 88 89 90 91 92 93 -0. (mr − y r − 0.3 DMO -0. . we do not only obtain full information maximum likelihood estimates of the money demand parameters (which in general have optimal properties) but.0. the common stochastic trends driving the system.24 -0.. −13.12 -2 -3 -4 83 84 85 86 87 88 89 90 91 92 93 -0. as already discussed in Section 9.1993:4.00 -1 -0. as a cointegration relation (1.5 -0. alternatively.36 83 84 85 86 87 88 89 90 91 92 93 3 2 1 0 -1 -2 -3 83 84 85 86 87 DDIFPY 88 89 90 91 92 93 Figure 10. Given that the VAR model is a good description of the information in the data.4 -0. we also gain information about the number of autonomous shocks..6 83 84 85 86 87 88 89 90 91 92 93 -0. −13.8) of the two stationary cointegration relations (mr − y r + 8.6. or. We note that the estimated long-run coeﬃcients remain remarkably stable all since 1983:2.e..0.216 CHAPTER 10.

in which we will ask questions like: Is it possible by expansion of money supply in excess of the real productivity level in the economy to permanently increase real income or is the ﬁnal eﬀect only an increase in the inﬂation rate? Should money supply or the interest rates be used as monetary instruments? Is money stock endogenously or exogenously determined? Is the level of the market interest rates determined by the central bank or by the ﬁnancial market. by imbedding the money demand relation (as well as the other stationary relations) in a dynamic equilibrium error correction system it is possible gain empirical insight into the short-run dynamics of the adjustment and feed-back behavior. This will be the topic of the next chapter. or by both? What are the empirical consequences of either case? .10.8. CONCLUDING DISCUSSIONS 217 Furthermore.

Essentially all the results discussed in Section 10. The organization of this chapter is as follows: Section 11. where we also will discuss how to distinguish between temporary and permanent errors in a structural VAR model. though interdependent.2 the diﬃcult problem of interpreting linear functions of VAR residuals as structural shocks.1 discusses how to impose identifying restrictions on the short-run adjustment dynamics.1 apply also for the shortrun structure of the model and will not be repeated here. whereas the long-run covariance matrix of the cointegrating relations was not part of the identiﬁcation process. Section 11. except for trivially in the triangular form model. empirical and economic identiﬁcation similarly as for the long-run structure. and Section 11. In this important respect the two identiﬁcation problems diﬀer from each other. the residual covariance matrix plays an important role in the identiﬁcation of the short-run structure. problems. we will not impose identifying restrictions on the residuals. Identifying restrictions on the residuals will be discussed in Chapter 13. However.Chapter 11 Identiﬁcation of the Short-Run Structure In the previous chapter we gave the arguments for why it is possible to treat the identiﬁcation as two separate. In this chapter we will discuss identiﬁcation of a short-run structure where we allow simultaneous current eﬀects as well as short-run adjustment eﬀects to lagged changes of the variables and to previous equilibrium errors to be part of the model. However. such as orthogonality restrictions. This means that an identiﬁed short-run adjustment structure should satisfy the conditions for generic.3 empirical and economic identiﬁca219 .

The discussion of how to impose identifying restrictions on the short-run parameters will then be based on a given ˆc identiﬁed β = β i.5 and 11. In some cases the residual covariances are small and of minor importance and can be disregarded altogether.220CHAPTER 11. When discussing identiﬁcation of the short-run adjustment parameters we will assume that the cointegration relations have been properly identiﬁed in the ﬁrst step of the identiﬁcation scheme. the speed of convergence toward the true value β c is proportional to t as t + ∞. Thus. whereas the convergence √ the estimates of the short-run adjustment parameters are of proportional to t.6 illustrate short-run identiﬁcation in three diﬀerent model representations.4. The statistical justiﬁcation is that the estimates ˆc of the long-run parameters β are super consistent.1 Formulating identifying restrictions As already discussed the identiﬁcation of the short-run structure is very much facilitated by keeping the properly identiﬁed cointegrating relations ﬁxed at ˆ c0 their estimated values.e. 11. IDENTIFICATION OF THE SHORT-RUN STRUCTURE tion and provides some examples of economic questions related to the shortrun adjustment structure. Section 11. The cointegrated VAR model is a reduced form model in the short-run dynamics in the sense that potentially important current (simultaneous) effects are not explicitly modeled but are left in the residuals.8 concludes with a discussion of the economic plausibility of the estimated results based on generically and empirically identiﬁed models for the Danish data. large oﬀ-diagonal elements of the covariance matrix Ω can be a sign of signiﬁcant current eﬀects between the system variables. Section 11. Since the reduced form is always generically identiﬁed all further restrictions on the short-run structure are then overidentifying. for: . Sections 11.e. Finally.e. i. i. For simplicity we disregard dummy variables at this stage. treating β xt as predetermined stationary regressors similarly as ∆xt−1 . While a simpliﬁcation search in the reduced form VAR model is quite simple. 11. this is generally not the case when the covariance matrix Ω is part of the identiﬁcation process.7 discusses identiﬁcation of the short-run structure in a partial model and illustrates with the Danish data.

as before.2) (11.1. Thus. have to satisfy the usual conditions for identiﬁcation. When checking for generic identiﬁcation of the short-run structure (11. Hp ϕp ). be formulated by the design matrices Hi : A = (H1 ϕ1 . In the latter case.1) (11. µ0.a + vt . we need to impose at least p×(p−1) just-identifying restrictions on (11... β 0 xt = ut : A0 Xt = µ0. similar as for the cointegration structure. . in general. Either of the two ways of calculating the identiﬁcation rank indices described in the previous chapter can be used. Σ) c0 (11. .. whereas in the former case they will be included in the vector X0t and. −A1 . x0t−1 βc ) is a stationary process.a = A0 µ0 .11. The structural vector error correction model (11. but introduces p×(p−1) new parameters (assuming that the diagonal elements of A0 are ones and that the residual covariance matrix is unrestricted). a = A0 α. To ﬁnd out whether the restrictions deﬁning the model are identifying one can either check the rank conditions in (??) based on (??) or by using some generic parameter values.4) we can use the same rank condition given in Chapter 10 for the longrun structure. the dummy variables will be unrestricted in all equations. −a) and X0t = (∆x0t . Ω) or equivalently: ˆ A0 ∆xt = A1 ∆xt−1 + aβ xt−1 + µ0.4) where A0 = (A0 . εt ∼ Np (0. thus.a +vt . choose whether to include them in the identiﬁcation process or not. Multiplying the VAR model with a nonsingular (p × p) matrix A0 does not change the likelihood function.3) where A1 = A0 Γ1 .3) to obtain a unique solution. (11.3) can be written in a more compact form. ∆x0t−1 . Identifying restrictions on the rows of A0 (we assume here that the constant term is not part of the identiﬁcation process) can. and vt = A0 εt . Note that when the model contains dummy variables we can. FORMULATING IDENTIFYING RESTRICTIONS 221 ˆ c0 A0 ∆xt = A0 Γ1 ∆xt−1 +A0 αβ xt−1 +A0 µ0 +A0 εt . vt ∼ Np (0.

in this case the VAR model would not be a good description of the information in the data. or absolute. The theoretical concept of a shock. whether εt = ∆xt − Et−1 {∆xt | g(Xt−1 )} is a correct measure of the unanticipated change in x. and its decomposition into an explained part. estimation can in principle be carried out by the switching algorithm of the eigenvalue routine described in Chapter 10. The requirement for εt to be a correct measure of an unanticipated (autonomous) shock is that the conditional expectation Et−1 {∆xt | xt−1 } correctly describes how agents form their expectations. if agents make model based rational expectations from a model that is diﬀerent from the V AR model. the VAR residuals can be orthogonalized in many diﬀerent ways and whether the orthogonalized residuals vt ˆ . Theories also require shocks to be ”structural”. See. has a straightforward correspondence in the V AR model as a change of a variable. the eﬀect of which is (1) unanticipated (novelty). Since the dimension of the equation system is generally larger than the dimension of the cointegration space. The uniqueness can be achieved econometrically by choosing A0 so that the covariance matrix Ω becomes diagonal.5. and an unexplained part. the residual εt . In general. However. For example. and its decomposition into an anticipated and unanticipated part. IDENTIFICATION OF THE SHORT-RUN STRUCTURE Given that the short-run identiﬁcation structure is generically identiﬁed. meaningful. in some sense. the question of how to deﬁne a shock is important. the switching algorithm can be a little cumbersome and it is common to apply other maximization algorithms. (2) unique (a shock hitting money stock alone). then the conditional expectation would no longer be an adequate description of the anticipated part of the shock ∆xt . we will here assume that it describes a shock.222CHAPTER 11. i. 11. for example. and (3) invariant (no additional explanation by increasing the information set). as will be demonstrated in Section 11. ∆xt .2 Interpreting shocks Because diﬀerent choices of A0 lead to diﬀerent estimates of the residuals. 2003). the selection of diﬀerent estimation algorithms in GiveWin (Doornik and Hendry. The novelty of a shock depends on the credibility of the expectations formation. by postulating a causal ordering among the variables of the system one can trivially achieve uncorrelated residuals. objective. implying that they are. With the reservation that the word structural has been used to cover a wide variety of meanings. the conditional expectation Et−1 {∆xt | Xt−1 }. For example.e.

though derived from sophisticated theoretical models. and not in the narrow sense of identifying deep structural parameters 11.3) we note that the data variation can be decomposed into a systematic (anticipated) part and an unsystematic (unanticipated) part. structural interpretability of estimated shocks seems hard to justify. The invariance of a structural shock is probably the most crucial requirement as it implies that an empirically estimated structural shock should not change when increasing the information set. hence.11. Thus. Since essentially all macroeconomic systems are stochastic and highly interdependent. In empirical models the ceteris paribus assumptions should preferably be accounted for by conditioning on the ceteris paribus variables. The . Thus. the simpliﬁcation search should preferably be guided by some relevant questions that the empirical analysis should provide an answer to. WHICH ECONOMIC QUESTIONS? 223 can be given an economic interpretation as a unique structural shock. In the words of Haavelmo (1943) there is no close association between the true shocks deﬁned by the theory model and the measured shock based on the estimated VAR residuals.3 Which economic questions? Macroeconomic theory is usually quite informative about prior economic hypotheses relevant for the long-run structure. A structural shock seems to be a theoretical concept with little or fragile empirical content in macro-econometric modelling. the identiﬁcation of the short-run structure has often the character of data analysis aiming at the “identiﬁcation” of a parsimonious parameterization rather than testing well-speciﬁed economic hypotheses. diﬀerent schools will claim structural explanations for diﬀerently derived estimates based on the same data. the inclusion of additional ceteris paribus variables in the model is likely to change the VAR residuals and. In the remaining part of this chapter we will discuss economic identiﬁcation in the broad sense of answering some questions of economic relevance. Some of them are just-identifying and cannot be tested against the data. From (11. depends crucially on the plausibility of the identifying assumptions. Therefore.3. Theory models deﬁning deep structural parameters are always based on many simplifying assumptions inclusive numerous ceteris paribus assumptions. the estimated shocks. whereas much less seems to be known about the short-run adjustment mechanisms. Nevertheless.

extraordinary events. but also add to the realism of the empirical model by conditioning on relevant ceteris paribus variables. In contrast.. the model may predict. ∆xi.. In the ideal case the empirical model should allow for all relevant aspects of (possibly competing) theoretical model(s) as testable hypotheses. most theoretical models make prior assumptions on how agents are supposed to react under optimizing behavior given some ceteris paribus assumptions. In this sense the speciﬁcation of economic models is more precise with respect to the postulated behavior and its economic consequences. by the ceteris paribus assumptions of the economic model. i = 1. p. deviations from previous long-run equilibrium states.3) is less precise in terms of an underlying theoretical model. r. instantaneous or partial adjustment behavior towards equilibrium states..224CHAPTER 11. 3.. and (iii) extraordinary events. For example. k − 1. p. IDENTIFICATION OF THE SHORT-RUN STRUCTURE systematic part explains the change from t − 1 to t of the system variables as a result of: 1. ∆xi. such as ”constant real exchange rate” . (ii) current and lagged changes in the determinants... i = 1. Thus. but far more ﬂexible in terms of actual macroeconomic behavior which is inﬂuenced by the wider circumstances that generated the data.. Based on these assumptions. . j 6= i) in the system variables. and 4. The speciﬁcation of an empirical model like (11.. for example. this type of empirical models allow for the possibility that agents react on (i) previous equilibrium errors in the long-run steady-state relations. m = 1.. . current (anticipated) changes (∆xj.. previous changes of the system variables. the question of instantaneous or partial adjustment behavior in the domestic money market can be speciﬁed as hypotheses on the adjustment coeﬃcients αij in the empirical model. A ’surprising’ outcome of a test might then be associated with ceteris paribus assumptions of the theoretical model. such as reforms and interventions. i. On the other hand these consequences are only empirically relevant given that the postulated behavior is (at least approximately) correct. i = 1. ΦDt . .e..t−m .t . 2.t . β 0i xt−1 . .

then central banks would be able to supply exactly the amount of money satisfying agents’ demand for money. . or in the long run? • Is an empirically stable demand for money relation a prerequisite for monetary policy to be eﬀective for inﬂation control? • How strong is the direct (indirect) relationship between a monetary policy instrument and price inﬂation? As demonstrated above the short-run adjustment coeﬃcients are not in general invariant to the choice of A0 . Based on the empirical results reported in the preceding chapters we were able to conclude that the Danish money market behavior in the post Bretton Woods period has been characterized by short-run equilibrium error correction behavior.11. in other cases results are more robust. For example: • Is money stock adjusting to a long-run money demand or supply relation? • Has monetary policy been more eﬀective when based on changes in money stock or changes in interest rates? • Does the eﬀect of expanding money supply on prices diﬀer in the short run. By allowing for dynamic adjustment towards long-run steady states it is now possible to address a number of additional questions which are directly or indirectly related to the eﬀectiveness of monetary policy. We will illustrate these ideas by recalling that the economic motivation for doing an empirical analysis of the Danish money demand data was the discussion in Chapter 2 of inﬂation and monetary policy. Chapter 9. WHICH ECONOMIC QUESTIONS? 225 or ”no risk aversion in the capital markets” just to mention a few of the possibly very crucial ones. in the medium run. In some cases the empirical answers can be very sensitive to the choice of identiﬁcation scheme. in which the economic model was based on an assumption on equilibrium behavior in the money market. The interest in empirical money demand relations is motivated by the idea that if the latter are known and empirically stable.3. which is why the answers to the above questions are crucially related to the identiﬁcation issue. The latter was strongly inﬂuenced by the discussion in Romer (1996).

t 23 24 25 r r +a21 (m − y )t−1 + a22 (Rb − Rm )t−1 + a23 (∆p − Rm )t−1 + εpt = .t ∆Rb. For simplicity of notation we will here assume that A1 = 0.t + a0 ∆yt + a0 ∆Rb. should be straightforward...1 µ0.3 µ0.. albeit acknowledging that a complete answer should be based on the full system analysis....2 µ0. Is money stock adjusting to money demand or supply? .   +     (mr − y r )t−1     (Rb − Rm )t−1  +     (∆p − Rm )t−1    vmt v∆pt vyr t vRm. there is no short-run dynamic adjustment in lagged changes of the process and that the longrun structure can be described by the simple relations discussed in Chapter 2.5) Question 1. ∆mr t ∆2 pt r ∆yt ∆Rm. = .e..t   µ0.t (11.5      =        +         To simplify the discussion we will focus solely on the money and inﬂation equations.t 12 13 14 15 r r +a11 (m − y )t−1 + a12 (Rb − Rm )t−1 + a13 (∆p − Rm )t−1 + εmt r = a0 ∆Rm. = . i... We consider the following model speciﬁcation:       1 a0 a0 a0 12 13 14 0 1 a0 a0 23 24 0 0 1 a0 34 0 0 0 1 0 0 0 0  a11 a21 a31 a41 a51 a12 a22 a32 a42 a52 a13 a23 a33 a43 a53  a0 15 a0 25 a0 35 a0 45 1       ∆mr t ∆2 pt ∆Rm..t r = a0 ∆2 pt + a0 ∆Rm.226CHAPTER 11.t vRb.2.4 µ0. Generalization to more realistic speciﬁcations.. IDENTIFICATION OF THE SHORT-RUN STRUCTURE To illustrate how one can empirically investigate the above questions we will assume that the causal chain model below is an adequately identiﬁed representation of the short-run structure of the Danish data. such as the empirically identiﬁed relations of Table 10.t r ∆yt ∆Rb.t + a0 ∆yt + a0 ∆Rb.

Deviations from this relation cause inﬂation. Nevertheless. nevertheless.and a13 > 0}. a11 < 0. Therefore. be diﬃcult to control money stock.and a13 > 0). a12 > 0. Central banks can inﬂuence the demanded quantity of money.2.6) Relation (11. On the other hand. WHICH ECONOMIC QUESTIONS? 227 If a11 < 0. given the previous result (i. could be consistent with a situation where the central bank has controlled money stock so that money velocity is stationary around a constant level. 2. 2. central banks cannot directly control money stock. then empirical evidence is in favor of money holdings adjusting to a long-run money demand relation. Under the assumption that agents can obtain their desired . even if a stable demand for money relation has been found. a12 > 0.3. a12 = 0. a12 > 0. Question 2. and a13 > 0. The empirical requirement for a stable money demand relation is that {a11 < 0. and a13 = 0. The latter can be derived from the money stock equation as follows: mr = y r + a12 /a11 (Rb − Rm ) + a13 /a11 (∆p − Rm ) = y r − β 1 (Rb − Rm ) − β 2 (∆p − Rm ) (11.6) is now imbedded in a dynamic adjustment framework. Finally.1 There exists an empirically stable demand for money relation. Thus. Is an empirically stable money demand relation a prerequisite for central banks to be able to control inﬂation rate? This hypothesis is based on essentially three arguments: 2.3. it may. and that the estimates are empirically stable.11. But a12 > 0 and (Rb − Rm ) ∼ I(0) implies that a change in the short-term interest rate will transmit through the system in a way that leaves the spread basically unchanged.6) corresponds to the aggregate money demand relation discussed in a static equilibrium framework by Romer. the case a11 < 0. controlling money stock indirectly might still be possible by changing the short-term interest rate. In this case money stock is endogenously determined by agents’ demand for money and it is not obvious that money stock can be used as a monetary instrument by the central bank. {a21 > 0} would be consistent with the situation where deviations from the empirically stable money demand relation inﬂuence inﬂation rate in the short-run. with the diﬀerence that (11.e.

e. imposing general restrictions on A0 without imposing restrictions on Σ. in particular plausible estimates of the equilibrium correction coeﬃcients. but the coeﬃcients to the lagged changes of the process are more diﬃcult to interpret. Many of the short-run adjustment coeﬃcients in the unrestricted reduced form VAR discussed in Chapter 4 were statistically insigniﬁcant and the ﬁrst step is to test whether they can be set to zero without violating the . no credit restrictions).4 Overidentifying restrictions on the reduced form The estimates of the unrestricted VAR model were given in Table 7.1) standardized residuals are quite large possibly because of important current eﬀects. imposing (just-identifying) zero restrictions on the oﬀ-diagonal elements of Σ.228CHAPTER 11. IDENTIFICATION OF THE SHORT-RUN STRUCTURE level of money (i. We note that: (i) the short-run structure is over-parameterized with many insigniﬁcant coeﬃcients. imposing restrictions on the short-run parameters when A0 = I. In the next sections we will illustrate the above discussion using the Danish data based on the following identiﬁcation schemes: 1. The guiding principle is the plausibility of the results. by monetization of government debt. 3. it seems likely that there is ’excess money’ in the economy because of ’excess supply’ rather than ’excess private demand’. (iii) the signs and the magnitudes of the αij coeﬃcients seem reasonable. (iv) some of the correlations of the (0. 2. i. (ii) the adjustment coeﬃcients αij seem to provide the bulk of explanatory power. 11. a. For the Danish data we have no strong prior hypotheses about the shortrun structure and imposing overidentifying restrictions has more the character of a simpliﬁcation search rather than stringent economic identiﬁcation. re-specifying the full system model as a partial model based on weak exogeneity test results. 4. Excess money would generally then be the result of central bank issuing more money than demanded.1.e.

but the residuals from the money stock equation and the real income equation are correlated with a coeﬃcient 0..33 and the residuals from the short-term interest rate and the bond rate equations with a coeﬃcient of 0.. 20 dummy variable coeﬃcients and 15 parameters in the residual covariance matrix. ∆xt−k+1 .54.11.. except for ∆Ds83t . First we remove the insigniﬁcant lagged variables ∆2 pt−1 and ∆R m.7 and a p-value of 0. j 6= 1. when we aggregate the data over time the information about these links get mixed up. If 1t . Causal chain eﬀects get blurred when the data are temporally aggregated. albeit some only borderline so. β0 xt−1 ).t | ∆xe . the higher the correlations. Expectations are inadequately modelled by the VAR model. . that the bond rate changes the day after a change in the short-term interest rate (as a result of a central bank intervention. 2.. χ2 (20) = 19. Assume now that agents have forward looking expectations to the variable.. There are at least three explanations for why the residuals from a VAR model are likely to be correlated: 1. The dummy variables. . the more aggregated the data.1 are all signiﬁcant on the 5% level. We note that the equilibrium correction coeﬃcients aij are highly signiﬁcant demonstrating the major loss of information usually associated with VAR models speciﬁed in diﬀerences.4. However. were only marginally signiﬁcant and were also left out (altogether 15 coeﬃcients).1.40. ∆xt−1 . which is a signiﬁcant reduction. Most of the correlations are relatively small. x1t . β0 xt−1 ). and that they use this value when 1t making plans: Et−1 (∆xj. The original VAR contained 50 autoregressive coeﬃcients + 20 seasonal coeﬃcients.. REDUCED FORM RESTRICTIONS 229 joint test of over-identifying restrictions. for example. ˆ All current eﬀects are accounted for by the residual covariance matrix Σ at the bottom of Table 11. say) and that the central bank changes the short-term interest rate as a result of a market shock to the long-term bond rate but only after week. Assume. The estimates of the remaining coeﬃcients reported in Table 11. ∆xt−k+1 . Of these only 16 remained signiﬁcant. In this ’trimmed’ system 20 additional zero restrictions were accepted based on a LR test of over-identifying restriction. In this case we would be able to identify correctly the causal links based on daily data. so that Et−1 (∆x1t ) = ∆xe .t−1 from the system (altogether 10 coefﬁcients) based on F-test. Chapter 3 showed that the VAR model is consistent with agents making plans based on the expectation: Et−1 (∆xt | ∆xt−1 .

a large residual correlation coeﬃcient does not necessarily imply a ’structural’ simultaneous eﬀect. 11. Omitted variables eﬀects. The last point makes it very hard to argue that the residuals could possibly be a measure of autonomous errors (structural shocks). Therefore.5 The VAR in triangular form Assume that we want to estimate the VAR model with uncorrelated residuals. . With this in mind we will now attempt to ’identify’ some of the current eﬀects in the model as if they were a result of either point 1 or 2 above..e. β0 xt−1 . ∆xp. the expectation is exactly correct.t |∆xt−1 . In this case the likelihood function is decomposed into p independent sequentially decomposed likelihoods as demonstrated in Section 2. This is because the VAR model is very powerful for the analysis of small system. 1t 3. i. but the identiﬁcation of cointegration relations becomes increasingly diﬃcult when the dimension of the vector process gets larger..230CHAPTER 11.t . the residuals of a ’small’ VAR model are likely to contain omitted variables eﬀects and the residuals will be correlated if the left-out variables are important for several of the variables in the ’small’ VAR. λs ) i p . Thus.t .. by pre-multiplying the VAR model with ˆ the inverse of the Choleski decomposition of the covariance matrix Ω. ∆xp. There are two important implications of omitted variables: (1) the estimated short-run adjustment coeﬃcients are likely to change when we increase the number variables in the model (provided that the new variables are not orthogonal to the old variables). λs ) = = i=1 Π P (∆xit |∆xi+1.. β0 xt−1 . then the reduced 1t form VAR residuals from equation ∆x1t would be correlated with the residuals from the equations in which ∆xe were important. ∆xt−1 . IDENTIFICATION OF THE SHORT-RUN STRUCTURE ∆xe = ∆x1t .4: P (∆x1. In most cases the VAR model contains only a subset of (the most important) variables needed to explain the economic problem.e. nor does it imply incorrectly speciﬁed expectations.3) to be an ˆ upper triangular matrix Ω−1/2 . The most convenient way of achieving this is by choosing A0 in (11. . i..t .. (2) the residuals generally become smaller when increasing the dimension of the VAR.

02 -0.3) 0 0.t −2.09 0.41 0.14 -0.06 (2.29 0 0 0 −0.33 (−6.t ∆y r ∆R b.28 0 ˆ Ω (oﬀ-diagonal elements are standardized) 0.14 0.21 0 −0.5.t r ∆yt ∆Rb.6) (3.01 (−2.01822 0.01 (−5.6) (−3.0) (−3.02432 -0.14 -0.33 -0.22 0.13 0.28 (−2.21 (2.9) 0 0 0 0 0 0 0 0 0.11.1: A parsimonious parametrization of the cointegrated VAR model ∆m r ∆2 pt ∆R m.34 (4.39 (−13.02 0.t t t r ∆mt−1 −0. THE VAR IN TRIANGULAR FORM 231 Table 11.4) ∆Rb.7) 0 0 −1.2) r ∆yt−1 0 0 0 −0.4) 0.t−1 ecm1t−1 ecm2t−1 ecm3t−1 ∆Ds83 t ∆mr t ∆2 pt ∆Rm.00152 .0) 0 −0.3) (−2.01462 -0.0) 0.02 (−2.6) −0.5) 0.05 (5.00122 -0.01 0 −0.

5 if we had included current changes of all the other variables in the system as regressors.t |∆xj. . Here we ﬁrst investigate the consequences of diﬀerent ordering by performing an OLS regression for each variable as if it had been at the end of the causal chain.t and ∆Rm. The results are reported in Table 11..232CHAPTER 11.1). IDENTIFICATION OF THE SHORT-RUN STRUCTURE We ﬁnd p conditional expectations: E(∆xit |∆xi+1..t prior to ∆Rm. important to check the sensitivity of the results to various orderings of the causal chains.t .t .t . p j=i where a0i is a row vector deﬁned by the last p-i elements of the ith row of A0 . in general.2 where the coeﬃcients of each row are given by the conditional ˆ c0 expectation E(∆xi. Thus. and ∆mr in the causal chain. Because the residuals are uncorrelated in this system.. .. In this case the OLS estimates are equivalent to FIML estimates. ∆2 pt .t are weakly exogenous t for the long-run parameters β. . In Chapter 10 we showed that both ∆yr and ∆Rb.j+1 ∆xj+1.t was equilibrium correcting to the third cointegration relation but not ∆Rb. However. previously insigniﬁcant adjustment coeﬃcients αij in the reduced form model might become signiﬁcant in the transformed model.t (j 6= i). there is an inherent arbitrariness in this type of models. therefore. ∆xp. Since diﬀerent orderings will. β0 xt−1 ) = p P a0. the short-run adjustment parameters are not invariant to transformations by a non-singular matrix A0 as demonstrated in (11. Finally. ∆xt−1 .. on an underlying assumption of a causal chain.t . the coeﬃcients can be estimated fully eﬃciently by OLS equation by equation. i = 1... demonstrating t the robustness of the previous results regarding the lack of long-run feedback for diﬀerent model speciﬁcations. produce diﬀerent results. Independently of the chosen ordering no such changes in equilibrium correction eﬀects were found in the equations for ∆yr and ∆Rb.. Dt ). This is an important reason for placing ∆yr and ∆Rb. i = 1. t t because the primary purpose of this study is to investigate the role of money .t + A1 ∆xt−1 + aβ0 xt−1 . . 5. i = 1. and the choice of ordering is subjective.t . It is. The results show that ∆mr and ∆yr as well as ∆Rb.. By this exercise we can ﬁnd out how the reduced form model estimates would have changed for equation ∆xi.. thus.t exhibited t t signiﬁcant simultaneous eﬀects and that ∆Rm. β xt−1 . The triangular system is based on a speciﬁc ordering of the p variables and. ..t ..

5) (1.26 (3.00 (0.14 (−1.79 0.30 0.09 −0.0) (−1.9) −0.38 (−0.26 (−0.83 (−0.71 (−1.0) 0.10 (−1.1) (2.22 (−1.7) −0.3) (−0. ∆2 pt and ∆mr are placed at the end of the t chain.3) 0.t → ∆yr → ∆Rm.3) −0.003 −1.52 (−0.4) (−0.05 (1.0) −0.3) and inﬂation in this system. t Based on the above arguments the estimates of the triangular system in Table 11.02 0.01 (0.01 0.7) 1 −0.0) −1.01 (0.01 (−1.6) −0.7) The upper triangular representation of the current eﬀects in Table 11.5.06 0.1) (−1.0) −0.01 (−2.61 1 0.61 −0.6) −0.11.3) −0.2) −0.66 −0.08 (0.t ∆y r ∆R b. Inﬂation did not exhibit any signiﬁcant eﬀects from current changes in the other variables and.0) (0.t ∆mr t−1 r ∆yt−1 −0.03 (−1.10 (1.0) (−0.9 are in bold face.2: The fully speciﬁed conditional expectations equation by equation ∆m r ∆2 pt ∆R m. To increase readability coeﬃcients with an absolute t-value > 1.28 (0.12 (1.8) −0.22 (2.02 0.01 (−4.2) −0.00 (−0.3) (−2.7) (0.5) 0.7) −0. indicating that the current eﬀects are not .75 (0.0) 0.3) 0.01 (−2.5) 0.31 (3.5) ∆Rb.40 −1.1) −0.01 (−1.6) 0.40 (−1.5) −1.t ∆y r t ∆R b. prior to ∆Rm.47 (3.9) (0.5) (2.6) −0.43 (−12. THE VAR IN TRIANGULAR FORM 233 Table 11.t → ∆2 pt → ∆mr .9) 0.8) 0.1) (2.4) 0.54 (−0.2) 0.00 (0.00 −0.3 corresponds to the Cholesky decomposition of the residual covariance matrix when the system variables are ordered as in (11.t t t r ∆m t 1 −0.01 (−0.3 are based on the following ordering: ∆Rb.1) (−1.4) 1 0. hence was placed prior to ∆mr .3) ∆2 pt ∆R m.t−1 ecm1t−1 ecm2t−1 ecm3t−1 ∆Ds83 t 0. It appears that the transformation by A0 does not change the VAR model estimates very much.7).3) −2.17 (0.04 (4.7) (−1.0) (−3.5) 0.12 −3.21 −0.t .13 (−2.02 (−0.4) −0.1) 0. t t (11.32 (−3.2) −0.30 (−4.3) 1 −0.14 0.3) −0.

4) 0.t ∆y r ∆R b.9) −0.0) −0.01 (0.t−1 ecm1t−1 ecm2t−1 ecm3t−1 ∆Ds83 t ∆mr t ∆2 pt ∆Rm.1) (−3.8) (2.00 (−0.9) 0.3: The short-run adjustment structure in triangular form ∆m r ∆2 pt ∆R m.01 (−0.04 (0.32 (−3.2) −0.10 0.6) φ : 0 0.2) (1.2) 0 0 1 −0.38 (−0.2) (−0.0) (0.33 (0.4) −0.51 −0.79 0.7) −0.40 (2.54 (−0.30 (−4.32 (3.16 −3.02 (−0.5) 0.9) 0.1) −0.8) −0.6) 0.t t t r ∆m t 1 0 0 0 0 ∆2 pt −0.9) (0.8) −0.9) −1.81 (−0.04 0.1) 0.04 (3.02 (−0.26 (−0.t ∆y r t ∆R b.30 0.7) −1.14 (−1.5) −1.234CHAPTER 11.04 (0.1) −0.3) 1 −0.22 1 0 0 0 (−1.2) 0.17 (0. IDENTIFICATION OF THE SHORT-RUN STRUCTURE Table 11.01 (−2.1) (−2.9) 0.4) (−1.21 (−1.22 (1.5) −0.02 (−2.1) ∆R m.01 (−4.07 (−0.43 (−12.7) 0 1 0.66 (−2.02 0.5) a0 : −0.6) −0.t ∆mr t−1 ˆ r ∆yt−1 ∆Rb.t r ∆yt ∆Rb.0) −0.25 (3.9) A01 : 0.41 (−1.2) (1.00 −0.66 −0.003 −1.9) −0.1) (−1.29 0.6) (2.t A00 : −2.01 (−1.2) b Σ 1 0 0 0 0 1 0 0 0 1 0 0 1 0 1 .05 (1.52 (−0.52 (0.6) −0.5) 0.

1 is identiﬁed w. and Φ) while at the same time keeping the residual correlations of Σ as small as possible. (i) our model should preferably describe empirically relevant aspects of the economic problem for the chosen period. 11. Though statistical signiﬁcance is not necessarily the same as economic signiﬁcance the former is important for two reasons. identiﬁed by construction.t t−1 and the corresponding zero coeﬃcients in the equation for ∆Rm.1 = 3 that eq. A1 . We will next demonstrate how to impose general restrictions on the matrices A0 . Short-run structure I reported in Table 11.4 is identiﬁed w. GENERAL RESTRICTIONS 235 very important.t−1 in the equation for ∆yt and the corresponding zero coeﬃcients r in the equation for ∆mt are identifying. The calculated rank indices (??) reported in Table 11. 4 with one overidentifying restriction and r4.r. In the following we will demonstrate two diﬀerent model structures obtained by imposing as many zero restrictions as possible on (A0 .t.r. The remaining equations contain simultaneous eﬀects and therefore need to be checked for generic and empirical identiﬁcation.1 with two overidentifying restrictions.4 = 2 implying that eq. (ii) leaving insigniﬁcant coeﬃcients in the model may introduce near singularity in the model if generic identiﬁcation is violated when restricting the insigniﬁcant coeﬃcients to zero.t while imposing overidentifying t restrictions on the remaining model parameters. both t generically and empirically (because of the highly signiﬁcant coeﬃcient of r ecm2t−1 in the ∆mr equation).6. Similarly. eq.5 conﬁrms the visual inspection: r1. and Φ while leaving Ω unrestricted. a.3 were insigniﬁcant and it would seem natural to restrict them to zero. The ∆2 pt equation is in the reduced form and.11. a.4 allows simultaneous eﬀects r of ∆mr and ∆yt and of ∆Rm.6 Imposing general restrictions on A0 Many of the estimates of the triangular system reported in Table 11.e. This is an example where a strongly signif- .t and ∆Rb. The signiﬁcant coeﬃcients of ∆mr and Ds83 t in the equation for ∆Rb. the former. the signiﬁcant coeﬃcients of ∆yt−1 t r and ∆Rb.t. A1 .t are identifying the latter w. points 1 and 2 above do not seem to be empirically very important in this case.r. eq. i.t. r The zero restriction of ecm2t−1 in the equation for ∆yt and the corresponding nonzero coeﬃcient in the equation for ∆mr are identifying. therefore.

8) A01 : 0 0 0 0.4) 0 1 0 0.22 (1. IDENTIFICATION OF THE SHORT-RUN STRUCTURE Table 11.31 (2.01 (−4.t ∆mr t−1 r ∆yt−1 A00 : 0 0 0.t t t r ∆m t 1 0 0 −0.t r ∆yt ∆Rb.t−1 ecm1t−1 ecm2t−1 ecm3t−1 ∆Ds83 t ∆mr t ∆2 pt ∆Rm.236CHAPTER 11.14 0.t ∆y r ∆R b.8) −0.7) ∆2 pt ∆R m.0) ∆Rb.04 (5.02102 -0.10 0.t φ0 : a0 : 0.4) 0 −0.01 (−2.4) 0.11 -0.40 (−13.2) (−3.76 (2.01 0.02 (−1.7) 0 0 1 0 0 0.2) 0 0 0 0 0.0) 1 0 0 0 0 0 0 −1.33 0 (oﬀ-diagonal elements are standardized) 0.10 (0.01 (−2.4: An overidentiﬁed simultaneous short-run adjustment structure I ∆m r ∆2 pt ∆R m.8) 0 1 −0.04 (−2.24 (0.08 0.20 -0.15 0.t ∆y r t ∆R b.31 (−6.01472 ˆ Σ: -0.5) −4.28 0 (−1.5) 0 0.35 (3.9) 0 −0.32 (−2.00142 .07 (2.16 0.8) 0 −0.9) 0 0 0 0 0 0 0 −0.02 0.7) −0.06 -0.00122 0.02362 -0.

the signiﬁcant coeﬃcients of ecm2t−1 and ecm3t−1 in the ∆Rm.3.t−1 are not identifying.r. short-run structure II.0).5 (p-value 0. produced a test statistic of 16. 3. real money stock is exclusively equilibrium error correcting to long-run money demand. sets the insigniﬁcant current eﬀect of ∆R b.t−1 in the equation for ∆R b. ∆Rm.5 = 2 implying that eq. the estimated simultaneous eﬀects are generally not very signiﬁcant. though not zero as in the triangular representation in Table 11.t in the equation for ∆R m. short-term interest is essentially equilibrium error correcting to the long-term bond rate and exhibits a (small) negative eﬀect from excess liquidity.r.6. Altogether 17 overidentifying restrictions have been imposed on the shortrun structure I.t.t . This is particularly so for the coeﬃcients of the bond rate and the deposit rate. GENERAL RESTRICTIONS 237 icant (and economically interpretable) dummy variable can contain valuable identifying information. distributed as χ2 (17). As a result the residual correlations changed to some extent. but not signiﬁcantly so as can be inferred from the test of the 20 overidentifying restrictions which were accepted based on a test value of 22.0 and the restrictions were accepted with a p-value of 0.7) than in the long run (1. 4. Based on the estimated results we notice that: 1. 2. inﬂation is exclusively equilibrium error correcting to the homogeneous inﬂation-interest rate relation. Therefore. the eﬀect of an increase in real aggregate demand on money stock might be smaller in the short run (0. However. The calculated rank indices (??) reported in Table 11. eq. Similarly. 5 with one overidentifying restriction.5 conﬁrms the visual inspection: r3.t.52. The residual correlations reported at the bottom of Table 11.t to zero and imposes additionally a zero restriction on ∆R b.4 are quite small and insigniﬁcant.r. 5.t equation are identifying w. suggesting that the identifying information is weak in this model.34). real aggregate demand shows a negative short-run eﬀects from increases in the long-term bond rate.t equation and the corresponding zero coeﬃcient in the r ∆Rb. .6.t.3. The same is true for eq.11.t .5 w. The LR test. since they enter similarly in both equations. reported in Table 11. 3 is identiﬁed w. eq. whereas the coeﬃcients of ∆yt−1 and ∆Rb.

15 9 2.12 5 5.3 2 5.j ri.235 1.1245 6 4.3 4 4.123 5.2 2 4.125 3.jk 1.23 7 1.134 10 2.4 4 2.1 5 2.12 5 4.4 2 5.245 6 6 6 3.34 5 1.25 6 4.5 2 3.2 2 3.345 10 2.234 6 5 6 5.35 6 5.238CHAPTER 11.13 6 4.13 4 5.1 4 3.2 2 5.235 6 7 8 4.23 4 5.1 3 4.45 8 3.1 3 5.1234 6 .1345 10 3.24 4 3.125 4.5 6 2.234 1.4 2 1.24 4 5.5 4 1.135 10 2.35 5 1.123 4.3 5 1.45 4 2.23 6 4.jkm ri. IDENTIFICATION OF THE SHORT-RUN STRUCTURE Table 11.14 4 3.13 10 2.4 2 3.45 4 4.1235 8 5.5: Checking the rank ri.3 6 2.124 3.jkmn 1.25 4 3.2345 7 2.2 2 1.12 6 3.345 7 7 5 1.25 6 1.34 4 conditions of short-run structure 1 ri.5 4 4.14 3 5.15 5 4.15 4 3.124 5.24 4 1.14 7 2.35 8 2.34 8 2.

19 -0.15 ˆ Σ: 0.00132 -0.38 (4.005 (−1.03 0.4) 0 (oﬀ-diagonal elements are standardized) 0.16 0 (−1.01472 -0.03 0.3) 0 0 0 0 0.t ∆mr t−1 r ∆yt−1 2 A00 : 0 0 0.01 (−2.0) 0 −0. GENERAL RESTRICTIONS 239 Table 11.5) 0 0 0 0 −0.02092 0.t ∆y r t ∆R b.04 (4.t ∆y r ∆R b.18 0.6.06 (2.7) ∆ pt ∆R m.70 (1.40 (−13.08 -0.01 (−5.0) A01 : 0 0 0 0.8) 0 0 0 0 −0.00142 .67 (3.32 (−3.4) 0 1 0 0 0 0 0.t t t r ∆m t 1 0 0 −0.12 (2.7) −0.3) 0 0.02 -0.10 -0.8) 0 0 1 0 0 0.01 (−2.t r ∆yt ∆Rb.t−1 ecm1t−1 ecm2t−1 ecm3t−1 ∆Ds83 t ∆mr t ∆2 pt ∆Rm.02332 -0.8) 0 1 −0.8) 1 0 0 0 0 0 0 −1.t φ0 : a0 : −4.6: An overidentiﬁed simultaneous short-run adjustment structure II ∆m r ∆2 pt ∆R m.31 (−6.17 0.28 (−3.1) 0 −0.6) ∆Rb.11.14 -0.

the long-term bond rate would probably be equilibrium error correcting to the German bond rate of similar maturity and real aggregate demand to real exchange rates. or the short-run adjustment parameters λS . for example. neither real aggregate demand nor the long-term bond rate exhibit any long run equilibrium correction eﬀects. terms of trade and foreign aggregate demand if these variables were included in the analysis. the bond rate exhibits a small negative short-run eﬀect from an increase in liquidity and a positive short-run eﬀect from an increase in aggregate demand. β) has been subject to many . Chapter 16 will discuss a procedure for extending the model to include such eﬀects.r.t. Note. For example. 7. 11. however. Assume.r.4. As already discussed in Chapter 9. the long-run structure.5 to ﬁnd out whether weak exogeneity holds either for the long-run parameters β. that the latter does not imply weak exogeneity for the short-run adjustment parameters ( λS ). because the motivation for the empirical study in most cases is an interest in the dynamic feed-back eﬀects w. IDENTIFICATION OF THE SHORT-RUN STRUCTURE 6.3 and 10. The last point can probably be explained by the present information set not being suﬃciently large to give a satisfactory explanation of all ﬁve variables. inﬂation and the short-run interest rate conditional on the bond rate and the real aggregate demand which were found to be weakly exogenous for the long-run parameters β. But. SF r if the parameters of interest are λS then neither Rb. that a weakly exogenous variable (w. Therefore. we need to estimate the full system of equations.t nor yt is weakly exSF ogenous for these parameters as demonstrated by the estimation results in Tables 10.240CHAPTER 11. in some cases there are clear advantages with a partial system analysis.t. The question SF is why one would like to continue the analysis in a partial model considering that we already have estimated the full system. establishing long-run weak exogeneity for a variable is often used as a justiﬁcation for performing the model analysis conditional on such a variable.7 A partial system In this section we re-estimate the model as a partial model for real money stock. However.

the ECM terms in the parsimonious reduced form model reported in Table 11. Furthermore. Conditioning on such a variable is likely to reduce the need for intervention dummies in the model and this might improve the stability of the parameter estimates in the conditional model. the empirical conclusions would be fairly robust whether based on the full or the partial system. and Rahbek (1998) can be used as a safeguard against conditioning on variables.4 and 11. In more doubtful cases the procedure suggested in Harbo. As mentioned above the need for dummy variables may change in the partial model.0147 0. and t t r ∆Rm.0233 0. but whatever happens in the Danish economy it is not likely to inﬂuence the US economy. ∆p2 .11. First we notice that the basic results on the short-run adjustment structure remained essentially unaltered in all the diﬀerent representations reported in Table 11. In some cases the exogeneity status of a variable is so obvious that testing might not be needed and one can directly estimate a partial model. Thus. the economic activity in USA is likely to inﬂuence the Danish economy.t and ∆yt . Johansen. For example. In particular. The parameter estimates (particularly the current eﬀects) have become slightly more signiﬁcant in the present model.0232 0.1-11. ∆mr ∆2 p ∆Rm 0.00117 Full system: σ ε ˆ Partial system: σ ε ˆ 11.6.0147 0. ECONOMIC IDENTIFICATION 241 interventions and that current changes of this variable has signiﬁcantly affected the other variables in system. . thus improving the precision of the statistical inference as compared to the full model. the residual standard error of the deposit rate equation decreased to some extent compared to the full system estimate.8. the residual variance in the conditional model is often significantly reduced. The estimated results are similar to the estimates of the full system reported in Table 11. Table 11. which are not weakly exogenous for the long-run parameters.5.1 hardly changed at all when allowing for simultaneous eﬀects in the model.t conditional on ∆Rb.00130 0. Finally.2 based on the above empirical results.7 reports the estimates of the partial model for ∆mr . This was the case here: all dummy variables became insigniﬁcant after conditionr ing on ∆Rb.8 Economic identiﬁcation We will now make an attempt to answer the questions raised in Section 11.t and ∆yt .

t a0 : 0.0) (−3.t A00 : 0 0 1 ∆y r 0.02 (−2.29 (−5.7) −0. the inﬂation rate and the deposit rate ∆m r ∆2 pt ∆R m. Including the change in current real income in the real money stock equation seemed to improve the model speciﬁcation. current and lagged changes of the process do not seem very crucial for the interpretation of the results and we can focus on the equilibrium error correction results when answering the questions of Section 11.0) 0 0 0 0 −1.t−1 ecm1t−1 ecm2t−1 ecm3t−1 ∆mr t ∆2 pt ∆Rm. This seems to indicate either that the central bank has willingly supplied the .28 (3.58 (2.7) 0 −0.9) ∆R b.242CHAPTER 11.31 ˆ Σ: 0.07 −0.2.7: The estimates of a partial system for money.4) 0 −0. Altogether. Including current interest rate changes improved the dynamic speciﬁcation somewhat but the static longrun solutions remained unaltered.04 0.8) A01 : 0 0 0 ∆Rb.t ∆mr t−1 r ∆yt−1 −2.39 (−12.9) 0. Is money stock adjusting to money demand or supply? The result that real money holdings have been exclusively adjusting to a long-run money demand relation was very robust in all representations.9) 0 0 0 0.01472 -0.27 (2.01 (−2.41 0 0 t (2.02322 -0. IDENTIFICATION OF THE SHORT-RUN STRUCTURE Table 11.26 (−2.12 -0.9) −0.00112 The latter were essentially found between changes in real money stock and real income and between changes in the short and the long-term interest rate.t t ∆m r 1 0 0 t 2 ∆ pt 0 1 0 ∆R m. Question 1.

Hence. i.e. Question 4. No such evidence is found in any of the short-run structures. Does excess money. there is no evidence that inﬂation has been caused by monetary expansion in this period. it seems likely that the long-term movements in the demand for money are primarily inﬂuenced by the level of the long-term bond rate.04 to lagged changes in money stock. the cost of holding money. we also found that the money demand relation was a linear combination between two stationary relations (mr − y r + b1 Rb + b2 Ds83t ) and (Rm − 0. Question 3.11. Thus. However. The triangular form in Table 10. then the central bank can indirectly inﬂuence the level of the demanded quantity by inﬂuencing its determinants.2 reports a positive. According to the estimated money-demand relation this can be achieved by changing the short-term interest rate. Since the ’modiﬁed spread’ was found to be stationary an increase in central bank interest rate is likely to transmit through the system in a way that leaves this component basically unchanged. But a change in the short-term interest rate is likely to change money demand only to the extent that it changes the spread (Rm − Rb ).4Rb ). this insigniﬁcant eﬀect is canceled by a negative coeﬃcient -0. Given the empirically stable money demand relation found in the Danish data. the present empirical evidence seems to suggest that the central bank would not have been able to eﬀectively control M3. ECONOMIC IDENTIFICATION 243 demanded money.8. which was found to be weakly exogenous and probably not controllable by the central bank. can inﬂation be eﬀectively controlled by the central bank? If the level of (the broad measure of) money stock is endogenously determined by agents’ demand for money. or that agents have been able to satisfy their desired level of money holdings independently of the central bank. deﬁned here as the deviation from the long-run money demand relation cause inﬂation? If this is the case. Question 2. What is the eﬀect of expanding money supply on prices in the long run? Is money causing prices or prices causing money? This question will be addressed in the common trends model in the next chapter. Therefore.04) coeﬃcient of ecm2. at least not in the short run. . but very small (0. Although we found that (Rm − Rb ) ∼ I(1) in the money demand relation. ecm2 should have a positive coeﬃcient in inﬂation equation. The access to credit outside Denmark as a result of deregulation of capital movements might suggest the latter case.

4 discusses whether the results are economically identiﬁed using the scenario analysis of Chapter 2. Section 12. and their weights. This will be done in the next chapter. Generally. 245 . whereas additional restrictions are overidentifying and. α.4 demonstrated the duality between the AR-representation describing the adjustment dynamics .2 focuses on some special cases.Chapter 12 Identiﬁcation of common Trends Sections 5. hence.1 discusses the common trends decomposition based on the VAR model. where we will discuss restrictions on the VAR model that aim at identifying r permanent and p − r transitory shocks. Section 12.2 and 6. the identiﬁcation problem for the common trends case is similar to the one of the long-run relations in the sense that one can choose a normalization and (p − r − 1) restrictions without changing the value of the likelihood function. β⊥ . α0⊥ Σεi . In this chapter we will discuss how to impose restrictions on the underlying common trends without attaching a structural meaning to the estimated shocks. towards the long-run relations β0 xt and the MA-representation describing the common driving trends. and Section 12. testable. The organization of this chapter is as follows: Section 12.3 illustrates the ideas using the Danish data.

CATS uses the former formulation. and vt = A0 εt . µ0. e e where β ⊥ = β⊥ (α0⊥ Γβ⊥ )−1 . β. This was shown by pre-multiplying the VAR model: ∆xt = Γ1 ∆xt−1 + αβ0 xt−1 + µ0 + αβ 1 t+εt .3) C = β ⊥ (α0⊥ Γβ⊥ )−1 α0⊥ .246 CHAPTER 12. and α0⊥ = (α0⊥ Γβ⊥ )−1 α0⊥ . Thus.4). the common stochastic trends and their weights can be calculated either based on unrestricted α.a = A0 µ0 . a = A0 α. xt = C where (12. ˆ ˆ c ˆc or on restricted estimates. The moving average representation for a VAR model with linear. Note that the matrices β⊥ and α⊥ can be directly calculated for given estimates of α. trends is given by: X εi + tCµ0 + C∗ (L)(εt + µ0 + αβ1 t). and Γ based on (12.1) where A1 = A0 Γ1 . α . εt ∼ Np (0. alternatively. β . IDENTIFICATION OF COMMON TRENDS 12. A0 ΩA00 ) (12. (12. (12.4) It is useful to express the C matrix as a product of two matrices (similarly as Π = αβ0 ) e C = β ⊥ α0⊥ . but no quadratic. Ω) with a (p × p) non-singular matrix A0 : A0 ∆xt = A1 ∆xt−1 + aβ0 xt−1 + µ0. e C = β⊥ α0⊥ . vt ∼ Np (0.1 The common trends representation Chapters 10 and 11 demonstrated that the V AR residuals are not invariant to linear transformations. β.2) (12. When choosing the moving average option ˆ .a + aβ 1 t + vt .5) or.

Case 1: Long-run homogeneity:   β=    a b c −ω1 a −ω 2 b −ω3 c −(1 − ω 1 )a −(1 − ω2 )b −(1 − ω 3 )c ∗ ∗ ∗ ∗ ∗ ∗      e  → β⊥ =       1 1 1 0 0 1 1 1 0 0       Case 2: A stationary variable in β : .6) without changing the value of the likelihood function. but e also of α⊥ . In the ⊥ last section of this chapter we will relate this decomposition to the more intuitive discussion of common stochastic trends of Chapter 2.5) that the p × (p − r) matrix β⊥ (alternatively c e interpretation as the coeﬃcients of the p − r common β ⊥ ) can be given an P P 0 stochastic trends α⊥ εi (alternatively αc0 εi ) of the variables xt .2 Some special cases We will now discuss a few special cases of restrictions on β and α which e imply some interesting restrictions on β⊥ and α⊥ . the Q transformation leads to just-identiﬁed common trends for which no testing is ine volved. 12. hence. Similarly as for α and β one can transform β⊥ and α⊥ by a nonsingular (p − r) × (p − r) matrix Q 0 e ec C =β⊥ QQ−1 α⊥ =β ⊥ αc0 .2. SOME SPECIAL CASES 247 in CATS the program uses the latest estimates of α and β as a basis for the calculations. Additional restrictions on β⊥ and α⊥ would constrain the likelihood function and. The next section will discuss a few special cases of testable restrictions on α⊥ which can be expressed as testable restrictions on α. ⊥ (12.12. e It appears from (12. so that new test procedures need not be derived. Thus. e The decomposition of C = β⊥ α0⊥ resembles the decomposition Π = αβ0 e but with the important diﬀerence that β⊥ is a function not only of β⊥ . be testable.

Case 3: A column of α is proportional to a unit vector:  ∗ 0 0 0 0 ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗   0 ∗ ∗ ∗ ∗ 0 ∗ ∗ ∗ ∗    β=    0 1 0 0 0 ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗      e  → β⊥ =       ∗ 0 ∗ ∗ ∗ ∗ 0 ∗ ∗ ∗    .e.   i.. a zero row in the C-matrix. IDENTIFICATION OF COMMON TRENDS i.3 Illustrations For the Danish data we found r = 3 and. p − r = 2 common trends.248 CHAPTER 12.e. .   In this case the last variable is weakly exogenous and corresponds to a common driving trend.   α=        → α⊥ =       ∗ ∗ ∗ ∗ ∗ 0 0 0 0 1       12. then the latter will appear as a unit row vector in the C-matrix. i = 1.. Case 4: A row in α is equal to zero. We will report three diﬀerent cases of the estimated common trends representation: 1. k − 1 corresponding to the weakly exogenous variable are zero. In addition. hence.. if all the rows of Γi .  ∗ ∗ ∗ ∗ 0 ∗ ∗ ∗ ∗ 0 ∗ ∗ ∗ ∗ 0    α=        → α⊥ =        . . based on the unrestricted α and β from the VAR model. a zero column in the C− matrix.

1 β⊥.00 0.4) (0.0 α0⊥. based on β restricted to the structure HS.7) (2.01 (1.01 0.0) (−0.05 1.6 0.31 −0.5) 0.2 in Table 12.11 (5. To improve interpretability we have.001) 249 (0. We note ˆ that the largest coeﬃcient in α⊥. α⊥.54 1.2) 0.2 -0.9) 0.14 −4.1 .16 Residual standard deviations in the brackets m r The C matrix ˆmr ε ˆ∆p ε ˆRm ε 0.13 −0.4) 0.9) (−5.02 0.18 −0.6) (−4.02 −0.018) ˆRb ε (0.0) (5.00 (0.9) (−0.4 0.4 -2.5 -3.9 -11.1 ˆ α0⊥. β⊥.024) (0.03 1. Unless the residuals are standardized the magnitude of a coeﬃcient is not very informative.5 (2.60 (−1.3) 1. e e The estimates β⊥.7) 0.01 (0.2 ˆ -0. reported the residual standard deviations in the brackets underneath the residuals in .5) 0.12.05 0.8) 0.1 .2 corresponds to the weakly exogenous bond rate.24 -0.1 are based on a ﬁrst ˆ ˆ tentative normalization by normalizing on the largest coeﬃcient.38 (0.50 -0.6) 0.00 −0.1) ˆyr ε ˆRb ε 0.3 0. 3.09 (0.00 mr ∆p Rm yr Rb 5.3 and α restricted as in 2.01 −0. but unrestricted β.01 (−0.9) (0.4 of Table 10.40 (1.8) (0. by normalizing on ˆRm we have in ε fact normalized on an insigniﬁcant coeﬃcient.2) −2.03 -0.13 (−0.68 (5.2 .1: The unrestricted VAR representations The common trends representation e e ˆ ˆ ˆmr ε ˆ∆p ε ˆRm ε ˆyr ε β⊥.1) 0.9) ∆p Rm yr Rb −0.002) -0.2 (0.015) (0. therefore.3. based on a restricted α (the zero row restrictions of real income and bond rate). and α⊥.1 -0.8 1. This is a reminder that a large coeﬃcient does not necessarily imply a statistically signiﬁcant coeﬃcient.51 t-ratios in the brackets 2.9) (−0.57 −14. ILLUSTRATIONS Table 12.1) (−0.7) (−1.44 (0.02 0.1 it corresponds to the short-term interest rate and not to ˆ the weakly exogenous real income. whereas in α⊥. Thus.

16 1.46 −0.1 and ˆ0i = ˆ [εmr .i t i=1 u2.65 −0.09 0.00 · ¸−1 = · −3.12 0. t ε i=1ˆi .01 −0.1 and α⊥.0   0. IDENTIFICATION OF COMMON TRENDS the upper part of Table 12.49 0. Figure 12.72   1.73 ¸ ¸ αc0 ˆ⊥ = 0. i = 1974:3.11 −3.35   0. the residual standard error of the ε latter is 18 times larger than the one of the former.2 (to be included) the two unrestricted common trends deﬁned by: X t i=1 u1. for example..54 −0... and a ˆ zero restriction on the real income residual and a unitary coeﬃcient on the bond rate residual in α⊥.14 −2. εyr .06 −1.1.250 CHAPTER 12. εRm .i ˆ ε where α⊥. This can be achieved by premultiplying α⊥ by ˆ ˆ −1 the transformation matrix Q −1 X = α0⊥.61  0. εRb ]i .00 0.2 .49 ˆ By post-multiplying β ⊥ by Q we get the corresponding loadings matrix:   =   ˆc β⊥ .1 . .1 are not unique in the sense that we can impose (p − r − 1) = 1 identifying restriction on each vector without changing the likelihood function. It appears that even if the coeﬃcient of ˆRm is ε 5 times larger than the coeﬃcient of ˆyr . 1993:3.0 0. Assume.2 ˆ X X t ε i=1ˆi . Q = resulting in: · −0. ε∆p .06 1.89 1.58 −14. As already mentioned the common trends estimates of Table 12.02 0.1 (to be included) shows the graphs of the cumulated VAR residuals equation by equation and Figure 12.24 −0.1 ˆ = α0⊥.12 −0.0 0. that we would like to impose a zero restriction on the bond rate residual and a unitary coeﬃcient on the real income residual in α⊥.0 1.2 are given by the estimates in Table 12.

1 we report the C matrix for the unε restricted VAR.49 (−1.3) 0 0 0 0 0 0 0 0 −0. ˆ In the lower part of the Table 12.27 0.0) (5.1 α α £ ¤ 0.001) (0.2) 1.4 (2.82 −16.4) Note that this can also be achieved by simple row manipulations of the unrestricted vectors.8) (5.024) (0.0 .1 ) ˆRb (β ⊥.i . In representation 2 reported in Table 12.02 (2. are now overidentiﬁed with three overidentifying restrictions each (which is the equivalent of the six degrees of freedom in the joint exogeneity test in Chapter 9). ILLUSTRATIONS Table 12.54ˆ 0⊥2 )/(−0.6) 0. for example: αc = [(ˆ 0⊥1 + 0.002) 251 α⊥.1) Case 3 ˆyr (β ⊥.05 1.1 ) ˆRb (β ⊥.91 1. The corresponding two common trends.4) 0.45 (4.0) 0. Note that the ﬁrst two are probably closely related to unanticipated shocks to the monetary policy instruments and the third one is the goal variable of monetary policy.25 0.4 in Chapter 10).02 (2.0 (2.2: Three representations of the moving average form ˆmr ε ˆ∆p ε ˆRm ε ˆyr ε ˆRb ε (0.018) (0.4) 0.09 −3.015) (0.0) (6.i and i=1 ε Pt ε i=1 ˆRb.55 −3.10 (−5.3) 0.2 we have imposed the two zero row restrictions on α corresponding to the weak exogeneity of the real income P and the bond rate.2 ) ε ˆ ε ˆ 0.44 (2.3.68 −3.3) 0.51 (3.9) 1. Note that the two exogeneity restrictions completely .05 Rb 1.0 0. The results suggest that there are no signiﬁcant long-run impact of disturbances to real money stock.1 ˆ α⊥.0) (5.85 (1.0) (2.66 −14.2 ˆ mr ∆p Rm y r 0 0 ˆmr ε 0 0 0 0 0 0 0 1 0 0 0 0 1 The C matrix: Case 2 ˆ∆p ε ˆRm ˆyr (β ⊥.40 (4. short-term interest rate and the inﬂation rate on any of the variables of this system.0) −0.33)] = ˆ ⊥. t ˆyr .8) (4.12 −0.12.08 (−5.2 ) ε ε ˆ ε ˆ 0 0 0. ˆ∆p and ˆRm contain no signiﬁcant coε ε eﬃcients which is consistent with the columns of α being proportional to unit vectors (as indeed seemed to be the case for HS. The columns of ˆm .0) (4.6) (2.

Table 12. the common stochastic trend coincides with the i=1 variable itself.t = εj. though less signiﬁcantly so. IDENTIFICATION OF COMMON TRENDS determine the common trends in our empirical model (recalling that p − r = 2).2 only reports the last two columns. Consequently. The same is true for the real aggregate income variable. The reason for this can be found in Table 11. the α vectors can be represented as r unit vectors resulting in r zero columns in the C matrix. This is because these restrictions were accepted with very high p-values. In this ˆ ˆ case they are equivalent to β⊥.2 shows that the bond rate is signiﬁcantly aﬀected by both of the stochastic trends. it does not necessarily imply that these two variables are the common trends. two rows of the Π matrix are zero in this case and the remaining three rows can be represented as:   α11 0 0    0 α22 0  β01 xt−1   0 α33   β02 xt−1  . representation 3 has imposed the structure HS.i . A comparison of the columns for ˆyr ε ˆ ˆ and ˆRb .1 which shows that both variables exhibit signiﬁcant eﬀects from lagged changes of process. While the long-run weak exogeneity of the bond rate and the real income implies that their cumulated residuals can be considered common stochastic trends. For this to be the case we need the further condition on the Γi matricesthat the rows associated with the weakly exogenous variables have to be zero.e. Π= 0    0 0 0  β03 xt−1 0 0 0 Thus.e. of β⊥.1 and β ⊥. i. when there are exactly p − r weakly exogenous variables.2 .t becomes ∆xj. In a such a case the equation for the variable xj. i. We will now exploit this feature to learn more about . In addition to the weak exogeneity restrictions on α. Chapter 5 demonstrated that cointegration and common trends are two sides of the same coin.252 CHAPTER 12.2 . Table 12.t P so that xj. Since the ﬁrst three columns of the C matrix are equal to zero. in the three cases reveals that the unrestricted ε estimates have not changed much as a result of imposing the six α restrictions and the four β restrictions.t = t εj.4 on β. When this is not the case the estimates of the C matrix may diﬀer considerably depending on whether they are based on unrestricted or restricted α and β.1 and β⊥.

The level of real money balances seems to have been strongly inﬂuenced by the latter trend. but the bond rate more strongly so.01ΣˆRb.t a measure of u2t .t ε 1.45Σˆyr . describing shocks to real income.44  · P stationary and  ˆyr .i − 12.44)ΣˆRb. Thus.i = 0. the interest rate spread and the short and long-term real interest rates were nonstationary.27 -3.05 + 0.i = −0.10)Σˆyr .and the long-term interest rate have been aﬀected by the two shocks. whereas the level of real aggregate income much less so.t is a measure of u1t and ˆRb.3.44)ΣˆRb. Furthermore. Using the above results we can now express these as linear functions of the two stochastic trends: mr − y r Rm − Rb Rm − ∆p Rb − ∆p = = = = (0. cumulated shocks to real aggregate income might have inﬂuenced real money balances and real income diﬀerently (though the diﬀerence between 0.i ε ε ε ε (0.0 + 3.27)Σˆyr . Both the short. Furthermore. it is interesting to notice that the ﬁrst stochastic trend.5)ΣˆRb.i ε ε ε ε     =     The nonstationarity of the liquidity ratio seems primarily to derive from cumulated shocks to the bond rate.12Σˆyr .i + 1. The nonstationarity of the interest spread is related to both stochastic trends.55 − 1.i ε ε ε ε (0.2:       mr ∆p Rm yr Rb    0.15Σˆyr . This is primarily because the real bond rate has been more strongly aﬀected by the two stochastic trends than the real deposit rate.i − (16.4 .i + 0.05)Σˆyr .50ΣˆRb.55  + deterministic ˆRb.82 −16.i = 0.45 where ˆyr .55 − 0. ILLUSTRATIONS 253 the generating mechanisms underlying the cointegration structure HS.02 − 0.0 ¸ −0.82 and 1.02 + 0. has had a positive impact on both interest rates but a negative one on the .45 − 0.90ΣˆRb.02 0. We reproduce the common trends estimates of Case 3 in Table 12.i ε ε ε ε (0.27 may or may not be statistically signiﬁcant).45)ΣˆRb. The magnitude of the nonstationarity of the short-term and the longterm real interest rate appears to diﬀer in the sense that the real short-term interest rate is closer to stationarity than the real long-term rate.i − 0.12.i + (0.49  components 0.10)Σˆyr .i + (1. the results seem to suggest that the Danish money demand has been more interest rate elastic than aggregate demand.05 1.10 0.11ΣˆRb. In Chapter 9 we ε ε found that the liquidity ratio.i + (0.03Σˆyr .t ε  P 0.i = −0.82 − 1.

of which the real trend is probably the most signiﬁcant.11 − 0. whereas the common trends representation is based on a restricted β and α. The stationarity of the interest rate relation has been achieved by combining the two interest rates in the same proportion as the two stochastic trends enter the variables.4 Economic identiﬁcation We have now obtained estimates of the common trends representation that can be used to evaluate the empirical content of the real money representation .i + (0. Altogether.03ΣˆRb.i ε ε = −0. Adding a small fraction of real income is enough to counterbalance the two stochastic trends so that stationarity is strongly improved in the extended relation (H28 in Table 9.03Σˆyr .e. whereas the second trend describing shocks to the bond rate has had a positive impact on the inﬂation rate and the two interest rates. (mr − y r ) − 13.01 − 0. It seems plausible that the nonstationarity of real interest rates in this period is related to the ’real income’ stochastic trend and its eﬀect on the inﬂation rate.10 − 0. Based on the above results it is now straightforward to examine the stationary cointegration relations deﬁned by HS. H27 in Table 9.i − 0.254 CHAPTER 12. It appears that the homogeneous inﬂation-interest rates relation (i.i ε ε The reason why the stochastic trends do not cancel exactly is that cointegration relations are derived under the hypothesis H24 but without imposing weak exogeneity restrictions on α.06Σˆyr .06Σˆyr .i ε = = 0. The stationarity of the third relation is achieved by combining a homogeneous relation between the inﬂation rate and the two interest rates with the real income variable.4 .3) does not cancel the two stochastic trends.01ΣˆRb.8(Rm − Rb ) Rm − 0.4Rb (Rm − ∆p) + 0.10)ΣˆRb.3).i ε ε = 0.28)Σˆyr . the results indicate that the relationship between excess aggregate demand and inﬂation rate has not followed conventional mechanisms in the present sample period. 12. IDENTIFICATION OF COMMON TRENDS inﬂation rate. The stationarity of the empirical money demand relation is a result of the interest rate spread being inﬂuenced by the stochastic trends in the same proportion as the liquidity ratio.45 + 0.i − 0.08y r = (0.5(Rm − Rb ) − 0.

2 εt . where α0⊥.4 on β.82 −16. i. the interest rate spread. (mr − y r ). and (Rb − ∆p) should all be stationary.44  · P stationary and   P u1i 1. ECONOMIC IDENTIFICATION 255 discussed in Chapter 2.4.7) The cointegration implications of the theory model (12.7) the real trend should inﬂuence real money stock and real income with the same coeﬃcients. i. The theoretical model predicted that nominal growth derives from money expansion in excess of real productive growth in the economy and real growth from productivity shocks to the economy.27 to real income.12.e.0 ¸ −0. (Rm − Rb ). thus.1 = [∗. The empirical model was consistent with ﬁnding two autonomous permanent disturbances which cumulate to two common driving trends.82 to money stock and 1.49  + deterministic u2i 0.t a nominal disturbance. As demonstrated above this was not the case here and we will now discuss why. 0. ∗. Case 3.10 0. 0. 0.7) is that velocity. To facilitate the discussion of the cointegration implications of the empirical results for the real money representation we reproduce both the theoretical representation and the estimates from table 12. and the real interest rates. One of these seemed consistent with the hypothetical real income trend.             mr ∆p yr Rm Rb mr t ∆pt r yt Rm.e. The estimated coeﬃcients are 0.8) .t denotes an autonomous real disturbance and u2. whereas the other was clearly not related to excess monetary expansion in the domestic market. According to (12.02 0. 0. 0] and α0⊥. To improve comparability u1.1 εt and u2t = α0⊥. Instead.2. u1t = α0⊥. 0]. both are positive and not too far away     =      0.55  components 0.2 = [0.t     d12 0 ¸ stationary and 0 c21  · P  u1i  P d12 0  + deterministic u2i components 0 c21  0 c21     =      (12.t Rb. it was generated by permanent ’shocks’ to the long-term bond rate and. 0. seemed more related to ﬁnancial behavior in long-term capital market.45 (12. with the weak exogeneity restrictions imposed on α and the restrictions HS. (Rm − ∆p).27 -3.05 1.

0 0. The eﬀect from the nominal trend (shocks to the bond rate) is positive. the effect from the real trend on inﬂation is highly signiﬁcant and negative. some of discrepancy between the assumed monetary model and the empirical evidence can be related to the strong impact of the longterm interest rate in the money demand relation (conventional monetary models would assume the demand for money to be interest rate inelastic). By transforming β⊥ based on (12. Instead the weights are approximately in the proportion 0. Furthermore. but may.256 CHAPTER 12.4 to 1. to achieve econometrically consistent results we need to redeﬁne the common nominal trend as a linear combination of the shocks to real expenditure and ˜ to the nominal bond rate. possibly reﬂecting the self-fulﬁlling eﬀect of long-term inﬂationary expectations on price inﬂation.23 1.8) using the following transformation matrix Q · ¸ Q= 1. deviate too much from each other. The ﬁnding of two stochastic trends in inﬂation rate could also be consistent with prices containing two stochastic I(2) trends instead of one. at least in terms of a conventional Phillips curve relationship. but not least importantly. The empirical results show that both stochastic trends inﬂuence the two interest rates in a similar way though not in the proportion one to one. The latter seems counter-intuitive.00 0.4Rb ) ∼ I(0) (probably because we used average yields on M3 money stock). inﬂation rate is signiﬁcantly aﬀected by both trends which is against the econometric assumption of Chapter 2 that only the nominal trend should inﬂuence inﬂation. Finally.0 we achieve the following common trends representation . The Fisher parity predicts that nominal interest rates are only inﬂuenced by the stochastic nominal trend and the expectations hypothesis predicts that the latter inﬂuences the interest rates with equal coeﬃcients. Therefore. but a very signiﬁcant one on real money stock. the estimated results of the I(2) analysis in Chapter 14 clearly supports the existence of one I(2) trend. IDENTIFICATION OF COMMON TRENDS from unity. Therefore. nevertheless. However.0 consistent with (Rm − 0. It appears that the nominal trend (deﬁned here as the cumulated shocks to the long-term bond rate) has indeed an insigniﬁcant eﬀect on the real income variable. The theory model predicts that the nominal I(1) trend should not inﬂuence real money nor real income.

28 Rb  In (12. Note that the two common trends are no uncorrelated as they were in (12. the condition that real money stock and real aggregate demand should be aﬀected similarly by the real stochastic trend is now lost.i and u1i = Σεyr .9): · 1.23Σεyr .t and ˆRb.9) and (12. and negatively so.0 ¸ 0.31 m  ∆p   0. the ﬁnding that the nominal stochastic · P ¸ stationary and   P u1i + deterministic  u2i  components (12.55  components 0. the economic implications of the results became less plausible in (12.0 1.10) the econometric condition is satisﬁed but at the sacriﬁce of the uniqueness condition.1 about interpreting shocks we conclude that the deﬁnition of u1t and u2t satisﬁes the requirement of uniqueness and. in price inﬂation than were shocks to the bond rate.9) Q= and obtain the representation:  r   −2. and in (12.t ε ε being correlated with a coeﬃcient 0.0     Rm   0.10) .44  · P stationary and   P u1i + deterministic 0. Furthermore. but fails on the econometric condition that only the nominal stochastic trend should inﬂuence inﬂation.0 0.67 0.12.0 0. For example.86 −16. possibly novelty.15 1.47 0.9) and (12.i as before. Moreover.47 -3.45 0. Given the discussion in Secon β tion 12.86 −5.38 1.49  u2i 0.0 ¸     =       −2.03 (see Table 11.10).5). in (12.45 (12.4.9) on both.0 7.38 4.44  r    y  =  0.8) on α⊥ . ECONOMIC IDENTIFICATION 257       mr ∆p yr Rm Rb  in which inﬂation is aﬀected by a single common stochastic trend deﬁned by u2i = ΣεRb. In (12.10) we have identiﬁed the common trends by imposing restrictions e ⊥ . This seems to be related to the fact that the cumulated shocks to real aggregate demand were more signiﬁcant. Though the coeﬃcient of the nominal trend in y r is insigniﬁcant we can similarly apply the following transformation to (12.15 0.8) with ˆyr .i + 0.

which is why we believe the VAR methodology has the potential of being a progressive research paradigm.7). However. In either case the analysis points forward. In particular. as long as a ﬁrst order linear approximation of the underlying economic structure provides an adequate description of the empirical reality. We will return to this question in the last chapter.258 CHAPTER 12.e. the empirical evidence based on the present sample does not seem to support the conventional monetary model as represented by (12. points to a need to reconsider the theoretical basis for understanding inﬂationary mechanisms in this period. in the ideal case the empirical analysis might suggest possible directions for modifying the theoretical model. the VAR model is essentially a convenient summary of the covariances of the data (see Chapter 3). the ﬁnding that shocks to the long-term bond rate. were an important driving force. Thus. i. the results illustrate how fragile a structural interpretation of VAR residuals can be. or by distinguishing between permanent and transitory shocks. Alternatively. Up to this point we have not explicitly identiﬁed the estimated shocks by making them uncorrelated. The logic of the conventional monetary model seems to be inconsistent with the logic of the econometric analysis and. By comparing the theoretical model with the corresponding empirical results we may get some understanding for why the theory model failed. Provided that further reductions (simpliﬁcations) of the model are based on valid procedures for statistical inference the ﬁnal empirical results should essentially reﬂect the information in the covariances of the data. thus. suggests that ﬁnancial deregulation and the increased globalization may have played a more crucial role for nominal growth in the domestic economy than the actions of the central bank. whether or not the structuring of the economic reality using the cointegrated VAR model is empirically convincing. . This will be the topic of the next chapter. it might suggest how to modify the empirical model (for example by adding more data) to make it more consistent with the theory. IDENTIFICATION OF COMMON TRENDS trend did not arise from shocks to money stock was very strong and significant. Altogether. Thus. Whether or not such an empirical claim can be forcefully brought forward depends on the reliability of the empirical results. instead of the money stock.

There is. the shocks which do not cumulate to stochastic trends. The previous chapter argued that this is particularly so when the shock has to be estimated from the VAR residuals which are seldom invariant to extensions of the information set. whereas the identiﬁcation of the common trends is formulated as p − r linear hypotheses on the shocks to the variables/equations. the empirical deﬁnition of a (structural) shock is more arbitrary. however. Therefore. the deﬁnition of a structural shock and how to identify it based on the VAR residuals plays an important role in the identiﬁcation of the common trends. This is the purpose of this chapter. 13. with εt ∼ Np (0.1 Transitory and permanent shocks The VAR model: ∆xt = Γ1 ∆xt−1 +αβ0 xt−1 +µ + εt . i. an important diﬀerence: The identiﬁcation of the longrun structure is formulated as r linear hypotheses on the variables of the system.Chapter 13 Identiﬁcation of a Structural VAR In the previous chapter no attempt was made to identify the transitory shocks.e. Ω). While we do not in general need to discuss what a variable is. The Vector Equilibrium Correction model with simultaneous eﬀects: 259 .

3) .e.4) C = β⊥ (α0⊥ Γβ⊥ )−1 α0⊥ e = β ⊥ α0 . IDENTIFICATION OF A STRUCTURAL VAR A0 ∆xt = A1 ∆xt−1 +aβ 0 xt−1 +µa +vt . vt ∼ Np (0.C. (13.1) The SVAR model: The SVAR model diﬀers from the Eq. µa = A0 µ. The MA representation of the VAR model assuming no linear trends in the data: xt = C where X εi +C∗ (L)(εt +µ0 ). the transitory shocks deﬁne zero columns in the C matrix) 2.C model. We consider now a ’structural’ representation as deﬁned by the matrix B (which is similar to the matrix A0 of the Eq. A0 ΩA00 ) where A1 = A0 Γ1 . ⊥ (13. permanent shocks have a signiﬁcant long-run impact on the variables of the system. Most common trends models achieve this by assuming that 1. the p ’structural shocks’ are divided into (p − r) permanent and the r transitory shocks. the VAR residuals are assumed to be related to some underlying ’structural’ shocks which are linearly independent 2. and vt = A0 εt . model in the following sense: 1. transitory shocks have no long-run impact on the variables in the system (i. except that B does not assume a normalization) associating the ’structural’ shocks ut with the VAR residuals: ut = Bεt (13.2) (13.260 CHAPTER 13. a = A0 α.

t = Bεt can be achieved by choosing: us.1) using (13.p−r ] have.r ] have no long-run impact on the variables of the system whereas the permanent shocks [ul. We will ﬁrst choose B so that conditions 1. ul. Thus.t u0l.. 2...4) to the unrestricted VAR model we have introduced p × p (= 25) additional parameters so that we need to impose as many restrictions to achieve just identiﬁcation. E(us. us. us.1 . and 3 are satisﬁed.1 .. The orthogonality condition 3..13. TRANSITORY AND PERMANENT SHOCKS 261 or. ul.t ) = Ir and E(ul.5): e xt =β ⊥ α0⊥ B−1 X vi +C∗ (L)B−1 ut +µ0 .. i. Distinction between transitory and permant shocks. E(ut u0t ) = Ip . .t = α0 Ω−1 εt ul. . 3. i. alternatively 4. restricts (p − r) × r (= 6) additional parameters. [us. The condition 2. us. ul ] = [us. associating the VAR residuals with the underlying structural shocks: εt = B−1 ut We can now reformulate (13...6) (13. all ’structural’ shocks are linearly independent or.. Orthogonality of the transitory shocks. implies that {p × (p + 1)}/2 (= 15) of the parameters in B−1 have been ﬁxed. ul..r .e. The transitory shocks..t u0s.5) We need to choose B so that the ’usual’ assumptions underlying a structural intepretation are satisﬁed: 1. .e.t ) = Ip−r Identiﬁcation: By including the relation (13.t = α0⊥ εt .p−r ] 2.1 . (13..1 .1.t and the permanent shocks ul. alternatively. 1. in our example we need to impose 4 additional restrictions to achieve complete uniqueness of the ’structural’ shocks. . ut = [us .

For empirical applications see Mellander. and α0⊥ εt deﬁne the permanent shocks. which can be achieved by: ul. so that α0 εt deﬁne the transitory shocks. Orhogonality of the transitory shocks implies that E{(α0 εt )(ε0t α)} = α0 Ωα = I. Vredin and Warne (1992). IDENTIFICATION OF A STRUCTURAL VAR 2. which can be achieved by: us. we still need to distinguish between permanent and transitory shocks. can be correlated with permanent shocks: This can be ahieved by the the following choice: · 0 ¸ α B= α0⊥ and B−1 = £ α α⊥ ¤ h α(α Ω α) 0 −1 −1/2 . where α0 = α(α0 α)−1 and α⊥ = α⊥ (α0⊥ α⊥ )−1 . but transitory shocks.t = (α0 Ωα)−1/2 α0 εt Orhogonality of the permanent shocks implies that E{(α0⊥ εt )(ε0t α⊥ )} = α0⊥ Ωα⊥ = I. Ωα⊥ (α0⊥ Ωα⊥ )−1/2 i . . say. The uniqueness can be achieved econometrically by choosing A0 so that the covariance matrix Ω becomes diagonal and by appropriately restricting α⊥ .262 CHAPTER 13. Orthogonality within the two groups can be achieved by choosing: " α0 (Ω−1 α)−1/2 α0 Ω−1 (α0⊥ Ωα⊥ )−1/2 α0⊥ # B= which corresponds to: B −1 .t = (α0⊥ Ωα⊥ )−1/2 α0⊥ εt . while maintaining the orthogonality assumption within the groups. = We now relax the assumption that permanent and transitory shocks are uncorrelated. Thus.

1. depends crucially on the plausibility of the identifying assumptions. and Coenen and Vega (2000). thus.13. TRANSITORY AND PERMANENT SHOCKS 263 Hansen and Warne (200?). Whether the resulting estimate vt can be given an economic interpretation as a unique structural ˆ shock. cannot be tested against the data. . Some of them are just-identifying and.

For example. We will argue here that the I(2) analysis oﬀer a wealth of largely unexploited possibilities for an improved understanding of empirical problems where acceleration rates matter. Thus. the hypothesis of a double unit root is often hard to reject even in moderately sized samples when the adjustment behavior is sluggish. Thus. while the distinction between the I(1) and I(2) model is theoretically sharp. as argued in Chapter 2. The aim of this chapter and the next is to convince the reader that the I(2) analysis. is well worth the eﬀort. though possibly demanding. for example.Chapter 14 I(2) Symptoms in I(1) Models The purpose of this chapter is to give a soft introduction to the I(2) model so that the reader has a good intuition for the basic ideas prior to the analysis of the formal I(2) model of the next chapter. the intuition that I(2) trends can be found over very long time periods is at odds with the fact that signiﬁcant mean reversion is more likely to be found in large than in small samples. However. 265 . the relevance of the I(2) model has often been disregarded from the outset based on economic arguments. there are few applications of the cointegrated VAR model for I(2) data. to give a double unit root a structural economic interpretation as a ’very’ long-run relationships in the data is generally not granted. The reason for this is that the existence of I(2) data and. The statistical analysis of the I(2) model is quite involved and the move from the fairly well-known I(1) world to the more complex I(2) analysis may easily seem prohibitive. the determination of the inﬂation rate. hence. the unit root property is a statistical concept which in general should not be translated into a property of a economic problem. empirically it is often much more diﬀuse. Therefore. Generally.

the graph of the latter . Section 14. which can be diﬃcult to distinguish from an I(1) variable with a linear trend. Thus. mt and pt instead of (m − pt ). therefore. not discuss the univariate test procedures here because the next chapter will strongly argue that univariate tests of individual variables cannot (and should not) replace the multivariate I(2) test procedure. It is often the case that an I(2) variable can be approximated by a trend-adjusted I(1) variable when it is observed over shorter periods. we will leave the formal I(2) testing to the next chapter and. Section 14. most empirical applications in which any of the variables might contain a double unit root report the results of some univariate DickeyFuller type tests applied to each variable separately. However.5 discusses under which circumstances the data can be transformed to I(1) variables without loss of information. Section 14. however.2.1 gives a brief informal introduction to the role of deterministic and stochastic components in models with nominal variables. before doing that one would probably ﬁrst like to know whether the data are I(2) or not. particularly in small samples. Because an I(2) variable typically exhibits a very smooth behavior. 14. I(2) SYMPTOMS The distinction between an I(2) variable and a trend-adjusted I(1) variable is often diﬀuse. Therefore.1 Stochastic and deterministic components in nominal models When analyzing nominal instead of real variables. we will take a look at the graphs of the data in levels and diﬀerences as a ﬁrst step in the analysis. we need to reconsider the role of the stochastic and deterministic components and how they enter the model. Most econometric software packages contains a routine for cointegration analysis based on the I(1) model.266 CHAPTER 14. Section 14. We will. Section 14. instead. for example.6 concludes. but only a few include tests and estimation procedures for the I(2) model.4 gives some examples of questions which can be adequately asked and tested based on the I(1) model even if the data are I(2).3 discusses typical ’symptoms’ in the I(1) model signalling I(2) problems. the diﬀerenced data are often more informative about potential I(2) behavior than the data in levels. Since the slope coeﬃcient of a linear trend in xt corresponds to an average growth rate of ∆xt . Therefore. Section 14. provides an intuitive approach to the I(2) model based on the deﬁnition of the so called R model of Chapter 7.

Hypothetically.1. In the former case we would have chosen to model the shift from a high inﬂation to a low inﬂation regime deterministically in the latter case stochastically. often suggests whether there is signiﬁcant mean reversion in the diﬀerences or not. For example nominal money stock and prices in Figure 14. From an econometric point of view this shock violated the normality assumption of the VAR model and. STOCHASTIC AND DETERMINISTIC COMPONENTS IN NOMINAL MODELS267 . exhibit smooth trending behavior over the whole sample period but with a change in the slope at around 1982-83. was accounted for by a blip dummy.1: The levels of the variables for the Danish nominal data.025 .4 1. In many cases it may be possible to avoid the I(2) analysis by allowing suﬃciently many mean shifts in ∆xt (i.1. trend-adjusted nominal money and prices could be found to be empirically I(1) when allowing for diﬀerent growth rates in the two regimes.e. Does the choice matter or not? It depends! In the Danish data we detected an extraordinary large shock in the bond rate and real money stock at 1983:1 which approximately coincided with a change in inﬂationary regimes.03 .75 .02 .5 1.14.5 1980 1985 1990 1995 .5 . Consistent with this the growth rates (in panel c and d) seem to ﬂuctuate around a higher mean value up to 1983 and a lower value thereafter. The latter cumulates to a shift in the levels of the bond rate and the levels of real .03 1975 1980 1985 1990 1995 ib ry m .05 .015 1980 1985 1990 1995 py 1975 id 1980 1985 1990 1995 1975 1980 1985 1990 1995 Figure 14.2 1975 .25 1975 1.04 . but to be I(2) when assuming constant linear growth rates. broken linear trends in xt ) even if the growth rate seems to be drifting oﬀ in a nonstationary manner.3 1. panel a and b.5 0 -. hence.

005 1975 Did 1980 1985 1990 1995 0 1980 1985 1990 1995 1975 1980 1985 1990 1995 Figure 14. by econometrically accounting for the extraordinary large shock in the data we have in fact chosen to model the shift deterministically. One could on one hand argue that if the behavioral shift was properly anticipated by the economic agents then a model derived under the assumption of such forward looking behavior should be able to account for these large changes in the data. For example. showed that the stationarity of the Danish inﬂation rate was rejected (though borderline so) even when corrected for a shift in the mean at 1983:1.3. then the magnitude of the shock was probably unanticipated and we should treat it as a large innovation outlier.05 . Chapter 7 demonstrated that an unrestricted step dummy in the VAR model corresponds to a broken linear trend in the data.05 0 1975 .2: The diﬀerences of the variables for the Danish nominal data. Thus.005 0 -.005 1975 1980 1985 1990 1995 Dib Dry Dpy 1980 1985 1990 1995 .025 0 0 1975 . money stock. When there are large shocks in the data. In order not to violate the multivariate normality assumption of the VAR model. Table 9. If on the other hand the VAR model provides a good description of the data. I(2) SYMPTOMS . Nonetheless.1 Dm CHAPTER 14. nevertheless. big interventions and reforms often need to be modelled deterministically and one can ask whether this is good or bad. prefer to allow for a mean shift in the diﬀerences to avoid the I(2) analysis.268 . In this case we treat some shocks as deterministic though strictly not required by the econo- .1 . though not large enough to violate the normality assumption one might. if model based forward looking expectations are adequately describing agents’ behavior then the linear VAR model should exhibit signs of non-constancy and we should preferably move to a nonlinear model analysis.

µ01 . t. and µ02 are unre- . Rb ] . in this case we also need to reconsider the role of the deterministic components. y r . For example. β11 . t = 1. The empirical analyses of Chapter 9 showed that the real money stock and the real income variable contained a linear trend and a level shift at around 1983. be justiﬁed by economic arguments. Rb ]. As a matter of fact nominal money and prices might very well contain a broken linear trend even though (m − p) showed no such evidence. The question is whether we need to reconsider the possibility of a broken linear trend in the nominal variables. where mr = m − p. Rm . that nominal money stock is likely to be I(2) as well.1) ˜0 ˜ where β = [β0 . Rm .. 14. This would be the case if m and p contain the same (broken) trend as it would then cancel in the real transformation. if included. Φ. ESTIMATING AN I(1) MODEL WITH I(2) DATA 269 metric analysis. therefore. This implies that prices are I(2) and. Such a choice should. Ω ). ∆p. . To use broken linear trends and step dummies exclusively to avoid the I(2) analysis does not seem reasonable. Based on the nominal VAR analysis it is possible to test the hypothesis that there are broken linear trends in the data and that they cancel in m − p. one could argue that a shift dummy/linear broken trend is a proxy for an omitted variable which.1 suggest that the slope of the linear trend might indeed have changed at around 1983. therefore. tDst ]. y r . However.. Chapter 2 demonstrated that we need to reconsider the role of the stochastic trends when the VAR analysis is based on nominal money and prices x0t = [m.. would have made the trend/dummy variable superﬂuous in the model.14. The latter did not seem to have generated a broken linear trend in the two variables. T (14.2. To illustrate this possibility we respecify the VAR model so that it is consistent with broken linear trends both in the data and in the cointegration relations: ˜0˜ ∆xt = Γ1 ∆xt−1 +αβ xt−2 +ΦDpt +µ01 +µ02 Dst +εt εt ∼ Np (0. The graphs of nominal money stock and prices in Figure 14. β12 ] and x0t = [xt .2 Estimating an I(1) model with I(2) data Chapter 9 found that the inﬂation rate was empirically I(1) based on the data vector x0t = [mr . p.

the results might suggest a double unit root in the data.98 0. lowering the value of r does not remove the additional unit roots associated with the I(2) components in the data.67 0. then it is a clear sign of I(2) behavior in at least some of the variables. When xt ∼ I(2) and. the model contains two unit roots plus an additional fairly large root (0.270 CHAPTER 14.31 0.0 1.45 0. The characteristic roots are reported below for the model (14. If there remain one or several large roots in the model for any reasonable choice of r. Thus.1) allowing for broken linear trends in the data and in the cointegration relations: V AR(p) 0. This will be formally discussed in the next chapter.78 0.58 0. Because the process ∆xt also contains unit roots we need to impose another reduced rank restriction on the matrix Γ = I − Γ1 .0 0. Because the additional unit root(s) belong to the diﬀerence matrix Γ = I − Γ1 .0 1.0 0. In the Danish nominal money model there are altogether p × k = 5 × 2 = 10 roots in the characteristic polynomial. if the rank is three as argued in Chapter 8. Therefore.65 0. I(2) SYMPTOMS stricted and the broken linear trend is restricted to lie in the cointegration relations to avoid quadratic trends in the data.78) remains in the model. however. Nev- . Note.19 0.26 0.74 0. hence ∆xt ∼ I(1) the reduced rank restriction on the matrix Π is not suﬃcient to get rid of all (near) unit roots in the model.44 0.90 0. that even if the rank of Π = αβ 0 has been correctly determined there will remain additional unit roots in the VAR model when xt is I(2). though it is now considerably smaller compared to r = 3.08 0. When imposing three unit roots a quite large root (0. a straightforward way of ﬁnding out whether there is an additional (near) unit root in the model is to calculate the roots of vector process (as described in Chapter 3) after the rank r has been determined.74 0.17 There appears to be two large roots in the unrestricted model consistent with r = 3. Since the speciﬁcation of the deterministic components is likely to inﬂuence any inference regarding possible I(2) components in the VAR model we will ﬁrst estimate the Danish nominal money model allowing for broken linear trends both in the data and in the cointegration relations and then without allowing for such trends.58 0.19 r=3 1. When imposing this rank restriction.30 0.45 0.0 1.87 0.90).22 0.67 0.61 0.61 0.08 r=2 1.

23 0. Thus.0 0.0 1.24 0.72 0.t given below.78 0. 14.78 0.46 0.32 0.4) (14. ∆xt−1 .74 0.34 0. For the model version with no broken trends and Ds83t restricted to the cointegration space the modulus of the roots are reported below: V AR(p) 0.51 0. and intervention eﬀects. Dpt .1.47 0. We reproduce the basic deﬁnitions of the R-model: R0t = αβ0 R1t +εt . a conclusion that will be conﬁrmed by the graphs of β0 xt and β0 R1. the stochastic movements in the data do not seem to be particularly well approximated by the introduction of broken linear trends in the model.23 r=2 1. Dst .3. We will argue below that one can test a number of hypotheses based on the I(1) procedure even if xt is I(2) without loosing much precision but that the interpretation of the results has to be modiﬁed as compared to the I(1) case.72 0.92 0.46 0.90 0.0 1. AN INTUITIVE APPROACH 271 ertheless.56 0.1): ˆ ∆xt = B1 ∆xt−1 +ΦDpt +µ01 +µ02 Dst + R0t and ˆ xt−2 = B2 ∆xt−1 +ΦDpt +µ01 +µ02 Dst + R1t . the evidence of I(2) behavior is not very strong.2) where R0t and R1t are found by ﬁrst concentrating out lagged short-run eﬀects.3 An intuitive approach One might ask whether it makes sense at all sense to estimate the I(1) model in this case.79 0.99 0.24 r=3 1.51 0.15 Compared to the model with broken linear trends the results are almost unchanged. The intuition for this can be seen from the so called R-model discussed in Section 7. (14.14. in model (14.33 0.65 0.0 1.0 0.3) .65 0. ˜ (14.

2) can only be true if β0 R1t ∼ I(0) or β = 0. . the I(2) trend in xt−2 cannot be cancelled by regression on the I(1) variable ∆xt−1 as is done in (14.. the stationary relations β0 R1t consists of two components β0 xt−2 and ω0 ∆xt−1 . albeit recognizing that some of the cointegration relations β0 xt may be ˜ stationary by themselves.272 CHAPTER 14. .5) − β B2 ∆xt−1 ) + εt − ω 0 ∆xt−1 ) + εt where ω = β0 B2 . Thus. To conclude: when xt ∼ I(2) we have that β0i xt ∼ I(1) and β0i R1t ∼ I(0) ˜ for at least one i. r. Thus. R1t ∼ I(2).3-14. β 0i xt .2): R0t = αβ0 (˜t−2 −B2 ∆xt−1 ) + εt x = = α(β 0t−2 x ˜ 0 α(β t−2 x ˜ 0 (14. the linear combination β 0 R1t transforms the process from I(2) to I(0) and we say that the estimate ˆ β is super-super consistent. both ∆xt and ∆xt−1 contain a common I(1) trend which is cancelled when regressing one on the other as is done in (14. i = 1.4) ˜ into (14. Thus.5. The connection between β0 xt−2 and β 0 R1t can be seen by inserting (14. The graphs of the cointegration relations based on model (14. .3). exhibits nonstationary behavior but β0i R1t looks stationary. In the ﬁrst case we talk about directly stationary relations. β0i xt . in the second case about polynomially cointegrated relations. and the lower panels the cointegration relations corrected for short-run dynamics. The upper panels show the relations. Thus...1) with no broken trend in the cointegration relations are given in Figures 14. It is a strong indication of double unit roots in the data when the graphs of at least one of the cointegration relations. Here we consider β 0 xt ∼ I(1) without separating between the two ˜ cases. The next chapter will discuss more formally how to distinguish between the two cases... the equation (14. or (ii) ωi ∆xt−1 ∼ I(1) and βi xt ∼ I(1) cointegrate to produce the ˜ stationary relation β 0 R1t ∼ I(0). r ˜ to be stationary there are two possibilities: (i) either ωi = 0 and β0i xt−2 ˜ 0 0 ∼ I(0). I(2) SYMPTOMS When xt ∼ I(2). This gives ˜ a powerful diagnostic for detecting I(2) behavior in the VAR model. Because R0t ∼ I(0) and εt ∼ I(0).. For the cointegration relations β0i R1t . β0i R1t . On the other hand.4). R0t ∼ I(0) even if ∆xt ∼ I(1). i = 1.

2 -26.2 -9. . AN INTUITIVE APPROACH Beta1'*Z2(t) 273 -21.4.5 -1.0 -25.5 0. Beta2'*Z2(t) 3.6 -5.8 -0.4 -7.3. The graphs of β 02 xt (upper panel) and β 02 R1t (lower panel).5 2.0 -10.8 74 76 78 80 82 84 86 88 90 92 Beta2'*R2(t) 2.0 1.6 1.8 -24.4 -27.6 74 76 78 80 82 84 86 88 90 92 Beta1'*R2(t) 4 3 2 1 0 -1 -2 -3 74 76 78 80 82 84 86 88 90 92 Figure 14.0 0.0 -1.6 -22.8 -3.5 -2.0 -1.0 74 76 78 80 82 84 86 88 90 92 Figure 14.0 -0.14. The graphs of β 01 xt (upper panel) and β 01 R1t (lower panel).3.5 1.

As can be seen the graphs of β0 xt and β 0 R1. I(2) SYMPTOMS Beta3'*Z2(t) -34.0 -36. 14.8 -41. Chapter 2 discussed the case where the I(2) trends exclusively aﬀected nominal money and prices.4 Transforming I(2) data to I(1) Chapter 2 demonstrated that (m − p) ∼ I(1) when both nominal money and prices contain the same I(2) trend (with the same coeﬃcients).5.6 74 76 78 80 82 84 86 88 90 92 Beta3'*R2(t) 3 2 1 0 -1 -2 -3 74 76 78 80 82 84 86 88 90 92 Figure 14.2 -36. All the previous empirical analyses using real money/inﬂation rate were econometrically valid under the implicit assumption that the stochastic I(2) trend had been cancelled by the nominal to real transformation.0 -40. .8 -37.4 -35.t are quite diﬀerent supporting the interpretation above that there might be a double unit root in the data. but not suﬃcient condition for his transformation.4 -39. The graphs of β 03 xt (upper panel) and β 03 R1t (lower panel). Under the assumption of long-run price homogeneity we demonstrated that a transformation of the two nominal variables.274 CHAPTER 14.2 -40. We will now test the hypothesis that all cointegration relations satisfy long-run price homogeneity using the standard I(1) procedure. In the next chapter we will address the nominal-to-real transformation more formally and show that long-run price homogeneity of the cointegration relations is a necessary.6 -38.

∆p. m − p.t  r if (mt .4.e. when broken linear trends were allowed in the model long-run price homogeneity was not accepted.1y r . The other two long-run relations described a relationship between (1) velocity and the interest rate spread and (2) the short and the long-term interest rate.t .t ) ∼ I(1). i.t Rb. Table 14. Thus. (yt . For the Danish data the test of long-run price homogeneity in the nominal model with no (broken) linear trends in the data or in the cointegration relations was accepted based on a χ2 (3) = 3. 1) from I(2) to I(1). long-run price homogeneity is a very important property when analyzing models based on real transformations.t Rb. i. and price inﬂation.2 inﬂation was only present in one of the long-run relations of the preferred structure H4 . The results of the nominal and real analysis diﬀer primarily with respect to inﬂation being absent in the cointegration space in the former case but present in latter case. In this case the VAR analysis based on the nominal or the real data give essentially the same results with the exception that a VAR(k) model based on the real vector will have one more lag of prices (pt−k−1 ) compared to a VAR(k) in nominal variables. in a relation describing a homogeneous relation between inﬂation and the two interest rates as a function of real aggregate income.38 and a p-value of 0.e. Rm. TRANSFORMING I(2) DATA TO I(1) 275 m and p. We note that β02 xt approximately reproduces the money demand      ∼ I(2) →         ∼ I(1)   .1 reports the estimates of the β relations and the corresponding α for Danish money data in nominal values. We have imposed three over-identifying restrictions (similar to the ones of HS4 in Table 10. removed the I(2) trend without loss of information. As shown in Table 8. Rb. to real money. and mt and pt cointegrate (1. However. pt ) ∼ I(2).34.t   (m − p)t ∆pt r yt Rm. To be added! We will now examine the results of the nominal analysis when β is unrestricted and when long-run price homogeneity has been imposed and compare the results with the empirical results from the real money analysis of Chapter 9.       mt pt r yt Rm.1 to make the nominal results as comparable as possible with the results of the real analysis. (∆p − Rm ) − 0.14.2) on the β vectors of Table 14.5(Rm − Rb ) + 0.

2) −0.40 (−3.05 −1.00 0.3) 0.0) 0 −0.0 0 yr 0.03 (−0.2) Rm Rb D83 ∆mt ∆pt r ∆yt 1.65 10.6) (1.00 (−0.60 0.01 0 -1.2) 3.10 −0.276 CHAPTER 14.6 (−6.03 1.t −0.1) −0.5) .00 (0.08 α2 ˆ −0.42 -0.25 −0.25 (1.00 −13.01 -1.64 0.6 −0.00 (−0.8) −0.10 0.0 0 p 0.5) 1.1) (7.3) (−2.4) ∆Rm.5) −0.1: Two just-identiﬁed long-run structures Unrestricted V AR Generically identif ied ˆ1 ˆ2 ˆ3 ˆ ˆ ˆ β β β β1 β2 β3 m -0.0) 13.48 -0.00 -0.9) (−0.0 0 (4.6 1.2) 0.29 (−1.3) −0.01 −0.14 (3.00 -0.00 α3 ˆ −0.35 (1.28 (3.06 (1.45 (6.7) (0.1) (−1.10 (−1.01 −0.42 (−0.7) 1.3) 0.75 −0.1) (−5.18 -1.3) (−1.15 α1 ˆ α2 ˆ 0.5 (4.7) 0.1) 0.06 (−0.9) -1.39 (−1.00 (0.10 −0.8) (5.9) 0.4) 0.02 -1.2 (1.00 −0.2) (−1.15 (−0.2) 1.1) (−4.00 (7.00 (−0. I(2) SYMPTOMS Table 14.01 0 1.07 −0.2) (1.t ∆Rb.13 -0.2) (−5.87 −1.2) −0.3) α3 ˆ 2.00 −0.00 α1 ˆ 7.

it should correspond to a unit vector if stationary. the nominal money analysis reproduces the results of the real money analysis remarkably well. it corresponds approximately to the following cointegration relation: (∆p − Rm ) − 0.8(Rm − Rb ) + 0. CONCLUDING REMARKS 277 relation and β 03 xt the interest rate relation of H4 . The homogeneous inﬂation interest rate relation can be reproduced from the second equation in Table 14.9Rm − 0.35Rb ) + 1. There are at least three straightforward ways of checking the possibility of double unit roots in the data: 1. The graphs of the cointegration relations β 0 xt compared to β0 R1. However. Table 9. 2. the econometric analysis will beneﬁt from treating nominal prices as I(2). 14. The characteristic roots of the model for reasonable choice of cointegration rank.64(−0. The diﬀerence between the two versions of the model is basically whether inﬂation should be treated as stationary or nonstationary. The graphs of the data in levels and diﬀerences.t .45Rb ) (14.12y r + 1.5.8Rm i. When long-run price homogeneity is not present we may still be able to transform the data but with some violation of the I(1) properties.6). 3.6) (14.3 indicated that this was not so.14.25(Rm − 0.1y r .7) = −0.1 as: ∆pt = 0.5 Concluding remarks We will argue here that both econometrically as well as economically there is potentially a lot to be gained from using the econometrically rich structure of the I(2) analysis. the analysis based on the I(1) model will then imply some loss of information.e. Since. In the real model where inﬂation is included in the cointegration space.2y r + 1.0Rm − 0. Except for that the coeﬃcient to the spread is slightly higher in (14. .

ii .

. . . . . . . . .3 Illustration . . . . . . .3 A stochastic formulation . 2. . .6. . . . . . . . . Sequential decomposition of the likelihood function . . . . . .1 The roots of the characteristic function . . . . . . 7 . . . . . . .Contents 0. . . . . . . . . . . . vii 1 Introduction 1. .2 The time dependence of macroeconomic data . . .4 3. . . . . . . . . . . . . true and observable variables . . . 1. .5 Scenario Analyses: treating prices as I(1) .4 Scenario Analyses: treating prices as I(2) . . . . . . . . . . . 2. . . . . . . . . The dynamic properties of the VAR process . . . .5 Experimental design in macroeconomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. . . . . . . . . . . . . . .1 Preface . . . . . . . . . . . . . . .1 3. . 3. . . .5 3. . . . . . . . . . . . . . . . . . 1. . . . . . . . . 13 15 19 22 28 34 35 37 37 40 46 47 50 53 53 55 57 58 2 Models and Relations 2. . . . . .6. . . . . . . . . . . . . . . . . . . . . . . .3 Theoretical. 2. . . 3. . .2 Calculating the roots using the companion matrix 3. . . . . 1. . . .1 A historical overview . . 2. . . . 8 . . . . 3 . . . . . . . . . . . . . . 1. . 5 . . . . . . . . .2 3. . . . . . . . .2 On the choice of economic models . . . . . . . . .1 Inﬂation and money growth . .6 On the choice of empirical example . . . . . . . . . . . . . A vector process . . . . . . . iii . . . . . .4 Testing a theory as opposed to a hypothesis 1.7 Concluding remarks . . . . . . . . . . . . . . . . . . . .6 Probability Approach A single time series process . . . . .3 3. . . . 3 The 3. . . . Deriving the VAR . . . . . . 2. . . . . Interpreting the VAR model . .6. . . . . . . . . . 1 . . 10 . . . . . . . . . . .6 Concluding remarks . 3 . .

. . . . . . . . . . . . . . .4 The relationship between the diﬀerent VAR formulations 4. . . . . . . . . . . . . . . . . 4. . . . . .2 The ECM formulation with m = 2 . . . . . . . . . . 4. . .iv CONTENTS 59 60 63 65 66 68 70 71 72 72 76 79 80 81 83 85 85 86 91 93 96 98 99 99 102 106 107 109 110 115 118 4 Estimation and Speciﬁcation 4. . . . . . . . . . . . . . . . . . . . . . . . . . . .3 5.2. . changes and levels . . . . . . . . . 4. . .8 Conclusions . . . . . .3. . . . .3 Tests of residual autocorrelation .2.2 Residual correlations and information criteria . . . . . . . . . . . . . . . . . . .5 Dummy variables in a simple regression 6. . .3. . . 4. . . . . . . . . . . . . . .1 The ECM formulation with m = 1. . . . . . . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 The MA representation . . . .4 5. . . . . . . . . . . . . 4. . . . 4. . . . . . . . 4. 4. . 6. . . . . . . . . . . . . . . .3. . . . . . . 6. .5 5. 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. 4. . . . . . . . . . . . . 6. . . . . . . . 6.7 An illustrative example . . . . . An intuitive interpretation of Π = αβ0 Common trends . . . . . . .2 Three diﬀerent ECM-representations . . . . . .3 Five cases . . . . . . . . . . . . . . . .1 The estimates of the unrestricted VAR(2) for the Danish data . . . . . .1 5. . .1. .2 A trend and a constant in the VAR .6 Dummy variables and the VAR . . . . . . model . . . . 5 The 5. . . . . . . .3. . . . . . . . . . . . . . . . . . . .2. .3 Misspeciﬁcation tests . . . . Concluding discussion .1 Speciﬁcation checking . . . .1 A dynamic regression model . .5 Normality tests . . . . . . . . . . . . .2. . . .2 5. . . . . . . . . 4. . . . . . . . . Pulling and pushing forces . . . . . . .4 Tests of residual heteroscedastisity . . . . . . . . . . . . . . . . . 4. . . . . . .1 Likelihood based estimation in the unrestricted VAR . 6 Deterministic Components 6. . . . . . . . 6. . . . . . . . . . . . . . . . . From AR to MA . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Cointegrated VAR Model Integration and cointegration . . . . . . . .3 ECM-representation in acceleration rates. .4 Concluding remarks . .

. . 216 9 Testing restrictions 9. . . . . . . . . . . . . . .4 An illustration . . . . . . 9. . . . . . . . . . 8. .4 Just-identifying restrictions . . . . . .4 Some coeﬃcients known . . . . . .1 Empirical illustration: . . . . . . . . .1 Illustrations . . 10. . 119 . . . . 155 . . . . . . . . . . . . The recursively calculated log-likelihood . . . . . . . . . . . . . . . .1 Formulating hypotheses . 9.3 Formulating identifying hypotheses 10. . . . . . . . . . . . 9. . 122 . . . 189 . 168 . . . 9. . . . . 9. . . . . . . .5 Over-identifying restrictions .5 An illustration . . 10. . . . . . . . . . . . . . . .1 Recursively calculated trace tests 8. . . . . . . 143 . . . . . . . 10. . . . .5 Long-run weak exogeneity . .2 The asymptotic tables . 10. . . . . . .1 Illustrations . . . 177 . . . .1 The trace test . . . . . . . . . . . . . . . . . 158 . . . . . . . . . . 206 . 127 139 . . . . . . . . .8 Concluding discussions . . . . . . .6 Lack of identiﬁcation . .3 Normalization . . . .4. . 209 . . . . 9. . . . . . . . . . . . . . . . . . . . 197 . 10. . . . 8. . . . . . . . . . . . 9. 7. . . . . . .4 The uniqueness of the unrestricted estimates 7. . . . . . . . . . . . . 139 . . . . . . 10. . . .5 Recursive tests of constancy . . . . . . . . . . .7 Recursive tests of α and β . . .1 Concentrating the general VAR-model . . . . . . . . . . . . . . . . . . . . . . . v 119 .2 8. . . 165 . . . . . 171 . . . . . .5. . . . . . . . . . . . . . . . . . . . . . . . .2 Identifying restrictions . . 153 . 184 .3. . 8 Cointegration Rank 8. . . . . . . . 8. . . . . . . . . 9. . . . . . . . . . . . . .5. . . . . . . . . . . . 7. . 179 . . . . . . . . . . . . . 156 . . . . . .CONTENTS 7 Estimation in the I(1) Model 7. . . . . . . . . . . . . . . . . . . . . . . .1 Identiﬁcation . . . . . . .2 Derivation of the ML estimator . . . . . 8. . . . . .3 . . . 7. . . . . . . . . . . . . . . . . . . . 212 . . . . . . . 150 . . . . . . . . . . . . . . . . . . . . 125 . . . . . . . . . 166 . . . . . . . . . . . . . . . 126 . . . . 181 . . . .2 Same restriction . . . . . . 196 . 192 195 . . . . . .1 Illustrations . . . . . . . . 8. . . . . . . . . . . . . . . . . . .5.6 Revisiting the scenario analysis 10 Identiﬁcation Long-Run Structure 10.3 Choosing the rank . . . . 9. . . . 186 . . . . 202 . . . . . . . . . . .3 Some β assumed known . .5. . 157 Recursively calculated prediction tests . . . .2. . . . . . . 211 . . . . . .

. . . . . . .7 A partial system . . . . . . . . . . . . 12. . . . . . . . . .8 Economic identiﬁcation . . . . . Concluding remarks . . . . . . . 11. . . 12. . . . . . . . . . . . . . . . . .4 14. . . . . . . . . . . . . . . . Deﬁning the I(2) model . . . . . . . . . . . . The two-step procedure . . . . . . . . . . 287 . . . . . . . . .2 15. . . . .3 Illustrations .5 15 The 15. . .3 Which economic questions? . 299 . . . . . . . . . . . . . .1 15. . . . . . 11. . . . . . . 11.1 Formulating identifying restrictions . . . . . . 290 . . . .4 15. . . . . . . . .3 15. . . . . . . . 223 . . . 239 . . . . . .7 Symptoms Stochastic and deterministic components in nominal Estimating an I(1) model with I(2) data . I(2) Model Introducing the I(2) model . 259 14 I(2) 14. . 279 . . . . . . . . . . . 277 . . . . . . . . . . . . . . 11. . . . .1 Transitory and permanent shocks . . . . . . . . . . . . . . . . . Deterministic components in the I(2) model . . .1 14. . .6 General restrictions . . .3 14. . . . . . . . . . . . . . . .5 The VAR in triangular form . . . . . . . . . . . . . . . . . . .6 15. 222 . . . . . . . 296 16. . . . . . . . . . . . . . 11.4 Reduced form restrictions . . . . 282 .vi 11 Identiﬁcation of the Short-Run Structure 11.2 14. . . . . . . CONTENTS 219 . . . . . . .1 A concluding discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transforming I(2) data to I(1) . .4 Economic identiﬁcation . . . . . . . . . . . 281 . . . . . . . . . .2 Some special cases . . . . . 228 . . . . . . . . . . . . . . . . . . . . . . 11. . . . . An intuitive account . . . . . . . . . 292 16 On the Econometric Approach 295 16. . . . . . . . . . . . 11. . 277 . . .2 Interpreting shocks . . . . .2 General-to-Speciﬁc and Speciﬁc-to-General . . . . . . 231 . .5 15. . 234 . . . . . . . Th full information maximum likelihood procedure Price relations in the long run and the medium run The nominal to real transformation . . . . . . . . 220 . 265 266 268 270 273 276 models . . . . . . . . . .1 The common trends representation 12. 241 245 246 247 248 254 . . 12 Identiﬁcation of Common trends 12. . . . . . 13 Identiﬁcation of a Structural VAR 259 13. .