You are on page 1of 34

Multivariate time series

models
VARs and Causality Tests
Chapter 12&15 Asteriou (apprx 20-pages)
Chapter 6 Brooks
Chapter 5 Enders
Simultaneous Equations Models

All the models we have looked at thus far have dealt with single dependent
variables and estimations of single equations
All of the independent variables have been EXOGENOUS and y has been an
ENDOGENOUS variable.
However interdependence is often encountered in economics, where several
dependent variables are determined simultaneously
An example from economics to illustrate - the demand and supply of a good:
(1)
(2)
(3)

where = quantity of the good demanded
= quantity of the good supplied
S
t
= price of a substitute good
T
t
= some variable embodying the state of technology
[Note: Were trying to estimate the parameters (i.e. greek letters)]

Q P S u
dt t t t
= + + + o |
Q P T v
st t t t
= + + + k
Q Q
dt st
=
Q
dt
Q
st
Assuming that the market always clears, and dropping the time
subscripts for simplicity
(4)
(5)
This is a simultaneous STRUCTURAL FORM of the model.

The point is that price and quantity are determined simultaneously by
the market process (price affects quantity and quantity affects price).
Another way to think of this: when P changes all we observe is changes in
equilibrium, we dont know whether this is due to a shift of only demand or only
supply or some combination. i.e. we cant separately estimate the effect of a change
in price on demand or on supply

P and Q are endogenous variables (they are determined by the model),
while S and T are taken to be exogenous.
We can obtain REDUCED FORM equations corresponding to (4) and
(5) by solving equations (4) and (5) for P and for Q (separately).
Simultaneous Equations Models:
The Structural Form

Q P S u = + + + o |
Q P T v = + + + k
The steps before we see the maths
1. Solve the equations simultaneously to find
an expression for P that does not involve Q
This is the reduced form for P
2. Rearrange supply and demand to get them
in the form P=. Then solve simultaneously
to get an expression for Q that does not
involve P
This is the reduced form for Q
3. We can estimate the reduced forms

Solving to find an expression for P,
(6)
Rearranging (6),


(7)




Obtaining the Reduced Form
o | + + + P S u = + + + k P T v
| o k P P T S v u = + +
( ) ( ) ( ) | o k = + + P T S v u
P T S
v u
=

o
|
k
|

| |
Rearranging (4) and (5) to get P on the left hand side and then equating
them.
Solving to find an expression for Q,
(8)


Multiplying (8) through by |,





(9)



Obtaining the Reduced Form (contd)
o | | |k | Q S u Q T v =
| o | |k | Q Q T S u v = + +
( ) ( ) ( ) | o | |k | = + + Q T S u v
Q T S
u v
=

o |
|
|k
|

|
|
|
Q S u Q T v
|
o
|

| |

k

=
Obtaining the Reduced Form (contd)
(7) and (9) are the reduced form equations for P
and Q.




Note that each of the reduced form equations
include the error terms from both (4) and (5)
(i.e. and )

P T S
v u
=

o
|
k
|

| |
Q T S
u v
=

o |
|
|k
|

|
|
|

But what would happen if we had estimated equations (4)
and (5), i.e. the structural form equations, separately
using OLS?

Both equations (4) and (5) depend on P. One of the
CLRM assumptions was that the error term should be
uncorrelated with the explanatory variables.

It is clear from (7) that P is related to the errors in (4) and
(5) - i.e. it is stochastic.
Another way of saying this is that a shock affecting one
equation will have an effect on the second equation via
the influence on P!

Simultaneous Equations Bias

This can be seen more clearly in the following general form of a simultaneous equation
model.
A) y
1t
= b
10
+ b
11
y
2t
+b
12
x
2t
+b
14
x
4t
+e
1t

B) y
2t
= b
20
+ b
21
y
1t
+b
23
x
3t
+b
24
x
4t
+e
2t

Let one of the error terms e
1t
increase and hold all else constant
Then:
This will cause y
1t
to increase due to equation A
In turn y
1t
will cause y
2t
to increase (if b
21
is positive) due to equation B)
if y
2t
increases in B) then it also increases in A) where it is an explanatory variable.
Hence an increase in e
1t
caused an increase in an explanatory variable in the same equation
(meaning the error and the explanatory variable are correlated with each other!!).





(I have shown this with the arrows: Blue leads to Red leads to Green

Simultaneous Equation Bias
Simultaneous Equations Bias
What would be the consequences for the OLS
estimator, , if we ignore the simultaneity?

Application of OLS to structural equations which are
part of a simultaneous system will lead to biased and
inconsistent estimators

Hence it would not be possible to estimate
simultaneous equations validly using OLS.

|
So What Can We Do?
Taking the normal equations (7) and (9), we can rewrite them as
(10)

(11)

We CAN estimate equations (10) & (11) using OLS since all the RHS
variables are exogenous [(10) doesnt have Q while (11) doesnt have P
=> feedback has been controlled for!].
But ... we probably dont care what the values of the t coefficients are;
what we wanted were the original parameters in the structural equations -
o, |, , , , k.
Can We Retrieve the Original Coefficients from the ts?
Short answer: sometimes.
This is known as the identification problem

Avoiding Simultaneous Equations Bias
P T S = + + + t t t c
10 11 12 1
Q T S = + + + t t t c
20 21 22 2
Identification:
We re-wrote:
as

and
as


So: if we know the s it
may be possible to solve these equations to find what
we care about i.e. , , etc. but not always!!
P T S
v u
=

o
|
k
|

| |
Q T S
u v
=

o |
|
|k
|

|
|
|
P T S = + + + t t t c
10 11 12 1
Q T S = + + + t t t c
20 21 22 2
etc , ,
11 10
|
k
t
|
o
t

=

VARs
Vector Autoregressive (VAR) Models
Identification of exogenous and endogenous variables was
heavily criticised by Sims(1980).
According to Sims if there is simultaneity among a number of
variables, then all these variables should be treated in the same
way.
All variables should be treated as endogenous
This means that in its general reduced form each equation has
the same set of regressors which leads to the development of
VAR models
VARs have been often advocated as an alternative to large scale
simultaneous equations structural models
A natural generalisation of autoregressive models popularised by Sims

A VAR is in a sense a systems regression model i.e. there is more than one
dependent variable.

Simplest case is a bivariate VAR(k)



where u
1t
and u
2t
are uncorrelated white noise error terms

This is a bivariate VAR(k) since there are 2 variables and the longest lag is k
The analysis could be extended to a VAR(k) model with g variables and g
equations but still k lags.

(Note: when looking back at previous slides remember Var is used for
Variance!
Vector Autoregressive Models

y y y y y u
y y y y y u
t t k t k t k t k t
t t k t k t k t k t
1 10 11 1 1 1 1 11 2 1 1 2 1
2 20 21 2 1 2 2 21 1 1 2 1 2
= + + + + + + +
= + + + + + + +


| | | o o
| | | o o
... ...
... ...
One important feature of VARs is the compactness with which we can write
the notation. For example, consider the case from above where k=1.

We can write this as


or using matrix notation as


or even more compactly as
y
t
= |
0
+ |
1
y
t-1
+ u
t

21 21 22 21 21
Vector Autoregressive Models:
Notation and Concepts
y y y u
y y y u
t t t t
t t t t
1 10 11 1 1 11 2 1 1
2 20 21 2 1 21 1 1 2
= + + +
= + + +


| | o
| | o
y
y
y
y
u
u
t
t
t
t
t
t
1
2
10
20
11 11
21 21
1 1
2 1
1
2
|
\

|
.
|
=
|
\

|
.
|
+
|
\

|
.
|
|
\

|
.
|
+
|
\

|
.
|

|
|
| o
o |
This model can be extended to the case where there are k lags of each variable
in each equation and g endogenous variables:

y
t
= |
0
+ |
1
y
t-1
+ |
2
y
t-2
+...+ |
k
y
t-k
+ u
t

g1 g1 gg g1 gg g1 gg g1 g1

We can also extend this to the case where the model includes first difference
terms and cointegrating relationships (a VECM) well see this later .



Vector Autoregressive Models:
Notation and Concepts (contd)

Advantages of VAR Modelling
- Do not need to specify which variables are endogenous or exogenous - all are
endogenous
- Allows the value of a variable to depend on more than just its own lags or combinations of
white noise terms, so more general than ARMA modelling
- Provided that there are no contemporaneous terms on the right hand side of the equations,
can simply use OLS separately on each equation. For OLS to be used efficiently it is also
required that there is the same lag length in each equation
- Forecasts are often better than traditional structural models.
Problems with VARs
- VARs are a-theoretical (as are ARMA models)
- How do you decide the appropriate lag length?
- So many parameters! If we have g equations for g variables and we have k lags of each of
the variables in each equation, we have to estimate (g+kg
2
) parameters. e.g. g=3, k=3,
parameters = 30 so need a lot of data
- Do we need to ensure all components of the VAR are stationary?
- How do we interpret the coefficients?
Vector Autoregressive Models Compared with Structural
Equations Models
Does the VAR Include Contemporaneous Terms?

So far, we have assumed the VAR is of the form



But what if the equations had a contemporaneous feedback term?


We can write this as



This VAR is in primitive (structural) form. This cannot be estimated using
OLS
y y y u
y y y u
t t t t
t t t t
1 10 11 1 1 11 2 1 1
2 20 21 2 1 21 1 1 2
= + + +
= + + +


| | o
| | o
y y y y u
y y y y u
t t t t t
t t t t t
1 10 11 1 1 11 2 1 12 2 1
2 20 21 2 1 21 1 1 22 1 2
= + + + +
= + + + +


| | o o
| | o o
y
y
y
y
y
y
u
u
t
t
t
t
t
t
t
t
1
2
10
20
11 11
21 21
1 1
2 1
12
22
2
1
1
2
0
0
|
\

|
.
|
=
|
\

|
.
|
+
|
\

|
.
|
|
\

|
.
|
+
|
\

|
.
|
|
\

|
.
|
+
|
\

|
.
|

|
|
| o
o |
o
o
Primitive versus Standard Form VARs
We can take the contemporaneous terms over to the LHS and write


or
B y
t
= |
0
+ |
1
y
t-1
+ u
t


We can then pre-multiply both sides by B
-1
to give
y
t
= B
-1
|
0
+ B
-1
|
1
y
t-1
+ B
-1
u
t

or
y
t
= A
0
+ A
1
y
t-1
+ e
t

This is known as a standard form VAR, which we can estimate using OLS.
In general the approach is to estimate the VAR equation in standard form
1
1
22
12 1
2
10
20
11 11
21 21
1 1
2 1
1
2


|
\

|
.
|
|
\

|
.
|
=
|
\

|
.
|
+
|
\

|
.
|
|
\

|
.
|
+
|
\

|
.
|

o
o |
|
| o
o |
y
y
y
y
u
u
t
t
t
t
t
t
Choosing the Optimal Lag Length for a VAR
An important decision as too few leg will mean the model is
misspecified and too many lags will waste degrees of freedom.
To check lag length begin with the longest feasible length given
degrees of freedom considerations.
Can use single equation and multi equation information criteria
The AIC and SBC can be applied to each equation in the
VAR



Recall that in both the AIC and SBC increasing the lag length
will reduce the RSS at the expense of increasing k.
k= number of parameters in the equation, which equals g.p
+1 where g is the number of variables, p is the number of
lags of each variable and 1 is for the intercept term
Choose the AIC or SBC with the lowest value

n k
e
T
RSS
AIC
/ 2
) ( =
n k
e
T
RSS
SBC
/
) ( =
Choosing the Optimal Lag Length for a VAR
Multivariate versions of the information criteria are available. These
can
be defined as:
MAIC = T ln + 2N
MSBC = T ln + Nln(T)
where is the determinant of the var/cov matrix of residuals
i
in
the standard VAR system.
N is the total number of parameters estimated in all equations, which
will be equal to g
2
k + g for g equations, each with k lags of the g
variables, plus a constant term in each equation. The values of the
information criteria are constructed for 0, 1, lags (up to some pre-
specified maximum).
Select the model with the lowest MAIC or MSBC

Diagnostic Checking
Once we have estimated a VAR(p) standard
form model we can check the residuals from
the equation to see if they are white noise
error terms, in particular to see if the
residuals are serially correlated
Apply the Lagrange Multiplier test of serial
corelation. LM~
2
q
where q is the number of
lags you wish to test in the
t
series
Apply the Jarque-Bera test for normality. JB~
2
2
Causality
Block Significance and Causality Tests
It is likely that, when a VAR includes many lags of variables, it will be difficult
to see which sets of variables have significant effects on each dependent variable
and which do not. For illustration, consider the following bivariate VAR(3):



This VAR could be written out to express the individual equations as



We might be interested in testing the following hypotheses, and their implied
restrictions on the parameter matrices:
|
|
.
|

\
|
+
|
|
.
|

\
|
|
|
.
|

\
|
+
|
|
.
|

\
|
|
|
.
|

\
|
+
|
|
.
|

\
|
|
|
.
|

\
|
+
|
|
.
|

\
|
=
|
|
.
|

\
|

t
t
t
t
t
t
t
t
t
t
u
u
y
y
y
y
y
y
y
y
2
1
3 2
3 1
22 21
12 11
2 2
2 1
22 21
12 11
1 2
1 1
22 21
12 11
20
10
2
1
o o
o o


| |
| |
o
o
t t t t t t t t
t t t t t t t t
u y y y y y y y
u y y y y y y y
2 3 2 22 3 1 21 2 2 22 2 1 21 1 2 22 1 1 21 20 2
1 3 2 12 3 1 11 2 2 12 2 1 11 1 2 12 1 1 11 10 1
+ + + + + + + =
+ + + + + + + =


o o | | o
o o | | o
Block Significance and Causality Tests (contd)



Each of these four joint hypotheses can be tested within the F-test framework,
since each set of restrictions contains only parameters drawn from one
equation.
These tests are also be referred to as Granger causality tests.
Granger causality tests seek to answer questions such as Do changes in y
1

cause changes in y
2
? If y
1
causes y
2
, lags of y
1
should be significant in the
equation for y
2
. If this is the case, we say that y
1
Granger-causes y
2
.
If y
2
causes y
1
, lags of y
2
should be significant in the equation for y
1
.
If both sets of lags are significant, there is bi-directional causality
If both sets of lags are not significant in either equation then y
1
and y
2
are
independent
Hypothesis Implied Restriction
1. Lags of y
1t
do not explain current y
2t |
21
= 0 and
21
= 0 and o
21
= 0
2. Lags of y
1t
do not explain current y
1t |
11
= 0 and
11
= 0 and o
11
= 0
3. Lags of y
2t
do not explain current y
1t |
12
= 0 and
12
= 0 and o
12
= 0
4. Lags of y
2t
do not explain current y
2t |
22
= 0 and
22
= 0 and o
22
= 0
Vector Autoregression (VAR) An Example
Granger causality tests which have been interpreted as
testing, for example, the validity of the monetarist proposition
that autonomous variations in the money supply have been a
cause of output fluctuations.
Granger-causality requires that lagged values of variable A are
related to subsequent values in variable B, keeping constant the
lagged values of variable B and any other explanatory variables


Vector Autoregression (VAR)
Granger Causality
In a regression analysis, we deal with the dependence of one
variable on other variables, but it does not necessarily imply
causation. In other words, the existence of a relationship
between variables does not prove causality or direction of
influence.
In the relationship between GDP and Money supply, the often
asked question is whether GDP M or M GDP. Since we
have two variables, we are dealing with bilateral causality.
The structural VAR equations are given as:
yt t t t t
mt t t t t
y m m y
y m y m
c |
c |
+ + + =
+ + + =


1 22 1 21 2
1 12 1 11 1
Vector Autoregression (VAR)
Granger Causality
We can distinguish four cases:
Unidirectional causality from M to GDP
Unidirectional causality from GDP to M
Feedback or bilateral causality
Independence
Assumptions:
Stationary variables for GDP and M
Number of lag terms, can be established using AIC/SBC
Error terms are uncorrelated
Vector Autoregression (VAR)
Granger Causality Example Estimation (t-test)
m
t
does not Granger cause y
t
relative to an information set
consisting of past ms and ys iff |
21
= 0.
y
t
does not Granger cause m
t
relative to an information set
consisting of past ms and ys iff |
12
= 0.
In a model with one lag, as in our example, a t-test can be used to test the
null hypothesis that one variable does not Granger cause another variable. In
higher order systems, an F-test is used.
t t t t
t t t t
y m y
y m m
2 1 22 1 21
1 1 12 1 11
c | |
c | |
+ + =
+ + =


1. Regress current GDP on all lagged GDP terms but do not
include the lagged M variable (restricted regression). From this,
obtain the restricted residual sum of squares, RSS
R
.
2. Run the regression including the lagged M terms (unrestricted
regression). Also get the residual sum of squares, RSS
UR
.
3. The null hypothesis is Ho: o
i
= 0, that is, the lagged M terms do
not belong in the regression.
5. If the computed F > critical F value at a chosen level of
significance, we reject the null, in which case the lagged m
belong in the regression. This is another way of saying that m
causes y.
Vector Autoregression (VAR)
Granger Causality Estimation (F-test)
) /(
/ ) (
k n RSS
m RSS RSS
F
UR
UR R

=
The Sims Causality Test
Sims (1980) put forward an alternative test for
causality
The key to this test is that it is not possible for the
future to cause the present
Suppose we want to find out does y cause m. For
Sims causality testing consider the following model

t t t t t
t t t t t
m y m y
y y m m
2 1 23 1 22 1 21
1 1 13 1 12 1 11
c | | |
c | | |
+ + + =
+ + + =
+
+
The Sims Causality Test
We can test if y
t
causes m
t
by testing if |
13
=0.
If |
13
0 then y
t+1
does not cause m
t
as the
future cannot cause the present
Again this hypothesis can be tested using a t
test if there is 1 lead term. If there are more
than one lead terms then an F test is applied
to jointly test the hypothesis that they are all
equal to zero
TIME-SERIES
STATIONARY
One variable
Homoskedastic
ARMA
Heteroskedasticity
GARCH
More than 1
Variable
VAR
NON
STATIONARY
One Variable
ARIMA
More than 1
variable
Cointegration
(VECM)
Overview of course:

You might also like