You are on page 1of 28

Multivariate Time Series Models

Asad Dossani

Fin 625: Quantitative Methods in Finance

1 / 28
Multivariate Time Series Models

Multivariate time series models generalize univariate time series mod-


els to allow for the interaction between multiple variables. The pri-
mary model we use is the vector autoregressive model, or vector
autoregressions (VAR). In a VAR, each variable depends on its own
lags, and lags of the other variables. VARs are used to determine
how one variable affects and forecasts another over time.

2 / 28
Stationarity and Autocorrelation Matrices

Suppose yt is a vector of d time series. yt is stationary if all first


and second moments are time invariant. The matrix valued function
Γ(.) is the cross covariance function.

yt = (yt1 , . . . , ytd )0
µ ≡ E(yt )
Γ(k) ≡ E[(yt+k − µ)(yt − µ)0 ]

3 / 28
Cross Covariance Function

γij (k) is the cross covariance between the i th and j th component at


lag k for i 6= j. It is the autocovariance for j th component when
i = j. Γ(k) not symmetric unless k = 0.

γij (k) = Cov(yt+k,i , ytj )


γjj (k) = γjj (−k)
γij (k) 6= γij (−k) for i 6= j and k = 1, 2, . . .
γij (k) = γji (−k)

4 / 28
Cross Correlation Matrix

Let D −1/2 be a diagonal matrix with γjj (0)−1/2 , the reciprocal of


the standard deviation of the j th series, as its j th main diagonal
element. R (k) is the cross correlation matrix, and ρij (k) is the
cross correlation coefficient

R (k) = D −1/2 Γ(k)D −1/2


ρij (k) = Corr(yt+k,i , ytj )

5 / 28
Cross Correlation Matrix

Autocorrelation coefficients are symmetric, but the cross correlation


coefficient is not necessarily symmetric.

ρjj (k) = ρjj (−k)


ρij (k) 6= ρij (−k) for i 6= j and k = 1, 2, . . .
ρij (k) = ρji (−k)

6 / 28
Sample Cross Covariance/Correlation Matrices

With available observations y1 , . . . , yT , a natural estimator of the


cross covariance matrix is the sample cross covariance matrix.

T −k
(yt+k − µ̂)(yt − µ̂)0
1 X
Γ̂(k) =
T
t=1
T
µ̂ =
1 X
T
yt
t=1

7 / 28
Sample Cross Covariance/Correlation Matrices

Similarly, a natural estimator of the cross correlation matrix is the


sample cross correlation matrix.

R̂ (k) = D̂ −1/2 Γ̂(k)D̂ −1/2


D̂ = diag(γ̂11 (0), . . . , γ̂dd (0))

8 / 28
Vector White Noise

A vector noise process has no serial correlation across all the compo-
nents of ut . However, different components of ut may be contem-
poraneously correlated, as Σu is not necessarily a diagonal matrix.

ut ∼ WN(a, Σu )
E(ut ) = a
Var(ut ) = Σu
Cov(ut , us ) = 0 for all t 6= s

9 / 28
Vector Autoregressive Models: VAR(1)

We define a d-vector autoregressive model with order 1. Each com-


ponent of yt is a linear combination of its lagged values and the
lagged values of other components. The concurrent linear relation-
ship among the different components of yt is reflected by the non-
zero off diagonal elements in Σu . yt , c , ut , and 0 are (d x 1) vectors
and B is a (d x d) matrix.

yt ∼ VAR(1)
yt = c + Byt−1 + ut
ut ∼ WN(0, Σu )

10 / 28
Bivariate VAR(1)

yt = c + Byt−1 + ut
        
y1,t c1 B11 B12 y1,t−1 u
= + + 1,t
y2,t c2 B21 B22 y2,t−1 u2,t
y1,t = c1 + B11 y1,t−1 + B12 y2,t−1 + u1,t
y2,t = c2 + B21 y1,t−1 + B22 y2,t−1 + u2,t

If B12 = 0 and B21 = 0, then both y1,t and y2,t are AR(1) processes.

11 / 28
Expected Value of yt

If yt is stationary, the expected value of yt is given by:

E(yt ) = E(c + Byt−1 + ut )


E(yt ) = c + B E(yt−1 )
E(yt ) − B E(yt ) = c
(I − B )E(yt ) = c
E(yt ) = (I − B )−1 c

12 / 28
Bivariate VAR(1)

Suppose yt follows a bivariate VAR(1) process, where yt = (y1,t , y2,t )0 :

yt = c + Ayt−1 + ut
ut ∼ WN(0, Σu )
 
c = 0.1
0.2
 
A = 0.5
0
0.25
0.5

Compute the expected value of the series, i.e. E(yt ).

13 / 28
Bivariate VAR(1)

E(yt ) = (I − A)−1 c
    !−1  
1 0 0.5 0.25 0.1
= −
0 1 0 0.5 0.2
  !−1  
0.5 −0.25 0.1
=
0 0.5 0.2
 ! 
1 0.5 0.25 0.1
=
(0.5)(0.5) − (−0.25)(0) 0 0.5 0.2
 ! 
2 1 0.1
=
0 2 0.2
 
0.4
=
0.4
14 / 28
Vector Autoregressive Models: VAR(p)

We similarly define a d-vector autoregressive model with order p.


yt , c , ut , and 0 are (d x 1) vectors. B1 , . . . , Bp , and Σu are (d x d)
matrices.

yt ∼ VAR(p)
yt = c + B1 yt−1 + · · · + Bp yt−p + ut
ut ∼ WN(0, Σu )
E(yt ) = (I − B1 − · · · − Bp )−1 c

15 / 28
Forecasting using a VAR(1)

yt+1 = c + Byt + ut+1


Et (yt+1 ) = Et (c + Byt + ut+1 )
Et (yt+1 ) = c + Byt
yt+2 = c + Byt+1 + ut+2
Et (yt+2 ) = Et (c + Byt+1 + ut+2 )
Et (yt+2 ) = c + B Et (yt+1 )
Et (yt+2 ) = c + B (c + Byt )
Et (yt+2 ) = c + Bc + B 2 yt

16 / 28
Forecasting using a VAR(1)

yt+k = c + Byt+k−1 + ut+k


k−1
Et (yt+k ) = c + B j c + B k yt
X

j=1

k−1
Et (yt+k ) = c + Bj c
X
as k → ∞
j=1

Et (yt+k ) = (I − B )−1 c as k → ∞

17 / 28
Forecasting using a VAR(1)

Suppose yt follows a bivariate VAR(1) process, where yt = (y1,t , y2,t )0 :

yt = c + Ayt−1 + ut
ut ∼ WN(0, Σu )
 
c = 0.1
0.2
 
A = 0 0.5
0.5 0.25

Suppose yt = (1, 0)0 . Compute the one period ahead and two period
ahead forecasts, i.e. Et (yt+1 ) and Et (yt+2 ).

18 / 28
Forecasting using a VAR(1)

Et (yt+1 ) = c + Ayt
      
0.1 0.5 0.25 1 0.6
= + =
0.2 0 0.5 0 0.2
Et (yt+2 ) = c + AEt (yt+1 )
      
0.1 0.5 0.25 0.6 0.45
= + =
0.2 0 0.5 0.2 0.3

19 / 28
Information Criteria

We can use multivariate information criteria to select the order p


of the VAR. We choose p such that one of the information criteria
is minimized. Let Σ̂u denote the variance covariance matrix of the
residuals, T denote the sample size, and k 0 = p 2 k + p, the total
number of regressors in all equations.

2k 0
MAIC = log(|Σ̂u |) +
T
k0
MBIC = log(|Σ̂u |) + ln T
T
2k 0
MHQIC = log(|Σ̂u |) + ln(ln T )
T

20 / 28
Granger Causality

Time series y1,t is said to Granger cause time series y2,t if lags of
y1,t are useful for forecasting y2,t , after controlling for lags of y2,t .

E(y2,t |y1,t−1 , y2,t−1 , y1,t−2 , y2,t−2 , . . . ) 6= E(y2,t |y2,t−1 , y2,t−2 , . . . )

21 / 28
Granger Causality

VARs can be used to test for Granger causality. Suppose we are


interested in testing whether y1,t Granger causes y2,t , and we esti-
mate a bivariate VAR(1):

yt = c + Byt−1 + ut
y1,t = c1 + B11 y1,t−1 + B12 y2,t−1 + u1,t
y2,t = c2 + B21 y1,t−1 + B22 y2,t−1 + u2,t

We can use a F-test to test the restriction that B21 = 0. If we reject


the null hypothesis, we conclude that y1,t Granger causes y2,t .

22 / 28
Granger Causality

Suppose we are interested in testing whether y1,t Granger causes


y2,t , and we now estimate a bivariate VAR(p):

yt = c + B1 yt−1 + · · · + Bp yt−p + ut
Testing for Granger causality is a test of the following restriction:

B1,21 = B2,21 = · · · = Bp,21 = 0

23 / 28
Granger Causality

To test for Granger causality, we first estimate the unrestricted VAR


and compute the residual sum of squares, RSS. Next, we estimate
a restricted VAR, and compute the residual sum of squares, RSSr .
Under the null hypothesis:

(RSSr − RSS)/p
F = ∼ F (p, 2T − 4p − 2)
RSS/(2T − 4p − 2)

If the test statistic is greater than the critical value, we reject the
null hypothesis of no Granger causality.

24 / 28
Impulse Response Functions

Impulse response functions measure the resulting changes in other


components at different time lags due to a unit change in one com-
ponent series. For a VAR(p), we write the MA(∞) representation.
The (i, j) element of Ak is the impulse response of yt+k,i , the i th
component at the k units of time ahead, from one unit of extra
shock in the j th component of yt .

yt = c + ut + Ak ut−k
X

k=1

25 / 28
Impulse Response Functions

In practice, the components of ut are not independent, and a change


in one component of yt is typically associated with some changes in
other components. Thus, it is not possible to define the responses
with respect to a single component of yt .

26 / 28
Impulse Response Functions

We can apply the following transformation to alleviate correlation


among components of ut . Now, Ψ0 , Ψ1 , . . . are the impulse re-
sponse functions.

ut = Ψ0 t
Ψ0 Ψ00 = Σu
Var(t ) = I

yt = c + Ψ0 t +
X
Ψk t−k
k=1

27 / 28
Impulse Response Functions

The definition of t is not unique. We generally assume that Ψ0 is


lower triangular, also known as the Cholesky decomposition. This
assumes that the first variable is contemporaneously independent of
the second variable. The second variable is in turn contemporane-
ously independent of the third variable, and so on.

28 / 28

You might also like