Professional Documents
Culture Documents
Modelling Volatility and Correlation: Introductory Econometrics For Finance' © Chris Brooks 2002 1
Modelling Volatility and Correlation: Introductory Econometrics For Finance' © Chris Brooks 2002 1
0.04
0.02
0.00
-0.02
-0.04
-0.06
-0.08
1/01/90 11/01/93 Date 9/01/97
• Models with nonlinear g(•) are “non-linear in mean”, while those with
nonlinear 2(•) are “non-linear in variance”.
• Many other non-linearity tests are available, e.g. the “BDS test” and the
bispectrum test.
• One particular non-linear model that has proved very useful in finance
is the ARCH model due to Engle (1982).
• So use a model which does not assume that the variance is constant.
• Recall the definition of the variance of ut:
t2 t ut-1, ut-2,...) = E[(ut-E(ut))2 ut-1, ut-2,...]
= Var(u
We usually assume that E(ut) = 0
2
so t = Var(ut ut-1, ut-2,...) = E[ut2 ut-1, ut-2,...].
• What could the current value of the variance of the errors plausibly
depend upon?
– Previous squared error terms.
• This leads to the autoregressive conditionally heteroscedastic model for
the variance of the errors:
= 0 + 1 t2 ut21
• This is known as an ARCH(1) model.
‘Introductory Econometrics for Finance’ © Chris Brooks 2002 8
Autoregressive Conditionally Heteroscedastic
(ARCH) Models (cont’d)
• The full model would be
yt = 1 + 2x2t + ... + kxkt + ut , ut N(0, t2)
2 2
where t = 0 + 1 ut 1
• We can easily extend this to the general case where the error variance
depends on q lags of squared errors:
t2 = 0 + 1 ut 1+2 ut 2+...+q ut q
2 2 2
t 0 1ut21 , vt N(0,1)
• The two are different ways of expressing exactly the same model. The
first form is easier to understand while the second form is required for
simulating from an ARCH model, for example.
1. First, run any postulated linear regression of the form given in the equation
above, e.g. yt = 1 + 2x2t + ... + kxkt + ut
saving the residuals, û t.
2. Then square the residuals, and regress them on q own lags to test for ARCH
of order q, i.e. run the regression
uˆt2 0 1uˆt21 2uˆt2 2 ... quˆt2 q vt
where vt is iid.
Obtain R2 from this regression
If the value of the test statistic is greater than the critical value from the
2 distribution, then reject the null hypothesis.
• Note that the ARCH test is also sometimes applied directly to returns
instead of the residuals from Stage 1 above.
• How do we decide on q?
• The required value of q might be very large
• Non-negativity constraints might be violated.
– When we estimate an ARCH model, we require i >0 i=1,2,...,q
(since variance cannot be negative)
• A natural extension of an ARCH(q) model which gets around some of
these problems is a GARCH model.
• Due to Bollerslev (1986). Allow the conditional variance to be dependent
upon previous own lags
• The variance equation is now
2 2 2
= + u + (1)
t 0 1
t1 t
-
1
variance equation.
• We could also write
2 2 2
t
-1 = 0 + 1ut
2+ t
-2
2 22
=
t
-
2+
0u
+
1
t
3t
-
3
• Substituting into (1) for t-12 :
2 22 2
t = 0 + 1ut1 + +
(
0
1+
u
t
2t
-
2)
22 2
+
=
u
0
1
t
+
+
0
1u
1
t+
t
-
2
2
• Now substituting into (2) for t-22
2
=+22
u
++u+2
(+
u2
+2
)
t
0 1
t
101
t
2 0
1
t
3t
-
3
2 22 2223
2
=+u
++u+ +
u+
t01
t
101
t0
2t1
3t
-
3
2 2 2 22
32
=
t(
1
+
0+
+
)1u
(
1
t
1L
+
+L)
+
t-
3
• An infinite number of successive substitutions would yield
2 2 2 2 2 2
t = 0 (
1+ + + ..
.)+ 1 ut1(1+ L + L +.
..
)+ 0
• So the GARCH(1,1) model can be written as an infinite order ARCH model.
• We can again extend the GARCH(1,1) model to a GARCH(p,q):
2 2 22 22 2
=
t+
0u
+
1
t
1u
2
t+
.
.
.
+
2u
+
q
t
q1
t+
-
1
2+
t
-
2.
.
.
+
pt
-
p
q p
2
t =0
u
j t
j
2 2
i t
i
i1 j
1
when 1
<
1
•
1is termed “non-stationarity” in variance
1
• 1
=
1is termed intergrated GARCH
• Since the model is no longer of the usual linear form, we cannot use OLS.
• We use another technique known as maximum likelihood.
• The method works by finding the most likely values of the parameters given
the actual data.
• More specifically, we form a log-likelihood function and maximise it.
1. Specify the appropriate equations for the mean and the variance - e.g. an
AR(1)- GARCH(1,1) model:
yt = + yt-1 + ut , ut N(0,t2)
2 22
=
t+
0u
+
1
t
1t
-
1
2 2 t 1 2 t 1
3. The computer will maximise the function and give parameter values and
their standard errors
‘Introductory Econometrics for Finance’ © Chris Brooks 2002 19
Parameter Estimation using Maximum Likelihood
• Consider the bivariate regression case with homoscedastic errors for
simplicity: y t 1 2 xt u t
• Then the joint pdf for all the y’s can be expressed as a product of the individual
density functions
f ( y1 , y 2 ,..., yT 1 2 X t , 2 ) f ( y1 1 2 X 1 , 2 ) f ( y 2 1 2 X 2 , 2 )...
(2) f ( yT 1 2 X 4 , 2 )
T
f ( yt 1 2 X t , 2 )
t 1
• Substituting into equation (2) for every yt from equation (1),
(3)
f ( y , y ,..., y x , 2 ) 1 1 T ( y t 1 2 xt ) 2
1 2 T 1 2 t exp
( 2 )
T T
2 t 1 2
‘Introductory Econometrics for Finance’ © Chris Brooks 2002 21
Parameter Estimation using Maximum Likelihood
(cont’d)
• The typical situation we have is that the xt and yt are given and we want to
estimate 1, 2, 2. If this is the case, then f() is known as the likelihood
function, denoted LF(1, 2, 2), so we write
1 1 T
( y x ) 2
(4)LF ( , , )
1 2
2
exp t 1 2 t
( 2 )
T T
2 t 1 2
1
2
2 t
• which is equivalent to
(5) T T 1 T
( y x ) 2
2
2 t
y ˆ ˆ x 0
t 1 2 t
y Tˆ ˆ x 0
t 1 2 t
1 ˆ ˆ 1
(9) T
t 1 2T
y x t 0
ˆ1 y ˆ 2 x
‘Introductory Econometrics for Finance’ © Chris Brooks 2002 24
Parameter Estimation using Maximum Likelihood
(cont’d)
• From (7), ( y ˆ ˆ x ) x 0
t 1 2 t t
y x ˆ x ˆ x 0
t t 1 t 2
2
t
y x ˆ x ˆ x 0
t t 1 t 2
2
t
ˆ x y x ( y ˆ x ) x
2
2
t t t 2 t
ˆ x y x Tx y ˆ Tx
2
2
t t t 2
2
ˆ 2 ( xt2 Tx 2 ) y t xt Tx y
(10)
ˆ 2
y x Tx y
t t
( x Tx ) 2
t
2
• From (8), T 1
ˆ 2 ˆ 4
(y t ˆ1 ˆ 2 xt ) 2
‘Introductory Econometrics for Finance’ © Chris Brooks 2002 25
Parameter Estimation using Maximum Likelihood
(cont’d)
• Rearranging, ˆ 2 1 ( y t ˆ1 ˆ 2 xt ) 2
T
1
(11) 2 ut2
T
Now we have yt = + yt-1 + ut , ut N(0, ) t2
2
=
t+
0
T
2
u
+
1
t
1
2
t
-
1
1 T 1 T
L log(2 ) log( t ) ( y t y t 1 ) 2 / t
2 2
2 2 t 1 2 t 1
• Unfortunately, the LLF for a model with time-varying variances cannot be
maximised analytically, except in the simplest of cases. So a numerical procedure
is used to maximise the log-likelihood function. A potential problem: local
optima or multimodalities in the likelihood surface.
• The way we do the optimisation is:
1. Set up LLF.
2. Use regression to get initial guesses for the mean parameters.
3. Choose some initial guesses for the conditional variance parameters.
4. Specify a convergence criterion - either by criterion or by value.
2
• Due to Glosten, Jaganathan and Runkle
=t01
t
1
+
t
-
1t
-
1
t-
1 u 2
+ 2
+
= 0 otherwise
• For a leverage effect, we would see > 0.
• We require 1 + 0 and 1 0 for non-negativity.
The news impact curve plots the next period volatility (ht) that would arise from various
positive and negative values of ut-1, given an estimated model.
News Impact Curves for S&P 500 Returns using Coefficients from GARCH and GJR
0.14
Model Estimates: GARCH
GJR
0.12
Value of Conditional Variance
0.1
0.08
0.06
0.04
0.02
0
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Value of Lagged Shock
• Engle, Lilien and Robins (1987) suggested the ARCH-M specification.
A GARCH-M model would be
yt = + t-1 + ut , ut N(0, t2)
2 22
=
t+
0u
+
1
t
1t
-
1
• GARCH can model the volatility clustering effect since the conditional
variance is autoregressive. Such models can be used to forecast volatility.
• We could show that
Var (yt yt-1, yt-2, ...) = Var (ut ut-1, ut-2, ...)
• So modelling t2 will give us models and forecasts for yt as well.
• Variance forecasts are additive over time.
are known.
• 1f,T 2 would be obtained by taking the conditional expectation of the
first equation at the bottom of slide 36:
2
f 2 2
1
,
T0 1
T+
u
=
+ T
f 2
• Given, 1f,T how is 2 ,T , the 2-step ahead forecast for 2 made at time T,
2
1. Option pricing
C = f(S, X, 2, T, rf)
2. Conditional betas
im,t
i ,t
m2 ,t
3. Dynamic hedge ratios
The Hedge Ratio - the size of the futures position to the size of the underlying
exposure, i.e. the number of futures contracts to buy or sell per unit of the spot
good.
• What if the standard deviations and correlation are changing over time?
Use s ,t
h p t t
F ,t
‘Introductory Econometrics for Finance’ © Chris Brooks 2002 40
Testing Non-linear Restrictions or
Testing Hypotheses about Non-linear Models
• Usual t- and F-tests are still valid in non-linear models, but they are
not flexible enough.
L
A
L
̂
B
L
~
~
̂
• We know at the unrestricted MLE, L(ˆ ), the slope of the curve is zero.
~
• But is it “significantly steep” at L( ) ?
• This formulation of the test is usually easiest to estimate.
• Purpose
• To consider the out of sample forecasting performance of GARCH and
EGARCH Models for predicting stock index volatility.
• Implied volatility is the markets expectation of the “average” level of volatility of
an option:
• Which is better, GARCH or implied volatility?
• Data
• Weekly closing prices (Wednesday to Wednesday, and Friday to Friday) for the
S&P100 Index option and the underlying 11 March 83 - 31 Dec. 89
• Implied volatility is calculated using a non-linear iterative procedure.
• Add in a lagged value of the implied volatility parameter to equations (2) and (3).
(2) becomes
(4)
ht 0 1u t21 1 ht 1 t21
and (3) becomes
(5)
u t 1 u 2
1/ 2
ln(ht ) 0 1 ln(ht 1 ) 1 ( t 1
) ln( t21 )
ht 1 ht 1
• We are interested in testing H0 : = 0 in (4) or (5).
• Also, we want to test H0 : 1 = 0 and 1 = 0 in (4),
• and H0 : 1 = 0 and 1 = 0 and = 0 and = 0 in (5).
• If this second set of restrictions holds, then (4) & (5) collapse to
ht2 0 t21 (4’)
• and (3) becomes
ln(ht2 ) 0 ln( t21 ) (5’)
• We can test all of these restrictions using a likelihood ratio test.
R Mt R Ft 0 1 ht u t ( 8 .7 8 )
ht 0 1u 2
t1 1 h t1 ( 8 .7 9 )
ht 0 1u 2
t1 1 h t 1 2
t1 ( 8 .8 1 )
h t 2 0 2
t1 ( 8 .8 1 )
E q u a tio n fo r 0 1 0 1 0 -4 1 1 L o g -L 2
V a ria n c e
sp e c ific a tio n
( 8 .7 9 ) 0 .0 0 7 2 0 .0 7 1 5 .4 2 8 0 .0 9 3 0 .8 5 4 - 7 6 7 .3 2 1 1 7 .7 7
( 0 .0 0 5 ) ( 0 .0 1 ) ( 1 .6 5 ) ( 0 .8 4 ) ( 8 .1 7 )
( 8 .8 1 ) 0 .0 0 1 5 0 .0 4 3 2 .0 6 5 0 .2 6 6 - 0 .0 6 8 0 .3 1 8 7 7 6 .2 0 4 -
( 0 .0 2 8 ) ( 0 .0 2 ) ( 2 .9 8 ) ( 1 .1 7 ) (-0 .5 9 ) ( 3 .0 0 )
( 8 .8 1 ) 0 .0 0 5 6 - 0 .1 8 4 0 .9 9 3 - - 0 .5 8 1 7 6 4 .3 9 4 2 3 .6 2
( 0 .0 0 1 ) (-0 .0 0 1 ) ( 1 .5 0 ) ( 2 .9 4 )
N o te s : t-ra tio s in p a re n th e s e s , L o g -L d e n o te s th e m a x im is e d v a lu e o f th e lo g -lik e lih o o d fu n c tio n in
e a c h c a s e . 2 d e n o te s th e v a lu e o f th e te s t s ta tis tic , w h ic h fo llo w s a 2 (1 ) in th e c a s e o f (8 .8 1 ) r e s tr ic te d
to (8 .7 9 ) , a n d a 2 ( 2 ) in th e c a s e o f (8 .8 1 ) r e s tric te d to (8 .8 1 ) . S o u rc e : D a y a n d L e w is (1 9 9 2 ) .
R e p rin te d w ith th e p e rm is s io n o f E ls e v ie r S c ie n c e .
u u 2
1 / 2
ln ( h t ) 0 1 ln ( h t1 ) 1 ( t1
t1
) ln( t 2 1 ) ( 8 . 8 2 )
h t1 h t1
ln ( h t 2 ) 0 ln ( 2
t1 ) ( 8 .8 2 )
E q u a tio n fo r 0 1 0 1 0 -4
1 L o g -L 2
V a ria n c e
sp e c ific a tio n
(c ) - 0 .0 0 2 6 0 .0 9 4 -3 .6 2 0 .5 2 9 - 0 .2 7 3 0 .3 5 7 - 7 7 6 .4 3 6 8 .0 9
(-0 .0 3 ) ( 0 .2 5 ) (-2 .9 0) ( 3 .2 6 ) (-4 .1 3 ) ( 3 .1 7 )
(e ) 0 .0 0 3 5 - 0 .0 7 6 -2 .2 8 0 .3 7 3 - 0 .2 8 2 0 .2 1 0 0 .3 5 1 7 8 0 .4 8 0 -
( 0 .5 6 ) (-0 .2 4 ) (-1 .8 2) ( 1 .4 8 ) (-4 .3 4 ) ( 1 .8 9 ) ( 1 .8 2 )
(e ) 0 .0 0 4 7 - 0 .1 3 9 -2 .7 6 - - - 0 .6 6 7 7 6 5 .0 3 4 3 0 .8 9
( 0 .7 1 ) (-0 .4 3 ) (-2 .3 0) ( 4 .0 1 )
N o te s : t-ra tio s in p a re n th e s e s , L o g -L d e n o te s th e m a x im is e d v a lu e o f th e lo g -lik e lih o o d fu n c tio n in
each case. 2 d e n o te s th e v a lu e o f th e te s t s ta tis tic , w h ic h fo llo w s a 2(1 ) in th e c a s e o f (8 .8 2 ) re s tric te d
to (8 .8 0 ), a n d a 2 (2 ) in th e c a s e o f ( 8 .8 2 ) r e s tr ic te d to (8 .8 2 ) . S o u rc e : D a y a n d L e w is (1 9 9 2 ) .
R e p rin te d w ith th e p e r m is s io n o f E ls e v ie r S c ie n c e .
• But the models do not represent a true test of the predictive ability of IV.
• So the authors conduct an out of sample forecasting test.
• There are 729 data points. They use the first 410 to estimate the models,
and then make a 1-step ahead forecast of the following week’s volatility.
• Then they roll the sample forward one observation at a time,
constructing a new one step ahead forecast at each step.
2
where t 1 is the “actual” value of volatility, and 2ft is the value forecasted
for it during period t.
• Perfectly accurate forecasts imply b0 = 0 and b1 = 1.
• But what is the “true” value of volatility at time t ?
Day & Lewis use 2 measures
1. The square of the weekly return on the index, which they call SR.
2. The variance of the week’s daily returns multiplied by the number
of trading days in that week.
2
t 1 b 0 b 1 2
ft t 1 ( 8 .8 3 )
2
F o re c a stin g M o d e l P ro x y fo r e x b 0 b 1 R
p o s t v o la tility
H isto ric S R 0 .0 0 0 4 0 .1 2 9 0 .0 9 4
( 5 .6 0 ) (2 1 .1 8 )
H isto ric W V 0 .0 0 0 5 0 .1 5 4 0 .0 2 4
( 2 .9 0 ) ( 7 .5 8 )
G A R C H S R 0 .0 0 0 2 0 .6 7 1 0 .0 3 9
( 1 .0 2 ) ( 2 .1 0 )
G A R C H W V 0 .0 0 0 2 1 .0 7 4 0 .0 1 8
( 1 .0 7 ) ( 3 .3 4 )
E G A R C H S R 0 .0 0 0 0 1 .0 7 5 0 .0 2 2
( 0 .0 5 ) ( 2 .0 6 )
E G A R C H W V - 0 .0 0 0 1 1 .5 2 9 0 .0 0 8
(-0 .4 8 ) ( 2 .5 8 )
Im p lie d V o la tility S R 0 .0 0 2 2 0 .3 5 7 0 .0 3 7
( 2 .2 2 ) ( 1 .8 2 )
Im p lie d V o la tility W V 0 .0 0 0 5 0 .7 1 8 0 .0 2 6
( 0 .3 8 9 ) ( 1 .9 5 )
N o te s : H is to ric re fe rs to th e u s e o f a s im p le h is to ric a l a v e ra g e o f th e s q u a re d r e tu rn s to fo re c a s t
v o la tility ; t-ra tio s in p a re n th e s e s ; S R a n d W V r e fe r to th e s q u a r e o f th e w e e k ly r e tu rn o n th e S & P 1 0 0 ,
a n d th e v a ria n c e o f th e w e e k ’s d a ily re tu rn s m u ltip lie d b y th e n u m b e r o f tra d in g d a y s in th a t w e e k ,
re s p e c tiv e ly . S o u rc e : D a y a n d L e w is (1 9 9 2 ). R e p rin te d w ith th e p e rm is s io n o f E ls e v ie r S c ie n c e .
Im p lie d v s. E G A R C H - 0 .0 0 0 0 1 0 .6 9 5 - 0 .1 7 6 - 0 .0 2 6
(-0 .0 7 ) ( 1 .6 2 ) ( 0 .2 7 )
Im p lie d v s. E G A R C H 0 .0 0 0 2 6 0 .5 9 0 - 0 .3 7 4 - 0 .1 1 8 0 .0 3 8
v s. H isto ric a l ( 1 .3 7 ) ( 1 .4 5 ) (-0 .5 7 ) ( 7 .7 4 )
G A R C H v s. E G A R C H 0 .0 0 0 0 5 - 1 .0 7 0 - 0 .0 0 1 - 0 .0 1 8
( 0 .3 7 ) ( 2 .7 8 ) (-0 .0 0 )
N o te s : t-ra tio s in p a re n th e s e s ; th e e x p o s t m e a s u re u s e d in th is ta b le is th e v a ria n c e o f th e w e e k ’s d a ily
re tu rn s m u ltip lie d b y th e n u m b e r o f tra d in g d a y s in th a t w e e k . S o u rc e : D a y a n d L e w is (1 9 9 2 ).
R e p rin te d w ith th e p e rm is s io n o f E ls e v ie r S c ie n c e .
• Within sample results suggest that IV contains extra information not
contained in the GARCH / EGARCH specifications.
• Out of sample results suggest that nothing can accurately predict
volatility!
• In the case of the VECH, the conditional variances and covariances would each depend upon lagged values of
all of the variances and covariances and on lags of the squares of both error terms and their cross products.
• In matrix form, it would be written
• Neither the VECH nor the diagonal VECH ensure a positive definite variance-
covariance matrix.
• An alternative approach is the BEKK model (Engle & Kroner, 1995).
• In matrix form, the BEKK model is
In Sample
Unhedged Naïve Hedge Symmetric Time Asymmetric
=0 =1 Varying Time Varying
Hedge Hedge
hFC ,t hFC ,t
t t
h F ,t h F ,t
Return 0.0389 -0.0003 0.0061 0.0060
{2.3713} {-0.0351} {0.9562} {0.9580}
Variance 0.8286 0.1718 0.1240 0.1211
Out of Sample
Unhedged Naïve Hedge Symmetric Time Asymmetric
=0 =1 Varying Time Varying
Hedge Hedge
hFC ,t hFC ,t
t t
h F ,t h F ,t
Return 0.0819 -0.0004 0.0120 0.0140
{1.4958} {0.0216} {0.7761} {0.9083}
Variance 1.4972 0.1696 0.1186 0.1188
0.95 Conclusions
- OHR is time-varying and less than 1
0.90
- M-GARCH OHR provides a
0.85
better hedge, both in-sample and out-of-sample.
- No role in calculating OHR for asymmetries
0.80
0.75
0.70
0.65
500 1000 1500 2000 2500 3000
Symmetric BEKK
Asymmetric BEKK