You are on page 1of 18

# Market Models: Chapter 5

## Chapter 5: Forecasting Volatility and Correlation

Previous chapters have described how volatility and correlation forecasts may be generated
using different models. In some cases there are very noticeable differences between the
various forecasts of the same underlying volatility or correlation, and in some cases there are
great similarities. It is a generic problem with volatility forecasting that rather different results
may be obtained, depending on the model used and on the market conditions. Correlation
forecasting is even more problematic because the inherent instability of some correlations
compounds the difficulties.

Even if only one particular type of model were always used, the forecasts will depend on the
parameters chosen. For example, in Figure 3.6 the exponentially weighted moving average
volatility of the CAC at the end of 1997 could be 30% or 45% depending on whether the
smoothing constant is chosen to be 0.96 or 0.84. Many other such examples have been
encountered in previous chapters, such as the different ‘historic’ forecasts of the rand 1-year
swap rate volatilities in Figure 3.2. There are also differences between various types of
GARCH correlation estimates, as shown in Figure 4.13.

The underlying market conditions will also affect results. When markets are stable, in that
they appear to be bounded above and below or that they are trending with a relatively stable
realized volatility, differences between the various forecasts are relatively small. It is
following extreme market events that differences between forecasts tend to be greatest.

If one decides to approach the difficult problem of forecast evaluation, the first consideration
is: which volatility is being forecast? For option pricing, portfolio optimization and risk
management one needs a forecast of the volatility that governs the underlying price process
until some future risk horizon. A geometric Brownian motion has constant volatility, so a
forecast of the process volatility will be a constant whatever the risk horizon. Future volatility
is an extremely difficult thing to forecast because the actual realization of the future process
volatility will be influenced by events that happen in the future. If there is a large market
movement at any time before the risk horizon but after t = 0, the forecast that is made at t = 0
will need to take this into account. Process volatility is not the only interesting volatility to
forecast. In some cases one might wish to forecast implied volatilities, for example in short-
term volatility trades with positions such as straddles and butterflies (Fitzgerald, 1996).

The second consideration is the choice of a benchmark forecast. The benchmark volatility
forecast could be anything, implied volatility or a long-term equally weighted average
statistical volatility being the most common. If a sophisticated and technical model, such as
GARCH, cannot forecast better than the implied or ‘historical’ forecasts that are readily
available from data suppliers (and very easily computed from the raw market data) then it
may not be worth the time and expense for development and implementation.

A third consideration is, which type of volatility should be used for the forecast? Since both
implied and statistical volatilities are forecasts of the same thing, either could be used. Thus a
model could forecast implied volatility with either implied volatility or statistical volatility.
Price process volatilities could be forecast by statistical or implied volatilities, or indeed both
(some GARCH models use implied volatility in the conditional variance equation). There is
much to be said for developing models that use a combination of several volatility forecasts.
When a number of independent forecasts of the same time series are available it is possible to
pool them into a combined forecast that always predicts at least as well as any single
component of that forecast (Granger and Ramanathan, 1984; Diebold and Lopez, 1996). The
Granger–Ramanathan procedure will be described in §5.2.3; the focus of that discussion will
be the generation of confidence intervals for volatility forecasts.

## Copyright 2001 Carol Alexander 1

Market Models: Chapter 5

This chapter begins by outlining some standard measures of the accuracy of volatility and
correlation forecasts. Statistical criteria, which are based on diagnostics such as root mean
square forecasting error, out-of-sample likelihoods, or the correlation between forecasts and
squared returns, are discussed in §5.1.1. Operational evaluation methods are more subjective
because they depend on a trading or a risk management performance measure that is derived
from the particular use of the forecast; these are reviewed in §5.1.2.

Any estimator of a parameter of the current or future return distribution has a distribution
itself. A point forecast of volatility is just the expectation of the distribution of the volatility
estimator,1 but in addition to this expectation one might also estimate the standard deviation
of the distribution of the estimator, that is, the standard error of the volatility forecast. The
standard error determines the width of a confidence interval for the forecast and indicates how
reliable a forecast is considered to be. The wider the confidence interval, the more uncertainty
there is in the forecast.2 Standard errors and confidence intervals for some standard
forecasting models are described in §5.2.

Having quantified the degree of uncertainty in a forecast, one should make an adjustment to
the mark-to-market value of an option portfolio when some options have to be marked to
model. The scale of this adjustment will of course depend on the size of the standard error of
the volatility forecast, and in §5.3.1 it is shown that the model price of out-of-the-money
options should normally be increased to account for uncertainty in volatility. Section 5.3.2
shows how uncertainty in volatility is carried through to uncertainty in the value of a
dynamically delta hedged portfolio. It answers the question: how much does it matter if the
implied volatility that is used for hedging is not an accurate representation of the volatility of
the underlying process?

## 5.1 Evaluating the Accuracy of Point Forecasts

How can it be that so many different results are obtained when attempting to forecast
volatility and correlation using the same basic data? Unlike prices, volatility and correlation
are unobservable. They are parameters of the data generation processes that govern returns.
Volatility is a measure of the dispersion of a return distribution. It does not affect the shape of
the distribution but it still governs how much of the weight in the distribution is around the
centre, and at the same time how much weight is around the extreme values of returns. Small
volatilities give more weight in the centre than large volatilities, so it may be that some
volatility models give better forecasts of the central values, while other volatility models give
better forecasts of the extreme values of returns.

In financial markets the volatility of return distributions can change considerably over time,
but there is only one point against which to measure the success of a fixed horizon forecast:
the observed return over that horizon. The results of a forecast evaluation will therefore
depend on the data period chosen for the assessment. Furthermore, the assessment of
forecasting accuracy will depend very much on the method of evaluation employed (Diebold
and Mariano, 1995). Although we may come across statements such as ‘We employ
fractionally integrated EWMA volatilities because they are more accurate’, it is unlikely that

1 Similarly, a point forecast of correlation is just the expectation of the distribution of the correlation
estimator.
2 Classical statistics gives the expected value of the estimator (point estimate) and the width of the
distribution of the estimator (confidence interval) given some true value of the underlying parameter. It
is a good approximation for the distribution of the true underlying parameter only when the statistical
information (the sample likelihood) is overwhelming compared to one’s prior beliefs. This is not
necessarily so for volatility forecasts, especially for the long term (§8.3.3).

## Copyright 2001 Carol Alexander 2

Market Models: Chapter 5

a given forecasting model would be ‘more accurate’ according to all possible statistical and
operational evaluation criteria. A forecasting model may perform well according to some
evaluation criterion but not so well according to others. In short, no definitive answer can ever
be given to the question ‘which method is more accurate?’.

Much research has been published on the accuracy of different volatility forecasts for
financial markets: see, for example, Andersen and Bollerslev (1998) Alexander and Leigh
(1997), Brailsford and Faff (1996), Cumby et al.(1993), Dimson and Marsh (1990), Figlewski
(1997), Frennberg and Hansson (1996), and West and Cho (1995). Given the remarks just
made about the difficulties of this task it should come as no surprise that the results are
inconclusive. However, there is one finding that seems to be common to much of this
research, and that is that ‘historical’ volatility is just about the worst predictor of a constant
volatility process. Considering Figure 5.1, this is really not surprising.3 A realization of a
constant volatility process is just a lag of historical volatility,4 and trying to predict the lag of
a time series by its current value will not usually give good results!

## Figure 5.1: Historic and Realised Volatility of DEM-USD

25

20

15

10

0
May-88 May-89 May-90 May-91 May-92 May-93 May-94 May-95

HIST30 REAL30

Some operational and statistical criteria for evaluating the success of a volatility and/or
correlation forecast are described below. Whatever criterion is used to validate the model it
should be emphasized that, however well a model fits in-sample (i.e. within the data period
used to estimate the model parameters), the real test of its forecasting power is in out-of-
sample, and usually post-sample, predictive tests. As explained in §A.5.2, a certain amount of
the historic data should be withheld from the period used to estimate the model, so that the
forecasts may be evaluated by comparing them to the out-of-sample data.

3 Let t = T be the forecast horizon and t = 0 the point at which the forecast is made. Suppose an
exceptional return occurs at time T − 1. The realized volatility of the constant volatility process will
already reflect this exceptional return at time t = 0; it jumps up T periods before the event. However,
the historical volatility only reflects this exceptional return at time T; it jumps up at the same time as
the realized volatility jumps down.
4 This is the case if the historical method uses an equally weighted average of past squared returns over
a look-back period that is the same length as the forecast horizon. More commonly, historical
volatilities over a very long-term average are used to forecast for a shorter horizon – for example, 5-
year averages are used to forecast 1-year average volatilities.

## Copyright 2001 Carol Alexander 3

Market Models: Chapter 5

## Suppose a volatility forecasting model produces a set of n post-sample forward volatility

predictions, denoted σ̂ t + 1, … , σ̂ t + T Assume, just to make the exposition easier, that these
forecasts are of 1-day volatilities, so the forecasts are of the 1-day volatility tomorrow, and
the 1-day forward volatility on the next day, and so on until T days ahead. We might equally
well have assumed the forecasts were of 1-month volatility over the next month, 1 month
ahead, 2 months ahead, and so on until the 1-month forward volatility in T months’ time. Or
we might be concerned with intra-day frequencies, such as volatility over the next hour. The
unit of time does not matter for the description of the tests. All that matters is that these
forecasts be compared with observations on market returns of the same frequency.5

A process volatility is never observed; even ex post we can only ever know an estimate, the
realization of the process volatility that actually occurred. The only observation is on the
market return. A 1-day volatility forecast is the standard deviation of the 1-day return, so a 1-
day forecast should be compared with the relevant 1-day return. One common statistical
measure of accuracy for a volatility forecast is the likelihood of the return, given the volatility
forecast. That is, the value of the probability density at that point, as explained in §A.6.1.
Figure 5.2 shows that the observed return r has a higher likelihood under f(x) than under g(x).
That is, r is more likely under the density that is generated by the volatility forecast that is the
higher of the two. One can conclude that the higher volatility forecast was more accurate on
the day that the return r was observed.

## Figure 5.2: Volatility and the Likelihood

0.22
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0

f(x) g(x)

Suppose that we want to compare the accuracy of two different volatility forecasting models,
A and B.6 Suppose model A generates a sequence of volatility forecasts, { σ̂ t + 1, … , σ̂ t + T}A
and model B generates a sequence of volatility forecasts, { σ̂ t + 1, …….., σ̂ t + T}B. For model
A, compare each forecast σ̂ t + j with the observed return on that day, rt + j , by recording the
likelihood of the return as depicted in Figure 5.2. The out-of-sample likelihood of the whole
sequence of forecasts is the product of all the individual likelihoods, and we can denote this
LA. Similarly, we can calculate the likelihood of the sample given the forecasts made with
model B, LB. If over several such post-sample predictive tests, model A consistently gives
higher likelihoods than model B, we can say that model A performs better than model B.

5 If implied volatility is being forecast, then the market implied volatility is the observed quantity that
can be used to assess the accuracy of forecasts.
6 These could be two EWMA models, but with different smoothing constants; or a 30-day and a 60-day
historic model; or an EWMA and a GARCH; or two different types of GARCH; or a historic and an
EWMA, etc.

## Copyright 2001 Carol Alexander 4

Market Models: Chapter 5

Different volatility forecasting models may be ranked by the value of the out-of-sample
likelihood, but the effectiveness of this method does rely on the correct specification of the
return distributions. Generally speaking, we assume that these return distributions are normal,
but if they are not normal then the results of out-of-sample likelihood tests will not be
reliable. If likelihood criteria are to be used it is advisable to accompany results with a test for
the assumed distribution of returns (§10.1).

Much of the literature on volatility forecasting uses a root mean square error (RMSE)
criterion instead of a likelihood (§A.5.3). But while a RMSE may be fine for assessing price
forecasts, or any forecasts that are of the mean parameter, there are problems with using the
RMSE criterion for volatility forecasting (Makridakis, 1993). In fact, the ‘minimize the
RMSE’ criterion is equivalent to the ‘maximize the likelihood’ criterion when the likelihood
function is normal with a constant volatility.7 Hence RMSEs are applicable to mean
predictions, such as those from a regression model, rather than variance or covariance
predictions.8

Not only is the RMSE criterion applicable to means rather than variances, one statistical
performance measure that has, unfortunately, slipped into common use is an RMSE between a
volatility forecast and the realized volatility, which is just one observation on the process
volatility. As a statistical criterion this makes no sense at all, because the correct test is an F-
test, not an RMSE. 9 In fact the only justification for using the RMSE between a forecast and
the ex-post realized volatility is that it is a simple distance metric.

## Notwithstanding these comments, a popular approach to assessing volatility forecasting

accuracy is to use the RMSE to compare the forecast of variance with the appropriate squared
return. The difference between the variance forecast and the squared return is taken as the
forecast error. These errors are squared and summed over a long post-sample period, and then
square-rooted to give the post-sample RMSE between the variance forecast and the squared
returns. However, these RMSE tests will normally give poor results, because although the
expectation of the squared return is the variance, there is a very large standard error around
this expectation. That is, the squared returns will jump about excessively while the variance
forecasts remain more stable. The reason for this is that the return rt is equal to σtzt, where zt is
a standard normal variate, so the squared return yields very noisy measurements due to
excessive variation in zt2.

Another popular statistical procedure is to perform a regression of the squared returns on the
variance forecast. If the variance is correctly specified the constant from this regression
should be zero and the slope coefficient should be one. But since the values for the
explanatory variable are only estimates, the standard errors-in-variables problem of regression
described in §A.4.2 produces a downward bias on the estimate of the slope coefficient.

7 To see this, suppose returns are normal so (from §A.6.3) the likelihood L is most easily expressed as:
−2lnL = nln(2π) + nlnσ2 + Σ(xi − µ)2/σ2.
Now maximizing L is equivalent to minimizing −2lnL, and when volatility is constant this is equivalent
to minimizing Σ(xi − µ)2. This is the same as minimizing √(Σ(xi − µ)2), that is, the root of the sum of the
squared errors between forecasts xi and the mean.
8 Of course, a variance is a mean, but the mean of the squared random variable, which is chi-squared
distributed, not normally distributed, so the likelihood function is totally different and does not involve
any sum of squared errors.
9 Hypothesis tests of the form H : σ = σ would be relevant; that is, to test whether the process
0 Α B
volatility underlying the forecast is the same as the process volatility that generated the realization we
have observed, ex post. Therefore an F-test based on the test statistic σ̂ / σ̂ 2 B for the equality of
2
A
two variances would apply.

## Copyright 2001 Carol Alexander 5

Market Models: Chapter 5

The R2 from this regression will assess the amount of variation in squared returns that is
explained by the successive forecasts of σ2. However, the excessive variation in squared
returns that was mentioned above also presents problems for the R2 metric. In fact this R2 will
be bounded above, and the bound will depend on the data generation process for returns. For
example, Andersen and Bollerslev (1998) show that if returns are generated by the symmetric
GARCH(1,1) model (4.2), then the true R2 from a regression of the squared returns on the
variance forecast will be

R2 = α2 / (1 − β2 − 2αβ). (5.1)

## Alpha Beta R2 Alpha Beta R2 Alpha Beta R2

0.05 0.85 0.0130 0.075 0.83 0.0301 0.1 0.8 0.0500
0.05 0.86 0.0143 0.075 0.84 0.0334 0.1 0.81 0.0550
0.05 0.87 0.0160 0.075 0.85 0.0375 0.1 0.82 0.0611
0.05 0.88 0.0182 0.075 0.86 0.0428 0.1 0.83 0.0689
0.05 0.89 0.0210 0.075 0.87 0.0500 0.1 0.84 0.0791
0.05 0.9 0.0250 0.075 0.88 0.0601 0.1 0.85 0.0930
0.05 0.91 0.0309 0.075 0.89 0.0756 0.1 0.86 0.1131
0.05 0.92 0.0406 0.075 0.9 0.1023 0.1 0.87 0.1447
0.05 0.93 0.0594 0.075 0.91 0.1589 0.1 0.88 0.2016
0.05 0.94 0.1116 0.075 0.92 0.3606 0.1 0.89 0.3344

Relation (5.1) provides an upper bound for the R2 for GARCH(1,1) forecasts, and similar
upper bounds apply to other standard forecasting models. Table 5.1 shows how the true R2
varies with some common values for the estimates of α and β. Most of the R2 are extremely
small, and the largest value in the table is around 1/3, nothing like the maximum value of 1
that one normally expects with R2. Therefore it is not surprising that most of the R2 that are
reported in the literature are less than 0.05. Earlier conclusions from this literature, that
standard volatility models have very poor forecasting properties, should be reviewed in the
light of this finding. The fact that the R2 from a regression of squared returns on the forecasts
of the variance is low does not mean that the model is misspecified.

## 5.1.2 Operational Criteria

An operational evaluation of volatility and correlation forecasts will focus on the particular
application of the forecast. Thus any conclusions that may be drawn from an operational
evaluation will be much more subjective than those drawn from the statistical methods just
described. The advantage of using an operational criterion is that the volatility forecast is
being assessed in the actual context in which it will be used. The disadvantage of operational
evaluation is that the results might imply the use of a different type of forecast for every
different purpose.

Some operational evaluation methods are based on the P&L generated by a trading strategy.
A measurement of trading performance is described in §A.5.3 that is relevant for price
forecasting, where an underlying asset is bought or sold depending on the level of the price
forecast. A performance criterion for volatility or correlation forecasts should be based on
hedging performance (Engle and Rosenberg, 1995) or on trading a volatility- or correlation-
dependent product.

For example, the metric for assessing a forecast of implied volatility could involve buying or
selling straddles (a put and a call of the same strike) depending on the level of the volatility

## Copyright 2001 Carol Alexander 6

Market Models: Chapter 5

that is forecast. Straddles have a V-shaped pay-off and so will be in-the-money if the market
is volatile, that is, for a large upward or downward movement in the underlying. The forecast
of volatility σ̂ can be compared with the current implied volatility level σ, and then a trading
strategy can be defined that depends on their difference.

The choice of strategy is an entirely subjective decision. It depends on how one proposes to
implement the trades. For example, a simple trading strategy for an implied volatility forecast
σ̂ that relates to a single threshold implied volatility τ, might be: ‘buy one at-the-money
straddle if σ̂ − σ > τ ; otherwise do nothing’. An alternative volatility strategy could be: ‘buy
one ATM straddle if σ̂ − σ > τ1; sell one ATM straddle if σ̂ − σ < τ2 ; otherwise do nothing’.
Or the strategy may go long or short several straddles, depending on various thresholds: ‘buy
n1 straddles if σ̂ − σ ≥ τ1; sell n2 straddles if σ̂ − σ ≤ τ2; otherwise do nothing’.

The P&L results from this strategy will depend on many choices: the number of trades n1 and
n2, the thresholds τ1 and τ2 , the frequency of trades, the strike of the straddles and, of course,
the underlying market conditions during the test, including the current level of implied
volatility σ. Clearly an evaluation strategy that is closest to the proposed trading strategy
needs to be designed and the trader should be aware that the optimal forecasting model may
very much depend on the design of the evaluation strategy.

When volatility and correlation forecasts are used for risk management the operational
evaluation of volatility and correlation forecasts can be based on a standard risk measure such
as Value-at-Risk. The general framework for backtesting of VaR models will be discussed in
§9.5. But if the model performs poorly it may be for several reasons, such as non-normality in
return distributions, and not just the inaccuracy of the volatility and correlation forecasts.

Alexander and Leigh (1997) perform a statistical evaluation of the three types of statistical
volatility forecasts that are in standard use: ‘historical’ (equally weighted moving averages),
EWMAs and GARCH. Given the remarks just made, it is impossible to draw any firm
conclusions about the relative effectiveness of any volatility forecasting method for an
arbitrary portfolio. However, using data from the major equity indices and foreign exchange
rates, some broad conclusions do appear. While EWMA methods perform well for predicting
the centre of a normal distribution, the VaR model backtests indicate that GARCH and
equally weighted moving average methods are more accurate for the tails prediction required
by VaR models. These results seem relatively independent of the data period used.

GARCH forecasts are designed to capture the fat tails in return distributions, so VaR
measures from GARCH models tend to be larger than those that assume normality. The
‘ghost features’ of equally weighted averages that follow exceptional market moves have a
similar effect on the ‘historical’ VaR measures. Therefore it is to be expected that these two
types of forecasts generate larger VaR measures for most data periods, and consequently
better VaR backtesting results (§9.5.1).

## 5.2 Confidence Intervals for Volatility Forecasts

We have examined the ability of point forecasts of volatility to capture the constant volatility
of a price process. These point forecasts are the expectation of the future volatility estimator
distribution. Another important quality for volatility forecasts is that they have low standard
errors. That is, there is relatively little uncertainty surrounding the forecast or, to put it
another way, one has a high degree of confidence that the forecast is close to the true process
volatility. In §A.5.1 it is shown how standard errors of statistical regression forecasts are used
to generate confidence intervals for the true value of the underlying parameter. These

## Copyright 2001 Carol Alexander 7

Market Models: Chapter 5

principles may also be applied to create confidence intervals for the true volatility, or for the
true variance if that is the underlying parameter of interest.

The statistical models described in chapters 3 and 4 are variance forecasting models.10 When
the variance is forecast, the standard error of the forecast refers to the variance rather than to
the volatility. Of course, the standard error of the volatility forecast is not the square root of
the standard error of the variance forecast, but there is a simple transformation between the
two. Since volatility is the square root of the variance, the density function of volatility is

## g(σ) = 2σh(σ2) for σ > 0, (5.2)

where h(σ2) is the density function of variance.11 Relationship (5.2) may be used to transform
results about predictions of variances to predictions of volatility.

## 5.2.1 Moving Average Models

A confidence interval for the variance σ̂ 2 estimated by an equally weighted average may be
obtained by a straightforward application of sampling theory. If a variance estimate is based
on n normally distributed returns with an assumed mean of zero, then n σ̂ 2 / σ2 has a chi-
squared distribution with n degrees of freedom.12 A 100(1 − α)% two-sided confidence
interval for n σ̂ 2 / σ2 would therefore take the form (χ2n, 1 − α/2, χ2n, α/2) and a straightforward
calculation gives the associated confidence interval for the variance σ2 as:

## (n σ̂ 2 / χ2n, α/2 , n σ̂ 2 / χ2n, 1 − α/2). (5.3)

For example a 95% confidence interval for an equally weighted variance forecast based on 30
observations is obtained using the upper and lower chi-squared critical values:

## χ230, 0.025 = 46.979 and χ230, 0.975 =16.791.

So the confidence interval is (0.6386 σ̂ 2, 1.7867 σ̂ 2) and exact values are obtained by
substituting in the value of the variance estimate.

Assuming normality,13 the standard error of an equally weighted average variance estimator
based on n (zero mean) squared returns is [√(2/n)]σ2. Therefore, as a percentage of the
variance, the standard error of the variance estimator is 20% when 50 observations are used in
the estimate, and 10% when 200 observations are used in the estimate.

10The volatility forecast is taken to be the square root of the variance forecast, even though E(σ) ≠
√E(σ2).

11 If y is a (monotonic and differentiable) function of x then their probability densities g(·) and h(·) are
related by g(y) = dx/dyh(x). So if y = √x, dx/dy= 2y.
12 The usual degrees-of-freedom correction does not apply since we have assumed throughout that
returns have zero mean.
13 It follows from footnote 9 that if X are independent random variables (i = 1, … , n) then f(X ) are
i i
also independent for any monotonic differentiable function f(·). Moving average models already
assume that returns are i.i.d. Now
n
assuming normality too, so that the returns are NID(0, σn 2), we apply
the variance operator to σˆ t = ∑ rt −i / n . Since the squared returns are independent V (σˆ 2t ) = ∑ V ( rt2− i ) / n 2 .
2 2

i =1 i =1

4 4 4
Now V(rt ) = E(rt ) − [E(rt )] = 3σ − σ = 2σ by normality, and it follows that the variance of the
2 4 2 2

## Copyright 2001 Carol Alexander 8

Market Models: Chapter 5

## The (infinite) EWMA variance estimator given by (3.3) has variance14

(1 − λ ) 4
2 σ .
(1 + λ )

Therefore, as a percentage of the variance, the standard error of the EWMA variance
estimator is about 5% when λ = 0.95, 10.5% when λ = 0.9, and 16.2% when λ = 0.85.

To obtain error bounds for the corresponding volatility estimates, it is of course not
appropriate to take the square root of the error bounds for the variance estimate. However, it
can be shown that15

V( σ̂ 2) ≈ (2σ)2 V( σ̂ ).

The standard error of the volatility estimator (as a percentage of volatility) is therefore
approximately one-half the size of the variance standard error (as a percentage of the
variance).16 As a percentage of the volatility, the standard error of the equally weighted
volatility estimator is approximately 10% when 50 observations are used in the estimate, and
5% when 200 observations are used in the estimate; the standard error of the EWMA
volatility estimator is about 2.5% when λ = 0.95, 5.3% when λ = 0.9, and 8.1% when
λ = 0.85.

The standard errors on equally weighted moving average volatility estimates become very
large when only a few observations are used. This is one reason why it is advisable to use a
long averaging period in ‘historical’ volatility estimates. On the other hand, the longer the
averaging period, the longer-lasting the ‘ghost effects’ from exceptional returns (see Figure
3.1).

## 5.2.2 GARCH Models

The covariance matrix of the parameter estimates in a GARCH model can be used to generate
GARCH confidence intervals for conditional variance. For example, the variance of the one-
step-ahead variance forecast in a GARCH(1,1) model is

ˆ t2+1 ) = Vt ( ω
Vt ( σ ˆ )ε t4 + Vt ( β
ˆ ) + Vt ( α ˆ t4
ˆ )σ
(5.4)
+ 2 covt ( ω ˆ ) ε t2 + 2 covt ( ω
ˆ ,α ˆ ,β
ˆ )σˆ t2 + 2 covt ( α ˆ )ε 2 σ
ˆ ,β ˆ2
t t

The estimated covariance matrix of parameter estimates is part of the standard output in
GARCH procedures. It will depend on the sample size used to estimate the model: as with the
error bounds for moving average estimates given above, the smaller the sample size the
bigger these quantities. However, in the GARCH forecast by far the largest source of
uncertainty is the unexpected return at time t. If there is a large an unexpected movement in

Applying the variance operator to (3.3): V( σ̂ ) = [(1 − λ)2/(1 − λ2)] V(rt − 12) = [(1 − λ)/(1 + λ)] 2σ4.
14 2

15 Taking a second-order Taylor expansion of f(x) about µ, the mean of X, and taking expectations
gives E(f(X)) ≈ f(µ) + ½V(X) f´´(µ). Similarly, E(f(X)2) ≈ f(µ)2 + V(X)[ f´(µ)2 − f(µ)f´´(µ)], again
ignoring higher-order terms. Thus V(f(X)) ≈ f´(µ)2V(X).
16 For the equally weighted average of length n, the variance of the volatility estimator is
(2σ4/n)(1/2σ)2 = σ2/(2n), so the standard error of the volatility estimator as a percentage of volatility is
1/√(2n); For the EWMA, the variance of the volatility estimator is (2[(1 − λ)/(1+λ)]σ4)(1/2σ)2 = [(1 −
λ)/(1+λ)]σ2/2, so the standard error of the volatility estimator as a percentage of volatility is √[(1-
λ)/2(1+λ)].

## Copyright 2001 Carol Alexander 9

Market Models: Chapter 5

the market, in either direction, then εt2 will be large and the effect in (5.4) will be to widen the
confidence interval for the GARCH variance considerably.

Consider the GARCH(1,1) models discussed in §4.3.2. Table 5.2 reports upper and lower
bounds for 95% confidence intervals for the 1-day-ahead GARCH variance forecasts that are
generated using (4.16) with a number of different values for unexpected daily return. The
confidence interval for the variance is quoted in annualized terms assuming 250 days a year.
Note that these returns will be squared in the symmetric GARCH(1,1), so it does not matter
whether they are positive or negative.

Table 5.2: GARCH variance 95% confidence interval bounds on 11 September 1998

## Unexpected S&P 500 CAC Nikkei 225

Return Lower Upper Lower Upper Lower Upper
0.001 0.10080 0.10345 0.20367 0.20649 0.13829 0.14022
0.002 0.099647 0.10460 0.20273 0.20742 0.13722 0.14130
0.003 0.098144 0.10610 0.20144 0.20871 0.13606 0.14245
0.004 0.096568 0.10768 0.20006 0.21009 0.13489 0.14362
0.005 0.094967 0.10928 0.19865 0.21151 0.13371 0.14480

Note that the GARCH confidence intervals are much wider following a large unexpected
return − rather than uncertainty in parameter estimates, it is market behaviour that is the main
source of uncertainty in GARCH forecasts. These confidence intervals are for the variance,
not the volatility, but may be translated into confidence intervals for volatility.17 For example,
the confidence intervals for an unexpected return of 0.005 translate into a confidence interval
in volatility terms of (30.8%, 33.1%) for the S&P 500, (44.6%, 46%) for the CAC and
(36.5%, 37.9%) for the Nikkei 225.

## 5.2.3 Confidence Intervals for Combined Forecasts

Suppose a process volatility σ is being forecast and that there are m different forecasting
models available.18 Denote the forecasts from these models by σ̂ 1, σ̂ 2, … , σ̂ m and suppose
that each of these forecasts has been made over a data period from t = 0 to t = T. Now suppose
we have observed, ex-post, a realization of the process volatility over the same period. The
combined forecasting produce of Granger and Ramanathan (1984) applied in this context
requires a least squares regression of the realized volatility σ on a constant and σ̂ 1, σ̂ 2, … ,
σ̂ m . The fitted value from this regression is a combined volatility forecast that will fit at least
as well as any of the component forecasts: the R2 between realized volatility and the
combined volatility forecast will be at least as big as the R2 between realized volatility and
any of the individual forecasts. The estimated coefficients in this regression, after
normalization so that they sum to one, will be the optimal weights to use in the combined
forecast.

Figure 5.3 shows several different forecasts of US dollar exchange rate volatilities for
sterling, the Deutschmark and the Japanese yen. Some general observations on these figures
are as follows:
• The GARCH forecasts seem closer to the implied volatilities.

17Percentiles are invariant under monotonic differentiable transformations, so the confidence limits for
volatility are the square root of the limits for variance.

## Copyright 2001 Carol Alexander 10

Market Models: Chapter 5

• The historic 30-day and the EWMA forecasts with λ = 0.94 are similar (because the half-
life of an EWMA with λ = 0.94 is about 30 days).
• The EWMA 90-day forecasts are out of line with the other 90-day forecasts (because the
EWMA model assumes a constant volatility, the 90-day forecasts are the same as the 30-
day forecasts).
• There is more agreement between the different 30-day forecasts than there is between the
different 90-day forecasts (because uncertainties increase with the risk horizon).
• The volatility forecasts differ most at times of great uncertainty in the markets, for
example during 1994 in sterling and during 1990 in the Deutschmark. On the other hand,
they can be very similar when nothing unusual is expected to happen in the market, for
example during 1990 in the yen.

Figure 5.3a: 30-day Volatility Forecasts Figure 5.3d: 90-day Volatility Forecasts
GBP-USD DEM-USD
25
25

20 20

15 15

10 10

5 5

0 0
May-88 May-89 May-90 May-91 May-92 May-93 May-94 May-95 May-88 May-89 May-90 May-91 May-92 May-93 May-94 May-95

## GARCH30 EWMA HIST30 IMP30 GARCH90 EWMA HIST90 IMP90

Figure 5.3b: 90-day Volatility Forecasts Figure 5.3e : 30-day Volatility Forecasts
GBP-USD JPY-USD

25 25

20 20

15 15

10 10

5 5

0 0
May-88 May-89 May-90 May-91 May-92 May-93 May-94 May-95 May-88 May-89 May-90 May-91 May-92 May-93 May-94 May-95

## GARCH90 EWMA HIST90 IMP90 GARCH30 EWMA HIST30 IMP30

Figure 5.3c: 30-day Volatility Forecasts Figure 5.3f: 90-day Volatility Forecasts
DEM-USD JPY-USD
25
25

20
20

15
15

10
10

5
5

0
0
May-88 May-89 May-90 May-91 May-92 May-93 May-94 May-95
May-88 May-89 May-90 May-91 May-92 May-93 May-94 May-95
GARCH90 EWMA HIST90 IMP90
GARCH30 EWMA HIST30 IMP30

## Copyright 2001 Carol Alexander 11

Market Models: Chapter 5

Now suppose that a combination of these forecasts is to be used as a single forecast for
realized volatility. Table 5.3 summarizes the results of the ordinary least squares regressions
of realized volatility on the GARCH, historic, EWMA and implied volatility forecasts shown
in Figure 5.3. Some general conclusions may be drawn from the estimated coefficients and
the t-statistics (shown in parentheses). Firstly, the GARCH forecasts take the largest weight in
the combined forecast, although they are not always the most significant. In fact implied
volatilities are often very significant, particularly for the 30-day realized volatilities. Historic
and EWMA forecasts have low weights, in fact they are often less than zero, so they appear to
be negatively correlated with realized volatility.19 In each case the fitted series gives the
optimal combined forecast for realized volatility. Note that the intercept is quite large and
negative in most cases, which indicates that the forecasts have a tendency to overestimate
realized volatility.

## GBP DEM JPY

30-day 90-day 30-day 90-day 30-day 90-day
Intercept −2.02 0.44 −3.73 −3.06 −6.08 −4.67
(−2.78) (0.35) (−3.83) (−1.41) (−6.23) (−2.09)
GARCH 1.13 1.02 1.16 1.48 1.56 1.49
(6.58) (5.12) (8.33) (6.52) (10.36) (5.94)
Historic −0.26 0.19 −0.10 0.11 0.24 −0.24
(−3.56) (4.48) (−1.36) (2.97) (3.36) (−5.90)
EWMA −0.24 −0.14 −0.27 −0.03 −0.81 −0.10
(λ = 0.94) (−1.77) (−1.45) (−2.35) (−0.59) (−7.19) (−1.57)
Implied 0.50 −0.12 0.51 −0.27 0.54 0.26
(8.32) (−1.6) (9.21) (−4.3) (11.2) (5.00)
Est s.e. 2.87 2.47 2.83 2.22 2.65 2.20
R2 0.56 0.46 0.47 0.3 0.51 0.34

Figure 5.4a shows two combined forecasts of 30-day realized volatility of the GBP–USD rate.
Forecast 1 uses all four forecasts as in Table 5.3, and forecast 2 excludes the EWMA forecast
because of the high multicollinearity with the historic 30-day forecast (§A.4.1). In each case
the model was fitted up to April 1995 and the last 6 months of data are used to compare the
model predictions with the realized volatility out-of-sample in Figure 5.4b. Since the 30-day
realized volatility jumps up 30 days before a large market movement, it is a very difficult
thing to predict, particularly when markets are jumpy. There is, however, a reasonable
agreement between the forecast and the realized volatility during less volatile times, and the
out-of-sample period marked with a dotted line on the figures happened to be relatively
uneventful.

The 90-day realized volatility jumps up 90 days before a large market movement, so it is even
more difficult to predict. Not surprisingly the models in Table 5.3 have much lower R2 for 90-
day realized volatility in all three rates. Figure 5.4c shows the combined forecast from the
GBP–USD model in the second column of Table 5.3, and much of the time it performs rather
poorly.

19 The low and insignificant coefficients on the historic 30-day forecast and the EWMA are a result of
a high degree of multicollinearity of these variables (§A.4.1). In fact their unconditional correlation
estimate during the sample is over 0.95 in all three models. When there is a high level of correlation
between some explanatory variables the standard errors on coefficient estimators will be depressed, and
the danger is that the model will be incorrectly specified. However, multicollinearity does not affect
forecasts so it does no real harm to the combined forecast if both 30-day historic and EWMA forecasts
are in the model.

## Copyright 2001 Carol Alexander 12

Market Models: Chapter 5

## Figure 5.4a: 30-day Realised Volatility and Combined Forecasts

of GBP-USD
25

20

15

10

0
May-88 May-89 May-90 May-91 May-92 May-93 May-94 May-95

## Figure 5.4b: Out-of-Sample Performance of GBP-USD

Combined Forecast

11

10

4
26-Apr-95 9-May-95 19-May-95 1-Jun-95 13-Jun-95 23-Jun-95 5-Jul-95

## Figure 5.4c: 90-day Realised Volatility and the Combined

Forecast of GBP-USD

20

15

10

5
May-88 May-89 May-90 May-91 May-92 May-93 May-94 May-95

## 90-day Realised Volatility Combined Forecast

The lack of precision of these models is reflected by confidence intervals for realized
volatility which turn out to be rather wide, because of the large standard error in the models.
The linear regression approach to the construction of combined volatility forecasts allows one
to use standard results on confidence intervals for regression predictions (Appendix A.5.1) to
construct confidence intervals for volatility forecasts. For example, a two-sided confidence
interval for the realized volatility will be

## Copyright 2001 Carol Alexander 13

Market Models: Chapter 5

## where σ̂ t is the combined volatility forecast value at time t, and

ξt = Zαs√( 1 + Rt(X´X)−1Rt),

where Zα is an appropriate critical value (in this case normal since very many points were
used for the regression), s is the estimated standard error of the regression, X is the matrix of
in-sample data on the different volatility forecasts and Rt = (1, σ̂ 1t, σ̂ 2t, … , σ̂ mt)´ the vector
of the individual volatility forecasts at the time when the combined forecast is made.

As an example, return to the realized volatility forecasts shown Figure 5.3. The X matrix for
the GBP 30-day realized ‘combined forecast 2’ in figure 5.4a contains data on 1s (for the
constant), 30-day GARCH volatility, 30-day historic and 30-day implied, and

##  0 . 04200 − 0 .00543 0 .00274 − 0 .00110 

 
(X´X)−1 =  − 0 .00543 0 .00115 − 0 . 0066 0 .00001  .
 0 . 00274 − 0 .0066 0 .00042 − 0 . 00021 
 
 − 0 .00110 − 0 . 00021 0 .00011 
 0 .00001

The vector Rt depends on the time period. For example, on 7 July 1995 it took the value (1,
9.36, 9.55, 8.7)´ and so Rt(X´X)−1Rt = 0.0023. Since the estimated in-sample standard error of
regression was s = 2.936, and the 5% critical value of N(0,1) is 1.645, the value ξt, for a 90%
confidence interval based on (5.5) for the true realized volatility on 7 July 1995 was 4.83. The
point prediction of realized volatility on that day was 8.44, so the interval prediction is 8.44 ±
4.83. Therefore from this model one can be 90% sure that realized volatility would be
between 3.61% and 13.27%. Similarly, a 95% confidence interval is (2.68%, 14.2%). These
interval predictions are very imprecise. Similar calculations for the other rates and for 90-day
volatilities shows that all interval predictions are rather wide. This is because the standard
errors of regression are relatively large and the ‘goodness’ of fit in the models is not
particularly good.

## 5.3 Consequences of Uncertainty in Volatility and Correlation

Volatility is very difficult to predict − and correlation perhaps even more so. This section
considers how one should account for the uncertainty in volatility and correlation when
valuing portfolios. When portfolios of financial assets are valued, they are generally marked-
to-market. That is, a current market price for each option in the portfolio is used to value the
portfolio. However, there may not be any liquid market for some assets in the portfolio, such
as OTC options. These must be valued according to a model, that is, they must be marked-to-
model (MtM). Thus the MtM value of a portfolio often contains marked-to-model values as
well as marked-to-market values.

There are uncertainties in many model parameters; volatility and correlation are particularly
uncertain. When uncertainty in volatility (or correlation) is captured by a distribution of the
volatility (or correlation) estimator; this distribution will result in a distribution of mark-to-
model values. In this section the MtM value of an options portfolio is regarded as a random
variable. The expectation of its distribution will give a point estimate of MtM value.
However, instead of the usual MtM value, this expectation will be influenced by the
uncertainty in the volatility parameter. The adjustment to the usual value will be greatest for
OTM options; on the other hand, the variance of the adjusted MtM value will be greatest for
ATM options.

## Copyright 2001 Carol Alexander 14

Market Models: Chapter 5

We shall show that volatility and correlation uncertainty give rise to a distribution in MtM
values. Distributions of MtM values are nothing new. This is exactly what is calculated in
VaR models. However, in VaR models the value distribution arises from variations in the risk
factors of a portfolio, such as the yield curves, exchange rates or equity market indices. This
section discusses how to approach the problem of generating a value distribution where the
only uncertainty is in the covariance matrix forecast.

## 5.3.1 Adjustment in Mark-to-Model Value of an Option

In the first instance let us consider how the uncertainty in a volatility forecast can affect the
value of an option. Suppose that the volatility forecast is expressed in terms of a point
prediction and an estimated standard error of this prediction. The point prediction is an
estimate of the mean E(σ) of the volatility forecast, and the square of the estimated standard
error gives an estimate of the variance V(σ) of the volatility forecast. Denote by f(σ) the value
of the option as a function of volatility, and take a second-order Taylor expansion of f(σ)
about E(σ) :

## E(f(σ)) ≈ f(E(σ)) + ½( ∂2f/∂σ2) V(σ), (5.7)

and this can be approximated by putting in the point volatility prediction for E(σ) and the
square of the estimated standard error of that prediction for V(σ).

It is common practice for traders to plug a volatility forecast value into f(σ) and simply read
off the value of the option. But (5.7) shows that when the uncertainty in the volatility forecast
is taken into account, the expected value of the option requires more than just plugging the
point volatility forecast into the option valuation function. The extra term on the right-hand
side of (5.7) depends on ∂2f/∂σ2.20 For the basic options that a usually priced using the Black-
Scholes formula, ∂2f/∂σ2 is generally positive when the option is OTM or ITM, but when the
option is nearly ATM then ∂2f/∂σ2 will be very small (and it may be very slightly negative).

We have already seen that Black–Scholes ‘plug-in’ option prices for OTM and ITM options
are too low and that this is one reason for the smile effect (§2.2.1). The adjustment term in
(5.7) means that when some account is taken of the uncertainty in forecasts of volatility, the
Black–Scholes ‘plug-in’ price will be revised upwards. On the other hand, the Black–Scholes
price of an ATM option will need negligible revision. It may be revised downwards, but only
by a very small amount. We shall return to this problem in §10.3.3, when the volatility
uncertainty will be modelled by a mixture of normal densities. The empirical examples given
there will quantify the adjustment for option prices of different strikes and it will be seen that
simple ATM options will require only very small adjustments, if any, because they are
approximately linear in volatility.

However, the variance of the model value due to uncertain volatility will be greatest for ATM
options. To see this, take variances of (5.7):

## V(f(σ)) ≈ ( ∂f/∂σ)2V(σ). (5.8)

20 This is similar to the new Greek ‘psi’ that was introduced by Hull and White (1997) to capture
kurtosis sensitivity in options (§10.3.3).

## Copyright 2001 Carol Alexander 15

Market Models: Chapter 5

This shows that the variance of the value is proportional to the variance of the volatility
forecast, and it also increases with the square of the option volatility sensitivity, that is, the
option vega (§2.3.3). Options that are near to ATM will have the largest contribution to the
variance of the option portfolio value due to uncertain volatility, since they have the greatest
volatility sensitivity.

The adjustments that need to be made to the value of a portfolio of options to account for
uncertainty in the correlation forecasts are not easy to express analytically. A simple graphical
method is to plot the value as a function of correlation and examine its convexity in the local
area of the point correlation forecast. Figure 5.5a illustrates a hypothetical value as a function
of correlation. If the correlation forecast is ρ̂1 then the value should be adjusted downwards
for uncertainty in correlation because the function is concave at this point, but if the
correlation forecast is ρ̂ 2 then the value should be adjusted upwards for uncertainty in
correlation because the function is convex at this point. Figure 5.5b shows that the amount of
this adjustment depends on the degree of uncertainty in correlation and the shape of the value
as a function of correlation. If correlation is known to be ρ then the option value is V(ρ).
However, if correlation is unknown, suppose it takes the value ρ1 with probability p and the
value ρ2 with probability 1− p: in this case, because the option value is convex in correlation,
the option value pV(ρ1) + (1 − p)V(ρ2) is greater than V(ρ). Thus the uncertainty in
correlation, in this case, would lead to an upward adjustment in the option value.

## Figure 5.5a: Option Value as a Function of Correlation

Value

ρ1 ρ2 Correlation

## Figure 5.5b: Adjustment for Uncertainty in Correlation

Value

pV(ρ1 ) + (1 - p)V(ρ2 )
V(ρ)

ρ1 ρ ρ2

Value

Option 1

Correlation

Option 2

## Copyright 2001 Carol Alexander 16

Market Models: Chapter 5

Rather than making an ad-hoc adjustment in portfolio values, it may be preferable to hedge
this correlation risk by taking a position with the opposite characteristics. For example, if
value is a concave function of correlation this portfolio could be hedged with a portfolio that
has a value which is a convex function of correlation, as depicted in Figure 5.5c.
Unfortunately, it is not always possible to find such products in the market, although recently
there has been some growth in the markets for OTC products such as currency correlation
swaps to hedge correlations between two currency pairs.

Before ending this section, it is worthwhile to comment that the method for adjusting
portfolio values due to uncertain correlation that is depicted in Figure 5.5 should be applied to
all parameters in the portfolio pricing model. Volatility is easier, because the adjustments
have a simple analytic form described by (5.7) and (5.8). But for correlation, and all other
parameters, it is worthwhile to investigate the degree to which they affect the value and, if
necessary, make adjustments for uncertainties in parameter estimates.

## 5.3.2 Uncertainty in Dynamically Hedged Portfolios

How much does it matter if there are errors in the implied volatilities that are used to calculate
deltas for dynamic hedging of an option portfolio? It is possible to answer this question with
an empirical analysis, as the following example shows. Consider a short position on a 90-day
call option that is delta hedged daily using an implied volatility of 15%. The process volatility
is unknown, of course, but suppose you believe that it could take any value between 8% and
18% and that each value is equally likely. What is the error in the value of the delta hedged
portfolio?

For a given process volatility we can estimate the value of the option in h days’ time, and so
also the value of the delta hedged portfolio. Using 1000 Monte Carlo simulations on the
underlying price process, with a fixed process volatility, we obtain an expected value for the
option (taking the discounted expected value of the pay-off distribution as described in
§4.4.2) and a measure of uncertainty in this option value (either the standard error of the
option value distribution over 1000 simulations, or the 1% lower percentile of this
distribution21). In each simulation, the evolution of the underlying price and the fixed delta
(corresponding to an implied volatility of 15%) allow us to compute the expected present
value of the delta hedged portfolio, along with a measure of uncertainty (standard error or
percentile). We can do this for each process volatility x% (x = 8, 9, … , 18) and thus generate
a curve that represents the expected value of the hedged portfolio as a function of the process
volatility. Then, taking the time horizons h = 1, 2, …, 90 gives a surface that represents the
expected value of the hedged portfolio as a function of the process volatility x and time
horizon h. For the uncertainty around this surface, the 1% lower percentile of the value
change distribution is a good measure since it corresponds to a the 1% h-day VaR measure.

Figure 5.6 shows the 1% VaR for each possible process volatility between 8% and 18% and
for each holding period from 1 to 90 days. The figure shows that VaR measures can be high
even when process volatility is low, if the option is hedged for a long period using the wrong
hedging volatility. So much is obvious, but what is more interesting is that the VaR measures
are much greater when the process volatility is above 15%. That is, when the (wrong) hedging
volatility of 15% is underestimating the process volatility, some VaR measures can be very
considerable indeed.

There is a simple explanation for the shape of this surface. With continuous rebalancing, the
variance of a delta hedged portfolio due to using the wrong hedging volatility is
approximately (σh − σ0)2 (vega)2, where σh denotes the implied volatility used for hedging and

## Copyright 2001 Carol Alexander 17

Market Models: Chapter 5

σ0 denotes the process volatility.22 This is only a local approximation. In fact the standard
error of the hedged portfolio value has the asymmetric shape shown in Figure 5.7.

4
x 10

0
20
15 100

10 50

## Volatility (%) 5 0 Time (days)

The standard error of the value of the portfolio (option plus hedge) is zero if σh = σ0. If the
hedging volatility is correct the option can be perfectly hedged. But σ0 is unknown, and if the
implied (hedging) volatility is not equal to the process volatility, the standard error of the
portfolio value increases with the volatility error, as shown in Figure 5.7. It is not symmetric:
if the process volatility is less than the hedging volatility then there is less uncertainty in the
value of the hedged portfolio; if the process volatility is greater than the hedging volatility
then there is more uncertainty in the value of the hedged portfolio. This means that if one
does not know whether the hedging volatility is accurate, it is better to use a higher value than
a lower value. That is, there will be less uncertainty in the value of the hedged portfolio if one
over-hedges the position.

## Figure 5.7: The Effect of Using an Incorrect Hedging

Volatility
s.e. of hedged
portfolio
1.5

0
1 σ0
hedging volatility

22 Discrete rebalancing increases the standard error of the hedged portfolio by a factor that depends on
1/√(2n), where n is the number of rebalancings, and on the process volatility; the standard error curve
still has an asymmetric smile shape.