Professional Documents
Culture Documents
†
Mandira Sarma
∗
An earlier draft of this paper was presented at the the Fifth Capital Markets Conference at UTI Institute
of Capital Markets, Mumbai, India. I thank the participants of the conference for helpful comments and
suggestions. All errors are mine.
†
Address: Indira Gandhi Institute of Development Research, Goregaon (East), Mumbai 400 065. Phone:
+91-22-840-0919. Fax: +91-22-840-2752. Email: mandira@igidr.ac.in
1
Abstract
Financial risk management is about understanding the large movements in the market
values of asset portfolios. The conventional approach of estimating market risk measures
deal in assuming a Gaussian distribution for the innovations of the return series. This
approach may lead to faulty risk measures if the innovation distribution is non-Gaussian,
which is often the case for financial series. This paper uses extreme value theory to
explicitly model the tail regions of the innovation distribution of the return series of a
prominent Indian equity index, the S & P CNX Nifty. We find that the lower tail of the
Nifty innovations behaves very much like the lower tail of the standard Gaussian curve,
while the upper tail has significant ”tail thickness”. This inherent asymmetry and the
existence of tail–thickness can provide valuable information to the risk manager. The
EVT-based tail quantiles have been used to make daily forecasts of Value-at-Risk for the
portfolio under consideration at 95% and 99% levels for a long and a short position in the
Nifty portfolio. These forecasts are found to be providing statistically sound risk measures
for the portfolio under consideration.
KEY WORDS Extreme value theory; Value-at-Risk; pseudo maximum likelihood estima-
2
1 Introduction
Value-at-Risk (VaR) is widely used as a tool for measuring the market risk of asset portfolios.
defined as the maximum monetary loss of a portfolio such that the likelihood of experiencing
a loss exceeding that amount, due to its exposure to the market movements, over a specified
Extreme value theory (EVT) deals with the study of the asymptotic behaviour of extreme
(maxima and minima) observations of a random variable. Financial risk management is all
about understanding the large movements in the values of asset portfolios. It essentially
deals with the analysis of the tail regions of the distribution of changes in the market value of
the portfolio. Extreme value theory, by dealing with only extreme observations, can provide
a better treatment to the estimation of tail quantiles like VaR. In conventional techniques
of measuring risk, inferences about the tail region is made after estimating the entire return
distribution. In such an approach, the observations in the interior of the distribution dominate
the estimation process and since extreme observations consist only a small part of the data,
their contribution in the estimation is relatively smaller than the observations in the central
part of the distribution. Therefore in such an approach the tail regions are not accurately
estimated.
Extreme value theory, on the other hand, focuses primarily on analysing the extreme obser-
vations rather than the observations in the central region of the distribution. The theory
provides robust tools for estimating only the tails by making use of the available data. Tail
quantiles like VaR can be estimated more accurately by using EVT than the conventional
approaches.
Another appealing aspect of EVT is that it does not require to make a priori assumption
about the return distribution. The fundamental result of extreme value theory, known as the
“extremal types theorem”, identifies the possible classes of distributions for movements of
the extreme returns irrespective of the actual underlying return distribution. This extremely
3
powerful result of the extreme value theory makes the VaR estimation process free from any
a priori assumption about the portfolio return distribution. Moreover, EVT based methods
inherently incorporates separate estimation of the upper and the lower tails, and thereby em-
phasises the necessity to treat both the tails separately due to possible existence of asymmetry
in the return series. This becomes important when estimating VaR measures for long and
short positions. Conventional models of VaR estimation treats both the tails symmetrically
and hence VaR for the long and short positions are assumed to be equal in magnitude.
This paper uses the recent developments of extreme value theory to analyse the tails of
the innovation distribution of the Nifty returns. Using the “Peaks-Over-Threshold” (POT)
model (McNeil and Frey, 1999) we estimate the tail regions of the innovation distribution of
the Nifty returns. We find that the lower tail of the Nifty innovation behaves very much like
a Gaussian tail whereas the upper tail behaves significantly different from that of a Gaussian
tail. The upper tail is found to exhibit significant “tail thickness” which indicates existence of
asymmetry in the innovation distribution. The existence of asymmetry and the tail thickness
in the innovation distribution provides valuable information while estimating risk measures
The rest of this paper is organised as follows. Sections 2 and 3 present an overview of extreme
value theory and its application in financial risk management. Section 4 describes the “Peaks-
Over-Threshold (POT)” model used in this paper. Section 5 provides an empirical analysis
of the tails of the Nifty innovations. In Section 5.4 daily 99% and 95% VaR forecasts for a
short and a long Nifty position have been estimated by using the estimated tail quantiles.
Theses measures are tested for the existence of “correct conditional coverage” in section 6.1.
The classical Extreme Value theory (EVT) deals with the study of the asymptotic behaviour
4
Suppose that X ∈ (l, u) is a random variable with density f and cdf F . Let X1 , X2 , ....Xn be
Yn = max{X1 , X2 , ....., Xn }
Zn = min{X1 , X2 , ...., Xn }
The extreme value theory deals with the distributional properties of Yn and Zn as n becomes
large.
It can be easily shown that the exact distributions of the extreme observation is degenerate
in the limit. In order to find a distribution of interest which is non-degenerate, the extrema
Yn and Zn are transformed with a scale parameter an (> 0) and a location parameter bn ∈ R,
Yn − an
bn
Zn − an
and
bn
is non-degenerate.
The two extremes, the maximum and the minimum are related by the following relation:
Therefore, all the results for the distribution of maxima leads to an analogous result for the
distribution of minima and vice versa. We will discuss the results for maxima only and ignore
5
2.1 The Fisher-Tippett Theorem
The Fisher-Tippett theorem (1928) is a fundamental result in EVT. The importance of this
result is that it exhibits the possible limiting forms for the distribution of Yn under linear
transformations even without the exact knowledge of the underlying distribution F . The
“Fisher-Tippett Theorem”, also known as the “Extremal type theorem” states thus:
Yn − an d
−→ H as n → ∞
bn
for some non-degenerate distribution H, then H must be one of the only three possible ‘extreme
value distributions’.
In that case, X (and the underlying distribution F ) is said to belong to the (maximum)
More specifically, this basic result states that if there exist some suitable normalizing constants
Yn −an
an (> 0) and bn , the transformed maxima bn has a non-degenerate limiting distribution
function H(x), then H must have one of only three possible “forms”. The limit laws for
maxima were derived by Fisher and Tippett (1928). A first rigorous proof is due to Gnedenko
(1943). De Haan (1970) subsequently provided a simpler proof and Weissman (1977) provided
The three possible probability laws for suitably normalised extrema are: the Gumbel (or Type
I) distribution, the Fréchet (or Type II) distribution and the Weibull (or Type III) distribu-
tion2 . The Gumbel distribution is a limit law for the thin-tailed distributions such as the
bution for the fat-tailed distributions such as Student’s-t or the Stable Paretian distributions.
The marginal distribution of a stationary garch process is also in the domain of attraction
of the Fréchet family. Finally, the Weibull distribution is obtained when the distribution of
6
Von Mises (1976) gives necessary and sufficient conditions for a distribution F to belong to
the domain of attraction of a particular extreme value distribution. Using these conditions
it can be established that there exist suitable normalising constants for certain well known
For example,
distribution) ∈ DA(Gumbel)
• Poisson, Geometric ∈
/ any domain of attraction
The three families of extreme value distributions, viz. the Gumbel, the Fréchet and the
Weibull, can be nested into a single parametric representation, as shown by Jenkinson and Von
Mises. This representation is known as the “Generalised Extreme Value” (GEV) distribution,
and given by
− 1ξ
Hξ (x) = exp{−(1 + ξx) (1)
where
1 + ξx > 0
The support of ξ is
1
x > − if ξ > 0
ξ
1
x < if ξ < 0
ξ
x ∈ R if ξ = 0
3
Leadbetter et al. (1983)(Chap 1) and Embrechts et al. (1997) (Chap 3) discuss the Von Mises’ conditions
and derive norming constants for each specific distribution to belong to a particular domain of attraction.
7
The parameter ξ, called the tail index, models the distribution tails. Each of the three extreme
value distributions can be obtained as a special case of the GEV distribution. When ξ > 0,
we get the Fréchet distribution, when ξ < 0 we get the Weibull distribution and ξ = 0 is the
These results imply that essentially all the common, continuous distributions of statistics
belong to the domain of attraction of a single family Hξ , the extreme value distributions
being differentiated only by the value of ξ. This shows the generality of the extremal types
theorem.
distribution function F (x). Let u be the finite or infinite right endpoint of the distribution
F . The distribution function of the excesses over certain (high) threshold k is given by
F (x + k) − F (k)
Fk (x) = Pr{X − k ≤ x|X > k} =
1 − F (k)
for 0 ≤ x < u − k.
The Pickands-Balkema-de Haan theorem (Balkema & de Haan 1974; Pickands 1975) states
that if the distribution function F ∈ DA(Hξ ) then ∃ a positive measurable function σ(k) such
that
and vice versa, where Gξ,σ(k) (x) denote the Generalised Pareto distribution.
The above theorem states that as the threshold k becomes large, the distribution of the
excesses over the threshold tends to the Generalised Pareto distribution, provided the under-
lying distribution F belongs to the domain of attraction of the Generalised Extreme Value
distribution.
8
2.4 The Generalised Pareto Distribution (GPD)
1 − (1 + ξx/σ)−1/ξ ; if ξ 6= 0
(
Gξ,σ (x) = (2)
1 − exp(−x/σ); if ξ = 0
where σ > 0, and the support of x is x ≥ 0 when ξ ≥ 0 and 0 ≤ x ≤ −σ/ξ when ξ < 0.
There are two broad categories of approaches which uses the results of the extreme value
theory while estimating market risk of financial assets. The first among them, known as the
‘Block Maxima Model (BMM)’ utilises the ‘extremal types theorem’ to model the distribution
size from the data. Then the ‘generalised extreme value’ distribution is fitted to these block
extrema. This distribution would reflect the behaviour of very high profits (in case of maxima)
and very high losses (in case of minima) from the portfolio.
For example, suppose the data consists of the daily returns of a particular portfolio and we
are interested to analyse the lower tail of the portfolio return distribution. In this case, the
BMM approach would involve the fitting of the GEV distribution Hξ (x) to the minimum
observations collected from non-overlapping blocks over the entire sample. If the block size
is 25 (a month), the the 5th percentile of this distribution would give the magnitude of the
daily loss that can be expected with probability 0.05, which is the daily loss level that one can
expect to face once in 20 months. Such a value is known as the ‘stress loss’ with probability
0.05.
Longin (2000) develops a method for VaR estimation using the BMM approach. He develops
a formula for VaR estimation by relating the distribution of the extremes and the distribu-
tion of the underlying returns in terms of the parameters of the ‘generalised extreme value’
distribution. This approach can be used even for stationary non-iid time series by estimating
9
an additional parameter called the ‘extremal index’4 .
The above measures of market risk are essentially unconditional. These measures are constant
over the forecast period, and do not incorporate the changing time series dynamics of the
underlying returns.
The second approach, known as the ‘Peaks-Over-Threshold (POT)’ model, attempts to esti-
mate the tails of the underlying return distribution, instead of modeling the distribution of
In the POT model, a certain threshold is identified to define the starting of the tail of the re-
turn distribution. Then the distribution of the ‘excesses’ over the threshold point is estimated.
There are two approaches of estimating the ‘excess’ distribution, viz. the semi-parametric
models based on the Hill estimator (Danielsson and de Vries, 1997) and the fully parametric
model based on the Generalised Pareto distribution (GPD) (McNeil and Frey, 1999). The
Hill estimator based approach is limited in its application as it requires the assumption of fat
tails of the underlying return distribution. On the other hand, the gpd version is applicable
to any kind of distribution, fat-tailed or no. This approach utilises the Pickands-Balkema-de
Haan theorem to fit a generalised Pareto distribution to the excesses over specific thresholds.
The following sections describe the gpd approach of the pot model in details.
The POT model provides for a framework of estimating the tails (positive or negative tails)
of the return distribution by estimating what is known as the distribution of excesses over
The distribution of excesses over a high threshold k on the portfolio’s loss distribution F is
defined by
10
In terms of the underlying loss distribution F ,
F (y + k) − F (k)
Φk (y) = (3)
1 − F (k)
for
k→u
Thus, using the Pickands-Balkema-de Haan theorem, one can model the distribution of the
excesses over the threshold k as a GPD, provided the threshold is sufficiently high.
for x > k.
Using HS estimate for F (k) and ML estimates of the GPD parameters gives rise to the
!(− 1 )
Nk x−k ξ̂
F̂ (x) = 1 − 1 + ξˆ (6)
N β̂
For a given probability p > F (k), a tail quantile is estimated by inverting the tail estimator
formula (6),
β̂ N
q̂p = k + (1 − p)−ξ̂ − 1 (7)
ξˆ Nk
11
5 A POT analysis of the Nifty tails
In this section, we present a pot analysis of the innovation distribution of the returns of a
The Section 5.1 describes the data. Section 5.2 describes the estimation procedure in details.
Section 5.4 provides an example of VaR estimation using a two-stage methodology (McNeil
5.1 Data
The data consists of 2697 daily logarithmic returns of the Nifty portfolio (from 3 July 1990
till 15 March 2002). The first 1,250 observations (from 3 July 1990 till 7 May 1996) comprise
the estimation window and the rest of the 1446 observations (from 8 May 1996 till 15 March
2002) are used for making rolling window “out–of–sample” VaR forecasts.
As the pot can be used only for iid data, we need to remove the time series dynamics from
the return series and obtain an iid series free from any time series dynamics.
At first we try to ascertain the time series structure of the Nifty return series. A specification
search in terms of AIC and SBC criteria leads us to choose the ar(1)-garch(1,1) model as
Table 1 presents the estimated parameters of the mean and volatility equations of the Nifty
returns. The constant term in the mean equation is found to be insignificant although the
12
The parameters in the volatility equations, viz., the constant, the arch(1) parameter and
We extract the standard residuals from the estimated model and investigate how the moments
of the residual series changed after the time series dynamics from Nifty is removed. Table 2
presents the values of the first four unconditional moments of the Nifty series and the standard
residuals obtained from an ar(1)-garch(1,1) specification of the Nifty returns. The values
of these descriptive statistics indicate the existence of positive skewness and leptokurtosis in
Table 3 presents the values of the test statistics for testing the significance of skewness, excess
kurtosis and autocorrelation (upto lag 35) along with their respective p-values for the original
return series and the residual series. The results of Table 3 indicates that the return series has
significant skewness, excess kurtosis and autocorrelation. The residual series is found to have
significant skewness and excess kurtosis but it does not possess significant autocorrelation.
Thus, neither the return series nor the residual series can be considered to be normally
distributed, since both the series have significant positive skewness and excess kurtosis.
As table 3 indicates, the residual series is found to be free from autocorrelation and hence we
The Pickands-Balkema-de Haan theorem offers the generalised Pareto distribution as a natural
choice for the distribution of excesses (peaks) over sufficiently high thresholds. However,
while choosing an appropriate threshold, one faces a trade off between bias and variance. The
theoretical consideration suggests that the threshold should be as high as possible for the
Pickands-Balkema-de Haan theorem to hold good, but in practice, too high a threshold might
leave us with very few observations above the threshold for estimating the GPD parameters6 .
The GPD estimators are unbiased if and only if k → u, i.e. if the threshold is sufficiently
high. However, if it is chosen very high, there may be very few observations left for estimating
6
For more on this issue, see McNeil and Frey (1999).
13
the GPD parameters to the tail, leading to statistical imprecision and very high variance of
the estimates.
There is no correct choice of the threshold level. While McNeil and Frey (1999), McNeil (1996)
and McNeil (1999) use the “mean-excess-plot” as a tool for choosing the optimal threshold
level7 , Gavin (2000) uses an arbitrary threshold level of 90% confidence level (i.e. the largest
10% of the positive and negative returns are considered as the extreme observations).
In this paper we follow Neftci (2000) and choose the threshold level as 1.65 times the uncon-
ditional variance of the residuals8 . This represents the 5% of extreme movements if the data
were normally distributed. On the both sides of the tails, observations lying beyond 1.65
Table 4 presents the estimated threshold point, the number of extreme observations beyond
the threshold, and the results of the maximum likelihood estimation of the GPD to the
excesses (peaks) over the chosen threshold for the lower tail and the upper tail respectively of
the i.i.d. residual series obtained by fitting an ar(1)-garch(1,1) model to the Nifty returns.
The first column of this table gives the estimated threshold points corresponding to the chosen
level of threshold. For the lower tail the threshold point is -1.6493 and for the upper tail it is
1.6494.
The second column of Table 4 gives the number of extreme observations beyond the thresholds
on both the tails, the third column gives the estimated cdf at the thresholds, the fourth and
the fifth columns presents the ML estimation of the GPD parameters (with SEs in parenthesis)
The estimates of the gpd parameters can be used in the tail quantile estimation formula (7)
Table 5 presents some of the estimated quantiles on the lower tail and the upper tail along
7
Details on mean-excess-plots can be found in McNeil and Frey (1999) and Embrechts et al. (1997).
8
We tried with the mean excess plots but did not get a well behaved linear mean excess plot.
9
For analysis of the lower tail, i.e. the minima, we use the negative returns and then apply results for
maxima.
14
with the empirical quantiles and the corresponding quantiles on the standard normal curve10 .
The first column of this table indicates the probability levels and the second, third and the
fourth columns give the corresponding quantiles. The quantiles on the second column are
estimated from the gpd approximation, the quantiles on the third columns are empirically
observed quantiles and the ones on the fourth column are the corresponding quantiles on the
standard normal curve. Panel A of Table 5 provide the quantiles on the lower tail and the
This table shows that the estimated tail quantiles are closer to the empirical quantiles than
that of a normal distribution approximation. This implies that a normal distribution approx-
imation of the underlying dgp would lead to misleading risk estimates. Particularly for the
upper tail, the normal quantiles seem to be underestimating the empirical distribution, while
the EVT-based quantiles are estimating it more precisely. This indicate the presence of the so
called ‘fat–tailed’ behaviour of financial series. The existence of ‘fat-tails’ for this particular
data is also established by the existence of significant excess kurtosis as shown in table 2.
risk.
It is interesting that the discrepancy between the normal quantiles and the empirical quantiles
is less in the case of the lower tail than the upper tail.
Another point worth noting is that the tail quantiles indicate the existence of asymmetry, as
the pth quantile is not equal in absolute value to the (1 − p)th quantile.
These aspects show that assumption of normality do not reflect the actual riskiness of the
portfolio.
10
These quantiles can be considered as the unconditional VaR measures for the i.i.d. residual series obtained
from three models – the EVT-based GPD approach, the historical simulation and the Normal distribution
model.
15
5.3 The KS test for discrepancy
To test for the significance of difference between the estimated and empirical tails, and to test
whether the discrepancies between the normal approximation and the estimated quantiles is
Suppose that F (x), G(x) and φ(x) denote the empirical, the estimated and the normal dis-
tribution functions.
H0 : F (x) = G(x)
H1 : F (x) 6= G(x)
This hypothesis tests whether the estimated quantiles are significantly different from the
empirical quantiles. The hypothesis testing is done separately for the lower as well as the
The second hypothesis tests whether the tails of the empirical distribution is significantly
0
H0 : F (x) = φ(x)
0
H1 : F (x) > φ(x)
and
16
Table 6 provides the estimated KS-statistics for these hypotheses for the lower and the upper
tails. The discrepancy between the estimated and empirical tails is found to be insignificant
Significance of D+ for the upper tail indicates that the empirical quantiles on the upper
tails are significantly higher than the normal quantiles. This establishes the existence of
significant “tail thickness” on the upper tail. However, the normal approximation does not
lead to significant underestimation of the lower tail quantiles. This implies that the lower tail
of the empirical distribution seems to be behaving much like the normal tail, while the upper
Figure 1 give a plot of the estimated lower and upper tails of the Nifty innovation distributions
It is seen that for the lower tail, the fitted as well as the normal approximation fit the data
well as there do not seem to be much of a discrepancy between the empirical tail and the
fitted as well as the normal tails. It is also clear that the gpd fit is good only beyond the
threshold -1.6493 from where the lower tail is believed to have started, and gpd does not fit
the data inside the threshold level which is in the middle part of the distribution.
For the upper tail, the tail fatness appears very prominently. In this case, the gpd ap-
proximation estimates the tail very well while the normal approximation underestimates it.
Surprisingly, the gpd is fitting the distribution even inside the threshold to a great extent.
While normal approximation fails to capture the tail behaviour of the data on the upper tail,
the extreme value theory model is able to capture both the tails precisely.
The tail quantile estimates thus obtained can be translated back into the original return series,
given an estimate of the time dependent mean and volatility. This idea has been developed
in McNeil and Frey (1999) in their two stage VaR estimation approach,as described below.
17
5.4 Estimating Value-at-Risk
VaR is a measure of extreme risk in terms of the unknown loss distribution F (x) of the
portfolio under consideration. VaR is the pth quantile of the distribution F , (where p is very
V aRp = F −1 (p)
Although EVT primarily deals with i.i.d. random variables, recent works of McNeil and
Frey (1999) has developed a procedure for applying this approach to stationary time series
processes for conditional VaR estimation. This approach of VaR estimation is explained in
Section 5.5.
Let {Xt } is a strictly stationary time series whose dynamics are given by
Xt = µt + σt Zt (8)
Zt ∼ fZ (z)
The pth quantile of the distribution of Xt at time t can be obtained by using that of Zt , as,
xtp = µt + σt zp (9)
McNeil and Frey (1999) proposes the following approach to estimate VaR for financial returns
1. Fit a time series model to the return series using a pseudo-maximum likelihood (PML)
18
estimator using normality for fZ (z). Estimate µt and σt from the fitted model and
The use of the PML approach to estimate the parameters of the time series by using
the normal distribution for fZ (z) does not imply the assumption of normality of fZ (z).
Under standard regularity conditions (Gourieroux, 1997; Gourieroux et al., 1984) the
use of the normal distribution would yield consistent estimation even if the underlying
distribution is not normal. That is, the consistency of the PML estimator does not
depend on the distribution which is used to build the likelihood function provided it
belongs to the quadratic exponential family of distributions (in this case, the normal
2. If the residual series Zt is found to be strictly white noise, the EVT can be applied to
model the tail of the white noise fZ (z). The EVT based VaR formula (7) can be used
t .
to estimate VaR for the Zt series, say V aRZ
as
t t
V aRX = µ̂t + σ̂t V aRZ (10)
The estimated quantiles on the innovation series can be used to make daily VaR forecasts for
the underlying returns. In order to do so, we dynamically estimate µ̂t and σ̂t forecasts for
the “out of sample” periods, by using a rolling window of size 1250. These daily forecasts
and the VaR for the residual series are used in the formula (10) to make one day ahead VaR
11
Pseudo Maximum Likelihood (PML) estimators are obtained by maximising the likelihood function asso-
ciated with a family of probability distributions which do not necessarily contain the true pdf of the underlying
random variable. Gourieroux et al. (1984) establishes that the PML estimators of the first two moments of
the unknown underlying distribution, based on the linear and quadratic exponential family are asymptotically
consistent and normally distributed regardless of the exact form of the true unknown distribution. The normal
distribution, being a quadratic exponential family, can provide consistent estimators of the first two moments.
19
forecasts for the underlying data.
Figure 2 depicts the forecasted 95% and 99% VaR vis-a-vis the actually observed portfolio
returns over the forecast window. VaR forecasts are estimated for a long and a short position
The VaR plots on the negative side of the graph are for the long position and the plots on
The graphs indicate that the VaR measures indicate that the VaR forecasts are able to capture
A correctly specified VaR model should generate the pre specified failure rate conditionally
at every point in time. This is known as the property of “conditional coverage” of the VaR
model. The basic feature of a 99% VaR is that it should be exceeded 1% of the time, and that
the probability of the VaR being exceeded at time t + 1 remains 1% even after conditioning
on all information known at time t. This implies that the VaR should be small in times of
low volatility and high in times of high volatility, so that the events where the loss exceeds
the forecasted VaR measure are spread over the entire sample period, and do not come in
clusters. A model which fails to capture the volatility dynamics of the underlying return
distribution will exhibit the symptom of clustering of failures, even if (on the average) it may
Consider a sequence of one period ahead VaR forecasts {vt|t−1 }Tt=1 , estimated at a significance
level 1 − p. These forecasts can be considered as one–sided interval forecasts (−∞, vt|t−1 ] with
coverage probability p. Given the realisations of the return series rt and the ex-ante VaR
20
where rt is observed return and vt is forecasted VaR measure on day t.
The stochastic process {It } is called the “failure process”. The VaR forecasts are said to be
This is equivalent to saying that the {It } series is iid with mean p.
Christoffersen and Diebold (2000) and Clements and Taylor (2000), suggest that a regression
of the It series on its own lagged values and some other variables of interest, such as day-
dummies or the lagged observed returns, can be used to test for the existence of various
form of dependence structures that may be present in the {It } series. Under this framework,
conditional efficiency of the It process can be tested by testing the joint hypothesis:
H : Φ = 0, α0 = p (11)
where
in the regression
S
X S−1
X
It = α0 + αs It−s + µs Ds,t + t (12)
s=1 s=1
t = S + 1, S + 2, ..., T (13)
The hypothesis (11) can be tested by using an F-statistic in the usual OLS framework.12
To test for the existence of the property of correct conditional coverage, we perform an OLS
regression of the It series on its five lagged values and five day–dummies representing the
12
In view of the fact that the It series is binary, a more appropriate way is to do a binary regression rather
than an OLS regression. However, there seem to be a technical problem in the implementation of the binary
regression as more than 90% of the It ’s are zero and only a few are unity. This asymmetry in the data
results in singular Hessian matrices in the estimation process and the maximum likelihood estimation fails as
a result. This problem seems to be more severe in the case of 99% VaR models. Therefore we resort to an
OLS regression, which is asymptotically equivalent to a binary regression.
21
trading days in a week. Significance of the F-statistic of this OLS will lead to rejection of a
model; otherwise it will lead to its non-rejection. It should be noted that the non-significance
of the F-statistic does not necessarily imply non-significance of the t-statistics corresponding
to the individual regressors in the OLS. We follow Hayashi (2000) and adopt the policy
of preferring the F-statistic over the t-statistic(s) in the case of a conflict. Therefore the
model will not be rejected if the F-statistic is not significant even though some individual
Table 7 presents the results of the test of correct conditional coverage to the evt-based VaR
measures. Panel A of the table deals with 95% VaR estimation and Panel B deals with 99%
VaR models. It is found that for both the long and the short Nifty positions, the VaR measures
are generating the correct conditional coverage, for both 95% and 99% VaR measures. The
estimated failure probabilities are not significantly different from the pre-specified levels, and
the independence of the failure series is indicated by insignificance of the F-statistics in the
ols regression of the failure series on its own past values and the day-dummies.
7 Conclusion
This paper carries out an analysis of the tail behaviour of the Nifty innovation distribution
using extreme value theory. We find that the essential features of the innovation distribution
is very different from the normal distribution. The right tail of the innovation distribution
displays significant ‘tail–fatness’ while the left tails behaves quite like the normal distribution.
This asymmetry in the innovation distribution necessitates treating the left tail and the right
We see that the extreme value theory based gpd model of tail estimation is able to capture
these features of the innovation distribution. By estimating upper and lower tails separately,
this approach takes care of the inherent asymmetry present in the data. This approach also
gives a better fit to both the tails of the innovation distribution compared to the normal
distribution.
22
The quantiles estimated by using the gpd model are used to estimate 95% and 99% Value-
at-Risk measures for a short and a long position in the Nifty portfolio. The tests of “correct
conditional coverage” confirms that the VaR measures possess “correct conditional coverage”,
23
References
Christoffersen PF, Diebold FX, 2000. How Relevant is Volatility Forecasting for Financial
Clements MP, Taylor N, 2000. Evaluating interval forecasts of high-frequency financial data.
Danielsson J, de Vries CG, 1997. Value-at-Risk and extreme returns. Manuscript, London
School of Economics.
Embrechts P, Kluppelberg C, Mikosch T, 1997. Modelling Extremal Events for Insurance and
Gavin J, 2000. Extreme value theory- an empirical analysis of equity risk. UBS Warburg
working paper.
Econometrica 52:681–700.
Leadbetter MR, Lindgren G, Rootzen H, 1983. Extremes and Related Properties of Random
Longin F, 2000 July. From Value at Rsik to Stress testing: The Extreme Value Approach.
McNeil AJ, 1996. Estimating the tails of loss severity distributions using extreme value theory.
McNeil AJ, 1999. Extreme value theory for risk managers. Manuscript, Department Mathe-
24
McNeil AJ, Frey R, 1999. Estimation of tail-related risk measures for heteroscedastic financial
Neftci SN, 2000. Value at Risk Calculations, Extreme Events, and Tail Estimation. The
25
Table 1 Estimation of AR(1)-GARCH(1,1) model
This table presents the values of the first four unconditional moments of the raw Nifty series and the standard
residuals extracted from a ar(1)-garch(1,1) specification of the Nifty returns.
This table presents the values of the test statistics and the corresponding p-values of the tests of skewness,
kurtosis and autocorrelation for the raw data (Panel A) and the standard residuals (Panel B).
statistic p-value
Panel A: The returns series (rt )
skewness 15.7839∗ 0.000
kurtosis 292.3220 ∗ 0.000
H.C. Ljung-Box 57.2202∗ 0.01
26
Figure 1 The tails of the innovation distribution
This figure provides the plots of the estimated tails of the Nifty innovation distribution. The graphs provide
the empirical, the gpd approximation and the normal approximation to the lower and the upper tails. The
threshold levels from where the tail starts are shown by a vertical line. For example, for the lower tail, the
threshold level is -1.6493 and for the upper tail it is 1.6494.
0.12
0.1
probability
0.08
0.06
0.04
0.02
0
-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5
x
0.12
0.1
probability
0.08
0.06
0.04
0.02
0
0.5 1 1.5 2 2.5 3 3.5 4
x
27
Figure 2 95% and 99% VaR measures estimated with the POT model
This graph shows the dynamic VaR forecasts for a long and a short position of Rs. 100 in the Nifty portfolio
by using the pot approach based on extreme value theory. The graph depicts two sets of VaR forecasts, one
for the long position and the other for the short position. The graph on the negative side of the observed
returns is for the long position and the one on the positive side is for the short position in Nifty.
10
5
Returns (%)
95% VaR
99% VaR
0
Observed Returns
-5
Forecast Dates
This table provides the results of the estimated GPD parameters fitted to the excesses over the chosen threshold.
The first column gives the threshold points on both the left and the right tails corresponding to the 1.65σ
level of threshold. The second column presents the number of observations beyond the threshold level and the
third column gives the estimated cdf of the tails at the respective threshold points. The fourth and the fifth
columns present the Pseudo-maximum-likelihood estimation of the GPD parameters fitted to the excesses over
the thresholds, along with the standard errors of estimation within parenthesis.
u Nu Fu ξˆ σ̂
28
Table 5 Estimated quantiles on the i.i.d. residuals
This table provides some of the estimated quantiles on the tails along with the empirical quantiles as well as
the corresponding quantiles on the standard normal distribution. Panel A deals with the lower tail and Panel
B deals with the upper tail. The column 1 gives the probability level. Columns 2, 3 and 4 give the estimated,
empirical and standard normal distribution quantiles corresponding to these probability levels.
29
Table 6 Results of the Kolmogorov-Smirnov tests
This table provides the values of the Kolmogorov-Smirnov test statistics for testing the hypotheses 5.3
against 5.3 (D) and the hypothesis 5.3 against 5.3 (D+ ).
D 0.2794 0.1574
D+ 0.8536∗ 0.3589
Critical value of D at 0.05 level of significance = 0.467
Critical value of D+ at 0.05 level of significance = 0.400
This table presents the results of the test of autoregressive and periodic dependence in the failure series
generated by the three alternative approaches. The VaR measures are for a long position in the Nifty portfolio
and the VaR measures are estimated by the gpd approach based on the extreme value theory. The first column
gives the portfolio, the second column gives the estimated failure probability and the corresponding p-value in
parenthesis, and the third column reports the estimated F-statistics of the hypothesis of no dependence of the
failure series with the corresponding p-values (within parenthesis). Panel A of this table deals with 95% VaR
estimation and Panel B deals with 99% VaR estimation.
30