You are on page 1of 8

Extreme value theory

Extreme value theory or extreme value analysis (EVA) is a


branch of statistics dealing with the extreme deviations from the
median of probability distributions. It seeks to assess, from a given
ordered sample of a given random variable, the probability of
events that are more extreme than any previously observed.
Extreme value analysis is widely used in many disciplines, such as
structural engineering, finance, earth sciences, traffic prediction,
and geological engineering. For example, EVA might be used in
Extreme value theory is used to
the field of hydrology to estimate the probability of an unusually
model the risk of extreme, rare
large flooding event, such as the 100-year flood. Similarly, for the
events, such as the 1755 Lisbon
design of a breakwater, a coastal engineer would seek to estimate earthquake.
the 50-year wave and design the structure accordingly.

Data analysis
Two main approaches exist for practical extreme value analysis.

The first method relies on deriving block maxima (minima) series as a preliminary step. In many situations
it is customary and convenient to extract the annual maxima (minima), generating an "Annual Maxima
Series" (AMS).

The second method relies on extracting, from a continuous record, the peak values reached for any period
during which values exceed a certain threshold (falls below a certain threshold). This method is generally
referred to as the "Peak Over Threshold"[1] method (POT).

For AMS data, the analysis may partly rely on the results of the Fisher–Tippett–Gnedenko theorem, leading
to the generalized extreme value distribution being selected for fitting.[2][3] However, in practice, various
procedures are applied to select between a wider range of distributions. The theorem here relates to the
limiting distributions for the minimum or the maximum of a very large collection of independent random
variables from the same distribution. Given that the number of relevant random events within a year may be
rather limited, it is unsurprising that analyses of observed AMS data often lead to distributions other than
the generalized extreme value distribution (GEVD) being selected.[4]

For POT data, the analysis may involve fitting two distributions: one for the number of events in a time
period considered and a second for the size of the exceedances.

A common assumption for the first is the Poisson distribution, with the generalized Pareto distribution being
used for the exceedances. A tail-fitting can be based on the Pickands–Balkema–de Haan theorem.[5][6]

Novak[7] reserves the term “POT method” to the case where the threshold is non-random, and distinguishes
it from the case where one deals with exceedances of a random threshold.

Applications
Applications of extreme value theory include predicting the probability distribution of:
Extreme floods; the size of freak waves
Tornado outbreaks[8]
Maximum sizes of ecological populations[9]
Side effects of drugs (e.g., ximelagatran)
The magnitudes of large insurance losses
Equity risks; day-to-day market risk
Mutational events during evolution
Large wildfires[10]
Environmental loads on structures[11]
Fastest time humans are capable of running the 100 metres sprint[12] and performances in
other athletic disciplines[13][14][15]
Pipeline failures due to pitting corrosion
Anomalous IT network traffic, prevent attackers from reaching important data
Road safety analysis[16][17]
Wireless communications[18]
Epidemics[19]
Neurobiology[20]

History
The field of extreme value theory was pioneered by Leonard Tippett (1902–1985). Tippett was employed
by the British Cotton Industry Research Association, where he worked to make cotton thread stronger. In
his studies, he realized that the strength of a thread was controlled by the strength of its weakest fibres. With
the help of R. A. Fisher, Tippet obtained three asymptotic limits describing the distributions of extremes
assuming independent variables. Emil Julius Gumbel codified this theory in his 1958 book Statistics of
Extremes, including the Gumbel distributions that bear his name. These results can be extended to allow for
slight correlations between variables, but the classical theory does not extend to strong correlations of the
order of the variance. One universality class of particular interest is that of log-correlated fields, where the
correlations decay logarithmically with the distance.

Univariate theory
Let be a sequence of independent and identically distributed random variables with
cumulative distribution function F and let denote the maximum.

In theory, the exact distribution of the maximum can be derived:

The associated indicator function is a Bernoulli process with a success probability


that depends on the magnitude of the extreme event. The number of extreme
events within trials thus follows a binomial distribution and the number of trials until an event occurs
follows a geometric distribution with expected value and standard deviation of the same order .

In practice, we might not have the distribution function but the Fisher–Tippett–Gnedenko theorem
provides an asymptotic result. If there exist sequences of constants and such that
as then

where depends on the tail shape of the distribution. When normalized, G belongs to one of the following
non-degenerate distribution families:

Weibull law: when the distribution of has a

light tail with finite upper bound. Also known as Type 3.

Gumbel law: when the distribution of has an exponential

tail. Also known as Type 1.

Fréchet law: when the distribution of has a heavy tail

(including polynomial decay). Also known as Type 2.

For the Weibull and Fréchet laws, .

Multivariate theory
Extreme value theory in more than one variable introduces additional issues that have to be addressed. One
problem that arises is that one must specify what constitutes an extreme event.[21] Although this is
straightforward in the univariate case, there is no unambiguous way to do this in the multivariate case. The
fundamental problem is that although it is possible to order a set of real-valued numbers, there is no natural
way to order a set of vectors.

As an example, in the univariate case, given a set of observations it is straightforward to find the most
extreme event simply by taking the maximum (or minimum) of the observations. However, in the bivariate
case, given a set of observations , it is not immediately clear how to find the most extreme event.
Suppose that one has measured the values at a specific time and the values at a later time.
Which of these events would be considered more extreme? There is no universal answer to this question.

Another issue in the multivariate case is that the limiting model is not as fully prescribed as in the univariate
case. In the univariate case, the model (GEV distribution) contains three parameters whose values are not
predicted by the theory and must be obtained by fitting the distribution to the data. In the multivariate case,
the model not only contains unknown parameters, but also a function whose exact form is not prescribed by
the theory. However, this function must obey certain constraints.[22][23] It is not straightforward to devise
estimators that obey such constraints though some have been recently constructed.[24] [25] [26]

As an example of an application, bivariate extreme value theory has been applied to ocean research.[21][27]

Nonstationary extremes
Statistical modeling for nonstationary time series was developed in the 1990s.[28] Methods for
nonstationary multivariate extremes have been introduced more recently.[29] The latter can be used for
tracking how the dependence between extreme values changes over time, or over another
covariate.[30][31][32]

See also
Extreme risk
Extreme weather
Fisher–Tippett–Gnedenko theorem
Generalized extreme value distribution
Large deviation theory
Outlier
Pareto distribution
Pickands–Balkema–de Haan theorem
Rare events
Weibull distribution
Redundancy principle

Notes
1. Leadbetter, M. R. (1991). "On a basis for 'Peaks over Threshold' modeling". Statistics and
Probability Letters. 12 (4): 357–362. doi:10.1016/0167-7152(91)90107-3 (https://doi.org/10.1
016%2F0167-7152%2891%2990107-3).
2. Fisher and Tippett (1928)
3. Gnedenko (1943)
4. Embrechts, Klüppelberg, and Mikosch (1997)
5. Pickands (1975)
6. Balkema and de Haan (1974)
7. Novak (2011)
8. Tippett, Michael K.; Lepore, Chiara; Cohen, Joel E. (16 December 2016). "More tornadoes in
the most extreme U.S. tornado outbreaks" (https://doi.org/10.1126%2Fscience.aah7393).
Science. 354 (6318): 1419–1423. Bibcode:2016Sci...354.1419T (https://ui.adsabs.harvard.e
du/abs/2016Sci...354.1419T). doi:10.1126/science.aah7393 (https://doi.org/10.1126%2Fscie
nce.aah7393). PMID 27934705 (https://pubmed.ncbi.nlm.nih.gov/27934705).
9. Batt, Ryan D.; Carpenter, Stephen R.; Ives, Anthony R. (March 2017). "Extreme events in
lake ecosystem time series" (https://doi.org/10.1002%2Flol2.10037). Limnology and
Oceanography Letters. 2 (3): 63. doi:10.1002/lol2.10037 (https://doi.org/10.1002%2Flol2.100
37).
10. Alvardo (1998, p.68.)
11. Makkonen (2008)
12. J.H.J. Einmahl; S.G.W.R. Smeets (2009), "Ultimate 100m World Records Through Extreme-
Value Theory" (https://web.archive.org/web/20160312023048/https://pure.uvt.nl/ws/files/124
4969/j.1467-9574.2010.00470.x.pdf) (PDF), CentER Discussion Paper, Tilburg University,
57, archived from the original (https://pure.uvt.nl/ws/files/1244969/j.1467-9574.2010.00470.
x.pdf) (PDF) on 2016-03-12, retrieved 2009-08-12
13. D. Gembris; J.Taylor; D. Suter (2002), "Trends and random fluctuations in athletics", Nature,
417 (6888): 506, Bibcode:2002Natur.417..506G (https://ui.adsabs.harvard.edu/abs/2002Nat
ur.417..506G), doi:10.1038/417506a (https://doi.org/10.1038%2F417506a), hdl:2003/25362
(https://hdl.handle.net/2003%2F25362), PMID 12037557 (https://pubmed.ncbi.nlm.nih.gov/1
2037557), S2CID 13469470 (https://api.semanticscholar.org/CorpusID:13469470)
14. D. Gembris; J.Taylor; D. Suter (2007), "Evolution of athletic records : Statistical effects versus
real improvements", Journal of Applied Statistics, 34 (5): 529–545,
Bibcode:2007JApSt..34..529G (https://ui.adsabs.harvard.edu/abs/2007JApSt..34..529G),
doi:10.1080/02664760701234850 (https://doi.org/10.1080%2F02664760701234850),
hdl:2003/25404 (https://hdl.handle.net/2003%2F25404), S2CID 55378036 (https://api.seman
ticscholar.org/CorpusID:55378036)
15. H. Spearing, J. Tawn, D. Irons, T. Paulden & G. Bennett (2021), "Ranking, and other
properties, of elite swimmers using extreme value theory", Journal of the Royal Statistical
Society: Series A (Statistics in Society), 184 (1): 368–395, doi:10.1111/rssa.12628 (https://do
i.org/10.1111%2Frssa.12628), S2CID 204823947 (https://api.semanticscholar.org/CorpusID:
204823947)
16. Songchitruksa, P.; Tarko, A. P. (2006). "The extreme value theory approach to safety
estimation". Accident Analysis and Prevention. 38 (4): 811–822.
doi:10.1016/j.aap.2006.02.003 (https://doi.org/10.1016%2Fj.aap.2006.02.003).
PMID 16546103 (https://pubmed.ncbi.nlm.nih.gov/16546103).
17. Orsini, F.; Gecchele, G.; Gastaldi, M.; Rossi, R. (2019). "Collision prediction in roundabouts:
a comparative study of extreme value theory approaches". Transportmetrica A: Transport
Science. 15 (2): 556–572. doi:10.1080/23249935.2018.1515271 (https://doi.org/10.1080%2
F23249935.2018.1515271). S2CID 158343873 (https://api.semanticscholar.org/CorpusID:1
58343873).
18. C. G. Tsinos, F. Foukalas, T. Khattab and L. Lai, "On Channel Selection for Carrier
Aggregation Systems (https://ieeexplore.ieee.org/abstract/document/8052574)." IEEE
Transactions on Communications, vol. 66, no. 2, Feb. 2018 ) 808-818.
19. Wong, Felix; Collins, James J. (2020-11-02). "Evidence that coronavirus superspreading is
fat-tailed" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7703634). Proceedings of the
National Academy of Sciences. 117 (47): 29416–29418. Bibcode:2020PNAS..11729416W
(https://ui.adsabs.harvard.edu/abs/2020PNAS..11729416W). doi:10.1073/pnas.2018490117
(https://doi.org/10.1073%2Fpnas.2018490117). ISSN 0027-8424 (https://www.worldcat.org/i
ssn/0027-8424). PMC 7703634 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7703634).
PMID 33139561 (https://pubmed.ncbi.nlm.nih.gov/33139561).
20. Basnayake, Kanishka; Mazaud, David; Bemelmans, Alexis; Rouach, Nathalie; Korkotian,
Eduard; Holcman, David (2019-06-04). "Fast calcium transients in dendritic spines driven by
extreme statistics" (https://dx.doi.org/10.1371/journal.pbio.2006202). PLOS Biology. 17 (6):
e2006202. doi:10.1371/journal.pbio.2006202 (https://doi.org/10.1371%2Fjournal.pbio.20062
02). ISSN 1545-7885 (https://www.worldcat.org/issn/1545-7885). PMC 6548358 (https://ww
w.ncbi.nlm.nih.gov/pmc/articles/PMC6548358). PMID 31163024 (https://pubmed.ncbi.nlm.ni
h.gov/31163024).
21. Morton, I.D.; Bowers, J. (December 1996). "Extreme value analysis in a multivariate offshore
environment". Applied Ocean Research. 18 (6): 303–317. doi:10.1016/s0141-
1187(97)00007-2 (https://doi.org/10.1016%2Fs0141-1187%2897%2900007-2). ISSN 0141-
1187 (https://www.worldcat.org/issn/0141-1187).
22. Beirlant, Jan; Goegebeur, Yuri; Teugels, Jozef; Segers, Johan (2004-08-27). Statistics of
Extremes: Theory and Applications. Wiley Series in Probability and Statistics. Chichester,
UK: John Wiley & Sons, Ltd. doi:10.1002/0470012382 (https://doi.org/10.1002%2F0470012
382). ISBN 9780470012383.
23. Coles, Stuart (2001). An Introduction to Statistical Modeling of Extreme Values. Springer
Series in Statistics. doi:10.1007/978-1-4471-3675-0 (https://doi.org/10.1007%2F978-1-4471-
3675-0). ISBN 978-1-84996-874-4. ISSN 0172-7397 (https://www.worldcat.org/issn/0172-73
97).
24. de Carvalho, M.; Davison, A. C. (2014). "Spectral density ratio models for multivariate
extremes" (https://www.maths.ed.ac.uk/~mdecarv/papers/decarvalho2014a.pdf) (PDF).
Journal of the American Statistical Association. 109: 764‒776. doi:10.1016/j.spl.2017.03.030
(https://doi.org/10.1016%2Fj.spl.2017.03.030).
25. Hanson, T.; de Carvalho, M.; Chen, Yuhui (2017). "Bernstein polynomial angular densities of
multivariate extreme value distributions" (https://www.maths.ed.ac.uk/~mdecarv/papers/hans
on2017.pdf) (PDF). Statistics and Probability Letters. 128: 60–66.
doi:10.1016/j.spl.2017.03.030 (https://doi.org/10.1016%2Fj.spl.2017.03.030).
26. de Carvalho, M. (2013). "A Euclidean likelihood estimator for bivariate tail dependence" (http
s://www.maths.ed.ac.uk/~mdecarv/papers/decarvalho2013.pdf) (PDF). Communications in
Statistics – Theory and Methods. 42 (7): 1176–1192. arXiv:1204.3524 (https://arxiv.org/abs/1
204.3524). doi:10.1080/03610926.2012.709905 (https://doi.org/10.1080%2F03610926.201
2.709905). S2CID 42652601 (https://api.semanticscholar.org/CorpusID:42652601).
27. Zachary, S.; Feld, G.; Ward, G.; Wolfram, J. (October 1998). "Multivariate extrapolation in the
offshore environment". Applied Ocean Research. 20 (5): 273–295. doi:10.1016/s0141-
1187(98)00027-3 (https://doi.org/10.1016%2Fs0141-1187%2898%2900027-3). ISSN 0141-
1187 (https://www.worldcat.org/issn/0141-1187).
28. Davison, A.C.; Smith, Richard (1990). "Models for exceedances over high thresholds" (http
s://rss.onlinelibrary.wiley.com/doi/10.1111/j.2517-6161.1990.tb01796.x). Journal of the
Royal Statistical Society: Series B (Methodological). 52 (3): 393–425. doi:10.1111/j.2517-
6161.1990.tb01796.x (https://doi.org/10.1111%2Fj.2517-6161.1990.tb01796.x).
29. de Carvalho, M. (2016). Statistics of extremes: Challenges and opportunities. In: Handbook
of EVT and its Applications to Finance and Insurance (https://www.maths.ed.ac.uk/~mdecar
v/papers/decarvalho2016b.pdf) (PDF). Hoboken: Wiley. pp. 195–214. ISBN 978-1-118-
65019-6.
30. Castro, D.; de Carvalho, M.; Wadsworth, J. (2018). "Time-Varying Extreme Value
Dependence with Application to Leading European Stock Markets" (https://www.maths.ed.a
c.uk/~mdecarv/papers/castro2018.pdf) (PDF). Annals of Applied Statistics. 12: 283–309.
doi:10.1214/17-AOAS1089 (https://doi.org/10.1214%2F17-AOAS1089). S2CID 33350408 (h
ttps://api.semanticscholar.org/CorpusID:33350408).
31. Mhalla, L.; de Carvalho, M.; Chavez-Demoulin, V. (2019). "Regression Type Models for
Extremal Dependence" (https://www.maths.ed.ac.uk/~mdecarv/papers/mhalla2019.pdf)
(PDF). Scandinavian Journal of Statistics. 46 (4): 1141–1167. doi:10.1111/sjos.12388 (http
s://doi.org/10.1111%2Fsjos.12388). S2CID 53570822 (https://api.semanticscholar.org/Corpu
sID:53570822).
32. Mhalla, L.; de Carvalho, M.; Chavez-Demoulin, V. (2018). "Local robust estimation of the
Pickands dependence function" (https://doi.org/10.1214%2F17-AOS1640). Annals of
Statistics. 46 (6A): 2806–2843. doi:10.1214/17-AOS1640 (https://doi.org/10.1214%2F17-AO
S1640). S2CID 59467614 (https://api.semanticscholar.org/CorpusID:59467614).

References
Abarbanel, H.; Koonin, S.; Levine, H.; MacDonald, G.; Rothaus, O. (January 1992),
"Statistics of Extreme Events with Application to Climate" (http://www.fas.org/irp/agency/dod/
jason/statistics.pdf) (PDF), JASON, JSR-90-30S, retrieved 2015-03-03
Alvarado, Ernesto; Sandberg, David V.; Pickford, Stewart G. (1998), "Modeling Large Forest
Fires as Extreme Events" (https://web.archive.org/web/20090226080558/http://www.vetmed.
wsu.edu/org_nws/NWSci%20journal%20articles/1998%20files/Special%20addition%201/v
72%20p66%20Alvarado%20et%20al.PDF) (PDF), Northwest Science, 72: 66–75, archived
from the original (http://www.vetmed.wsu.edu/org_nws/NWSci%20journal%20articles/199
8%20files/Special%20addition%201/v72%20p66%20Alvarado%20et%20al.PDF) (PDF) on
2009-02-26, retrieved 2009-02-06
Balkema, A.; Laurens (1974), "Residual life time at great age", Annals of Probability, 2 (5):
792–804, doi:10.1214/aop/1176996548 (https://doi.org/10.1214%2Faop%2F1176996548),
JSTOR 2959306 (https://www.jstor.org/stable/2959306)
Burry K.V. (1975). Statistical Methods in Applied Science. John Wiley & Sons.
Castillo E. (1988) Extreme value theory in engineering. Academic Press, Inc. New York.
ISBN 0-12-163475-2.
Castillo, E., Hadi, A. S., Balakrishnan, N. and Sarabia, J. M. (2005) Extreme Value and
Related Models with Applications in Engineering and Science, Wiley Series in Probability
and Statistics Wiley, Hoboken, New Jersey. ISBN 0-471-67172-X.
Coles S. (2001) An Introduction to Statistical Modeling of Extreme Values. Springer, London.
Embrechts P., Klüppelberg C. and Mikosch T. (1997) Modelling extremal events for
insurance and finance. Berlin: Spring Verlag
Fisher, R.A.; Tippett, L.H.C. (1928), "Limiting forms of the frequency distribution of the largest
and smallest member of a sample", Proc. Camb. Phil. Soc., 24 (2): 180–190,
Bibcode:1928PCPS...24..180F (https://ui.adsabs.harvard.edu/abs/1928PCPS...24..180F),
doi:10.1017/s0305004100015681 (https://doi.org/10.1017%2Fs0305004100015681),
S2CID 123125823 (https://api.semanticscholar.org/CorpusID:123125823)
Gnedenko, B.V. (1943), "Sur la distribution limite du terme maximum d'une serie aleatoire",
Annals of Mathematics, 44 (3): 423–453, doi:10.2307/1968974 (https://doi.org/10.2307%2F1
968974), JSTOR 1968974 (https://www.jstor.org/stable/1968974)
Gumbel, E.J. (1935), "Les valeurs extrêmes des distributions statistiques" (http://archive.num
dam.org/article/AIHP_1935__5_2_115_0.pdf) (PDF), Annales de l'Institut Henri Poincaré, 5
(2): 115–158, retrieved 2009-04-01
Gumbel, E. J. (2004) [1958], Statistics of Extremes (https://books.google.com/books?id=kXC
g8B5xSUwC&pg=PP1), Mineola, NY: Dover, ISBN 978-0-486-43604-3
Makkonen, L. (2008), "Problems in the extreme value analysis", Structural Safety, 30 (5):
405–419, doi:10.1016/j.strusafe.2006.12.001 (https://doi.org/10.1016%2Fj.strusafe.2006.12.
001)
Leadbetter, M. R. (1991), "On a basis for 'Peaks over Threshold' modeling", Statistics &
Probability Letters, 12 (4): 357–362, doi:10.1016/0167-7152(91)90107-3 (https://doi.org/10.1
016%2F0167-7152%2891%2990107-3)
Leadbetter M.R., Lindgren G. and Rootzen H. (1982) Extremes and related properties of
random sequences and processes. Springer-Verlag, New York.
Lindgren, G.; Rootzen, H. (1987), "Extreme values: Theory and technical applications",
Scandinavian Journal of Statistics, Theory and Applications, 14: 241–279
Novak S.Y. (2011) Extreme Value Methods with Applications to Finance. Chapman &
Hall/CRC Press, London. ISBN 978-1-4398-3574-6
Pickands, J (1975), "Statistical inference using extreme order statistics", Annals of Statistics,
3: 119–131, doi:10.1214/aos/1176343003 (https://doi.org/10.1214%2Faos%2F1176343003)

Software
Extreme Value Statistics in R (https://cran.r-project.org/web/views/ExtremeValue.html) -
Packages for extreme value statistics in R
ExtremeStats.jl (https://github.com/juliohm/ExtremeStats.jl) and Extremes.jl (https://github.co
m/jojal5/Extremes.jl) - Extreme Value Statistics in Julia

External links
Extreme Value Theory can save your neck Easy non-mathematical introduction (pdf) (http://
www.risknet.de/fileadmin/eLibrary/EVT-Paper-Roehrl-Chavez-Demoulin.pdf)
Source Code for Stationary and Nonstationary Extreme Value Analysis University of
California, Irvine (http://amir.eng.uci.edu/neva.php)
Steps in Applying Extreme Value Theory to Finance: A Review (http://www.bankofcanada.c
a/wp-content/uploads/2010/01/wp00-20.pdf)
Les valeurs extrêmes des distributions statistiques Full-text access to conferences held by
E. J. Gumbel in 1933–34, in French (pdf) (http://www.numdam.org/item?id=AIHP_1935__5_
2_115_0)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Extreme_value_theory&oldid=1164481581"

You might also like