You are on page 1of 11

ARTICLE IN PRESS

Journal of Wind Engineering


and Industrial Aerodynamics 93 (2005) 535545
www.elsevier.com/locate/jweia

Technical note
A comparison of methods of extreme wind
speed estimation
Ying An, M.D. Pandey
Department of Civil Engineering, University of Waterloo, Waterloo, Ont., Canada N2L 3G1
Received 8 March 2004; received in revised form 13 April 2005; accepted 4 May 2005
Available online 11 July 2005

Abstract

The paper presents a comparative assessment of methods for extreme value analysis of the
US wind speed data using four different methods, namely Standard Gumbel, Modied
Gumbel, Peaks-Over-Threshold (POT) and Method of Independent Storms (MIS). The
analysis highlights the inuence of methodological assumptions on the estimates of design
wind speed corresponding to 50-year and 500-year return period. The results demonstrate that
the MIS method leads to more stable quantile estimates than the POT method.
r 2005 Elsevier Ltd. All rights reserved.

Keywords: Wind speed; Extreme value estimation; Pareto distribution; Gumbel distribution; Peaks-over-
threshold method; Method of independent storms; Return period

1. Introduction

There are two general classes of statistical models for the quantile estimation of
extreme wind speed, namely annual-maxima and Peaks-Over-Threshold (POT)
methods. In the rst method, the Gumbel distribution is tted to a sample of annual
maximum values recorded over a period of time. It is referred to Standard Gumbel
(SG) method [1]. Harris [2] proposed a Modied Gumbel (MG) approach in which

Corresponding author. Tel.: 519 888 4567x5858; fax: 519 888 4349.
E-mail addresses: y3an@engmail.uwaterloo.ca (Y. An), mdpandey@uwaterloo.ca (M.D. Pandey).

0167-6105/$ - see front matter r 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jweia.2005.05.003
ARTICLE IN PRESS
536 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 93 (2005) 535545

parametric plotting positions and weighted least-squares method are applied for
tting the Gumbel model to a sample of annual maxima.
The POT method considers, instead of just annual maxima, several of the largest
order statistics exceeding a sufciently high threshold in the collected data. In this
manner, the data set is enlarged to decrease the sampling uncertainty. This method
has gained wide acceptance in the eld of extreme value estimation due to Pickandss
work [3] in which the generalized Pareto distribution (GPD) was proved to be the
limiting distribution of peaks. Simiu and Heckert [4] applied this approach to
estimate the design wind speed in the US.
Cook [5] proposed the Method of Independent Storms (MIS) that analyzes a time
series of storm maxima modelled by the Gumbel distribution. The method was
rened by Harris [6], and it includes two parts: (a) obtaining a subset of independent
maxima by identifying storms; and (b) tting the storm maxima by the Gumbel
distribution.
The primary objective of the paper is to study the impact of methodological
differences on extreme quantile estimates of the wind speed. We apply POT, MIS,
SG and MG methods to common wind speed data sets and study the differences in
quantile estimates. For illustrative purposes, data sets of wind speed collected at six
stations in the US, namely, Albuquerque, Boise, Denver, Moline, Toledo and
Tucson [4], are analyzed, and results are discussed in the paper.

2. Annual-maxima method

2.1. Standard Gumbel method

In this method, a sample of annual maximum values of wind speed (x) is directly
modelled by the Gumbel distribution
F x expeaxP , (1)
where a and P are referred to as the measure of the dispersion and characteristic
product, respectively. The empirical distribution is plotted on the Gumbel
probability paper. a and P are estimated using the method of classical least-
squares. Note that the rank-mean plotting positions, i=n 1, are typically used,
which is the mean value of an ith order statistics in a sample of size n. This is an
unbiased, non-parametric estimate, because no assumption is introduced about the
distribution type.

2.2. Modified Gumbel method

Harris [2] proposed modications of the Standard Gumbel method by revising the
plotting positions and using the weighted least-square method for the estimation of
the distribution parameters. Using the fact that the domain of attraction of extreme
wind speed is Gumbel distribution, Harris explicitly derived the mean ranks for vth
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 93 (2005) 535545 537

Gumbel order statistics in a sample of N as:


Z 1
n!
yv  ln lnzznv 1  zv1 dz, (2)
v  1!n  v! 0
where z F x denotes the Gumbel cumulative probability distribution given in
Eq. (1).
This method uses a weighted-least-squares technique to t a line on the Gumbel
probability paper. The weight is taken as inversely proportional to the corresponding
ranks standard deviation. The parameters of straight line y ax  P on Gumbel
paper are obtained by minimizing the quantity S 2 as
X
n X
n
1=s2
S2 wv yv  aqv P2 subject to wv 1 and wv Pn v 2 , (3)
v1 v1 v1 1=sv

where qv is value of squared wind speed and yv is the rank-mean plotting position; wv
and sv are weight and standard deviation for rank vth sample, respectively.
Note that the classical least-squares technique used in the Standard Gumbel
method assumes or presumes that the standard deviations associated with the order
statistics are statistically uniform. This approach, MG contends, is not suitable for
tting extreme value data due to the fact that scatter associated with the plotted
probability ordinates is not identical, but varies systematically, e.g., being largest for
the largest value. Consequently, the classical least-squares method places higher
weight on sample extremes.

3. Peaks-over-threshold approach

To model the upper tail of the wind speed distribution, consider k exceedances of
X (n samples) over a threshold u and let Y 1 , Y 2 , y, Y k denote the peaks, i.e.,
Y X i  u. Pickands [3] showed that in some asymptotic sense, the conditional
cumulative distribution of the excess Y X  u of the variate X over the threshold u
given X 4u for u sufciently large, i.e., PY X i 2uoyjX i 4u, follows the GPD:
 
cy  h 1=c
Gy 1  1 , (4)
a
where h, a and c denote the location, scale and shape parameters, respectively.
Generally, the location parameter is taken as zero. The distribution has unbounded
upper tail, i.e., 0oyo1 if cX0 and bounded as 0oyoa=c if co0.
Simiu and Heckert [4] adopted the De Haan method [7] for the estimation of the
scale and shape parameters with the order statistics of exceedances, {Xnk,y, Xn},
where Xnk is the smallest data point to exceed a given threshold. The shape
parameter is computed as
!1
2
1 1 M 1
n
c c1 c2 where c1 M n and c2 1  1 (5)
2 M 2
n
ARTICLE IN PRESS
538 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 93 (2005) 535545

by means of moments of excesses obtained from the log-transformed data:

1X k
M r
n logX ni1  logX nk r . (6)
k i1

The scale parameter can be obtained by


M 1
n
au where r 1 if cX0; and r 1=1  c if co0. (7)
r
A quantile value, xR , corresponding to an R-year return period is calculated from
the quantile of peaks corresponding to a return period of lR, where l is the mean
exceedance (or crossing) rate per year. If n denotes the number of samples collected
over T years and k is the number of exceedances, then l k=T. Finally, a required
quantile value can be estimated as
 
1 1
xR G 1 u. (8)
lR
The POT is a useful alternative to the popular Gumbel method in the eld of
extreme value estimation. However, the threshold sensitivity of quantile estimates is
a topic of concern. The experience suggests that a very high threshold resulting in a
small POT sample would increase the sampling uncertainty (variance) associated
with a quantile estimate. On the other hand, as threshold is lowered to include more
data, quantile bias tends to increase [8]. In this sense, it is expected that an optimal
threshold might exist that would minimize both bias and variance.
Galambos and Macri [9] pointed a critical limitation that the POT method is an
asymptotic concept, and the necessary limiting assumptions cannot be satised in a
small data set. The reason is that although k top order statistics used in the analysis,
k must be quite large, while the ratio k=n must be very small. The rst requirement
on k demands its value in the order of hundreds, the second condition requires a very
large sample size of the order of say 100,000. Galambos and Macri [9] argued that k
is not sufciently large in the POT analysis of wind data presented in [4], which can
lead to erroneous estimates of the GPD shape parameter.
Another source of error can be the measurement (or round off) error associated
with the wind speed as well as the threshold value u. Because of this, the relative
error associated with an excess (X  u) increases considerably. For example, take
X 25
0:5 mph and u 20
0:5 mph. There is a 2% error associated with X , but
the error associated with excess, (2570.52070.5 570.1 mph), becomes 20%.

4. Modied Independent Storms Method (MIS)

In the Method of Independent Storms (MIS), a continuous time series of wind


speed records is examined to identify wind storms. The downward crossings of wind
speed below a chosen threshold denes the beginning of the lulls in the record,
and the wind speed records between each pair of lulls are considered as a part of an
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 93 (2005) 535545 539

45

41 Time series of Denver.CO


Storm-2
Storm-1 Threshold ( u = 29 mph )
37 MIS points
POT points
Wind Speed (mph)

33

29

25

21

17

13

9
0 2 4 6 8 10 12 14 16 18 20 22 24 26 22 30 32 34 36 38 40 42 44 46 48 50
Day

Fig. 1. Illustration of difference between MIS points and POT points.

independent storm (Fig. 1). The maximum speed in each independent storm is
selected to form a sample of extreme values.
Although POT and MIS appear similar in implementation, there are subtle
conceptual differences, as shown in Fig. 1. The POT considers the values exceeding
over a threshold, i.e., excesses X u, which should be independent values.
Nevertheless, it is possible that excesses within one storm might be included in the
POT data set, as shown in Fig. 1 (without correlation consideration here for the
purpose of illustration). On the other hand, the MIS uses the actual values of storm
maxima, and ignores other smaller peaks within a storm. Furthermore, increasing
threshold in MIS only reduces the number of independent values in the sample, while
it affects both the number and magnitude of excesses in the POT analysis.
Suppose an R-year of time series n independent storm maxima above a threshold u
are chosen. The distribution of storm maxima above the threshold is given as
8
< F x  F u ; x4u;
G x 1  F u (9)
:
0; xpu:
The average annual rate of occurrence of storms is denoted by r where r n=R. It
is assumed that the occurrence of storms is sufciently uniform that the number of
storms in any given wind year does not differ signicantly from r. The distribution of
maxima of r independent storms is given as
Gx Gxr . (10)
ARTICLE IN PRESS
540 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 93 (2005) 535545

Considering that the storm maxima converge to the Gumbel distribution,


 ln ln Gx ordinates are plotted against the sample order statistics of wind
speed. Following the concept of the modied Gumbel method, Harris [6] derived the
parametric plotting position formula and recommended a weighted least-squares
method for estimating the distribution parameters. The major difference between
MG and MIS is that only the some upmost samples of the n observations are used in
MIS. The effect of threshold on the sizes of upmost samples for MIS method is
illustrated in Fig. 3.

5. Numerical results

5.1. Wind speed data

The four methods of extreme wind speed analysis, namely SG, MG, MIS and
POT, are applied to the US wind speed data. The data consist of time series of 4-day-
interval maxima recorded at six stations, namely Albuquerque (NM), Boise (ID),
Denver (CO), Moline (IL), Toledo (OH) and Tucson (AZ). These stations are
located in the regions where hurricanes and tropical storms are not expected.
Additional information is presented in Table 1. The sample size considered in the
POT and MIS methods depends on the threshold. The sample size decreases as
threshold wind speed is raised (Fig. 2).
The wind speed data and the POT analysis program were provided by Simiu and
Heckert [4]. Computer programs for MIS and MG methods are adopted from the
original references [2,6]. In MIS and MG methods wind speed data were
preconditioned by a square transformation, i.e., distribution was tted to X 2 ,
whereas POT and SG utilized the original wind speed X . However, results of all the
methods are compared on the basis of the wind speed in mph units.

5.2. Discussion of results

Numerical results for 50 and 500-year wind speed quantile are plotted against the
threshold velocity in Figs. 49. In these gures, note that increasing value of

Table 1
Information about the US wind speed data sets

Location Period Number of Wind speed Condition of


years measurements measurement

Moline, IL 1965/11979/12 15
Tucson, AZ 1965/11979/12 15 10 m above
Albuquerque, NM 1965/11983/12 19 Daily fastest mile ground in open
Boise, ID 1965/11987/12 23 wind speeds terrain
Denver, CO 1965/11979/11 15
Toledo, OH 1965/11989/12 25
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 93 (2005) 535545 541

threshold on the X-axis implies decreasing size of the sample, as shown in Figs. 2 and
3. Results of the Gumbel methods (SG and MG) are independent of the threshold, as
they utilize a complete sample of annual maxima.
The standard Gumbel method appears to provide an upper bound estimate
in most cases. Albuquerque and Boise shown in Figs. 6 and 7, respectively,
are exceptions to this observation, where MG provides upper bound estimates.
The quantile estimates obtained from MG are quite close (within 10%) to SG
estimates.
In general, POT estimates exhibit large variation with respect to the threshold
speed. Such high threshold sensitivity is recognized as a limitation of POT method,
as it makes it difcult to identify a representative quantile value. A possible reason
for this variability is that the Pareto model is only applicable to a narrow and
unidentiable range of exceedence data [10]. From Figs. 49, it can be clearly
observed that the POT estimates are only reasonable while the threshold is raised
up to a high enough level in line with the discussion in the paper by Davison and
Smith [10].
In contrast with POT methods, MIS estimates follow a smoother trend with very
limited variability with respect to the threshold speed. The MIS curve tends to be
lower than those of MG and SG or within the band of SG MG lines. The reason

Moline
1000
Tucson
900
Albuquerque
800
Boise
700
Number of Data Points

Denver
600
Tovledo
500

400

300

200

100

0
18 22 26 30 34 38 42 46 50
Thresholds (mph)

Fig. 2. The effect of threshold on the sizes of samples for POT method.
ARTICLE IN PRESS
542 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 93 (2005) 535545

60
55
50
45
40
Number of Data

35
30
Albuquerque.nm
25
Boise.id
20
Denver.co
15
Moline.il
10
Tv
5
Tucson.az
0
16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54
Threshold (mph)

Fig. 3. The effect of threshold on the sizes of upmost samples for MIS method.

why MIS has stable estimation with respect to the threshold might be mainly caused
by the largest samples selecting strategy in which only some top values are
considered during the calculation of weighted plotting positions for Gumbel
analysis. Both of amounts and values of those upmost samples are not changed a lot
while the threshold is raised up. There are some difculties for numerical integration
by which the weighted positions for Gumbel plotting can be calculated during the
implement of MIS. To deal with the problem MIS gives modied numerical method
in which the over- and under-ow problems are avoided, but, some constraint for
this modied numerical integration causes the limitation of the largest threshold level
above which the integration cannot go further any more. This limitation should be
addressed in the future.
Although the paper presents numerical results for six stations only, we have
analyzed many other stations and obtained qualitatively similar results. It is implicit
in the analysis that there is only one dominant wind mechanism at each site. In
reality however some records might be a result of mixed wind climates, which could
be a reason for differences observed in the quantile estimates.

6. Conclusion

The paper presents a comparison of four different methods of extreme wind


speed estimation. The comparison includes SG, MG, POT and the MIS. These
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 93 (2005) 535545 543

four methods are applied to a common data set that consists of six stations in
the US.
The general conclusions of the paper are

The SG method tends to provide an upper bound estimate of 50/500 year design
wind speed.
The MIS estimates of design wind speed exhibit a more stable trend with limited
threshold sensitivity, which is in contrast with rapidly uctuating estimates
obtained from the POT methods.

Appendix. A

See Figs. 49.

75 90
70 85
Quantile (mph)
Quantile (mph)

80
65 75
60 70
55 POT 65
POT
MIS 60 MIS
50 MG MG
SG
55
SG
45 50
23 25 27 29 31 33 35 37 39 41 43 45 47 23 25 27 29 31 33 35 37 39 41 43 45 47
(a) Threshold (mph) (b) Threshold (mph)

Fig. 4. The (a) 50-year and (b) 500-year quantile estimates for Moline (IL).

80 110
75
100
70
Quantile (mph)

Quantile (mph)

65 90
60 80
55 70
50 60
45 POT POT
40 MIS 50 MIS
MG MG
35 SG
40 SG
30 30
18 20 22 24 26 28 30 32 34 36 38 40 42 18 20 22 24 26 28 30 32 34 36 38 40 42
(a) Threshold (mph) (b) Threshold (mph)

Fig. 5. The (a) 50-year and (b) 500-year quantile estimates for Tucson (AZ).
ARTICLE IN PRESS
544 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 93 (2005) 535545

75 80
70 75
Quantile (mph)

Quantile (mph)
70
65
65
60 60
55 55 POT
POT
MIS 50 MIS
50 MG MG
45 SG
SG
45 40
28 30 32 34 36 38 40 42 44 46 48 50 52 28 30 32 34 36 38 40 42 44 46 48 50 52
(a) Threshold (mph) (b) Threshold (mph)

Fig. 6. The (a) 50-year and (b) 500-year quantile estimates for Albuquerque (NM).

60 70
55 65
Quantile (mph)

60
Quantile (mph)

50
55
45 50
40 45
35 POT 40 POT
MIS 35 MIS
30 MG MG
SG 30 SG
25 25
18 20 22 24 26 28 30 32 34 36 38 40 42 18 20 22 24 26 28 30 32 34 36 38 40 42
(a) Threshold (mph) (b) Threshold (mph)

Fig. 7. The (a) 50-year and (b) 500-year quantile estimates for Boise (ID).

70 85
65 80
75
Quantile (mph)

Quantile (mph)

60 70
55 65
60
50
55
45 POT 50
MIS 45 POT
40 MIS
MG 40 MG
35 SG 35 SG
30 30
22 24 26 28 30 32 34 36 38 40 42 44 46 22 24 26 28 30 32 34 36 38 40 42 44 46
(a) Threshold (mph) (b) Threshold (mph)

Fig. 8. The (a) 50-year and (b) 500-year quantile estimates for Denver (CO).
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 93 (2005) 535545 545

65 75
60 70
Quantile (mph)

Quantile (mph)
65
55
60
50 55
45 POT 50 POT
MIS 45 MIS
40 MG MG
SG 40 SG
35 35
21 23 25 27 29 31 33 35 37 39 41 43 45 21 23 25 27 29 31 33 35 37 39 41 43 45
(a) Threshold (mph) (b) Threshold (mph)

Fig. 9. The (a) 50-year and (b) 500-year quantile estimates for Toledo (OH).

References

[1] E.J. Gumbel, Statistics of Extremes, Columbia University Press, New York, 1958.
[2] R.I. Harris, Gumbel re-visited: a new look at extreme value statistics applied to wind speeds, J. Wind
Eng. Ind. Aerodyn. 59 (1996) 122.
[3] J. Pickands, Statistical inference using order statistics, Ann. Stat. 3 (1975) 119131.
[4] E. Simiu, N.A. Heckert, Extreme wind distribution tails: a peaks over threshold approach,
J. Struct. Eng. ASCE 122 (5) (1996) 539547.
[5] N.J. Cook, Towards better estimation of extreme winds, J. Wind Eng. Ind. Aerodyn. 9 (1982)
295323.
[6] R.I. Harris, Improvements to the method of independent storms, J. Wind Eng. Ind. Aerodyn. 80
(1999) 130.
[7] L. De Haan, Extreme value statistics, in: J. Galambos, J. Lechner, E. Simiu (Eds.), Extreme Value
Theory and Applications, vol. 1, 1994, pp. 93122.
[8] M.D. Pandey, An adaptive exponential model for extreme wind speed estimation, J. Wind Eng. Ind.
Aerodyn. 90 (2002) 839866.
[9] J. Galambos, N.J. Macri, Classical extreme value model and prediction of extreme winds, J. Struct.
Eng. 125 (7) (1999) 792794.
[10] A.C. Davison, R.L. Smith, Models of exceedances over high thresholds, J. R. Stat. Soc. Ser. B 52
(1990) 393442.

You might also like