Professional Documents
Culture Documents
Methodology
Julius Lieblein
October 1974
Final Report
Julius Lieblein
October 1974
Final Report
5. Examples 11
7. Further Work 16
References , 24
Tables
Addendum A-
ii
Efficient Methods of Extreme -Value Methodology
Julius Lieblein
distribution.
bution that is quite different from that which governs ordinary data
1
extreme -value data requires estimation of the parameters of the extreme-
-(x-u)/b
'
Prob {X<_x} = e^ ^-c«<x<°°, - °°<u<°°, o<b<«' (1)
scale.
normal distribution, then the average value of such a sample also has a
2
. ,
largest values, whose c.d.f. is given above, and to which attention will
functions, called "estimators", have been utilized for this purpose, with
E(f) = u, E(g) = b,
to be "unbiased".
(2) Minimum variance . Generally, as more and more data are taken
'
or dispersion is represented by the variance of the estimator
unbiased estimator.
based on a given sample size, n (see [1]). This lower bound could thus be
proposed unbiased estimator, f. Their ratio thus cannot exceed unity, and
is generally less.
This ratio,
for sample size. The variances of two (unbiased) estimators may be portrayed
with increasing sample size, for two estimators of the same parameter.
- Estimator 2
-~ Estimator 1
''2
—^ Sample Size
'^1
Estimator 1 than with Estimator 2, for which the sample size required is
n^ > n^. The corresponding situation for efficiency of the same two
Estimator 2
Sample Size
^1 '^2
variance and its related efficiency have a bearing on sample size, and the
most economical estimator to use is the one v\rith smallest variance, or,
optimally efficient estimator (i.e., E = 1) has not been found for either
order
The x's are then called "order statistics". The desired estimators o£ the
extreme- value parameters then take the form of a linear function of the
...
order statistics:
u..=
(n)
; v
n
Sa.x.,
^^-L
1' 1
.
b.-.=
fn)
n
Zb.x.
.^^11 (4)
The BLUE are then obtained by finding numerical coefficients, a^, b^,
for which each of the two linear functions of the order statistics is an
presented in Table 1 for n 16, were obtained from [11]. Their effi-
ciencies are given in Table la based on [8] which shows that relatively
10_, at which efficiency, for u, has reached 98"6, and for u is 851.
denoted by a^ and b^, giving the following BLUE for estimating u and b
6
.
m . m
Then coefficients a
11^
I and bl for good estimators ul
the place of the a's and b's in (4) are given by (see [7]).
(m)
. and bl-
(m)
. to take
m
a: = S a • (t/i)p(n,m,i,t) ,
(6a)
^ ^
t=l
i = 1,2,. ..,n
m
b! = E b • (t/i)p(n,m,i,t), (6b)
^ ^
t=l
where
may serve as a guide for computer programming in the case of large samples.
^4) =
4
^f^Vt'
^4
^4) =
^^^Vt'
with the a^ and b^ shown in the first two columns of the worksheet, the
6 . 6
7
a. b.
1 1
(BLUE) (BLUE)
.13417 •
.10132 .75516
parameters) will not exceed (m/n) times the variance of the BLUE for
efficiency will not be less than that of the BLUE for samples of size m,
rough measure of the efficiency associated with this method. Table la,
which gives efficiencies for different sample sizes, then shows that
10, a fact already noted above with regard to total sample size, n.
4. Estimators for Very Large Samples
The preceding methods become less practical for very large samples,
necessary to use all the sample values in order to obtain good estimates.
v/hich "space" the full set of n values in a manner analagous to the way
. of the observations above the median and the same fraction below —the
median is then the "50-percent quantile" of the sample. The "optimum
n. = fnx.}
where the brackets denote the next larger integer, e.g., {4.3} = 5,
{16} = 17.
9
.
This procedure also produced the other quantities in Table 3. The a| and
b| are the coefficients that give the good estimators for u and b based
u* = I a*x
"
:
'
- i=i
'
'
. k
b* = z b*x
. , 1 n.
1=1 1
Rao lower bound. This variance, while not obtained from the exact variances
exact efficiency for those sample sizes where the latter is available.
The comparison is made in Table 3 with BLUE for sample size 16,* where
the ratio is obtained to the exact values given in Table la. The relative
u*, and almost 85 percent as good for the estimator b*, as the efficiency
of the best linear unbiased estimator obtainable for n = 16. These high
ratios justify calling these starred values good estimators, although not
best for the sample size n. What is very useful about them is that, for a
given value of k, both the same spacings, K, and the same coefficients, a*^
and b^, can be used for any (large) n, and do not have to be recalculated
for each different n. Thus, a very compact table such as Table 3 can
These four selected observations thus yield estimators that are almost
(from Table 3) 911 as efficient as the BLUE for u, and 84% as efficient
5. Examples
The second example applies the procedure for large samples to maximum wind
11
i Unordered Ordered, x. Coefficients to Obtain BLUE
^ a. b.
1 1
Check sums should be obtained for every column of values. The first two
colijmns should give the same simi because reordering does not change the
these are the conditions for the estimator to be unbiased. These checks
are satisfied above. IVhen they are not, one should look for an error in
respectively,
u =
8
Z a.x- = 3.8724, b =
.8 E b.x. = 0.4171.
^ ^ ^ '
i=l i=l
sums of products then the above check sums will be found a very useful
From this one can estimate the probability that a given value of x will
Ex. 2 . The second exanple uses a set of maxiinum annual wind speeds
from Chattanooga, Tenn. , for the 21-year period 1944 through 1964. This
example illustrates the simplified procedure for "very large" samples given
40 45 52 57
41 45 53 59
41 49 53 62
42 49 54 63
45 50 57 63
67
Check sum = VWT
Table 3 allows us to make good estimates by using only 2, 3, or 4 selected
values out of the 21. Thus, for 2 values, the fractional distances to
and n. = {.90(21)}= 19. The calculations, based on the values from Table 3,
=41 0.5673 -0.4837
x^^ = 5^ ^
.4327 '
.4857
a.* b.*
1 1
^1
--
40 0.1893 -0.3127
^6
==
45 .4566 - .1123
==
54 .2772 .2607
^4
^19 _^ .0768 .1643
4 . 4
Z a.*x = 47.9262 b* = Z b.*x = 6.8672
i=l
in,
-
'^i
-
distribution, eq. (1), then the procedure for fitting the distribution,
ascending order
Xj <X2l... ix^.
14
, . .
a. n _< 16
n . n
u = T. a.x. , b = Z b-x.
^
i=l ^ ^ i=l ^
n
. a.' = Z [a . (t/i)p(n,m,i,t)]
1 ^
t=l
i = 1,2,. .. ,n,
n
= Mb.^ - Ct/i)p(n,m,i,t)]
^ t=l
p(n,m,i,t) = -
n . n
u> V = Z a. 'x. , bl . = E b. 'x.
(n) .^^1 i» (n) .^^ 1 1
These give the linear unbiased estimators that are the best available on
the basis of the known BLUE coefficients for samples of size m. Further
c. n > about 50
.
(1) Take k =
Table 3,
4 and obtain
i = 1, 2,. .
.
,
X.
1
k.
and coefficients a.* and b.* from
XI
(2) Compute n^, n^, n^, n^ with n^ = [nX^], the next larger integer
to nX .
. k . k
u* = z a.*x , b* = z b.*x .
. 1 1 n. . , 1 n.
1=1 1 1=1 1
considering that only a very small portion of the observations are used.
7. Further Work ,
of largest values.
16
(1) quick graphical aids for selecting the most appropriate
different situations;
18
1
1 1
n i a^ b, n i
1 — —
2 1 0.916 373 -u. /Zi o4o Q
y 1
i 0.245 539 -0.369 242
2 .083 627 .ill o4o z 1 7/1
.1/4 887
ooZ 08 707J
ZU
7
.141 789 - .006 486
7 A
J 1 .OoD . OoU o41 M- .117 357 .037 977
2 .255 lU 91;
oiO c: C
D .097 218 .065 574
3 .087 966 77/1 771;
/ Zo
A
U .079 569 .082 654
7 .063 400 .091 965
4 1 .510 998 .DDO A1 Q Q
0 .047 957 .094 369
2 .263 943 .
HQ
Uoo Q
y .032 291 .088 391
X
J DoU .223 919
1
4 .071 380 .248 797 1 .222 867 - .347 830
2 .162 308 - .091 158
5 1 .418 934 - . dUj 1iZ77/ 3 1 77 8/1
04Dc:
.
01
uiyQ 71 0
ZIU
2 .246 282 AHA Do4 4
/I
.112 868 .022 179
c
3 .167 609 .
1 xr\
iou 40Dc /I t:
J .095 636 .048 671
824 A 064
4 .108 . iol DoO
1 ft!
0 .080 618 .066
q
J . Udo odu 1 /1Q7
40O 7 .066 988 .077 021
0
0 .054 193 .082 771
6 1 .355 450 . 4oy 777
/I c:q Q
y .041 748 .083 552
2 .225 488 . UO J QQ7
yyz 1u .028 929 .077 940
3 .165 620 077 1lyyQQ
4 .121 054 1 7A 77/1
/ Z4
1
1
1
1 .204 123 - .329 210
5 .083 522 .
1 Q D04
i4y C 7/1
y1
.151 384 - .094 869
867 c n7
145 ou/ 7 .126 522 .028 604
6 .048 . -
/I
4 .lUO 77A
ZZO .
01 0
UlU 077
UOZ
c
7 1 .309 008 - .423 700 J .093 234 .035 284
2 .206 260 - .UOU oyo 6 .080 222 .052 464
1 no
3 .158 590 .uio iyz 7 .068 485 .064 071
4 .123 223 .
nQ7
Uo / 77Q
ooy Q
0 .057 578 .071 381
5 .093 747 .
11/1
ii4 000 Q
y .047 159 .074 977
1 7 c
6 .067 331 oby iU .036 886 .074 830
1 7n 1/11 1
7 .041 841 11 .026 180 .069 644
19
X
1 4 8
i a b. n i a b
1 1
— — —
I u ±
. / 't yi A
Q1 0 - . 298 313 15 1 0.153 184 -0 777
U Z
. / o
AOA
ouo
7 . loo /177 - . 098 284 7
L .
"^1
xxy OXH- .
008 7A8
uyo / Oo
X -L
1
. UJ /
1 77 . 062 11 .041 529 .
oi;i c;74
UoX oOf
1 7 . uzy 7Q0
/ yu • uUX
OfSl oyy .UOO Q84 .
ot;7
Uoo £0 /
i- J .
n?
UZi1 yob . 057 oou •
.030 484 . Uoo A07
0Q7 DUo
14 .024 887 . uo^ OOH
1
X 1 fit
ouy ox u xo . m
uxo8 8Q4
OyH .
048 A48
U4o O'+O
7
L . IZo yoo 77"^
/ / o
5 07 c;
.11/3 .024 787 4 . 087 801 .
077
071
yoo
UZl
7 0A7 ool .035 768 ...
5 f
'
20
Table la. Efficiency of BLUE Listed in Table 1
L 0. 84047 0
V/ • 42700
\J\J
/
3 .'91732 .58786
4 .94448 .67463
c .95824 .72960
A
VJ .96654 _75732
7 .94315 .79606
8 .97605 !81785
9 .97903 .83519
10 .98135" .84938
11 .98312 .86119
12 .98474 .87121
13 .98600 .87982
14 .98706 .88725
15 .98800 .89381
16 .98880 .89961
21
\
^ I 1 11 (
o
r--
O
vO
O
O o VO
O r--
VO
o
\o O
oO VO
O O O \0
ooo
oo oO o vO
O \o
vO
o
vO
vO
•M
X
Id ^0 O
O o to vO o
o o O
O
o o oo o vO o
rH
O oooo
o
o hO
K) O o \0 o CTi
5-4 \0 o o LO
to \0 o
oo .— o
O
u o rsl LT) 1—1 vO
o •4-1
W a rH
c
5
~j
•H
O O o to o
O vO
O LO olo -a LO
LO
r-- o |o
LO
vO
vO
to
to O
O vO
o li
.— I— (NI^O
vO
\0
to
to O vO
vO
—
t t
/ r— to LO VO
V J (
I—)
•H
V /
jl
•H r--
vD
o
O o
O O
O vO
vO
vO O O OO O VO
ooo OOO
O O O
vD vO
II
vO
o O
^ VO
(XI
II
, —
II
V '
in
5-1
CD 4-1
o
V) HN W 1
o
oo vO
to
to
o
o o
C
to
to
o OO vO to o
o LO
o
<4-l
(D CD II vO OOO vO
vO
to
to o oo
O r--
o o
u "u t— LO vO to
I"—
II t—
II II
• H (U
o O
o to
to \o O
to
to
to
to
rH
c to O
vO O
to to
U o to
vO o
to to
c;3
•H r— I r J ^ to
to vO o to
to
to
to
U i-H r—i to
5-1
oi o% C7>
"Si-
i-H
>o cr>
o r—
CTi r~-
00 LO oo
o un oo
0)
(D •M
TO
II LO o csi rvj
I/)
•a •H r-l
P 03 II 00*^0
(y)^ oo 0
oo
CTl O
O vO LO t-O r-(
1-H t--
LO rH OfNl
22
' 1 1 ( -t ' (
•H
O
X
o
vO
r-i cnoo
^
•K CTi oo
• r-- LO
CO
•H W • •
<—
^
U
•H II
<D M-l
</) 4-1 •K
•H
1) O
<u o
< X
O rH O
(->
M-t
cu
^ t/)
•H • \£3
•H
(J W •
rH
"-M PC II
M-l o •
X
(D ^ c
K LO oo <— 1—
'J 4-1 J-i I— \0 ^
•H
+-*
•H o o
M-i
03
CO oo
O rH
(-l
O o • • •
II
II
II 03 J •5C
< P
(/I
03 d
•H w•
o
-H
U-1 o pi
•
o
-H
O (-> Cii -P
^-^ 4-1
o <a <
• o;
J ^
vO
O ( —
•H X
)-' II
V
o
o3 1
J-i
to X
-a
o X
03 <4-l
V 1
to 1=)
+^ hJ
OQ X r-t-1 LO
<D CTi LO
•H ^-1 V 1
LO K) to
'J O c OO ""^ vO
•H 1—
<-H t/l C X
M-i QJ X
0) • H
O U ID
o c
Qj o
•H
•H
U Cs] to
LO
LO
o
o
M-i to 00 o LO
M-i c r
•
LO 1—1
0) •
(/) to X
O Xo
•H
U «
03 <
d,
(/)
1—
hO o
LO
vO
LO
<< LO to
g
^<
o o to o
,'
.3 < 13 X 1
M-l
O O
•H • H •H •H
bl 03 X) 03
23
References
(1969), 125-131.
9. Mann, N. R., "Point and Interval Estimation Procedures for the Two-
Parameter Weibull and Extreme-Value Distributions," Technome tries .
11. White, J. S., "Least -Squares Unbiased Censored Linear Estimation for
the Log Weibull (Extreme Value) Distribution," J. Indus t. Math. Soc,
Vol. 14, 21-60 (1964).
24
ADDF,KTOI
p, from which an extreme is taken; and fii) the number of such extreme
data from the same population. It was also indicated that extreme -value
Professor E. J. Gumbel [b] to the present day, and the question is still
largely open.
of the text above, dealing with maximum annual wind speeds. Records of
throughout the year. Maximum values for five-minute periods are read and
the largest of these is taken as the single maximum for the year. Thus
the amount of data "in back of" the year's maximum is represented not by
the 365 daily maxima, but by the much larger number of five-minute
sample size n = 21, and this is the distinction between p and n. What may
periods are from the same population, as this is a basis for the
A-1
.
experience has made it seem likely that considerable departure from such
"smooth" out irregularities in the fundamental data, which may not even
any rate, it is true that in the cited example, and in many other such
It was stated above that the basic data behind the maxima may not
even be available, yet this need not impair the application of extreme-
value methods. Many successful applications where "basic data" are not
idea is that a bar of, e.g., steel is made up of many hypothetical small
theory of flaws enunciated in 1920 [a] ; and the first statistical treat-
as the half-size bar, i.e., the "amount of data, p" is twice as much, even
ioiowing the amount of "basic data, p". This may be a reason that little
workers who are already applying extreme -value methods to their problems
and may be using methods of estimation that may not be the most efficient
found, through other methods, outside the scope of the present preliminary
problems. This would imply, again on a heuristic basis, that the amount
— However, the writer of the present report has found evidence to justify
earlier priority, namely, to Chaplin in 1880 (see Lieblein [c]).
basic data (in amount p) come from a simple exponential distribution
data, such as the normal (Gaussian), it is known that such small values
value of p.
A-4
.
References
c. Lieblein, J., "Two early papers on the relation between extreme values
and tensile strength," Biometrika Vol. 41, Parts 3 and 4, 559-560
,
(December 1954).
USCOMM-NBS-DC
NB5-n4A '»Ev. 7-73)
12. Sponsoring Organization Name and Complete Address (Street, City, Stale, ZIP) 13. Type of Report & Period
Covered
Final
Same as No, 9 14. Sponsoring Agency Code
16. ABSTRACT (A 200-word or less {actual summary ol moat significant information. If document includes a aignificant
bibliography ot literature survey, mention it here.)
This report presents the essentials of modem efficient methods of estimating the two
parameters of a Type I extreme-value distribution. These methods are an essential
phase of the analysis of data that follow such a distribution and occur in the study
of high v/inds, earthquakes, traffic peaks, extreme shocks and extreme quantities
and phenomena generally. Methods are given that are appropriate to the quantity
of data available—highly efficient methods for smaller samples and nearly as
efficient methods for large or very large samples. Necessary tables are provided.
The methods are illustrated by examples and summarized as a ready guide for analysts
arid for computer programming. The report outlines further work necessary to
cover other aspects of extreme-value analysis, including other distribution types
that occur in failure phenomena such as consumer product failure, fatigue failure, etc
17. KEY *ORDS (six to twelve entries; alphabetical order: capitalize only the first tetter of the first key word unless a proper
ndme. eparated by .temicolons)
UNCL.ASSIFIED
i-^r From .Sup. of Doc, U.S. Government Printin;? Office 20. SE( URITY CLASS 22. Price
ishineton. D.C. 20 i02. SD Cat. No. Cl^ (THIS PAGE)
i
r- 0- <ie: From Nation \l Technicril Intormation Service (NTIS)
-nqHeld, Vitginii 221M LNCLASSIFIED
USCOMM-OC 2q042-P74