You are on page 1of 22

..


ELSEVIER Journal of Hydrology 157 (1994) 13 34
Journal
of
Hydrology
[3]

Effective fractal dimension and corrections to the mean


of annual maxima

I.J. Dwyer*, D.W. R e e d


Institute of Hydrology, Wallingford, OXIO 8BB, UK
Received 26 October 1992; revision accepted 7 December 1993)

Abstract

Given a time series of daily rainfall totals, it is recognized that the maximum daily value is, in
general, lower than the true 24 h maximum. Thus, annual maximum series under-represent the
extreme events, and correction factors are therefore required to convert them to their 24 h
counterparts. The phenomenon is generalized and examined by reference to UK time series for
rainfall, wind speed and air temperature, resulting in mean correction factors of 1.167, 1.099
and 1.036, respectively. For air temperature, the result is influenced by the choice of datum. In
general, such correction factors are unavailable, and so the effective fractal dimension, d, of the
data is investigated as a means through which to predict them; the correlation found is
promising.

1. Introduction

W h e n investigating environmental extremes, such as those o f wind, rain or tem-


perature, it is c o m m o n to extract the extreme events from a long data record. For
example, given a long record o f daily rainfall totals one m a y obtain the annual
m a x i m u m series. Statistics derived from this series, such as the mean annual maxi-
m u m , m a y then be used to assess the likelihood o f extreme rainfall occurring. This
can, however, be misleading.
The daily rainfall totals represent rain which fell during a fixed interval called the
observation day (often from 0900 h one day to 0900 h the next). The statistics derived
from the annual m a x i m u m series are therefore suitable for making inferences a b o u t
rain falling in such a fixed interval rather than a b o u t rain falling in any other 24 h
interval. I f it was possible to slide a 24 h window continuously across a year's data, the

* Corresponding author.

0022-1694/94/$07.00 © 1994 - Elsevier Science B.V. All rights reserved


S S D I 0022-1694(93)02439-5
14 LJ. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34

maximum accumulation found would, in general, be greater than that found from the
fixed intervals of the observation day. Unfortunately, the sliding or variable maxi-
mum is often unavailable, owing to the absence of continuous (or even hourly) data.
Only the fixed maximum is available, and the extreme events are thus under-appre-
ciated.
The example of daily rainfall has been chosen to reflect the historical development
of the problem. By examining many US rainfall data sets, Hershfield and Wilson
(1958) concluded that one could obtain an approximate value of the mean variable
maximum by multiplying the mean fixed maximum by 1.13. Weiss (1964) produced
theoretical arguments to suggest that this correction factor should be 1.14. More
recently, Van Montfort (1990) suggested a procedure for estimating correction
factors in an extreme value distribution context, and the broader investigation of
Coyle et al. (1991) served as a precursor to the study presented here.
The implications of this phenomenon are far-reaching. In hydrology, the accurate
estimation of extreme rainfall can be vital to the design of river flood defences. In
particular, the mean annual maximum value is often used to characterize extreme
rainfall of a given duration. More generally, the problem is present whenever extreme
value analysis is carried out upon discretized data (that is, data consisting of totals or
averages over some basic time interval). The problem is not confined to annual
maximum series, and is relevant also to partial duration (peaks over a threshold)
series and similar data sets.
In this paper, the phenomenon is examined for the climate variables of rainfall,
wind speed and air temperature. This study is confined to 'annual maximum' type
analysis, concentrating upon correction factors for the mean annual maximum.

2. Theory

Looking more closely at how discrepancies between fixed and variable maxima
arise, the following observations can be made:
(1) The true largest event may simply be split across two fixed intervals, with part of
the event being recorded in the first interval and the remainder in the second interval,
resulting in two relatively modest measurements. The fixed maximum recorded may
then be 'displaced' from this true (variable) maximum to some other secondary event
which is better synchronized with the fixed intervals (Fig. l(a)). The highly variable
nature of rainfall - - particularly its ability to start and stop abruptly - - encourages
this behaviour, whereas it is less likely to occur for a variable such as air temperature,
which tends to vary more smoothly over time. That is, the more intermittent the
variable, the more opportunity there is for the fixed maximum to be displaced.
(2) Even when the fixed maximum does coincide with the largest event, the variable
maximum will still generally be higher because, unconstrained, it searches to either
side of the fixed maximum, to maximize the result (Fig. l(b)). If there is strong
autocorrelation (as is the case for air temperature), the ratio of the variable to the
fixed maximum will be close to unity; weaker autocorrelation will tend to yield larger
ratios. Thus, greater temporal variability (i.e. erraticism) gives rise to higher correc-
tion factors.
1.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34 15

climate varla~e
v~-iab~ fixed ~/maxim~

(a) time
climate
variable

Cb)
time
Fig. 1. Continuous data (graph) measured as accumulations over fixed time intervals (histogram). The
variable maximum is, in general, greater than the fixed maximum, and may be part of(a) a separate event or
(b) the same event.

Thus one would expect rainfall, being highly intermittent and erratic, to render
higher correction factors than the smoother variable o f air temperature; correction
factors for wind data might lie between the two. The variogram is one means t h r o u g h
which the temporal character o f a variable can be investigated.

2.1. Variograms

Let X(t) be a r a n d o m variable in time, t, which has been sampled by a series o f


regularly spaced measurements x(t) = [x(1), x(2) . . . . , x(n)]. Variogram analysis indi-
cates the temporal character of X(t) in that it examines the dependence between
16 1.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34

values separated by a given time lag. The semi-variance 7(h) is defined as


7(h) = ½E{[x(t + h) - x(t)] 2}
(where E denotes statistical expectation), so that 27(h) is the expected squared dif-
ference between values separated by a time lag h. An estimate ofT(h ) is obtained using
the sample semi-variance s2(h), defined by
1 n-h
s2(h) -- 2(n- Z [ x ( i + h) - x(i)] 2 (1)
h) i=1
It is an unbiased estimator provided X(t) is stationary in the mean and 7(h) is time-
invariant. The plot of s 2(h) against h is known as the sample variogram.
The variogram is used to describe the temporal character because, unlike auto-
correlation, the calculation of semi-variance does not assume second-order station-
arity (constant mean and finite constant variance throughout the series). Where
second-order stationarity does hold, however, the semi-variance and autocorrela-
tion, p(h), at lag h, are related by
7(h) = ~2[1 - p(h)] (2)
where a 2 is the variance of the series. Missing data are easily dealt with when calculat-
ing semi-variance: the summation in Eq. (1) is simply taken over the number of pairs
available at that lag.
A more comprehensive discussion on variograms is that given by Webster and
Oliver (1990). For this study, a non-dimensionalized version of semi-variance is
used by defining the dimensionless structure function, S(h), as
S(h) - 2s2(h) (3)
s Is2

where sl and s2 are the standard deviations of the series [x(1),... , x ( n - h)] and
[x(h + 1) , . . . , x(n)], respectively. This is to allow direct comparison between variograms.

2.2. Scaling behaviour

The spatial and temporal scaling properties of climate data have received some
attention in recent years (e.g. Beer, 1989; Ladoy et al., 1991). In particular, X(t)
exhibits simple scaling if, for all A > 1,
Pr{[X(t + h) - X(t)] > q} = Pr{[X(t + Ah) - X(t)] > A/4q} for all q E 91
where Pr denotes probability and H E [0, 1] is the scaling parameter. For example, if
AXD and AXH represent fluctuations between daily and hourly values, respectively,
then the above definition says that A XD has the same probability distribution as
24HA XH (24 being the number of hours in a day). In terms of the variogram defined
above, simple scaling corresponds to the existence of a power law near the origin
(Bruno and Raspa, 1989); namely,
7(h) ~ c[ h ITM as [hl - ~ 0 (4)
l.J. Dwyer, D.W. Reed / Journal of Hydrology 157 (1994) 13-34 17

where c is a constant. In this case, a log-log plot of the variogram will reveal a straight
line at small lags with a gradient c~ = 2H; if sufficient second-order stationarity holds
then the same is true of the log-log structure function plot, referred to hereafter as the
log-log variogram. The gradient c~ can be estimated by linear regression.

2.3. Fractals

Let us consider now the graph of a realization of the random variable X(t) over the
time period [0, T] defined by
Gr(X) = {[t,X(t)] E 9t2: t E [0, T]}
If X exhibits simple scaling, then the graph Gr(X) will be a fractal, that is,
d = dim [Gr(X)] > 1
where 'dim' indicates fractal (Hausdorf0 dimension, The literature covering fractal
theory is extensive (e.g. Barnsley, 1989); for present purposes, it suffices to say that d
indexes the degree of erraticism and intermittency of the variable X(t). Necessarily,
1 < d < 2, with higher d indicating greater erraticism and intermittency.
Thus the scaling and fractal properties of a variable are closely related. Indeed, if
X(t) is a stationary Gaussian process with semi-variance independent of t and satisfy-
ing Eq. (4), then the fractal dimension, the scaling parameter (H) and the log-log
variogram gradient are simply related by
d=2-H
giving
d = 2 - a/2 (5)
(see Adler, 1981; Constantine and Hall, t991). Although it is recognized that rainfall
is strongly non-Gaussian (Lovejoy and Schertzer, 1994), investigations reveal that
logarithmically transforming rainfall data closer to normality has no significant effect
on d. This suggests that, although the resulting value for d cannot be called the
Hausdorff fractal dimension, it is nevertheless well defined and has similar properties
(e.g. it is invariant under equivalence transformations). Consequently, and similar to
Constantine and Hall (1991), Eq. (5) is used to define the 'effective fractal dimension',
d, for cases where Eq. (4) holds.
Thus, the extent of erraticism and intermittency of a variable is characterized by the
effective fractal dimension (d) of a graph of the variable. This dimension can be
readily estimated by linear regression upon a sample log-log variogram over the
range of lags for which scaling appears to hold. As it is erraticism and intermittency
which give rise to the discrepancy between fixed and variable maxima, one would
expect a relationship to exist between d and the correction factor.

3. Methods

The aim is to compare the effective fractal dimension of a data record with the
18 1.J. Dwyer, D.W. Reed / Journal of Hydrology 157 (1994) 13-34

correction factor required to convert the mean fixed m a x i m u m into the mean variable
maximum,

3.1. Correction factors

The correction factor is simply the ratio of the mean variable maximum to the mean
fixed maximum• As an example, we let x(t)= [x(1),x(2),... ,x(n)] be a series of
hourly measurements so that x(i) is the accumulation for the ith hour• Let us sup-
pose we are interested in 24h extremes; the daily accumulations J~ (i.e. fixed 24h
accumulations) can be constructed from the hourly series by defining
fl = x(1) + x ( 2 ) + . . . + x(24)
f2 = x(25) + x(26) + . . . + x(48)

fp = x ( n - 23) + x ( n - 22) + . . . + x(n)


where p = n/24. The m a x i m u m of these is called the fixed maximum, F, so that
r = max (fj)
1 <_j<_p

Similarly, the variable 24 h accumulations, vj, can be constructed by defining


vl = x(l) + x ( 2 ) + . . . + x(24)
v 2 = x(2) + x(3) + . . . + x(25)

Vq = x ( n - 23) + x ( n - 22) + . . . + x(n)

where q = n - 23. Then


V = max (Vj)
1 <j<q

is the variable maximum. It should be noted that {fj} c {vj}, whence F _< V.
To calculate the mean of the variable and fixed maxima, m consecutive samples
xl (t),..., Xm(t) are used, each of length n. The fixed maxima F1,... ,Fm and variable
maxima V1,..., Vm are extracted for each sample and the means

m i=1 m i=1

are calculated• The ratio

R = V/P

is then the ratio of the mean variable maximum to the mean fixed maximum.
For the sake of clarity, only hourly series and 24 h maxima have been discussed. In
LJ. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34 19

general, however, the series may have some other basic data interval such as minutes
or days, and the duration of the extracted maxima could be any multiple of the basic
data interval. Thus, for the ith sample, the fixed accumulations of duration D are
fli(D) = xi(1) + . . . + xi(D )
f~(D) = xi(D + 1) + . . . +xi(2D)

f~(D) = xi(n -- D + 1) + . . . + xi(n)


with p = n/D, as illustrated in Fig. 2(a), resulting in a fixed maximum
Fi(D) = max (fji)
1<j<p

The variable accumulations of duration D could be defined as previously (Fig. 2(b))


with the first and last variable intervals coinciding with the first and last fixed inter-
vals. An event occurring across the border of two samples, however, is not properly
represented by such a convention. If the variable maximum is to represent 'truth', it is
necessary to extract all extremes, including those which are split across two samples.
A convention is adopted whereby a border event is neither missed nor 'counted twice'
in the sense of it contributing to the variable maximum for both samples (see Appen-
dix A for further details). If [vj(D)] denotes the resulting set of q variable accumula-
tions of duration D assigned to sample i, then
Vi(D ) = max
l<j<q
[vj(D)]
defines the variable maximum for sample i. Thus, the ratio of the means of variable to
fixed maxima for each duration D is defined by

~-~ Vi(O)
R(D) = (:(D)_ __ i=l

F(D) ~ Fi(D)
A=I

The ratio R(D) can be used as a sample estimate of the population ratio. Dropping
the notation D, an estimate, s(R), of the standard deviation of R(D) was suggested by
Barnett (1974) as
1 m
sZ(R) - m(m - 1)F2 Z ( v i - RFi)2 (6)
i=1

Confidence intervals for R(D) can then be calculated using Normal percentage points.

Choice of datum
There is, however, a degree of arbitrariness in the value of R in that it is relative to
the datum (zero level) of the units of the variable; in other words, it is not location-
20 l.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34

climate
variable

lPi fl
........... • ..........,.......

....................!i!!!i!!iiiiii;iiiiiiti~i~i~!ililili~i~i
iiiiiiiiiiiiiiiiii
!ii i i i i i i i i i i i i i!Iilili i i !i !i i i !i i
:::::::::::::::::::::
........... :::::::::::::::::::::::::::::::::::::::::::
:::::::::::::::::::::
il!!!?iiiii~!iiii?i?i|i?i!!!i~!?ii!?ili~i?:

iiiiiiIiiiiiiii!iii iiiii
.:.:.:-:.:.:-:.:.:.:
:-:-:-:-:-:-:-:-:-: f=
!ii !!!i!i i !i i i i
"iiTiTi'i":Ti'i'
........
. . . . . . . . . .
, ..........
(a)
time

v.=.cmat'
Lii i!i i i i i ti l
.7.:¢!:i:i:~:i$
| :i:~:i:i:i:i:i:i:i:i
t
iiiiiiiiiiii::~ii::L!:::,i~!!::!!!ih
~
',
V1
,
::::::::::::::::::::::::::::::::::: I -i V~,

i::::::::i::::::! ii::!::i::ili :::::::::::::::::::::~ I


::::::::::: !: :~:i:: :~::~::~::~::::::::
: : : : : : : : : : : i:!:!:i::i:i:~::i: ~ : : : : : : : : ::::::::::::: :::::::::::::::::::::: --i V 3
:::::::::: ::::::::::" :::::::::: :::::::::: ::::::::::::
::::::::::::::::::::::::: :::::::. :.:--'
.... :;:::: . . . . . . . . . . . . . . . .
....................
...:.....: . . . . . . . . . . . . .
v V 4

(b)
time
Fig. 2. For duration D = 4 the figurerepresentsthe construction of (a) the fixedaccumulationsfl ,f2, • • • and
(b) the variable accumulations Vl, v2,.., for a discrete time series.

invariant. For instance, the ratio of the air temperatures 10°C and 5°C is 10/5 = 2,
whereas when measured in degrees Kelvin the ratio becomes 283/278 ~ 1.02. Wind
speed and rainfall offer a natural datum corresponding to 'no wind' and 'no rain',
respectively. This is not the case for temperature, however, as it is meaningless to talk
of 'no temperature'. This problem is formalized in Appendix B.
Potential candidates for the datum of an air temperature record include the mean
and minimum of the entire record. A datum such as the mean, which divides the
record into positive and negative values, may render negative values of R for samples
which contain only a few positive values split across two fixed intervals. Negative
ratios are difficult to interpret, and the minimum is therefore a preferred datum to the
mean. However, the minimum suffers from being dependent on a single event. As a
compromise, the lower 1% quantile is used; this is more stable than the minimum,
l.J. Dwyer, D.W. Reed/Journal of Hydrology157 (1994) 13-34 21

1.25-

1.20"

(~1 1.15'
(35
o"
.4--J
© 1.10-
rY"

1.05 -

1.00 I
o 8 16 24 ~2 40 48 56 64.

Duration, D
Fig. 3. Plot of ratios R(D) against duration D for wind speed record A.

does not admit negative values for R, and in practice yields a datum of zero for hourly
wind speed and rainfall.

Model for R ( D )
Having obtained the ratios R(D) for a given data record, plots of R(D) against D
can be constructed, such as is shown in Fig. 3 for hourly wind speed at Eskdalemuir,
southern Scotland. The behaviour exhibited in this example is typical in that there is a
rapid increase in R(D) for small D, which then levels off, leaving R(D) to fluctuate
about some constant value. This can be explained by considering the two finite sets of
fixed and variable accumulations as separate approximations to the infinite set of
accumulations obtained by sliding a window continuously across the data. For any
duration D > 1, the variable maximum is more accurate than the fixed maximum in
the sense of there being D times more intervals over which to take the maximum.
Therefore, as D increases, the relative accuracy of the variable m a x i m u m increases,
whereupon the ratio converges to its true value.
Following Coyle et al. (1991), this behaviour is modelled as exponentially dimin-
ishing growth to a limiting constant. Thus, denoting the model function by Re(D),
and noting that R(1) -- 1, gives

Re(D) = 1 + a{1 - exp [ - b ( D - 1)]} (7)


Fig. 4 shows this model fitted to the graph of Fig. 3 by nonlinear least-squares
regression, resulting in estimates for the parameters of a = 0.1050 and b = 0.0717.
22 l.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34

1.25-

1.20"

ch 1.15.
Qf
G
0 1.10"
n~

1.05 -

,.oo .... , ....... , ....... ,

o a ~ 24 32 4o 4a 58 64

Durotion, D
Fig. 4. Model relationship (Eq. (7)) fitted by nonlinear regression to wind speed record A.

The limiting value, R*, of the ratio is given by


R* = lim Re(D ) = 1 + a
D---~o¢
whence R* = 1.105 for Fig. 4. The parameter b indicates the rate at which the curve
approaches this limit; for instance, defining the p%-life as the duration Dp at which
the curve attains p % of its limiting growth above one gives
P
Re(Op) ---- 1 Jr- a 10---6

whence

Dp = 1 - ~ l n ( 1 - p / 1 0 0 )

Thus D95 ~ 43 for Fig. 4. It should be noted that the quantity D95 should not be
interpreted as the duration above which the correction factor R* can be confidently
applied, but rather as the duration above which R* can be confidently approximated
by Re(D); that is, R* is considered to apply across all durations.
By modelling R(D) in this way, it is assumed that the variation about the regression
curve is due to sampling error. To check this, the 95% confidence intervals for R(D)
can be calculated using Eq. (6) to see whether the regression curve falls within them.
For the data record illustrated in Fig. 4, and a number of other records examined, the
regression curve did fall within the intervals for all or almost all D, thus supporting
the assumption stated.
l.J. Dwyer, D.W. Reed/Journal of ttydrology 157 (1994) 13 34 23

For each climate series examined, estimates are obtained for R* and D95 together
with their respective 95% confidence intervals (from the regression fit). The maximum
duration used in fitting the model must be high enough to represent the approach to
R* adequately, but not so high that it approaches the length of the samples; for the
series considered here, a maximum duration of 32 or 64 basic data intervals proved
apt.
No theoretical basis is claimed for the exponential model (Eq. (7)) other than it
matches the behaviour observed. Only exceptionally did the model appear to be
inadequate, as was the case for one of the data records presented here. With regard
to air temperature data, all of the above can be analogously applied to the analysis of
minima, with the upper 1% quantile being used for the datum.

3.2. Higher-order moments

It has been discussed how discretization causes the mean of the maxima to be
underestimated. This must clearly be corrected if the extremes are to be properly
appreciated. It is conceivable, however, that the variance of the maxima may also
be distorted by discretization.
Coyle et al. (1991) reported that the coefficient of variation (CV) for the fixed and
variable maxima are broadly similar. It is becoming efficacious to calculate L-moment
ratios (see Hosking, 1990), which are similar to conventional moment ratios but suffer
less from sample variability, are more robust to outliers and are bounded in [-1, 1]
(thus allowing easier comparisons). Consequently, for each data record, the L-CV, L-
skewness and L-kurtosis are calculated for the set of fixed maxima
[Fi(D):i= 1 , . . . , m ] and variable maxima [Vi(D):i- 1 , . . . , m ] at each duration
D. The results are plotted to discern any systematic higher-order differences between
the fixed and variable maxima. The L-moments were calculated by the method of
plotting positions (Hosking, 1990).

3.3. Effective fractal dimension

For each data record, the dimensionless structure function S(h) is calculated for
lags h = 1 , 2 , . . . , 100 using Eq. (3). A log-log plot of S(h) against h is then con-
structed, as illustrated in Fig. 5(a) for Eskdalemuir wind speed data.
As discussed above, linearity in this plot is indicative of scaling behaviour and is
anticipated at small lags. The lags over which linearity holds, known as the range of
scaling, is assessed by eye, and a straight line is fitted by least-squares regression (Fig.
5(b)). The resulting estimate of the gradient c~ of this line, together with its 95%
confidence interval, is used in Eq. (5) to calculate the effective fractal dimension d
of the data record and its 95% confidence interval.
It should be noted that in the example of Fig. 5, S(1) is slightly too low to align with
the succeeding points, hence the lag one point is excluded from the analysis. This
phenomenon is common, and is disconcerting in that Eq. (4) suggests an upper bound
to the range of scaling but no lower bound. The lag one point is excluded (preferring
stability of the calculated dimension to theoretical consistency) pending a better
24 1.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13 34

2.0--
........ •o ° ° . . " ~
• °o.'''"
1.0-

0.5-

"2" 0.1--
v
09
0.05-

0.01-

0.005-

I I I I I I I I I
(a) I 2 4 7 10 20 40 70 100
Lag, h

2,0-

1.0-

0.5-

v" 0.1-
,f.,o
0.05-

0.01-

0.005-

(b) I I t I I I I I I
I 2 4 7 10 20 40 70 100
Lag, h
Fig. 5. (a) Dimensionless structure function S(h) plotted against lag h with (b) a straight line fitted over the
range of scaling for wind speed record A

understanding o f the p h e n o m e n o n . Possible explanations include the skewness o f the


data, the presence o f non-stationarity or multiple scaling behaviour.

4. Data series

This study examines 21 data records: eight rainfall, six wind speed and seven air
t e m p e r a t u r e All records are from the same site (Eskdalemuir, southern Scotland) and
1.J. Dwyer, D.W. Reed / Journal of Hydrology 157 (1994) 13-34 25

have a basic data interval of 1 h. Each air temperature record is analysed in terms of
extremes of minima as well as maxima, and thus a total of 28 sets of results is presented.
Setting m = 32, each data record then consists of 32 consecutive samples of length
n. It should be noted that n must be a multiple of D; for the samples used here, n is
chosen to be the greatest multiple of D such that n _< 29 (512). Thus, to construct 32
consecutive samples requires a data record of 214 values, that is, nearly 2 years of
hourly data. These conventions follow Coyle et al. (1991). The data available to the
study span the years 1970-1989 inclusive, resulting in a maximum of 10 data records
for each variable. Thus, each data record has associated with it a letter A - J corre-
sponding to the period it covers. For reasons of quality assurance, not all the possible
records for each variable are analysed; most notably, significant changes in instru-
mentation and processing discouraged analysis of wind speed data beyond March
1981.

5. Results and discussion

First, the graphs and results of three representative examples are presented - - one
each of rainfall, wind speed and air temperature maxima. Discussion and comparison
of these serve to summarize the complete set of results. Next, results are presented
supporting R* as a well-defined quantity, and finally the graph comparing the correc-
tion factor R* and effective fractal dimension d of the records is commented upon.
The log log variogram for air temperature (Fig. 6(a)) shows a strong linear rela-
tionship, with regression yielding a relatively low effective fractal dimension d = 1.270
and a tight 95% confidence interval (1.257, 1.283); in this case, as with all the air
temperature records, the lag one point can be satisfactorily included in the analysis.
The sinusoidal behaviour of the variogram reflects the diurnal cycle and may distort
the range of scaling. Fig. 6(b) exemplifies the low correction factors associated with
air temperature data; the model renders a limiting value R* -- 1.032, and a narrow
95% confidence interval (1.030, 1.034). The rate of convergence to this limiting value
is slow, with a D95 of about 17 h; owing to the relative insensitivity of the model to the
parameter b, the confidence interval for D95 is broad namely (12, 26).
Wind speed data also demonstrate strong scaling behaviour over a wide range of
lags (Fig. 7(a)). The resulting dimension d = 1.551 is appreciably higher than that of
air temperature, as is the correction factor R* = 1.098 (Fig. 7(b)). The plot of R(D) is
particularly well behaved for wind speed data (low variation about the regression
line). Fig. 8 shows the L-CV, L-skewness and L-kurtosis of the fixed and variable
maxima for this record. There is no evidence of systematic differences between the
fixed and variable maxima at higher moments. Broadly similar results were found for
the other data records, confirming that it is generally sufficient to correct only the
mean of the maxima.
The graphs for rainfall data disclose some important differences from those for
wind speed and air temperature. The structure function (Fig. 9(a)) very rapidly
approaches the sill (S(h)= 2), reached when the separation lag becomes great
enough for values to be uncorrelated. Consequently, it can be difficult to fit a straight
26 1.J, Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34

2.0--

1.0-
. , . . , . .
...-.... ..~ ./'~
0.5- . "...i ".,,."',,

v 0.1- /
(13
d=1.270
0.05-

0.01-
0,005-

I I I I I 1 I I I
(a) 1 2 4 7 !0 20 40 70 100
Lag, h

1.25-

1.20"

(-3 1.15'

d
0 1.1o-
rY"

R°=1.032
1.05-

1.00 ''''1''' .... I ....... I .... '''1 ....... I


0 8 16 24 ,32 40 48 56 64
(b)
Duration, D
Fig. 6. (a) The log log variogram and (b) graph of R(D), complete with regression lines, for air temperature
record E.
line, as this results in a very limited range of scaling and can m a k e it difficult to
separate the scaling region from the sill; much higher values of d are obtained than
for air t e m p e r a t u r e or wind speed. The resulting correction factors are also higher;
R* = 1.163 (~0.011) for Fig. 9(b). Convergence to R* is rapid, although high varia-
LJ Dwyer, D W Reed/Journal of Hydrology 157 (1994) 13-34 27

2.0--
1.0-

0.5-

"2" 0.1--

0.05-

0.01-

0.005-

(a) I I I I I I I I I
1 2 4 7 10 20 40 70 100

Log, h

1.25-

1.20-

1.15-

rY
d
0 1.10- ~--
rY

1.05 -

D~5~47
1.00 '' .... I ....... I ....... I ....... I ....... I ....... I ....... I ....... I
(b) 0 8 16 24 .32 40 48 5(3 64

Duration, D
Fig. 7. (a) The log-log variogram and (b) graph of R(D), complete with regression lines, for wind speed
record B.
28 1.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34
0.4- 0.4-

0.3- 0.3-

u)
q)
c-
(~ 0.2- o,2-
v

_5

0.1 0.1

0.0 I I I 0.0
8 16 24 32 8 16 24 32

Duration, D Duration, D

0.4-

0.3-

(/2
0
1~ 0 . 2 -
S ,,/
.._1

0.1

0.0 i I 214 t
0 8 16 32
Duration, D
F i g . 8. Graphs o f L-CV, L-skewness and L-kurtosis o f the fixed and variable m a x i m a (solid and dotted
lines, respectively) for wind speed record B.

tion about the regression line accounts for the relatively broad confidence intervals
for R*
To establish R* as a well-defined quantity, it is helpful to examine its sensitivity to
data record length and resolution. The former is achieved by concatenating the eight
rainfall records analysed (A-H), to obtain one long record of 217 hourly values
l.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34 29

2.0-- /
1.0- d=1.806

0.5-

v 0.1--
(1)
0.05-

0.01-
0.005-

(a) I I I I I I I
4 7 10 20 40 70 100
Lag, h

1.25-

1.20-

t ' ~ 1.15-
re"
C
, _ _

0 1.1o-
r~

1.05"

1.00 ...... I I''¸' '1 ..... ~1 '¸¸''

(b) 0 8 16 24 32
Duration, D
Fig. 9. (a) The log-log variogram and (b) graph of R(D), complete with regression lines, for rainfall
record H.
30 l.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34

1.20-
I Rainfall
I
• Wind speed
II
A Air temp. (max) +4
V Air temp. (min) I

I
1.15-
II

a:

"5
t- 1.10-
.
0
m

1.05-
,~ T~T T

A*A A

1.00 I I I I i I i I i I
1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0

Fractal dimension, d
Fig. 10. Graph comparing the correction factor R* with the effective fractal dimension d for each of the 28
sets of results. Also marked (cross) is the result for the concatenated rainfall record.
spanning the years 1970-1984. Retaining m = 32 (i.e. dividing the record into 32
consecutive samples), fixed and variable maxima are extracted from samples eight
times longer than previously. The resulting R* = 1.172 is consistent with the average
R ~ (1.167) for the eight separate records. Calculating the effective fractal dimension of
this concatenated record yields d = 1.806. Current investigations (to be reported)
using 8 h and daily rainfall records indicate that R* is generally insensitive to the
time resolution of the data, at least in the range 1 h to 1 day. Thus, R* is considered to
be sufficiently well defined for application to rainfall data.
Results of the discretization study are usefully summarized by plotting R* against d
(Fig. 10). A clear association between the two statistics has emerged, with higher d
giving rise to higher R* in an approximately linear fashion. However, there is con-
l.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34 31

Table 1
Mean and standard deviation (SD) of R* for each climatic variable, and a suggested range given as 2SD
either side of the mean

Variable Mean of R* SD of R* Suggested range~

Air temperature
maxima 1.036 0.004 (1.03, 1.04)
minima 1.041 0.002 (1.04, 1.05)
Wind speed 1.099 0.008 (1.08, 1.11)
Rainfall 1.167 0.013 (1.14, 1.19)

To two decimal places.

siderable scatter, largely in the R* component. It should be noted that this relation-
ship could easily be destroyed by using different datum types for climate variables
such as air temperature: in view of this, consistency of datum (in this case, the 1%
quantile) is considered imperative to the comparison. Thus, for the given datum, the
effective fractal dimension can be used to indicate an approximate range for the
expected value of R*.
Fig. 10 also indicates the sample variability of d and R*. In terms of effective fractal
dimension, there is clear segregation of the three climate variables into narrow bands.
Although this was anticipated, in view of their differing temporal character, the
fineness of the segregation was not. Segregation of the variables into bands is less
marked in terms of R*; Table 1 shows the mean and standard deviation of R" for each
variable. The highest R* value for wind speed corresponds to the one data record for
which the exponential model (Eq. (7)) had proved inadequate. It is shown for com-
pleteness, but was censored from the calculations for Table 1.
The results for rainfall suggest that the Weiss (1 day to 24h) correction factor of
1.14 is generally too low for the Eskdalemuir site. It is interesting that the correction
factors required for air temperature maxima at Eskdalemuir tend to be slightly lower
than those required for air temperature minima.
The success of the model, showing R(D) fluctuating about some limiting value,
supports the view that R* may be applied across all durations. Thus, whereas
Hershfield and Wilson (1958) concluded that it is by coincidence that the correction
factors to be applied to hourly and daily rainfall are the same, these results suggest
that correction factors for different durations are naturally equal. However, this
conclusion is tentative and might only apply to the range of durations considered.
Indeed, prompted by evidence that rainfall is a multiple scaling rather than a simple
scaling process (Lovejoy and Schertzer, 1994), research continues into the scaling
behaviour of rainfall by the examination of data sets of differing resolutions.

6. Conclusions

It is often necessary to carry out extreme value analysis upon discretized data. In
such cases, to appreciate fully the magnitude of extremes, it is appropriate to adjust
32 LJ. Dwyer, D.W. Reed/Journal of Hydrology 157 (19943 13 34

the mean fixed maximum to reflect the mean variable maximum. Correction factors
for doing so are, however, sample dependent, making their estimation difficult and
imprecise. Nevertheless, the correction factors found clearly distinguish between the
climate variables studied, whence an approximate range for the expected correction
factor according to type of variable can be indicated (Table 1). There is no evidence to
suggest that the higher-order moments need adjusting.
There is evidence that many climate variables exhibit simple scaling behaviour. The
effective fractal dimension, d, is a measure of the temporal erraticism and intermit-
tency of such variables. Our results show that d also clearly distinguishes between the
three types of variable considered - - demonstrating the unique temporal character of
each. Furthermore, unlike R* and many other statistical measures, its calculation is
independent of datum and, owing to the scaling properties, it is also independent of
the time-scale. All this suggests that effective fractal dimension is a meaningful and
useful measure of the temporal variability of climatological time series.
As it is erraticism and intermittency which give rise to the discrepancy between
fixed and variable maxima, the calculation of d for a given data record may serve to
indicate the appropriate correction factor. For the data collected, a positive correla-
tion exists such that d c a n be used to indicate an approximate range for the correction
factor. The relationship may be investigated further by examining other sites with
different climate regimes, as well as including other types of variable, not solely
climatic.

7. Acknowledgements

The Allowance for Discretization in Hydrological and Environmental Risk


Estimation ( A D H E R E ) project is funded by the Terrestrial and Freshwater Sciences
Directorate of the UK Natural Environment Research Council. We thank Andrew
Coyle, Bruce Kelbe, Christine Simmonds and Lisa Stewart for their contributions to
the project. The Eskdalemuir hydrometeorological data were supplied by the UK
Meteorological Office. We were grateful for referees' comments, one of which
prompted the analysis of the concatenated rainfall record.

8. References

Adler, R.J., 1981. The Geometryof Random Fields. Wiley, New York, pp. 188 206.
Barnett, V., 1974. Elements of Sampling Theory. English University Press, London, Chapter 3.
Barnsley, M., 1989. Fractals Everywhere. AcademicPress, Boston, MA, pp. 1 394.
Beer, T., 1989. Rainfall as a fractal process. Proc. 4th Int. Meeting on Statistical Climatology, Rotarua,
New Zealand, March 1989. New Zealand Meteorological Association, Wellington, pp. 269 272.
Bruno, R. and Raspa, G., 1989. Geostatistical characterization of fractal models of surfaces. In: M.
Armstrong (Editor), Geostatistics, Vol. 1, Kluwer, Dordrecht, pp. 77 89.
Constantine, A.G. and Hall, P., 1991.Characterizing surface smoothness via estimation of effective fractal
dimension. Centre for Mathematics and its Applications, Canberra, A.C.T., Research Rep.
CMA-SR20-91. (See J. R. Statist. Soc. B, 56(13:97--.I13.)
l.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13 34 33

Coyle, A.J., Kelbe, B.E., Reed, D.W. and Stewart, E.J., 1991. A temporal look at hydrological extremes.
Proc. British Hydrological Society 3rd Nat. Hydrol. Syrup., Southampton, 16-18 September, British
Hydrological Society, London. pp. 6.51-6.59.
Hershfield, D.M. and Wilson, W.T., 1958. Generalizing of rainfall-intensity frequency data. IUGG/IAHS
Publ. 43:499 506.
Hosking, J.R.M., 1990. L-Moments: analysis and estimation of distributions using linear combinations of
order statistics. J.R. Statist. Soc., B, 52(1): 105 124.
Ladoy, Ph., Lovejoy, S. and Schertzer, D., 1991. Extreme variability of climatological data: scaling and
interminency. In: D. Schertzer and S. Lovejoy (Editors), Nonlinear Variability in Geophysics. Kluwer
Academic, Dordrecht, pp. 241 250.
Lovejoy, S. and Schertzer, D., 1994. Multifractals and rain. In: Z.W. Kunzewicz (Editor), New Uncertainty
Concepts in Hydrology and Water Resources, UNESCO Series in Water Sciences. Cambridge Uni-
versity Press, Cambridge.
Van Montfort, M.A.J., 1990. Sliding maxima. J. Hydrol., 118: 77-85.
Webster, R. and Oliver, M.A., 1990. Statistical Methods in Soil and Land Resource Survey, Oxford
University Press, Oxford, Chapter 12.
Weiss, L.L., 1964. Ratio of true to fixed-interval maximum rainfall. J. Hydraul. Div. Proc. ASCE, 90: 77-
82.

9. Appendix A

T o e n s u r e t h a t e v e n t s o c c u r r i n g across the b o r d e r o f two s a m p l e s are n o t missed, we


a d o p t a c o n v e n t i o n w h e r e b y a v a r i a b l e i n t e r v a l w h i c h is split across the b o r d e r is
a s s i g n e d to the s a m p l e in w h i c h it is m o s t l y c o n t a i n e d ; if it is e q u a l l y split b e t w e e n
t h e m t h e n it is a s s i g n e d to the f o r m e r (Fig. A1). E q u a l l y , a b o r d e r e v e n t m u s t n o t be
' c o u n t e d twice' in the sense o f it c o n t r i b u t i n g to the v a r i a b l e m a x i m u m for b o t h
samples. T h u s , if the v a r i a b l e m a x i m u m for the first s a m p l e straddles the b o r d e r ,
t h e n the v a r i a b l e i n t e r v a l s a s s i g n e d to the s e c o n d s a m p l e m u s t n o t i n c l u d e those w h i c h
o v e r l a p with this m a x i m u m (Fig. A2).

climate
variable

14
I
i iiiiiii!ililililililiiiiiiiiiiiiiiiiiiiii
.... l . ......... L

iiiiiiiiiii~iiiiiii!ii ,:,:,:,:,:.:.:.:.:.:+:, o

I
~ iiiiNiiiiiiiii)ilil I
i ~i:i:i:i:i:i:i~:i:i:
~i~i~i~i!i!i~i!i~i
~
!i!i!i!!!!!!!!i!!!
:}~;~;:;:;:;:;:;:;: 1
ii[iii~i[i!]~i~i~i! ...................... i
::::::::::::::::::: .........................

!:]:!:!:!:~:~:!~$! i:!:[:]:~:!:]:~!:!

:::::::::::::::::::::

sample i sample i+1

Fig. AI. Border intervals for duration D = 4. In searching for the variable maximum for each sample, the
accumulations It and 12 are assigned to sample i whereas 13 is assigned to sample i + 1.
34 l.J. Dwyer, D.W. Reed/Journal of Hydrology 157 (1994) 13-34

climate
variable
::::::::::
• .-.....-.-...-[.. •

:::::::::: ......:..........

I .:....: ,...,...

:+:+:.:.:.:.:.:.:
+:.:.:.:.:.:.:.:.:.
ii!i!!!i!i!!!ii!!ii!
!!!!ili!!ii!!iiii!
..,.....,....:..

:+:.:.:+:+:+:
::::::::::::::::::: ~ ::::::::::::::::::
:i:i:!:!¢i:!:!:H
::::::::::::::::::::::::::::::
iiiiiiiiiiiii~iiii
ii!!~!~i!ii!~i~i!!i! !!ii!!i!!i!ii~ii!i!ii :::::::::::::::::::::::::::::
............, i i i i i i i i i

@
.:....:.......
!iiii!iiiiiiiiiiiiiil!iiiiljiiiiiiiiiiiii:::::::::::::::::::1 ...:..:.....~...
:+:+:.:.:+:.~
i!i!ii~i!!~!ii~i :i:i:i:i:i:i:i:i:i:i: iiii~iii~iii!!!iii] i~!!!ii~!iii!i#
:+:+:+:+:.:.: :::::::::::::::2:::

~.~...................:..:......:
i::i::i::iii::i::iii::iilt
• +:.:.:.:+:.:.:.., !:!:!:!:!:!:!:i:!: ~
i:i:~:~:i:i:i:~:~:~:i :!:i:i:i:i:i:i:i:i:!:

i!i i i!i i i i i !l l il"


sample i sample i+1

Fig. A2. If, for duration D = 4, accumulation I is the variable maximum for sample i, then the first variable
accumulation to be assigned to sample i + 1 is vl, as it does not overlap with accumulation 1.

10. Appendix B

Let us consider the effect upon the ratio R(D) of changing the units of the variable
X. We describe a change of units by an invertible transformation h:9~ ~ ~ such that if
X = x in the old units then X = x' = h(x) in the new units. If h is linear (h(x) = rx,
r constant), then this represents a simple change of scale such as converting knots
to tenths of knots or millimetres to inches. The ratio is unchanged, however, as

R'(D) - (Z'(D) -- rI/(D) = ~(D) : R(D), r# 0


F'(D) rF(D) F(D)
If, however, h is affine (h(x) = rx + d, r and dconstants), then this represents a change
of scale plus a relocation of the zero level (or datum) such as occurs when converting
degrees centigrade to degrees Fahrenheit. In this case, the ratio does change, as, in
general,
rY/(D) + d V(D)
rF(D) + d ~ F ~
Thus, the value of R(D) is dependent upon the choice of datum.

You might also like