Professional Documents
Culture Documents
Data-Based Mechanistic Modelling A N D The Rainfall-Flow Non - Linearity
Data-Based Mechanistic Modelling A N D The Rainfall-Flow Non - Linearity
5, 335-363 (1994)
SUMMARY
Although rainfall-flow processes have received much attention in the hydrological literature, the nature of
the non-linear processes involved in the relationship between rainfall and river flow still remains rather
unclear. This paper outlines the first author’s data-based mechanistic (DBM) approach to model structure
identification and parameter estimation for linear and non-linear dynamic systems and uses it to explore
afresh the non-linear relationship between measured rainfall and flow in two typical catchments. Exploiting
the power of recursive estimation, state dependent non-lineanties are identified objectively from the time
series data and used as the basis for the estimation of non-linear transfer function models of the rainfall-
flow dynamics. These objectively identified models not only explain the data in a parametrically efficient
manner but also reveal the possible parallel nature of the underlying physical processes within the catch-
ments. The DBM modelling approach provides a useful tool for the further investigation of rainfall-flow
processes, as well as other linear and non-linear environmental systems. Moreover, because DBM model-
ling uses recursive estimation, it provides a powerful vehicle for the design of real-time, self-adaptive envir-
onmental management systems. Finally, the paper points out how DBM models can often be interpreted
directly in terms of dynamic conservation equations (mass, energy or momentum) associated with environ-
mental flow processes and stresses the importance of parallel processes in this connection.
KEY WORDS Data-based mechanistic modelling Objective inference Rainfall-flow processes
Soil moisture non-linearity Evapo-transpiration Recursive estimation Fixed
interval smoothing Time variable and state dependent parameters Transfer
functions Parallel flow processes Flood warning Active mixing volume
Imperfect mixing
1 . INTRODUCTION
Over many years, the first author has promoted an objective, data-based approach to the
modelling of environmental and other systems (see e.g. References 1-7). This favours simpler
mathematical descriptions of complex environmental processes, arguing that large, highly
parameterized models cannot be justified statistically because of the inherent limitations in the
available environmental time series data. Coupled with the inability to perform planned
experiments, such data deficiences seriously restrict the ident8ability of the parameters in such
models. This problem is exacerbated by modal dominance in dynamic systems; that is, the fact
that the normal response of high order dynamic systems is governed mainly by those few
eigenvalues which define the identifiable dominant modes of the system. Such modal dominance
is not often acknowledged in environmental simulation modelling but it can be dramatic in its
effect: for example, we have recently found that a linear, constant parameter, second order
transfer function model of the kind discussed in this paper can explain over 99 per cent of the
dynamic behaviour associated with the perturbed dynamics of a very high order global carbon
balance model.
Given these difficulties, we have suggested that the dominant modal structure should be
identified directly from the data using objective methods of statistical inference. In other words,
wherever data availability allows, the model builder should progress from these measured data to
an identifiable, efficiently parameterized, model in a manner which is as objective as possible.
This data-to-model or top-down approach contrasts with the alternative reductionist or bottom-up
methodology and, in situations where experimental data are available, helps to avoid the
possibility that prior scientific prejudice and an over-dependence on the current scientific
paradigm will lead to an over-parameterized, statistically unvalidated model. A similar
philosophy is inherent in the work of Beck and Jakeman, both of whom have papers in this
special issue (see References 8 and 9 and the references therin). The former, in particular, stresses
the importance of model structure identification in ensuring identifiability and proposes
analytical methods that can be compared with those espoused in the present paper.
But if such a data-based approach to modelling is to be of maximum potential utility, it has to
be carried out carefully. The simple input-output transfer function model, for instance, can
provide a reasonable basis for time series forecasting and control but it lacks the kind of clear,
mechanistic interpretation which is essential if the model is to be fully credible as a scientific
theory of behaviour. Data-based mechanistic (DBM) modelling (see e.g. References 4-7) is an
approach to environmetric (or other systems) analysis which attempts to extend data-based time
series methodology in a manner which enhances the model builder's ability to interpret the data-
based model in physical, biological, ecological or chemical terms.
In the DBM approach, the model structure is first obtained by a process of objective statistical
inference applied to the time series data and based on a given general class of linear transfer
function (TF) model whose parameters are allowed to vary over time. The presence of such time
variable parameters, which are estimated by the application of a sophisticated form of recursive
fixed interval smoothing parameter estimation, allows for the identification of any time or state
dependent variations in the model parameters which reflect non-stationary and non-linear
aspects of the observed system behaviour. However, the model equations so obtained are then
only accepted as a theory of behaviour if, in addition to explaining the data well, they also
provide a description which has direct relevance to the physical reality of the system under study.
Finally, if non-linear phenomena have been detected and identified, the identified non-linear
model parameters, which will normally be time invariant, are finally estimated by some
appropriate form of numerical non-linear optimization based on the identified model structure
and applied directly to the data.
Of course, while this novel approach should normally ensure that the model equations have an
acceptable physical interpretation, it does not guarantee that this interpretation will necessarily
conform to the current scientific paradigm. Indeed, one of the most exciting, albeit controversial,
aspects of DBM models is that they can tend to question such paradigms. For example, DBM
methods have been applied very successfully to the characterization of imperfect mixing in fluid
flow processes and, in the case of pollutant transport in rivers, have led to the development of the
aggregated dead zone (ADZ) model.'0i1'Despite its initially unusual physical interpretation, the
practical success of this ADZ model and its formulation in terms of physically meaningful
parameters seriously questions the credibility of the dispersion assumptions of the ubiquitous
advection dispersion model (ADE) which preceded it as the most credible theory of pollutant
transport in stream channels.
In this paper, we outline the DBM approach and apply it to the problem of modelling the
DATA-BASED MECHANISTIC MODELLING 337
Time (hours)
non-linear relationship between rainfall and flow data. Two sets of data are analysed: the first set,
shown in Figure 1, consists of 400 hourly measurements of rainfall and flow for a typical
catchment in the UK;4,’23’3 the second set is composed of 500 daily measurements of rainfall, flow
and temperature from an eastern United States catchment, as shown in Figure 2. Visual
inspection of both data sets quickly reveals that the physical process involved is non-linear,
since ‘antecedent’ rainfall conditions clearly affect the subsequent flow behaviour. In particular, if
the prior rainfall has been sufficient to wet thoroughly the soil in the catchment, then river flow
will be significantly higher than if the soil had dried out through lack of rainfall. In addition to
these ‘soil moisture’ induced characteristics, a longer term, seasonal effect is introduced by evapo-
transpiration: the annual climatic cycle will clearly affect the water balance and introduce
seasonal variations in the dynamic relationship between rainfall and flow. Although such an
evapo-transpirative effect is not obvious to the eye in either the hourly or the daily data, it is
clearly present in the daily data of Figure 2, as we shall see when the data are modelled in Section
3 of the paper.
The ‘soil moisture’ non-linearity is well known in hydrology and various models have been
proposed, from the simple ‘antecedent precipitation index’ (API) approach (see e.g. Reference
14) to the construction of large deterministic catchment simulation models such as the ‘Stanford
watershed model’ (see e.g. Reference 15, p. 30). In the API approach , the ‘effective rainfall’ or
‘rainfall excess’ (i.e. that part of the rainfall which is effective in causing flow variations in the
river) is obtained by multiplying the measured rainfall by an exponentially weighted index of past
rainfall inputs which, it is argued, reflects the antecedent rainfall behaviour and the associated
soil moisture conditions. A similar approach based on a T F model for soil moisture was first
suggested by Young’ and Whitehead et In its most recent this model can be
338 P. C. YOUNG A N D K . J. BEVEN
time (days)
Daily Rainfall
100
-2 8060
v
3 40
3d 20
0
0 50 100 150 200 250 300 350 400 450 500
time (days)
-20
0
' 50 100 150 200 250 300 350 400 450 500
Time (days)
Figure 2. Daily flow, rainfall and temperature data used in second example (flow measured in equivalent rainfall units
based on catchment area)
formulated as follows:
where k denotes the value of the associated variable at the kth sampling instant; y ( k ) is the
measured flow (or 'discharge'); ug(k)is the gauged rainfall; u,(k) is the effective rainfall; 6 is a
time delay; and s ( k ) is a soil moisture index, similar to the API, which defines the proportion of
the gauged rainfall that is effective in this sense and is defined as
s(k) = cr(k) + (1 - l / ~ , , , ( t ) ) s (-k I ) . (2)
It is easy to see that, as in the API, s ( k ) is an exponentially weighted past (EWP) measure of
antecedent rainfall." In (2), c is simply a scaling parameter which constrains rainfall excess and
streamflow to be equivalent in volume over the observation period and the index s(k) to lie
between 0 and 1.0; and ~ ~ (is t a ) time constant which is assumed to vary as a function of
DATA-BASED MECHANISTIC MODELLING 3 39
temperature t ( k ) (or sometimes pan evaporation measurements if these are available), according
to the relationship
where the temperature modulation factorf causes the loss rate to increase with temperature. This
temperature dependent term is introduced to account for the seasonal evapo-transpiration effects
which can reasonably be expected to be functions of the seasonal temperature variations. Note
that ~ ~ ( isf defined
) to be equal to a constant T~ at 20°C and will tend to remain constant if the
rainfall-flow observations are made over a short interval of time, as in the later hourly data
example, where the temperature changes are not sufficient to induce any obvious evapo-
transpiration effects.
Finally, we see from equation (1) that a linear TF relationship is assumed to apply between the
effective rainfall ue(k)and flow y ( k ) , where 6 is any pure time delay affecting the rainfall-flow
dynamics; and A ( z - ' ) , B(z-') are polynomials in the backward shift operator (i.e.
z-'y(k) = y ( k - i ) ) which characterize this TF,
where 3{.}is a reasonably behaved, non-linear function of the variables in an extended or non-
minimum state space (NMSS)20defined by the following NMSS state vector:
~ ( k=) [ y ( k- 1 ) . . . y ( k - n ) ~ ( k ).~. u(k
. - m ) T U ( k ) T ...U ( k - q)T]T.
We see that ~ ( kis)composed of the past values of y ( k ) ,as well as present and past values of a
deterministic input (or exogenous) variable vector u(k) with elements ui(k),i = 1,2,. . . , r; and
the present and past values of a vector U(k) of other exogenous variables U,(k),j = 1 , 2 . . . ,s.
Finally, p ( k ) is an unobserved, zero mean, stochastic process with fairly general properties, which
is the source of all stochasticity in the system and is assumed to be independent of the input
variables ui(k) and U j ( k ) .
This model assumes that y ( k ) is causally related to the primary input variables ui( k ) ;while the
vector U(k) represents any other associated variables which may affect the system non-linearly
but whose relevance in this regard may not be clear prior to time series analysis. For example,
in the rainfall-flow case, y ( k ) is the flow measurement; u l ( k ) = u ( k ) is the rainfall; and
Uj(k),j = 1 , 2 could, for example, represent the measurements of two other variables that are
expected to affect the system, such as temperature and pan evaporation.
In the present paper, we consider the case of a single, primary, input variable, so that r = 1.
Following arguments similar to those presented in Reference 4, the non-linear model ( 5 ) can then
be approximated, via a process of statistical linearization, by the following general T F model
with time or state dependent parameters:
+ + +
B ( k , z - ' ) - bo(k) bl (k)z-' . . . b,(k)z-"
u ( k ) + E(k)
y ( k ) = A ( k ,Z-I) + + +
1 a l (k)z-I . . . a,,(k)z-"
or
Y ( k ) = x(k) + S(k), (6b)
where x(k), which can be considered as the noise free output of the model, is defined as
and [ ( k ) is a stochastic noise term arising from the stochastic disturbance p ( k ) in ( 5 ) which, like
p ( k ) ,is assumed to be statistically independent of the input variables u(k) and any q ( k ) .Finally,
any pure time delay 6 affecting the relationship between u(k) and y ( k ) can be accommodated by
setting the 6 leading coefficients of the B(k,z-I) polynomial equal to zero.
The model (6) is a stochastic version of the T F model in (l), where the polynomial parameters
are assumed to be possible functions of the time index k . In the present context, these time
variable parameters will reflect the nature of any non-linear or non-stationary aspects of the
system behaviour; and their statistical estimates, based on the data y ( k ) ,u(k) and any U,(k),
should provide a potential source of information on the nature of the non-linearity and/or non-
stationarity in the system.
DATA-BASED MECHANISTIC MODELLING 34 1
variations can be related to any of the variables in the NMSS vector ~ ( k )in; other words, the
analysis is intended to identify any state dependency in the parameter variations which is
associated with non-linear behaviour in the system. At this stage, it should be possible to develop
some general non-linear representation of the estimated parameter variations a(k/N) in terms of
the other measured variables in ~ ( k Such
) . a systematic approach is currently being investigated
and will be the subject of subsequent publications. Here, however, we will assume that the
number of other measured variables is such that all physically meaningful non-linear relation-
ships can be explored on an individual basis in a reasonably straightforward fashion. As we shall
see, this proves to be possible in the rainfall-flow examples.
The linearization approach used to derive the TVP model (6) suggests that any temporal
variations in the linearized model parameters will be functions of the state ~ ( k )Some . 25 years
ago the first author suggested and utilized one of the simplest general assumptions which
~ ~ , ~ ~that a(k) is linearly related to functions of ~ ( k ) ,
acknowledges such state d e p e n d e n ~ y ,namely
i.e.
= M{X(k)Mk), (8)
where, in this paper, we restrict M(k) = M(x(k)} to be a diagonal transformation matrix
functionally dependent upon ~ ( k )and ; a(k) is a transformed parameter vector which should,
ideally, have time invariant elements if our introduction of the state dependency in this manner
proves successful.
If a ( k ) is time variant, then we may have to allow for some statistical degree of freedom in the
model (8). One of the simplest assumptions in this regard (see e.g. References 4,23,24) is to allow
a ( k ) to vary as a vector random walk (RW), i.e.
a ( k ) = a ( k - 1) +v(k), (9)
in which v ( k ) is a zero mean, white noise vector with covariance matrix Q which is assumed
statistically independent of the residual noise term ~ ( kin) (7). Other models, such as the integrated
(IRW) and smoothed (SRW) random walks, provide alternative representations parametric change
(see e.g. Reference 17) but the simpler RW model proves sufficient in the present context.
Substituting from (9) into (8), we see that the temporal evolution of the model parameter
vector a(k) is described by a state dependent model (SDM: referred to as SDM1 in Reference 4) in
the form of the following Gauss-Markov
a(k) = F(k, k - l)a(k - 1) + G ( k ) v ( k ) (10)
where
F ( k , k - 1) = M(k)M(k - l)-'; G ( k ) = M(k).
Since M(k) is diagonal, F ( k , k - 1) is also diagonal with elements f;,,(k,k- 1) =
mii(k)/mii(k- 1); in other words, the parameter variation is primarily dependent on the ratio
of the state dependent functions at the kth and ( k - 1)th sampling instants. In this manner, a
large increase (decrease) in mii(k),in relation to its prior value at the previous sampling instant
mii(k - l), will lead to a similar proportionate increase (decrease) in the inter-sample change in
value of the parameter. Of course, if the ith element v i ( k )of the white noise vector v ( k ) is zero,
then this will represent the actual variation in the parameter between samples; alternatively, the
parameter will also be subject to an additional stochastic change due to v i ( k ) .
estimate i ( k / N ) of the TVP vector, together with its associated covariance matrix P * ( k / N ) ,to
identify the nature of the potentially non-linear diagonal elements in the transformation matrix
M ( k ) and then estimate the parameters which characterize these non-linear elements. There are
two approaches to this problem: where v ( k )is zero so that a ( k ) in (8) is time invariant; and where
a ( k ) is assumed to evolve as the RW process (9).
First, when a ( k ) is time invariant, we can obtain an estimate of M(k) by assuming that (cf.
equation (8))
i ( k / N ) = M ( k ) a+ ~ ( k ) , (11)
where ~ ( kis)a zero mean, white noise vector with covariance matrix P * ( k / N )which is introduced
to allow for the uncertainty in the estimate a ( k / N ) .This suggests that a weighted least squares
(WLS) estimate of cy can be obtained by minimizing the cost function
k=N
J = x [ i ( k / N )- M ( k ) a I T W ( k ) [ i ( k / N-) M ( k ) a ] ,
k= 1
where the weighting matrix W ( k ) is defined as the inverse of the estimated covariance matrix
P * ( k / N ) ,i.e.
W ( k )= P * ( k / N ) - ' .
(In the Gaussian case, where ~ ( kis) assumed to have a zero mean, normal distribution, this cost
function procedure is equivalent to maximum likelihood estimation.) The exact nature of the
M(k) diagonal elements is specified in relation to the NMSS state variables by reference to the
FIS estimates { a ( k / N ) ,P * ( k / N ) }and the physical nature of the system under consideration. This
emphasis is essential to the DBM approach: while the choice of non-linear transformation should
not be dictated by the prior perception of the modeller about the nature of the non-linear system,
it should be capable of a rational explanation in relation to the physical nature of the system.
Examples of this approach in the case of rainfall-flow modelling are discussed in subsequent
sections of the paper.
Second, the estimation problem is more difficult in examples where it proves necessary to
assume that a ( k ) evolves as a RW process (i.e. when equation (1 1) is not sufficient to explain all
of the estimated variation in the parameters): in such cases, it is once again necessary to exploit
FIS estimation, but this time in this general non-linear estimation context. While this is possible,
we feel that the first approach described above will often prove adequate in practice and so only
this simpler approach will be considered further in the present paper. However, such a TVP non-
linear model may well be important in any on-line practical application of DBM models, since it
allows for on-line adaptation of the model parameters (see Section 3.1).
Once we have identified a plausible structural form for the non-linear model, however, it is then
possible to re-estimate the model against the time series data by some form of numerical
optimization (e.g. deterministic minimization of the model residual variance; maximum like-
lihood estimation; prediction error minimization; etc.), the exact nature of which will tend to
depend on the identified form of the model. For example, in the case of the TVP model (6),the
identification will, in general, suggest optimization of the following TF model:
( 13b)
where [(k) = A ( x ( k ) ,z - ' ) ((k) and, nominally, all of the parameters can be state dependent. In
practice, however, we would expect only a subset of the parameters to be so dependent, as in the
practical examples discussed in Section 3, with the others remaining constant, as in the standard
linear T F model.
6. Use the estimates of the parameters in step 5 as starting values in a final model estimation
stage where the (hopefully constant) parameters in the identified non-linear model are
estimated by some form of numerical optimization based on the identified non-linear model
form and all the relevant data in the NMSS vector. Again, at this stage, IV estimation is
recommended as a general approach to the problem; see Section 2.6.
7. Analyse the residuals from the non-linear estimation to ensure that there is no evidence of
any non-linearity not identified in steps 1-6. This should include non-linearity tests on the
resid~als.~~>~~
2.6. The systems approach to modelling input-output behaviour
Note that in steps 1 and 6, IV identification and estimation is recommended (see e.g. Reference
17) as a general method of modelling the causal relationships between the input and output
variables. This preference is linked strongly with the ‘systems approach’ to modelling the input-
output behaviour of stochastic, dynamic systems and contrasts somewhat with the more
traditional methods advocated in the statistical literature, where the emphasis tends to be
placed more on the modelling of the stochastic effects. In systems modelling, it is the presumed
causal mechanism between input and output which dominates attention because it is here that the
description of the physical aspects of the system behaviour will normally reside and where the
input-output mapping, which is so important to control systems design, is contained. Within this
systems context, the stochastic effects are often considered as a nuisance whose effect has to be
avoided or negated, rather than something of primary importance.
This is not to say, of course, that the systems analyst is not cognizant of the significance of
stochastic effects and the dangers of ignoring their presence in time series analysis. Indeed, one of
the major advances in time series analysis in the twentieth century, the optimal state estimation or
Kalman filter which we exploit in this paper for TVP estimation, is inherently stochastic and was
developed by the systems theorist Rudolf K a l m a r ~But,
. ~ ~ to the systems analyst, stochastic inputs
arise for many and diverse reasons, such as: other inputs which affect the output but are not
measurable; the result of measurement noise which cannot be considered as having rational
spectral density and so is difficult to model explicitly; from the presence of higher frequency but
low amplitude modes of dynamic behaviour that are not sufficiently excited by the input signal to
be identifiable from the input-output data; and because of low amplitude, additive, non-linear
effects which yield characteristics which are, once again, not easily characterized by the more
common stochastic models.
The IV method is very appealing in circumstances such as these because it does not require any
strong assumptions about the stochastic nature of any additive noise processes (which are simply
required to be statistically independent of the chosen instrumental variables), or concurrent
estimation of a model for such noise, as in most other procedures, such as maximum likelihood or
prediction error minimization (PEM).28 On the other hand, not only does the IV method have
respectable statistical credentials (e.g. Reference 29) but its statistical efficiency can be enhanced
quite simply is the noise process can be assumed to have rational spectral density. For example,
the IV methodology proposed by Y o ~ n g , ~ ’where
- ~ ~ the instrumental variables are generated by
an iteratively u dated ‘auxiliary model’ of the system, can be extended to an optimal refined IV
E
(RIV) form17i3 36 which is asymptotically equivalent to maximum likelihood estimation and
yields asymptotically efficient estimates of the T F model parameters if the noise can be assumed
to follow an ARMA process. Moreover, the related simplified refined instrumental variable
(SRIV) algorithm,37 which is utilized in the examples discussed in the next section, provides a
good compromise between the basic IV and full RIV algorithms which has proven very successful
in practical environmental applications over the past 15 years.
346 P.C. YOUNG AND K. J. BEVEN
where u(k) = ug(k)in this example, the recursive FIS estimated parameters iiI( k / N )and & ( k / N )
indicate little significant variation in the lag parameter al ( k ) but considerable changes in the gain
parameter bo(k).
Bearing the initial TVP estimation results in mind, as well as the nature of the response and the
likely effect of the soil moisture non-linearity, it makes sense to constrain the i i l ( k / N ) lag
parameter estimate to be constant and allow the & ( k / N ) estimate to vary over the observation
interval. In this manner, the relatively invariant shape of the initial (or ‘quick flow’) response to a
rainfall event is preserved, since the time constant of the recession curve (see Figure 1, flow data)
is determined entirely by the lag parameter. However, the estimated i o ( ( k / N )variations allow for
those changes in the gain of the system, and the associated amplitude of the flow response, which
are likely to be caused in large part by the soil moisture non-linearity.
Figure 3 shows the FIS estimate i 0 ( k / N )together with its standard error bound obtained from
the first diagonal element of the covariance matrix P * ( k / N ) ,as described in the previous section.
Note that these results are obtained from the analysis of only the first two-thirds of the data, so
that we are able later to evaluate the efficacy of the final estimated model over the latter portion of
the data. As explained in Reference 4, the performance of the FIS estimation alogorithm is
controlled mainly by the noise variance ratio (NVR) parameter, which effectively defines the TVP
tracking capability of the estimation. The results in Figure 3 were obtained by optimizing the
NVR parameter in the FIS algorithm, as outlined in Appendix 1: this yielded an optimum value
of NVR = 0.0865 with an associated R; of 0.9852 and a residual variance of 1433. We see that,
not surprisingly, the maximum accuracy of the FIS estimate is obtained during and just after
rainfall has occurred, where the amount of information on the gain parameter is at its highest
because of the natural excitation of the system by the rainfall event.
A scatter plot of i 0 ( k / N )against the flow y ( k ) suggests a non-linear relationship between the
two variables, as indicated by the previous research on the bilinear model. The nature of this
relationship is clarified and strengthened, however, if we note the nature of the associated second
diagonal element of the weighting matrix W ( k ) ,as shown in Figure 4 normalized to a maximum
DATA-BASED MECHANISTIC MODELLING 341
- 25t i
I
50 100 150 200 250
time (hours)
Figure 3. FIS estimate i 0 ( k / N )for the first 256 samples of the hourly rainfall-flow data with the standard error band
shown dashed
value of unity. This illustrates how the weighting is very high during rainfall events and quite
small at all other times. This reflects the fact that, in this simple example, the weighting is defined
by the inverse of the second diagonal element of P * ( k / N )which, not surprisingly, is at its smallest
when the flow response is being excited by the rainfall activity and the information in the data on
the parameter bo(k) is at its highest level. Figure 4 also shows the modified weighting effect
obtained by setting all weighting below a threshold of 0.2 to zero. Figure 5 is a scatter plot of the
FIS estimates i 0 ( k / N )versus flow y ( k ) but with the estimates plotted only at those times when
the weighting is above this threshold. Here, the timing of the maximum rainfall events (all those
greater than 0.5 of the maximum rainfall over the period) are indicated by crosses on the
horizontal axis: the close link between the timing of these events and the most significant
parameter estimates is obvious.
The plot in Figure 5 suggests a possible power law relationship between i 0 ( k / N )and y ( k ) , so
1 - -
Temooral variation of weiehtine matrix element
0.8
3
=
.s 0.6
3
H
5 0.4 I
E threshold=0.2
0.2
0
50 100 I50 200 250
Time (hours)
Figure 4. Temporal variation of weighting matrix element for weighted least squares estimation of a power law
relationship between the b o ( k / N ) estimate and flow: threshold (0.2 of maximum weighting) is shown as dashed line,
with associated weighting shown as infill
348 P. C. YOUNG AND K. J. BEVEN
_1
250
Flow (liuedsec)
Figure 5. Weighted least squares estimate of power law curve compared with the most significant FIS & ( k / N ) estimates:
rainfall greater than 50 per cent of maximum marked x on the flow axis
that equation (1 1) takes the simple scalar form (since the lag parameter cil ( k / N ) is constant in
this case and so can be omitted)
optimum exponent=0.628
22 -
0'=17.23; R: S . 9 8
Figure 6. Final optimization of the power law exponent: residual variance of SRIV estimated model plotted against power
law exponent
with the variance of the model residuals 0;= 14.37, compared with the output flow variance of
CT; = 984.59. (The standard errors noted in parentheses are based on the approximate covariance
matrix obtained from SRIV estimation and should be treated with due caution: see comments in
Section 2.) This yields a coefficient of determination (Appendix I), defined in terms of the model
errors (residuals) &k), of R; = 0.985. The predictive properties of the model are also very good:
when it is applied to all of the data, including the one-third omitted for the identification and
estimation analysis, R: is only reduced to 0.978. Note that these high R; values indicate a very
good explanation of the data since they are based on the residual modelling errors i ( k ) and not
the one step ahead forecasts (see Appendix I).
Having identified a suitable non-linear model form using the SDM analysis, it now remains to
complete the final model estimation stage, where the parameters in the identified non-linear
model are estimated by numerical optimization based on the whole data set of 400 samples. Here
we again utilize SRIV estimation, but this time within an iterative procedure in which the power
law exponent is optimized to minimize the SRIV model residual variance, as discussed briefly in
Appendix I. The results of this optimization are shown in Figure 6, where we see that the
optimum exponent is quite clearly defined at 0.628 with a residual variance of 17.23 and
associated R; = 0.980. (Note that these results cannot be compared directly with those obtained
in the SDM identification stage since they are based on the analysis of the whole data set.) SRIV
estimation of the TF model then yields the following parameter estimates:
The good quality of the model is illustrated in Figure 7, which compares the model output i ( k )
350 P. C. YOUNG AND K. J . BEVEN
300 -
ur vh$L
250 -
time (hours)
Figure 7. Final estimated model output compared with the flow data: model error ( + 270) shown above
i ( k )=W u , ( k )
A(.-')
with the flow data. Note how successfully the second order model dynamics have been able to
account for the base flow effects.
The model residuals ( ( k ) = y ( k ) - f ( k ) in Figure 7 show some evidence of autocorrelation but
no significant correlation with the input effective rainfall variable, or any indication of residual
non-linearity other than the obvious heteroscedasticity. This is linked with the amplitude of the
flow y ( k ) and so is removed if the residuals are scaled by the flow, i.e. &(k) = ((k)/y(k).These
observations suggest that the model might be improved still further if the residuals in (15) are
modelled explicitly. The Akaike information criterion (AIC) identifies a third order autoregres-
sive (AR(3)) model for &k) with the following parameter estimates:
21 = -0.697 (0.05) 22 = -0.229 (0.06) 23 = -0'304 (0'05).
The ACF of the residuals indicates that they are white with a residual variance of 8.533, which
can be compared with the variance of 17.23 for &k).
The complete, non-linear, stochastic model for the rainfall-flow data obtained in the above
manner can be written in the form
19.832 - 1 9 . 1 6 9 ~ ~ ' 1
+
y ( k ) = 1 - 1.7642-' 0 . 7 6 7 ~ ~ ~
u(k)y(k)0.628+ +
1 - 0 . 6 9 7 ~ ~ '0 . 2 2 9 -
4);
~ ~0 .~3 0 4 ~ ~ ~
e ( k ) = N(O,8.53), (16)
where the input-output part of the model explains 98 per cent of the flow variance, while the
AR(3) noise model only accounts for 2 per cent, with the residual white noise representing half of
this variance. Given these results, it would be possible to improve the efficiency of the estimates
by estimating all the parameters in (16) simultaneously using full RIV estimation, rather than in
the two stage SRIV approach. However, the improvement is small in this case and does not affect
the nature of the model to any large degree. It is much more important in the present context, and
indeed crucial to the DBM approach, that the model should be interpreted in physically
DATA-BASED MECHANISTIC MODELLING 351
Rainfall
66.6%
Time constant = 80.2 hrs
Figure 8. Final estimated model interpreted as a parallel connection of two first order processes with different time
constants and percentage flow partitioning through the parallel pathways
meaningful terms. In t h s regard, the second order model (16) not only explains the data very
well, but is also of considerable physical interest.
dXi(t)
Ti- = Giue(t)- x i ( t ) ; i = 1,2,
dt
where Tiand Gi, i = 1,2, are, respectively, the associated time constants and steady state gains;
u e ( t ) = u(t)y(t)0'628is the effective rainfall input entering via the partitioned pathways; and
x i ( t ) ,i = 1,2, are the continuous time equivalents of the outflows in Figure 8. It is straightfor-
ward to see how such an equation can be interpreted as a dynamic mass balance or storage
equation by defining the rate of change of water storage S i ( t ) in the ith zone as the volume of
water entering the zone in unit time minus the volume leaving in the same time, i.e.
where Giue(t)represents the volume equivalent of the rainfall input entering the ith zone. (See
Young and Lees7 for a more general environmental interpretation as an active mixing volume
352 P.C. YOUNG AND K. J. BEVEN
(AMV) model.) Now, by making the physically reasonable assumption that the outflow x , ( t ) is
proportional to the storage S,(t) at any time, i.e.
and substituting from equation (19) into (18), we obtain the identified equation (17) provided Ti
is a constant. The derivation of equation (17) in this manner is perfectly credible in physical
terms: indeed the derivation, including the assumption (19), is the basis of the physically based
'lag-and-route' models that have been used for some considerable time by hydrologists (see e.g.
Reference 38).
These physical interpretations of the model are not only in sympathy with the DBM approach
but also quite useful in practical terms, provided the effects of uncertainty on the parameter
estimates are taken into account. For example, the recognition that the system model, as defined
by the nominal (mean) value of the estimates, is unambiguously a parallel dynamic process has
the additional desirable effect of allowing the modeller to objectively estimate the 'base flow'
element of the river flow: this is simply obtained as the output x 2 ( k )of the lower T F in Figure 8.
This can replace or supplement other methods of base flow estimation used in conventional
hydrological analysis. In addition, the storage interpretation of each parallel pathway given by
equation (18) provides a useful method of inferring the relative changes in water balance going on
in the main short and long term storage zones of the system.
One aspect of the model that is not physically reasonable is its bilinearity, which suggests a
positive feedback link between the output and input to the system; this is something which clearly
affects the stability of the model in simulation terms (although it does not affect it deleteriously in
forecasting terms since the stability of the forecast is not affected). Of course, this connection is
not really present in any physical interpretation of the process: rather, y ( k ) is being used as a
convenient surrogate for a soil moisture effect (see Section 1). Consequently, in any simulation
application of the model, the measured y ( k ) in the definition of effective rainfall should be
replaced by a variable s ( k ) defined in terms of the input rainfall by the equation
s(k) = {k(z-')/~(z-')}u(k),where R(z-') and B(z-') are the estimated model polynomials.
Of course, in relation to the presently estimated models, this involves an approximation, and so it
would be preferable, in such circumstances, to re-estimate the model using this modified
structure. Indeed, the model in this form can then be compared directly with the conventional
model (1)-(3), as indicated by our introduction of s ( k ) :in particular, the expression for s(k)
above can be compared with that in equations (1) and (2). This suggests that these more
conventionally estimated models may require the introduction of a power law in s ( k ) ,which is
supported by the recent research of Jakeman et aL9
Of course, however reasonable the physical interpretation and the objectivity of the analysis, a
model should always be interpreted and used with caution. In the present case, for example, the
model derivation is based on only one set of time series data and further modelling, based on
additional data sets, is clearly necessary before the model can be confidently accepted as a
reasonable representation of the catchment dynamics. In addition, one special caveat is in order:
the uncertainty in the model parameter estimates leads to associated uncertainty in the derived
'physically meaningful' parameters, such as the partitioning percentages, steady state gains, time
constants and water storages. Clearly, such uncertainty needs to be carefully evaluated and taken
into account in any subsequent use of the model (e.g. Monte Carlo analysis39i40).
Finally, it is interesting to note that on-line, adaptive versions of rainfall-flow models of the
type considered here can be useful in river management provided the above caveats are taken into
account. A practical example of this is the computer-based adaptive flood warning system for the
DATA-BASED MECHANISTIC MODELLING 353
As in the hourly data case, this model unambiguously defines a parallel pathway model of the
form shown in Figure 8: in this case, however, the quick flow pathway has a time constant of 1.44
days and a flow partition of 33.1 per cent, while the slow flow pathway has a time constant of 24.7
days and a flow partition of 66.9 per cent.
This model explains the data reasonably well, as we see in Figure 10, with an R: = 0.832 and a
residual variance of 3.54. However, the model residuals, as shown in Figure 10 where they are
3 54 P. C. YOUNG AND K.J. BEVEN
0.7 -
7 0.6 -
0.5
a
2 0.4
Time (days)
Figure 9. FIS estimate & ( k / N ) for the daily rainfall-flow data with the standard error band shown dashed
enlarged for clarity, exhibit clear long term seasonality which is visually correlated negatively
with the temperature, which is plotted in scaled and inverted form (cf. Figure 2) as a dashed line.
This correlation is shown more clearly in Figure 1l(a), where the residuals are plotted below the
scaled and inverted temperature plot; and it is quantified in the cross-correlation function plotted
in Figure 1 1(b), where we see that there is strong negative correlation at lag zero and a seasonal
pattern of variation in both variables.
On the basis of these results, it seems reasonable to include an additional input non-linearity in
terms of the smoothed temperature measurements (FIS estimation suggests a linear law in this
case); and non-linear optimization with associated SRIV identification and estimation yields the
following TF model:
0.3015 - 0.27882-'
~ ( k ) y ( k ) ' .1.95
~ ~-{ O~O5ewup)+ [(k).
y ( k ) = 1 - 1.55222-1 + 0 . 5 6 6 ~ - ~ (20)
This model explains the data quite well, as shown in Figure 12, which compares the model output
60
20
0
0 50 100 150 200 250 300 350 400 450 500
Time (days)
Figure 10. Output of non-linearly optimized model with flow dependent input non-linearity: enlarged model error ( x 10)
shown above (+ 70) and compared with the scaled, inverted and FIS smoothed temperature series (shown dashed)
DATA-BASED MECHANISTIC MODELLING 355
-101
0 50 100 150 200 250 300 350 400 450 500
10
SRlV Estimated Model Error
-101 I
0 50 100 150 200 250 300 350 400 450 500
Time (days)
(a)
Cross correlation function between scaled and FIS smoothed
0.6 temcerature and model error
$ 0.4-
.-
(="
3 0.2-
u
-s
'J
o -
{ V
5 -0.2o -
0
9 -0.4
-0.6L
t
-500 -400 -300 -200 -100 0 100 200 300 400 500
lag (days)
(b)
Figure 1 1 . Correlation between the residual model error and scaled, inverted and FIS smoothed temperature (see text):
(a) comparative plot (b) cross-correlation function
i ( k ) ,defined as before, with the flow data. When compared with the results in Figure 10, there is
a clear improvement in fit, as reflected in the enhanced R; = 0.891 and residual variance of 2.33.
Finally, the model again suggests a parallel pathway model, this time with a quick flow pathway
time constant of 1.87 days and a flow partition of 39.1 per cent, combined with a slow flow
pathway time constant of 29.7 days and a flow partition of 60.9 per cent.
It will be noted that no attempt has been made here to model the residual <(k) as an AR
process. This is because the model represents only an initial exercise in modelling these daily data.
In particular, although the model appears reasonable at first sight, there are some indications,
both from an identification/estimation standpoint and in DBM terms, that a superior non-linear
representation is possible. First, it is clear during the non-linear optimization exercises that the
356 P. C. YOUNG AND K . J . BEVEN
60
h
E
E
v
b 40.
Time (days)
Figure 12. Output of non-linearly optimized model with flow and temperature dependent input non-linearities compared
with the flow data: model error ( + 70) shown above
serial-type non-linearity in (17) is not necessarily well identified with the optimum not too well
defined. Secondly, the FIS estimation tends to suggest that both of the first order model
parameters are probably time variable with the FIS estimate &,(k/N) correlated with flow and
iil ( k / N ) inversely correlated with the smoothed temperature. Finally, from a DBM standpoint,
the serial temperature non-linearity is not particularly attractive and is somewhat difficult to
interpret in hydrological terms. On the other hand, if the a l ( k ) parameter is inversely
proportional to temperature, as indicated by the concurrent TVP estimation of both para-
meters, then it makes some physical sense, implying that the time constant of the T F model is
changing seasonally in inverse relation to the temperature. This is physically meaningful since the
increased summer evapo-transpiration effects would be expected to reduce the time constant of
the flow recession because of the larger loss of water by evapo-transpiration.
Taken together, these observations indicate that the model (17) is not yet satisfactory and
needs refinement. This will be the subject of future research on these and other similar
hydrological data.
bidirectional flow processes (e.g. other applications would be solute transport and flow in tidal
estuaries, where the AMV-type model could replace the more conventional ADE description).
In related research, Young39>40 notes that, when applied within this generalized conservation
context, the DBM analysis nearly always yields models with real eigenvalues, so that the T F
model can be decomposed easily into serial, parallel or feedback connections of first order
processes, each of which can be considered as a first order AMV conservation equation of some
sort (as in the parallel flow interpretation of the rainfall-flow models in the present paper). The
importance of parallel processes, in particular, is emphasized by a number of rainfall-flow and
solute transport examples. Jakeman and his colleagues (see Reference 9 and the references
therein) have arrived at related conclusions in their analysis of rainfall-flow records from various
parts of the world.
5 . CONCLUSIONS
This paper has introduced the concept of data-based mechanistic (DBM) modelling and
illustrated its practical utility with two applications in the area of rainfall-flow modelling.
These examples are inherently non-linear and demonstrate well how recursive fixed interval
smoothing (FIS) can be used as a tool in time and state dependent parameter estimation, as well
as provide a method of structure identification for non-linear stochastic, dynamic systems. A
novel aspect of the latter analysis is the use of weighted least squares (maximum likelihood)
estimation of the non-linear functions, with the weighting based on the covariance estimates
obtained in the FIS estimation.
The approach to non-linear modelling proposed here can be enhanced in various ways. First,
improved hypothesis tests are required to indicate whether the FIS estimates of the parameters
are exhibiting sign$cant temporal variations which justify further evaluation by state dependent
modelling (SDM). Second, more research is required into the optimization of the noise variance
ratio (NVR) parameters, which are so important to SDM estimation. Third, a more systematic
SDM procedure needs to be developed, particularly the methods by which the FIS estimates are
related to the measured NMSS variables that characterize the process: so far, only a rather
inefficient, ad hoc approach has been utilized. Finally, once the model structure identification
analysis has been refined in this manner, it will be necessary to introduce better, statistically
based, non-linear optimization techniques in the final non-linear estimation phase of the analysis
(e.g. maximum likelihood or prediction error mimimization), with accompanying, improved
statistical validation procedures appropriate to these kinds of non-linear model.
In a more general environmental systems context, we believe that one of the most important
aspects of the DBM analysis may be its potential for use in the objective identification and
estimation of active mixing volume (AMV) models which are able to characterize the dynamics
associated with mass, energy or momentum conservation in imperfectly mixed environmental
flow processes.
APPENDIX I: SRIV IDENTIFICATION AND ESTIMATION
This Appendix describes briefly the SRIV method and outlines the associated identification and
optimization procedures used in the main text.
c
k=N
[k=l ]
k=N
k*(k)z*(k)T a = C X * ( k ) y * ( k ) .
k=l
These are a modification of the associated and well known least squares normal equations for the
same model. The data vectors in (22) are defined as follows;
z*(/qT = [-y*(k - 1) - y * ( k - 2 ) . . . - y*(k - n)u*(k).. . u*(k - m)],
The adaptation of both the auxiliary model and the pre-filters is performed within a three step
iterative p r ~ c e d u r e ' ~ >and,
~ * - following
~~ the final iteration, the following two matrices are
generated, in addition to the estimated parameter vector a(N):
k=N -1
k=N -1
[k=l ]
k*(k)~*(k)~,
[k=l
where e2is the variance of the model residuals 2(k), i.e.
]
the covariance matrix, P * ( N ) = 82 C x * ( k ) i * ( k ) T ,
1 k=N
2(k) = y ( k ) - i ( k ) ; k2 = N -c
~(Ic)~.
k= 1
(26)
The matrix inverse in the definition of P*( N ) is generated separately after estimation is complete,
with the required i * ( k ) and u*(k) variables obtained from auxiliary model and pre-filtering
operations based on the final iteration estimate of a(N).In the case of Gaussian white residuals, it
can be shown that P*(N)is an estimate of the covariance matrix associate with the parameter
estimate vector a(N) obtained at the final iteration: consequently, the square root of its diagonal
elements provides an estimate of the standard error on the elements of a(N).P(N) is utilized to
generate the YIC criterion, as discussed below in Section 1.2. The recursive-iterative version of
this SRIV estimation procedure follows straightforwardly from the above non-recursive
equations (see e.g. Reference 37).
DATA-BASED MECHANISTIC MODELLING 359
algorithm is exploited directly in the optimization loop; the parameters to be optimized are
selected by the optimization procedure to minimize the variance b2 of the residuals 2(k) resulting,
at each step in the search procedure, from SRIV estimation of the parameters in the linear T F
part of the model. Of course, this procedure is only possible when the linear and non-linear parts
of the model are separable, as in the hourly data example and the initial analysis of the daily data,
where only the numerator of the T F is assumed to have state dependent parameters. It is not
possible if the non-linearity affects the denominator of the TF, as seems likely when the evapo-
transpiration effects are taken into account in the daily data case.
ACKNOWLEDGEMENTS
The authors are grateful for the supply of the rainfall-flow data sets. The hourly rainfall-flow
data were made available by Dr Anthony Jakeman of the Centre for Resource and Environ-
mental Studies at the Australian National University, Canberra; and the daily data were provided
by Professor John Norton of the Department of Electronic and Electrical Engineering, University of
Birmingham, England.
DATA-BASED MECHANISTIC MODELLING 36 1
REFERENCES
1. Young, P. C. ‘Recursive approaches to time-series analysis’, Bulletin of the Institute of Mathematics and
Applications, 10, 209-224 (1975).
2. Young, P. C. ‘A general theory of modelling for badly defined dynamic systems’, in Vansteenkiste,
G. C. (ed.), Modeling, Identification and Control in Environmental Systems, North-Holland, Amster-
dam, 1978, pp. 103-135.
3. Young, P. C. ‘The validity and credibility of models for badly defined systems’, in Beck, M. B. and van
Straten, G. (eds.), Uncertainty and Forecasting of Water Quality, Springer-Verlag, Berlin, 1983.
4. Young, P. C. ‘Time variable and state dependent modelling of nonstationary and nonlinear time series’,
in Subba Rao, T. (ed.), Developments in Time Series Analysis, Chapman and Hall, London, 1993, pp.
374-41 3.
5. Young, P. C. and Runkle, D. ‘Recursive estimation and modelling of nonstationary and nonlinear
time-series’, in Adaptive Systems in Control and Signal Procesing, Vol. I , Institute of Measurement and
Control for IFAC, London, 1989, pp. 49-64.
6. Young, P. C. and Minchin, P. E. H. ‘Environmetric time-series analysis: modelling natural systems
from experimental time-series data’, International Journal of Biol. Macromolecules, 13, 190-201 (1991).
7. Young, P. C. and Lees, M. ‘The active mixing volume: a new concept in modelling environmental
systems’, in Barnett, V. and Turkman, R. (eds), Statistics and the Environment, Wiley, Chichester, 1992,
pp. 3-43.
8. Stigter, J. D. and Beck, M. B. ‘A new approach to the identification of model structure’, Environmetrics,
5, (1994).
9. Jakeman, A. J., Post, D. A. and Beck, M. B. ‘From data and theory to models of the environment: the
case of rainfall-runoff models’, Environmetrics, 5, (1994).
10. Beer, T. and Young, P. C. ‘Longitudinal dispersion in natural streams’, Jnl. Env. Eng. Div., American
Soc. Civ. Eng, 102, 1049-1067 (1988).
1 1. Wallis, S. G., Young, P. C. and Beven, K. J. ‘Experimental investigation of the Aggregated Dead Zone
(ADZ) model for longitudinal solute transport in stream channels’, Proceedings ofthe Institute of Civil
Engineers, Part 2, 87, 1-22 (1989).
12. Jakeman, A. J., Littlewood, I. G . and Whitehead, P. G. ‘Computation of the instantaneous unit
hydrograph and identifiable component flows with application to two small upland catchments’,
Journal of Hydrology, 117, 275-300 (1990).
13. Young, P. C. and Beven, K. C. ‘Computation of the instantaneous unit hydrograph and identifiable
component flows with application to two small upland catchments-comment’, Journal of Hydrology,
129, 389-396 (1991).
14. Weyman, D . R. Runoff Processes and Streamjow Modelling, Oxford University Press, Oxford, 1975.
15. Kraijenhoff, D. A. and Moll, J. R. River Flow Modelling and Forecasting (Water Science and
Technology Library), D. Reidel, Dordrecht, 1986.
16. Whitehead, P. G . , Young, P. C. and Hornberger, G. M. ‘A systems model of stream flow and water
quality in the Bedford-Owe River; I stream flow modelling’, Water Research, 13, 1155- 1 169 (1979).
17. Young, P. C. Recursive Estimation and Time-Series Analysis, Springer-Verlag, Berlin, 1984.
18. Lees, M., Young, P. C. and Ferguson, S. ‘Adaptive flood warning’, in Young, P. C. (ed.), Concise
Encylopedia of Environmental Systems, Pergamon, Oxford 1993, pp. 234-236.
19. Lees, M., Young, P. C., Beven, K. J., Ferguson, S. and Burns, J. ‘An adaptive flood warning system for
the River Nith at Dumfries’, in White, W. R. and Watts, J. (eds.), River Flood Hydraulics, Wallingford,
Institute of Hydrology, 1994.
20. Young, P. C., Behzadi, M. A., Wang, C. and Chotai, A. ‘Direct digital and adaptive control by input-
ouput, state variable feedback’, International Journal of Control, 46, 1861-1881 (1987).
362 P. C. YOUNG AND K. J. BEVEN
42. Wellstead, P. E. ‘An instrumental product moment test for model order estimation’, Aufomatica, 14,
89-91 (1978).
43. Young, P. C . ‘Recursive estimation, forecasting and adaptive control’, in Leondes, C . T . (ed.), Control
and Dynamic Systems, Vol. 30, Academic Press, San Diego, 1989, pp. 119-165.
44. Akaike, H. ‘A new look at statistical model identification’. IEEE Transactions on Automatic Control,
AC-19, 716-723 (1974).