You are on page 1of 17

Journal of Hydrology (2006) 331, 161 177

available at www.sciencedirect.com

journal homepage: www.elsevier.com/locate/jhydrol

Towards a Bayesian total error analysis of conceptual rainfall-runoff models: Characterising model error using storm-dependent parameters
George Kuczera
a b

a,*

, Dmitri Kavetski b, Stewart Franks a, Mark Thyer

School of Engineering, University of Newcastle, NSW 2308, Australia Department of Civil and Environmental Engineering, Princeton University, Princeton, NJ 08544, USA

Received 19 August 2005; received in revised form 30 April 2006; accepted 9 May 2006

KEYWORDS
Conceptual rainfall-runoff modelling; Parameter calibration; Model error; Input uncertainty; Bayesian parameter estimation; Parameter variation; Model determinism

Summary Calibration and prediction in conceptual rainfall-runoff (CRR) modelling is affected by the uncertainty in the observed forcing/response data and the structural error in the model. This study works towards the goal of developing a robust framework for dealing with these sources of error and focuses on model error. The characterisation of model error in CRR modelling has been thwarted by the convenient but indefensible treatment of CRR models as deterministic descriptions of catchment dynamics. This paper argues that the uxes in CRR models should be treated as stochastic quantities because their estimation involves spatial and temporal averaging. Acceptance that CRR models are intrinsically stochastic paves the way for a more rational characterisation of model error. The hypothesis advanced in this paper is that CRR model error can be characterised by storm-dependent random variation of one or more CRR model parameters. A simple sensitivity analysis is used to identify the parameters most likely to behave stochastically, with variation in these parameters yielding the largest changes in model predictions as measured by the NashSutcliffe criterion. A Bayesian hierarchical model is then formulated to explicitly differentiate between forcing, response and model error. It provides a very general framework for calibration and prediction, as well as for testing hypotheses regarding model structure and data uncertainty. A case study calibrating a six-parameter CRR model to daily data from the Abercrombie catchment (Australia) demonstrates the considerable potential of this approach. Allowing storm-dependent variation in just two model parameters (with one of the parameters characterising model error and the other reecting input

* Corresponding author. Tel.: +61 2 49 216038; fax: +61 2 49 216991. E-mail address: george.kuczera@newcastle.edu.au (G. Kuczera). 0022-1694/$ - see front matter c 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.jhydrol.2006.05.010

162

G. Kuczera et al.
uncertainty) yields a substantially improved model t raising the NashSutcliffe statistic from 0.74 to 0.94. Of particular signicance is the use of posterior diagnostics to test the key assumptions about the data and model errors. The assumption that the storm-dependent parameters are log-normally distributed is only partially supported by the data, which suggests that the parameter hyper-distributions have thicker tails. The results also indicate that in this case study the uncertainty in the rainfall data dominates model uncertainty. c 2006 Elsevier B.V. All rights reserved.


Introduction

Catchment models simulate water balance dynamics at the catchment scale. Because of the signicance of water in terrestrial ecosystems, catchment models are an integral part of virtually all environmental models formulated at the catchment scale and their applications range from catchment water and nutrient balances to biophysical models. This paper focuses on conceptual rainfall-runoff (CRR) models. An important, perhaps dening, feature of CRR models is that their parameters are not directly measurable and must be inferred (calibrated) from the observed data (e.g., Beven and Binley, 1992). The advantage of this class of models is their ability to capture the dominant catchment dynamics while remaining parsimonious and computationally efcient. Characterising the uncertainty in streamow predicted by a CRR model has attracted the attention of hydrologists over many years. Yet in the recent reviews of CRR model calibration, Kuczera and Franks (2002), Kavetski et al. (2002) and Vrugt et al. (2005) note the lack of a robust framework that accounts for all sources of error (input, model and response error). Although Vrugt et al. (2005) propose a simultaneous parameter optimization and data assimilation method for improved uncertainty analysis, their strategy merges input and model structural errors into a single forcing term [p. 11]. One concludes that realistic statistical models of input and model structural error remain to be articulated. The lack of a robust calibration framework has a number of implications for CRR modelling: (i) quantifying the predictive uncertainty in streamow and other model outputs is problematic; (ii) the regionalisation of CRR model parameters continues to be confounded by biases in the calibrated parameters and unreliable assessment of parameter uncertainty; and (iii) it is difcult to discriminate between competing CRR model hypotheses because poor model performance can hide behind the veil of ignorance about the sources of error. This paper focuses on a more rigorous characterisation of the uncertainty associated with CRR models. The study builds on the Bayesian total error analysis of Kavetski et al. (2002, 2006c,d) who proposed a parameter estimation methodology that discriminates between input, model and response errors. The main contribution of this work is an explicit characterisation of model error that is open to scrutiny and improvement. When linked with statistical models of input and response error, a basic total error framework emerges. This framework advances both operational hydrology (it improves the quality of predictions and generates more meaningful uncertainty bounds) and scientic hydrol-

ogy (it yields insights into model error and thus facilitates model development). The paper is organised as follows: after a brief review of CRR modelling, the need for characterising model error is motivated by an example. It is then argued that the notion of a deterministic CRR model is indefensible, at least for the current generation of such models. Although there are many ways to make a CRR model stochastic, a simple approach is to abandon the notion that model parameters are xed quantities and instead assume that they are random variables. Different probability distributions for these parameters can then be investigated. A starting point is to assume that the CRR model parameters vary from storm to storm. This strategy offers a simple characterisation of model error and, importantly, one that can be tested. A Bayesian inference framework is then developed, which requires the modeller to make explicit assumptions regarding model, input and response uncertainty and allows testing these hypotheses against available evidence. A case study illustrates this approach and highlights the role of diagnostic checks of key assumptions.

The traditional view of CRR model error


CRR models can be described using the following equation: qt hxt ; st ; g; j 1

The vector qt contains the true responses of the catchment at time t, which are observable point or spatially/temporally averaged quantities. In the simplest case, qt is scalar and contains the streamow observed at the catchment outlet. Generally the function h( ) is a probability density function (pdf), so that the true response qt is a random sample from the pdf h( ). The usual practice of assuming that the CRR model is deterministic yields a special form of h( ), the Dirac delta function. The catchment responds to external forcing or inputs. The vector xt represents the true input and contains one or more observable point or spatially/temporally averaged quantities, typically rainfall and potential evapotranspiration within the catchment at a time t. The term st refers to the set of internal state and ux variables. An internal variable is one that is not observed or measured for the purpose of testing the model hypothesis. It is stressed that an internal variable may be observable in the sense that there exists a technology to observe or measure it. However, if that technology has not been applied, then the variable is internal to the model and thus cannot be scrutinised. The terms g and j refer respectively to the sets of conceptual and physical parameters. These parameters deter-

Towards a Bayesian total error analysis of conceptual rainfall-runoff models mine qt and st for a given external forcing xt. Physical parameters can be estimated using procedures that are independent of observable catchment responses qt (e.g., laboratory measurements of soil core permeabilities), whereas conceptual parameters can only be inferred (calibrated) by some process involving matching simulated catchment responses to observed values of qt. In the remainder of this paper we omit reference to the internal states st and to the physical parameters j, since they are inferred independently of the conceptual parameters g. The CRR model then reduces to qt hxt ; g 2

163

2. Model and response errors are lumped into a single random process, the simplest being the standard least squares (SLS) error model ~ t hxt ; g et q 3

where et is random independent Gaussian error with constant variance. The objective of this paper is to analyse the effect of these assumptions on the calibrated parameters and model predictions, and suggest practical strategies for making CRR calibration more consistent with the error propagation schematic shown in Fig. 1(a).

All calibration methods in CRR modelling involve some form of matching simulated responses h(xt, g) to the observed responses qt. These methods necessarily make assumptions, either explicit or implicit, about how errors arise and propagate through the CRR model to affect the simulated catchment responses (see Kavetski et al. (2002) for a more complete overview). Fig. 1(a) summarises our current understanding of the uncertainty in CRR modelling. There are three distinctly different sources of error. The observation of forcing inputs, particularly rainfall but also potential evapotranspiration (PET), is subject to measurement error and, more importantly, is affected by the sampling uncertainty arising from incomplete sampling of the spatially/temporally distributed random elds. The response of the catchment, typically streamow discharge at one or more locations, is itself subject to measurement and rating curve errors. Finally, given the simplications made in deriving CRR models, they cannot be expected to reproduce the true response exactly even if error-free forcing and response data were available this discrepancy is termed model or structural error. In stark contrast, Fig. 1(b) shows the conceptualisation that underpins calibration methods dominating current practice. The dening features of these methods are listed below: 1. Input error is ignored or assumed negligible, i.e., the ~t is assumed to be equal to the true observed forcing x forcing xt.

Signicance of model error: an example


It is widely acknowledged that the conceptualisation of error propagation shown in Fig. 1(b) is a crude approximation to reality (e.g., Beven and Binley, 1992; Kuczera and Franks, 2002). This can be illustrated by an example involving the calibration of the Sacramento model to daily rainfall-runoff data for the 2770 km2 Abercrombie River at Abercrombie (412028) in New South Wales, Australia. Thirteen parameters were calibrated to two years of data using the SLS criterion with the runoff data square-root transformed to account for the nonstationary variance of the runoff errors (two of the 13 calibrated parameters were poorly identied and highly correlated). Fig. 2 presents a scatter plot of observed and simulated daily runoff, including the 90% condence and prediction limits. The condence limits (based on a linear approximation) reect the uncertainty in the tted Sacramento model parameters, while the prediction limits account for both parameter uncertainty and the uncertainty arising from the noise term et in Eq. (3), which lumps model and response errors. The NashSutcliffe (NS) statistic was 0.73. What is striking is the limited contribution of parameter uncertainty to the overall predictive uncertainty, highlighting the often overlooked fact that poorly determined parameters do not necessarily lead to high predictive uncertainty. Instead, most of the predictive uncertainty is dominated by the noise

True input, xt

Input errors Observed input data, x t Conceptual catchment model Model errors

True input, xt

True dynamics

True dynamics

Observed input data, x t


Conceptual catchment model

,) Simulated response, h ( x t
Model and Response errors, t

True response, qt Response errors

Simulated response, h ( x t ,) Observed response data, qt

True response, qt

Observed response data, q t

(a) True Conceptualisation

(b) Current Conceptualisation

Figure 1

Schematic of error propagation in CRR models (sources of errors shaded grey) (taken from Kavetski et al. (2002)).

164
16000
90% prediction limit

G. Kuczera et al.

14000 12000 10000 8000 6000 4000 2000 0 0

Observed 90% confidence limit

Runoff (ML/day)

2000

4000

6000

8000

10000

12000

14000

16000

Simulated runoff (ML/day)

Figure 2

Runoff scatter plot for the Sacramento model calibrated to two years of daily runoff for the Abercrombie River.

term et. Noting that the 90% prediction limit interval represents 6080% of the simulated runoff, it is highly unlikely that this magnitude of uncertainty is due to errors in estimating runoff, since a gauging station with a stable and well developed rating curve is unlikely to have a coefcient of variation in errors exceeding 510%. The evidence therefore strongly suggests that the bulk of the predictive uncertainty is due to structural errors in the model and input data uncertainty, both of which are ignored by the SLS calibration. Deeper insight into the nature of the model and forcing error is provided by Fig. 3, which presents a time series plot

of observed and simulated daily runoff. It is immediately evident that the model error is highly structured and completely at odds with the SLS assumption of independence from one day to the next. There are long runs of systematic over- and under-estimation. Many recessions are systematically mis-specied, while several peaks dominated by quickow are either spuriously exaggerated or completely missed. These qualitative features are well known to practitioners and researchers it is generally recognised that model and input errors induce a complex uncertainty structure in the model parameters and predictions. Our focus in this

16000 14000 Simulated 12000 Runoff (ML/day) 10000 8000 6000 4000 2000 0 400 Observed

500

600

700 Days

800

900

1000

1100

Figure 3 Runoff time series for the Sacramento model calibrated to two years of daily rainfall-runoff data for the Abercrombie River using the SLS method.

Towards a Bayesian total error analysis of conceptual rainfall-runoff models paper is to disentangle the contribution of model and forcing errors and acquire a deeper understanding of how these uncertainties affect and propagate through CRR models.

165

A storm-based characterisation of model uncertainty


To better understand the nature of CRR model error, it is logical to review the assumptions that underpin CRR models. These models focus on the dominant catchment dynamics and in most cases are deliberately constructed to be parsimonious with regard to model parameters (to ease the burden of calibration). The resulting simplication of catchment processes is likely to be the major source of model error. Using this insight, we explore plausible mechanisms that give rise to structural error in CRR models. CRR models typically route water through one or more conceptual storages. These one-dimensional stores represent two or three-dimensional features of the catchment and therefore the contents of these conceptual stores are necessarily spatially averaged. The ux of water entering and leaving a store is determined either by forcing inputs such as rainfall and PET, or by ux equations conditioned on the contents of the store and parameter values. In addition, the observed forcings (such as rainfall and PET) are spatial and temporal averages of random elds. There are innitely many spatially and temporally distributed rainfall elds that yield the same average catchment rainfall. However, each distinct rainfall eld causes a different hydrologic response. For example, if the main mass of the rainfall eld were located over the saturated part of the catchment, a signicant quickow response would occur. Conversely, if that same rainfall were located over an unsaturated part of the catchment, no quickow response would occur and the soil store would be recharged. Models based on spatial and temporal averaging of input elds and store contents cannot replicate such differences and will generally produce errors in the response. Finally, since CRR models may have interconnected stores, an error in one ux can propagate through downstream stores and affect other uxes. For example, if the recharge ux to a groundwater store is underestimated, then the baseow recession will be underestimated over a time scale comparable to the response time of the groundwater store: the groundwater store remembers the recharge ux error and induces a persistent baseow ux error. This mechanism may be responsible for the characteristic systematic over- or underestimation of the baseow recession seen in Fig. 3. These considerations suggest that it is unreasonable to expect a CRR model to deterministically predict the catchment response even if the exact spatially and temporally averaged forcing were known spatial and temporal averaging induces unavoidable errors in the internal model uxes. We argue that the notion of a deterministic CRR model, though statistically and mathematically convenient, is difcult to justify and needs to be relaxed if a rational description of model error is to be developed. The central question is how to relax the determinism that is currently embedded in the CRR paradigm. There are at least two ways to make the ux equations stochastic:

1. Relax the assumption that CRR parameters are timeinvariant constants. By allowing some of the CRR parameters to be random variables over some characteristic time scales, stochastic variations in the uxes can be modelled. 2. Perturb the internal states over some characteristic time scale. The notion of time-varying parameters is not new. The state-space formulation underlying the Kalman lter naturally allows for time variation in parameters in the extended Kalman lter CRR parameters can be treated as state variables that can be randomly perturbed at every update step (see Bras and Rodriguez-Iturbe (1985) for an overview of hydrologic applications). However, as Kavetski et al. (2002) observe, the extended Kalman lter approach to CRR modelling is hampered by assumptions of linearity of the state equation and Gaussian structure of all errors. Although recent work by Vrugt et al. (2005) using the ensemble Kalman lter has shown that model nonlinearity can be accommodated, their approach of lumping model and input error into an additive Gaussian error fails to address the fundamental differences between input and model error. The critical issue to be addressed is the temporal variation of the random perturbations of the model uxes. For example, if a CRR model uses an hourly time step for computation, should the uxes also be randomly perturbed at hourly intervals? If the hourly time step is signicantly less than the response time of the store receiving the ux, independent hourly perturbations would average out and the store would behave like a low-pass lter responding only to the average component of the input. Some means of representing the persistence in the random perturbation of the ux is therefore necessary. One way to introduce such persistence is to randomly perturb the model parameters at the beginning of each storm event. This strategy is consistent with the idea of storm-dependent parameters explored by Kuczera (1990). Since the rainfall during a storm event represents the primary (and the most spatially and temporally heterogeneous) forcing of the catchment water balance, ux perturbations are likely to persist over storm-event time scales. This is arguably the simplest hypothesis that allows for persistence in model error and must be judged by its consistency with available evidence. The simplest stochastic perturbation approach partitions the set of conceptual parameters as g = (h, x), where x is the nd-vector of time-invariant (deterministic) parameters and h is the ns-vector of storm-dependent (stochastic) parameters. The latter are then described by sampling distributions (pdfs), such as the following simple Gaussian form: hi Nhjli ; r2 i ; i 1; . . . ; ns 4

where Nhjli ; r2 i is an independent normal pdf with mean li and variance r2 i . This distribution is stationary (constant mean and variance) if the catchment is assumed not to be undergoing a change over time (natural and anthropogenic changes may invalidate this assumption). An alternative approach to introducing stochasticity into the model uxes involves randomly perturbing CRR store depths at the beginning of each storm event, while leaving

166 the CRR parameters time-invariant. This approach can be readily implemented in the state-space formulation of the Kalman lter by treating the depths of CRR stores as state variables and adding random perturbations at the beginning of each storm. However, this approach is fundamentally unsatisfying: the mass balance of a CRR store is deterministic and the appearance of a mass imbalance is a consequence of spatial and temporal averaging, not a violation of mass conservation. Since it is best to work directly with the more plausible cause of model error, rather than with its symptom, the approach of perturbing CRR store depths is not explored further in this paper.

G. Kuczera et al. (3) The stream store is a linear reservoir with depth S (mm) temporarily delaying the progress of water in the stream channel according to
dS dt

quick ssf bf qS qS kStream S

water balance stream runoff flux

where kStream is a CRR parameter. Table 1 summarises the log SPM parameters. The seventh parameter, rMult, needs further comment. Kavetski et al. (2002, 2006c,d) used storm-dependent rainfall depth multipliers as an explicit (albeit approximate) representation of input uncertainty, which corresponds to the assumption that the rainfall errors are multiplicative (i.e., raintrue = rainobs * rMult with rMult varying from storm to storm). This paper uses the same approach to account for the uncertainty in the observed catchment rainfall. The effects of storm-dependent parameter stochasticity were explored using a synthetic daily runoff time series Qo derived from the two-year daily rainfall record for the Abercrombie River. The series Qo was generated assuming all the parameters were deterministic (their values are given in Table 1 and were obtained by tting the log SPM model to the Abercrombie runoff record with a NashSutcliffe statistic of 0.73, the same as obtained with the more complex Sacramento model). Given the same rainfall series, a new runoff time series Qi was generated with hi (the ith log SPM parameter) selected as being stochastic (its distribution is assumed to be log-normal with expected value given in Table 1 and a given coefcient of variation CV), while keeping the remaining parameters time-invariant (values given in Table 1). A new value of hi was sampled from the assumed log-normal distribution at the beginning of each storm.

Identifying likely storm-dependent CRR parameters


Our hypothesis is that a major portion of model error can be attributed to ux errors arising from spatial and temporal averaging. We postulate that such errors can be described by randomly sampling one or more parameters (affecting ux magnitudes) from a probability distribution at the start of each storm. This section demonstrates the plausibility of this hypothesis and introduces a strategy for identifying the CRR parameters most likely to be storm-dependent. Fig. 4 illustrates a typical CRR model, a member of the saturated path modelling (SPM) family (Kavetski et al., 2003), hereafter referred to as log SPM. The log SPM model contains three conceptual stores and seven parameters, as follows: (1) The soil store with depth s (mm) is forced by rainfall (rain) and potential evapotranspiration (pet) to generate four uxes according to the following water balance:

ds dt

rMult rain quickf ssff rgef ets water balance saturation-soil depth function quickflow flux subsurface stormflow flux groundwater recharge flux actual evapotranpiration flux 5

1 sF f 1sF991 expks 100

quickf f rMult rain ssff f ssfMax rgef f rgeMax ets pet 1 exps

where sF, k, ssfMax, rgeMax are CRR parameters (all positive) and rMult is a rainfall storm depth multiplier. The relationship between the soil store depth and the model uxes (excepting ET) is a modied logistic function. (2) The groundwater store is a linear reservoir with depth h (mm) receiving a recharge ux from the soil store and discharging a baseow ux into the stream according to
dh dt

The NashSutcliffe statistic NS(i) was then evaluated for the runoff time series Qo and Qi, where Qi was treated as the simulated time series and Qo as the traditional observed time series. Fig. 5 presents a plot of NS(i), i = 1, . . . , 7, for a range of CVs. Several important observations can be made: 1. The model predictions, and hence the NS statistic, are most sensitive to storm-dependent variation in the parameter k. This parameter species how rapidly the saturated area grows as a function of the soil store depth s and controls the production of saturation overland and subsurface stormow. A CV of 20% reduces the NS statistic to values as low as 4050%.

rge bfh water balance bfh kBf h baseflow flux where kBf is a CRR parameter.

Towards a Bayesian total error analysis of conceptual rainfall-runoff models


rMult*rain Evapotranspiration et

167

Saturation overland flow sof

Soil store Subsurface stormflow ssf Saturated area function f Stream linear store Streamflow q

Recharge rge

Groundwater linear store

h Baseflow bf

Figure 4

Schematic of the log SPM model.

Table 1

Summary of log SPM parameters Description Exponent controlling saturated area fraction Exponent controlling saturated area fraction Subsurface stormow at full saturation Groundwater recharge rate at full saturation Groundwater discharge constant Stream discharge constant Observed storm depth rainfall multiplier Expected valuea 0.02 2300 0.62 mm/day 5.6 mm/day 6.3 105 0.47 1.21

Parameter k sF ssfMax rgeMax kBF kStream rMult


a

Expected value obtained by calibrating to two years of daily rainfall-runoff data at Abercrombie River.

2. The second most sensitive parameter is the rainfall multiplier parameter rMult. This parameter regulates the magnitude of the error in the rainfall, the primary forcing of the model. 3. The remaining parameters display limited sensitivity to storm-dependent variation suggesting they are best treated as time-invariant. Using the insights from Fig. 5, we attempt to replicate the effects of parameter stochasticity on model calibration by the following experiment. A runoff series was generated by log SPM with parameter k made log-normally stormdependent with a CV of 30% and all other parameters set as time-invariant with values given in Table 1. This time series is shown as the observed series in Fig. 6. The simulated series in Fig. 6 is obtained by SLS tting the log SPM model to this synthetic observed series assuming all the parameters are deterministic (time-invariant). Comparison with Fig. 3 (which shows calibration to real observed data)

reveals qualitative similarities peaks are either missed or spuriously generated, while baseow recessions are often systematically in error. Therefore, calibration of a stochastic-parameter model erroneously assuming that its parameters are time-invariant, produces model mismatches that are qualitatively similar to those observed in typical hydrological calibrations.

Incorporating model uncertainty into the BATEA framework


The Bayesian total error analysis (BATEA) methodology incorporates forcing uncertainty into the calibrated parameter distributions by treating some function of the true input (e.g., storm depth multipliers) as latent variables (Kavetski et al., 2002, 2006c,d). This framework can be readily extended to accommodate model uncertainty, especially when expressed in the form of time-dependent

168

G. Kuczera et al.

Figure 5

Sensitivity of the NashSutcliffe statistic to storm-dependent parameter variability.

14 12 10 Runoff (mm) 8 6 4 2 0 0 100 200 300 Days 400 500 600 700 Simulated Observed

Figure 6 Time series of synthetic runoff generated by log SPM, illustrating the effects of ignoring potential stochasticity of the model. The observed series were generated with a storm-dependent k with 30% CV. The simulated series were obtained by SLS calibration assuming all parameters are xed.

(e.g., storm-dependent) parameters. In this section the error propagation process outlined in Fig. 1(a) is formally dened in terms of the hierarchical model shown in Fig. 7. Suppose a hydrologic time series is partitioned into n epochs {(ti, ti+1 1), i = 1, . . . , n} where ti is the time step index corresponding to the beginning of the ith epoch. Each epoch begins with a storm event and ends with a dry spell exceeding a minimum duration. The observed response time series for the ith epoch is ~ i fq ~ t ; t ti ; . . . ; ti1 1g, whereas qi is the true response q ~ i and xi contain time series for the ith epoch. The vectors x the observed and true forcing time series respectively for the ith epoch. The BATEA hypothesis of Kavetski et al. (2006c) assumed ~i ; ui that maps the observed forcing x ~i into a function gx the true forcing xi. The function g( ) accounts for the sam-

pling and measurement error in the observed forcing. For example, Kavetski et al. (2002, 2006c) considered the special case of ui being a storm depth multiplier scalar, which ~i (note that in log SPM, the yields the mapping xi ui x parameter rMult serves as u). The vector ui is assumed to vary from storm to storm and be a random realisation from the probability model with pdf p(uja) ui puja 8

where a is a vector of parameters describing the statistical properties of the input errors (e.g., the mean and variance of the multipliers). Since the same averaged inputs can yield different internal dynamics and thus different catchment responses, we extend the BATEA hypothesis by relaxing the assumption that the CRR model is deterministic in the sense of producing

Towards a Bayesian total error analysis of conceptual rainfall-runoff models


MODEL ERROR INPUT ERROR

169

Storm-dependent CRR parameters i p(|)

Observed input ~ xi

True input

Legend
Parameter Hierarchical process Observed variable

xi g(~ xi , i )

i p( |)

True streamflow

qi h(xi, i , )
Observed streamflow

RESPONSE ERROR

~ p(q ~ | q, ) q i

Figure 7

Schematic of the hierarchical BATEA model.

a unique response for a given forcing input and a set of CRR parameters. Specically, we assume that for each epoch there exists a CRR model h(xi, hi, x) that maps the true forcing xi into the true response q where hi is a set of event-specic CRR model parameters drawn from the hyper-distribution p(hjb) hi phjb 9

where b are the CRR hyper-parameters (e.g., means and variances of the parameters). The storm-dependent parameters are therefore treated as latent (or hidden) variables. The true response for the ith storm epoch then becomes qi hxi ; hi ; x 10

e; X e is the full posterior pdf, where pa; b; x; c; h1:n ; u1:n j Q h1:n = {h1, . . . , hn} contains the sets of storm-dependent CRR parameter realisations for all the storms, and u1:n = {u1, . . . , un}. Direct evaluation of this integral is formidable due to its high dimensionality and strong nonlinearity. Following Kavetski et al. (2002), it is advantageous (both statistically and computationally) to work directly with the full posterior. Our primary interest in this study is to nd the most probable (or modal) posterior parameters, given by the maximum of the posterior pdf ^; x ^ 1 :n ^ ^;^ a; b c; ^ h1:n ; u
a;b;x;c;h1:n ;u1:n

max

e; X e pa; b; x; c; h1:n ; u1:n j Q 13

where x are the time-invariant CRR parameters. The observed response is corrupted by errors in the gauging process and is assumed to be distributed according to ~i q ~ jqi ; c pq 11

To focus on the key issues characterising model uncertainty using storm-dependent parameters, the following simplications can be made: 1. Since the latent variables of the input and structural error models are both associated with storm epochs, they are combined into h, the set of storm-dependent parameters. 2. The response measurement error parameters c are assumed to be known. In the case of streamow, this is a reasonable assumption since at a well-maintained gauging station there would be numerous ow gaugings to develop the rating curve. Exploiting these simplications along with those offered by the hierarchical structure of BATEA yields the following posterior pdf:
e e e e;X e ; c p Q jb; x; h1:n ; X ; cpb; x; h1:n ; X ; c pb; x; h1:n j Q e;X e ; c p Q e e e ; cpb; xj X e ; c p Q jx; h1:n ; X ; cph1:n jb; x; X e;X e ; c p Q e e p Q jx; h1:n ; X ; cph1:n jbpbpx e;X e ; c p Q e jx; h1:n ; X e ; cph1:n jbpbpx / p Q 14

~ jq; c describes the response measurement error where pq and is conditioned on the true discharge q and the parameter set c that characterises the error process. The hierarchical BATEA model is atypical of Bayesian hierarchical models, since the sampling of the true response q is not independent of earlier storm epochs. This complication arises because the time memory of CRR models induces a dependence between storm epochs: storm-dependent parameters of early storm events affect model responses in subsequent events. This dependence requires careful attention and precludes routine application of Bayesian hierarchical model packages.

BATEA inference problem


The primary objective of BATEA is to identify the parameters a, b, x and c given the observed streamow time series e fq ~ i ; i 1; . . . ; ng, the observed forcing time series data Q e fx ~i ; i 1; . . . ; ng and any prior information. In the X Bayesian framework this inference problem is described by the posterior pdf Z e; X e; X e pa; b; x; c; h1:n ; u j Q e dh1:n du pa; b; x; cj Q 1:n 1:n 12

e jx; h1:n ; X e ; c is the likelihood function (sampling where p Q e which, according to Fig. 7, is independent distribution) of Q

170 from b, p(h1:njb) is the hyper-distribution of h1:n which only depends on b, and p(x) and p(b) are prior pdfs. In addition to identifying the modal values of the stormdependent parameters and the parameters of their hyperdistribution by maximising the objective function (14), the full distribution of these quantities could be obtained using a Monte Carlo or Markov chain Monte Carlo method. However, this is nontrivial due to the high dimensionality of the posterior pdf (14) and the dependence between storm epochs described in the previous section. Consequently, this paper limits itself to determining the modal values of all quantities of interest. In the case of the storm-dependent model parameters, this includes the mean and standard deviation of their hyper-distributions and therefore gives signicant information regarding the overall shape of these distributions. Avenues for more thorough analysis of posterior BATEA pdfs will be investigated in future papers.

G. Kuczera et al. stream water balances; and (iii) it uses an implicit Euler scheme with convergence to machine precision for the nonlinear soil water balance ODE (5). These model implementation techniques enable the use of computationally fast Newton-type methods to maximise the objective function (14). For example, the BATEA calibration to 71 storm events with 146 parameters in the following case study takes about 3 min of CPU time on a standard 2 GHz laptop processor.

Case study
The Abercrombie catchment is revisited to explore the hypothesis that storm-dependent parameters adequately describe input (rainfall) and model uncertainty. e jx; h1:n ; X e ; c assumes that the The likelihood function p Q daily streamow measurement errors are independently and normally distributed with zero mean and a standard deviation of 0.25 mm. This streamow measurement error model was selected for convenience and sufces for the purpose of this case study. The key point is that the measurement error model is inferred independently of the BATEA calibration, typically by analysis of rating curve residuals. This assists the BATEA inference because it reduces the effects of one of the three sources of uncertainty. Storm epochs were dened by inter-storm dry spells of two or more days, followed by rainfall exceeding a 0.5 mm/day threshold. In the two-year daily record, 71 such storm epochs were identied. The log SPM model was rst calibrated using SLS, which yielded a NS statistic of 0.736 (same as for the more complex Sacramento model). The parameter sF was highly correlated with ssfMax and rgeMax, and was therefore xed at its SLS value in all subsequent runs involving BATEA. This avoids the confounding effects associated with strong parameter interaction (which are important in practice but lie beyond the scope of this paper). In all calibrations, the model parameters were log-transformed to reduce the parameterisation nonlinearity of the objective function and to ensure that the parameters remain positive. The search for the mode of the posterior pdf (14) was implemented in three steps: 1. The CRR parameters most likely to be storm-dependent were identied using the NashSutcliffe sensitivity plot (Fig. 5) these were k and rMult. 2. The posterior mode of the parameters b, x and h1:n was estimated by a quasi-Newton optimisation scheme using the SLS estimates as initial values. The optimisation was based on the logarithm of the posterior pdf to reduce the nonlinearity of the problem. 3. Posterior diagnostics were evaluated to check the assumptions of the storm-dependent parameter models. Whereas uniform noninformative priors were specied for the deterministic parameters x, weakly informative priors were prescribed for the hyper-parameters b, since otherwise the posterior pdf becomes unbounded and hence ill-posed (Kavetski et al., 2006c). For log rMult, the prior on the hyper-mean was a normal distribution with zero mean and standard deviation of 0.1, whereas the prior on the hypervariance was a scaled inverse-v2 distribution with a single

Maximisation of the posterior probability density function


The maximisation of the posterior pdf (14) is computationally challenging, since it has to be performed in a highdimensional space well beyond sizes normally encountered in CRR modelling. If there are n storm epochs, ns stormdependent parameters and nd time-invariant CRR parameters, the dimension of the space is nd + n ns, and may include hundreds, perhaps thousands, of variables. This fact significantly limits the selection of optimisation algorithms for the BATEA inference problem. Global optimisation methods that do not rely on the continuity of the objective function perform inefciently for such high-dimensional problems. For example, the SCE-UA algorithm of Duan et al. (1992), often used in CRR model calibration, employs a population of simplexes and suffers a rapid increase in computing time and memory with respect to the dimension of the problem. The most efcient class of practical optimisation schemes for high-dimensional problems are those based on Newton-type methods (e.g., Nocedal and Wright, 1999; also see Kavetski et al., 2006b for a recent application in hydrology). However, these methods are gradient-based and require the objective function to be smooth with respect to the model parameters. In addition, further enhancements are needed to deal with the potential multi-modality of the objective function. Unfortunately, the majority of current CRR models are not sufciently smooth at the microscale and have a complex macro-scale structure (Duan et al., 1992; Thyer et al., 1999). However, Kavetski et al. (2003, 2006a,b) showed that the micro-scale roughness is a numerical artefact due to physical or numerical thresholds in the model, whereas the complex macro-scale structure is often due to the use of poorly stable explicit time stepping schemes. Both problems can be remedied using numerical techniques that do not materially affect the conceptual physical basis of CRR models. The log SPM model used in this study conforms to the guidelines recommended by Kavetski et al. as follows: (i) it uses smooth constitutive functions and does not contain internal thresholds; (ii) it employs exact solutions for the linear ODEs (6) and (7) describing the groundwater and

Towards a Bayesian total error analysis of conceptual rainfall-runoff models degree of freedom and a scale of 0.2 (which is equivalent to a CV of 20% on rMult) specifying a single degree of freedom ensures that the prior is proper but very diffuse. For the storm-dependent parameter log k, the prior on the hypermean was a Gaussian pdf with a mean of 3.88 (equal to the SLS value) and standard deviation of 0.5, whereas the prior on the hyper-variance was a scaled inverse-v2 distribution with a single degree of freedom and scale of 0.5. Table 2 summarises the hyper-parameters and prior distributions used in this case study. Under these assumptions, the posterior pdf (14) can be expressed using the notation of Gelman et al. (1995, Table A.1) as e; X e jx; h1:n ; X e ; c / p Q e ; cph1:n jbpbpx pb; x; h1:n j Q 15 where e jx; h1:n ; X e ; c p Q ph1:n jb
n Y i1 2 2 2 pb Nlh jlp ; r2 p Inv v rh jm ; s n Y i1

171

~ i jhx ~i rMulti ; hi ; x; r2 N q c

15a 15b 15c 15d

Nhi jlh ; r2 h

px constant

Here N(zjl, r2) is the joint pdf of the vector z whose ith component is independently distributed as a normal variate 2 2 with mean li and variance r2 i , while Inv v (yjm, s ) is the joint pdf of the vector y whose ith component is independently distributed as a scaled inverse v2 variate with degrees of freedom mi and scale si. The streamow sampling distribu~ i jhx ~ i rMulti ; hi ; x; r2 tion is Nq c where the expected streamow is given by the log SPM model h( ) with input dur~i rMulti , and the standard ing the ith storm epoch equal to x deviation of measurement error rc is equal to 0.25 mm. The hyper-parameter vector b consists of the mean lh and the variance r2 h of the storm-dependent parameters h. The vectors lp and r2 p are, respectively, the prior mean and variance

Table 2 Summary of distributions used in the application of BATEA in the Abercrombie River case study Variable k sF ssfMax rgeMax kBF kStream rMult Streamow error Probability model log k $ Nlk ; r2 k Prior distribution lk $ N3:88; 0:52 , 2 2 r2 k $ Inv v 1; 0:5 Uniform Uniform Uniform Uniform Uniform lr $ N0; 0:12 , 2 2 r2 r $ Inv v 1; 0:2

a a a a a log rMult $ Nlr ; r2 r ~ $ Nq; 0:252 q

a These variables were assumed time-invariant and therefore no probability model was required.

of lh, while the vectors m and s are, respectively, the prior degrees of freedom and prior scale of r2 h. Three BATEA runs using different combinations of stormdependent parameters were undertaken: BATEA Run 1: storm-dependent parameter k According to Fig. 5, stochastic variation of parameter k is likely to account for the largest variation between simulated and observed daily runoff. Table 3 reports the NS statistic for the posterior modal t, the posterior pdf at the mode and modal values for the deterministic parameters x and storm-dependent hyper-parameters b. The posterior modal t uses the values of x and h1:n that maximise the posterior pdf (14) this gives the best possible t to the observed data because the storm-dependent parameters h1:n have been optimised for each epoch. Table 3 shows that the NS statistic climbs from 0.736 for the SLS t using deterministic parameters to 0.897 if k is treated as stormdependent. BATEA Run 2: storm-dependent parameter rMult Fig. 5 also suggests that the stochastic variation of parameter rMult is likely to have similar sensitivity as parameter k. Table 3 reports that the NS statistic climbs to 0.938. BATEA Run 3: Storm-dependent parameters k and rMult In the third run, parameters k and rMult are made stormdependent. Table 3 reports that the NS statistic climbs to 0.947. This suggests that the NS statistic is starting to plateau and that signicant further improvement in the goodness-of-t is unlikely. Fig. 8 presents a time series plot of observed daily runoff and simulated runoff with stormdependent parameters taking their modal value for each storm epoch. Unlike the SLS t in Fig. 3, the BATEA t is excellent with only small discrepancies at peaks and in recessions. The change in the standard deviation of the hyper-distribution of k and rMult depending on which parameters are treated stochastically, is also noted. The modal standard deviation of log k is 0.209 in run 1, whereas in run 3 it drops to 0.075. In run 1, k was the only storm-dependent parameter and had to compensate for both input and model error. In run 3, rMult was allowed to vary between storms and dealt more directly with input errors, thus allowing k to focus on model error. Comparison of the deterministic model parameters across the three runs reveals that their modal values are also sensitive to the choice of storm-dependent parameter. Table 3 shows the results of the SLS calibration (assuming all model parameters, including k and rMult, are timeinvariant). Comparison with BATEA run 3 reveals that the parameters have shifted markedly, suggesting that SLS calibration ignoring model and input error can lead to signicant parameter bias. This nding extends the results of Kavetski et al. (2002), who demonstrated parameter bias in a synthetic example with corrupt inputs and no model error. However, in the absence of uncertainty measures on the parameters (and since the true parameter values are never known in a real-data study), the suspected bias in this case study cannot be conrmed. The differences between the SLS and BATEA modal estimates for the rainfall multiplier rMult are of interest: SLS infers a modal rMult of 1.20, whereas BATEA run 3 infers a modal value of 0.74. This marked difference in the

172
Table 3 Summary of BATEA calibration in the Abercrombie River case study NashSutcliffe statistic Log-pdf at the posterior mode Parameter

G. Kuczera et al.

BATEA run

Values at posterior mode Mean Standard deviation 0.209 0.271 0.075 0.272

0.897

196.5

loge k loge sF loge ssfMax loge rgeMax loge kBF loge kStream loge rMult loge k loge sF loge ssfMax loge rgeMax loge kBF loge kStream loge rMult loge k loge sF loge ssfMax loge rgeMax loge kBF loge kStream loge rMult loge k loge sF loge ssfMax loge rgeMax loge kBF loge kStream loge rMult

2.788 7.746 1.351 1.276 9.171 0.181 0.107 2.060 7.746 4.456 3.303 8.840 0.984 0.324 2.127 7.746 4.492 3.358 8.933 0.973 0.300 3.864 7.746 0.559 1.721 10.18 0.747 0.185

0.938

4.35

0.947

196.6

SLS

0.736

816.8

10 9 8
Daily runoff (mm)

Observed Simulated

7 6 5 4 3 2 1 0 0 100 200 300 Days 400 500 600 700

Figure 8 Time series of observed and calibrated (posterior modal) runoff for the Abercrombie River obtained using BATEA with storm-dependent log SPM parameters: k and rMult.

estimated rainfall error profoundly affects the log SPM simulation of the groundwater store depth h. Over the calibration period, the SLS simulation forces the groundwater store to increase by about 400 mm, while in the BATEA simulation the groundwater store declines a modest

30 mm over the same two-year period. Because the log SPM model has no capability to lose water, the SLS simulation has to soak up the excess rainfall by storing it in the groundwater store. Such physically unreasonable behaviour of internal mode variables is a potential artefact

Towards a Bayesian total error analysis of conceptual rainfall-runoff models


-1.8 -1.9 -2 -2.1 -2.2 -2.3 -2.4 -3 -2 -1 0 1 2 3

173

0.4 0.2 0 -0.2 log rMult -0.4 -0.6 -0.8 -1 -1.2 -1.4 -1.6 -3 -2 -1 0 1 2 3 Standard normal deviate z Standard normal deviate z

Figure 9 Normal probability plots for the calibrated (modal) storm-dependent parameters k and rMult [symbols] and tted Gaussian hyper-distributions [solid lines].

log k

of inappropriate characterisations of input and model error in the SLS calibration.

Posterior diagnostics
While the results presented in Fig. 8 are encouraging (allowing storm-dependent variation in just two parameters significantly improved the t), it is necessary to probe using posterior diagnostics the specic assumptions made in this BATEA analysis. The major assumption is that the storm-dependent parameters are independent realisations from a log-normal distribution. This assumption can be readily evaluated since there are 71 storm epochs and hence 71 realisations. Fig. 9 presents normal probability plots for log k and log rMult along with the tted hyper-distributions, while Table 4 summarises the KolmogorovSmirnov statistics. In the case of log k, the underlying distribution is largely normal with the two outliers being largely responsible for the failure to conform to a normal distribution (it is clear that the tted hyper-distribution is signicantly affected by the outliers). However, the distribution of log rMult seems more complicated: while the normal distribution provides a reasonable rst approximation, there are clear systematic departures from normality in the lower tail and near the median. Nonetheless, visual inspection suggests that the log-normal hyper-distribution describes the storm-dependent variation of parameter rMult better than the variation in parameter k.

Table 4

Posterior diagnostics for BATEA run 3 Statistic Kolmogorov Smirnov Nonparametric runs test Kolmogorov Smirnov Nonparametric runs test Value of statistic 0.195 1.94 0.139 1.23 5% Signicance value (s) 0.105 1.96 0.105 1.96

Parameter k

rMult

Further insight can be gleaned from the time series plots of the storm-dependent parameters shown in Fig. 10 along with the nonparametric runs test statistics in Table 4. While the runs test statistics do not reject the hypothesis that the storm-dependent parameters are independent, inspection of the time series plots reveals that the outliers tend to cluster and that the model parameter values for consecutive storm events are often nearly identical. These second-order effects suggest that the denition of storm epochs requires further renement. More generally, the optimal denition of the time scale at which the model parameters vary stochastically is unresolved and will be investigated in future work. For example, sampling once a day seems less attractive than sampling at the beginning of a storm, since the latter represents the commencement of a forcing event and thus sets a natural time scale for the system. In the spirit of CRR modelling, the statistical representation of model uncertainty should be parsimonious, favouring fewer latent variables to avoid over-parameterisation (over-tting) and statistical illposedness. Finally, increasing the time resolution of the time dependent parameters raises the dimensionality of the objective function (14) and yields a progressively more difcult computational problem. The relationship between the input data and the associated latent variables is further explored in Fig. 11, which shows scatter plots of storm rainfall depth versus the storm-dependent parameters. While parameter k exhibits no signicant relationship with storm depth, parameter rMult appears to exhibit a statistically signicant relationship with storm depth the p value on the linear trend slope parameter is 0.0013. However, this is a tenuous relationship: if the two largest storms (storms 2 and 45) were removed, there would be no signicant relationship with storm depth. The evidence supporting a relationship between rMult and storm depth is therefore inconclusive. Overall, the assumption that k and rMult are independently and log-normally distributed seems reasonable in the Abercrombie case study. However, there is a clear need to accommodate outliers using distributions more kurtotic than the Gaussian distribution. Another assumption that can be tested concerns the likee jx; h1:n ; X e ; c. It is assumed that after lihood function p Q

174
-1.8 -1.9 -2 log k -2.1 -2.2 -2.3 -2.4 0
0.5 0.3 0.1 -0.1 log rMult -0.3 -0.5 -0.7 -0.9 -1.1 -1.3 -1.5 0 10 20 30 40 50 Storm epoch number 60 70

G. Kuczera et al.

10

20

30 40 50 Storm epoch number

60

70

80

80

Figure 10

Time series of the calibrated (modal) log SPM storm-dependent parameters: k and rMult.

-1.8 -1.9 -2 -2.1 -2.2 y = 3E-05x - 2.127 -2.3 -2.4 0 50 100 Storm rainfall (mm) 150 R = 0.0004
2

0.4 0.2 0 -0.2 log rMult -0.4 -0.6 -0.8 -1 -1.2 -1.4 -1.6 0 50 100 Storm rainfall (mm) 150 y = -0.0035x - 0.231 2 R = 0.141

log k

Figure 11

Relationship between storm depths and calibrated storm-dependent parameters.

allowance has been made for storm-dependent parameters, the residuals (dened as the difference between observed runoff and runoff computed using the modal parameter values for each storm event) are independently and normally distributed with zero mean and a standard deviation of 0.25 mm. Fig. 12 presents a normal probability plot of the residuals. While the distribution of residuals is symmetric (with mean of 0.001 and standard deviation of 0.245 mm), its tails are considerably fatter than expected for a Gaussian

distribution. The generalisation of the normal model described by Box and Tiao (1973) and implemented in BaRE (Thiemann et al., 2001) could therefore be preferable. Finally, Fig. 13 shows the residual time series: while the autocorrelation is not signicantly different from zero, the nonparametric runs yield a test statistic of 12.37, strongly rejecting the assumption of independence. However, given the small magnitude of the residuals, this is a relatively minor issue.

Towards a Bayesian total error analysis of conceptual rainfall-runoff models


2 1.5 1 Residual (mm) 0.5 0 -0.5 -1 -1.5 -2 -4 -3 -2 -1 0 1 2 3 4 Standard normal deviate z

175

Figure 12 Posterior check of the response error model: normal probability plot of the log SPM model residuals (dened as the discrepancy between observed and calibrated runoff).

Residual (mm)

-2 0 100 200 300 Days 400 500 600 700

Figure 13

Time series of the residual errors of the log SPM simulation with modal parameter values estimated using BATEA.

Predictive uncertainty arising from model and input error


Stochastic variation in parameter rMult accounts, albeit crudely, for input uncertainty, whereas variation in parameter k represents the model error of the deterministic CRR model (which would be unable to match the observed runoff exactly even if the true forcing were known). The BATEA formalism enables the contribution of input and model uncertainty to be explicitly evaluated. Fig. 14 presents 90% prediction limits derived by Monte Carlo sampling from the hyper-distributions for the parameters k and rMult for the same two-year period used in the calibration. These limits apply if the structure of the input uncertainty (mean and variance of the hyper-distribution of rMult) is the same as in the calibration period. If accurate input data were available, the prediction limits would be derived from stochastic variation in k alone and are shown in Fig. 15. A comparison of Figs. 14 and 15 suggest that stochastic variation in the

rainfall multiplier rMult dominates the overall predictive uncertainty. Conversely, the variation in the model responses due to parameter k is relatively minor, suggesting that in this case study the model error is dominated by the uncertainty in observed inputs.

A reection on BATEA and GLUE


The astute reader would be interested in how BATEA differs from the GLUE formalism of Beven and Binley (1992). GLUE exposed the shortcomings of traditional statistical models, such as nonlinear regression and the Kalman state-space formulations, which assume that the errors are additive and Gaussian. However, these shortcomings are not deciencies of the Bayesian paradigm itself, but are brought about by ignoring the ample evidence that traditional model assumptions are not supported by the data and by a failure to recognise that a robust CRR framework must not only describe hydrologic processes, but also characterise the inher-

176
14 12 Daily runoff (mm) 10 8 6 4 2 0 0 100 200 300 Days 400 500 600

G. Kuczera et al.

Median 90% limit Observed

700

Figure 14 Time series of the median and 90% prediction limits due to the uncertainty in storm-dependent parameters k and rMult identied using BATEA.

14
Observed

12 Daily runoff (mm) 10 8 6 4 2 0 0

90% limit due to variation in k

Median

100

200

300 Days

400

500

600

700

Figure 15 Time series of the median and 90% prediction limits due to the uncertainty in storm-dependent parameter k only, identied using BATEA.

ent uncertainty in this description. This paper shows that such a framework can be built using formal Bayesian theory. GLUE and BATEA share the recognition that model error is signicant and difcult to characterise. However, the conceptual frameworks are fundamentally different. While BATEA is built on the error propagation framework shown in Fig. 1(a) and explicitly differentiates between input, response and model error, GLUE utilises parameter uncertainty to represent all sources of error. GLUE remains rooted to the deterministic CRR model, since each CRR time series is generated using time-invariant parameters sampled from the behavioural parameter set, whereas BATEA allows parameters to vary stochastically from storm to storm. The exclusive focus on parameter uncertainty in GLUE creates conceptual difculties in its derivation from a strict Bayesian perspective. For example, although GLUE uses Bayesian updating, its likelihood functions are not proper indeed they are often termed pseudo-likelihood functions in recognition that subjective goodness-of-t criteria are used to construct the likelihood function and that the likelihood function becomes independent of the number of observations. In contrast BATEA attempts to directly represent in-

put, response and model error within the standard Bayesian framework, making all assumptions explicit and open to challenge. Seen in this light, BATEA includes some of the philosophical basis of GLUE (which abandons the notion of single true parameters) and seeks to improve on it by explicitly disaggregating input, model and response error using formal Bayesian strategies.

Conclusions
The characterisation of model error in CRR modelling has been thwarted by the convenient but indefensible assumption that CRR models are deterministic descriptions of catchment dynamics. Explicit acceptance that CRR models are fundamentally stochastic paves the way for a more rational characterisation of model error. This paper argues that the uxes in CRR models are fundamentally stochastic because they involve spatial and temporal averaging. The challenge is to characterise this stochasticity in a way that is consistent with available evidence and is statistically and computationally tractable.

Towards a Bayesian total error analysis of conceptual rainfall-runoff models We proposed the hypothesis that the structural error of CRR models can be characterised by storm-dependent random variation of one or more CRR model parameters. A sensitivity analysis was designed to identify the parameters most likely to vary between storms. A Bayesian hierarchical model (BATEA) was then developed to explicitly differentiate between input, response and model error using stormdependent parameters. The hypothesis that storm-dependent parameters are independent and log-normally distributed was evaluated in a case study. Posterior diagnostics showed that this hypothesis is reasonably consistent with the evidence, although the need to deal with outliers was recognised. This study moves one step closer to a total error formalism that (i) enables a rational assessment of predictive uncertainty, (ii) allows rigorous testing of competing CRR model hypotheses, and (iii) removes parameter biases that can confound regionalisation of CRR parameters. Nonetheless, signicant problems remain, most notably, optimal characterisation of the apparent stochasticity of CRR models and the identication of the time scale at which this stochasticity operates. The intuitive approach of stormdependent parameters proposed in this study is in general agreement with the evidence, yet it may be possible to derive more rigorous stochastic formulations by investigating the mechanics of spatial and temporal averaging. The computational issues of accommodating stormdependent parameters are formidable because the dimensionality of the problem depends on the number of storms in the calibration time period. In this study, we overcame these difculties by selecting a numerically smooth CRR model amenable to optimisation using computationally fast Newton-type methods. We are presently working on methods that avoid the growth of the dimensionality of the problem and permit the use of more common numerically nonsmooth CRR models.

177

Acknowledgment
This work was partially funded by a grant from the Australia Research Council.

References
Beven, K.J., Binley, A.M., 1992. The future of distributed hydrological models: model calibration and uncertainty prediction. Hydrological Processes 6, 279298. Box, G.E.P., Tiao, G.C., 1973. Bayesian Inference in Statistical Analyses. Addison-Wesley, Boston, MA.

Bras, R.L., Rodriguez-Iturbe, I., 1985. Random Functions in Hydrology. Addison-Wesley. Duan, Q., Sorooshain, S., Gupta, V.K., 1992. Effective and efcient global optimization for conceptual rainfall-runoff models. Water Resources Research 28, 10151031. Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B., 1995. Bayesian Data Analysis. Chapman and Hall. Kavetski, D., Franks, S.W., Kuczera, G., 2002. Confronting input uncertainty in environmental modelling, in calibration of watershed models. In: Duan, Q., Gupta, H., Sorooshian, S., Rousseau, A., Tourcotte, R. (Eds.), Water Science and Application Series 6. American Geophysical Union, Washington, DC, pp. 4968. Kavetski, D., Kuczera, G., Franks, S.W., 2003. Semi-distributed hydrological modelling: a saturation path perspective on TOPMODEL and VIC. Water Resources Research 39 (9), 1246 1253. doi:10.1029/2003WR00212. Kavetski, D., Kuczera, G., Franks, S.W., 2006a. Calibration of conceptual hydrological models revisited: 1. Overcoming numerical artefacts. Journal of Hydrology 320 (12), 173186 (The model parameter estimation experiment MOPEX, Sapporo, Japan, Edited by Schaake, J., Qingyun Duan). Kavetski, D., Kuczera, G., Franks, S.W., 2006b. Calibration of conceptual hydrological models revisited: 2. Improving optimisation and analysis. Journal of Hydrology 320 (1 2), 187201 (The model parameter estimation experiment MOPEX, Sapporo, Japan, Edited by Schaake, J., Qingyun Duan). Kavetski, D., Kuczera, G., Franks, S.W., 2006c. Bayesian analysis of input uncertainty in hydrological modelling. I. Theory. Water Resources Research 42, W03407. doi:10.1029/2005WR004368. Kavetski, D., Kuczera, G., Franks, S.W., 2006d. Bayesian analysis of input uncertainty in hydrological modelling. II. Application. Water Resources Research 42, W03408. doi:10.1029/ 2005WR004376. Kuczera, G., 1990. Estimation of runoff-routing model parameters using incompatible storm data. Journal of Hydrology 114 (12), 4760. Kuczera, G., Franks, S.W., 2002. Testing hydrologic models: fortication or falsication? In: Singh, V.P., Frevert, D.K. (Eds.), Mathematical Modelling of Large Watershed Hydrology. Water Resources Publications, Littleton, Co. Nocedal, J., Wright, S.J., 1999. Numerical Optimization. SpringerVerlag, New York. Thiemann, M., Trosset, M., Gupta, H., Sorooshian, S., 2001. Bayesian recursive parameter estimation for hydrological models. Water Resources Research 37, 25212535. Thyer, M., Kuczera, G., Bates, B.C., 1999. Probabilistic optimization for conceptual rainfall-runoff models: a comparison of the shufed complex evolution and simulated annealing algorithms. Water Resources Research 35 (3), 767773. Vrugt, J.A., Diks, C.G.H., Gupta, H.V., Bouten, W., Verstraten, J.M., 2005. Improved treatment of uncertainty in hydrological modelling: combining the strengths of global optimization and data assimilation. Water Resources Research 41, W01017. doi:10.1029/2004WR0030059.

You might also like