Computationally inexpensive identification of noninformative parameters in water resource models

PUBLICATIONS
Water Resources Research

RESEARCH ARTICLE Computationally inexpensive identification of noninformative
10.1002/2015WR016907
model parameters by sequential screening
M. Cuntz and J. Mai contributed €fer1,
Matthias Cuntz1, Juliane Mai1, Matthias Zink1, Stephan Thober1, Rohini Kumar1, David Scha
equally to this work. € n , John Craven , Oldrich Rakovec , Diana Spieler , Vladyslav Prykhodko ,
Martin Schro 1 1,2 1 1 1
Giovanni Dalmasso1,3, Jude Musuuza1,4, Ben Langenberg1, Sabine Attinger1,5, and Luis Samaniego1
Key Points:
1
Developed fully automated Department Computational Hydrosystems, UFZ—Helmholtz Centre for Environmental Research, Leipzig, Germany, 2Now
sequential screening method at Hydros Consulting Inc., Boulder, Colorado, USA, 3Now at Lysosomal Systems Biology, German Cancer Research Center
Saves up to 70% of model
evaluations in subsequent model
(DKFZ), Heidelberg, Germany, 4Now at Department of Civil Engineering, University of Bristol, Bristol, UK, 5Institute of
diagnostics Geosciences, University of Jena, Jena, Germany
Demonstrated for hydrologic model
in three hydrologically unique
catchments Abstract Environmental models tend to require increasing computational time and resources as physical
process descriptions are improved or new descriptions are incorporated. Many-query applications such as
Correspondence to: sensitivity analysis or model calibration usually require a large number of model evaluations leading to high
M. Cuntz,
computational demand. This often limits the feasibility of rigorous analyses. Here we present a fully auto-
matthias.cuntz@ufz.de
mated sequential screening method that selects only informative parameters for a given model output. The
method requires a number of model evaluations that is approximately 10 times the number of model
Citation:
Cuntz, M., et al. (2015), parameters. It was tested using the mesoscale hydrologic model mHM in three hydrologically unique
Computationally inexpensive European river catchments. It identified around 20 informative parameters out of 52, with different informa-
identification of noninformative model
tive parameters in each catchment. The screening method was evaluated with subsequent analyses using
parameters by sequential screening,
Water Resour. Res., 51, 6417–6441, all 52 as well as only the informative parameters. Subsequent Sobol’s global sensitivity analysis led to almost
doi:10.1002/2015WR016907. identical results yet required 40% fewer model evaluations after screening. mHM was calibrated with all and
with only informative parameters in the three catchments. Model performances for daily discharge were
Received 8 JAN 2015 equally high in both cases with Nash-Sutcliffe efficiencies above 0.82. Calibration using only the informative
Accepted 17 JUL 2015 parameters needed just one third of the number of model evaluations. The universality of the sequential
Accepted article online 21 JUL 2015
screening method was demonstrated using several general test functions from the literature. We therefore
Published online 16 AUG 2015
recommend the use of the computationally inexpensive sequential screening method prior to rigorous anal-
yses on complex environmental models.
1. Introduction
Modern environmental models incorporate multiple physical processes. They tend to increase continually
with regards to the level of process detail and hence in their complexity. Most process descriptions have an
empirical basis or represent a process that is itself a composite of several other processes. The descriptions
therefore rely heavily on their parameterizations. The parameters in the parameterizations are, however,
only known within a given uncertainty or can be seen as ‘‘effective’’ parameters altogether, for example in
gridded distributed models where the parameters are intended to compensate for subgrid variability.
Therefore, they need to be calibrated for the model to fit observations. Even with ever increasing computa-
tional resources, real-time computation times of environmental models rarely decrease, given, among other
reasons, that the greater complexity requires additional floating point operations. This may also be exacer-
bated by the fact that practitioners tend to magnify the resolution of their models in order to minimize the
scale mismatch between the observations and the model. Many-query applications such as sensitivity analy-
sis, uncertainty quantification as well as model calibration usually require a large number of model evalua-
tions, leading to a high computational demand often making rigorous analyses impractical.
Environmental models have typically been developed for a specific purpose and are trimmed to a specific
output. Surface hydrology models, for instance, were developed primarily for the prediction and interpreta-
tion of river discharge time series [Freeze and Harlan, 1969]. However, multiple fluxes and states predicted
C 2015. American Geophysical Union.
V
by environmental models are now being used as predictive variables in addition to the primary output.
All Rights Reserved. Hydrologic models, for example, are now also being used for drought analysis, which uses soil moisture as
CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6417

Water Resources Research 10.1002/2015WR016907
primary variable [Samaniego et al., 2013; Sheffield et al., 2009]. The additional output variables of interest
will, however, be sensitive to other process parameterizations and parameters as compared to the original
model output. The output variables may even have competing demands on the same processes so that the
parameters differ depending on whether the model matches one output variable or the other. Different
hydraulic conductivities, for example, lead to faster or slower removal of water from the vadose zone and
thus lead to different soil moisture states. They also influence groundwater tables and hence river base
flows. So the model may want to increase the hydraulic conductivities to increase river discharge, which
may then lead to a dry bias in soil moisture, depending on the formulation of runoff. It might therefore be
necessary to repeat model assessments for specific output variables during application.
€hler et al., 2013]. Sensitivity
One kind of model assessment is sensitivity analysis [e.g., Saltelli et al., 2008; Go
analyses quantify the change in model output for a given numerical alteration of the parameters [Liepmann
and Stephanopoulos, 1985; Saltelli et al., 1993; Razavi and Gupta, 2015]. They are essential tools during
model development [Saltelli et al., 2000]: if a model is very sensitive to one of its parameters then the pro-
cess description, which includes this parameter, might need revisiting. Otherwise the results may be highly
dependent on this one parameter and its proper estimation. Sensitivity analyses are also interesting for
model calibration: the model might have several parameters that, in general or in the specific application,
do not significantly influence the desired output variable. These parameters cannot be estimated reliably
during model calibration and may lead to equifinality, among others [Medlyn et al., 2005]. This does not
mean that these parameters are not important to the model. For example a parameter related to the snow
parameterization of a hydrologic model might be trivial in semiarid regions. It can, therefore, not be con-
strained with observations of river discharge in the semiarid Mediterranean. River discharge also integrates
over the whole catchment and smoothes parameter importances over space and time. Thus, it might be
quite difficult to estimate a parameter that is important only during certain times of the year or in certain
areas of the river catchment. For example, it might be that the same snow parameter is important only sea-
sonally in ephemeral catchments. Or that the parameter is only important in a certain region of the catch-
ment, for example in mountainous areas.
It is therefore a valuable endeavor to find frugal methods that select parameters, which are important for
the observable in question, at least at some point in time or in some part of the model domain, and which
are identifiable with the available data. Screening methods are used for this purpose [e.g., Morris, 1991;
Campolongo et al., 2000; Makler-Pick et al., 2011]. They are similar to sensitivity analyses as far as they ana-
lyze whether model output is sensitive to numerical alternations of the model parameters. They do, how-
ever, only tell whether a parameter is sensitive or not and do not provide importance rankings as do full
sensitivity analyses. It was however shown [Campolongo et al., 2007] that the specific screening method of
Morris [1991] gives indexes that are highly correlated with the sensitivity indexes of Sobol’ [1993]. The same
authors also demonstrated that the primary aim of screening methods, i.e., identifying noninformative
parameters, can be achieved with fewer model evaluations than needed to perform a full sensitivity
analysis.
The objective of this study is to demonstrate the effectiveness in terms of model evaluations and addition-
ally the benefit and universality for sensitivity analysis and model calibration of a newly developed, compu-
tational inexpensive, fully automated sequential screening method. The method is applied here to several
general test functions and also to the mesoscale hydrologic model mHM [Samaniego et al., 2010] in three
distinct European catchments. It should, however, be generally employable in mathematical computer
codes. We hence hypothesize that screening out noninformative parameters with the proposed sequential
screening method leads to the same results in subsequent sensitivity analysis or model calibrations than
the same analyses with all model parameters.
The final aim of our research in general is the development of a robust, parsimonious hydrologic model
that can be calibrated unambiguously, for example, against river discharge. The sensitivity analysis
should, therefore, reflect similar characteristics to the ones relevant during model calibration, which is the
case for variance-based sensitivity analyses such as Sobol’s global sensitivity analysis [Sobol’, 1993]
applied here. The screening method of Morris [Morris, 1991] is based on similar assumptions as the Sobol’
method and therefore also fits in the context of the current study. Indeed, the radial design of Saltelli
et al. [2012] allows the practitioner to perform the screening method of Morris first, which can then be
expanded later to a full Sobol’ global sensitivity analysis.

We want to screen out noninformative model parameters before conducting computationally expensive
model assessments such as sensitivity analysis or model calibration. The paper is therefore organized as fol-
lows: section 2 therefore introduces first the chosen screening method of Elementary Effects (section 2.1).
This study proposes an extension of Elementary Effects that reuses previously gained information in the
process of screening. The sequential screening algorithm was developed using several analytical test func-
tions (section 2.2) and is explained in detail in section 2.3. Screening algorithms were originally proposed
because of the high computational costs of full sensitivity analyses; the latter is explained in section 2.4. Sec-
tion 3 introduces the hydrologic model applied (section 3.1) and three distinct catchments in Europe where
the screenings, sensitivity analyses and model calibrations were performed (section 3.2). The results and dis-
cussions show first the reference Sobol’ sensitivity analysis (section 4.1) and in comparison the normal Ele-
mentary Effects (section 4.2) for the hydrologic model in one test catchment. The performance of the new
sequential screening algorithm is then examined on this real-world problem (sections 4.3 and 4.4). The
impacts of the sequential screening on the overall performances of sensitivity analyses and model calibra-
tions are finally evaluated in sections 4.5 and 4.6.
2. Methods
The global sensitivity and screening methods applied in this study are based on the analysis of model out-
put if the parameters are varied in their a priori ranges. Because full sensitivity analyses are computationally
expensive, frugal screening methods were proposed. They do not describe the full sensitivity information
but rather identify only noninformative parameters, which can be excluded from further analyses such as a
full sensitivity analysis or model calibration. A popular screening method, which goes in concert with
Sobol’s sensitivity analysis (see below), is the method of ‘‘Elementary Effects,’’ which is also known as the
‘‘Morris method’’ [Morris, 1991]. It does not sample the whole parameter space but rather analyses the out-
put along several trajectories through the parameter space (cf. section 2.1). The essential steps of screening
with Elementary Effects are demonstrated here on analytical test functions (section 2.2). Screening methods
still need to analyze a considerable number of trajectories with appreciably redundant information. This
study therefore builds on the idea of the project ‘‘Managing uncertainty in complex models’’ [MUCM, 2010]
to reuse already gained information in consecutive screening trajectories. A computational inexpensive,
fully automated sequential screening method is developed here using the test functions (section 2.3). It is
compared to and applied before the global sensitivity analysis of the ‘‘Sobol’ method’’ [Sobol’, 1993] (section
2.3). The practitioner of sensitivity analyses or screening methods has to decide on two things: (1) Environ-
mental models often produce time series of model output. One has to decide, therefore, how to aggregate
the time series or time-dependent sensitivity indexes (section 2.5). (2) One has to decide on a feasible num-
ber of model evaluations given the available resources. Section 2.6 explains how the number of model eval-
uations used in the screening and sensitivity analyses were determined through convergence criteria, and
how errors on the screening and sensitivity indexes were estimated.
2.1. Elementary Effects

The method of ‘‘Elementary Effects’’ or ‘‘Morris method’’ [Morris, 1991] analyses the change in model output
f ðpÞ depending on parameters p if exactly one parameter pi is changed by a fraction dpi of its feasible
range. It then changes a different parameter by a fraction of its feasible range and repeats this step until
each parameter has been changed exactly once. This chain is called a ‘‘trajectory’’ in parameter space. Each
parameter pi is changed exactly once (to pi 1dpi ) in every trajectory. The Elementary Effect of parameter pi
is calculated as a differential quotient:
f ðpi 1dpi Þ2f ðpi Þ
EEi 5 : (1)
D
D is dpi as a fraction of the parameter range, i.e., a value between 0 and 1. Campolongo et al. [2007] pro-
posed that the mean of the absolute values of EEi from several trajectories, denoted li , is a practical and
concise measure of sensitivity. They particularly showed that it is a good proxy of the Sobol’ total-order sen-
sitivity index STi explained below (equations (4) and (D1)). STi is used when one wants to identify noninfluen-
tial parameters. Elementary Effects are hence also used for sorting out noninformative parameters. It
requires RðP11Þ model evaluations to calculate the Elementary Effects with R trajectories and P parameters.

Typical values for R in the literature are on the order of tens to hundreds. This means that the Elementary
Effects method would need between 500 and 5000 model evaluations for a model with 50 parameters.
2.2. Test Functions

Different analytical functions from the literature are used to demonstrate Elementary Effects and develop
the new method of sequential screening. The functional forms and the parameters used are given in
detail in Appendix A. We sampled 200 trajectories for each test function to calculate the mean Elemen-
tary Effects li (Figure 1). The errors on li were estimated by bootstrapping whole the trajectories 1000
times. The li are normalized by the largest max ðli Þ and called gi . The gi are plotted in ascending
order in Figure 1.
The first function (Figure 1a) is Sobol’s G function [Sobol’, 1993] (Appendix A1), which can be used to gener-
ate test cases over a wide spectrum of complexity for screening and sensitivity analysis. We use the coeffi-
cients from [Saltelli et al., 2008, Table 3.1, p. 124], which give two sensitive and two insensitive parameters
as well as two parameters that can or cannot be seen as informative depending on the practitioner’s needs.
The next six functions (Figures 1b–1g) are shifted and possibly curved versions of G, introduced as G by
Saltelli et al. [2010] (Appendix A2). The G functions in Figures 1d and 1e are concave versions of Figures 1b
and 1c while Figures 1f and 1g are convex versions. All parameters are taken from Table 5 of Saltelli et al.
[2010]. The first forms (Figures 1b, 1d, and 1f) have easily identifiable noninformative parameters in all three
curvatures while the second forms (Figures 1c, 1e, and 1g) exhibit all possible sensitivities without clear
grouping. Another test case (Figure 1h) is a function introduced by Bratley et al. [1992] and used by Kucher-
enko et al. [2009] for sensitivity analysis (Appendix A3). It also spans a range of parameter importances but
in a convex manner, which is quite typical for real-world problems. The fourth function (Figure 1i) was intro-
duced by Saltelli et al. [2008] (Appendix A.4). It has two easily identifiable groups of parameter sensitivities
but with only important parameters when employing the coefficients of Saltelli et al. [2010]. The Ishigami
function [Ishigami and Homma, 1990] has two highly interacting sensitive parameters while one parameter
can be neglected (Figure 1j and Appendix A5). The test function of Oakley and O’Hagan [2004] exhibits rela-
tively high sensitivities for all parameters (Figure 1k and Appendix A6). The sensitivities for half of the
parameters come only from interactions with other parameters therefore these sensitive parameters would
be missed if only first-order sensitivities were evaluated such as Sobol’s Si (equations (3) and (D1)) or local
derivatives. Elementary Effects use a global sampling algorithm so that they are able to correctly identify
the sensitive parameters. The last test function is the Morris function [Morris, 1991] with a clear distinction
of two groups, one influential and one noninformative (Figure 1l and Appendix A7).
These test functions have different combinations of sensitive and insensitive parameters. There are cases
where it is straightforward to select noninformative parameters (e.g., Figure 1b). However, there are cases
where the practitioner has to decide the amount of unexplained variance that is tolerable given the specific
application (e.g., Figure 1h).
2.3. Sequential Screening

Parameter screening is primarily performed to identify noninformative model parameters that can be dis-
carded from further model assessments such as parameter estimation. The procedure generally consists of
the following steps: calculation of sensitivity or screening indexes, determination of whether any parameter
groups can be identified, and finally discarding the parameter group that has sensitivities identified as
‘‘low.’’ ‘‘Low’’ is hereby not a fixed number but is a choice of the practitioner.
We propose an algorithm that automates this procedure and mimics the above steps. Given the sorted nor-
malized mean Elementary Effects gi of the test functions, step two is to identify a group of parameters that
have low sensitivities. We used the logistic function L with an offset (Appendix B), however any sigmoidal
function would fulfill the task. The logistic function has a convex and a concave part as well as a rather lin-
ear section in the center. It can therefore fit all kinds of different combinations of sensitive and insensitive
parameters. Figure 1 shows the fits of the logistic function to the Elementary Effects of the test functions
(solid lines). In Figure 1c, for example, the concave part fits the Elementary Effects while in Figure 1h the
convex part fits the values. Figure 1e shows a fit of the quasilinear center of the logistic function. We used
the point of largest positive curvature xj (Appendix C) of the fitted curve to identify the lowest sensitivities
(dotted lines in Figure 1): all parameters with an index below the value at the largest curvature Lðxj Þ are

Figure 1. Normalized absolute Elementary Effects g of the test functions for the development of the proposed sequential screening algorithm: (a) Sobol’s G function, (b)–(g) Saltelli’s G
function, (h) Bratley function, (i) Saltelli’s B function, (j) Ishigami function, (k) Oakley and O’Hagan function, and (l) Morris function. Gray circles indicate noninformative model parameters
while black circles are retained in subsequent analyses. The solid lines are the fitted logistic functions where the dotted lines indicate the points of largest curvatures. The final thresholds
chosen are shown by the dashed gray lines. Mean and standard deviation (error bars) of bootstraps are given here.
identified as potentially noninformative. The point of largest curvature xj is in the convex portion of the
curve. It can hence be below zero if the concave part of the curve best fits the indexes (e.g., Figure 1c). This
case can happen only if one or two parameters are noninformative at most while a considerably larger

amount of very sensitive parameters exist. It is therefore prudent and not very restrictive to take all parame-
ters in this case. The point of largest curvature should be rather low if noninformative parameters exist. It
can be significantly high in two cases: first if there are only informative parameters, which means the offset
of the logistic curve is high (e.g., Figure 1i), or second if the indexes fit the quasilinear, rather than the con-
vex part of the curve (e.g., Figure 1g). In these cases we check if the value at the point of largest curvature
Lðxj Þ is below 0.2. If not, we do not select any noninformative parameters but keep all parameters for fur-
ther analyses. The threshold is then set arbitrarily to the value of the lowest index but could be set to zero
without any difference (dashed gray lines in Figure 1). The value 0.2 is the user’s choice depending on how
much variability one wants to retain in further model assessments. The value chosen here seems to be
already a rather conservative choice because it retains more than 95% of total variance in our cases (see
Results and Discussion).
For the first step of parameter screening, the calculation of sensitivity or screening indexes, we build on an
idea of the project ‘‘Managing uncertainty in complex models’’ [MUCM, 2010], which was presented in
another form in case study 1 of the toolkit. The idea is: if a parameter has a large Elementary Effect EEi in
one trajectory it will most probably be influential. This means that one does not have to calculate another
Elementary Effect EEi for this parameter and it can be discarded from further trajectories.
We therefore propose the following algorithm to identify noninformative parameters:
1. Sample M1 trajectories with all parameters following the strategy of Campolongo et al. [2007] or Saltelli
et al. [2012] and calculate li .
2. Calculate g , defined as g 5 l /lmax with lmax 5 max ðli Þ. It ranges between 0 and 1 and gets sorted in
ascending order.
3. Fit the logistic function with offset to the sorted gi and determine the point of largest curvature xj of the
fitted function. If Lðxj Þ is between 0 and 0.2, then the threshold gthresh is set to Lðxj Þ otherwise to the low-
est g , i.e., min ðgi Þ.
4. All parameters pi with gi gthresh (or li lthresh 5 gthresh lmax ) are defined as influential parameters and
are discarded in the remaining algorithm.
5. One trajectory is sampled with only the remaining parameters. All absolute Elementary Effects jEEi j
above lthresh are defined as influential parameters and hence discarded.
6. Repeat step 5 until no absolute Elementary Effect jEEi j above lthresh is additionally found.
7. Sample another M2 trajectories with the remaining parameters and calculate li . All parameters pi with
li below the threshold lthresh are the final noninfluential parameters.
In this algorithm, the sampled trajectories become shorter and shorter and therefore fewer model evalua-
tions are necessary because previously acquired information is reused. The number of trajectories M1 and
M2 can be chosen using one’s own sense of acceptability. We chose M1 5 3 and M2 5 5 here.
The conventional Morris Method samples all trajectories, performs all model runs and then calculates the
final indexes such as l . The proposed method samples M1 trajectories, performs the M1 (P11) model runs
and calculates first indexes. Additional reduced trajectories are sampled according to the results of the first
trajectories. Trajectories are, therefore, sampled one after the other, i.e., sequentially.
2.4. Sobol’ Method

The sensitivity analysis method of Sobol’ studies a scalar model output f ðpÞ if the model parameters p are
varied within their uncertainty ranges. The model is run N times with different parameter sets. The variance
V5Vðf ðpÞÞ of the scalar output f ðpÞ is decomposed into component variances Vi from individual parame-
ters pi or parameter interactions Vij ; Vijk , . . ., V12...P :
X
P XP X
P
V5 Vi 1 Vij 1 1V12...P : (2)
i51 i51 j5i11
Vi is the variance in f solely due to the variability of the ith parameter pi; Vij is the variance if both parameters
pi and pj are varied. The first-order model sensitivity to each parameter pi is quantified with the first-order
Sobol’ index:

Vi
Si 5 ; (3)
V
which quantifies the effect if only parameter pi is varied while fixing all other parameters. The total-order
Sobol’ index sums all component variances that include a contribution of parameter pi:
!
1 X P
Vi
STi 5 Vi 1 Vij 1 1V12...P 512 ; (4)
V j6¼i
V
where Vi denotes the variance if all parameters are varied except pi. The total-order Sobol’ index STi repre-
sents the main effect of parameter pi and all its interactions with all other parameters. Both Sobol’ indexes
range from 0 to 1. The sum of all STi can be greater than 1 while the sum of all Si must be less or equal to 1.
A model with a sum of Si close to 1 is known as an additive model with independent parameters, i.e., with
little parameter interactions [Saltelli et al., 2005].
An algorithm was proposed to compute both, the first-order and total order Sobol’ indexes with only
N(P12) model evaluations [Saltelli, 2002]. The calculations used in this study are given in Appendix D. It also
details drawbacks of the algorithm and how they were overcome in this study.
2.5. Averaging Sobol’ Indexes

One drawback of the Sobol’ indexes is that they use only a single scalar model output. Environmental mod-
els normally output time series such as streamflow from hydrologic models. One can thus calculate indexes
for each model time step Si ðtÞ. The index time series can be interpreted independently or a mean index can
be calculated [Herman et al., 2013]:
1X T
1X T
Vi ðtÞ
Si 5 Si ðtÞ5 ; (5)
T t51 T t51 V
1X T
1X T
Vi ðtÞ
STi 5 STi ðtÞ512 ; (6)
T t51 T t51 V
where T is the number of time steps. A model output might, however, only be interesting if it is related to a
large flux. For example, the sensitivity of a soil evaporation parameter is not very interesting for an atmos-
pheric model in winter when there is very little evaporation. One would then flux-weight the index for a
more balanced consideration of the parameter’s sensitivity. Sensitivity indexes should, in our opinion, be
model inherent and independent of observations. One could then use the mean modeled flux for the flux-
weighting. We propose here to use the total variance at each time step V(t) for weighting instead because
larger model fluxes mostly have larger variances so that V may be a good surrogate for the flux:
XT XT
w t51
VðtÞS i ðtÞ Vi ðtÞ
Si 5 X T
5 Xt51
T
; (7)
t51
VðtÞ t51
VðtÞ
XT XT
w t51
VðtÞSTi ðtÞ Vi ðtÞ
STi 5 X T
512 Xt51
T
: (8)
t51
VðtÞ t51
VðtÞ
These are closed forms for the Sobol’ indexes that are easily interpretable. It is very similar to averaging
w w
ratios in other circumstances. The different meanings between Si and STi as well as Si and STi can be com-
pared, for example, to the mean relative humidity calculated as the average of the individual relative
humidities or calculated as the ratio of the average vapor pressures. The second form is hence probably
appropriate if one is interested rather in cumulative effects of model outcome than on each individual time
point.
Instead of averaging the sensitivity indexes, one can also calculate a single measure from the time series
output. One possible and common measure is the root mean square error (RMSE) between the individual
time series of each parameter set and an observed time series. Si and STi are then calculated from these sca-
lar values [Rosero et al., 2010; Rosolem et al., 2012; Rakovec et al., 2014]. Including observations might com-
plicate the interpretation of the sensitivity measures because the ‘2 -norm such as in RMSE is sensitive to

outliers in the observations as well as to single sensitivity runs that deviate markedly from the observations.
Because of this, we use root mean square deviation (RMSD) to distinguish that the deviations are calculated
from the mean model state and not from observations. We calculate RMSD from the average time series of
all parameter sets and denote the resulting Sobol’ indexes SRMSDi and SRMSD
Ti .
2.6. Convergence Check and Error Estimate

Convergence of the sensitivity indexes was indicated when the error of the total-order Sobol’ indexes STi
had converged. Hundred parameter sets were sampled, STi calculated, and the error of STi estimated by
bootstrapping. The design matrices A and B of Appendix D are hereby bootstrapped simultaneously [cf.
Saltelli, 2002]. Another 100 parameter sets were sampled and the STi along with their errors were calculated
from the 200 parameter sets. This was continued, always adding 100 new sets. Convergence was achieved
when the absolute difference between two consecutive error estimates of STi was less than a threshold. We
chose 0.1% of the largest possible total-order Sobol’ index STi (5 1), which means that the threshold was
taken as 0.001. Convergence of the Sobol’ index series was achieved if all parameters fulfilled the conver-
gence criteria.
The procedure was also applied for convergence of Elementary Effects. The threshold was in this case 0.1%
of the largest Elementary Effect. It is hence not fixed as for the Sobol’ indexes but depends on the
application.
3. Model and Setup

Elementary Effects, sequential screening, and Sobol’s sensitivity analysis are applied to the distributed ‘‘mes-
oscale Hydrologic Model’’ mHM (section 3.1) having 52 adjustable model parameters. While it is relatively
complex, it is still fast enough for testing computationally expensive procedures. The model is calibrated
and applied in three distinct catchments in Europe with different characteristics (section 3.2). The calibration
algorithm employed in this study is briefly explained in section 3.3.
Only river discharge is used as output variable of the hydrologic model mHM. However, the model would
exhibit different sensitive parameters if another variable such as evapotranspiration would be of interest.
One of the primary targets of a hydrologic model is, however, the prognostic of discharge, which is also a
good integrated measure of catchment behavior.
3.1. The Hydrologic Model mHM

The ‘‘mesoscale Hydrologic Model’’ mHM is a grid-based distributed model. It conceptualizes basin hydro-
logical processes as fluxes, which exchange water between internal model states. The main components of
mHM are interception, snow accumulation and melting, evapotranspiration, soil water infiltration and stor-
age, seepage to the underground, groundwater storage, and runoff generation. The runoff, which is aggre-
gated at every model cell, consists of direct runoff, slow and fast interflow, as well as base flow. It is routed
through the model domain using the Muskingum flood routing method. A full description of the model can
be found in Samaniego et al. [2010]. The model has been evaluated in different watersheds ranging from
100 to 500,000 km2 in Germany and in the U.S. [Kumar et al., 2013a, 2013b; Samaniego et al., 2013].
A key feature of mHM is its parameter estimation technique, known as Multiscale Parameter Regionalization
(MPR) [Samaniego et al., 2010]. This method allows the prediction of spatially distributed parameter fields
(e.g., porosity) by connecting physiographic input variables (such as clay content) with transfer function
parameters (e.g., the coefficients in pedotransfer functions) by the use of transfer functions (e.g., pedotrans-
fer functions). The transfer function parameters are time-invariant and applicable on each grid cell within
the study domain. The spatially distributed parameter fields are estimated on the spatial resolution of the
physiographic input (e.g., 500 m) and then upscaled internally to the modeling resolution (e.g., 24 km).
3.2. Study Area and Data Sets

The study was conducted in three distinct European catchments, which are (a) the Neckar river upstream of
the Rockenau gauging station located in Southwest Germany; (b) the Sava river upstream of the Hrastnik
gauging station located in Slovenia; and (c) the Guadalquivir river upstream of the Batanes gauging station
located in Spain. While all three study catchments fall within the general warm temperate climatic regime,
according to the Ko €ppen-Geiger climate classification, local hydrometeorologies are, however, quite

different (Table 1). The humid and

Table 1. Physiographical and Meteorological Characteristics of the Three Study
Catchmentsa wet Sava river basin is characterized
Neckar Sava Guadalquivir by ample amounts of snow with an
Country Germany Slovenia Spain
aridity index (P=EP [UNEP, 1992]) of
Drainage area (km2) 12,700 5,180 19,555 around 1.9. The intermediate Neckar
Elevation (m) 455 743 860 river basin is characterized by fully
Slope (8) 2.4 7.3 4.7
Forest cover (%) 34 63 21
humid and warm weather conditions
Permeable cover (%) 56 31 79 in summer with an aridity index
Clay content (%) 29 26 28 of about 1.1, and the semiarid
Sand content (%) 40 40 39
Annual precipitation P (mm) 885 1579 433
Guadalquivir river basin is character-
Annual snow (mm) 56 137 5 ized by hot and dry summer condi-
Annual pot. evapotrans. Ep (mm) 776 812 1221 tions, only 433 mm of average
Annual temperature (8C) 8.3 8.1 14.2
Annual Runoff Q (mm) 304 927 55
annual precipitation and very little
Aridity index (P/Ep) 1.1 1.9 0.4 runoff (55 mm); the aridity index is
Calibration period 1970–1980 1999–2009 1953–1959 around 0.4. Other physiographical
Validation period 1981–1991 1994–1998 n/a
characteristics such as land cover
a
Fractions of land cover classes are estimated from the CORINE database year also vary substantially between the
2000. Long-term meteorological characteristics are estimated from the E-OBS
gridded data sets for the period 1951–2010. Elevation, slope, clay, and sand con-
catchments and are summarized in
tents are catchment averages, precipitation, snow, and runoff are average mean Table 1. The differences in hydrome-
annual sums of the catchments, while potential evapotranspiration and tempera- teorological and physiographical
ture are mean annual averages over the catchments. The calibration period for
each catchment varies depending on the availability of observed discharge data.
conditions provide ideal test beds to
Guadalquivir has no validation period because there are only 7 years of data of show the general applicability of the
which 3 years are taken for spin-up of the model. Annual snow amount is calcu- proposed method.
lated as the amount of precipitation on days with mean daily air temperature
below 0 8 C. The meteorological data set used to
drive the hydrologic model mHM
is based on the publicly available
E-OBS data set (v8.0) from the European Climate Assessment & Data set project [Haylock et al., 2008]. The
data set includes gridded estimates of daily total precipitation as well as daily minimum, maximum, and
average air temperatures at a spatial resolution of 0.258. Other physiographical data sets required to run the
model include the 90 m digital elevation model from the Shuttle Radar Topography Mission [Farr et al.,
2007], soil textural information such as sand and clay contents, bulk density, and root-zone depth extracted
from the Harmonized World Soil Database [FAO IIASA ISRIC ISSCAS, 2012], and land cover for the years 1990,
2000, and 2006 derived from the CORINE database [CEC, 1993]. Land cover data from 1990 is used for the
prior periods. Daily discharge time series for the corresponding gauging stations of the three study catch-
ments were acquired from The Global Runoff Data Centre (www.bafg.de/GRDC).
3.3. Model Calibration
The hydrologic model mHM was calibrated using the Shuffled Complex Evolution (SCE) algorithm [Duan
et al., 1992] with parameter adjustments of Behrangi et al. [2008]. The SCE algorithm was configured with
two complexes; the reflection and contraction step lengths in the Simplex method were set to 0.8 and 0.45,
respectively. SCE seems to have an order of about O(N2). In our case it required between 1.5 and 4 times P2
model evaluations to find the optimal set of P parameters. The selection of SCE was based on its widespread
usage in hydrological studies; any optimization algorithm that includes convergence criteria to find a global
optimum could have been used instead.
mHM was run for either a 7 or 11 year calibration period, depending on data availability, with a 3 year spin-
up to allow the water reservoirs of the model to adapt to local catchment characteristics. We optimized
Nash-Sutcliffe efficiency (NSE) [Nash and Sutcliffe, 1970] between observed and calculated daily streamflow
and refrain from other performance measures [e.g., Gupta et al., 2009]. We realize, though, that we neglect
heteroscedastic and autocorrelated errors [Schoups and Vrugt, 2010; Evin et al., 2013]. However, the main
objective of this paper is to demonstrate a computationally inexpensive method for identifying noninforma-
tive parameters without losing model performance. Model performance is often given as NSE in hydrology
and so we chose to optimize NSE not to confound other effects during the optimization process.
It is, however, possible that other parameters might be important for other performance measures. This is
especially true if one chooses, for example, signatures that measure specific aspects of the model such as

flow-duration curves or only high flows. This would ideally be considered already during the screening pro-
cess where the model output would be the hydrologic signature. Here the screening is performed on the
whole discharge time series with the different averaging possibilities presented in section 2.5. The model
performance in terms of NSE is then compared with and without prior screening.
4. Results and Discussion

This study aims to demonstrate the effectiveness of a newly-proposed, fully automated, frugal screening
method to identify noninformative model parameters. The benefit of the method is shown for Sobol’s
global sensitivity analysis and for model calibration, where the method reduces the number of model
parameters and thus the number of required model evaluations.
Sobol’ indexes for each of the parameters of the hydrologic model mHM were calculated in different ways
(section 2.4) and are compared in section 4.1. This is contrasted to the results of the Elementary Effects in
the traditional (section 4.2) and the herein proposed sequential method (section 4.3). The efficiency of the
latter method is discussed in section 4.4. Last the influence of the proposed sequential screening for
noninfluential parameters is analyzed for the sensitivity indexes (section 4.5) in three distinct European
catchments (section 4.6), and also for model performances after calibration in the various catchments
(section 4.7).
4.1. Sobol’ Sensitivity Indexes From Time Series

For mHM, 2000 parameter sets each consisting of 52 transfer parameters were quasirandomly sampled
(N 5 2000). mHM was run 2000(52 1 2) 5 108,000 times on the German river Neckar for 14 years, with the
first 3 years discarded as spin-up. Daily discharge data were recorded for the location of the gauging station
Rockenau (Table 1) and 52 Sobol’ indexes were calculated for each day. The time series were bootstrapped
1000 times as explained in section 2.6, recalculating daily Sobol’ indexes from the selected sets. The mean
and standard deviation of the 1000 bootstraps are reported here as the Sobol’ indexes and their respective
error estimates.
Figures 2a and 2b show two examples of total-order Sobol’ index time series (mean annual cycles; black
lines, right axes). ST19 is the total-order Sobol’ index for parameter 19, which multiplies the saturated
hydraulic conductivity of the soil [Cosby et al., 1984], and ST28 is the index for parameter 28, which scales
potential evapotranspiration [Hargreaves and Samani, 1985] to take into account the orientation of the ter-
rain (i.e., aspect). Parameters are in general more sensitive if the associated fluxes are large. This is illustrated
with parameters 19 and 28: if there are large rain or snow melt events then a higher hydraulic conductivity
can drain the water faster than a lower one. Consequently, the multiplier for the saturated hydraulic con-
ductivity (parameter 19) is more important during the long winter rainfalls and snow melt events. The
parameter becomes less important during times when there is no water to infiltrate, i.e., in summer. Param-
eter 28, conversely, directly multiplies evapotranspiration rate and is therefore most important in summer,
as there is generally very little evapotranspiration during winter. Mean total-order sensitivities STi (equation
(5)) are almost the same for both parameters, just above 0.3. Water levels in the Neckar are significantly
reduced during summer months so that one might argue that it is more important to catch the large dis-
charge peaks in winter. Weighting the mean of the two sensitivity indexes ST19 and ST28 with discharge Q
(black line in Figure 2c) leads to almost 0.4 for the parameter of the saturated hydraulic conductivity (19)
versus approximately 0.2 for the evapotranspiration parameter (28). If the hydrologic model were used as a
land surface scheme for a regional circulation model (RCM), the latter would be affected most by the evapo-
transpiration fluxes. Weighting the sensitivity indexes with evapotranspiration would be a sensible choice in
this case [cf. Go€hler et al., 2013]. It would have the opposite effect, though, as the weighting with discharge
on the two example parameters 19 and 28: the evapotranspiration parameter (28) would increase while the
saturated hydraulic conductivity (19) would decrease in importance.
Weighting the sensitivities by discharge might be sensible if one is interested, for example, in flood forecast-
ing. Discharge in this context would be the modeled discharge either of each sensitivity run or the average
of all sensitivity runs. We propose to take the total variance at each time point for weighting instead of
mean discharge. It leads to natural closed forms for the Sobol’ indexes (equations (7) and (8)), which are eas-
ily interpretable. Figure 2c shows the annual mean discharge time series (black line, right axis) and the total

Figure 2. (a) and (b) Total-order sensitivity indexes STi (right ordinates, black lines) for parameter 19 (a) and parameter 28 (b), respectively, for the Neckar river basin at the Rockenau
gauging station. All lines are mean annual cycles for the years 1970–1980. Also given is the total variance of all sensitivity runs at each time point (left ordinates, gray lines). The gray lines
of mean total variances are the same in Figure 2a–2c. (c) Mean annual discharge for 10 years (right ordinate, black line) and mean total variance of all sensitivity runs at each time point
(left ordinate, gray line).
variance at that time point produced by all sensitivity runs. Both time series are very similar with high values
in winter and low values in summer. They both exhibit peak values during snow melt and rather uneventful
times during the turn of the year. Different parameter sets seem to lead to larger differences during times
of peak discharge. This means that the total simulation variance is a good surrogate for total flux and can
w w
hence be employed to weight time-dependent sensitivity indexes, i.e., calculating Si and STi (equations
(7) and (8)).
Three forms of calculated sensitivity measures are presented in Figure 3: simple averaging, weighted aver-
aging, and taking the RMSD relative to the mean simulated time series (cf. section 2.4). The figure is a
stacked bar chart in radial coordinates where the azimuth is the parameter number and the radius, i.e., bar
height, is the sensitivity measure. The lower bar in the stack is Si and the upper portion is STi 2Si so that the
total height gives STi . Error bars from bootstrapping are given for STi only. Each parameter has three stacks,
one for each of the three possible transformations from time series to a scalar sensitivity measure. The col-
ored sections are for orientation only and indicate to which modules the parameters mainly belong, i.e., to
which part of the water balance equation they are primarily acting (bold letters on top). The water balance
equation is thereby dS/dt 5 P2E2Q with S water storage, P precipitation, E evapotranspiration, and Q dis-
charge. One parameter is related to canopy interception and there are eight parameters in the snow

module. Nine parameters in the soil

moisture module are related to storage
while eight parameters in the module
influence evapotranspiration, such as the
aforementioned parameter 19. There is
one parameter for direct overland runoff
and three parameters directly related to
evapotranspiration, such as parameter
28. Five parameters are in the formula-
tions of fast and slow interflows and
three parameters influence the percola-
tion to the deep water reservoir. The
routing scheme requires five parameters
and lastly there are nine parameters
related to the geology of the catchment.
mHM shows the most sensitivity to
parameters that affect evapotranspira-
tion as related to soil moisture and to the
evapotranspiration module itself. This is
reassuring because evapotranspiration E
is the largest flux after precipitation P in
the water balance equation so the model
Figure 3. Stacked bar chart of Sobol’ indexes calculated in three different
should be sensitive to variations in its for-
ways: Sobol’ are the mean Si of the time series (e.g., mean of the black line in mulation. But the model also shows sen-
Figure 2a), weighted Sobol’ are the weighted indexes Swi (e.g., the black sensi- sitivity in the way water can be
tivity time series of Figure 2a weighted with the gray total variance), and RMSD
stands for SRMSD (see text for details). The lower bars in the stacks are the first-
transported from the vadose zone, i.e.,
Ti
order indexes Si and the upper parts are STi –Si so that the total heights give either as interflow into the river or as per-
the total-order indexes STi . The time series were bootstrapped 1000 times. colation into groundwater. If one inter-
Mean and standard deviation (error bars) of bootstraps are given here.
prets E also in the way that it removes
water from the vadose zone, then the
term ‘‘critical zone’’ fits well in mHM. Last but not least the model exhibits a certain sensitivity to the param-
eter that regulates when precipitation is interpreted as snow (parameter 2).
The sum of all first-order sensitivity indexes Si must be 1 if there are no parameter interactions in the model.
This is called an additive model [Saltelli et al., 2005, 2012] and would be the most desirable. Models with a
sum above 0.8 are considered as nearly additive [Saltelli et al., 2000] while models with a sum below 0.65
indicate the existence of significant coupling between the parameters [Liepmann and Stephanopoulos,
w
1985]. The sum of all Si of mHM is 0.89, of all Si 0.94, and of all SRMSDi 1.12. It is of course practicable to
examine the sum of all Si at each single time point. However, for an additivity analysis, one determines how
much of a single parameters variance Vi contributes to the total variance V. From a theoretical standpoint, it
w
is therefore rather Si that must be compared to the analysis of Saltelli et al. [2005] for single scalar output.
w
The sum 0.94 of the first-order indexes (Si ) thus means that the parameters of mHM are mostly independ-
ent with little interactions. Only 6% of the total variance along the whole time series is explained by inter-
acting terms. The 5% and 95% percentiles of all summed Si at each time point are 0.78 and 0.98,
respectively. So first-order parameter variations explain at least 80% of total model variance (almost) all of
the time.
Checking the convergence and the sums of the sensitivity indexes also reveals problems regarding the use
of RMSD as a measure of sensitivity. RMSD relative to the mean time series is (the square root of) the mean
variance of the sensitivity runs. RMSD is a squared sum and thus sensitive to outliers, that means to parame-
ter combinations leading to unrealistically large fluxes. This leads to slowly converging series of STi (cf. sec-
w
tion 2.6). Mean STi and weighted mean STi had all converged after about N 5 1000 (cf. Table 2) while SRMSD Ti
was still far from convergence at N 5 2000 (the maximum sampling in this study). The sum of all SRMSDi con-
verged slowly from 1.32 at N 5 1000 to 1.12 at N 5 2000. Sobol’ indexes from RMSD also exhibited unrealis-
tic behavior. For example the highest SRMSDTi is 0.76 for the multiplication constant of saturated hydraulic

Table 2. Model Evaluations Until Convergence of the Sobol’ Method Before (All) and After the Proposed Sequential Screening (After
Screening), and Elementary Effectsa
Sobol’
Elementary Sequential
All After Screening Effects Screening
Neckar 1000(52 1 2) 5 54,000 900(22 1 2) 5 21,600 1100(52 1 1) 5 58,300 433 [401,451]
Sava 900(52 1 2) 5 48,600 900(22 1 2) 5 21,600 1700(52 1 1) 5 90,100 413 [393,434]
Guadalquivir 1100(52 1 2) 5 59,400 1000(19 1 2) 5 21,000 1200(52 1 1) 5 63,600 406 [369,454]
a
The last column is the number of model evaluations needed for the sequential screening method. Values in the latter are the mean,
minimum, and maximum (in brackets) number of model evaluations of five repetitions of the method.
conductivity (parameter 19) which has an SRMSD

i of only 0.35. The strongly related parameter 20 in the same
pedotransfer function for hydraulic conductivity has, however, indexes of 0.31 and 0.40 for SRMSD
i and SRMSD
Ti ,
respectively. This would mean that parameter 19 and parameter 20 have very similar first-order sensitivities
but parameter 19 interacts with almost every other parameter of the model while parameter 20 has little
interaction among the other model parameters, which is not very realistic. It is additionally very difficult to
interpret the variances ViRMSD of something similar to a variance, i.e., RMSD. Because of the slow conver-
gence, the sometimes unrealistic results and the difficulty in interpreting the results, we refrain in the fol-
lowing from the use of RMSD.
Weighting the individual Sobol’ indexes of each time step emphasizes certain aspects of the model such as
high flows for flood analysis. Because we are interested in discharge during high flows as well as during low
flows for drought analysis, we use in the following only the unweighted sum of the individual Sobol’
indexes, that means equations (5) and (6).
4.2. Sobol’ Indexes and Elementary Effects

Figure 4a repeats the Si (darker stacks) of Figure 3, providing insight into the indexes, for the gauging sta-
tion Rockenau of the Neckar river. mHM shows the largest sensitivity for parameter 28, the global multiplier
Figure 4. (a) Stacked bar chart of mean Sobol’ indexes (Si ) before (darker stacks) and after screening with the proposed sequential method (lighter stacks) for the Neckar river basin at
gauging station Rockenau. The lower bars in the stacks are the first-order indexes Si and the upper parts are STi –Si so that the total heights give the total-order indexes STi . (b) l of the
Elementary Effects (gray bars). The stars mark the parameters that would be retained with the sequential screening method. The time series were bootstrapped 1000 times. Mean and
standard deviation (error bars) of bootstraps are given here for all indexes.

Table 3. Nash-Sutcliffe Efficiencies of mHM Versus Observational Discharge for the Calibration and Validation Periods With All Parame-
ters (All) or the Reduced Set of Parameters After Screening (After Screening)a
NSE
Function Evaluations
Calibration Validation Calibration
All After Screening All After Screening All After Screening

Neckar 0.87 0.83 0.86 0.84 15,648 5536
Sava 0.89 0.88 0.83 0.82 18,513 5631
Guadalquivir 0.86 0.85 n/a n/a 15,467 5304
a
The last two columns give the number of model evaluations needed during model calibration before and after screening.
Guadalquivir has no validation period because there are only 7 years of data of which 3 years are taken for spin-up of the model.
to potential evapotranspiration [Hargreaves and Samani, 1985] and parameter 19, the scale of the saturated
hydraulic conductivity [Cosby et al., 1984]. The third most sensitive parameter (36) multiplies percolation
from the deepest soil layer into groundwater. Two other notable sensitivities are to the two parameters in
the formulation of slow interflow (34 and 35). This means that multipliers that apply directly to the compo-
nents of the water balance are most sensitive, compared to parameters that act on variables which then
affect the water balance. The Neckar has long periods of little base flow and exhibits large peaks in dis-
charge after snow melt. Fast interflow is therefore less dominant so consequently the fast interflow compo-
nent shows little sensitivity while the more dominant slow interflow component is more sensitive.
Twenty-one of the 52 model parameters produce 99% of the sum of the total-order sensitivity indexes
P
( Pi51 STi ), and 31 produce 99.9% of the sum. So there are at least 52231 5 21 parameters that are noninfor-
mative during model calibration and show (almost) no sensitivities. Removing 21 parameters from the sen-
sitivity calculations would reduce the required 54,000 model runs by approximately 21,000 (cf. Table 2).
Leaving out one third of all parameters in a calibration procedure of order O(N2) saves more than half of the
model evaluations. Therefore calibrating the model with 21 instead of 52 unknown parameters leads to a
reduction equal to approximately two thirds of the previous number of required model evaluations (see
section 4.7 and Table 3).
The calibration procedure SCE [Duan et al., 1992] needs approximately 16,000 model evaluations for all 52
parameters. So a sensitivity analysis with 54,000 would have been even wasteful in terms of model evalua-
tions if it were performed only to identify noninformative parameters to be excluded from model calibra-
tion. Screening methods are used in this circumstances to find noninformative parameters first before a full
sensitivity analysis or model calibration is performed. Figure 4b shows the screening indexes ‘‘Elementary
Effects’’, or more precisely l (gray bars), the mean of the absolute Elementary Effects [Campolongo et al.,
2007], which is supposed to be a good proxy for the total-order sensitivity index STi . The resemblance
between STi in Figure 4a and l in Figure 4b is striking. All visible sensitivities STi of Figure 4a are captured
by the Elementary Effects. The relative magnitudes are similar for the large sensitivities but different for the
small ones. The Elementary Effects l , which seem to be zero, are the 21 parameters that produce less than
0.1% of the sum of the total Sobol’ indexes.
Elementary Effects, however, cannot be taken absolutely as they do not sample the whole parameter space.
A cumulative sum to determine the cutoff value for noninformative parameters is thus less appropriate.
One instead tries to identify a group of parameters with low indexes l that is separated from the other
parameters having larger l [cf. Saltelli et al., 2008]. There is only a small step in the Elementary Effects of
mHM. The logistic function of the sequential screening method proposed here deals with this situation by
fitting only the convex part to the indexes. Fitting the logistic function to the Elementary Effects and identi-
fying the point of largest curvature retains 18 parameters and excludes 34 for further analysis in mHM; the
former being similar to the 21 parameters that produce 99% of the sum of the total-order Sobol’
indexes STi .
However, screening methods are supposed to require fewer model evaluations than full sensitivity analyses.
But the l converge only slowly with increasing number of trajectories. The Elementary Effects vary largely
especially in the beginning of the trajectory series. mHM needed 1100(5211) 5 58,300 model evaluations
before all Elementary Effects had converged in the Neckar basin (Table 2). If one uses the radial screening
ðiÞ
proposed by Saltelli et al. [2012] (which uses the matrices A and AB from the Sobol’ method explained in

Figure 5. The steps of the screening method for Neckar: (a) the first iteration with M1 5 3 trajectories, which determines the fitting func-
tion (black solid line) and hence the threshold gthresh (dashed gray line) and subsequently lthresh . The circles are the normalized l of the
three trajectories. Gray circles did not show enough sensitivity yet while black circles are marked as influential parameters. (b) The second
iteration with exactly one reduced trajectory, i.e., with less parameters. The circles are the absolute Elementary Effects jEEj of all parameters
with black circles above the threshold lthresh (gray dashed line). (c) The third iteration with no new influential parameters; symbols and
lines as in Figure 5b. (d) The final iteration with M2 5 5 trajectories. The circles are the l of the five trajectories; no parameter had a l
above lthresh .
Appendix D), then one can also calculate Sobol’ indexes although it requires a large number of model
evaluations.
4.3. Sequential Screening

We therefore propose to use Elementary Effects in a sequential manner (section 2.3). Figure 5 shows the dif-
ferent steps: M1 5 3 trajectories were calculated, and l as well as g were determined (g 5 l /lmax ). These
are the open circles in Figure 5a. The logistic function was fitted against the normalized ranks of the g with
values 0–1 on the abscissa (black solid line in Figure 5a). The point of largest curvature determines the cut-
off value Lðxj Þ 5 0.10 (gray-dashed line). All parameters with g above this threshold are defined as influen-
tial parameters and discarded in the following (black open circles). In practice, we calculate
lthresh 5 Lðxj Þlmax 5 7.82 and stay with absolute Elementary Effects (jEEj) or l for retaining or discarding
parameters. Figure 5b shows the next iteration where exactly one reduced trajectory was sampled. This tra-
jectory had only 33 parameters remaining, i.e., 19 parameters were already set aside as informative parame-
ters in the first iteration. Three parameters are tagged influential in this iteration (Figure 5b). All absolute
Elementary Effects are below the threshold in the third iteration (Figure 5b). This triggers the last step of the

algorithm: M2 5 5 trajectories were sampled and l (the average of the five absolute Elementary Effects
jEEj) were compared to lthresh . This yielded no further informative parameters so that all 30 parameters of
this last step are the noninformative parameters of the model in the Neckar catchment (Figure 5d), retaining
22 informative parameters. The algorithm assured that each noninformative parameter was sampled at least
M11M211 5 9 times, giving it a chance to demonstrate its sensitivity. The method required (during five rep-
etitions) approximately 410 model evaluations on average in all three catchments, Neckar, Sava, and Gua-
dalquivir, respectively (Table 2), and always kept 22 6 2 informative parameters per catchment.
The 22 parameters for the Neckar catchment can be seen in Figure 4b as Elementary Effects after 1100 tra-
jectories (bars) and indicated from the sequential screening (stars). For example, parameter 1, the water
holding capacity of vegetation, was labeled noninfluential with a l 5 4.89 whereas parameter 3, a degree-
day factor for snow melt, was retained for further analysis with l 5 7.48. The threshold from the sequential
method was lthresh 5 7.81, i.e., larger than 7.48. The threshold lthresh comes, however, from only M1 5 3 tra-
jectories whereas the l of Figure 4b have converged after 1100 trajectories. lthresh and the Elementary
Effects l of Figure 4b are hence not 100% comparable by number but are very close indeed. The logistic
function applied directly to the l of Figure 4b would yield a lthresh 5 9.02 and would indicate 18 informa-
tive parameters. These 18 informative parameters are included in the 22 parameters selected by the
sequential method. The sequential method, however, selects four additional parameters having Elementary
Effects l of 7.48 (parameter 3), 8.80 (parameter 31), 4.93 (parameter 44), and 1.74 (parameter 45), respec-
tively. The last value is very little in the converged Elementary Effects but parameter 45 became (slightly)
sensitive in one of the iterations of the sequential method. This demonstrates the conservative approach
chosen in the sequential method.
The selection by Elementary Effects can be compared to selections of Sobol’ indexes. A common approach
to select parameters from a Sobol’ sensitivity analysis is to select all parameters that make up a certain per-
centage of the cumulative sum of the Sobol’ indexes, say 99.9% [Rosolem et al., 2012]. For example, the 22
largest STi add up to 99.4% of the sum of all STi in the Neckar basin while one needs 31 parameters to add
up to 99.9%. The sequential screening method varied around 22 parameters during five repetitions with
two parameters more or less. So it always selected the parameters that make up at least 99.2% of the cumu-
lative sum of STi .
4.4. Convergence of Elementary Effects and Sobol’ Indexes

Screening methods are supposed to require a significantly lower number of model evaluations as compared
to complete sensitivity analysis because they have less information content. However, there are arbitrary
choices in screening methods, which is in the case of the Morris Method the number of trajectories. One
finds as little as 10 trajectories in the literature for cases with a small number of parameters [e.g., Cariboni
et al., 2007]. Here we tracked the evolution of l for a varying number of trajectories and find convergence
only after 1100 trajectories for the Neckar catchment. We have defined convergence (see section 2.6) when
the absolute difference between two consecutive error estimates (with bootstrapping) of l was less than
0.1% of the largest Elementary Effect.
In theory, Elementary Effects should not converge faster than Sobol’ indexes. Both methods randomly sam-
ple the parameter space. l is the mean of the absolute differences if the relative step size D is constant.
Sobol’ indexes use variances, i.e., mean squared differences from the mean. Both methods are hence doing
somehow similar steps. Both calculation methods were optimized independently: Elementary Effects by
maximizing dispersion of parameters in the input space [Campolongo et al., 2007] and Sobol’ indexes by the
algorithm of Saltelli [2002] and its variants reviewed in Saltelli et al. [2010] (cf. Appendix D).
Indeed, in our case Elementary Effects needed as many model evaluations as Sobol’ indexes (Table 2) if we
checked for convergence. The general picture, i.e., the separation of sensitive and insensitive parameters,
emerged much earlier than after 1100 trajectories. In our perspective, one does not need a significant num-
ber of trajectories to find the most sensitive and insensitive parameters. However, the intermediate parame-
ters change their rank quite often at the beginning of the trajectory series. If we relaxed the convergence
criteria to 1% of the largest Elementary Effect, the Morris method had converged already after 300 trajecto-
ries. Less formal methods such as comparison by eye yielded about 30 trajectories for more or less stable
indexes, still requiring 1590 model evaluations of mHM in the Neckar. A rule-of-thumb is that screening
methods should have at least the number of trajectories as there are parameters. This would be 52 in case

of mHM, requiring 2756 model evaluations. This is to be compared to the around 430 model evaluations
needed in the sequential screening (Table 2), which is equivalent to only eight full trajectories. There are
alternative screening methods that can use fewer trajectories than model parameters such as supersatu-
rated designs (reviewed in Campolongo et al. [2000]) but they operate mostly at very few factor levels and
require strong assumptions about model behavior such as to be a monotonic function [Saltelli et al., 2012].
Convergence is more important for sensitivity indexes than for screening methods. Sobol’ indexes give
quantitative measures of model sensitivities. They can, for example, be tracked in time to identify deficient
process descriptions in a model [Guse et al., 2014]. Therefore the Sobol’ indexes should more or less have
converged to draw conclusions. The total-order Sobol’ indexes STi are calculated here with the formula of
Jansen [1999]. This formula converges very quickly for small indexes. It is the large indexes that converge
slowly. Total-order Sobol’ indexes STi converged after N 5 1000 (i.e., 54,000 model evaluations) with the
absolute difference between two consecutive error estimates of less than 0.1%. If we relaxed this criteria to
1%, the indexes had converged after N 5 400, i.e., 21,600 model evaluations.
Saltelli et al. [2012] proposed a radial screening that can be used for screening with Elementary Effects first
and can later be enhanced to a full Sobol’ analysis. In this method, the screening can reduce the number of
parameters for further model calibration but it does not, however, reduce the number of model evaluations
for the sensitivity analysis. We propose here to perform sequential screening method first before any further
analysis, be it sensitivity analysis or model calibration. It is shown in the next sections that one does not
lose information content in the further analyses by the sequential screening method.
Let’s assume for mHM, one would take 52 trajectories with radial screening, which means 2756 model eval-
uations. A model calibration would add about 5500 model evaluations after screening (Table 3), i.e., alto-
gether about 8250 model evaluations. N 5 400 for the sensitivity analysis would yield 21,600 model
evaluations but this includes the 2756 of the radial screening. In comparison, the sequential screening
method needs about 430 model evaluations. Model calibration would then also add about 5500, leading to
approximately 6000 model evaluations. But N 5 400 for the sensitivity analysis would then be only 9600
evaluations. Savings by the sequential screening method are therefore at least between 20% and 50%
depending on the analysis.
4.5. Sobol’ Indexes With and Without Prior Screening

The sequential screening method can be used to select informative parameters for future analysis. One
analysis could be, for example, a full Sobol’ sensitivity analysis. We tracked convergence of the Sobol’
indexes after screening and needed only N 5 900 model evaluations for convergence, i.e., 900 3
(22 1 2) 5 21,600 model runs, compared to N 51000 before the screening, i.e., 1000 3 (5212) 5 54,000
model evaluations (Table 2). Figure 4a compares the Sobol’ indexes for all 52 parameters (darker stacks)
with Sobol’ indexes calculated with only the selected 22 informative parameters (lighter stacks), where the
remaining 30 noninformative parameters, according to the sequential screening method, were set to their
default values. Both Sobol’ stacks are very similar, which means that they contain very similar information,
yet in the case of the screening stack with only a quarter of model evaluations.
There are differences in the details though. For example the first-order Sobol’ indexes Si are smaller for the
two most sensitive parameters 19 and 28 (the multipliers to saturated hydraulic conductivity and potential
evapotranspiration, respectively). The Si are almost always slightly smaller in the screening case. One might
also argue that the total-order Sobol’ indexes STi after screening might be slightly less than the indexes of
the full analysis. The sum of the first-order Sobol’ indexes Si is 0.89 and 0.74 and the sum of the total-order
indexes STi is 1.44 and 1.27 before and after screening, respectively. The latter reduction is much greater
than the sum of STi of the noninformative model parameters, which is only about 0.01. But STi are not addi-
tive, i.e., they do not add up to one. So the total variance of the Sobol’ analysis does not have to change in
the same proportion as the STi . Total variance actually increased in our case due to better coverage of the
(reduced) parameter space with the Sobol’ quasi-random numbers. There are more reasons for the apparent
decrease in additivity of the model, i.e., the reduced sum of Si. First, the mean li of the Elementary Effects is
a good proxy of STi [Campolongo et al., 2007]. The sequential method thus selects the parameters according
to STi and not Si. Second, we checked the convergence of only STi and not Si. Si had not fully converged after
N 5 1100. The STi seems to converge significantly faster than the Si. We have investigated all the different
ways of calculating Sobol’ indexes of [Saltelli et al., 2010, Table 2] (cf. Appendix D). We notably enhanced

the chosen method for Si (equation (D1)) by taking all available information into account when calculating
the total variance. Differences between the formulas show up mostly for small parameter sensitivities. Large
indexes converge similarly for all formulations. The Si have also the tendency to be underestimated with a
small number of evaluations (not always though, see next section) and then rise slowly to their final values.
So part of the smaller Si after screening comes from not fully converged indexes. The practitioners must
decide what information they want to obtain from the sensitivity analysis whether or not it is acceptable
that the first-order indexes have not fully converged. In our case, we were most interested in the total-order
indexes STi for mHM because this information is most relevant for model calibration. We traced the develop-
ment of the Si and they had already reached two e-foldings to convergence, which we assumed sufficient.
We hypothesize, though, that the first-order and total-order Sobol’ indexes should result in nearly the same
values, within their uncertainties, before and after screening with a very large number of model evaluations.
4.6. Sobol’ Indexes for Three European Catchments

The method was applied using mHM in three distinct catchments in Europe: the temperate Neckar river in
Germany (Figures 6a and 6d), the colder Sava catchment in Slovenia (Figures 6b and 6d), and the drier Gua-
dalquivir river in Spain (Figures 6c and 6d). The Neckar and Sava rivers have similar annual temperatures of
around 88C (Table 1) and exhibit similar sensitive parameters. Sava has more snowfall, therefore the snow
parameters (2, 3) are slightly more important. Sava also has much more precipitation than the Neckar so
that a second parameter for the saturated hydraulic conductivity becomes more important, namely the
sand dependence of saturated hydraulic conductivity (parameter 20). However, this reduces the relative
importance of the multiplier of potential evapotranspiration (parameter 28).
Guadalquivir is much drier than the northern catchments with an average annual temperature of around
148C and average total annual precipitation of 433 mm (Table 1). Snow parameters became unimportant in
addition to the reduced importance of the interflow parameters. Most of the moisture leaves the system via
evapotranspiration so that the soil parameters are much more important than in the other two systems.
The soil parameters seem to be almost equally important with the exception of the shape factor of the infil-
tration curve (parameter 26), which is the most sensitive parameter in Guadalquivir. The infiltration from
one soil layer to the next is parameterized as in the HBV model with a power function of saturation
€m and Forsman, 1973] of which the exponent is determined by the infiltration shape factor (param-
[Bergstro
eter 26). Infiltration is similar even with very different shape factors in soils near or above field capacity,
which is mostly the case in Neckar and Sava. However, it is significantly reduced in drier soils, which is the
case in Guadalquivir.
The sequential screening method seems to work very effectively in the very different environments. The
total-order Sobol’ indexes before and after screening are very similar in all three catchments (Figures 6a–
6c). We found exceptions for large indexes in all three catchments. Whereas the Sobol’ indexes seem to be
slightly less in the Neckar and Sava, they are slightly larger in the Spanish catchment. The sum of all
Si 5 0.96 after screening in Guadalquivir, which would be an almost perfect additive model. But the Si have
not converged yet and the indexes are overestimated with a small number of evaluations. This is opposed
to the other two catchments where the Si were always underestimated with smaller number of sensitivity
runs. Additionally, Guadalquivir needed a slightly greater number of model evaluations for convergence
after screening than the other two catchments (Table 1). This means that the runoff characteristics pro-
duced by different parameter sets might be particularly different in Guadalquivir due to the ephemeral
nature of the catchment. It could, however, also hint that the catchment is easier to calibrate because fewer
parameter combinations lead to a satisfactory runoff.
4.7. The Effect of the Sequential Screening Method on mHM Calibration

The primary goal of screening methods is the selection of informative model parameters for subsequent
use. One further use is a full sensitivity analysis as in section 4.5. Another employment is model calibration
in specific catchments. The hydrologic model mHM was calibrated once with all parameters and once with
the reduced set of parameters after sequential screening in the three catchments: the Neckar, Sava, and
Guadalquivir, respectively (Figure 7 and Table 3). The chosen SCE algorithm needed between 15,000 and
19,000 model evaluations for 52 parameters to converge. The noninformative model parameters were not
constrained during optimization so that ‘‘uncertainty reduction’’ [e.g., Koffi et al., 2013] is very little or even
meaningless. The apparent equifinality in mHM is therefore, in our opinion, an artifact in the way that only

Figure 6. Sobol’ indexes before (darker stacks) and after screening with the proposed sequential method (lighter stacks) for (a) Neckar, Germany, (b) Sava, Slovenia, and (c) Guadalquivir,
Spain. (d) Map of Europe with the shaded catchments of Figures 6a–6c. Figure 6a is the same Figure 4a with the same layout for Figures 6b and 6c. The time series were bootstrapped
1000 times. Mean and standard deviation (error bars) of bootstraps are given here for all indexes.
different sets of noninformative model parameters lead to equifinal results while informative parameters
are rather unique in general. Equifinality does also occur with correlated parameters or with an objective
function that is incompatible with the prior screening. There are correlated parameters in mHM but they
are rather minor, which is corroborated by the fact that the model is almost additive. Screening and optimi-
zation should ideally use the same output measure. If a practitioner is only interested in high flows, then
the model output for screening would use only high flows and the objective function in the model calibra-
tion would also emphasize high flows. Equifinality can arise if screening and parameter estimation use
inconsistent objectives, which is not the case here.

Figure 7. Discharge time series observed (blue circles) and modeled with all optimized parameters (Qall mHM , solid gray lines) and with only
the reduced parameter set optimized (Qscreening
mHM , dotted gray lines) for (a) Neckar, Germany, (b) Sava, Slovenia, and (c) Guadalquivir, Spain,
respectively. The modeling periods are the calibration periods of Table 1 and the given Nash-Sutcliffe efficiencies (NSE) are also for these
time spans. Shown are 2 years out of the calibration period, starting in summer of the fifth year.
The SCE algorithm needed roughly a third of the previously required model evaluations in case of the
reduced parameter sets while still maintaining high Nash-Sutcliffe efficiencies (NSE) (Table 3). The differen-
ces of the discharge time series cannot be spotted by eye in Figure 7 and are well within the predictive
uncertainty of the model output (results not shown).
The hydrologic model mHM shows almost no performance loss between calibration and validation period
for Neckar. Some fidelity is lost in Sava between calibration and validation but quite minor in magnitude.
NSE is very high in all three catchments with the lowest NSE of 0.82 for Sava during validation. It is compara-
ble to other hydrologic models [e.g., te Linde et al., 2008; Alcamo et al., 2003] and certainly sufficient for fur-
ther model studies such as flood or drought forecasts [Samaniego et al., 2013].
Nevertheless, NSE stays nearly the same before and after screening during calibration as well as during vali-
dation in Sava. NSE is even higher after screening for Neckar in the validation compared to the calibration
period. This demonstrates the robustness of the proposed screening methods in the three distinct catch-
ments. Note the analysis was not possible for Guadalquivir due to short data series so that all data was used
for calibration of the model.

5. Conclusions
We performed a global Sobol’ sensitivity analysis of the hydrologic model mHM in three river basins in
Europe within distinct climatic regions. The analysis revealed that the mHM model has a very different set
of sensitive parameters in the arid catchment in Spain (Guadalquivir) as compared to the two humid catch-
ments around Central Europe. Soil parameters became much more influential in the arid catchment with
the shape of the infiltration curve as the most important parameter.
We proposed a computational inexpensive, fully automated sequential screening method to filter out the
noninformative parameters before further diagnostics of a computational model, reducing the overall calcu-
lation load. It has been demonstrated to work for test functions existing in the literature as well as for mHM
in three distinct European catchments. The method should, however, be applicable to any kind of mathe-
matical model. The sequential screening method always identified about 20 (out of 52) informative parame-
ters in mHM in all three catchments, although different parameters for each of the catchments were
identified. The noninformative parameters, that were excluded, contributed less than 1% to the sum of the
total-order Sobol’ indexes. The method thus seems to be rather conservative in excluding model parame-
ters from further analysis. The method identifies any parameter as sensitive that has an Elementary Effect
above 0.2 of the maximum Elementary Effect, which reflects current usage of Elementary Effects in the liter-
ature. The user should, however, check the screening results and adapt the threshold to its own personal
sense of acceptability. Also the low number of initial trajectories M1 5 3 might fail if there are oversensitive
model parameters. This is, however, problematic in all screening and sensitivity analyses and should be
rather evident from the final sequential screening results.
The sensitivity analysis was repeated with only the set of informative parameters, which yielded nearly the
same results as using the complete parameter set, however, required approximately 60% fewer model eval-
uations. The majority of the differences could be attributed to the extent of convergence of the sensitivity
indexes.
Calibrating mHM with all 52 transfer parameters needed about 3 times the number of model evaluations
than without the noninformative model parameters. The latter performed equally well as compared to the
full calibration in all three catchments. It was argued that calibration with the noninformative parameters
can lead to equifinality in these noninformative parameters.
In summary we recommend that it is highly beneficial to first use the proposed automated sequential
screening method to identify informative model parameters for a given output. Further model diagnostics
and model calibration can gain robustness and efficiency when using only the informative parameters.
Appendix A: Analytical Test Functions

A variety of functions are used to test the new method of sequential screening. Their definition and the
coefficients chosen are given in the following sections.
A1. G Function
The G function was introduced by Sobol’ [1993]:
Y
6
j4xi 22j1ai
GðxÞ5 : (A1)
i51
11ai
We used the coefficients a 5 [78, 12, 0.5, 2, 97, 33] from Saltelli et al. [2008, Table 3.1, p. 124]. The parameters
were randomly sampled with xi U½0; 1 8i.
A2. G Function
Saltelli et al. [2010] introduced an extension of the G function, which can be shifted and curved:
Y
10
ð11ai Þ j2ðxi 1di 2I½xi 1di Þ21jai 1ai
G ðxÞ5 ; (A2)
i51
11ai
where I½: is the integer part. G becomes the G function with ai 5 1 and di 5 0 8i. We used the following six
sets of coefficients from Table 5 of Saltelli et al. [2010]:

1. ai 5 1, di U½0; 1 8i, a 5 [0, 0, 9, 9, 9, 9, 9, 9, 9, 9],

2. ai 5 1, di U½0; 1 8i, a 5 [0, 0.1, 0.2, 0.3, 0.4, 0.8, 1, 2, 3, 4],
3. ai 5 0.5, di U½0; 1 8i, a 5 [0, 0, 9, 9, 9, 9, 9, 9, 9, 9],
4. ai 5 0.5, di U½0; 1 8i, a 5 [0, 0.1, 0.2, 0.3, 0.4, 0.8, 1, 2, 3, 4],
5. ai 5 2, di U½0; 1 8i, a 5 [0, 0, 9, 9, 9, 9, 9, 9, 9, 9],
6. ai 5 2, di U½0; 1 8i, a 5 [0, 0.1, 0.2, 0.3, 0.4, 0.8, 1, 2, 3, 4].
The parameters were randomly sampled with xi U½0; 1 8i.
A3. Bratley Function

This series was introduced by Bratley et al. [1992] for numerical integration studies and used by [Kucherenko
et al., 2009] for testing sensitivity methods:
X
n Y
i
KðxÞ5 ð21Þi xj ; (A3)
i51 j51
with n randomly sampled parameters xi U½0; 1 8i.
A4. Saltelli B Function

This function was introduced by Saltelli et al. [2008] for testing parameter sensitivities due to parameter
interactions:
X
n
BðxÞ5 x i xi ; (A4)
i51
with n Gaussian distributed coefficients xi N ½0; rxi and the randomly sampled parameters xi N ½0; rxi
8i where the tails were cut at the 5% and 95% quantiles. The variances were set after Saltelli et al. [2010]:
rxi 5 [0.7, 1.3, 1.4, 0.6, 0.95] and rxi 5[1.0, 1.1, 0.9, 1.2, 0.8].
A5. Ishigami Function

Ishigami and Homma [1990] introduced the following function with two strongly interacting parameters:
f ðxÞ5sin ðx1 Þ1a1 sin 2 ðx2 Þ1a2 x34 sin ðx1 Þ: (A5)
Coefficients were set to a1 5 0.5, a2 5 2.0 and the parameters were randomly sampled xi U½2p; p 8i.
A6. Oakley and O’Hagan Function

The function of Oakley and O’Hagan [2004] has 15 randomly sampled parameters xi N ½0; 1 8i with tails
cut at the 5% and 95% quantiles:
f ðxÞ5a1 T x1a2 T sin ðxÞ1a3 T cos ðxÞ1xT Mx; (A6)
with coefficient matrices ai and M given in Oakley and O’Hagan [2004] and Saltelli et al. [2008, Tables 3.5
and 3.6, page 129f]. They can be downloaded from http://www.jeremy-oakley.staff.shef.ac.uk/psa_example.
txt.
A7. Morris Function

This function was already presented by [Morris, 1991] when he introduced Elementary Effects:
X
20 X
20 X
20
f ðxÞ5b0 1 b i xi 1 bij xi xj 1 bijk xi xj xk
i51 i<j i<j<k
X
20
1 bijkl xi xj xk xl ; (A7)
i<j<k<l
where xi 5 2xi 21, except for i 5 3, 5 and 7 where xi 5 2:2xi /(xi 10:1Þ21. The coefficients are: bi 5 20 for
i 5 1, . . ., 10; bij 5 –15 for i, j 5 1, . . ., 6; bijk 5 210 for i, j, k 5 1, . . ., 5; and bijkl 5 5 for i; j; k; l 5 1, . . ., 4.

The remaining first and second-order coefficients are sampled from a standard normal distribution N ½0; 1
and the remaining third and fourth-order coefficients are set to zero. The parameters xi are randomly
sampled xi U½0; 1 8i.
Appendix B: Logistic Function

The logistic function with offset is:
A
LðxÞ5L0 1 ; (B1)
11e2kðx2x0 Þ
with L0 the minimum function value, A the curve’s amplitude, k the steepness, and x0 the sigmoid’s mid-
point. The first and second derivatives of the logistic function are:
kA
L0 ðxÞ5 ; (B2)
2fcosh ðk½x2x0 Þ11g
2k 2 Asinh ðk½x2x0 Þ
L00 ðxÞ5 : (B3)
2fcosh ðk½x2x0 Þ11g2
Appendix C: Curvature of a Function

The signed curvature jðxÞ of a function f(x) is defined as:
f 00 ðxÞ
jðxÞ5 h i3=2 ; (C1)
11f 0 ðxÞ2
0 00
with f ðxÞ and f ðxÞ being the first and second derivatives of f(x).
Appendix D: Calculation of Sobol’ Indexes

An algorithm was proposed to compute both, the first-order and total-order Sobol’ indexes with only
N(P12) model evaluations [Saltelli, 2002]. The method uses two independent matrixes A and B both
ðiÞ ðiÞ
containing N sets of P parameters (hence N rows and P columns). The additional matrix AB (or BA ) has
th
all columns of A (B) except for the i column, which comes from B (A). Sobol’ indexes can then be cal-
ðiÞ ðiÞ
culated from either the couple of matrixes B and AB or A and BA . During study preparation, all compu-
tational algorithms of Saltelli et al. [2010] to compute the Sobol’ indexes were evaluated, which revealed
that Si converged quickly for noninformative parameters with the formulation (b) of Table 2 of Saltelli
et al. [2010]:
" #
1 1X N
ðiÞ
Si 5 f ðBÞj f ðAB Þj 2f ðAÞj ; (D1)
V N j51
where ðBÞj denotes the j th row of matrix B and f ðBÞj is the model output calculated with this parameter set.
The evaluations of the computational algorithms further showed that Si became even more stable at the
beginning of the parameter sample series if the total variance V was calculated from both sampling
matrixes A and B, i.e., Vð½f ðAÞ; f ðBÞÞ, instead of just A as given in Saltelli et al. [2010]. The same evaluations
exposed that STi converged very quickly for noninformative parameters with the formulation of Jansen
[1999] (i.e., (f) in Table 2 of Saltelli et al. [2010]):
" #
N 2
1 1 X ðiÞ
STi 5 f ðAB Þj 2f ðAÞj : (D2)
V 2N j51
V is calculated in this case only from A, though, i.e., Vðf ðAÞÞ. The above equations (equations (D1) and (D2))
were chosen because of the fast convergence of the noninformative model parameters. However, there
were no significant differences in the convergences for informative parameters between the different for-
mulations of Saltelli et al. [2010], both for Si and STi .

It was also argued that the sampling of the parameter sets should rather use quasirandom sequences
instead of random Monte-Carlo or stratified sampling such as Latin Hypercube Sampling [Saltelli et al.,
2010]. However, we realized that quasirandom sequences such as Sobol’ sequences [Sobol’, 1976] can indi-
cate no sensitivity due to purely mathematical reasons. In detail, the algorithm of Saltelli [2002] is based on
ðiÞ
two sample matrixes A and B. Additional matrixes AB are built where the ith column of B is inserted in A.
ðiÞ
This gives identical parameter sets in A and AB except for the ith parameter that comes from B. A quasiran-
dom sequence chooses parameters in a deterministic way, for example by cycling through the normalized
parameter ranges with rational numbers. One point in the sequence can therefore choose, for example,
0.75 from the normalized parameter range of parameter pi. Another point in the sequence might also
choose 0.75 for parameter pi but with very different values for the rest of the sequence. Parameter pi is
ðiÞ
transferred from B to AB and may therefore lead to identical parameter sets in the two matrixes. This would
indicate to the algorithm that there is no output change with another parameter set and there is hence no
sensitivity to parameter pi. So identical numbers in the quasirandom number sequences should be avoided.
ðiÞ
Generating 2000 parameter sets with 20 parameters gives 268 identical parameter sets in A and AB . Skip-
ping the first 30,000 Sobol’ sequences reduces the amount of identical parameter sets to 2. We could verify
that the same reduction of identical sets occurs for at least 100 parameters by skipping the first 30,000
Sobol’ sequences. In this study, we therefore always skip 30,000 Sobol’ sequences to reduce the amount of
identical parameter sets.
Acknowledgments References
Funding was provided to M Cuntz by
Alcamo, J., P. Doll, T. Henrichs, F. Kaspar, B. Lehner, T. Rosch, and S. Siebert (2003), Development and testing of the WaterGAP 2 global
the Deutsche Forschungsgemeinschaft
model of water use and availability, Hydrol. Sci. J., 48(3), 317–337.
DFG (CU 173/2-1; WE 2681/6-1). The
Behrangi, A., B. Khakbaz, J. A. Vrugt, Q. Duan, and S. Sorooshian (2008), Comment on ‘‘Dynamically dimensioned search algorithm for com-
authors would like to acknowledge the
E-OBS data set from the EU-FP6 putationally efficient watershed model calibration’’ by Bryan A. Tolson and Christine A. Shoemaker, Water Resour. Res., 44, W12603, doi:
project ENSEMBLES (http://ensembles- 10.1029/2007WR006429.
eu.metoffice.com) and the data Bergstr€ om, S., and A. Forsman (1973), Development of a conceptual deterministic rainfall-runoff model, Nord. Hydrol., 4, 147–170.
providers of the ECA&D project (http:// Bratley, P., B. L. Fox, and H. Niederreiter (1992), Implementation and tests of low-discrepancy sequences, ACM Trans. Model. Comput. Simul.,
www.ecad.eu), the Global Runoff Data 2(3), 195–213.
Centre (GRDC), the European Campolongo, F., J. Kleijnen, and T. H. Andres (2000), Screening methods, in Sensitivity Analysis, Wiley Ser. Probab. Stat., edited by A. Saltelli,
Environment Agency (EEA), the NASA K. Chan, and M. Scott, pp. 65–80, John Wiley, N. Y.
Land Processes Distributed Active Campolongo, F., J. Cariboni, and A. Saltelli (2007), An effective screening design for sensitivity analysis of large models, Environ. Model.
Archive Center (LP DAAC), and other Softw., 22(10), 1509–1518.
agencies and governmental Cariboni, J., D. Gatelli, R. Liska, and A. Saltelli (2007), The role of sensitivity analysis in ecological modelling, Ecol. Modell., 203(1–2), 167–182.
organizations for providing the data CEC (1993), CORINE Land Cover Technical Guide, Eur. Union, Dir.-Gen. Environ., Report EUR 12585EN, Nucl. Safety and Civ. Prot., Off. for Off.
employed in this study. The study is a Publ. of the Eur. Comm, Luxembourg.
contribution to the Helmholtz- Cosby, B. J., G. M. Hornberger, R. B. Clapp, and T. R. Ginn (1984), A statistical exploration of the relationships of soil-moisture characteristics
Association climate initiative REKLIM. to the physical-properties of soils, Water Resour. Res., 20(6), 682–690.
Duan, Q. Y., S. Sorooshian, and H. V. Gupta (1992), Effective and efficient global optimization for conceptual rainfall-runoff models, Water
Resour. Res., 28(4), 1015–1031.
Evin, G., D. Kavetski, M. Thyer, and G. Kuczera (2013), Pitfalls and improvements in the joint inference of heteroscedasticity and autocorrela-
tion in hydrological model calibration, Water Resour. Res., 49, 4518–4524, doi:10.1002/wrcr.20284.
FAO IIASA ISRIC ISSCAS JRC (2012), Harmonized World Soil Database (version 1.2), FAO, Rome, Italy and IIASA, Laxenburg, Austria.
Farr, T. G., et al. (2007), The shuttle radar topography mission, Rev. Geophys., 45, RG2004, doi:10.1029/2005RG000183.
Freeze, R. A., and R. L. Harlan (1969), Blueprint for a physically-based, digitally-simulated hydrologic response model, J. Hydrol., 9, 237–258.
G€ohler, M., J. Mai, and M. Cuntz (2013), Use of eigendecomposition in a parameter sensitivity analysis of the Community Land Model, J.
Geophys. Res. Biogeosci., 118, 904–921, doi:10.1002/jgrg.20072.
Gupta, H. V., H. Kling, K. K. Yilmaz, and G. F. Martinez (2009), Decomposition of the mean squared error and NSE performance criteria: Impli-
cations for improving hydrological modelling, J. Hydrol., 377, 80–91.
Guse, B., D. E. Reusser, and N. Fohrer (2014), How to improve the representation of hydrological processes in SWAT for a lowland catch-
ment: Temporal analysis of parameter sensitivity and model performance, Hydrol. Processes, 28, 2651–2670.
Hargreaves, G. H., and Z. A. Samani (1985), Reference crop evapotranspiration from temperature, Appl. Eng. Agric., 1(2), 96–99.
Haylock, M. R., N. Hofstra, A. M. G. Klein Tank, E. J. Klok, P. D. Jones, and M. New (2008), A European daily high-resolution gridded data set
of surface temperature and precipitation for 1950–2006, J. Geophys. Res., 113, D20119, doi:10.1029/2008JD010201.
Herman, J. D., J. B. Kollat, P. M. Reed, and T. Wagener (2013), From maps to movies: High-resolution time-varying sensitivity analysis for spa-
tially distributed watershed models, Hydrol. Earth Syst. Sci., 17, 5109–5125.
Ishigami, T., and T. Homma (1990), An importance quantification technique in uncertainty analysis for computer models, In Proceedings of
the First International Symposium on Uncertainty Analysis (ISUMA ’90), December 3–5, 1990, pp. 398–403, IEEE, Univ. of Maryland, College
Park, Md.
Jansen, M. J. W. (1999), Analysis of variance designs for model output, Comput. Phys. Comm., 117, 35–43.
Koffi, E. N., P. J. Rayner, M. Scholze, F. Chevallier, and T. Kaminski (2013), Quantifying the constraint of biospheric process parameters by
CO2 concentration and flux measurement networks through a carbon cycle data assimilation system, Atmos. Chem. Phys., 13,
10,555–10,572.
Kucherenko, S., M. Rodriguez-Fernandez, C. Pantelides, and N. Shah (2009), Monte Carlo evaluation of derivative-based global sensitivity
measures, Reliab. Eng. Syst. Safety, 94, 1135–1148.

Kumar, R., L. Samaniego, and S. Attinger (2013a), Implications of distributed hydrologic model parameterization on water fluxes at multiple
scales and locations, Water Resour. Res., 49, 360–379, doi:10.1029/2012WR012195.
Kumar, R., B. Livneh, and L. Samaniego (2013b), Toward computationally efficient large-scale hydrologic predictions with a multiscale
regionalization scheme, Water Resour. Res., 49(9), 5700–5714, doi:10.1002/wrcr.20431.
Liepmann, D., and G. Stephanopoulos (1985), Development and global sensitivity analysis of a closed ecosystem model, Ecol. Modell., 30,
13–47.
Makler-Pick, V., G. Gal, M. Gorfine, M. R. Hipsey, and Y. Carmel (2011), Sensitivity analysis for complex ecological models: A new approach,
Environ. Model. Softw., 26(2), 124–134.
Medlyn, B. E., A. P. Robinson, R. Clement, and R. McMurtrie (2005), On the validation of models of forest CO2 exchange using eddy covari-
ance data: Some perils and pitfalls, Tree Physiol., 25, 839–857.
Morris, M. D. (1991), Factorial sampling plans for preliminary computational experiments, Technometrics, 33(2), 161–174.
MUCM (2010), The Managing Uncertainty in Complex Models (MUCM) Toolkit. [Available at http://mucm.aston.ac.uk/MUCM/MUCMToolkit/.]
Nash, J. E., and J. V. Sutcliffe (1970), River flow forecasting through conceptual models: Part I: A discussion of principles, J. Hydrol., 10,
282–290.
Oakley, J. E., and A. O’Hagan (2004), Probabilistic sensitivity analysis of complex models: A Bayesian approach, J. R. Stat. Soc. Ser. B, 66(Part
3), 751–769.
Rakovec, O., M. C. Hill, M. P. Clark, A. H. Weerts, A. J. Teuling, and R. Uijlenhoet (2014), Distributed Evaluation of Local Sensitivity Analysis
(DELSA), with application to hydrologic models, Water Resour. Res., 50, 409–426, doi:10.1002/2013WR014063.
Razavi, S., and H. V. Gupta (2015), What do we mean by sensitivity analysis? The need for comprehensive characterization of ‘Global’ sensi-
tivity in Earth and Environmental Systems Models, Water Resour. Res., 51, 3070–3092, doi:10.1002/2014WR016527.
Rosero, E., Z.-L. Yang, T. Wagener, L. E. Gulden, S. Yatheendradas, and G.-Y. Niu (2010), Quantifying parameter sensitivity, interaction,
and transferability in hydrologically enhanced versions of the Noah land surface model over transition zones during the warm sea-
son, J. Geophys. Res., 115, D03106, doi:10.1029/2009JD012035.
Rosolem, R., H. V. Gupta, W. J. Shuttleworth, X. Zeng, and L. G. G. de Gonçalves (2012), A fully multiple-criteria implementation of the Sobol’
method for parameter sensitivity analysis, J. Geophys. Res., 117, D07103, doi:10.1029/2011JD016355.
Saltelli, A. (2002), Making best use of model evaluations to compute sensitivity indices, Comput. Phys. Comm., 145(2), 280–297.
Saltelli, A., T. H. Andres, and T. Homma (1993), Sensitivity analysis of model output: An investigation of new techniques, Comput. Stat. Data
Anal., 15, 211–238.
Saltelli, A., S. Tarantola, and F. Campolongo (2000), Sensitivity analysis as an ingredient of modeling, Stat. Sci., 15(4), 377–395.
Saltelli, A., M. Ratto, S. Tarantola, and F. Campolongo (2005), Sensitivity analysis for chemical models, Chem. Rev., 105, 2811–2828.
Saltelli, A., M. Ratto, T. H. Andres, F. Campolongo, J. Cariboni, D. Gatelli, M. Saisana, and S. Tarantola (2008), Global Sensitivity Analysis: The
Primer, John Wiley, Hoboken, N. J.
Saltelli, A., P. Annoni, I. Azzini, F. Campolongo, M. Ratto, and S. Tarantola (2010), Variance based sensitivity analysis of model output. Design
and estimator for the total sensitivity index, Comput. Phys. Comm., 181(2), 259–270.
Saltelli, A., M. Ratto, S. Tarantola, and F. Campolongo (2012), Update 1 of: Sensitivity analysis for chemical models, Chem. Rev., 112,
PR1–PR21.
Samaniego, L., R. Kumar, and S. Attinger (2010), Multiscale parameter regionalization of a grid-based hydrologic model at the mesoscale,
Water Resour. Res., 46, W05523, doi:10.1029/2008WR007327.
Samaniego, L., R. Kumar, and M. Zink (2013), Implications of parameter uncertainty on soil moisture drought analysis in Germany, J. Hydro-
meteorol., 14, 47–68.
Schoups, G., and J. A. Vrugt (2010), A formal likelihood function for parameter and predictive inference of hydrologic models with corre-
lated, heteroscedastic, and non-Gaussian errors, Water Resour. Res., 46, W10531, doi:10.1029/2009WR008933.
Sheffield, J., K. M. Andreadis, E. F. Wood, and D. P. Lettenmaier (2009), Global and continental drought in the second half of the twentieth
century: Severity–area–duration analysis and temporal variability of large-scale events, J. Clim., 22(8), 1962–1981.
Sobol’, I. M. (1976), Uniformly distributed sequences with an additional uniform property, USSR Comput. Math. Math. Phys., Engl. Transl.,
16(5), 236–242.
Sobol’, I. M. (1993), Sensitivity analysis for non-linear mathematical models, Math. Model. Comput. Exp., Engl. Transl., 1, 407–414.
te Linde, A. H., J. C. J. H. Aerts, R. T. W. L. Hurkmans, and M. Eberle (2008), Comparing model performance of two rainfall-runoff models in
the Rhine basin using different atmospheric forcing data sets, Hydrol. Earth Syst. Sci., 12(3), 943–957.
UNEP (1992), World Atlas of Desertification, Edward Arnold, London, U. K.

Computationally inexpensive identification of noninformative parameters in water resource models

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computationally inexpensive identification of noninformative parameters in water resource models

Uploaded by

Copyright:

Available Formats

PUBLICATIONS

Water Resources Research

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6417

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6418

2.1. Elementary Effects

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6419

2.2. Test Functions

2.3. Sequential Screening

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6420

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6421

2.4. Sobol’ Method

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6422

2.5. Averaging Sobol’ Indexes

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6423

2.6. Convergence Check and Error Estimate

3. Model and Setup

3.1. The Hydrologic Model mHM

3.2. Study Area and Data Sets

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6424

different (Table 1). The humid and

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6425

4. Results and Discussion

4.1. Sobol’ Sensitivity Indexes From Time Series

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6426

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6427

module. Nine parameters in the soil

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6428

conductivity (parameter 19) which has an SRMSD

4.2. Sobol’ Indexes and Elementary Effects

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6429

All After Screening All After Screening All After Screening

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6430

4.3. Sequential Screening

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6431

4.4. Convergence of Elementary Effects and Sobol’ Indexes

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6432

4.5. Sobol’ Indexes With and Without Prior Screening

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6433

4.6. Sobol’ Indexes for Three European Catchments

4.7. The Effect of the Sequential Screening Method on mHM Calibration

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6434

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6435

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6436

Appendix A: Analytical Test Functions

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6437

1. ai 5 1, di  U½0; 1 8i, a 5 [0, 0, 9, 9, 9, 9, 9, 9, 9, 9],

A3. Bratley Function

with n randomly sampled parameters xi  U½0; 1 8i.

A4. Saltelli B Function

A5. Ishigami Function

A6. Oakley and O’Hagan Function

f ðxÞ5a1 T x1a2 T sin ðxÞ1a3 T cos ðxÞ1xT Mx; (A6)

A7. Morris Function

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6438

Appendix B: Logistic Function

Appendix C: Curvature of a Function

Appendix D: Calculation of Sobol’ Indexes

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6439

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6440

CUNTZ ET AL. INEXPENSIVE IDENTIFICATION OF NONINFORMATIVE PARAMETERS 6441

You might also like

1. ai 5 1, di U½0; 1 8i, a 5 [0, 0, 9, 9, 9, 9, 9, 9, 9, 9],

with n randomly sampled parameters xi U½0; 1 8i.