You are on page 1of 1

SAFE Toolbox

ABOUT SAFE / DOWNLOAD / CASE STUDIES / F.A.Q / PAWN METHOD / ABOUT US / PRIVACY POLICY

F.A.Q

Here you can nd a list of Frequently Asked Questions on GSA in general and SAFE in particular.

WHAT IS GSA USEFUL FOR? WHY SHOULD I USE IT IN THE FIRST PLACE?

We recently published a literature review that attempts at answering this question – what is GSA useful for in the
context of earth systems modelling, and what we have learnt about the construction, testing and use of environmental
models through the years, thanks to the application of GSA:

Wagener, T. & Pianosi, F. (2019), What has Global Sensitivity Analysis ever done for us? A systematic review to support
scienti c advancement and to inform policy-making in earth system modelling, Earth-Science Reviews, 149, 1-18.

This paper could be a useful starting point if you want to get ideas on the type of questions GSA can address.

HOW DO I SET-UP MY GSA (HOW TO CHOOSE THE GSA METHOD, SAMPLING STRATEGY, SAMPLE SIZE, ETC.)?

A general introduction on the key choices in setting-up GSA with references for further reading is given in:
Pianosi, F., Beven, K., Freer, J.W. Hall, J. Rougier, J. Stephenson, D.B., Wagener, T. (2016), Sensitivity analysis of
environmental models: A systematic review with practical work ow, Environmental Modelling & Software, 79, 214-
232. (Open Access)

More discussion on the practical effects of some of the key set-up choices, and how to interpret them, is given in:
Noacco, V., Sarrazin, F., Pianosi, F. and Wagener, T. (2019), Matlab/R work ows to assess critical choices in Global
Sensitivity Analysis using the SAFE toolbox, MethodX, 6, 2258-2280.

Speci c discussion on the choice of sample size for EET, RSA and VBSA can be found in:
Sarrazin, F., Pianosi, F., Wagener, T. (2016), Sensitivity analysis of environmental models: Convergence and validation
Environmental Modelling & Software, 79, 135-152. (Open Access)

I DO NOT HAVE A MATLAB LICENCE, CAN I STILL USE SAFE?

If you do not have Matlab you can use SAFE in Octave. Octave is freely available at:
www.gnu.org/software/octave/download.html
In order to use SAFE in Octave, you also need to download the “statistics” package:
http://octave.sourceforge.net/statistics/

Alternatively, you can download the R or Python version of SAFE.

WHY DO I GET NEGATIVE VARIANCE-BASED SENSITIVITY INDICES? (OR INDICES LARGER THAN 1)

In principle variance based indices take values in [0,1]. However, this might not happen in practice because we use an
approximation procedure to estimate the indices (analytical computation being impossible). For instance, imagine that
the ”true” (unknown) value of a sensitivity index is 0.05 and your estimation error is -0.06, you will get a sensitivity
estimate of -0.01. So, obtaining indices below 0 (or above 1) is an evidence that the approximation errors are relatively
large.

>> HOW DO I ESTIMATE APPROXIMATION ERRORS?


The extent of approximation errors can be assessed by using the bootstrapping option to derive con dence intervals.

>> HOW DO I REDUCE APPROXIMATION ERRORS?


If you want to reduce approximation errors (and hopefully obtain index values in [0,1]), you must increase the sample
size. To do this ef ciently, you can add new samples to an already existing dataset, rather than creating a new sample of
bigger size from scratch (see point 4 in “work ow_vbsa_hymod”). Depending on the case study, the sample size needed
to achieve good approximation may vary a lot, ranging from 1,000 up to 10,000 (or more) times the number of input
factors (see for example Figure 5 in Pianosi et al. (2016))
A nal remark: if the true unknown value of the sensitivity index is 0, a negative index can be obtained even when using
a very large sample size, although very small in absolute value, because the sensitivity index coincides with the
approximation error.

MY VARIANCE-BASED SENSITIVITY INDICES STILL HAVE VERY LARGE CONFIDENCE BOUNDS, WHAT CAN I
DO?

If you cannot afford to run more model evaluations to reduce the con dence intervals of VBSA indices, you might:
1) Extract as much information as possible from the sensitivity estimates you possess. For example, if con dence
intervals are large but they do not overlap, then you might still be able to derive reasonably robust conclusion about the
ranking of the inputs, even if the sensitivity indices values are not exactly estimated. This is also discussed in Sarrazin et
al. (2016).
2) Apply a different, less computationally demanding method, for example the Elementary Effects Test (EET). Notice
that if you used Saltelli’s resampling strategy to generate the input/output samples for VBSA (as implemented in the
vbsa_resampling function of SAFE) you can apply the EET to those samples without re-running the model (you only need
to rearrange the samples in the right way before passing them to the EET_indices function). The work ow to do this (in
Matlab) is available here.

HOW DO I SET A THRESHOLD FOR IDENTIFYING UNINFLUENTIAL FACTORS?

In theory, unin uential input factors should have zero-valued sensitivity indices. However, since sensitivity indices are
typically computed by numerical approximations rather than analytical solutions, an unin uential factor may still be
associated with a non-zero (although small) index value. One way to identify unin uential factors is to de ne a
threshold value for the sensitivity indices: if the index is below the threshold, then the input factor is deemed
unin uential. The problem then is how to sensibly de ne the threshold. A simple and effective way to set the threshold
for variance-based and PAWN sensitivity indices is by using the estimated sensitivity to a ‘dummy parameter’. The
approach is described and demonstrated in:
Farkhondeh KZ, Nossent J, Sarrazin F, Pianosi, F, van Griensven A, Wagener T, Bauwens, W (2017), Comparison of
variance-based and moment-independent global sensitivity analysis approaches by application to the SWAT model,
Environmental Modelling & Software, 91, 210–222.

ARE THERE OTHER METHODS TO ASSESS THE ROBUSTNESS OF RANKING AND SCREENING RESULTS?

Yes, for example the Model Variable Augmentation (MVA) method allows to assess the quality of SA results without
performing any additional model runs or requiring bootstrapping. More information about the method:
Mai, J., & Tolson, B. A. ( 2019). Model Variable Augmentation (MVA) for diagnostic assessment of sensitivity analysis
results. Water Resources Research, 55, 2631– 2651.
A Python implementation of this method is available here (code by Juliane Mai), and can be easily integrated in the
Python version of SAFE.

MODELS WITH CORRELATED INPUT FACTORS: CAN I STILL APPLY VARIANCE-BASED GSA?

The SAFE functions currently implemented for VBSA require that input factors be independent. However the literature
on extending variance-based sensitivity estimators to the case of dependent/correlated inputs is growing. A good
starting point is:
S. Kucherenko, S. Tarantola, P. Annoni (2012), Estimation of global sensitivity indices for models with dependent
variables, Computer Physics Communications, 183, 937–946.
A Python implementation of this method is available here (code by Alessio Ciullo), and can be easily integrated in the
Python version of SAFE.

SAMPLING FROM DISCRETE UNIFORM DISTRIBUTION: HOW TO MODIFY THE LOWER BOUND OF THE RANGE?
(Matlab version)

The sampling functions OAT_sampling and AAT_sampling in SAFE rely on the Matlab/Octave function unidinv, which
assumes that the lower bound of a discrete uniform distribution be 1 – this is why OAT_sampling and AAT_sampling allow
to specify only one parameter (the upper bound) for discrete uniform distributions. If one wants to sample from a range
with lower bound different from 1, we suggest to still use the OAT_sampling and AAT_sampling functions as they are, and
simply shift the results after sampling by adding/subtracting the due amount. A simple script with some examples is
available here.

WHY DO I GET AN ‘OUT OF MEMORY’ ERROR MESSAGE WHEN USING AAT_sampling FUNCTION? (Matlab
version)

You can get an ‘out of memory’ error when using the AAT_sampling.m function with a large number of model evaluations
(N) or forcing inputs (M). The error is typically caused by another function, lhcube.m, which is called by AAT_sampling.m.
The problem occurs when lhcube.m tries to calculate the minimum distance between two points in the Latin Hypercube
Sample (lines 85-88 of lhcube.m). This step can take a lot of memory as it requires creating a matrix that includes the
distances between all possible pairs of two points in the LHS. In the lhcube.m function, we have two options to perform
this step. The rst option (default) is given on line 86, and it uses the pdist.m function of the Matlab Statistical Toolbox:
dk = min(pdist(Xk)) ; % Requires Statistical Toolbox
The second option (which should not be active in the copy of SAFE you have received) is given on the (commented) line
88, and it uses the ipdm.m function (an open access function included in SAFE):
dk=(ipdm(Xk,’metric’,2)); dk=min(dk(dk>0));
The reason why we use the rst option by default is that it uses less memory, meaning that it should avoid ‘out of
memory’ errors more frequently. However, we also give the second option for those who do not have the Statistical
Toolbox – they will just need to comment line 86 and uncomment line 88.

>> HOW DO I FIX THE PROBLEM?

If you have the Statistical Toolbox and can thus use the more ef cient pdist.m function, double check that this is the
option being used by lhcube.m. If you are already using the rst option (or if you do not have the Statistical Toolbox, so
you cannot use this option) and you still get an ‘out of memory’ message, then you can try increase the memory
allocated to Matlab. Look at the Mathworks documentation on how to do that – for example instructions for Matlab
R2018b are given here).
As a last resort, you can force lhcube.m to avoid calculating the minimum distance between points at all by commenting
all lines 85-95 in lhcube.m. What happens in those lines is that lhcube.m selects the ‘best’ LHS (to be delivered as output
of the function) out of a certain number of ‘candidate’ LHSs (10 in our case), and the selection criterion is that the ‘best’
LHS is the one where the minimum distance is maximum (so called ‘maximin’ approach). So, if we comment these lines,
the selection will not be made, and lhcube.m will simply return as output the last candidate LHS – so, still a LHS but
possibly suboptimal with respect to the maximin LHS.
[Want to learn more about LHS optimisation? See for instance: Damblin, Couplet, Iooss (2013) Numerical studies of
space lling designs: optimization of Latin Hypercube Samples and subprojection properties. Journal of Simulation]

You might also like