You are on page 1of 10

WATER RESOURCES RESEARCH, VOL. 45, W00B15, doi:10.

1029/2007WR006799, 2009

Aleatoric and epistemic uncertainty in groundwater flow


and transport simulation
James L. Ross,1 Metin M. Ozbek,2 and George F. Pinder1
Received 7 December 2007; revised 2 March 2009; accepted 23 March 2009; published 16 May 2009.
[1] The characterization of aleatory hydrogeological parameter uncertainty has
traditionally been accomplished using probability theory. However, when consideration is
given to epistemic as well as aleatory uncertainty, probability theory is not necessarily
appropriate. This is especially the case where expert opinion is regarded as a suitable
source of information. When experts opine upon the uncertainty of a parameter value, both
aleatoric and epistemic uncertainties are introduced and must be modeled appropriately.
A novel approach to expert-provided parameter uncertainty characterization can be
defined that bridges an historical gap between probability theory and fuzzy set theory.
Herein, a random set, a generalization of a random variable is employed to formalize
expert knowledge, and fuzzy sets are used to propagate this uncertainty to model estimates
of contaminant transport. The resultant random set-based concentration estimates are
shown to be more general than the corresponding random variable estimates. In some
cases, the random set-based results are shown as upper and lower probabilities that bound
the corresponding random variable’s cumulative distribution function.
Citation: Ross, J. L., M. M. Ozbek, and G. F. Pinder (2009), Aleatoric and epistemic uncertainty in groundwater flow and transport
simulation, Water Resour. Res., 45, W00B15, doi:10.1029/2007WR006799.

1. Introduction [4] One such method, fuzzy set theory, has failed to gain
[2] Uncertainty in groundwater flow and transport mod- acceptance in engineering, much less hydrogeological,
eling comes in two forms: aleatory and epistemic. Such applications. A possible reason for this is the necessary
distinctions in uncertainty are most often identified in risk paradigm shift one must make in order to apply fuzzy set
assessment and reliability engineering [Helton et al., 2000a, theory to uncertainty characterization.
2000b, 2004; Hofer et al., 2002; Helton and Oberkampf, [5] Random set theory [Zadeh, 1965] provides an intui-
2004; Oberkampf et al., 2004]; and only recently have these tive means for both epistemic and aleatory uncertainty
distinctions been identified in hydrogeological applications characterization. Whereas probability theory’s basic tool
[Srinivasan et al., 2007]. Aleatory uncertainty, also called for uncertainty characterization is the probability density
stochastic or variable uncertainty, refers to uncertainty that function (PDF), discrete random set theory is predicated
cannot be reduced by more exhaustive measurements or a upon the assignment of probabilities to intervals, rather than
better model. Epistemic uncertainty, or subjective uncer- points, as with discrete PDFs. As such, a random set is a
tainty, on the other hand, refers to uncertainty that can be generalization of a random variable, since intervals are more
reduced. imprecise than point values. The use of random set theory is
[3] Despite these apparent distinctions in uncertainty, more appropriate for representation of subjective knowledge
probability theory alone has traditionally been used to because it does not rely upon means, variances and prob-
characterize both forms of uncertainty in engineering appli- abilistic models, which are inconsistent with the nature of
cations [Apostolakis, 1990; Helton et al., 2004]. While it is human thought and discourse.
commonly accepted that probability theory is ideal for the [6] Random set theory [Helton and Oberkampf, 2004;
characterization of aleatory uncertainty [Ganoulis, 1996], Joslyn and Kreinovich, 2005], however, is a general
the facility with which probability theory effectively cap- approach to subjective knowledge characterization, and
tures epistemic uncertainty has been called into question in addition, random sets can be transformed into fuzzy
[O’Hagan and Oakley, 2004], especially given the intro- sets [Joslyn and Booker, 2004; Joslyn and Ferson, 2004]
duction of a number of alternative methods of epistemic with little difficulty. As will be shown, the transformation
uncertainty characterization [Choquet, 1954; Zadeh, 1965, from random sets to fuzzy sets facilitates the efficient
1978; Shafer, 1976]. solution of groundwater flow and transport model equations
characterized by uncertainty.
[7] Though a few applications of fuzzy set theory to
expert knowledge characterization in hydrogeologic appli-
1
Center for Groundwater Remediation Design, School of Engineering, cations have been published [Bardossy et al., 1989, 1990a,
University of Vermont, Burlington, Vermont, USA. 1990b, 1990c; Bagtzoglou et al., 1996; Dou et al., 1995,
2
Environ International Corporation, Princeton, New Jersey, USA. 1997a, 1997b, 1999; Fang and Chen, 1997; Demmico and
Klir, 2004; Guan and Aral, 2004; Ozbek and Pinder, 2006;
Copyright 2009 by the American Geophysical Union. Ross et al., 2006, 2007, 2008], fuzzy sets are not a
0043-1397/09/2007WR006799

W00B15 1 of 10
W00B15 ROSS ET AL.: UNCERTAINTY IN GROUNDWATER SIMULATION W00B15

Figure 1. Inversely estimated hydraulic conductivity field; different numbers identify fields with
distinct hydraulic conductivity values. Gray ovals represent contaminant source locations. Dots denoted
as A and B are pumping wells.

mainstream method of uncertainty characterization in [10] The Princeton Transport Code (PTC), a three-space
hydrogeology. Consequently, an explanation of the distinc- dimensional finite element simulator was employed to
tions between probability and fuzzy set theories and alea- model groundwater flow and transport, with a mesh density
toric and epistemic uncertainties in the characterization of value of 15.2 m everywhere but at the well locations, where
hydrogeological uncertainty is needed. the density is increased to 3.0 m. Dispersivity, storativity
[8] The purpose of this paper is to frame uncertain and porosity values are defined as 0.3 m, 0.0001, and 0.2,
hydraulic conductivity information in terms of aleatory respectively, throughout the domain. These are the default
and epistemic uncertainty, to show how it is related to PTC values for the parameters, and were deemed adequate
random set theory, and to demonstrate how it can be used as the purpose of the study was the novel characterization of
in groundwater flow and transport modeling. To this end, we hydraulic conductivity uncertainty and model estimates of
present the random set characterization of hydraulic conduc- concentration. Though the low dispersivity value in con-
tivity using both uncertainty types and the corresponding junction with the mesh density suggests a high Peclet
simultaneous use of random set, probability and fuzzy set number, and, as a result the possibility of significant
theories. We also describe how to propagate both types of numerical errors in the finite element transport model,
uncertainty in model estimates of concentration. In doing automatic upstream weighting of the convection term
so, we provide a reasonable method of uncertainty char- adjusts for small dispersivities.
acterization that is compatible with both probability and 2.2. Traditional Approach (Confidence Intervals)
fuzzy set theories.
[11] For a given subdomain, a representative random
variable hydraulic conductivity value can be constructed
2. Problem Statement from the mean and variance of that subdomain’s measure-
2.1. Site Information ment data set. Because these measurements are themselves
[9] In groundwater modeling problems where hydraulic inherently uncertain owing to measurement and inverse
conductivity measurements are few, a hydraulic conductivity model uncertainty, both the mean and variance of the
field is often assumed to be composed of a few large measurement data set may not be representative of their
subdomains of equal hydraulic conductivity, like the simpli- true values. An appropriate expert familiar with the hydro-
fied representation of the Woburn, Massachusetts site pre- geology of the area may be asked to provide some measure
sented in Figure 1, which we will use as our illustrative of the uncertainty in the form of a 95% confidence interval.
example problem. A small number of hydraulic conductivity Given a mean value, the assumption of hydraulic conduc-
measurements are available in each of these hydraulic tivity’s lognormality and this 95% confidence interval a
conductivity subdomains. Total correlation is assumed with- PDF defining the hydraulic conductivity random variable
in boundaries and zero correlation is assumed between the can be constructed. Such an approach to both epistemic and
units. Constant head conditions are specified on the left aleatoric uncertainty characterization is predicated strictly
(22.9 m) and right (38.1 m) boundaries, and no-flow upon probability theory. The two sources of uncertainty,
boundary conditions are specified along the top and bottom natural randomness and expert knowledge, are not distin-
of the domain. Contaminant sources (gray ovals in Figure 1) guishable when both are built into a single probability
are located in formations four and five at concentrations of distribution.
2000 ppb and 1500 ppb, respectively. Finally, two pumping [12] The mean hydraulic conductivity values associated
wells are placed in formation two. Wells A and B pump at with the domain in Figure 1 are provided in Table 1, along
8.19  104 m3/s and 4.91  104 m3/s, respectively. with the expert provided confidence intervals for the five

2 of 10
W00B15 ROSS ET AL.: UNCERTAINTY IN GROUNDWATER SIMULATION W00B15

Table 1. Hydraulic Conductivity Random Variable Properties for the discrete CDF calculated from the available measure-
the Domain in Figure 1 ments in that subdomain. However, as stated above, these
Unit Number Mean LnK (m/s) Confidence Interval Variance individual measurements are themselves uncertain. Intui-
tively, then, this measurement error warrants characteriza-
1 25.6 [27.4, 23.8] 0.83 tion before the subdomain’s hydraulic conductivity random
2 18.8 [18.8, 18.8] 0
3 19.2 [20.9, 17.4] 0.79 variable can be defined.
4 15.0 [17.2, 12.8] 1.26 [17] Rather than restrict an expert to a single interval in an
5 11.9 [11.9, 11.9] 0 attempt to capture both the measurement error as well as the
stochasticity of hydraulic conductivity throughout a partic-
ular hydraulic conductivity zone, as is traditionally accom-
plished by confidence intervals, it is more intuitive to permit
formations and the resulting calculated variances. In this the expert to opine upon the uncertainty in the individual
case, the expert possessed an awareness of the devices used measurement values that helped determine the random
to measure hydraulic conductivity at the various locations as variable in the purely probabilistic approach, above.
well as an implicit understanding of the subdomain-wise [18] It has been demonstrated that an appropriate expert
homogeneity throughout the site. On the basis of this can simply opine upon the uncertainty of hydrogeological
background knowledge, the intervals for three of the for- measurements by assigning an interval in which the true
mations were specified to approximate 2 orders of magni- value is expected to lie [Joslyn and Kreinovich, 2005]
tude variation in hydraulic conductivity. For simplicity, the using ‘‘what is known about the underlying quantity’’
remaining two formations were assigned zero variation. [Ferson et al., 2002]. Ferson et al. [2002] aptly note that
[13] Consider some of the limitations of the pure proba- though this is the simplest approach, it is also the most
bilistic approach. Aside from blurring the two uncertainty difficult to defend to others. They also provide alternative
sources (aleatory and epistemic) into a single probability methods for defining random set structures. For example,
distribution, the probabilistic form of the model estimates of knowing the measuring device (i.e., pump test, slug test),
concentration is significantly dependent upon the certainty an expert can simply define such an interval by stating that
with which the expert can define 95% confidence, a rather the true value lies within x orders of magnitude of the
abstract notion, and the appropriateness of the lognormality measured value [Ferson et al., 2002; Mathon et al., 2009].
assumption. In fact, the longstanding assumption of lognor- Thus, in any one hydraulic conductivity subdomain with t
mality for hydraulic conductivity may not be correct in all equiprobable measurements, a collection of t equiprobable
cases [Ricciardi et al., 1998; Mathon et al., 2009] The true intervals is defined, representing t uncertain measurements.
random variable may actually be best defined using an Where the measurements are used to construct a discrete
alternative probability function. CDF, these expert-provided intervals essentially bracket the
[ 14 ] Using rank-ordered Latin hypercube sampling unknown true random variable hydraulic conductivity. A
[Zhang and Pinder, 2003] model estimates of concentration collection of these intervals [Helton and Oberkampf, 2004;
were determined using the flow and transport simulator. The Joslyn and Kreinovich, 2005], forms a random set, which is
uncertainty in these concentration estimates is sensitive to isomorphic to a Dempster-Shafer body of evidence [Joslyn
relatively small variations in estimated hydraulic conduc- and Booker, 2004].
tivity intervals. Note that 2 orders of magnitude change in a [19] In this framework, a random set and associated
confidence interval is considered small relative to the range probability function are composed of a set of focal elements
of hydraulic conductivity values one can encounter in the {F, m}, where F is the set of focals (intervals) and m is the
field, which, according to Domenico and Schwartz [1990] basic mass assignment function that assigns nonzero prob-
can range over 11 orders of magnitude from clay to gravel. abilities to the focals. A random set can be transformed into
[15] The results of the three cases of varying hydraulic lower and upper probability bounds, thereby bracketing the
conductivity uncertainty presented in Table 2 are plotted in unknown true random variable. These bounds are called
Figure 2 as estimates of discrete concentration random belief, bel(IG), and plausibility, pl(IG), respectively, for some
variables. The cumulative distribution functions plotted arbitrary focal element IG 2 G, a set of focals. Because the
are constructed from the realizations of concentration esti- random set definition requires less precision from the opining
mates that result from the application of Latin hypercube expert, the resulting upper and lower bounding curves
sampling. The steeper distributions (case 2, black squares) (Figure 3) do not impose false precision in the parameter
result from the smaller hydraulic conductivity confidence uncertainty characterization and avoid any inaccuracies that
intervals in Table 3. The longer, wider distributions (case 3, result from forcing an expert to provide information that
hollow circles) are the random variables resulting from a
less certain expert, who provided wider confidence inter-
vals. Moderate uncertainty (case 1, black circles) produces Table 2. Locations and Concentration Statistics for the Nodes
cumulative distributions situated between the two extreme Whose Locations Are Plotted in Figure 1a
cases. Thus, the opining expert’s certainty regarding model
input parameters can produce larger changes in the model Variance of Variance of Variance of
Node Easting Northing Mean Case 1 Case 2 Case 3
output random variables.
400 1107 635 238 1499 1104 2248
234 1590 1030 645 829 234 3609
3. Random Set Uncertainty 162 1450 682 998 1113 176 8713
[16] An approximation of a hydraulic conductivity sub- 190 1420 634 722 4144 979 13689
domain’s uncertain hydraulic conductivity is provided by a
Locations and concentration are in feet.

3 of 10
W00B15 ROSS ET AL.: UNCERTAINTY IN GROUNDWATER SIMULATION W00B15

Figure 2. Random variables representing the concentrations at nodes 400 (top left), 234 (top right),
162 (bottom left), and 190 (bottom right). Intuitively, the variance changes throughout space and as the
expert-provided hydraulic conductivity confidence intervals are narrow (case 2) relative to the case of
interest (case 1) and relatively wide (case 3). The data for these random variables are given in Table 2.
The confidence intervals for the three cases are provided in Table 3.

would define a single random variable. Moreover, no prob- associated with a probability mass assignment m(IFj). The
ability model need be selected or assumed, which is desirable propagation of the random set-based hydraulic conductivity
in light of the above-mentioned possible inaccuracies in the values through to concentration values necessitates the use
lognormality assumption for hydraulic conductivity. of a tool to extend the flow and transport model such that it
[20] Formally, a hydraulic conductivity random set (F, mF), can operate upon these focal elements. Such an extension
is defined on the Cartesian product K = K1  . . .  Kn1, where would permit the calculation of concentration focal
Kj denotes the domain of the hydraulic conductivity value elements that can be aggregated into a concentration
and n1 is the number of uncertainty hydraulic conductivity random set at each location throughout the spatial domain.
zones (in the trial case, n1 = 3). In this formal definition, mF
is a function mapping elements, IF 2 F, of F to the interval
[0, 1],
Table 3. Expert-Provided Hydraulic Conductivity Confidence
mðIF Þ :! ½0; 1 IF 2 F; Intervals for the Base Case, a High-Certainty Case, and a Low-
P Certainty Casea
mF ðIF Þ ¼ 1:
IF :IF 2F Confidence Confidence Confidence
Unit Interval of Interval of Interval of
Since, in our trial example, the conductivity zones are Number Case 1 Case 2 Case 3
assumed uncorrelated, the n1-dimensional random sets are 1 [27.4, 23.8] [26.4, 24.8] [28.5, 22.6]
marginalized to random sets (Fj, mFj), j = 1,. . .,n1, each 2 [18.8, 18.8] [18.8, 18.8] [18.8, 18.8]
defined solely upon the individual domains Kj. In other 3 [20.9, 17.4] [19.9, 18.4] [21.9, 14.4]
words, one can specify (F, m F ) by means of n1 4 [17.2, 12.8] [16.6, 13.4] [18.5, 11.4]
5 [11.9, 11.9] [11.9, 11.9] [11.9, 11.9]
stochastically independent random sets.
a
[21] Each of these random sets is composed of a finite Conductivity is in m/s. Base case is case 1, high-certainty case is case 2,
number of intervals IFj, IFj 2 Fj, or focal elements, each and low-certainty case is case 3.

4 of 10
W00B15 ROSS ET AL.: UNCERTAINTY IN GROUNDWATER SIMULATION W00B15

Figure 3. Cumulative random set hydraulic conductivity values for zone 1 (top left), zone 2 (top right),
and zone 5 (bottom), determined by the ±2 orders of magnitude uncertainty on available measurements.

[22] Consider the extension of the transport model y = uses the power set of its domain rather than the domain itself.
f(x), where x = (x1,. . .,xn1) is the vector of uncertain Therefore, an approximation of the random set hydraulic
conductivities in n1 zones and y = (y1,. . .,yn2) is the vector conductivities, such that the extended transport model can be
of uncertain concentration values at n2 nodes. Assuming, carried out over W rather than its power set, is needed.
without loss of generality, that n1 = 1 and K1 = W = [23] In general, simplifying or approximating a random
{k1,. . .,kL}, the domain of L possible hydraulic conductivity set means approximating it by another random set, in which
values, where focal elements are subsets of W. The random the number of the focals containing relevant information is
set extension principle [Dubois and Prade, 1991] defines reduced [Bauer, 1996]. An approach to approximating a
the random set concentration (G, m) at node i as random set, presented by Dubois and Prade [1990], is
adopted in this paper. This approximation uses the
G ¼ f f ðIF ÞjIF 2 F g; following steps: (1) formation of sets of focals of the
original random set that are the focals of the approximating
and random set and (2) allocation of basic probability masses to
the sets from the first step using a process that is optimal in
mðIG Þ ¼ fmðIF ÞjIG ¼ f ðIF Þg; the sense that the resulting hydraulic conductivity focal
elements of the approximating random set are the smallest
where IG represents a focal set of concentrations that is an in size, effectively minimizing the amount of imprecision
image of a hydraulic conductivity focal element IF through introduced by the approximation.
the transport model f. This concentration focal element is [24] Approximating a random set by the above method
defined by results in nested focal elements, which comprise a fuzzy set
A [Klir and Yuan, 1995], a special type of random set and
IG ¼ f ðIF Þ ¼ f f ðk Þjk 2 IF g: defined by a membership function (discussed further
below). This fact permits the solution of the transport
A significant drawback to the random set extension model over the domain W, rather than its power set. For the
principle is its computational intensity, since a random set exact representation of the approximating conductivity
fuzzy set on W, the membership function is calculated by
5 of 10
W00B15 ROSS ET AL.: UNCERTAINTY IN GROUNDWATER SIMULATION W00B15

Figure 4. Possibilistic approximations for the random set provided by the expert for the hydraulic
conductivity values associated with zone 1 (top left), zone 2 (top right), and zone 5 (bottom). The
hydraulic conductivity values for zones 3 and 4 are considered certain and precise.

the one-point coverage function [Goodman and Nguyen, 2007]. Nevertheless, in our example case the vertex method
1985] for random sets: [Ross, 2004], an approximation to the extension principle, is
X
applied to reduce computational effort. The vertex method
for all k 2 W : mA ðk Þ ¼ mðCi Þ; results in, for each a-cut (an interval created by the
Ci :k2Ci horizontal cut of a fuzzy set at a given a, or membership
value), eight concentration values at each location due to the
where Ci are the focal elements of the approximating eight possible permutations of the three lower and three
hydraulic conductivity random set and mA denotes the upper hydraulic conductivity bounds (from three uncertain
membership function of the hydraulic conductivity fuzzy set hydraulic conductivity values) of the a-cut. The minimum
A. This fuzzy set representation of the uncertain hydraulic of these eight values is taken as the lower bound of the
conductivity values allows for the use of the special case of concentration a-cut, and the maximum is assigned as the
the extension principle, described above, for fuzzy sets upper bound.
[Dubois and Prade, 1991], which states that in order to [26] If the function being extended (i.e., the transport
calculate the possibility value of an uncertain concentration model) is nonlinear and monotonic with respect to its
value one must consider membership values of hydraulic variables (in our case, three hydraulic conductivity values),
conductivities used to calculate that concentration: then there is a possibility that the function will take smaller
or larger values for a combination of three hydraulic
for all y2 f ðWÞ : mf ð AÞ ð yÞ ¼ supfmA ðk Þj y ¼ f ðk Þg; conductivity values that are not necessarily a permutation
of the respective a-cut bounds, but rather a permutation of
where mf(A) represents the membership function of the fuzzy values sampled anywhere within these bounds. If one
concentration at a given node. computes with the 8 permutations of the bounds of the
[25] Where fuzzy sets are used to approximate random a-cuts of the three fuzzy hydraulic conductivity fuzzy sets
sets, the application of the extension principle [Klir and and calculates eight concentration values, one assumes that
Yuan, 1995] to the model equations is relatively straightfor- the concentration at that node cannot get any smaller than
ward and has precedent in hydrogeological applications the minimum of the eight values and cannot get any larger
[Dou et al., 1995, 1997a, 1997b; Prasad and Mathur, than the maximum of the eight values. In other words, one

6 of 10
W00B15 ROSS ET AL.: UNCERTAINTY IN GROUNDWATER SIMULATION W00B15

Figure 5. The cumulative belief and plausibility curves are relatively narrow and do not entirely bound
the corresponding cumulative distribution functions for the three cases at nodes 400 (top left) and 234
(top right), but are rather wide and do bound the cumulative distribution functions for nodes 162 (bottom
left) and 190 (bottom right).

admits that there is no need to investigate the entire Where model inputs are defined as fuzzy sets, model
hydraulic conductivity a-cuts in order to determine the estimates of concentration are interpreted as possibility
a-cut bounds of the concentration at any node. distributions. A possibility distribution defines for each
[27] If the alpha cuts of the fuzzy hydraulic conductivities value along the horizontal axis the degree to which that
have one or more extreme points in the interior, then the value is possible, given available evidence. In Figure 4 (top
vertex method approach can be taken as approximations to left), for instance, the hydraulic conductivity value 3.5 
the true global extreme values that determine the bounds of 109 m/s is most possible. This is similar to probability
the concentration at the given alpha level. This will result in theory, whereupon inspection of the peak of a probability
fuzzy nodal concentration values with narrower support density function would reveal the most probable value.
implying higher specificity in the information content than [29] Whereas, through the transformation to fuzzy sets,
there actually exists [Klir and Yuan, 1995]. However, it is random sets offer a relatively facile strategy for computation
the authors’ opinion that the nonlinearity of the relationship with uncertainty, the most significant advantage to the
between nodal concentrations and hydraulic conductivity is random set approach is its potential to characterize both
mostly monotonic and the extreme concentrations will be at aleatory and epistemic uncertainty. The foundation for
the vertices (a-cut bounds), rather than anywhere between random sets lies in probability theory, which, as mentioned
the vertices. Thus, for the contaminant transport problem above, is ideally suited for aleatory uncertainty character-
considered here, it is more efficient to use the vertex method ization. On the other hand, random sets are less specific, or
than to invoke a global optimization tool that implements less precise, than random variables [Joslyn and Booker,
the extension principle and, as such, searches the entire 2004], because focal elements, upon which random set are
a-cut. Where monotonicity cannot be justified, the authors based, are a source of imprecision in the uncertainty
recommend applying the extension principle. quantification process (focal elements associated with
[28] Figure 4 provides examples of a fuzzy set defining conventional probabilities, random variables, are points
an uncertain hydraulic conductivity value. The interpreta- and therefore more precise than general random sets). This
tion of these fuzzy sets in Figure 4 that is most pertinent to imprecision is a form of epistemic uncertainty. An expert
this topic is that of a possibility distribution [Zadeh, 1978]. who provides a body of evidence to characterize the
7 of 10
W00B15 ROSS ET AL.: UNCERTAINTY IN GROUNDWATER SIMULATION W00B15

uncertainty regarding some hydraulic conductivity measure- enough so as to be meaningful. Since the error of various
ment actually admits to the existence of these two forms of measurement techniques are commonly acknowledged and,
uncertainty. The natural randomness (aleatory uncertainty) at times, quantified [Mathon et al., 2009] we perceive this
of hydraulic conductivity is captured by the stochastic task to be reasonable.
nature of random sets (the basic mass assignments). Owing
to an expert’s inability to precisely define this natural 4. Discussion of Results
randomness, a random set merely provides bounds on the
exact random variable (because the basic mass function is [33] The outcome of the random set approach to uncer-
defined over intervals of the conductivity domain rather tainty characterization is significantly distinct from that of
than the domain itself), thereby imprecisely defining the the traditional purely probabilistic approach. Intuitively,
random variable (epistemic uncertainty). the strictly probabilistic approach produces random vari-
[30] The starting point for the application of random set- able concentration values, whereas the random set-based
based uncertainty characterization is similar to the probabi- approach results in upper and lower probability bounds. As
listic approach presented above. Given the measurements in noted above, assuming that uncertainty in both approaches
each zone in Figure 1, the expert, armed with knowledge of was characterized by the same expert or different experts with
the measurement technique and aquifer characteristics, the same understanding of pertinent data, the upper and lower
provides the aforementioned intervals on the each measured probabilities (plausibility and belief) should bracket the
value by specifying that the true hydraulic conductivity corresponding probability distribution (produced by the
value lies within ±2 orders of magnitude of the measure- stochastic approach) at a given location. Thus, the range of
ment. As mentioned above, where the measurements in a concentration values that result from the random set approach
particular zone are used to construct an approximate CDF, is greater than that which would result from the strict Monte
these expert-provided intervals become upper and lower Carlo method, owing to the imprecision inherent in the
bounds on the zone’s true random variable hydraulic con- random set approach. However, the degree of precision
ductivity (plausibility and belief, respectively). Figure 3 presented in the random variable approach is, as argued
shows these upper and lower bounds for zones one, two herein, difficult to justify.
and five. As in the Monte Carlo approach, the hydraulic [34] Consider the same nodal locations whose concentra-
conductivity values for zones three and four are considered tion random variables are plotted in Figure 2. The upper and
certain and precise. The fuzzy set approximations of these lower probabilities for these same nodal locations are
random sets, whose construction is outlined above, are plotted in Figure 5, along with the concentration random
shown in Figure 4. variables for all three cases of uncertainty presented in
[31] Though the set of intervals and associated probabil- Table 1. Note that, at some nodes, the upper and lower
ities provided by the expert are a natural extension of the probabilities entirely bound the corresponding random var-
confidence interval in the Monte Carlo approach above, iables, whereas, other nodes do not entirely bound the
they comprise a greater amount of information. As such, the corresponding cumulative distribution function. Such a
bodies of evidence provided by the expert capture both the discrepancy is likely due to the fact that the expert employed
uncertainty surrounding the mean hydraulic conductivity to define the confidence intervals was not the same as the
values and the imprecision with which the expert can truly expert who provided the information used to construct the
characterize this uncertainty. random set intervals.
[32] The vertex method [Ross, 2004], an approximation [35] Uncertainty (variance) associated with concentration
to the extension principle, was applied to the finite element values changes throughout the spatial domain, as is evident
approximation equations of the groundwater flow and by the different slopes in the cumulative distribution func-
transport model in order to propagate the possibilistic tions in Figure 2. Nevertheless, high variances may not
uncertainty through to the concentrations values. As a simply be associated with high degrees of uncertainty, but
result, uncertain concentration estimates are described by rather with means of greater magnitude. Random set-based
possibility distributions. The possibilistic concentration probability bounds, however, are independent of the mag-
values can be transformed into upper (plausibility) and nitude of the concentration values and provide a true means
lower (belief) bounds on the unknown random variable. The of uncertainty identification. Wider bound separation signi-
resulting bounds for the nodes of interest are plotted in fies more uncertainty in concentration estimates, and, as a
Figure 5 with the corresponding random variables from all result, locations where more data are warranted. For in-
three cases of uncertainty in cumulative distribution stance, the estimate in Figure 5 (bottom right) is more
function form (dashed lines) from Figure 2. If the intervals uncertain than that in Figure 5 (top right) and, as such, is
used to construct the random sets are certain to contain the in need of additional data. In this particular instance, a
value of the measured variable, the true (and unknown) combination of node 234’s distance from the contaminant
probability distribution, which the confidence intervals aim source and the expert’s precision in providing error bounds
to characterize, defining a random variable lies between the on hydraulic conductivity measurements from zone 4 rela-
plausibility and belief curves, especially where these bounds tive to the measurements in zone 1 contributed to the lower
are widely separated [Ferson et al., 2002]. Thus, the burden separation between the probability bounds for the concen-
is upon the expert, who specified the magnitude of tration random set at node 234 (Figure 5, top right) relative
measurement error that creates the random sets, to ensure to that for node 190 (Figure 5, bottom right). Likewise, the
that the measurement intervals are wide enough to bound expert believed the measurements from zone 2 to be slightly
the true hydraulic conductivity measurement yet narrow more reliable than those in zone 4, and thus, the uncertainty

8 of 10
W00B15 ROSS ET AL.: UNCERTAINTY IN GROUNDWATER SIMULATION W00B15

surrounding the concentration at node 400 is lower than it is in Figure 4), which contain the same information, and can be
for Nodes 190 and 162. transformed into, probability bounds, is quite intuitive and
readily interpretable. Inspection of a possibility distribution
5. Conclusion reveals not only the most possible concentration value, but
also a range of concentrations that are also possible to lesser
[36] Because thorough hydrogeological investigations and varying degrees. In fact, algorithms have been developed
cost significant amounts of money and time, an efficient to complement and refine possibilistic model estimates with
means of data acquisition and interpretation is valuable. One new information [Fruhwirth-Schnatter, 1993; Pan and Klir,
such means is expert knowledge extraction. However, the 1996; Yang, 1997; Ross et al., 2007, 2008].
consideration of expert knowledge introduces epistemic [41] A benefit of separately characterizing aleatory and
uncertainty in addition to the existing stochasticity in epistemic uncertainties is the possibility of identifying
hydrogeological parameters, such as hydraulic conductivity. where and what type of additional information is most
Thus, appropriate characterizations of uncertainty should beneficial. The value of additional information is correlated
delineate aleatory uncertainty from epistemic uncertainty, as with the reduction in reducible (epistemic) uncertainty
has been done in risk assessment and reliability engineering realized by the consideration of the new information; this
[Helton et al., 2000a, 2000b, 2004; Hofer et al., 2002; is easy to identify using belief and plausibility curves. The
Helton and Oberkampf, 2004; Oberkampf et al., 2004]. appropriate measure is the magnitude of epistemic uncer-
[37] In this paper, we have sided with Ganoulis [1996] tainty as indicated by the distance between the belief and
who also argued that probability theory alone cannot plausibility curves. As noted above, the concentration
accomplish this. The hazard associated with applying estimate at node 190 is less precise than that at node 234
traditional probability theory is that the opining expert may (Figure 5). Thus additional data are most valuable at node
provide inaccurate and artificially precise characterizations 234, where the reducible uncertainty, and likewise the dis-
of the random variable that best captures the naturally tance between belief and plausibility curves, is greatest. Klir
stochastic nature of hydraulic conductivity. Fuzzy set theory, [2006] provides a set of measures to quantify the amount of
on the other hand, has failed to find mainstream acceptance information as well as uncertainty in random sets.
perhaps as a result of its departure from probability theory. [42] Though the application presented above focuses
[38] Random sets were introduced in this paper as an upon the characterization of uncertainty in hydraulic con-
alternative and possibly more appropriate means for the ductivity measurements, other forms of uncertainty such as
characterization of both aleatory (the random variable) and boundary conditions also can be considered. In the case of
epistemic (expert-characterized measurement error) uncer- boundary conditions, which originate predominantly from
tainty. This approach to uncertainty characterization pro- expert insight, fuzzy sets can be used directly as a charac-
vides a methodology for bounding an unknown random terization methodology, bypassing the need for random sets.
variable and properly capturing the imprecise nature of The propagation of these forms of uncertainty through a
expert knowledge. In the provided example, expert knowl- groundwater flow and transport model is executed as
edge was used to characterize the reducible uncertainty of presented above.
individual hydraulic conductivity measurements. Random
sets were collected from these individual measurement [43] Acknowledgments. This material is based upon work supported
intervals and propagated through a groundwater flow and by the Strategic Environmental Research and Development Program
transport model using fuzzy set methodologies. (SERDP). Any opinions, findings, and conclusions or recommendations
[39] Aside from avoiding the imposition of false preci- expressed in this material are those of the authors and do not necessarily
reflect the views of SERDP.
sion, which is an unfortunate side effect of defining confi-
dence intervals, it is important to note that uncertainty References
characterization via random sets eliminates the need for Apostolakis, G. (1990), The concept of probability in safety assessments of
any probability model definition or assumption. Moreover, technological systems, Science, 250(4986), 1359 – 1364, doi:10.1126/
the approximation of the random sets by fuzzy sets and science.2255906.
model execution with the fuzzy extension principle accom- Bagtzoglou, A. C., A. Nedungadi, and B. Sagar (1996), A fuzzy rule-based
model for flow simulation in heterogeneous media, in Computational
plishes what fuzzy set-based hydrogeological research has, Methods in Water Resources XI, pp. 629 – 637, Comput. Mech. Publ.,
as yet, failed to embrace – the combination of probability Boston.
theory and fuzzy sets for the characterization of parameter Bardossy, A., I. Bogardi, and W. E. Kelly (1989), Geostatistics utilizing
uncertainty. If fuzzy set theory is to find a stronger foothold imprecise (fuzzy) information, Fuzzy Sets Syst., 31, 311 – 328,
in engineering applications, researchers must endeavor to doi:10.1016/0165-0114(89)90203-0.
Bardossy, A., I. Bogardi, and L. Duckstein (1990a), Fuzzy regression in
embrace hybrid frameworks that unite fuzzy sets with more hydrology, Water Resour. Res., 26(7), 1497 – 1508.
traditional mathematical tools such as probability, as is Bardossy, A., I. Bogardi, and W. E. Kelly (1990b), Kriging with imprecise
illustrated by this paper. (fuzzy) variograms I: Theory, Math. Geol., 22(1), 63 – 79, doi:10.1007/
[40] While the representation of model concentration BF00890297.
Bardossy, A., I. Bogardi, and W. E. Kelly (1990c), Kriging with imprecise
estimates as upper and lower probabilities (plausibility (fuzzy) variograms II: Application, Math. Geol., 22(1), 81 – 94,
and belief) provides a transparent comparison between the doi:10.1007/BF00890298.
random variable and random set approaches defined above, Bauer, M. (1996), Approximation algorithms and decision making in the
the utility of data in such a form may not be immediately Dempster-Shafer theory of evidence: An empirical study, Int. J. Approx-
obvious. What does one do with an imprecise notion of a imate Reasoning, 17, 217 – 237, doi:10.1016/S0888-613X(97)00013-3.
Choquet, G. (1954), Theory of capacities, Ann. Inst. Fourier, 5, 131 – 295.
stochastic estimate concentration (Figure 5)? In fact, the Demmico, R. V., and G. J. Klir (2004), Fuzzy Logic in Geology, 347 pp.,
representation of concentration estimates as possibility dis- Elsevier, Amsterdam.
tributions (like the possibilistic hydraulic conductivity values
9 of 10
W00B15 ROSS ET AL.: UNCERTAINTY IN GROUNDWATER SIMULATION W00B15

Domenico, P. A., and F. W. Schwartz (1990), Physical and Chemical Hanson and F. M. Hemez, pp. 453 – 469, Los Alamos Natl. Lab., Los
Hydrogeology, 824 pp., John Wiley, New York. Alamos, N. M.
Dou, C., W. Woldt, I. Bogardi, and M. Dahab (1995), Steady state ground- Joslyn, C., and V. Kreinovich (2005), Convergence properties of an interval
water flow simulation with imprecise parameters, Water Resour. Res., probabilistic approach to system reliability estimation, Int. J. Gen. Syst.,
31(11), 2709 – 2719, doi:10.1029/95WR02310. 34(4), 465 – 482, doi:10.1080/03081070500033880.
Dou, C., W. Woldt, I. Bogardi, and M. Daheb (1997a), Numerical solute Klir, G. J. (2006), Uncertainty and Information: Foundations of General-
transport simulation using fuzzy sets approach, J. Contam. Hydrol., ized Information Theory, 499 pp., John Wiley, Hoboken, N. J.
27(1 – 2), 107 – 126, doi:10.1016/S0169-7722(96)00047-2. Klir, G. J., and B. Yuan (1995), Fuzzy Sets and Fuzzy Logic: Theory and
Dou, C., W. Woldt, M. Daheb, and I. Bogardi (1997b), Transient ground- Applications, 574 pp., Prentice-Hall, Upper Saddle River, N. J.
water flow simulation using a fuzzy set approach, Ground Water, 35(2), Mathon, B., M. Ozbek, and G. F. Pinder (2009), Dempster-Shafer theory
205 – 215, doi:10.1111/j.1745-6584.1997.tb00076.x. applied to uncertainty surrounding permeability, Math. Geosci., in press.
Dou, C., W. Woldt, and I. Bogardi (1999), Fuzzy rule-based approach to Oberkampf, W. L., J. C. Helton, C. A. Joslyn, S. F. Wojtkiewicz, and
describe solute transport in the unsaturated zone, J. Hydrol., 220(1 – 2), S. Ferson (2004), Challenge problems: Uncertainty in system response
74 – 85, doi:10.1016/S0022-1694(99)00065-7. given uncertain parameters, Reliab. Eng. Syst. Safety, 85(1 – 3), 11 – 19,
Dubois, D., and H. Prade (1990), Consonant approximations of belief doi:10.1016/j.ress.2004.03.002.
functions, Int. J. Approximate Reasoning, 4, 419 – 449, doi:10.1016/ O’Hagan, A., and J. E. Oakley (2004), Probability is perfect, but we can’t
0888-613X(90)90015-T. elicit it properly, Reliab. Eng. Syst. Safety, 85(1 – 3), 239 – 248,
Dubois, D., and H. Prade (1991), Random sets and fuzzy interval analysis, doi:10.1016/j.ress.2004.03.014.
Fuzzy Sets Syst., 42, 87 – 101, doi:10.1016/0165-0114(91)90091-4. Ozbek, M. M., and G. F. Pinder (2006), Non-probabilistic uncertainty in
Fang, J. H., and H. C. Chen (1997), Fuzzy modeling and the prediction of subsurface hydrology and its applications: An overview, Water Air Soil
porosity and permeability from the compositional and textural attributes Pollut., 6, 35 – 46, doi:10.1007/s11267-005-9011-4.
of sandstone, J. Pet. Geol., 20(2), 185 – 204, doi:10.1111/j.1747- Pan, Y., and G. J. Klir (1996), Bayesian inference of fuzzy probabilities, Int.
5457.1997.tb00772.x. J. Gen. Syst., 26(1 – 2), 73 – 90.
Ferson, S., V. Kreinovich, L. Ginzburg, D. S. Myers, and K. Sentz (2002), Prasad, R. K., and S. Mathur (2007), Groundwater flow and contaminant
Constructing probability boxes and Dempster-Shafer structures, Tech. transport simulation with imprecise parameters, J. Irrig. Drain. Eng.,
Rep. SAND2002 – 4015, Sandia Natl. Lab., Albuquerque, N. M. 133(1), 61 – 70.
Fruhwirth-Schnatter, S. (1993), On fuzzy bayesian inference, Fuzzy Sets Ricciardi, K., G. F. Pinder, and G. P. Karatzas (1998), A new probability
Syst., 60, 41 – 58, doi:10.1016/0165-0114(93)90288-S. density function for hydraulic conductivity in optimal design, Eos Trans.
Ganoulis, J. (1996), Modeling hydrologic phenomena, Rev. Sci. Eau, 9(4), AGU, 79(45), Fall Meet. Suppl., F280.
421 – 434. Ross, J., M. Ozbek, and G. F. Pinder (2006), Fuzzy kalman filtering of
Goodman, I. R., and H. T. Nguyen (1985), Uncertainty Models for hydraulic conductivity, in Computational Methods in Water Resources
Knowledge-Based Systems: A Unified Approach to the Measurement XVI [CD ROM], Comput. Mech. Publ., Boston.
of Uncertainty, 643 pp., Elsevier, Amsterdam. Ross, J., M. M. Ozbek, and G. F. Pinder (2007), Hydraulic conductivity
Guan, J., and M. M. Aral (2004), Optimal design of groundwater remedia- estimation via fuzzy analysis of grain size data, Math. Geol., 39(8),
tion systems using fuzzy set theory, Water Resour. Res., 40, W01518, 765 – 780, doi:10.1007/s11004-007-9123-7.
doi:10.1029/2003WR002121. Ross, J., M. Ozbek, and G. F. Pinder (2008), Kalman filter updating of
Helton, J. C., and W. L. Oberkampf (2004), Alternative representations possibilistic hydraulic conductivity, J. Hydrol., 354(1 – 4), 149 – 159,
of epistemic uncertainty (Guest Editorial), Reliab. Eng. Syst. Safety, doi:10.1016/j.jhydrol.2008.03.005.
85(1 – 3), 1 – 10, doi:10.1016/j.ress.2004.03.001. Ross, T. (2004), Fuzzy Logic with Engineering Applications, 654 pp.,
Helton, J. C., F. J. Davis, and J. D. Johnson (2000a), Characterization of John Wiley, West Sussex, U. K.
stochastic uncertainty in the 1996 performance assessment for the Waste Shafer, G. (1976), A Mathematical Theory of Evidence, 297 pp., Princeton
Isolation Pilot Plant, Reliab. Eng. Syst. Safety, 69(1 – 3), 167 – 189, Univ. Press, Princeton, N. J.
doi:10.1016/S0951-8320(00)00031-4. Srinivasan, G., D. M. Tartakovsky, B. A. Robinson, and A. B. Aceves
Helton, J. C., M.-A. Martell, and M. S. Tierney (2000b), Characterization of (2007), Quantification of uncertainty in geochemical reactions, Water
subjective uncertainty in the 1996 performance assessment for the Waste Resour. Res., 43, W12415, doi:10.1029/2007WR006003.
Isolation Pilot Plant, Reliab. Eng. Syst. Safety, 69(1 – 3), 191 – 204, Yang, C. C. (1997), Fuzzy bayesian inference, IEEE Trans. Syst. Man
doi:10.1016/S0951-8320(00)00032-6. Cybern., 3, 2707 – 2712.
Helton, J. C., J. D. Johnson, and W. L. Oberkampf (2004), An exploration Zadeh, L. (1965), Fuzzy sets, Infect. Control, 8, 338 – 353, doi:10.1016/
of alternative approaches to the representation of uncertainty in model S0019-9958(65)90241-X.
predictions, Reliab. Eng. Syst. Safety, 85(1 – 3), 39 – 71, doi:10.1016/ Zadeh, L. (1978), Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets
j.ress.2004.03.025. Syst., 1, 3 – 28, doi:10.1016/0165-0114(78)90029-5.
Hofer, E., M. Kloos, B. Krzykacz-Hausmann, J. Peschke, and M. Woltereck Zhang, Y., and G. Pinder (2003), Latin hypercube sample selection strategy
(2002), An approximate epistemic uncertainty analysis approach in the for correlated random hydraulic conductivity fields, Water Resour. Res.,
presence of epistemic and aleatory uncertainties, Reliab. Eng. Syst. 39(8), 1226, doi:10.1029/2002WR001822.
Safety, 77(3), 229 – 238, doi:10.1016/S0951-8320(02)00056-X.
Joslyn, C., and J. M. Booker (2004), Generalized information theory for
engineering modeling and simulation, in Engineering Design and 

Reliability Handbook, edited by E. Nikolaidis et al., pp. 9-1 – 9-40, M. M. Ozbek, Environ International Corporation, 214 Carnegie Center,
CRC, New York. Princeton, NJ 08540, USA.
Joslyn, C., and S. Ferson (2004), Approximate representations of random G. F. Pinder and J. L. Ross, Center for Groundwater Remediation
intervals for hybrid uncertainty quantification in engineering modeling, Design, School of Engineering, University of Vermont, Burlington, VT
in Sensitivity Analysis of Model Output (SAM2004), edited by K. M. 05405, USA. (jlross@uvm.edu)

10 of 10

You might also like