Ocampo Duque2013

Environment International 52 (2013) 17–28
Contents lists available at SciVerse ScienceDirect
Environment International
journal homepage: www.elsevier.com/locate/envint
Water quality analysis in rivers with non-parametric probability distributions and

fuzzy inference systems: Application to the Cauca River, Colombia
William Ocampo-Duque a,⁎, Carolina Osorio a, Christian Piamba a, Marta Schuhmacher b, José L. Domingo c
a
Faculty of Engineering, Pontificia Universidad Javeriana, Cll. 18 #118-250, Cali, Colombia
b
Department of Chemical Engineering, Universitat Rovira i Virgili, Av. Países Catalans 26, 43007 Tarragona, Spain
c
Laboratory of Toxicology and Environmental Health, School of Medicine, IISPV, Universitat Rovira i Virgili, Sant Llorens 21, 43201 Reus, Spain
a r t i c l e i n f o a b s t r a c t
Article history: The integration of water quality monitoring variables is essential in environmental decision making. Nowadays,
Received 20 April 2012 advanced techniques to manage subjectivity, imprecision, uncertainty, vagueness, and variability are required in
Accepted 16 November 2012 such complex evaluation process. We here propose a probabilistic fuzzy hybrid model to assess river water
Available online 23 December 2012
quality. Fuzzy logic reasoning has been used to compute a water quality integrative index. By applying a
Monte Carlo technique, based on non-parametric probability distributions, the randomness of model inputs
Keywords:
Water quality
was estimated. Annual histograms of nine water quality variables were built with monitoring data systematically
Non-parametric density estimators collected in the Colombian Cauca River, and probability density estimations using the kernel smoothing method
Uncertainty were applied to fit data. Several years were assessed, and river sectors upstream and downstream the city of
Fuzzy inference systems Santiago de Cali, a big city with basic wastewater treatment and high industrial activity, were analyzed. The
Monte Carlo simulation probabilistic fuzzy water quality index was able to explain the reduction in water quality, as the river receives
Cauca River (Colombia) a larger number of agriculture, domestic, and industrial effluents. The results of the hybrid model were compared
to traditional water quality indexes. The main advantage of the proposed method is that it considers flexible
boundaries between the linguistic qualifiers used to define the water status, being the belongingness of water
quality to the diverse output fuzzy sets or classes provided with percentiles and histograms, which allows
classify better the real water condition. The results of this study show that fuzzy inference systems integrated
to stochastic non-parametric techniques may be used as complementary tools in water quality indexing
methodologies.
© 2012 Elsevier Ltd. All rights reserved.
1. Introduction Monte Carlo based methods are highly recommended in water quality
management since they are appropriate tools to deal with all diverse
Despite the huge numeric datasets collected nowadays, it is well types of uncertainties (Chowdhury et al., 2009; Darbra et al., 2008).
known that the assessment of water quality still relies heavily upon Fuzzy inference systems (FIS) have recently attracted the atten-
subjective judgments and interpretation. Linguistic computations should tion of environmental scientists as suitable platforms to evaluate
be considered together with numerical scoring systems to give appropri- multiple criteria related to water quality, and other environmental
ate water quality classifications (Ocampo-Duque et al., 2006). There is no conditions (Marchini et al., 2009). A common application of FIS has
doubt that the introduction of intelligent linguistic operations to analyze been the integration of water quality variables to design suitable integra-
databases is producing self-interpretable water quality indicators for a tive systems, which are successfully compared to traditional indexing
better assessment. Moreover, to simplify and improve the understanding techniques. Water quality is a vague term that cannot be easily de-
and the interpretation of water quality, methodologies for integration, ag- scribed using crisp data or limited indicators. Instead, water quality
gregation, and fusion of data must be developed (Sadiq and Tesfamariam, should be considered as a fuzzy term appropriately estimated with
2007). Data aggregation is not simply a problem of calculations; rather it linguistic computations (Mahapatra et al., 2011). The amount of linguis-
is a problem of judgment. Therefore, it deals not only with uncertainty tic “if-then” rules, as well as the number of indicators considered, seems
or variability related to random phenomena, but also with the subjective to be definitive for a robust and reliable evaluation (Lermontov et al.,
uncertainty related to linguistic, subjective, vague and imprecise concepts 2009). In a previous study, we developed a structured fuzzy hierarchy
faced in decision-making processes. Consequently, Fuzzy Logic and to interconnect various partial inference engines intended to define
water quality (Ocampo-Duque et al., 2006). Here, the FIS contained an
analytical hierarchy process to deal with the relative weight of the
⁎ Corresponding author. Tel.: +57 2 321 8200. variables involved in the evaluation process. Adaptive and cooperative
E-mail address: willocam@javerianacali.edu.co (W. Ocampo-Duque). neuro-FIS models have also been implemented to provide water
0160-4120/$ – see front matter © 2012 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.envint.2012.11.007
18 W. Ocampo-Duque et al. / Environment International 52 (2013) 17–28
quality management solutions (Ocampo-Duque et al., 2007, 2012). purpose of this research was to manage both the random nature of
An integrated risk assessment methodology, based on the weight of input variables and the linguistic subjectivity present in the water quality
evidence approach, which implemented a FIS in order to hierarchically indexing process. A case study, with information from a Colombian River,
aggregate a set of biological indicators following the precepts of the was selected to explain the application of the proposed method and its
Water Framework Directive, was recently described (Gottardo et al., benefits. The results are here reported. Comparison with common index-
2011). Also, Bayesian networks and probabilistic neural networks es is also discussed. Consequently, the simulation outputs involved both
have been recently used to train a water quality index supported in FIS kinds of uncertainty: fuzzy and probabilistic.
(Nikoo et al., 2011).
Probabilistic approaches are commonly applied in environmental 2. Methods
analysis and modeling to control uncertainty propagation. Parameter
uncertainty is a major aspect of the model-based estimation of the 2.1. Case study: the Cauca River
risk of human exposure to pollutants. The Monte Carlo method is exten-
sively applied despite it relies heavily on a statistical representation of The Cauca River is one of the most important water resources in
available information. The probability distributions of each variable are Colombia. It has a length of 1350 km, with a basin area of approximately
defined according to the Bayesian theory (Ramaswami et al., 2005). For 63300 km 2. It goes across the country from south to north through nine
instance, in human health risk assessment some variables are usually departments and a number of cities and towns without appropriate
managed as probability density functions (PDF) (Legay et al., 2011; wastewater treatment plants (WWTP). In fact, there are municipalities
Mari et al., 2009). Probabilistic Monte Carlo computations are powerful without any kind of treatment of their sewage. In the Department of
tools for water quality modeling (Cardona et al., 2011; Misha 2011). Valle del Cauca there is a notable deterioration of water quality in the
However, their use in water quality indexing systems is scarce. A new river, especially when it receives discharges from the City of Santiago
probabilistic water quality index intended for use in the production of de Cali. In this zone, a number of big river releases from domestic, agri-
drinking water is described by Beamonte-Cordoba et al. (2010). In this cultural, and industrial activities are present. The City of Santiago de
approach, each water quality variable is considered random with normal Cali, with more than two million inhabitants and several companies
distribution. Likewise, classical water quality indexes available worldwide located at Yumbo Industrial Park, is the main source of river pollution.
could be computed with Monte Carlo methods, assuming probability After crossing these areas, the organic loads are as high as to diminish
distributions, or estimating them from monitoring data to provide a dissolved oxygen levels below 1 mg/L, compromising the ecosystems
most comprehensive evaluation. living downstream and producing a clear reduction in its ecological
Recently, fuzzy-probabilistic methods have emerged to deal with status. Although the environmental concerns about water pollution in
complex problems related to water management (Chen et al., 2010; the river are commonly expressed by people and expert scientists, little
Zhang K. et al., 2009; Zhang X. et al., 2009). Hybrid methods allow address actions to recover the river to its original good ecological status, are
model parameter uncertainty in situations where available information undertaken.
is not sufficient to identify statistically representative distributions. For the current assessment, a water quality monitoring database
Therefore, they assign fuzzy numbers when the amount of data is short, including nineteen sampling sites was used. Data were provided by
or when the information about the confidence intervals of variables and the regional environmental protection agency, called the CVC Corpo-
parameters is unknown (Baudrit et al., 2007; Kentel and Aral, 2005). ration (www.cvc.gov.co). Data from ten years, considering four sam-
For example, Faybishenko (2010) showed a recent application of combin- pling campaigns per year, were used. Fig. 1 shows the sampling points
ing probability and possibility theory for simulating a soil water balance. where the data were collected: SP1 (Antes Suarez), SP2 (Antes Ovejas),
Moreover, fuzzy-stochastic hybrid methods are currently used to solve SP3 (Antes Timba), SP4 (Paso de La Balsa), SP5 (Paso de La Bolsa), SP6
optimization and management issues associated to water pollution (Hormiguero), SP7 (Antes Navarro), SP8 (Juanchito), SP9 (Paso del
(Guo et al., 2010; Rehana and Mujumdar, 2009; Zhang K. et al., 2009; Comercio), SP10 (Yumbo — Puerto Isaacs), SP11 (Paso de la Torre),
Zhang X. et al., 2009). In order to preserve the origin of uncertainties, SP12 (Vijes), SP13 (Yotoco), SP14 (Mediacanoa), SP15 (Puente Río frío),
some methods partitioning the total variance in risk analysis have SP16 (Puente Guayabal), SP17 (Puente la Victoria), SP18 (Puente
been developed (Kumar et al., 2009). Likewise, current methodologies Anacaro), SP19 (Puente La Virginia) (CVC Corporation, 2004).
are handling both random uncertainty and epistemic uncertainty, be-
cause they can combine the fuzzy set theory and Monte Carlo simula- 2.2. Water quality analysis and traditional indexes
tions (Li and Zhang, 2010; Li et al., 2007).
The method proposed in the present study is somehow inspired in the According to the objectives of this study, the Cauca River was
formal concept of fuzzy randomness, which was first introduced to struc- divided into three river sections: Section I (SP1 to SP6), Section II (SP7
tural analysis in civil engineering (Möller and Beer, 2004; Möller et al., to SP14), and Section III (SP15 to SP19). Thereby, the division includes
2002). The idea behind such concept is that stochastic as well as a relative less impacted area, an area highly impacted because of the
non-stochastic uncertainty is treated on the basis of the super-ordinated discharges from the city of Santiago de Cali and its industrial parks,
uncertainty model fuzzy randomness. This new uncertainty model con- and an area where these impacts should be reduced due to natural
tains the special cases of real valued random variables and fuzzy variables, attenuation. Table 1 displays the main statistics of water quality variables
and permits to take into account both uncertainty characteristics, simulta- used in this study. These were: dissolved oxygen (DO), fecal coliforms
neously. Hybrid stochastic fuzzy model was also applied for in-flight gas (FC), biochemical oxygen demand (BOD5), temperature (T), phosphates
turbine engine diagnostics, where the random fluctuations of perfor- (PO4), nitrates (NO3), turbidity (TUR), total solids (TS), and hydrogen
mance parameters were modeled with PDF while the complex functional potential (pH). Three years are displayed equally time spaced. Sampling
relationships were dealt with Neural Networks with FIS structure, com- campaigns included monitoring data in field (pH, DO, T), and laboratory
monly called ANFIS (Ghiocel and Altmann, 2001). In the present study, measurements of composite samples. The sampling campaigns were
the objective was to model variables with two layers of analysis for uncer- carried out during the same day in all sites. The 19 sites are monitored
tainty estimation, one inner layer of FIS using fuzzy membership func- in 4 periods: February–March, May–June, July–August, and October–
tions and rules, and one outer layer using Monte Carlo simulation. November, seeking stable hydrological conditions which are complex in
Randomness in water quality input variables was dealt with probability tropical regions.
theory. Then, decision about the water quality status was made by inte- Traditional water quality indexes are designed to integrate water
gration of these variables with the help of a FIS. In that way, we introduce quality variables or indicators to provide a class or score about phys-
a combined stochastic fuzzy model to assess water quality in rivers. The icochemical and biological water quality status. It is intended that
W. Ocampo-Duque et al. / Environment International 52 (2013) 17–28 19
RISARALDA
SP19
SP18
CHOCO
SP17 QUINDIO
SP16
SP15
SP14
SP13
CAUCA RIVER
SP12
SP11
SP10 COLOMBIA
SP9
SP8
VALLE DEL CAUCA SP7
SP6
SP5
SP3
SP4
SP2 Meters
0 16.000 32.000 64.000
SP1
Fig. 1. Map of the studied area: the Cauca River in the Valle Department (Colombia).
they are useful in environmental decision making. A commonly variable i. At local level, in the Cauca river basin, the CVC Corporation
referred water quality index was developed by the National Sanita- also uses the ICAUCA index to evaluate the water status (Torres et al.,
tion Foundation of United States (NSF_WQI) (Brown et al., 1970). It 2010). This index is computed according to Eq. (2),
was defined for any use of water by simply determining the specifi-
N wi
cations required by that use. This index included various physical, ICAICA ¼ ∏i¼1 Ii ð2Þ
chemical and biological characteristics. For each variable, the index in-
cluded a quality-value function that expressed the equivalence between where Ii is a special function defined for the variable i to transform the
the variable and its quality level. The strongly subjective character of the real value to a normalized quality number. The functions to calculate
equivalence functions is a problem with that index (Beamonte-Cordoba both indexes may be consulted in (CVC Corporation, 2004).
et al., 2010). The NSF_WQI is computed with Eq. (1),
2.3. Fuzzy inference systems
NSF WQI ¼ ∑i¼1 wi Q i
N
ð1Þ
It has been recently shown that linguistic computations used in
where wi is the weight of the variable, usually defined by experts, N is fuzzy inference systems (FIS) are superior to algebraic common ex-
the number of variables, and Qi is the quality value function of the pressions for water quality indexing evaluation (Lermontov et al.,
20
Table 1
Basic statistics of water quality variables involved in the study.
Indicator, abbr., units Year Section I Section II Section III
X s N Min Max X s N Min Max X s N Min Max
W. Ocampo-Duque et al. / Environment International 52 (2013) 17–28

Dissolved oxygen, DO, % Sat. 2002 76.09 20.71 24 17.11 83.26 27.90 25.45 32 2.76 77.34 34.64 8.84 20 20.50 55.98
2006 73.08 14.87 24 37.99 94.42 47.95 26.16 32 7.47 85.32 32.00 7.51 20 18.20 45.90
2010 72.16 18.65 24 22.65 94.68 35.70 23.96 32 7.02 75.30 37.22 15.68 20 8.81 69.32
Fecal coliforms, FC, CFU/100 mL 2002 1.51E + 05 5.13E + 05 24 0.00E + 00 2.40E + 06 5.97E + 07 1.01E + 08 32 2.40E + 04 2.40E + 08 1.82E + 05 5.30E + 05 20 2.40E + 03 2.40E + 06
2006 1.12E + 04 2.52E + 04 24 2.30E + 01 1.10E + 05 2.72E + 05 4.96E + 05 32 7.50E + 03 2.40E + 06 1.79E + 04 2.98E + 04 20 2.40E + 02 1.10E + 05
2010 1.05E + 05 2.61E + 05 24 7.30E + 02 9.30E + 05 5.35E + 06 1.62E + 07 32 9.10E + 04 9.30E + 07 3.15E + 05 7.16E + 05 20 2.30E + 03 2.40E + 06
Biochemical oxygen demand, BOD5, mg/L 2002 1.55 1.12 24 0.30 5.30 5.28 2.92 32 1.30 13.80 2.79 0.83 20 1.20 4.30
2006 1.87 0.76 24 1.09 4.02 3.96 1.51 32 1.75 7.52 3.44 1.08 20 2.05 5.77
2010 8.51 3.31 24 5.33 16.00 19.68 6.99 32 9.82 36.90 20.66 24.20 20 6.73 121.00
Temperature, T, °C 2002 20.4 2.8 24 15.0 24.2 23.8 2.0 32 20.0 27.0 25.2 0.9 20 22.8 26.7
2006 21.3 1.9 24 18.0 28.8 21.1 1.2 32 18.0 23.0 24.0 1.4 20 21.5 26.3
2010 22.4 0.8 24 20.9 24.3 24.4 1.3 32 22.2 27.1 25.3 1.5 20 22.6 27.5
Phosphates, PO4, mg/L 2002 0.062 0.008 24 0.060 0.099 0.076 0.039 32 0.060 0.216 0.065 0.015 20 0.060 0.123
2006 0.034 0.010 24 0.021 0.050 0.099 0.047 32 0.031 0.241 0.083 0.026 20 0.053 0.142
2010 0.069 0.016 24 0.064 0.125 0.089 0.025 32 0.064 0.157 0.084 0.045 20 0.064 0.271
Nitrates, NO3, mg/L 2002 0.30 0.20 24 0.11 1.05 0.26 0.25 32 0.04 1.53 0.43 0.20 20 0.07 0.78
2006 0.42 0.02 24 0.40 0.44 0.57 0.10 32 0.45 0.69 0.57 0.38 20 0.40 2.01
2010 0.84 0.89 24 0.11 2.57 0.98 1.10 32 0.11 3.29 1.39 1.40 20 0.11 4.28
Turbidity, TUR, NTU 2002 30.8 20.0 24 3.0 75.0 67.3 53.4 32 30.0 300.0 61.2 34.5 20 29.0 185.0
2006 110.8 117.5 24 9.0 349.0 143.1 125.4 32 18.0 404.0 201.3 231.6 20 21.0 892.0
2010 79.1 95.1 24 2.0 344.0 131.1 135.6 32 17.0 670.0 107.4 75.4 20 23.0 265.0
Total solids, TS, mg/L 2002 131.33 63.25 24 68.00 310.00 172.94 51.33 32 68.00 302.00 203.55 80.53 20 136.00 406.00
2006 181.25 108.37 24 59.00 396.00 270.09 134.88 32 129.00 811.00 338.35 156.07 20 191.00 901.00
2010 163.29 145.42 24 58.00 721.00 233.25 121.32 32 116.00 621.00 256.71 139.37 20 0.08 551.00
Hydrogen potential, pH, (−) 2002 6.93 0.47 24 5.76 7.98 6.98 0.19 32 6.58 7.27 6.90 0.31 20 6.22 7.48
2006 6.88 0.69 24 5.30 7.62 6.82 0.32 32 5.65 7.39 7.04 0.36 20 6.40 7.54
2010 7.15 0.30 24 6.45 7.65 7.05 0.19 32 6.68 7.32 7.36 0.33 20 6.72 7.90
Note: X is the median, s is the standard deviation, N is the number of data, Min is the minimum, Max is the maximum. Abr. is the abbreviation of the water quality variable.
2009; Ocampo-Duque et al., 2006). Water quality assessment is a functions were used in extreme and excellent fuzzy sets, having the
subjective task that must be carried out with tools able to manage following equations to represent them:
such subjectivity and imprecision. Here, linguistic operations in a
8 9
FIS frame are proposed to compute water quality by integrating pa- >
>
> 0; 2 x≤d >
>
rameters within an inference engine. Thus, a methodology to design >
>
> x−d d þ e>
>
>
>
< 2 ; d≤x≤ =
a water quality index is proposed. It could be adapted to diverse pur- e−d 2
μ ðx; a; bÞ ¼ 2 ð5Þ
poses with different number of inputs. In this sense, a FIS is a map- >
> x−d dþe >
>
>
> 1−2 ; ≤x≤e >>
>
>
ping process from given water quality inputs to desired water >
: e−d 2 >
;
quality index. The FIS involves three important parts: membership 1; x≥e
functions, fuzzy set operations, and inference rules. The Fuzzy
Logic toolbox of MATLAB (R2010) was used to build and compute where d, and e are the parameters shown in Table 2. These parameters
the FIS. locate the extremes of the sloped portion of the curve.
A FIS was parameterized to assess water quality considering nine The design and selection of membership functions from intervals of
input indicators (Table 1), using the same indicators that those included the input variables is a very subjective task. The main questions arise
in the well-known NSF_WQI and the ICAUCA. The FIS output is a fuzzy from the number of fuzzy sets used to divide the ranges of the variables,
water quality (FWQ) index. Table 2 summarizes the parameters of the and the own shape of these sets. A division in five fuzzy sets seems
membership functions. Five fuzzy sets were defined for input variables: appropriate. However, the number of rules may considerably increase,
“Very low”, “low”, “medium”, “high”, and “extreme”. In turn, the output especially if rules with more than one antecedent are desired. In contin-
water quality was defined according to five fuzzy sets (qualifiers): uous variables the number of fuzzy sets to represent any range could be
“poor”, “bad”, “regular”, “good”, and “excellent”. Gaussian functions selected from three to seven, being five a reasonable number. The shape
were used at low, medium, high, bad, regular and good fuzzy sets, having of the functions selected above was considered because of the low num-
the following expressions: ber of parameters required. Notwithstanding, other functions could be
also used, perhaps requiring more than two parameters. The Colombian
−ðx−cÞ Decree 1594/1984 and Resolution 2115/2007, the Spanish Decree 927/
μ ðx; s; cÞ ¼ exp 2
ð3Þ
2s 1988, the boundaries taken in the Lermontov fuzzy water quality index
(Lermontov et al., 2009) and the limits set by our previous study
where s and c are the parameters shown in Table 2, x is the value of the (Ocampo-Duque et al., 2006) were used to define the ranges from
input, and μ is the belongingness (or membership) of the input to the very low to extreme that water quality variables could take. Then, the
respective fuzzy set, which is a number between 0 and 1, meaning none division in five qualifiers was given trying to equally divide the universe
and total membership, respectively. The parameter c represents the of discourse with appropriate fuzzy intersection between sets.
center of the function in the abscissa where the membership value is 1, The inference engine is where the linguistic computations are exe-
and the parameter s defines the width of the function. It is important to cuted. It was created considering two kinds of rules: rules with only
point out that in fuzzy logic reasoning an x value may belong to more one antecedent, and rules with two antecedents or water quality vari-
than one fuzzy set. Z-shape functions were used in very low and poor ables. Forty five (45) rules were written with one antecedent and one
fuzzy sets, having the following equations to represent them: consequent (9 water quality variables per 5 fuzzy sets or options,
8 9 from very low to extreme). Nine hundred (900) rules were written
>
> 1; x≤a >
>
>
> x−a2 a þ b >
> with two antecedents and one consequent. All the likely combinations
>
> ; a≤x≤ >
>
< 1−2 = without repetitions were considered ((81 − 9)/2 = 36 pair combina-
b−a2 2
μ ðx; a; bÞ ¼ x−b a þ b ð4Þ tions and 25 options). In each rule, the most conservative output was
>
> >
> 2
> ; ≤x≤b >>
> considered, and the importance of the rule was defined according to
>
> b−a 2 >
>
: ;
0; x≥b the importance of the variables involved. Rules with DO, pH, BOD5
and FC received a weight of 1.0. Rules with NO3, PO4 and T, received a
where a and b are the parameters displayed in Table 2. These parameters weight of 0.75. Finally, rules with TUR and TS received a weight of 0.5.
locate the extremes of the sloped portion of the curve. Finally, S-shape More complex rules with three or more antecedents could be created,
Table 2
Parameters of the fuzzy inference system.
Indicator* Units Membership function parameters
“Very Low” “Low” “Medium” “High” “Extreme”
a b s c s c s c d e
Z-shape Gaussian Gaussian Gaussian S-shape
DO % Sat. 0.0 27.8 15.0 31.3 15.0 58.2 15.0 84.1 70.0 110.0
FC CFU/100 mL 58.9 272.0 143.3 337.5 143.3 675.0 143.3 1013.0 1078.0 1284.0
BOD5 mg/L 0.0 2.2 1.2 1.5 1.2 3.5 1.2 5.2 5.0 6.9
°
T C 15.1 19.9 2.6 18.9 2.6 23.0 2.6 27.2 25.1 30.0
PO4 mg/L 0.0 0.15 0.07 0.14 0.07 0.25 0.07 0.4 0.3 0.5
NO3 mg/L 0.0 3.8 1.6 3.2 1.6 6.1 1.6 9.5 7.2 12.0
TUR NTU 3.0 30.7 15.0 33.5 15.0 70.7 15.0 107.4 88.7 136.8
TS mg/L 25.6 230.4 80.0 150.6 80.0 300.0 80.0 450.0 395.0 642.0
pH – 5.0 6.5 0.5 6.4 0.5 7.5 0.5 8.5 8.0 9.5
FWQ – “Poor” “Bad” “Regular” “Good” “Excellent”
a b s c s c s c d e
Z-shape Gaussian Gaussian Gaussian S-shape
0.0 38.9 10.5 35.5 11.6 60.0 9.4 81.4 68.2 100.0
*DO: dissolved oxygen, FC: fecal coliforms, BOD5: biochemical oxygen demand, T: temperature, PO4: phosphates, NO3: nitrates, TUR: turbidity, TS: total solids, FWQ: Fuzzy water
quality index. a, b, s, c, d, and e, are the parameters to build the membership functions according to Eqs. (3)–(5).
Fig. 2. Conceptual integration of non-parametric Monte Carlo modeling with a Fuzzy Inference System.
although the improvements are not significant. Rules and ranges were are defined on the universe X, for a given element x belonging to X,
tested with several environmental experts from the CVC Corporation the following operations can be carried out:
and Academia. Some examples of rules are shown:
Intersection; AND : μ A∩B ðxÞ ¼ minðμ A ðxÞ; μ B ðxÞÞ ð6Þ
If “fecal coliform” is very low then “water quality” is excellent,
If “dissolved oxygen” is high then “water quality” is good, Union; OR : μ A∪B ðxÞ ¼ maxðμ A ðxÞ; μ B ðxÞÞ ð7Þ
If “phosphate” is medium then “water quality” is regular,
If “nitrate” is high then “water quality” is bad, Additive complement; NOT : μ A ðxÞ ¼ 1−μ A ðxÞ: ð8Þ
If “BOD5” is very high then “water quality” is poor,
If “fecal coliform” is very low and “dissolved oxygen” is very high Vector inputs are fuzzified to enter to the inference engine using the
then “water quality” is excellent. membership functions. When there are two antecedents, fuzzy logic op-
erations are applied to give a degree of support for these rules. In rules
Computations with words within the inference engine followed with one antecedent, their degree of support is the degree of member-
standard fuzzy set operations. These are: union (OR), intersection ship. The degree of support for the entire rule is used to shape the output
(AND) and additive complement (NOT). If two fuzzy sets A and B fuzzy set. The consequent of a fuzzy rule assigns an entire fuzzy set to the
Fig. 3. Propagation of uncertainty when a probabilistic variable is introduced to a fuzzy inference system.
output. This fuzzy set is represented by a membership function, which is rather than a standard index for use anywhere. Because of the high ran-
chosen to indicate the qualities of the consequent. If the antecedent is dom uncertainty in water quality variables, due to experimental mea-
only partially true (i.e., μ b 1), the output fuzzy set is truncated at this surement, human errors, and propagation of error due to the methods
value. This procedure is called the minimum implication method. Since used to measure the water quality variable, we propose treating the
decisions are based on the testing of all the rules in the system, these FIS inputs as stochastic. The conceptual model is depicted in Fig. 2.
must be aggregated to make a final decision. Therefore, output fuzzy The algorithm for Monte Carlo simulation assumes each computa-
sets of each rule are aggregated to a single output fuzzy set that may tion with the FIS as deterministic. A vector with water quality vari-
have a complex geometry. The aggregation procedure used here was ables is randomly selected according to its probability distribution
the maximum method, which is the union of all truncated output over the domain. Then the corresponding water quality score to
fuzzy sets (Mathworks, 2012). The final step was defuzzification to pro- that vector is computed with the FIS. The computation is carried out
vide a numerical water quality score. A convenient way to give FIS out- a consistent number of times to cover the entire range of likely inputs,
puts is also by means of the linguistic fuzzy sets with their respective and to build a well-defined histogram of the water quality scores.
membership degrees. In the current study, the centroid method was Random numbers were generated with the inverse transform meth-
used for defuzzification. It delivers a numeric score to water quality, so od. The quantity of random numbers was set at 10 000 in all cases.
Fig. 3 outlines the propagation of uncertainty when a probabilistic
∫μ ðzÞ⋅zdz variable is introduced to a FIS. A, and C, are fuzzy sets. Arrows point
FWQ ¼ ð9Þ out the information flow. Suppose a measured water quality variable
∫μ ðzÞdz X, continuous, positive, and random, with probability density func-
tion, f(X) ~ PDF, as shown in Fig. 3, to be introduced to the fuzzy sys-
where FWQ is a fuzzy water quality index which is a score between tem. Let X , Q1, and Q3 be the median, the 25th and 75th percentiles
0 a 100, and z is the independent variable of the output fuzzy set in of the data, respectively. When X is introduced to the fuzzy system,
each rule. Fuzzy water quality indexes have recently been proposed the probabilistic or random uncertainty is transformed into fuzzy
(Lermontov et al., 2009; Ocampo-Duque et al., 2006). uncertainty. First, X is fuzzified to take the membership value μA(X ).
μA( X ) is the degree of membership of X to the set A. Then, μA( X ) is
2.4. Monte Carlo simulation of FIS transformed to μC(y) according to the rule:
When the fuzzy water quality index is stochastically computed with If X is A then y is C: ð10Þ
Monte Carlo method, a stochastic fuzzy water quality index is obtained.
The stochastic model used in this study is described below. Obviously, Such transformation is computed according to the implication
the building of a FIS for water quality analysis is extremely subjective. method in fuzzy reasoning. In the case of the Figure, the reasoning
The number of input variables should be considerably higher than leads to the horizontal projection line from left to right, or μA( X ) =
nine, since the number of physicochemical, microbiological, and biolog- μC(y), as shown in the Fig. 3. The shape and size of the output fuzzy
ical variables measured nowadays in rigorous water protection agencies set is defined by the μC(y) value where the output set is truncated.
may be greater than hundreds. The creation of appropriate fuzzy rules is The area of the output fuzzy set is shown in dark gray. Observe that
an important issue for increasing the preciseness of the simulation. A ΔU is the uncertainty in the height of the output fuzzy set after
considerable number of fuzzy rules may make more accurate the deci- fuzzification of the random variable X when the interquartile range
sion from the inference engine. However, if the number of input vari- (IQR) is computed. The area of the output fuzzy set in every rule is re-
ables increases, the number of rules would also increase exponentially quired in the centroid method to provide the final output single water
to thousands or millions, which would make extremely more complex quality score. The centroid computes the center of area under the
the model requiring powerful computation to deliver a single score curve resulting after aggregation of all fuzzy sets within the inference
under stochastic conditions. Moreover, the number and form of the engine. Therefore, the uncertainty in the area of the fuzzy set do affects
rules, as well as the shape of the ranges of the membership functions, the water quality score. The Monte Carlo method allows computing the
are also subjective complex decisions, which could be designed for spe- final effect over the propagation of uncertainty when dealing with a
cific, regional and/or local requirements. Therefore, we here propose a random input variable in a FIS to provide a final defuzzified water qual-
convenient method to build a FIS for water quality evaluation purposes ity score, which also leaves the system with an empirical probability
Fig. 4. Examples of optimized fitting of non-parametric versus parametric distributions of two input variables. (Data of 2009, Section II).
parameters of the assumed distribution from the data. This is the most
common way to apply the PDF in environmental uncertainty analysis,
with multiple tools available. The main disadvantage of the parametric
approach is the lack of flexibility. Each parametric family of distribu-
tions imposes restrictions on the shapes that f(x) may have. For exam-
ple, the density function of the normal distribution is symmetrical and
bell-shaped, and therefore, it is unsuitable for representing skewed
densities or bimodal densities, which may appear in real water quality
datasets. The idea of the non-parametric approach is to avoid restrictive
assumptions about the form of f(x), and to estimate it directly from the
water quality monitoring data (Qin et al., 2011). It could be especially
useful if data are limited. A well-known non-parametric estimator of
the PDF is the histogram, when classes are properly well defined. Like-
Fig. 5. Box-and-Whisker plots for assessed water quality with the stochastic fuzzy wise, the kernel density estimation method is a widely used method for
water quality index (SFWQI) for different years and the three river sections. Reported
values are the medians.
density estimation.
The most attractive feature of non-parametric kernel density esti-
mation is that it directly makes use of sample data without a need of
density function. Thus, the shapes of the output fuzzy sets vary with estimating characteristic parameters in a theoretical distribution. In
each run as a random input is chosen. Propagation of uncertainty is other words, there is no error caused by assumption of a theoretical
somewhat expressed in this context as the transformation of probabilis- distribution for data and by mismatch between estimated parameters
tic uncertainty into fuzzy uncertainty through the every membership and actual behaviors of water quality indicators. Let X1, X2,…, Xn
function and rule evaluation. Such propagation is graphically represent- denote n water quality variable samples. The real probability density
ed as the uncertainty in the area of the output fuzzy set (ΔU) when the function f(x) of a water quality variable can be estimated by the
random input takes a number between Q1 and Q3. To compute such un- following density function:
certainty, deterministic computations of the FWQ index are performed

depending on the probability of water quality inputs randomly chosen 1 X n
x−X i
f^n ðxÞ ¼ K ð12Þ
within the statistic range of the water quality variables. Therefore, two nh i¼1 h
layers of uncertainty may clearly be identified. The fuzzy uncertainty
is self-contained in the FWQ number as long as probabilistic uncertainty where h is the bandwidth, K is called the kernel function and n is the
is observed through the output FWQ histogram. sample size. Gaussian functions are commonly selected as kernel
functions:
2.5. Non-parametric kernel density estimator
!
x−X i 1 ðx−X i Þ2
The use of probability distributions to assess water quality, when the K ¼ pffiffiffiffiffiffi exp − : ð13Þ
h 2π 2h2
integration of variables is required, may provide a better estimation,
since the outputs of the fuzzy water quality index will also have proba-
bility density rather than point estimation. Consequently, stochastic The determination of the bandwidth h is crucial for accurate esti-
fuzzy water quality indexes are estimated. Thus, the final classification mation of water quality variable distributions. There are many ways
will be more realistic. The probability distribution of a continuous- to estimate an optimal bandwidth (hopt). An approximation, known
valued random variable X is conventionally described in terms of its prob- as the Silverman's rule (Silverman, 1998) has been proposed:
ability density function (PDF), from which probabilities associated with X
4 1=5
can be determined using the relationship hopt ¼ σ ð14Þ
3n
b
P ðabXbbÞ ¼ ∫ f ðxÞdx: ð11Þ
a n 2
where σ ¼ min s; IQR
20:6745 , s2 ¼ n−1
1
∑i¼1 ðxi −xÞ , and IQR is the
The parametric approach for estimating f(x) is to assume some para- interquartile range of the data. Therefore, parametric or non-parametric
metric family of probability distributions, and then to estimate the PDF should be estimated for annual data sets to each input water quality
Table 3
Classification of the water quality according to the membership degree of the fuzzy sets.
Year Section Lower Quartile (0.25) Median Upper Quartile (0.75)
Bad Regular Good Bad Regular Good Bad Regular Good
2002 I 0.344 0.741 0.005 0.310 0.777 0.006 0.266 0.823 0.009
II 0.408 0.678 0.003 0.355 0.730 0.005 0.317 0.769 0.006
III 0.444 0.642 0.002 0.406 0.679 0.003 0.360 0.725 0.004
2006 I 0.336 0.750 0.005 0.302 0.786 0.007 0.268 0.822 0.009
II 0.431 0.654 0.003 0.382 0.703 0.004 0.333 0.753 0.005
III 0.459 0.628 0.002 0.418 0.667 0.003 0.369 0.716 0.004
2008 I 0.322 0.766 0.006 0.292 0.797 0.007 0.268 0.822 0.009
II 0.379 0.707 0.004 0.347 0.739 0.005 0.314 0.773 0.088
III 0.395 0.691 0.035 0.358 0.727 0.005 0.323 0.764 0.006
2009 I 0.348 0.737 0.005 0.326 0.760 0.006 0.301 0.787 0.007
II 0.364 0.722 0.004 0.331 0.755 0.005 0.377 0.708 0.004
III 0.419 0.667 0.003 0.377 0.708 0.004 0.339 0.747 0.005
2010 I 0.379 0.707 0.004 0.343 0.743 0.005 0.314 0.773 0.006
II 0.376 0.709 0.004 0.315 0.745 0.005 0.313 0.775 0.006
III 0.534 0.553 0.001 0.413 0.672 0.006 0.354 0.732 0.005
2010:
Water Quality Water Quality Water Quality
1200 600
700
1000 600 500
800 500 400

Frequency
Frequency
Frequency
400
600 300
300
400 200
200
200 100
100
0 0 0
47.25 49.50 51.75 54.00 56.25 58.50 60.75 63.00 48 49 50 51 52 53 54 45.6 46.8 48.0 49.2 50.4 51.6 52.8 54.0
2010 Section I 2010 Section II 2010 Section III
2009:
700 500
900
600 800
400
500 700
Frequency
Frequency
Frequency
600
300
400
500
300 400
200
300
200
100 200
100
100
0 0 0
48.8 49.6 50.4 51.2 52.0 52.8 53.6 54.4 46.2 47.3 48.4 49.5 50.6 51.7 52.8 53.9 44.8 46.2 47.6 49.0 50.4 51.8 53.2
2009 Section I 2009 Section II 2009 Section III
2008:
800
600
400
700
500
600
300
Frequency
Frequency
400 Frequency 500
400
300
200
300
200
200
100
100
100
0 0 0
49.6 50.4 51.2 52.0 52.8 53.6 54.4 47.7 48.6 49.5 50.4 51.3 52.2 53.1 54.0 45.6 46.8 48.0 49.2 50.4 51.6 52.8 54.0
2008 Section II 2008 Section III
2008 Section I
Fig. 6. Non parametric distributions of the stochastic fuzzy water quality index in the Cauca River for some selected years.
variable prior to the FIS calculations. Fig. 4 depicts two examples of nonparametric method to build all the probability distributions. The
distribution fittings carried out to estimate the best probability distribu- ksdensity (Kernel smoothing density) algorithm was used to compute
tions better representing the variables BOD5 and total solids, for 2009. the probability density estimates of the input variables (Mathworks,
Non-parametric versus parametric distributions are compared, corre- 2012). The estimate is based on a normal kernel function, using a win-
sponding the best fit to the kernel method in both cases. dow parameter (bandwidth) that is a function of the number of points
The Kolmogorov–Smirnov (K–S) test was used to assess the (Eq. (14)). The density is evaluated at equally spaced points that cover
goodness-of-fit of the entire input PDF variables. It is a nonparametric the range of the data. The ksdensity algorithm makes no assumptions
test for the equality of continuous, one-dimensional probability distri- about the mechanism producing the data or the form of the underlying
butions that can be used to compare a sample with a reference probabil- distribution. Therefore, no parameter estimates are made. In other
ity distribution. The most attractive characteristic of K–S test is that it is words, it produces a nonparametric density estimate that adapts itself
applicable for any continuous variable distribution, and any sample size. to the data. Likewise, the ProbDistUnivKernel object, which represents
A smaller statistic represents the better goodness-of-fit between as- a nonparametric probability distribution based on a normal kernel
sumed theoretical distribution, and actual variable samples. Therefore, smoothing function, was used to deal with all the PDFs (Mathworks,
the statistics can be used to rank the performances of all the water qual- 2012).
ity variable distributions including the proposed non-parametric kernel
distribution estimation (Qin et al., 2011). The K–S test evaluates if a 3. Results and discussion
sample comes from a continuous distribution with specified parame-
ters, against the alternative that it does not come from that distribution. The water condition for the Cauca River when crossing the Valle
The test rejects the null hypothesis at the 5% significance level (p b 0.05). Department in Colombia has been here assessed with a water quality
All statistic calculations were performed with the statistics toolbox of index built on a FIS. Input data have been provided by the CVC Corpo-
MATLAB (R2010). The best goodness of fit for 84% of data was obtained ration, a regional environmental protection agency. We assessed var-
for non-parametric kernel density estimators. The remaining 16% of ious years using stochastic modeling with non- parametric kernel
data presented K–S statistics for parametric fittings similar to that for estimators of inputs. Hence, integration between fuzzy and stochastic
non-parametric estimators. That was a good reason to choose the models was carried out to manage both random and linguistic
Table 4
Comparison of the fuzzy water quality index versus other indexes after Monte Carlo simulations (medians are provided). Membership values in linguistic scores, computed with
fuzzy modeling, are provided.
Index Year Section I Section II Section III
Numeric score Linguistic score Numeric score Linguistic score Numeric score Linguistic score
Stochastic NSF_WQI 2002 63 Regular 48 Bad 49 Bad

2006 56 Regular 45 Bad 47 Bad
Stochastic ICAUCA 2002 42.65 Regular 27.49 Bad 30.09 Bad
2006 63.74 Good 54.95 Good 44.03 Regular
2008 63.03 Good 28.80 Bad 42.54 Regular
2009 74.96 Good 30.64 Bad 55.26 Good
2010 40.49 Regular 26.70 Bad 26.77 Bad
Stochastic FWQ 2002 51.58 0.31 bad 50.59 0.35 bad 49.74 0.40 bad
0.77 regular 0.73 regular 0.67 regular
2006 51.81 0.30 bad 50.00 0.38 bad 49.47 0.41 bad
2008 51.98 0.29 bad 50.78 0.34 bad 50.54 0.35 bad
0.79 regular 0.73 Regular 0.72 regular
2009 51.18 0.32 bad 51.05 0.33 bad 50.10 0.37 bad
2010 50.85 0.34 bad 50.89 0.32 bad 49.45 0.41 bad
uncertainty in the analysis. Consequently, a stochastic fuzzy water median variations between 0.413 and 0.292, which is a considerable
quality index was developed. Fig. 5 shows the behavior of the sto- belongingness. Moreover, it is noticeable that the membership values
chastic FWQ index for five years and the three river sections assessed to the regular water quality fuzzy sets decreased downstream while
using a Box-and-Whiskers plot. Most of the time, the stochastic FWQ they increased for bad water quality fuzzy sets. This result may be asso-
index in river Section I was higher than that in river Section II, and in ciated to the higher number of domestic, agricultural and industrial
river Section III, respectively. Therefore, it is noted with the index that loads available downstream. In Table 3 can be seen that the member-
water quality decreases downstream. Generally, there is a bigger disper- ship degrees for bad water quality are higher in the lower quartile
sion in river Section III, as it is shown by their box heights. Moreover, it is than in the median, which is due to the lower quartile of the index is
noticeable the symmetry of the median, as well as the close distance the worst condition. For example, in 2002 and Section I, the member-
between estimated boundary values. Such numeric scores agree well ship degrees to the bad fuzzy set were 0.344, 0.310, and 0.266 for the
with expert and non-expert opinions, since water quality control is lower quartile, median and upper quartiles, respectively.
minimal in such area, and water pollution is considerably increased Output histograms are necessary in indexing computations since
downstream. Despite the limited monitored variables considered in the with point estimates the results could be limited. Fig. 6 shows the fre-
current study, and included in the index, the FIS model adequately quency histograms (or the nonparametric density estimations) for
describes the observed condition. The low water quality scores should the stochastic FWQ index of some assessed years through the three
be inferred from a brief inspection to Table 1. Low dissolved oxygen con- sections. They point out the randomness of water quality integration
centrations, specifically in Sections II and III, were observed with very low outputs when random inputs are provided to the FIS. So, the output
saturation percentages. Also, high fecal coliform concentrations were spread is easily observed in such figures. As it can be noted, diverse
available. Moderate (medium to high) concentration of total solids and shapes are possible. Some histograms showed peak shapes with rela-
turbidity are also common features in the area. BOD5 was high in Section tive symmetrical variability. Likewise, some histograms with wide
III with observed increase in time, since in 2010, the values were peaks (2008, Sections II and III) were calculated. The biggest disper-
considerably higher than those in 2006 and 2002. sion was generally observed for river Section III. Moreover, in some
With the aid of the fuzzy stochastic analysis, it is possible to map cases appeared bimodal histograms, especially in Section III, although
fuzzy random input parameter into fuzzy random responses. The the closeness between peaks was enough to get appropriate classifi-
stochastic fuzzy behavior of the FWQ index and some of its advantages cation and unambiguous outputs.
are shown in Table 3. Here, the membership functions (μ) to the diverse In order to validate the performance of the stochastic fuzzy water
output fuzzy sets are calculated. As mentioned above, the membership quality index, similar stochastic computations were carried out for
to each fuzzy set is a number between zero (0) and one (1), meaning the indexes: NSF_WQI and ICAUCA. Results of the medians are given
none and total membership, respectively. Partial memberships are in Table 4. From the NSF_WQI calculations, it can be observed that
also possible, which is one of the advantages of fuzzy logic for environ- they always provided a consistent output, classifying as regular the
mental decision making. It must be remarked that the sum of specific river Section I, and bad the river Sections II and III, for all the assessed
set membership values could be higher than 1. The membership de- years. In this case, numeric scores ranged between 63 and 41, with a
grees may be stated as possibility values to not confuse them with prob- spread of 22. The assessment with the ICAUCA index was less strict,
abilistic computations. Table 3 presents the calculated membership delivering good water quality classification in some cases. ICAUCA
values to the sets bad, regular and good. Such score was zero in all index outputs were between 74.93 and 26.70, with a range of 48.26.
years, reaching to poor and excellent water quality fuzzy sets. In all The outputs from stochastic FWQ index were similar to the other in-
cases, the belongingness estimation to the good water quality fuzzy dexes. The dispersion of the stochastic FWQ index results was lower
sets was really low. The fuzzy sets with the higher membership values than the other indexes, being within a maximum of 51.98 and a mini-
were the ones to regular water quality classification during all years, mum of 49.45, with a range of 2.53. Although the numeric score of the
and through the three river sections with median variations between defuzzified stochastic FWQ index is important, the advantage of the
0.797 and 0.667. Likewise, the membership value to bad water quality hybrid probabilistic fuzzy index over the others is that the last one
classification during all years and through the three river sections had provides a classification with more information related to the
belongingness to diverse possible classifications. In all cases, the sto- proposed method is that membership to two or more classes is also pos-
chastic FWQ index outputs have classified water quality in the studied sible which gives to decision makers a better conceptual assessment.
area as “partially bad and partially regular”. Lower possibility has the When the developed method was applied to the Cauca River, the results
water quality to belong to the good class (in Table 3, observe that for several years showed that water quality was possibly “regular” with a
μ b 0.01). As above stated, the belongingness to bad water quality sets membership degree of approximately 0.7, and possibly “bad” with a
increased downstream from the river Section I to Section III. It agrees membership degree of approximately 0.4. The index also predicted that
with the results from the NSF_WQI. Consequently, the belongingness water quality decreased downstream. The results have complex origins,
to the regular class decreases downstream. since the river is plainly affected by the presence of towns and cities
Water quality indexes based on fuzzy systems have been recently without adequate treatment for wastewater. We observed that the envi-
proposed in scientific literature with relative success. The fuzzy frame ronmental impact was not reduced downstream. Intense sugarcane agri-
clearly improves the conceptual design of the indexes, because they culture and some industrial plants could also be responsible of surface
are computed with expert rules and sets to provide final numerical/ water pollution. An intensive environmental protection program from
linguistic scores which include a convenient treatment of linguistic regional and national government is suggested if ecosystem restoration
uncertainty and subjectivity. However, the computation of water and biodiversity conservation is desired in the area.
quality index scores is clearly deterministic even within the fuzzy
method. A vector of water quality variables is given to the FIS, and Acknowledgments
a unique water quality score is obtained. The challenge now is how
to deal with computation in non-deterministic real world scenarios. The authors thank the Agencia Española de Cooperación Internacional
Water quality variables collected in rivers are essentially stochastic, para el Desarrollo (AECID) for financial support (Projects D/026977/09,
and density probability functions may easily be computed. Then, the and D/031370/10). We also thank the CVC Corporation for providing
key question is how to perform computations of water quality indexes water quality monitoring data.
when sufficient data have been collected, and the statistics are depend-
able. Currently, the easiest way to deal with stochastic computations is References
through Monte Carlo methods. Moreover, fuzzy alpha-cuts to deal with
uncertainty in inputs could also be considered (Kumar et al., 2009). In Baudrit C, Guyonnet D, Dubois D. Joint propagation of variability and imprecision in
assessing the risk of groundwater contamination. J Contam Hydrol 2007;93:72–84.
this paper, we used Monte Carlo simulation to calculate the fuzzy
Beamonte-Cordoba E, Casino Martinez A, Veres-Ferrer E. Water quality indicators:
index to analyze historic and geographic trends in water quality. The comparison of a probabilistic index and a general quality index. The case of the
method was powerful because provided better water quality classifica- Confederación Hidrográfica del Júcar (Spain). Ecol Indic 2010;10:1049–54.
Brown RM, McClelland NI, Deininger RA, Tozer RG. A water quality index: do we dare?
tion, and we observed graphically the consistency in fuzzy classification.
Water Sew Works 1970;117:339–43.
However, the use of combined probabilistic and fuzzy methods is still Cardona CM, Martin C, Salterain A, Castro A, San Martín D, Ayesa E. CALHIDRA 3.0 —
under development, and a generalized theory of uncertainty is required new software application for river water quality prediction based on RWQM1. En-
(Zadeh, 2005). Moreover, mathematical foundations about propagation viron Model Softw 2011;26:973–9.
Chen Z, Zhao L, Lee K. Environmental risk assessment of offshore produced water dis-
of probabilistic uncertainty through fuzzy systems may also require fur- charges using a hybrid fuzzy-stochastic modeling approach. Environ Modell
ther research. Finally, we found that the method was powerful not only Softw 2010;25:782–92.
by providing consistent histograms of defuzzified water quality scores Chowdhury S, Champagne P, McLellan PJ. Uncertainty characterization approaches for
risk assessment of DBPs in drinking water: a review. J Environ Manage 2009;90:
but also delivering the membership values to more than one water 1680–91.
quality class. The value of the membership function of the output CVC Corporation. Estudio de la calidad del agua del río cauca y sus principales
fuzzy sets resulted highly sensitive to input conditions. With this tool, tributarios mediante la aplicación de índices de calidad y contaminación. Project
Report 0168, Oct 2004. Available at: http://190.97.204.39/cvc/Mosaic/dpdf2/
the decision makers may be able to relax the boundaries between two Volumen10/1-ECARCpag1-158.pdf2004. (Accessed 1/9/2012).
or more likely water quality classes. Moreover, a consistent classifica- Darbra RM, Eljarrat E, Barcelo D. How to measure uncertainties in environmental risk
tion in water quality after stochastic simulations was observed which assessment. Trends Anal Chem 2008;27:377–85.
Faybishenko B. Fuzzy-probabilistic calculations of water-balance uncertainty. Stoch
showed that the fuzzy index was stable in providing appropriate classi-
Environ Res Risk A 2010;24:939–52.
fication. Finally, the use of fuzzy systems avoids using crisp values to Ghiocel DM, Altmann J. Hybrid stochastic-neuro-fuzzy model-based system for in-flight
water quality classification which is the most important fact in applying gas turbine engine diagnostics. In: Pusey HC, Pusey SC, Hobbs WR, editors. New fron-
tiers in integrated diagnostics and prognosticsProceedings of the 55th meeting of the
this methodology. With the Monte Carlo and FIS approach, the strongly
Society for Machinery Failure Prevention Technology, Virginia; 2001.
subjective character of the equivalence functions of traditional water Gottardo S, Semenzin E, Giove S, Zabeo A, Critto A, de Zwart D, et al. Integrated risk
quality indexes is avoided, and the assessment is closer to human rea- assessment for WFD ecological status classification applied to Llobregat river
soning, becoming the technique very useful under many similar envi- basin (Spain). Part I—fuzzy approach to aggregate biological indicators. Sci
Total Environ 2011;409:4701–12.
ronmental assessment problems. Guo P, Huang GH, Zhu H, Wang XL. A two-stage programming approach for water re-
sources management under randomness and fuzziness. Environ Modell Softw
4. Conclusion 2010;25:1573–81.
Kentel E, Aral M. 2D Monte Carlo and Monte Carlo-fuzzy health risk assessment. Stoch
Environ Res Risk A 2005;19:86–96.
We have implemented stochastic simulation to a fuzzy water Kumar V, Mari M, Schuhmacher M, Domingo JL. Partitioning total variance in risk
quality index in order to improve the water quality assessment pro- assessment: application to a municipal solid waste incinerator. Environ Modell
Softw 2009;24:247–61.
vided with deterministic indexes. The hybrid stochastic fuzzy method Legay C, Rodriguez MJ, Sadiq R, Sérodes JB, Levallois P, Proulx F. Spatial variations of
combined the benefits of Mont Carlo simulations with the advantages human health risk associated with exposure to chlorination by-products occurring
of fuzzy inference. The proposed method updated the design of indexing in drinking water. J Environ Manage 2011;92:892–901.
Lermontov A, Yokoyama L, Lermontov M, Soares-Machado MA. River quality analysis
techniques to integrate water quality variables available to date. Non- using fuzzy water quality index: Ribeira do Iguape river watershed, Brazil. Ecol
parametric kernel density estimators resulted appropriate tools to Indic 2009;9:1188–97.
build empirical probability density functions from raw data since normal Li H, Zhang K. Development of a fuzzy-stochastic nonlinear model to incorporate alea-
toric and epistemic uncertainty. J Contam Hydrol 2010;111:1-12.
and other parametric distributions did not fit well the real data, especially
Li J, Huang GH, Zeng G, Maqsood I, Huang Y. An integrated fuzzy-stochastic modeling
when number of data was limited. The Monte Carlo simulation improved approach for risk assessment of groundwater contamination. J Environ Manage
the results from point estimate of fuzzy water quality indexes since the 2007;82:173–88.
dispersion of the final indexes was estimated. The water quality classifica- Mahapatra SS, Nanda SK, Panigrahy BK. A cascaded fuzzy inference system for Indian
River water quality prediction. Adv Eng Softw 2011;42:787–96.
tion preserved the linguistic uncertainty of the subjective index and the Marchini A, Facchinetti T, Mistri M. F-IND: a framework to design fuzzy indices of en-
randomness from real measurements. The main advantage of the vironmental conditions. Ecol Indic 2009;9:485–96.
Mari M, Nadal M, Schuhmacher M, Domingo JL. Exposure to heavy metals and PCDD/Fs Qin Z, Li W, Xiong X. Estimating wind speed probability distribution using kernel density
by the population living in the vicinity of a hazardous waste landfill in Catalonia, method. Electr Power Syst Res 2011;81:2139–46.
Spain: health risk assessment. Environ Int 2009;35:1034–9. Ramaswami A, Milford JB, Small MJ. Integrated environmental modeling — pollutant
Mathworks. Product Documentation Matlab R2012a. Available at: http://www.mathworks. transport, fate, and risk in the environment. John Wiley & Sons; 2005.
com/help/2012. Accessed 29/08/2012. Rehana S, Mujumdar PP. An imprecise fuzzy risk approach for water quality management
Misha A. Estimating uncertainty in HSPF based water quality model: Application of of a river system. J Environ Manage 2009;90:3653–64.
Monte-Carlo based techniques. PhD Thesis at Virginia Polytechnic Institute and Sadiq R, Tesfamariam S. Probability density functions based weights for ordered weighted
State University, USA, 2011. averaging (OWA) operators: an example of water quality indices. Eur J Oper Res
Möller B, Beer M. Fuzzy randomness — uncertainty in civil engineering and computational 2007;182:1350–68.
mechanics. Berlin Heidelberg New York: Springer-Verlag; 2004. Silverman BW. Density estimation for statistics and data analysis. London:
Möller B, Graf W, Beer M, Sickert J. Fuzzy randomness — towards a new modeling of Chapman&Hall/CRC, ISBN: 0-412-24620-1; 1998.
uncertainty. In: Mang AH, Rammerstorfer FG, Eberhardsteiner J, editors. The Fifth Torres P, Cruz C, Patiño P, Escobar JC, Pérez A. Aplicación de índices de calidad de agua —
World Congress on Computational Mechanics, Vienna; 2002. ICA orientados al uso de la fuente para consumo humano. Ing Investig 2010;30:
Nikoo MR, Kerachian R, Malakpour-Estalaki S, Bashi-Azghadi SN, Azimi-Ghadikolaee 86–95.
MM. A probabilistic water quality index for river water quality assessment: a Zadeh LA. Toward a generalized theory of uncertainty (GTU) — an outline. Inf. Sci.
case study. Environ Monit Assess 2011;181:465–78. 2005;172:1-40.
Ocampo-Duque W, Ferré-Huguet N, Domingo JL, Schuhmacher M. Assessing water Zhang K, Li H, Achari G. Fuzzy-stochastic characterization of site uncertainty and variability
quality in rivers with fuzzy inference systems: a case study. Environ Int 2006;32: in groundwater flow and contaminant transport through a heterogeneous aquifer.
733–42. J Contam Hydrol 2009;106:73–82.
Ocampo-Duque W, Schuhmacher M, Domingo JL. A neural-fuzzy approach to classify Zhang X, Huang GH, Nie X. Robust stochastic fuzzy possibilistic programming for en-
the ecological status in surface waters. Environ Pollut 2007;148:634–41. vironmental decision making under uncertainty. Sci Total Environ 2009;408:
Ocampo-Duque W, Juraske R, Kumar V, Nadal M, Domingo JL, Schuhmacher M. A concurrent 192–201.
neuro-fuzzy inference system for screening the ecological risk in rivers. Environ Sci
Pollut Res 2012;19:983–99.

Ocampo Duque2013

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ocampo Duque2013

Uploaded by

Copyright:

Available Formats

Environment International 52 (2013) 17–28

Contents lists available at SciVerse ScienceDirect

Water quality analysis in rivers with non-parametric probability distributions and

Indicator, abbr., units Year Section I Section II Section III

X s N Min Max X s N Min Max X s N Min Max

W. Ocampo-Duque et al. / Environment International 52 (2013) 17–28

Indicator* Units Membership function parameters

“Very Low” “Low” “Medium” “High” “Extreme”

Z-shape Gaussian Gaussian Gaussian S-shape

Year Section Lower Quartile (0.25) Median Upper Quartile (0.75)

Bad Regular Good Bad Regular Good Bad Regular Good

1000 600 500

800 500 400

400 Frequency 500

Index Year Section I Section II Section III

Stochastic NSF_WQI 2002 63 Regular 48 Bad 49 Bad

You might also like