Professional Documents
Culture Documents
net/publication/226101513
CITATIONS READS
13 245
3 authors:
Mysore G. Satish
Dalhousie University
45 PUBLICATIONS 1,067 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Development of Computational Tools for Contamination Source Identification and Monitoring Network Design in Contaminated Aquifers View project
Artificial Neural Network Modeling and Genetic Algorithm Based Optimization of Hydraulic Design Related to Seepage under Concrete Gravity Dams on Permeable Soils
View project
All content following this page was uploaded by Bithin Datta on 05 June 2014.
···································································································································································································································
Abstract
This paper evaluates the performance of an Artificial Neural Networks (ANN) model for approximating density depended
saltwater intrusion process in coastal aquifer when the ANN model is trained with noisy training data. The data required for training,
testing and validation of the ANN model are generated using a numerical simulation model. The simulated data, consisting of
corresponding sets of input and output patterns are used for training a multilayer perception using back-propagation algorithm. The
trained ANN predicts the concentration at specified observation locations at different time steps. The performance of the ANN model
is evaluated using an illustrative study area. These evaluation results show the efficient predicting capabilities of an ANN model
when trained with noisy data. A comparative study is also carried out for finding the better transfer function of the artificial neuron
and better training algorithms available in Matlab for training the ANN model.
Keywords: artificial neural networks, saltwater intrusion process, groundwater, noisy data, approximation model
···································································································································································································································
*Assistant Professor, Dept. of Civil Engineering, Indian Institute of Technology, Guwahati 781039, Assam, India (Corresponding Author, E-mail:
rkbc@iitg.ernet.in)
**Professor, Dept. of Civil Engineering, Indian Institute of Technology, Kanpur 208016, India (E-mail: bithin@iitk.ac.in)
***Professor, Dept. of Civil Engineering, Dalhousie University, Halifax, Nova Scotia B3J1Z1, Canada (E-mail: mysore.satish@dal.ca)
− 205 −
Rajib Kumar Bhattacharjya, Bithin Datta, and Mysore G. Satish
regression as it can handle partial information. Many researchers with back propagation algorithms. The ANN toolbox available
have presented various approximation models to simulate the in MATLAB is used in this study. This paper begins with the
flow and transport processes in ground water. Ejaz and Peralta description of artificial neural network followed by the descrip-
(1985) presented regression equations to predict downstream tion of the two governing equations, flow and transport equa-
concentration of several constituents from the upstream flow rate tions, used to generate the required patterns for the ANN model.
and constituent concentration. Alley (1986) presented regression The details of developed model are provided next followed by
equations to relate the variation in pumping and recharge rate at the description of various standard statistical performance eval-
five decisions wells to the concentration at nine control locations uation criteria and then discusses the performance of the de-
for two dimensional transport processes in an aquifer. Lefkoff veloped model and presents a concluding remarks.
and Gorelick (1990) presented multiple linear regression equa-
tions to approximate transport process in an aquifer. This ap- 2. Artificial Neural Networks
proximation model predicts the change of ground water salinity
resulting from the hydrologic conditions and water use decisions. An artificial Neural Networks (ANN), which is considered as a
ANN has been used extensively for simulating groundwater universal approximator, mimics the function of the human brain
flow and transport processes. Rogers and Dowla (1994) incor- by acquiring knowledge through learning process. The study on
porated the artificial neural networks model with the optimiza- brain shows that brain stores information as patterns. Some
tion model to predict the total solute mass removal for treatment. patterns are very complex in nature, e.g. brain allows us to
They trained an ANN model, for a simplified case, to predict recognize individual faces from different angles. The learning of
whether or not a given set of pumping satisfies the containment pattern by an ANN model involves finding of an optimal set of
constraints. Morshed and Kaluarachchi (1998) presented ANN weights for the synaptic connections between the artificial
model to approximate the concentration break-through curves neurons of the network. The ability to gather knowledge through
for one dimensional unsaturated flow and transport model. Aly the process of learning, like a human brain, from sufficient input
and Peralta (1999) trained an ANN to model the response surface patterns makes it possible to apply the ANN to large scale real
within the optimization model. They applied their model to world problems. Once the ANN is trained, the relationship
design pump-and-treat systems for aquifer cleanup. Johnson and between the input and output is encoded in the network. Later it
Rogers (2000) evaluated the effect of using ANN and linear can be used to predict the output based on the information fed to
approximator in conjunction with simulated annealing driven the input nodes. Fig. 1 shows a single hidden layer feed forward
search for two different two-dimensional ground water remedia- neural network. The information is passing from input layer to
tion problems. All these studies were confined to one dimen- the output layer through hidden layer, i.e. the information flow is
sional or two-dimensional flow and transport processes only. The only in the forward direction. A synaptic weight is assigned to
application of ANN for approximating three-dimensional flow each link that represents the connection strength between two
and transport processes is very limited (Bhattacharjya and Datta, neurons. The synaptic weights are modified during network
2005; Bhattacharjya et al., 2007). Bhattacharjya et al. (2007) training process. The input output relation for this feed forward
used ANN for approximating three dimensional flow and network may be written as
transport processes in coastal aquifer. They have used standard
y = ∅2 [ V.∅1 ( W.x + b1 ) + b2 ] (1)
back-propagation algorithms with constant learning and momen-
tum rate. However, they did not explore the performance of other Where, x is the input vector, y is the output vector, W is the
faster optimization algorithms for training the ANN model. weight matrix for the synaptic connections between input and
Therefore, the main objective of the present study is to develop hidden layer, V is the weight matrix for the synaptic connections
an efficient ANN model for predicting flow and transport pro- between hidden and output layer, b1 is the biases in the hidden
cesses in coastal aquifer. ANN has also been successfully applied layer, b2 is the biases in the output layer, ∅1 is the transfer
in many surface water related problems, like rainfall runoff function for the neurons in hidden layer and ∅2 is the transfer
process (Hsu et al., 1995), daily reservoir forecasting (Coulibaly function for the neurons in the output layer.
et al., 2000), river flow forecasting (Korre et al., 2000), river
stage forecasting (Liong et al., 2000), etc.
The objectives of the study presented in this paper are (a)
development of an ANN model as an approximator of three-
dimensional flow and transport processes in costal aquifers and
its performance with noisy data, (b) performance evaluation of
the developed model using hypothetical study area, and (c)
determination of best transfer function of the artificial neurons as
well as best optimization algorithm for training of the ANN
model. The data generated by FEMWATER (Lin et al., 1997) for
an illustrative costal aquifer are used to train the ANN model Fig. 1. An ANN Model
able in ANN toolbox of MATLAB which are much faster than 3. Flow and Transport Equations
the algorithms used in traingd and traingdm function. These
faster algorithms can be divided into two categories. The fist In the present study, the FEMWATER model is used for gen-
category uses heuristic techniques, the standard steepest descent erating patterns for the ANN model. The finite element based
algorithm. On the other hand the second category uses standard numerical simulation model FEMWATER simulates the 3-di-
numerical optimization techniques. mensional flow and transport processes along with the initial and
In the first category two functions, namely traingda and trainrp boundary conditions for the flow through saturated-unsatu-
are available. The traingda function uses standard steepest rated porous media. The numerical simulation model FEMWATER
descent algorithm with variable learning rate. The convergence simulates the natural conditions, but cannot duplicate them. The
of the algorithm is highly sensitive to learning rate. Therefore, model is highly sensitive to the grid/mesh cell size and improper
this algorithm allows the learning rate to change during the selection may results to erroneous solutions. In spite of all these
training process. In the present study, if the current step error is limitations, FEMWATER is one of the most sophisticated simu-
less than the previous step error, the learning rate is increased by lation models to simulate density dependent flow and transport
1.04 times. On the other hand the trainrp function uses resilient processes. Interested reader may refer Lin et al. 1997 for details
backpropagation training algorithm. When sigmoid function is about FEMWATER.
used as a transfer function, the slope of the function approaches
to zero at higher input value. As a result, the change in magni- 3.1 Flow Equations
tude of weights and biases of the networks are very small even The 3D flow equation may be written as (Lin et al., 1997):
though the weights and biases are far from their optimal values.
ρ ∂h ρ ρ *
----- F ----- = ∇ ⋅ K ⋅ ⎛ ∇h + ----- ∇z⎞ + ----- q
Therefore, in resilient backpropagation training algorithm, the (5)
ρ o ∂t ⎝ ρo ⎠ ρo
size of the weight change is determined by a separate update
value. Where F is the storage coefficient, h is the pressure head [L], t
In the second category of faster algorithms, there are three is the time [T], K is the hydraulic conductivity tensor [LT −1], z is
types of standard numerical optimization techniques for training the potential head [L], h is the hydraulic head [L], q is the source
of an ANN model. These are: conjugate gradient, quasi-Newton, and/or sink [L3T−1L−3], ρ * water density at solute concentration
and Levenberg-Marquardt. In the conjugate gradient algorithms, C[ML−3], ρ o is the reference water density corresponding to zero
a search is performed along conjugate directions, which pro- solute concentration [ML−3], ρ is the density of either the injec-
duces generally faster convergence than steepest descent direc- tion fluid or the withdrawn water [ML−3].
tions. In most of the conjugate gradient algorithms, the step size The hydraulic conductivity K[LT−1] is depended on fluid
is adjusted after each iteration. Four different functions using density (ρ)[ML−3], viscosity (µ)[ML−1T −1], and acceleration due
variations of conjugate gradient algorithm are available in ANN gravity (g)[LT−2], The hydraulic conductivity K[LT −1] is defined
toolbox of MATLAB. These are traincgf (Fletcher-Reeves update); as,
traincgp (Polak-Ribiére update); traincgb (Powell-Beale Restarts); kρg
and trainscg (Scaled Conjugate Gradient). K = --------- (6)
µ
Newton's method is an alternative to the conjugate gradient
method for fast optimization. Newton's method often converges Where, k is the permeability tensor [L2].
faster than conjugate gradient methods, but this algorithm The density (ρ)[ML−3] and dynamic viscosity (µ)[ML−1T −1] of
requires Hessian matrix information. However, the computation water are depended on solute concentration. For saltwater in-
of the Hessian matrix for feed-forward neural networks is com- trusion, the relation between fluid density and solute concen-
plex and very expensive. There is another class of algorithms tration is expressed as (Lin et al., 1997),
based on the Newton’s method, called quasi-Newton algorithms. ρS – ρo⎞
ρ = ρo ( 1 + ε c ) where, ε = ⎛⎝ --------------
- (7)
These algorithms approximate the Hessian matrix instead of ρo ⎠
actual calculation. The quasi-Newton algorithm updates an ap-
where, c is the dimensionless solute concentration, ε is the
proximate Hessian matrix at each iteration of the algorithm. The
density deference ratio, ρ S is the solute concentration corres-
trainbgf function uses the quasi-Newton algorithm. However,
ponding to maximum density of fluid [ML−3], and ρo is the
this algorithm requires more computational time and storage
reference water density corresponding to zero solute concen-
than the conjugate gradient methods. The trainoss function uses
tration [ML−3].
one step secant algorithm. The algorithm does not store the
The Darcy velocity is calculated as given bellow
complete Hessian matrix and assumes that after each iteration,
the previous Hessian was an identity matrix. Like the quasi- ρ
V = –K ⎛ -----o ∇h + ∇z⎞ (8)
Newton method, the Levenberg-Marquardt algorithm also ap- ⎝ρ ⎠
proximates the Hessian matrix. The trainlm function imple-
ments the Levenberg-Marquardt algorithm in ANN toolbox of 3.2 Transport Equation
MATLAB. The governing equation of contaminant transport for disper-
(10)
Here, i is the index for location and j is the index for time step.
CTopij [ML −3] is the saltwater concentration at top of the aquifer
for location i and for j th time step. CBotij [ML −3] is the saltwater
concentration at bottom of the aquifer for location i and for jth
time step. Pij [L3T −1] is the pumping from location i and for j th
time step, and Rij [L3T−1] is the pumping from location i and for
j th time step, No is the number of observation locations, Np is the
number of pumping locations, NR is the number of recharge well
locations, T is the total number of time steps.
5. Performance Measure
1 N Cno – Cnp
- × 100
AARE = ---- ∑ ---------------- (11)
N i = 1 Cnp
n
TSx = ----x × 100 (12)
N Fig. 2. Illustrative Study Area
sidered for evaluating the performance of the ANN model. The pumping and recharge values are taken as input to the ANN
area of the aquifers is approximately 2.59 Km2 and the thickness model. Therefore, the input to the ANN model is equal to 18 (6
of the aquifer is 50 m. The confined aquifer system is subjected well and 3 time steps). The observation locations are also shown
to saltwater intrusion along the coastal side of the study area. The in Fig. 2. There are three observation locations and concentra-
left hand face of the aquifer is the ocean face which allows the tions at top and bottom of the aquifer are predicted by the ANN
saltwater to enter in to the aquifers through the bottom of the model. Therefore, total number of outputs is equal to 18 (6 for
aquifers, and also allows the exit of the mixed water from the top each time steps).
of the aquifers. It is assumed that mixed water can exit the
aquifer through the top 20% of the aquifer at the ocean face (Das 6.2 Generation of ANN Patterns
and Datta, 2000). The right hand side of the aquifers is inland The training, validation and testing patterns are generated
face, which allows fresh water to enter in to the aquifers. The using finite element based FEMWATER model. The input to the
other four faces, front, back, top and the bottom of the aquifer are ANN model is the set of transient pumping and recharges rates.
considered as impermeable. The three-dimensional confined These pumping values are generated using a uniform distribution
hypothetical aquifer is assumed to be homogeneous and isotro- for specified upper and lower limits. In this study, the upper limit
pic, with respect to the fresh water hydraulic conductivity, mole- of pumping is taken as 5000 m3/day and upper limit of recharge
cular diffusion and longitudinal and transverse dispersivities. is taken as 1000 m3/day. The lower limit for both pumping and
The value of the aquifer parameters used in the study are listed in recharge are taken as zero m3/day. The set of transient pumping
Table 1. and recharge values are then used as input for the FEMWATER
The boundary condition in the aquifer (Fig. 2) is considered as model for transient simulation over a three time period. The
time invariant. The flow boundary condition on the ocean face is resulting concentrations at selected observation locations at dif-
considered as hydrostatic in vertical direction. It is assumed to be ferent time step are specified as output patterns. Fig. 3 shows a
constant throughout the ocean face. On the inland face, the typical concentration distribution in the aquifer. For training,
reference hydraulic head is varying linearly along the length of validation and testing purpose, 636 patterns are generated using
the inland face. The adjective mixed outflux can exit through the FEMWATER model. The set of generated patterns has been
top 20% portion of the aquifer. In this portion, the concentration divided into three parts, 20% of the patterns are kept for testing,
gradient normal to the ocean face is equal to zero. In the rest 80% 20% of patterns for validation, and remaining 60% are used for
of the ocean face, the solute concentration is equal to one as it training the neural network. Once trained, outputs from the ANN
allows influx of saltwater into the aquifer. The solute concentra- model are the concentration at observation locations at different
tion is equal to zero on the inland face. Further, the solute con- time steps.
centration gradient normal to the other four faces of the aquifer is
set equal to zero. Steady state reference hydraulic head and
concentration are assumed as the initial condition for transient
flow and transport.
Fig. 2 also shows the locations of the pumping and recharge
wells. The water is pumped from the middle layer of the aquifer
and also recharged at the middle layer of the aquifer. The pump-
ing from the aquifer is considered as transient. A stress period of
180 days is considered in case of pumping and recharge and it is
considered that the pumping and recharge rates are constant for a
particular stress period. The pumping patterns are generated
randomly using a uniform distribution function. The transient
6.3 Selection of Transfer Functions and Optimization Algo- Table 3. Performance of Different Transfer Function for Normal-
rithm ization 2
This is not known in advance that how many hidden neurons Transfer function
MSE
are required for better model performance. An increase in num- Hidden Layer Output Layer
ber of hidden neurons may increase the capability of the ANN tansig tansig 0.1744
model to map more complex system. However, computational
tansig logsig 0.5664
complexity also increases with the increase in hidden neurons. In
this study trial and error method is adopted to find the number of tansig purelin 0.0411
hidden neurons. The study shows that 20 neurons in the hidden logsig tansig 0.1660
layer yield better model performance. Therefore, the ANN logsig logsig 0.5704
architecture considers in this study is 18-20-18. logsig purelin 0.3790
A study is carried out for finding the best combination of
purelin tansig 0.1941
transfer functions for the artificial neurons in the hidden layer
purelin logsig 0.5792
and also in the output layer. The best combination of transfer
functions is evaluated on the basis of Mean Square Error (MSE) purelin purelin 0.0693
of the training pattern. The trainlm optimization function is used
to train the networks. Table 2 shows the MSE values for the patterns are normalized between -1 and +1. Table 4 shows the
training patterns for different combinations of transfer functions. performance of the different optimization algorithms available in
The data are normalized between -1 and +1. It may be observed ANN toolbox of MATLAB in terms of MSE of training data,
that tansig transfer function for the neurons in hidden layer and computational time required, and number of iterations perform-
purelin transfer function for the neurons in output layers result ed. It may be observed that trainlm optimization function pro-
better model performance. In this case, the value of MSE is duced better model performance in terms of mean square error
0.0020. Table 3 shows the MSE value for the training patterns (MSE). The value of MSE is 0.0020. This algorithm took 181.19
for different combinations of transfer functions, when the data seconds of computational time. On the other hand trainscg took
are normalized for mean zero and Standard Deviation (SD) one. less computational time, i.e. 4.150 seconds. However, in this
This time also, tansig-purelin combination shows better model case, the MSE (0.0081) is higher than the MSE achieved with
performance with respect to MSE statistics. The value of MSE is trainlm function. The traingda, traingdx, trainrp, traincgf,
0.0411. Therefore, tansig-purelin combination of transfer func- traincgp, and traincgb are better in terms of computational time
tion is used for rest of the study. It can be observed that the ANN requirement. These algorithms require only 4 to 10 seconds for
model performance is better when the patterns are normalized training the networks. However, the performance of these algori-
between -1 and +1. Therefore, for rest of the study, the input thms is not better than trainlm. Therefore, trainlm function is
output patterns are normalized between -1 and +1. used for rest of this study.
A separate study is also carried out for finding the best opti-
mization algorithm to train the networks. For this purpose also, 6.4 Performance of ANN Model
18-20-18 ANN architecture is used. The tansig transfer function The evaluations of the ANN model are performed in terms of
for the neurons in the hidden layer and purelin transfer function
for the neurons in the output layer is used. The input and output Table 4. Performance of Different Optimization Algorithms
Training Computational
MSE Iteration
Table 2. Performance of Different Transfer Function for Normal- algorithms time (sec)
ization 1 traingd 0.0059 1205.79 57858
Transfer function traingdm 0.0080 192.50 8658
MSE
Hidden Layers Output Layer traingda 0.0206 6.48 323
tansig tansig 0.0054 traingdx 0.0110 5.20 263
tansig logsig 0.1125 trainrp 0.0085 5.29 246
tansig purelin 0.0020 traincgf 0.0076 6.42 118
logsig tansig 0.2194 traincgp 0.0054 9.23 216
logsig logsig 0.1151 traincgb 0.0074 4.32 85
logsig purelin 0.0047 trainscg 0.0081 4.15 103
purelin tansig 0.0091 trainbfg 0.0064 43.70 140
purelin logsig 0.1168 trainoss 0.0090 13.28 107
purelin purelin 0.0093 trainlm 0.0020 180.19 17
Table 5. Performance of the ANN Model Trained with Error Free Data
TSx
Output AARE R
5% 10% 20% 30% 40% 50%
1 1.59 0.998 94.92 98.98 100.00 100.00 100.00 100.00
2 1.38 0.999 97.63 100.00 100.00 100.00 100.00 100.00
3 1.30 0.995 97.29 100.00 100.00 100.00 100.00 100.00
4 1.60 0.997 94.24 98.98 100.00 100.00 100.00 100.00
5 1.47 0.998 97.29 100.00 100.00 100.00 100.00 100.00
6 1.00 0.994 99.32 100.00 100.00 100.00 100.00 100.00
7 3.28 0.996 83.05 95.93 97.97 99.32 100.00 100.00
8 2.96 0.995 82.37 97.29 98.64 100.00 100.00 100.00
9 1.69 0.994 96.61 99.32 100.00 100.00 100.00 100.00
10 2.14 0.996 90.17 98.98 100.00 100.00 100.00 100.00
11 2.39 0.994 90.51 98.31 99.66 100.00 100.00 100.00
12 1.32 0.993 98.64 100.00 100.00 100.00 100.00 100.00
13 8.04 0.960 44.07 69.49 93.56 98.98 99.32 100.00
14 7.98 0.960 43.73 72.54 92.88 97.63 99.32 100.00
15 2.77 0.969 85.08 97.97 100.00 100.00 100.00 100.00
16 6.06 0.969 53.90 80.00 97.29 98.98 99.66 99.66
17 5.81 0.969 53.90 86.78 98.64 99.66 100.00 100.00
18 1.57 0.982 99.32 100.00 100.00 100.00 100.00 100.00
Average 3.02 0.987 83.44 94.14 98.81 99.69 99.90 99.98
SD of 0.10, 0.15, and 0.20 respectively. Though the AARE values Fig. 9 shows the performance of the ANN model with noisy
are increasing with the increase in random error, the AARE value data in terms of TS statistics. It can be observed that the model is
encountered are in acceptable range. very sensitive with respect to the perturbed error for TS5 statis-
Fig. 8 shows the coefficient of correlation (R) values on the tics. The average value of TS5 is 83.45% when the ANN model
testing patterns when the training patterns are perturbed with is trained with error free patterns. This indicates that 83.45% of
different level of noises. It can be observed that the R value is not predicted concentration values had an ARE value less than 5%.
decreasing much with the increase in noise level. The average This implies better predicting capability of the developed ANN
value of R is 0.987 when the training patterns are free from error. model. The TS5 value decreases to 78.61% when the training
The average values of R decreases to 0.941 when the training patterns are perturbed with random error of SD=0.05 and mean
patterns are perturbed with random error of SD = 0.20 and mean zero. The TS5 value further decreases to 65.42%, 58.93%, and
zero. It seems that the performance of the ANN model is not 50.02% when the training patterns are perturbed with random
degrading much with the perturbation of noise in the training error of SD=0.10, 0.15, 0.20 respectively and mean zero. The
pattern. The R values achieve are in acceptable range. ANN model is also little bit sensitive with respective to TS10 and
References
Engg. Research and Development Center. Vicksburg, M.S. 1997. position of the seawater intrusion in a coastal aquifer near Madras
Liong, S. Y., Lim, W. H., and Paudyal, G. N. (2000). “River stage coast. Proceedings of the 3rd international conference finite element
forecasting in bangladesh : Neural network approach.” J. of Com- water resources. University of Mississippi, Oxford, Miss. 1980.
puting in Civil Engineering, Vol. 4, No.1, pp. 1-8. Sherif, M. M., Singh, V. P., and Amer, A. M. (1988). “A two-
Morshed, J. and Kaluarachchi, J. J. (1998). “Application of artificial dimensional finite element model for dispersion (2D-FED) in
neural networks and genetic algorithms in flow and transport coastal aquifer.” J. of hydrology, Vol. 103, pp. 11-36.
simulation.” Adv. in Water Resour., Vol. 22, No. 2, pp. 145-158. Todd, D. K. (1980). Groundwater Hydrology, John Wiley & Sons.
Rogers, L. L. and Dowla, F. U. (1994). “Optimization of ground water Singapore.
remediation using artificial neural networks and parallel solute Willis, R. and Finney, B. A. (1988). “Planning model for optimal control
transport modeling.” Water Resour. Res., Vol. 30, No. 2, pp. 457- of saltwater intrusion.” J. of Water Resour. Plng. and Mgmt., ASCE,
481. Vol. 114, No. 2, pp. 163-178.
Rouve, G. and Stoessinger, W. (1980). Simulation of the transient