You are on page 1of 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/226101513

Performance of an Artificial Neural Network model for simulating saltwater


intrusion process in coastal aquifers when training with noisy data

Article  in  KSCE Journal of Civil Engineering · May 2009


DOI: 10.1007/s12205-009-0205-6

CITATIONS READS

13 245

3 authors:

Rajib Kumar Bhattacharjya Bithin Datta


Indian Institute of Technology Guwahati James Cook University
99 PUBLICATIONS   948 CITATIONS    204 PUBLICATIONS   4,722 CITATIONS   

SEE PROFILE SEE PROFILE

Mysore G. Satish
Dalhousie University
45 PUBLICATIONS   1,067 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Development of Computational Tools for Contamination Source Identification and Monitoring Network Design in Contaminated Aquifers View project

Artificial Neural Network Modeling and Genetic Algorithm Based Optimization of Hydraulic Design Related to Seepage under Concrete Gravity Dams on Permeable Soils
View project

All content following this page was uploaded by Bithin Datta on 05 June 2014.

The user has requested enhancement of the downloaded file.


KSCE Journal of Civil Engineering (2009) 13(3):205-215 Water Engineering
DOI 10.1007/s12205-009-0205-6
www.springer.com/12205

Performance of an Artificial Neural Network Model for Simulating Saltwater


Intrusion Process in Coastal Aquifers when Training with Noisy Data
Rajib Kumar Bhattacharjya*, Bithin Datta**, and Mysore G. Satish***
Received July 25, 2008/Revised February 13, 2009/Accepted March 9, 2009

···································································································································································································································

Abstract

This paper evaluates the performance of an Artificial Neural Networks (ANN) model for approximating density depended
saltwater intrusion process in coastal aquifer when the ANN model is trained with noisy training data. The data required for training,
testing and validation of the ANN model are generated using a numerical simulation model. The simulated data, consisting of
corresponding sets of input and output patterns are used for training a multilayer perception using back-propagation algorithm. The
trained ANN predicts the concentration at specified observation locations at different time steps. The performance of the ANN model
is evaluated using an illustrative study area. These evaluation results show the efficient predicting capabilities of an ANN model
when trained with noisy data. A comparative study is also carried out for finding the better transfer function of the artificial neuron
and better training algorithms available in Matlab for training the ANN model.
Keywords: artificial neural networks, saltwater intrusion process, groundwater, noisy data, approximation model
···································································································································································································································

1. Introduction ually from freshwater to saltwater. The thickness of the transition


zone is generally varying from few meters to more than hundred
The demand for fresh water is accelerating as the world popu- meters (Todd, 1980). Some researchers have ignored the
lation is increasing in alarming rate. Therefore, over exploitation transition zone when thickness of the zone is relatively small. In
of groundwater resources has become unavoidable in many parts this case, saltwater and fresh water may be considered as im-
of the world in order to cope with the increasing demands for miscible fluid and the interface would be a sharp interface.
freshwater. Moreover, it is reported that at least seventy percent However, sharp interface approximation does not provide any
of the world population live in coastal areas (Bear et al., 1999). information about the nature of the zone of dispersion. For large
The main sources of fresh water for those people are the fresh- thickness of the transition zone, the sharp interface model may
water aquifer near the coastal region. The unplanned exploitation give erroneous result. Therefore, it is necessary to consider tran-
of freshwater from coastal aquifers, hydraulically connected with sition zone for modeling real world seawater intrusion process in
sea or ocean may cause saltwater intrusion into coastal aquifers. a coastal aquifers.
Many times, the exploitation of coastal aquifers is restricted Saltwater intrusion process in coastal aquifers can be simulated
because of saltwater intrusion. The contamination of coastal aq- by solving flow and transport equations of porous media. The
uifers in different parts of the world due to saltwater intrusion has density variations in the transition zone make the flow and trans-
been reported by different researchers (Rouve and Stoessinger, port processes highly complex and nonlinear. The simulation of
1980; Sherif et al., 1988; Willis and Finney 1988; Cheng and these highly nonlinear processes is also complex and costly in
Chen, 2001). As reported, the main causes of saltwater intrusion terms of computer memory and computational time require-
are the excessive exploitation of groundwater and improper ments. An approximation of saltwater intrusion process in coastal
arrangement of pumping wells. aquifer would be useful especially when repetitive simulations
The heavier saltwater has the tendency to underlie freshwater are necessary.
because of hydrodynamic mechanism. However, a mixing zone Regression analysis and Artificial Neural Network (ANN)
of varying density exists between freshwater and saltwater due to models can be used for approximating the flow and transport
hydrodynamic dispersion. This zone is known as the transition processes in ground water. Artificial neural network, inspired by
zone. In this zone, the density of the mixed fluid increases grad- the workings of a human brain, is considered to be better than

*Assistant Professor, Dept. of Civil Engineering, Indian Institute of Technology, Guwahati 781039, Assam, India (Corresponding Author, E-mail:
rkbc@iitg.ernet.in)
**Professor, Dept. of Civil Engineering, Indian Institute of Technology, Kanpur 208016, India (E-mail: bithin@iitk.ac.in)
***Professor, Dept. of Civil Engineering, Dalhousie University, Halifax, Nova Scotia B3J1Z1, Canada (E-mail: mysore.satish@dal.ca)

− 205 −
Rajib Kumar Bhattacharjya, Bithin Datta, and Mysore G. Satish

regression as it can handle partial information. Many researchers with back propagation algorithms. The ANN toolbox available
have presented various approximation models to simulate the in MATLAB is used in this study. This paper begins with the
flow and transport processes in ground water. Ejaz and Peralta description of artificial neural network followed by the descrip-
(1985) presented regression equations to predict downstream tion of the two governing equations, flow and transport equa-
concentration of several constituents from the upstream flow rate tions, used to generate the required patterns for the ANN model.
and constituent concentration. Alley (1986) presented regression The details of developed model are provided next followed by
equations to relate the variation in pumping and recharge rate at the description of various standard statistical performance eval-
five decisions wells to the concentration at nine control locations uation criteria and then discusses the performance of the de-
for two dimensional transport processes in an aquifer. Lefkoff veloped model and presents a concluding remarks.
and Gorelick (1990) presented multiple linear regression equa-
tions to approximate transport process in an aquifer. This ap- 2. Artificial Neural Networks
proximation model predicts the change of ground water salinity
resulting from the hydrologic conditions and water use decisions. An artificial Neural Networks (ANN), which is considered as a
ANN has been used extensively for simulating groundwater universal approximator, mimics the function of the human brain
flow and transport processes. Rogers and Dowla (1994) incor- by acquiring knowledge through learning process. The study on
porated the artificial neural networks model with the optimiza- brain shows that brain stores information as patterns. Some
tion model to predict the total solute mass removal for treatment. patterns are very complex in nature, e.g. brain allows us to
They trained an ANN model, for a simplified case, to predict recognize individual faces from different angles. The learning of
whether or not a given set of pumping satisfies the containment pattern by an ANN model involves finding of an optimal set of
constraints. Morshed and Kaluarachchi (1998) presented ANN weights for the synaptic connections between the artificial
model to approximate the concentration break-through curves neurons of the network. The ability to gather knowledge through
for one dimensional unsaturated flow and transport model. Aly the process of learning, like a human brain, from sufficient input
and Peralta (1999) trained an ANN to model the response surface patterns makes it possible to apply the ANN to large scale real
within the optimization model. They applied their model to world problems. Once the ANN is trained, the relationship
design pump-and-treat systems for aquifer cleanup. Johnson and between the input and output is encoded in the network. Later it
Rogers (2000) evaluated the effect of using ANN and linear can be used to predict the output based on the information fed to
approximator in conjunction with simulated annealing driven the input nodes. Fig. 1 shows a single hidden layer feed forward
search for two different two-dimensional ground water remedia- neural network. The information is passing from input layer to
tion problems. All these studies were confined to one dimen- the output layer through hidden layer, i.e. the information flow is
sional or two-dimensional flow and transport processes only. The only in the forward direction. A synaptic weight is assigned to
application of ANN for approximating three-dimensional flow each link that represents the connection strength between two
and transport processes is very limited (Bhattacharjya and Datta, neurons. The synaptic weights are modified during network
2005; Bhattacharjya et al., 2007). Bhattacharjya et al. (2007) training process. The input output relation for this feed forward
used ANN for approximating three dimensional flow and network may be written as
transport processes in coastal aquifer. They have used standard
y = ∅2 [ V.∅1 ( W.x + b1 ) + b2 ] (1)
back-propagation algorithms with constant learning and momen-
tum rate. However, they did not explore the performance of other Where, x is the input vector, y is the output vector, W is the
faster optimization algorithms for training the ANN model. weight matrix for the synaptic connections between input and
Therefore, the main objective of the present study is to develop hidden layer, V is the weight matrix for the synaptic connections
an efficient ANN model for predicting flow and transport pro- between hidden and output layer, b1 is the biases in the hidden
cesses in coastal aquifer. ANN has also been successfully applied layer, b2 is the biases in the output layer, ∅1 is the transfer
in many surface water related problems, like rainfall runoff function for the neurons in hidden layer and ∅2 is the transfer
process (Hsu et al., 1995), daily reservoir forecasting (Coulibaly function for the neurons in the output layer.
et al., 2000), river flow forecasting (Korre et al., 2000), river
stage forecasting (Liong et al., 2000), etc.
The objectives of the study presented in this paper are (a)
development of an ANN model as an approximator of three-
dimensional flow and transport processes in costal aquifers and
its performance with noisy data, (b) performance evaluation of
the developed model using hypothetical study area, and (c)
determination of best transfer function of the artificial neurons as
well as best optimization algorithm for training of the ANN
model. The data generated by FEMWATER (Lin et al., 1997) for
an illustrative costal aquifer are used to train the ANN model Fig. 1. An ANN Model

− 206 − KSCE Journal of Civil Engineering


Performance of an Artificial Neural Network Model for Simulating Saltwater Intrusion Process in Coastal Aquifers when Training with Noisy Data

2.1 Training of an ANN Model 2.3 Transfer Function


Training is a process of determining the synaptic weights of an An artificial neuron performs two functions, i.e. summation of
ANN networks for getting desired output. There are two types of the input signals and transfer of the added input signal using a
training processes: supervised and unsupervised training. Super- transfer function. Transfer function takes the argument and pro-
vised training estimates the synaptic weights of the network duces the output. The following transfer functions available in
based on specified input and output patterns. On the other hand, MATLAB are considered in this study.
in unsupervised training, the networks perform classification of
the inputs without using any specified target patterns. In this 2.3.1 The tansig Function
study, supervised training process is used. This function generates output between -1 and +1 and takes the
In supervised training, both input and output data are required. input from -∞ to +∞. The function can be written as
In this study, the initial synaptic weights are generated randomly
2
between 1 and -1. Based on the inputs and initially generated tan sig ( x ) = ---------------------------
–2x
- (2)
random weights, the network is processed to get the outputs. The (1 + e ) – 1
results obtained from the network are compared with the actual
outputs. The errors are then propagated back through the 2.3.2 The logsig Function
network and the synaptic weights are adjusted accordingly by This function generates output between 0 and +1 and takes the
minimizing the differences between the outputs of the network input from -∞ to +∞. The function can be written as
and actual outputs. The training data sets are processed in batch
1
mode many times to refine the synaptic weights. The training log sig ( x ) = -------------
-
–x (3)
would stop only when the network provides desirable outputs. 1+e
Once the network is trained, the performance of the network
should be evaluated using another independent set of data to 2.3.3 The Purelin Function
appraise the predicting capabilities of the networks. This transfer function generates output linearly which can be
The most popular algorithm to adjust the synaptic weights of written as
an artificial neural network is the back propagation algorithm.
purelin( x ) = x (4)
This algorithm first computes the error signal at the output layer,
and then it is propagated to the input layer through hidden The function generates output between -∞ and +∞ and takes
layer(s). After computing error signal, this algorithm would first the input from -∞ to +∞.
adjust the synaptic weights between hidden and output layers
and then adjust the synaptic weights between input and hidden 2.4 Optimization Algorithms
layers using an optimization algorithm. These procedures would During training, weights of the synaptic connections and
continue till the termination criteria satisfy. biases of the ANN model are adjusted iteratively using an
optimization algorithm. In this process, the difference between
2.2 Termination Criteria actual and predicted outputs of the ANN is minimized using an
The training error decreases with the increasing number of optimization algorithm. There are various functions available in
training cycles. However, the error on a different data set, which ANN toolbox of MATLAB to implement different backpropaga-
was not used for the purpose of training, called as validation tion algorithms for training the ANN model.
error, decreases monotonically to a minimum but then starts in- The traingd function used steepest descent algorithm with a
creases, even as the training error continues to decrease (Burian learning rate and update the weights and biases of the networks
et al., 2001; Hassoun 1999). When ANN is used to train noisy in the direction of negative of the error function. The learning
data, it would initially learn the actual pattern, therefore the rate determines the step length. For larger value of learning rate,
validation error decreases initially along with the training error. the step length will be more and hence the algorithm may
After learning the actual pattern, the ANN may try to learn the oscillate and become unstable. On the other hand for smaller
noise also, thus the validation error may increase even if the value of learning rate, the step length will be small and hence
training error continue to decrease. In general practice, the algorithm takes more time to converge at optimal solution. A
validation error is carefully monitored during the training phase, learning rate of 0.05 is used in traingd function. The algorithm
and the training process is terminated just before the increase in uses in traingdm function is similar to that used in traingd
validation error (Burian et al., 2001; Hassoun 1999). However, function, but it uses a momentum factor for faster convergence.
in this study, the training is continued for some more iteration The momentum factor helps the networks to ignore local optima
after reaching minimum of the validation error to check that the in the error surface. The learning rate and momentum rate used
obtained solution is not a local optimal solution. Another new in the study is 0.05 and 0.8 respectively.
set, known as testing set, which was not used for training and The backpropagation training algorithms implemented in
validation purpose, is used for determining the predicting capa- traingd and traingdm functions are often too slow for real world
bility of the ANN model. problems. There are some high performance algorithms avail-

Vol. 13, No. 3 / May 2009 − 207 −


Rajib Kumar Bhattacharjya, Bithin Datta, and Mysore G. Satish

able in ANN toolbox of MATLAB which are much faster than 3. Flow and Transport Equations
the algorithms used in traingd and traingdm function. These
faster algorithms can be divided into two categories. The fist In the present study, the FEMWATER model is used for gen-
category uses heuristic techniques, the standard steepest descent erating patterns for the ANN model. The finite element based
algorithm. On the other hand the second category uses standard numerical simulation model FEMWATER simulates the 3-di-
numerical optimization techniques. mensional flow and transport processes along with the initial and
In the first category two functions, namely traingda and trainrp boundary conditions for the flow through saturated-unsatu-
are available. The traingda function uses standard steepest rated porous media. The numerical simulation model FEMWATER
descent algorithm with variable learning rate. The convergence simulates the natural conditions, but cannot duplicate them. The
of the algorithm is highly sensitive to learning rate. Therefore, model is highly sensitive to the grid/mesh cell size and improper
this algorithm allows the learning rate to change during the selection may results to erroneous solutions. In spite of all these
training process. In the present study, if the current step error is limitations, FEMWATER is one of the most sophisticated simu-
less than the previous step error, the learning rate is increased by lation models to simulate density dependent flow and transport
1.04 times. On the other hand the trainrp function uses resilient processes. Interested reader may refer Lin et al. 1997 for details
backpropagation training algorithm. When sigmoid function is about FEMWATER.
used as a transfer function, the slope of the function approaches
to zero at higher input value. As a result, the change in magni- 3.1 Flow Equations
tude of weights and biases of the networks are very small even The 3D flow equation may be written as (Lin et al., 1997):
though the weights and biases are far from their optimal values.
ρ ∂h ρ ρ *
----- F ----- = ∇ ⋅ K ⋅ ⎛ ∇h + ----- ∇z⎞ + ----- q
Therefore, in resilient backpropagation training algorithm, the (5)
ρ o ∂t ⎝ ρo ⎠ ρo
size of the weight change is determined by a separate update
value. Where F is the storage coefficient, h is the pressure head [L], t
In the second category of faster algorithms, there are three is the time [T], K is the hydraulic conductivity tensor [LT −1], z is
types of standard numerical optimization techniques for training the potential head [L], h is the hydraulic head [L], q is the source
of an ANN model. These are: conjugate gradient, quasi-Newton, and/or sink [L3T−1L−3], ρ * water density at solute concentration
and Levenberg-Marquardt. In the conjugate gradient algorithms, C[ML−3], ρ o is the reference water density corresponding to zero
a search is performed along conjugate directions, which pro- solute concentration [ML−3], ρ is the density of either the injec-
duces generally faster convergence than steepest descent direc- tion fluid or the withdrawn water [ML−3].
tions. In most of the conjugate gradient algorithms, the step size The hydraulic conductivity K[LT−1] is depended on fluid
is adjusted after each iteration. Four different functions using density (ρ)[ML−3], viscosity (µ)[ML−1T −1], and acceleration due
variations of conjugate gradient algorithm are available in ANN gravity (g)[LT−2], The hydraulic conductivity K[LT −1] is defined
toolbox of MATLAB. These are traincgf (Fletcher-Reeves update); as,
traincgp (Polak-Ribiére update); traincgb (Powell-Beale Restarts); kρg
and trainscg (Scaled Conjugate Gradient). K = --------- (6)
µ
Newton's method is an alternative to the conjugate gradient
method for fast optimization. Newton's method often converges Where, k is the permeability tensor [L2].
faster than conjugate gradient methods, but this algorithm The density (ρ)[ML−3] and dynamic viscosity (µ)[ML−1T −1] of
requires Hessian matrix information. However, the computation water are depended on solute concentration. For saltwater in-
of the Hessian matrix for feed-forward neural networks is com- trusion, the relation between fluid density and solute concen-
plex and very expensive. There is another class of algorithms tration is expressed as (Lin et al., 1997),
based on the Newton’s method, called quasi-Newton algorithms. ρS – ρo⎞
ρ = ρo ( 1 + ε c ) where, ε = ⎛⎝ --------------
- (7)
These algorithms approximate the Hessian matrix instead of ρo ⎠
actual calculation. The quasi-Newton algorithm updates an ap-
where, c is the dimensionless solute concentration, ε is the
proximate Hessian matrix at each iteration of the algorithm. The
density deference ratio, ρ S is the solute concentration corres-
trainbgf function uses the quasi-Newton algorithm. However,
ponding to maximum density of fluid [ML−3], and ρo is the
this algorithm requires more computational time and storage
reference water density corresponding to zero solute concen-
than the conjugate gradient methods. The trainoss function uses
tration [ML−3].
one step secant algorithm. The algorithm does not store the
The Darcy velocity is calculated as given bellow
complete Hessian matrix and assumes that after each iteration,
the previous Hessian was an identity matrix. Like the quasi- ρ
V = –K ⎛ -----o ∇h + ∇z⎞ (8)
Newton method, the Levenberg-Marquardt algorithm also ap- ⎝ρ ⎠
proximates the Hessian matrix. The trainlm function imple-
ments the Levenberg-Marquardt algorithm in ANN toolbox of 3.2 Transport Equation
MATLAB. The governing equation of contaminant transport for disper-

− 208 − KSCE Journal of Civil Engineering


Performance of an Artificial Neural Network Model for Simulating Saltwater Intrusion Process in Coastal Aquifers when Training with Noisy Data

sion/diffusion, adsorption, and source/sink used in FEMWATER N


o p

model is written as (Lin et al., 1997): ∑ ( Cno – Cn )( Cnp – Cn )


i=1
R = --------------------------------------------------------- (13)
N
∂C ∂S o 2 p 2
θ ------ + ρ ----- + V. ∇C – ∇. ( θ D. ∇C )
∂t ∂t ∑ ( Cno – C ) ( Cnp – C )
n n
i=1
∂h ρ* ∂h ρ ρ ∂θ
= –⎛ α' -----⎞ ( θ C + ρb S ) – ----- qC + ⎛ F ----- + -----o V. ∇⎛ -----⎞ – ------⎞ C (9)
⎝ ∂t ⎠ ρ ⎝ ∂t ρ ⎝ ρo⎠ ∂t ⎠ Here, Cno is the observed saltwater concentration, Cnp [ML−3] is
the predicted saltwater concentration, Cno [ML−3] is the mean of
Where, θ is the moisture concentration; ρ b is the bulk density of
observed concentration, Cnp [ML−3] is the mean of predicted
the medium [ML −3]; C is the solute concentration in aqueous
concentration, and N is the number of data points, nx is the
phase [ML −3]; S is the solute concentration in adsorbed phase
number of data points for which the Absolute Relative Error
[ML −3]; D is the dispersion tensor [L 2T −1].
(ARE) is less than x%. The ARE may be written as
4. Model Development
ARE = C n – C n × 100
o p
----------------
- (14)
Cn p

This study employed ANN technique for approximating the


flow and transport processes in coastal aquifer. The model pre- The threshold statistics is computed for ARE level of 5%, 10%,
dicts salt concentration at pre-specified locations for different 20%, 30%, 40%, and 50%. It can be noted that lower AARE
time steps. The salt concentration also varies along the depth of value and higher TSx and R values indicate better model
the aquifer. Therefore, the model is developed in order to predict performance.
the salt concentration at top and bottom of the aquifer. The inputs
to the ANN model are the pumping and recharge values at 6. Performance Evaluation Using an Illustrative
different time steps. The relation between inputs and outputs Study Area
may be expressed as:
6.1 Illustrative Study Area
⎧ ( CTopji, i = 1…No, j = 1…T ), ⎫ ⎧ ( Pji, i = 1…Np, j = 1…T ), ⎫ An illustrative study area as shown in Fig. 2 has been con-
⎨ ⎬ = f⎨ i ⎬
⎩ ( CBotj , i = 1…No, j = 1…T ) ⎭ ⎩ ( Rj , i = 1…NR, j = 1…T ) ⎭
i

(10)
Here, i is the index for location and j is the index for time step.
CTopij [ML −3] is the saltwater concentration at top of the aquifer
for location i and for j th time step. CBotij [ML −3] is the saltwater
concentration at bottom of the aquifer for location i and for jth
time step. Pij [L3T −1] is the pumping from location i and for j th
time step, and Rij [L3T−1] is the pumping from location i and for
j th time step, No is the number of observation locations, Np is the
number of pumping locations, NR is the number of recharge well
locations, T is the total number of time steps.

5. Performance Measure

The performance of the model is evaluated employing three


different types of standard statistical performance evaluation
criteria. These are average absolute relative error (AARE) (Jain et
al., 2005) threshold statistics for an absolute relative error (Jain
et al., 2005) level of x% (TSx) (Jain and Kumar, 2006), and
coefficient of correlation (R) (Jain et al., 2005). These three
performance evaluation criteria can be calculated using the
following equations.

1 N Cno – Cnp
- × 100
AARE = ---- ∑ ---------------- (11)
N i = 1 Cnp

n
TSx = ----x × 100 (12)
N Fig. 2. Illustrative Study Area

Vol. 13, No. 3 / May 2009 − 209 −


Rajib Kumar Bhattacharjya, Bithin Datta, and Mysore G. Satish

sidered for evaluating the performance of the ANN model. The pumping and recharge values are taken as input to the ANN
area of the aquifers is approximately 2.59 Km2 and the thickness model. Therefore, the input to the ANN model is equal to 18 (6
of the aquifer is 50 m. The confined aquifer system is subjected well and 3 time steps). The observation locations are also shown
to saltwater intrusion along the coastal side of the study area. The in Fig. 2. There are three observation locations and concentra-
left hand face of the aquifer is the ocean face which allows the tions at top and bottom of the aquifer are predicted by the ANN
saltwater to enter in to the aquifers through the bottom of the model. Therefore, total number of outputs is equal to 18 (6 for
aquifers, and also allows the exit of the mixed water from the top each time steps).
of the aquifers. It is assumed that mixed water can exit the
aquifer through the top 20% of the aquifer at the ocean face (Das 6.2 Generation of ANN Patterns
and Datta, 2000). The right hand side of the aquifers is inland The training, validation and testing patterns are generated
face, which allows fresh water to enter in to the aquifers. The using finite element based FEMWATER model. The input to the
other four faces, front, back, top and the bottom of the aquifer are ANN model is the set of transient pumping and recharges rates.
considered as impermeable. The three-dimensional confined These pumping values are generated using a uniform distribution
hypothetical aquifer is assumed to be homogeneous and isotro- for specified upper and lower limits. In this study, the upper limit
pic, with respect to the fresh water hydraulic conductivity, mole- of pumping is taken as 5000 m3/day and upper limit of recharge
cular diffusion and longitudinal and transverse dispersivities. is taken as 1000 m3/day. The lower limit for both pumping and
The value of the aquifer parameters used in the study are listed in recharge are taken as zero m3/day. The set of transient pumping
Table 1. and recharge values are then used as input for the FEMWATER
The boundary condition in the aquifer (Fig. 2) is considered as model for transient simulation over a three time period. The
time invariant. The flow boundary condition on the ocean face is resulting concentrations at selected observation locations at dif-
considered as hydrostatic in vertical direction. It is assumed to be ferent time step are specified as output patterns. Fig. 3 shows a
constant throughout the ocean face. On the inland face, the typical concentration distribution in the aquifer. For training,
reference hydraulic head is varying linearly along the length of validation and testing purpose, 636 patterns are generated using
the inland face. The adjective mixed outflux can exit through the FEMWATER model. The set of generated patterns has been
top 20% portion of the aquifer. In this portion, the concentration divided into three parts, 20% of the patterns are kept for testing,
gradient normal to the ocean face is equal to zero. In the rest 80% 20% of patterns for validation, and remaining 60% are used for
of the ocean face, the solute concentration is equal to one as it training the neural network. Once trained, outputs from the ANN
allows influx of saltwater into the aquifer. The solute concentra- model are the concentration at observation locations at different
tion is equal to zero on the inland face. Further, the solute con- time steps.
centration gradient normal to the other four faces of the aquifer is
set equal to zero. Steady state reference hydraulic head and
concentration are assumed as the initial condition for transient
flow and transport.
Fig. 2 also shows the locations of the pumping and recharge
wells. The water is pumped from the middle layer of the aquifer
and also recharged at the middle layer of the aquifer. The pump-
ing from the aquifer is considered as transient. A stress period of
180 days is considered in case of pumping and recharge and it is
considered that the pumping and recharge rates are constant for a
particular stress period. The pumping patterns are generated
randomly using a uniform distribution function. The transient

Table 1. Values of Aquifer Parameters


Aquifer parameters Value
o
Hydraulic Conductivity in x direction K , m/day
xx 20.000
Hydraulic Conductivity in y direction K oyy, m/day 0.200
o
Hydraulic Conductivity in z direction K , m/day
zz 20.000
Longitudinal dispersivity α l, m 25.000
Lateral dispersivity α t, m 5.000
Molecular diffusion do , m/day 0.660
Density difference ratio ε 0.025
Fig. 3. Concentration Distribution in the Aquifer after Second Time
Soil porosity φ 0.260 Step

− 210 − KSCE Journal of Civil Engineering


Performance of an Artificial Neural Network Model for Simulating Saltwater Intrusion Process in Coastal Aquifers when Training with Noisy Data

6.3 Selection of Transfer Functions and Optimization Algo- Table 3. Performance of Different Transfer Function for Normal-
rithm ization 2
This is not known in advance that how many hidden neurons Transfer function
MSE
are required for better model performance. An increase in num- Hidden Layer Output Layer
ber of hidden neurons may increase the capability of the ANN tansig tansig 0.1744
model to map more complex system. However, computational
tansig logsig 0.5664
complexity also increases with the increase in hidden neurons. In
this study trial and error method is adopted to find the number of tansig purelin 0.0411
hidden neurons. The study shows that 20 neurons in the hidden logsig tansig 0.1660
layer yield better model performance. Therefore, the ANN logsig logsig 0.5704
architecture considers in this study is 18-20-18. logsig purelin 0.3790
A study is carried out for finding the best combination of
purelin tansig 0.1941
transfer functions for the artificial neurons in the hidden layer
purelin logsig 0.5792
and also in the output layer. The best combination of transfer
functions is evaluated on the basis of Mean Square Error (MSE) purelin purelin 0.0693
of the training pattern. The trainlm optimization function is used
to train the networks. Table 2 shows the MSE values for the patterns are normalized between -1 and +1. Table 4 shows the
training patterns for different combinations of transfer functions. performance of the different optimization algorithms available in
The data are normalized between -1 and +1. It may be observed ANN toolbox of MATLAB in terms of MSE of training data,
that tansig transfer function for the neurons in hidden layer and computational time required, and number of iterations perform-
purelin transfer function for the neurons in output layers result ed. It may be observed that trainlm optimization function pro-
better model performance. In this case, the value of MSE is duced better model performance in terms of mean square error
0.0020. Table 3 shows the MSE value for the training patterns (MSE). The value of MSE is 0.0020. This algorithm took 181.19
for different combinations of transfer functions, when the data seconds of computational time. On the other hand trainscg took
are normalized for mean zero and Standard Deviation (SD) one. less computational time, i.e. 4.150 seconds. However, in this
This time also, tansig-purelin combination shows better model case, the MSE (0.0081) is higher than the MSE achieved with
performance with respect to MSE statistics. The value of MSE is trainlm function. The traingda, traingdx, trainrp, traincgf,
0.0411. Therefore, tansig-purelin combination of transfer func- traincgp, and traincgb are better in terms of computational time
tion is used for rest of the study. It can be observed that the ANN requirement. These algorithms require only 4 to 10 seconds for
model performance is better when the patterns are normalized training the networks. However, the performance of these algori-
between -1 and +1. Therefore, for rest of the study, the input thms is not better than trainlm. Therefore, trainlm function is
output patterns are normalized between -1 and +1. used for rest of this study.
A separate study is also carried out for finding the best opti-
mization algorithm to train the networks. For this purpose also, 6.4 Performance of ANN Model
18-20-18 ANN architecture is used. The tansig transfer function The evaluations of the ANN model are performed in terms of
for the neurons in the hidden layer and purelin transfer function
for the neurons in the output layer is used. The input and output Table 4. Performance of Different Optimization Algorithms
Training Computational
MSE Iteration
Table 2. Performance of Different Transfer Function for Normal- algorithms time (sec)
ization 1 traingd 0.0059 1205.79 57858
Transfer function traingdm 0.0080 192.50 8658
MSE
Hidden Layers Output Layer traingda 0.0206 6.48 323
tansig tansig 0.0054 traingdx 0.0110 5.20 263
tansig logsig 0.1125 trainrp 0.0085 5.29 246
tansig purelin 0.0020 traincgf 0.0076 6.42 118
logsig tansig 0.2194 traincgp 0.0054 9.23 216
logsig logsig 0.1151 traincgb 0.0074 4.32 85
logsig purelin 0.0047 trainscg 0.0081 4.15 103
purelin tansig 0.0091 trainbfg 0.0064 43.70 140
purelin logsig 0.1168 trainoss 0.0090 13.28 107
purelin purelin 0.0093 trainlm 0.0020 180.19 17

Vol. 13, No. 3 / May 2009 − 211 −


Rajib Kumar Bhattacharjya, Bithin Datta, and Mysore G. Satish

the magnitude of errors during in testing phase using different


standard statistical criteria. The statistical criteria used in the
study are average absolute relative error, threshold statistics and
coefficient of correlation.
Fig. 4 to Fig. 6 show the accuracy of the ANN model in pre-
dicting flow and transport processes in coastal aquifer. These
scatter plots show the degree of correlation between the actual
and predicted saltwater concentrations. The actual concentrations
are those values simulated using the numerical simulation model.
The prediction values are those produced by the ANN model. It
may be observed from the figures that the actual and predicted
salt concentrations are almost equal which shows the high

Fig. 6. Scatter Plot for Output 18

predicting capability of the ANN model.


Table 5 summarizes the ANN model performance by various
statistical parameters for the testing patterns. The prediction
quality of the ANN model is superior in terms of AARE and R
criteria. The highest value of AARE encountered here is 8.04%
and the lowest is 1.00% with as average value of 3.02. Similarly,
the highest value of R is 0.999 and the lowest is 0.960 with as
average value of 0.987. The AARE values are small and are in
acceptable range. Similarly, R values are also higher side and
shows better predicting capabilities of the ANN model. The
performance of the model is also superior in terms of TSx
Fig. 4. Scatter Plot for Output 2 statistics. The highest value of TS5 is 99.32% and the lowest is
43.73% with an average of 83.44%. This shows that for 83.44%
of the patterns, the ARE values are less than 5%. From these
evaluation results it may be concluded that the ANN model is
capable of simulating the flow and transport processes with a
high degree of accuracy for the illustrative study area considered
in the study.

6.5 Performance of ANN Model with Erroneous Data


For evaluating the performance of the ANN model with noisy
data, the training patterns are perturbed by adding normally
distributed random errors with standard deviation SD and mean
zero. The testing patterns used are error free. Fig. 7 shows the
performance of the ANN model in terms of Average Absolute
Relative Error (AARE) when the model is trained with noisy
data. It may be observed that the AARE values increases with the
increase in noise level. The AARE value of testing patterns is
3.024 when the training patterns are free from error (SD=0.00).
The AARE value increases to 3.629 when the training patterns
are perturbed with random error of SD=0.05 and mean zero.
Similarly, the AARE value increases up to 5.033, 5.568 and 6.667
Fig. 5. Scatter Plot for Output 9 when the training patterns are perturbed with a random error of

− 212 − KSCE Journal of Civil Engineering


Performance of an Artificial Neural Network Model for Simulating Saltwater Intrusion Process in Coastal Aquifers when Training with Noisy Data

Table 5. Performance of the ANN Model Trained with Error Free Data
TSx
Output AARE R
5% 10% 20% 30% 40% 50%
1 1.59 0.998 94.92 98.98 100.00 100.00 100.00 100.00
2 1.38 0.999 97.63 100.00 100.00 100.00 100.00 100.00
3 1.30 0.995 97.29 100.00 100.00 100.00 100.00 100.00
4 1.60 0.997 94.24 98.98 100.00 100.00 100.00 100.00
5 1.47 0.998 97.29 100.00 100.00 100.00 100.00 100.00
6 1.00 0.994 99.32 100.00 100.00 100.00 100.00 100.00
7 3.28 0.996 83.05 95.93 97.97 99.32 100.00 100.00
8 2.96 0.995 82.37 97.29 98.64 100.00 100.00 100.00
9 1.69 0.994 96.61 99.32 100.00 100.00 100.00 100.00
10 2.14 0.996 90.17 98.98 100.00 100.00 100.00 100.00
11 2.39 0.994 90.51 98.31 99.66 100.00 100.00 100.00
12 1.32 0.993 98.64 100.00 100.00 100.00 100.00 100.00
13 8.04 0.960 44.07 69.49 93.56 98.98 99.32 100.00
14 7.98 0.960 43.73 72.54 92.88 97.63 99.32 100.00
15 2.77 0.969 85.08 97.97 100.00 100.00 100.00 100.00
16 6.06 0.969 53.90 80.00 97.29 98.98 99.66 99.66
17 5.81 0.969 53.90 86.78 98.64 99.66 100.00 100.00
18 1.57 0.982 99.32 100.00 100.00 100.00 100.00 100.00
Average 3.02 0.987 83.44 94.14 98.81 99.69 99.90 99.98

Fig. 8. Performance of the ANN Model in Terms of AARE Value


with Erroneous Data
Fig. 7. Variation of R with Increase in Noise Level

SD of 0.10, 0.15, and 0.20 respectively. Though the AARE values Fig. 9 shows the performance of the ANN model with noisy
are increasing with the increase in random error, the AARE value data in terms of TS statistics. It can be observed that the model is
encountered are in acceptable range. very sensitive with respect to the perturbed error for TS5 statis-
Fig. 8 shows the coefficient of correlation (R) values on the tics. The average value of TS5 is 83.45% when the ANN model
testing patterns when the training patterns are perturbed with is trained with error free patterns. This indicates that 83.45% of
different level of noises. It can be observed that the R value is not predicted concentration values had an ARE value less than 5%.
decreasing much with the increase in noise level. The average This implies better predicting capability of the developed ANN
value of R is 0.987 when the training patterns are free from error. model. The TS5 value decreases to 78.61% when the training
The average values of R decreases to 0.941 when the training patterns are perturbed with random error of SD=0.05 and mean
patterns are perturbed with random error of SD = 0.20 and mean zero. The TS5 value further decreases to 65.42%, 58.93%, and
zero. It seems that the performance of the ANN model is not 50.02% when the training patterns are perturbed with random
degrading much with the perturbation of noise in the training error of SD=0.10, 0.15, 0.20 respectively and mean zero. The
pattern. The R values achieve are in acceptable range. ANN model is also little bit sensitive with respective to TS10 and

Vol. 13, No. 3 / May 2009 − 213 −


Rajib Kumar Bhattacharjya, Bithin Datta, and Mysore G. Satish

References

Alley, W. M. (1986). “Regression approximation for transport model


constraint sets in combined aquifer simulation-optimization studies.”
Water Resour. Res., Vol. 22, No. 4, pp. 581-586.
Aly, A. H. and Peralta, R. C. (1999). “Optimal design of aquifer cleanup
systems under uncertainty using a neural network and genetic
algorithms.” Water Resour. Res., Vol. 35, No. 8, pp. 2523-2532.
Bear, J., Cheng, A. H. D., Sorek, S., Ouazar, D., and Herrera, I. (1999).
Seawater Intrusion in Coastal Aquifers - Concepts, Methods and
Practices, in Theory and Application of Transport in Porous Media,
Edited by J. Bear. Kluwer Academic Publishers, Dordrecht, p. 625.
Bhattacharjya, R. K. and Datta, B. (2005). “Optimal management of
coastal aquifers using link simulation optimization approach.” J.
Fig. 9. Performance of the ANN Model in Terms of TS Statistics Water Resources Management, Vol. 19, No. 3, pp. 295-320.
Value with Erroneous Data Bhattacharjya, R. K., Datta, B., and Satish, M. G. (2007). “Artificial
neural networks approximation of density dependent saltwater
intrusion process in coastal aquifers.” Journal of Hydrologic
TS20 values. However, the model is less sensitive for TS30, TS40 Engineering, ASCE, Vol. 12, No. 3, pp. 273-282.
and TS50 statistics. For example the TS40 value is 99.91% when Burian S. J., Durrans, S. R., Nix, S. J., and Pitt, R. E. (2001). “Training
the ANN model is train with error free patterns and 99.68% of artificial neural networks to perform rainfall disaggregation.” J. of
when the ANN model is trained with the patterns perturbed with Hydrologic Engineering, ASCE, Vol. 6, No. 1, pp. 43-51.
random error of SD = 0.20 and mean zero. This indicates that the Cheng, J. M. and Chen, C. X. (2001). “Three-dimensional modeling of
density-dependent saltwater intrusion in multilayered coastal aqui-
absolute relative errors for most of the data set are less than 40%.
fers in Jahe river basin, Shandong province, China.” Ground Water,
These evaluations of results show that the ANN is capable for Vol. 39, No.1, pp. 137-143.
approximating the flow and transport processes in coastal aqui- Coulibaly, P., Anctil, F., and Bobçe, B., (2000). “Daily reservoir inflow
fers for the study area considered here. The predicting capability forecasting using artificial neural networks with stopped training
of the ANN model does not degrade much when the model is approach.” J. of Hydrology, Vol. 230, pp. 244-257.
trained with the noisy training data. Das, A. and Datta, B. (2000). “Optimization based solution of density
dependent seawater intrusion in coastal aquifers.” J. of Hydrologic
Engg., Vol. 5, No.1, pp. 82-89.
7. Conclusions
Ejaz, M. S. and Peralta, R. C. (1985). “Modeling for optimal manage-
ment of agricultural and domestic wastewater loading to streams.”
This study presents an ANN model for approximating the Water Resour. Res., Vol. 31, No. 4, pp. 1087-1096.
transient three-dimensional flow and transport processes in Hassoun, M. H. (1999). Fundamentals of Artificial Neural Networks,
coastal aquifers. The training and validation patterns are gener- Prentice Hall of India Pvt. Ltd., New Delhi, pp. 226-230.
ated using a finite element based numerical simulation model. Hsu, K., Gupta, V. H., and Sorooshian, S. (1995). “Artificial neural
The performance of the developed ANN model is evaluated for network modeling of the rainfall-runoff process.” Water Resources
an illustrative study area consisting of a hypothetical confined Research, Vol. 31, No.10, pp. 2517-2530.
aquifer. The performance of the ANN approximator is quite en- Jain, A. and Kumar, A. (2006). “An evaluation of artificial neural
network technique for the determination of infiltration model para-
couraging based on the AARE, R and TSx statistics. The per-
meters.” J. Soft Computing, Vol. 6, No. 3, pp. 272-282.
formance evaluations of the developed ANN model show that Jain, A., Srinivasulu, S., and Bhattacharjya, R. K. (2005). “Determina-
this approach is potentially useful for approximate simulation of tion of an optimal unit pulse response function using real-coded
the transient density dependent three-dimensional flow and genetic algorithm.” J. Hydrology, Vol. 303, No.1-4, pp. 199-214.
transport processes in coastal aquifers. The performance of the Johnson, V. M. and Rogers, L. L. (2000). “Accuracy of neural network
developed ANN model is also evaluated for noisy data. In this approximations in simulation-optimization.” J. of Water Resour.
case, the input training patterns are perturbed with random error Planning and Management, ASCE, Vol. 126, No. 2, pp. 48-56.
of different levels. It is observed that the performance of the Korre, A., Durucan, S., and Imrie, C. E. (2000). “River flow prediction
using artificial neural networks : Generalized beyond calibration
model does not degrade much with perturbation of noise in the
range.” J. of Hydrology, Vol. 233, pp. 138-153.
training data. On the other hand it may be concluded that with Lefkoff, L. J. and Gorelick, S. M. (1990). “Simulating physical
incorporation of field measurement errors in the training data, the processes and economic behavior in saline, irrigated agriculture:
predicting capability of ANN model does not degrade much, and Model development.” Water Resour. Res., Vol. 26, No. 7, pp. 1359-
therefore can be applied for simulating the flow and transport 1369.
processes in a coastal aquifer. However, more rigorous evalua- Lin, H. J., Rechards, D. R., Talbot, C. A., Yeh, G. T., Cheng, J. R.,
tions using larger and more complex real world study areas are Cheng, H. P., and Jones, N. L. (1997). A three-dimensional finite
element computer model for simulating density-dependent flow and
necessary before the applicability of this approach can be fully
transport in variable saturated media: Version 3.1, U.S. Army
established.

− 214 − KSCE Journal of Civil Engineering


Performance of an Artificial Neural Network Model for Simulating Saltwater Intrusion Process in Coastal Aquifers when Training with Noisy Data

Engg. Research and Development Center. Vicksburg, M.S. 1997. position of the seawater intrusion in a coastal aquifer near Madras
Liong, S. Y., Lim, W. H., and Paudyal, G. N. (2000). “River stage coast. Proceedings of the 3rd international conference finite element
forecasting in bangladesh : Neural network approach.” J. of Com- water resources. University of Mississippi, Oxford, Miss. 1980.
puting in Civil Engineering, Vol. 4, No.1, pp. 1-8. Sherif, M. M., Singh, V. P., and Amer, A. M. (1988). “A two-
Morshed, J. and Kaluarachchi, J. J. (1998). “Application of artificial dimensional finite element model for dispersion (2D-FED) in
neural networks and genetic algorithms in flow and transport coastal aquifer.” J. of hydrology, Vol. 103, pp. 11-36.
simulation.” Adv. in Water Resour., Vol. 22, No. 2, pp. 145-158. Todd, D. K. (1980). Groundwater Hydrology, John Wiley & Sons.
Rogers, L. L. and Dowla, F. U. (1994). “Optimization of ground water Singapore.
remediation using artificial neural networks and parallel solute Willis, R. and Finney, B. A. (1988). “Planning model for optimal control
transport modeling.” Water Resour. Res., Vol. 30, No. 2, pp. 457- of saltwater intrusion.” J. of Water Resour. Plng. and Mgmt., ASCE,
481. Vol. 114, No. 2, pp. 163-178.
Rouve, G. and Stoessinger, W. (1980). Simulation of the transient

Vol. 13, No. 3 / May 2009 − 215 −

View publication stats

You might also like