You are on page 1of 11

Analytica Chimica Acta 658 (2010) 1–11

Contents lists available at ScienceDirect

Analytica Chimica Acta


journal homepage: www.elsevier.com/locate/aca

Modeling the performance of “up-flow anaerobic sludge blanket” reactor based


wastewater treatment plant using linear and nonlinear approaches—A case study
Kunwar P. Singh a,∗ , Nikita Basant b , Amrita Malik a , Gunja Jain a
a
Environmental Chemistry Division, Indian Institute of Toxicology Research (Council of Scientific & Industrial Research),
Post Box No. 80, MG Marg, Lucknow-226 002, UP, India
b
School of Graduate Studies-Multiscale Modeling, Computational Simulations and Characterization in Material and Life Sciences,
University of Modena and Reggio E., Modena, Italy

a r t i c l e i n f o a b s t r a c t

Article history: The paper describes linear and nonlinear modeling of the wastewater data for the performance evalua-
Received 24 June 2009 tion of an up-flow anaerobic sludge blanket (UASB) reactor based wastewater treatment plant (WWTP).
Received in revised form 16 October 2009 Partial least squares regression (PLSR), multivariate polynomial regression (MPR) and artificial neural
Accepted 2 November 2009
networks (ANNs) modeling methods were applied to predict the levels of biochemical oxygen demand
Available online 10 November 2009
(BOD) and chemical oxygen demand (COD) in the UASB reactor effluents using four input variables mea-
sured weekly in the influent wastewater during the peak (morning and evening) and non-peak (noon)
Keywords:
hours over a period of 48 weeks. The performance of the models was assessed through the root mean
Wastewater
Partial least squares regression
squared error (RMSE), relative error of prediction in percentage (REP), the bias, the standard error of pre-
Multivariate polynomial regression diction (SEP), the coefficient of determination (R2 ), the Nash–Sutcliffe coefficient of efficiency (Ef ), and the
Artificial neural networks accuracy factor (Af ), computed from the measured and model predicted values of the dependent variables
Modeling (BOD, COD) in the WWTP effluents. Goodness of the model fit to the data was also evaluated through
Levenberg–Marquardt algorithm the relationship between the residuals and the model predicted values of BOD and COD. Although, the
model predicted values of BOD and COD by all the three modeling approaches (PLSR, MPR, ANN) were
in good agreement with their respective measured values in the WWTP effluents, the nonlinear models
(MPR, ANNs) performed relatively better than the linear ones. These models can be used as a tool for the
performance evaluation of the WWTPs.
© 2009 Elsevier B.V. All rights reserved.

1. Introduction formance for a conventional WWTP [2]. Since, the characteristics


and the flow rate of the raw wastewater influent to the WWTP
For an effective management of the industrial and domes- show large variations both with time (hours of the day, weeks, and
tic wastewater, an uninterrupted and efficient functioning of the seasons) and space (due to different lifestyles of population) and
wastewater treatment plants (WWTPs) is essential. This can be also vary from one plant to another, it is very difficult to develop
ensured through a regular and continuous monitoring of the per- a common strategy for the performance evaluation of a WWTP
formance of the WWTP [1]. Improper functioning of a WWTP may [3]. Further, the tedious analytical procedures for the quantitative
lead to discharge of contaminated effluents to the receiving land determination of the performance indicator variables, stress hard
or water body, causing severe public health risk in the surround- for devising some alternate and reliable methodology.
ing area. Performance of a WWTP is generally expressed in terms Safer operation and control of a WWTP can be ensured by devel-
of the ratios of the characteristic variables of the treated and raw oping appropriate mathematical model for predicting the plant
wastewaters. The organic pollution indicator variables such as the performance based on past observations of certain key quality
biochemical oxygen demand (BOD) and chemical oxygen demand parameters. Modeling approaches have globally been accepted for
(COD) are considered as the key parameters for describing the the evaluation of the WWTP performance. Mechanistic models
wastewater characteristics and their corresponding ratio in the can be used to simulate the plant behavior over a wide range of
effluent to influent wastewater as a measure of the overall per- operating conditions [4–6], however, these models take long sim-
ulation time due to broad range of time constants, necessitating
small integration steps and also due to seasonal temperature vari-
∗ Corresponding author. Tel.: +91 522 2476091; fax: +91 522 2628227. ations requiring long time (at least one year) influent data for a
E-mail addresses: kpsingh 52@yahoo.com, reliable performance evaluation covering the whole temperature
kunwarpsingh@gmail.com (K.P. Singh). range. Further, a large variability of the influent characteristics in

0003-2670/$ – see front matter © 2009 Elsevier B.V. All rights reserved.
doi:10.1016/j.aca.2009.11.001
2 K.P. Singh et al. / Analytica Chimica Acta 658 (2010) 1–11

terms of composition, strength, and flow rates might significantly


influence the model parameters, and consequently operational
control [7]. Recently, Singh et al. [2] used multi-way data modeling
(multi-linear) approaches (PARAFAC and NPLS) for predicting the
performance of a sewage treatment plant (STP) and reported a good
agreement between the measured and model predicted values of
the characteristic variables in the STP effluents. On the other hand,
the complex physical, chemical and biological processes involved
in wastewater treatment exhibit a nonlinear behavior, which are
difficult to describe by the linear mathematical models [3].
The up-flow anaerobic sludge blanket (UASB) reactor based
WWTPs are among the most popular ones and frequently used
in full scale installations for the anaerobic treatment of domes-
tic and industrial wastewater [8,9]. In this process, wastewater is
passed through the reactor in an up-flow mode. The up-flowing
wastewater travels through a zone containing larger number of
sludge particles held in suspension in the reactor and provide a
large surface area for attachment of organic matter undergoing
biodegradation. Wu and Hickey [10] developed a dynamic model
for UASB reactors containing granular sludge by incorporating
hydraulic concepts along with appropriate consideration of sub-
strate degradation kinetics and mass transfer limitations. However,
none of these mechanistic models are able to completely explain
or predict the performance of a UASB reactor under various condi-
tions. Therefore, modeling of a WWTP is a difficult task and most
of the available models are just approximations based on, prob-
ably severe assumptions [7,11,12]. The artificial neural networks
(ANNs) owing to their high accuracy, adequacy and quite promising
applications in engineering [13–15] can be used for modeling such
WWTP processes. It normally relies on the representative histori-
cal data of the process and has tremendous predictive capabilities.
Most of the available literature on the application of ANNs for mod-
eling the performance of WWTPs utilized the BOD and COD as the
common key parameters [3,16–18].
In this work, an attempt has been made to model the wastewa-
ter data using the linear and nonlinear approaches, such as partial Fig. 1. Box and Whisker plots of the measured variables in the (a) influent, and (b)
least squares regression (PLSR), multivariate polynomial regression effluents of the UASB reactor.
(MPR), and artificial neural networks to predict the performance of
an UASB reactor based WWTP, using the data set pertaining to the and COD of the wastewater samples were measured following the
selected characteristic variables measured in the influent and efflu- Winkler and reflex methods, respectively. NH4 –N was determined
ent wastewater. The data set was comprised of four characteristic spectrophoto-metrically (Nesslerization method) using UV–vis
variables measured in both the influent (untreated) and effluent spectrophotometer (GBC Cintra-40, Australia), whereas TKN was
(treated) wastewater samples collected on a weekly basis, three determined by distillation (Macro-Kjeldhal method) and titration
times a day during both the peak (morning and evening) and non- with H2 SO4 . TKN is the sum of free ammonia and organic nitrogen
peak (noon) flow regimes for a period of 48 weeks. The results of the compounds. The minimum detection limits for NH4 –N and TKN
modeling process would help to have an assessment of the expected were 0.02 and 0.2 mg L−1 , respectively. All measurements were
effluents for a given quality of the influent wastewater stream and made in duplicate and the results presented are the respective
the performance of the WWTP. mean values.
The dataset was analyzed statistically by generating the box and
whiskers [20] diagrams (Fig. 1) for all the measured variables both
2. Methodology in the influent and the effluents. These diagrams summarize each
variable by three components; a central point to indicate central
2.1. Wastewater dataset tendency (median); a box to indicate variability around this central
tendency (25th and 75th percentiles) and whiskers around the box
The available dataset on four characteristic variables of the influ- to indicate the range of the variable.
ent and effluent wastewater of the UASB reactor based WWTP,
Kanpur (India) was selected. The data comprised of four variables, 2.2. Multivariate modeling
viz. BOD, COD, ammonical nitrogen (NH4 –N) and total Kjeldahl-
nitrogen (TKN) measured weekly in the influent (untreated) and The main aim of this study was to build a multivariate model able
effluent (treated) wastewater samples collected thrice a day cov- to predict the levels of the characteristic variables (BOD and COD)
ering the peak (morning and evening) and non-peak (noon) flow in the effluents of the UASB reactor based WWTP from the influent
regimes over a period of 48 weeks. The wastewater samples were characteristics, so as to evaluate the performance of the plant. This
transported to the laboratory under low temperature (4 ◦ C) condi- is to find a mathematical relationship between these two data sets
tions and analyzed within a week. The BOD, COD, TKN and NH4 –N (influent and effluent) of variables. In this paper, three different
of the untreated and treated wastewater samples were deter- multivariate linear (PLSR), low-order nonlinear (MPR) and nonlin-
mined following the standard methods of analysis [19]. The BOD ear (ANNs) modeling approaches are evaluated for the performance
K.P. Singh et al. / Analytica Chimica Acta 658 (2010) 1–11 3

evaluation of the WWTP. Based on the existing measured values K–S splitting of the complete dataset. The optimum number of
of different variables and their correlative analysis, four variables latent variables (LVs) was decided on the basis of the root mean
(BOD, COD, NH4 –N, TKN) pertaining to the influent wastewater square error of prediction (RMSEP) and the magnitude of y-variance
were identified which affect the effluent characteristics (BOD, COD) described by the PLS models. The selected PLSR models were then
and were finally selected for the model development. employed to predict the dependent variables using the test set of
For the purpose of modeling, the data were partitioned into the data. The model predicted values of the dependent variables
three sub-sets; calibration, validation, and the test using the were used for calculating the figures of merit for evaluating the
Kennard–Stone (K–S) approach. The K–S algorithm designs the performance of the selected PLSR models.
model set in such a way that the objects are scattered uniformly
around the calibration domain. Thus, all sources of the data vari- 2.2.2. Multivariate polynomial regression (MPR) modeling
ance are included into the calibration model [21]. In the present Multivariate polynomial regression, a low-order nonlinear
study, the complete input data set (144 samples × 4 variables) was method maximizes the covariance between X and y data sets
partitioned as calibration set (72 samples × 4 variables); validation to explain best estimates of y. The regression coefficients are
set (36 samples × 4 variables); and test set (36 samples × 4 vari- estimated by the least squares method. The MPR model with inter-
ables). Thus, the calibration, validation and test sets comprised of action terms applied to the wastewater dataset has the form;
50%, 25% and 25% samples, respectively.
y = a + b1 x1 + b2 x21 + b3 x2 + b4 x22 + b5 x3 + b6 x23 + b7 x4 + b8 x24
2.2.1. Partial least squares regression (PLSR) modeling
+ b9 x1 × x2 + b10 x1 × x3 + b11 x2 × x3 + b12 x1 × x4 + b13 x2
Partial least squares regression, a linear multivariate calibration
technique [22] aims to find the relationship between a set of pre- × x4 + b14 x3 × x4 + E (4)
dictor (independent) data, X, and a set of responses (dependent),
Y. It attempts to maximize the covariance between X and Y and where, y is the dependent variable, a is the intercept, b1 –b14 are
searches for the factor space most congruent to both matrices [23]. the polynomial coefficients, x1 –x4 are the independent measured
Detailed description of the PLSR method and its algorithms could variables and E is the error. The quadratic model used here is of
be found elsewhere [24], however, in brief, it can be expressed as a interactive type, where the interactions between various indepen-
bilinear decomposition of both X and Y as [25,26]; dent variables are also taken into account.
Here, the MPR model was applied to the data set (144 sam-
X = TWT + EX (1) ples × 4 variables) pertaining to the influent wastewater divided
T in to three sub-sets (calibration, validation and prediction), so as to
Y = UQ + EY (2)
compare the model performance results with other models. Sepa-
such that the scores in X and the scores of the yet unexplained part rate MPR models were developed for predicting the BOD and COD
of Y have maximum covariance. Here, T, W, and U, Q are X and Y PLS in the effluents using the influent dataset. The model predicted val-
scores and loadings (weights) vectors, respectively; EX and EY are ues of the dependent variables were then used for calculating the
the X and Y residuals, respectively. The decomposition models of X figures of merit for evaluating the performance of the MPR models.
and Y and the expression relating these models through regression
constitute the linear PLSR model. The PLSR model performed in two 2.2.3. Artificial neural networks (ANN) modeling
stages uses a set of calibration (training) samples to construct the The artificial neural network, as the name implies, employs
model, which is employed to compute a set of regression coeffi- the model structure of a neural network which is very powerful
cients (BPLS ). These coefficients are then used to make prediction of computational technique for modeling complex nonlinear rela-
the dependent variables (Ynew ) in new (test) experimental set as; tionships particularly in situations where the explicit form of the
Ynew = Xnew · BPLS + E (3) relation between the variables involved is unknown [28,29]. The
main difference between the various types of ANNs involve is net-
The BPLS vector is derived from the model parameters. Here, we work architecture and the method for determining the weights and
have used the linear PLS1R (single dependent variable), PLS2R (set functions for inputs and neurodes (training) [30]. The multi-layer
of two dependent variables) and an augmented X-matrix PLS2R- perception (MLP) neural network has been designed to function
A (X-matrix containing variables along with their squared values) well in modeling nonlinear phenomena. The number of hidden
models to analyze our data set for predicting the BOD and COD layers in an ANN model is usually determined by a trial and error
levels in the WWTP effluents. method. A single hidden layer network is commonly sufficient for
most of the problems. The basic structure of a feed-forward ANN
2.2.1.1. Data arrangement for PLSR modeling. X-block: The influent model is usually comprised of one input layer, one or more hid-
(raw) wastewater characteristic variables were arranged in a two- den and output layers. Each layer consists of a certain number of
way array by taking the samples as the rows and 4 process variables basic element(s) called a neuron or a node. A neuron is a nonlinear
as the columns, yielding the X-matrix of dimensions 72 × 4 for cal- algebraic function, parameterized with boundary values [31]. The
ibration; 36 × 4 for validation and test sets each. The X-matrix data signal passing through a neuron is modified by weights and transfer
constituted the independent set of variables. functions. This process is repeated until the output layer is reached
Y-block: In a similar way, the Y-vector is comprised of 72 samples [13]. The number of neurons in the input, hidden and output layers
as rows and single (BOD or COD) variable (in case of PLS1R) and set depends on the problem. If the number of hidden neurons is small,
of two (BOD and COD of the effluents) variables (in case of PLS2R) as the network may not have sufficient degrees of freedom to learn the
the column. For PLS1R, the dimensions of the y-vector were 72 × 1 process correctly. On the other hand, if the number is too high, the
for calibration, 36 × 1 for validation and test sets each, whereas, for training will take a longer time and the network may over-fit the
PLS2R, the respective Y-dimensions were 72 × 2, 36 × 2 and 36 × 2. data [32]. The connections between the input layer and the mid-
dle or hidden layer contain weights, which usually are determined
2.2.1.2. Validation and prediction. External validation approach through training the system. The hidden layer sums the weighted
was adopted for the PLSR models validation for selecting the opti- inputs and uses the transfer function to create an output value. The
mum number of latent variables (LVs) [27]. The external validation transfer function is a relationship between the internal activation
was carried out through a validation set as obtained through level of the neuron and the outputs.
4 K.P. Singh et al. / Analytica Chimica Acta 658 (2010) 1–11

Fig. 2. Conceptual ANNs for (a) separate prediction of the BOD and COD levels; (b) simultaneous prediction of the BOD and COD levels in the UASB reactor effluents.

Here, three-layer feed-forward neural networks with back- function, fjh is applied to the incoming function, netpj
h yielding the

propagation (BP) learning were constructed for separate (Fig. 2a) output ipj , as
and simultaneous (Fig. 2b) prediction of the BOD and COD levels in
the effluents of an UASB reactor based WWTP using four selected ipj = fjh (netpj
h
) (6)
input variables, p (xpi , i = 1,. . ., 4). A feed-forward neural network
Similarly, the input to the output layer is the summation of the
(FFNN) is very powerful in function optimization modeling and has
outputs from the hidden layer, as
extensively been used for the prediction of water resources vari-
ables [33]. A single hidden layer is used in all the three networks. 
L
◦ ◦
The input vector, therefore, can be denoted as xp = (xp1 , xp2 ,. . ., xp4 ). netpk = wkj ipj + k◦ (7)
The output from the input layer netpj h then becomes the input to the
j=1
hidden layer, which acquires the form [34];
where, k◦ is the bias term (bias weight) for the output layer and L
 N is the dimension of the hidden layer. The output from the output
h
netpj = wjih · xpi + jh (5) layer is finally of the form
i=1 Opk = fk◦ (netpk

) (8)
where wjih
is the connection weight from the ith input node to jth Here, a nonlinear (tan sigmoid) activation function was applied in
hidden node, the superscript h refers to quantities on the hidden both the hidden and output layers.
layer, jh is the bias term (bias weight) for the hidden layer and N is The combined inputs are then transformed to the neuron out-
the dimension of the input vector. In the hidden layer, the activation put, which is generally distributed to various connection pathways
K.P. Singh et al. / Analytica Chimica Acta 658 (2010) 1–11 5

to provide inputs to the other neurons. The sigmoid function trans- as [38,39];
forms the independent variable ‘x’ having a range of −∞ to +∞
1
to the range −1 to +1. Maier and Dandy [35] have shown that N
hyperbolic tangent transfer (between −1 and 1) function produces MSE = (ypred − ymeas )2 (12)
N
better performance in terms of RMSE and learn quicker than lin- i=1
ear and bipolar sigmoid (between 0 and 1) functions. However,
the variables scaled to −1 and +1 may lead to the saturation of where ypred and ymeas represent the model predicted and measured
the sigmoid function, so we scaled the variables between −0.9 and values of the variable, and N represents the number of observations.
+0.9. After the feed-forward propagation process is completed, the The maximum number of epochs, target error goal MSE and the
back-propagation starts at the output layer incrementally adjust- minimum performance gradient were set to 300, 10−10 , and 10−10 ,
ing the connection weights between the output and hidden layers, respectively. Training stops when any of these conditions occur.
and hidden and input layers following the BP algorithm. The back- The optimal architecture of the ANN models and its parame-
propagation (BP) is a commonly used learning algorithm in ANN ter variation were determined based on the minimum value of the
application, which uses the BP of the error gradient. This train- MSE of the training and validation sets. In optimization of the net-
ing algorithm is a technique that helps in distributing the error works, four neurons were used in the hidden layer as an initial
in order to arrive at a best fit or minimum error. After the informa- guess. With increase in number of neurons, the networks yielded
tion has gone through the network in a forward direction and the several local minimum values with different MSE values for the
network has predicted an output, the back-propagation algorithm training set. Through trial and error approach, various combina-
redistributes the error associated with this output back through tions of the number of neurons in hidden layer, back-propagation
the model, and weights are adjusted accordingly. Minimization algorithms, and transfer functions (linear and sigmoid) were used.
of the error is achieved through several iterations. One complete Lowest MSE for the training and the validation sets was the criteria
cycle is known as the ‘epoch’. Each neuron in a layer is connected for selecting the best case [38].
to every neuron in the next layer. These links are given a synap-
tic weight that represents its connection strength [13]. Although,
traditional BP uses a gradient descent algorithm to determine the 2.2.4. Data treatment
weights in the network, it computes rather slowly due to linear con- For the PLSR and MPR modeling, the experimental data arranged
vergence. Hence, Levenberg–Marquardt algorithm (LMA), which is in data matrices, were auto-scaled (column mean centered and
much faster as it adopts the method of approximate second deriva- scaled). This data treatment eliminates offsets, changes in mea-
tive [36] was used here. The LMA is similar to the quasi-Newton surement units and focuses the analysis on proper modeling of the
method in which a simplified form of the Hessian matrix (second observed variances in measured variables [40]. Since, the data were
derivative) is used. The Hessian matrix can be approximated as [37]; pre-processed; these were transformed back to the original form
prior to the post modeling computations.
H = JT J (9) For ANN modeling, in view of the requirements of the com-
and the gradient (g) can be computed as putation algorithm, the raw data of both the independent and
dependent variables were normalized to an interval by transfor-
g = JT e (10) mation. The transformation modifies the distribution of the input
variables so that it matches the distribution of the estimated
where, J is the Jacobian matrix, which contains first derivatives of outputs. Here, all the variables are transformed to the same ground-
the network errors with respect to the weights and biases, and e is uniform distributions on −0.9, +0.9, as;
a vector of network errors. One iteration of this algorithm can be
written as (xi − xmin )(up − low)
zi = low + (13)
T
xk+1 = xk − [J J + I]
−1 T
J e (11) (xmax − xmin )

where,  is the learning rate and I is the identity matrix. During where, zi is the scaled value, xi is the ith values of the variables, xmax
training the learning rate  is incremented or decremented by and xmin are the maximum and minimum values of the variable, up
a scale at weight updates. When  is zero, this is just Newton’s and low are the highest and lowest values of the range selected for
method, using the approximate Hessain matrix. When  is large, scaling the data. The model output results were back transformed
this becomes gradient descent with a small step size. The LMA is to obtain the data in order to make them comparable with original
reported to have the fastest convergence for the neural networks target values, as;
that contain up to few hundred neurons [38].
During the training process, there are three factors that are (zi − low)(xmax − xmin )
xi = + xmin (14)
associated with the weight optimization algorithms. These are: (1) (up − low)
initial weight matrix, (2) learning rate, and (3) stopping criteria
such as; (3a) fixing of the number of epoch size, (3b) setting a The ANNs were applied to provide a nonlinear relationship
target error goal and (3c) fixing minimum performance gradient. between the sets of the inputs comprised of some selected char-
The learning rate is an indicator of the rate of convergence. If it acteristic variables of influent wastewater and the network output
is too small, the rate of convergence will be slow due to the large (BOD and COD).
number of steps needed to reach the minimum error. If it is too
large, the convergence initially will be fast, but will produce undue
oscillations, and may not reach the minimum error. The value of 2.2.5. Figures of merit
the learning parameter is not fixed as the optimization of learn- To determine the performance of each of the selected (PLSR,
ing parameter is highly problem dependent and should be selected MPR, ANN) models, different figures of merit, such as the RMSE,
so that oscillations in error surface can be avoided [14,35]. Hagan relative error in percentage (REP), the bias, standard error of predic-
et al. [39] demonstrated that the learning becomes unstable for tion (SEP), the coefficient of determination (R2 ), the Nash–Sutcliffe
higher values (>0.035). Thus, the learning rate was set to 0.001. The coefficient of efficiency (Ef ), and the accuracy factor (Af ) were cal-
mean square error (MSE), used as the target error goal, is defined culated [33,41,42].
6 K.P. Singh et al. / Analytica Chimica Acta 658 (2010) 1–11

Table 1
Performance parameters for PLSR, MPR, and ANN models for prediction of the BOD and COD levels in the UASB reactor effluents.

Model Sub-set R2 RMSEP Bias Af Ef REP SEP R2 a

BOD PLS1R Cal 0.93 2.55 0.00 1.04 0.93 4.59 2.57 0.00
Val 0.92 2.52 0.02 1.04 0.91 4.58 2.54 0.02
Test 0.88 2.28 0.00 1.04 0.88 4.37 2.31 0.02
PLS2R Cal 0.93 2.67 0.00 1.04 1.00 4.80 2.69 0.00
Val 0.92 2.62 0.03 1.05 0.90 4.78 2.65 0.02
Test 0.87 2.37 0.01 1.04 0.87 4.54 2.40 0.01
PLS2R-A Cal 0.92 2.85 0.00 1.05 1.00 5.12 2.87 0.00
Val 0.91 2.65 0.02 1.04 0.90 4.82 2.68 0.02
Test 0.86 2.44 0.01 1.04 0.86 4.68 2.47 0.01
MPR Cal 0.95 2.23 0.00 1.04 0.95 4.01 2.25 0.00
Val 0.91 2.75 0.03 1.05 0.89 5.01 2.78 0.00
Test 0.88 2.27 0.01 1.04 0.88 4.36 2.30 0.01
ANN1 Cal 0.97 1.63 −0.03 1.02 0.97 2.93 1.64 0.00
Val 0.93 2.35 0.00 1.04 0.92 4.29 2.39 0.15
Test 0.92 2.87 −0.04 1.05 0.80 4.50 2.88 0.04
ANN2 Cal 0.95 2.19 −0.07 1.03 0.95 3.93 2.20 0.00
Val 0.92 2.48 −0.01 1.03 0.91 4.52 2.52 0.00
Test 0.90 2.98 −0.03 1.05 0.79 5.72 3.01 0.04

COD PLS1R Cal 0.80 8.56 0.00 1.05 0.80 6.38 8.62 0.00
Val 0.73 7.22 0.03 1.05 0.68 5.45 7.31 0.12
Test 0.68 5.47 −0.02 1.04 0.61 4.21 5.55 0.16
PLS2R Cal 0.80 8.56 0.00 1.05 0.80 6.38 8.62 0.00
Val 0.73 7.22 0.03 1.05 0.68 5.45 7.31 0.12
Test 0.68 5.47 −0.02 1.04 0.61 4.21 5.55 0.16
PLS2R-A Cal 0.78 9.09 0.00 1.06 0.78 6.77 9.15 0.00
Val 0.72 7.34 0.04 1.05 0.67 5.54 7.44 0.09
Test 0.60 5.94 −0.01 1.04 0.54 4.57 6.02 0.13
MPR Cal 0.80 6.89 0.00 1.04 0.87 5.13 6.94 0.00
Val 0.83 6.11 0.08 1.04 0.77 4.62 6.17 0.06
Test 0.79 4.62 0.03 1.04 0.72 3.56 4.68 0.20
ANN1 Cal 0.90 6.41 −1.65 1.04 0.89 4.77 6.24 0.08
Val 0.87 5.61 0.08 1.03 0.81 4.23 5.64 0.07
Test 0.84 4.46 −0.07 1.04 0.76 4.97 6.52 0.01
ANN2 Cal 0.90 6.21 0.66 1.03 0.90 4.63 6.22 0.01
Val 0.84 6.39 0.08 1.04 0.75 4.82 6.44 0.18
Test 0.83 4.52 −0.09 1.04 0.75 5.02 6.56 0.06
a
Correlation between the residuals and model predicted values of the dependent variable.

(a) The RMSE represents the error associated with the model and the dependent variable. It is calculated as;
can be computed as;
1
N
 Bias = (ypred − ymeas ) (17)
N N
(y
i=1 pred
− ymeas )2 i=1
RMSR = (15)
N (d) Standard error of prediction is calculated as;

where ypred and ymeas represent the model computed and mea- N
(y
i=1 pred
− ymeas − bias)2
sured values of the variable, and N represents the number of SEP = (18)
N−1
observations. The RMSE, a measure of the goodness-of-fit, best
describes an average measure of the error in predicting the (e) The coefficient of determination (R2 ) (square of the correlation
dependent variable. However, it does not provide any infor- coefficient) represents the percentage of variability that can be
mation on phase differences. explained by the model and is calculated as;
⎡ ⎤
N 


⎢ N i=1 ymeas ypred −


N
y
N
y ⎥
⎢ i=1 meas i=1 pred ⎥
R = ⎢ 
2 ⎥ (19)

⎣ N 2 
2  

2 ⎥⎦
N N N
N y
i=1 meas
− y
i=1 meas
× N y2
i=1 pred
− y
i=1 pred

(b) Relative error of dependent variable in percentage (REP) for the (f) The Nash–Sutcliffe coefficient of efficiency (Ef ), an indicator of
calibration, validation and prediction steps are calculated as; the model fit is computed as [43];
 N
N (ypred − ymeas )2
(y
i=1 pred
− ymeas )2 Ef = 1 − Ni=1 (20)
REP =  N
× 100 (16) (ymeas − ŷmeas )
2
(y
i=1 meas
)2 i=1

where, ŷmeas is the mean of the measured values. The Ef is a


(c) The bias or average value of residuals (non-explained differ- normalized measure (−∞ to 1) that compares the mean square
ence) between the actual and predicted value of dependent error generated by a particular model simulation to the vari-
variable represents the mean of all the individual errors and ance of the target output sequence. The Ef value of 1 indicates
indicates whether the model overestimates or underestimates perfect model performance (the model perfectly simulates the
K.P. Singh et al. / Analytica Chimica Acta 658 (2010) 1–11 7

target output), an Ef value of zero indicates that the model is, on As evident, for both the PLS2R and PLS2R-A models, significantly
average, performing only as good as the use of the mean target high correlations between the measured and model predicted val-
value as prediction, and an Ef value <0 indicates an altogether ues of the two dependent variables were obtained. The values of the
questionable choice of the model [33,44]. correlation coefficients were comparable with those obtained for
(g) The accuracy factor (Af ), a simple multiplicative factor indicat- the respective PLS1R models. However, in terms of the estimated
ing the spread of results about the prediction is computed as error parameters (RMSEP, REP, SEP), the predictive performance of
[45];    the two PLS2R models was marginally inferior to the corresponding
 log(ypred /ymeas ) PLS1R models (Table 1). Normally, the PLS1R is expected to give bet-
Af = 10 N (21)
ter model on one y-variable than PLS2R on an enlarged Y-variable
set. However, the PLS2R models performing equally or even bet-
The larger the value of Af , the less accurate is the average esti- ter than corresponding PLS1R models are reported [46]. This may
mate. A value of one indicates that there is perfect agreement be due to the fact that PLS1R model does not use all the informa-
between all the predicted and measured values. Each performance tion available in the Y-matrix. Moreover, the performance of the
criteria term described above conveys specific information regard- PLS2R models depends upon the correlation between the depen-
ing the predictive performance efficiency of a specific model. dent variables modeled simultaneously. In case of highly correlated
Goodness-of-fit of the selected models (PLSR, MPR, and ANN) was Y-variables, performance of PLS2R models may be even better than
also checked through the analysis of the residuals. the respective PLS1R models [46]. In this case, both the y-variables
(BOD and COD) exhibited high correlations (0.70–0.90) in all the
3. Results and discussion three sub-sets.
The measured and model predicted values of BOD and COD
The results obtained by application of different multivariate lin- yielded by the three PLS models for the calibration, validation and
ear and nonlinear methods for modeling and prediction of the BOD prediction data sets are plotted in Figs. 3 and 4. A close pattern
and COD levels in the effluents of the UASB reactor from the influent of variation among the measured and model predicted (by all the
variables are presented in Table 1. Three data sub-sets were gen- three PLS models) values of both the variables (BOD and COD) in
erated using the Kennard–Stones approach and used for modeling; all the three data sets is evident suggesting for the good predictive
the first one was for calibration of the model, the other two sets capability of the selected models.
for the validation and testing of the model. It may be emphasized Further, the correlation coefficient (R2 ) values between the
here that all the data values were obtained through real measure- residuals and ypred (model predicted values of the BOD and COD) for
ments under field conditions varying through out the year. It is the calibration, validation and prediction sets as obtained for the
obvious that all possible sources of variation during the year were PLSR models are shown in Table 1. Relationship between the residu-
thus included. als and model predicted values of the dependent variable (ypred ) can
be more informative regarding the model fitting to a data set. If the
3.1. Partial least squares regression (PLSR) modeling residuals appear to behave randomly (low correlation), it suggests
that the model fits the data well. On the other hand, if non-random
The number of latent variables (LVs) in PLSR models (PLS1, PLS2, distribution is evident in the residuals, the model does not fit the
PLS2-A) calibration are selected on the basis of the percent of vari- data adequately [47]. Low values of the correlation coefficients (R2 )
ance explained in X and captured by the models (Y) in external between the residuals and model predicted values of the dependent
validation and the minimum root mean square errors of validation variables (BOD and COD) (Table 1) suggest for the random pattern
(RMSEV) value for each model. A minimum value of RMSEV was in distribution of the residuals, and hence, appropriateness of the
obtained for the three LVs for each of the PLSR models. Hence, the PLSR models fitting to all the data points.
three LVs PLSR models were selected and run for the complete cal- From the various model diagnostics (Table 1), it is evident that
ibration, validation and prediction data sets. The three LVs PLS1R the performance of all the PLSR models for COD was relatively
models, when run for the complete calibration data set captured poor as compared to that of BOD. This may be due to a relatively
about 86% variance in X in both the cases (BOD and COD), while 93 higher degree of nonlinear relationship between the dependent
and 80% of the variance in y, respectively for BOD and COD mod- (COD) and independent variables. In terms of various model per-
els. The performance criteria parameters (R2 , RMSE, bias, Af , Ef , REP, formance criteria parameters considered here, the PLSR models
and SEP) as obtained for PLS1R models for the calibration, validation fitted well to the measured data sets and yielded satisfactory results
and prediction sets are summarized in Table 1. The two PLS1R mod- suggesting for their suitability for predicting both the BOD and
els yielded RMSE of 2.55 (BOD) and 8.56 (COD) with a significantly COD levels in the effluents of the UASB reactor using the input
(p < 0.001) high correlation coefficient (R2 ) of 0.93 and 0.80 between variables. However, the process variables may have some degree
the measured and predicted values of BOD and COD, respectively of nonlinear relationships; we attempted to treat the data set
for the calibration set. The three component model applied to the using the nonlinear modeling approaches and make a comparative
validation and test data sets for the prediction of BOD yielded the study.
RMSEP of 2.52 and 2.28, respectively, and R2 values of 0.92 and 0.88
for the two sets, whereas, in case of COD, the model yielded rela- 3.2. Multivariate polynomial regression (MPR) modeling
tively higher RMSEP (7.22 and 5.47) and relatively lower correlation
(R2 ) values (0.73 and 0.68) (Table 1). Low values of bias, REP and SEP The MPR modeling results for the BOD and COD of the effluents
and the Ef and Af values (Table 1) closer to unity for the calibration, of the UASB reactor are presented in Table 1. From the model per-
validation and prediction sets for BOD and COD (except Ef ) suggest formance diagnostic parameters, it may be noted that for both the
the adequacy and goodness-of-fit of the selected PLS1R models in dependent variables (BOD and COD), the low-order nonlinear MPR
predicting these variables. models fitted the experimental data relatively better than the lin-
In case of the PLS2R and PLS2R-A models, the amount of variance ear PLSR models. For both the dependent variables, the MPR models
explained in X were about 88 and 91% and captured in Y were 86 and as fitted to the three (calibration, validation, prediction) data sets
85%, respectively. The performance criteria parameters (R2 , RMSE, yielded significantly (p < 0.001) high correlations (R2 ) between the
bias, Af , Ef , REP, and SEP) pertaining to the PLS2 models for the cal- measured and model predicted values of BOD (0.95, 0.91, and 0.88)
ibration, validation and prediction sets are summarized in Table 1. and COD (0.80, 0.83, and 0.79) and relatively lower RMSE, bias, REP
8 K.P. Singh et al. / Analytica Chimica Acta 658 (2010) 1–11

Fig. 3. Comparison of the model predicted and measured BOD levels in the UASB
effluents (a) calibration (b) validation, and (c) test sets using PLSR, MPR, and ANN Fig. 4. Comparison of the model predicted and measured COD levels in the UASB
models. effluents (a) calibration (b) validation, and (c) test sets using PLSR, MPR, and ANN
models.
and SEP values, and Ef and Af values closer to unity (Table 1). The
model performance diagnostics suggest for a relatively better fit-
ting of the MPR models to the data. It may be attributed to the which due to higher nonlinear nature could be fitted relatively bet-
nonlinear nature of the MPR model and hence, capability to cap- ter by the nonlinear MPR model. This further emphasized for the
ture the nonlinear relationships among the variables. The plots of need of examining some more complex nonlinear model, such as
the measured and MPR model predicted values of BOD and COD ANN to predict the dependent variables.
obtained for the three data sets (Figs. 3 and 4) suggest a close
pattern of variation among them emphasizing for the better pre-
dictive capability of these models. A low correlation between the 3.3. Artificial neural network (ANN) modeling
model predicted values of the dependent variables and correspond-
ing residuals (Table 1) suggest for random distribution of the later, In this study, three different ANNs were constructed to predict
hence, appropriateness of the quadratic model fitting to all the data the BOD and COD levels (both separately and simultaneously) in the
points. effluents of the UASB reactor based WWTP. Different ANN models
However, as in case of PLSR, the MPR model performed relatively were constructed and tested in order to determine the optimum
better for BOD as compared to the COD (Table 1), and moreover, number of nodes in the hidden layer and transfer functions. The
the improvement in performance of MPR model over the PLSR networks were trained using the training data set, and then vali-
model for COD was relatively better. This suggests that the BOD dated with the validation data set. The optimal network size was
data have relatively lesser degree of nonlinearity among its rela- selected from the one which resulted in minimum MSE in training
tionships with the other variables, as compared to the COD data, and validation data sets [48].
K.P. Singh et al. / Analytica Chimica Acta 658 (2010) 1–11 9

Fig. 5. Plot of residuals versus model predicted values of BOD in the UASB effluents Fig. 6. Plot of residuals versus model predicted values of COD in the UASB effluents
(a) calibration (b) validation, and (c) test sets using the BOD-ANN model. (a) calibration (b) validation, and (c) test sets using the COD ANN model.

In case of the COD, the selected ANN (four nodes in input layer,
3.3.1. BOD and COD ANN1 models seven nodes in hidden layer and single node in output layer) pro-
The architecture of the best ANN models for the prediction of the vided a best fit model for all the three (training, validation and test)
BOD and COD of the UASB effluents is shown in Fig. 2a. The ANN sets. Fig. 4a–c shows the plots between the measured and model
model for the BOD is composed of one input layer with four input predicted values of COD in training, validation and testing sets. The
variables, one hidden layer with eight nodes and one output layer coefficient of determination (R2 ) values (p < 0.001) for the training,
with one output variable, whereas, the COD model differed in num- validation and test sets were 0.90, 0.87, and 0.84, respectively. The
ber of nodes in the hidden layer, as it optimized with seven nodes respective values of RMSE and bias for the three data sets are 6.41
in this layer. For selecting the number of nodes in the hidden layer, and −1.65 for training, 5.61 and 0.08 for validation, and 4.46 and
a rule of thumb relies on the fact that the number of samples in the −0.07 for test (Table 1). Moreover, relatively low REP and SEP val-
training set should be greater than the number of synaptic weights ues suggest for goodness of the fit of the selected model. A closely
[49]. The number of hidden nodes nh in these models is between nI followed pattern of variation by the measured and model predicted
and 2nI + 1 [50], where nI is the number of input nodes (variables). COD values (Fig. 4a–c), R2 and RMSE values also suggest for a good-
The constructed ANN models (BOD and COD) were trained using fit of the selected COD model to the dataset. Moreover, the values
the LMA, which is much faster than other algorithms used in BP. In of Af closer to unity for the training, validation and test sets suggest
both of these ANNs, a nonlinear transfer function (tansig) was used for a perfect fit of the selected ANN model.
in both the hidden and outer layers. The coefficient of determina- The model predicted COD values (CODpred ) and the residuals
tion (R2 ), RMSE, bias, REP and SEP values, and Ef and Af as computed corresponding to the training, validation and testing sets are plot-
for the training, validation and test data sets computed for the two ted in Fig. 6. The observed relationship between residuals and
models (BOD and COD) are presented in Table 1. Fig. 3a–c shows model predicted COD values for all the three sets shows complete
the plots between the measured and model predicted values of BOD independence and random distribution of the residuals. It is fur-
in training, validation and test sets. The selected ANN (four nodes ther supported by the negligible small values of the correlations.
in input layer, eight nodes in hidden layer, and single node in out- Fig. 6 shows that the points are well distributed on both sides of
put layer) for BOD provided a best fit model for all the three data the horizontal line of zero ordinate representing the average of the
sets. The coefficient of determination (R2 ) values (p < 0.001) for the residuals suggesting that the model fits the data well.
training, validation and test sets are 0.97, 0.93, and 0.92, respec- A relatively better performance of the BOD model as compared
tively. The respective values of RMSE and bias for the three data to that of the COD model (Table 1) suggests that the selected fac-
sets are 1.63 and −0.03 for training, 2.35 and 0.00 for validation, tors (input variables) have relatively greater impact on BOD than on
and 2.87 and −0.04 for test. A closely followed pattern of variation COD. Selection of the factors might affect the model output remark-
by the measured and model predicted BOD levels in the UASB efflu- ably [51]. Several computed values of COD are more deviated from
ents (Fig. 3a–c), reasonably low values of REP and SEP for the three the actual measured values in the effluents (Fig. 4) due to the fact
datasets (Table 1) suggest for a good-fit of the BOD-ANN model to that the measured values may be affected by many factors during
the dataset and for the adequacy of the selected model for predict- the study.
ing the BOD levels in the UASB effluents. Further, the values of the
Af (1.02, 1.04, and 1.05) and Ef (0.97, 0.92 and 0.80) for the training, 3.3.2. Combined BOD–COD ANN2 model
validation and test sets, close to unity suggest for a perfect fit of the The architecture of the best ANN model for simultaneous pre-
selected ANN model. diction of the BOD and COD in the treated wastewater is shown
The model predicted BOD values (BODpred ) and the residuals in Fig. 2b. The ANN for the BOD–COD model is composed of one
corresponding to the training, validation and test sets are plot- input layer with four input variables, one hidden layer with eight
ted in Fig. 5. The observed relationship between the residuals and nodes and one output layer with two output variables. The selected
model predicted BOD values for all the three sets shows almost ANN model was trained using LMA. A nonlinear transfer function
complete independence and random distribution of the residuals. (tansig) was used in both the hidden and the outer layers. The coef-
Fig. 5 shows that the points are well distributed on both sides of ficient of determination (R2 ), RMSEP, bias, REP and SEP values, and
the horizontal line of zero ordinate representing the average of the Ef and Af as computed for the training, validation and test data sets
residuals suggesting that the model fits the data well. Plot of the used for the model are presented in Table 1. Figs. 3 and 4 show the
residuals and BODpred values can be more informative regarding plots between measured and the model predicted values of BOD
model fitting to a data set [47]. and COD in training, validation and testing sets, respectively. The
10 K.P. Singh et al. / Analytica Chimica Acta 658 (2010) 1–11

them (Table 1). For the simultaneously predicted values of BOD and
COD, the respective correlations with their residuals are small. Fig. 7
(BOD) and Fig. 8 (COD) show that the points are well distributed on
both sides of the horizontal line of zero ordinate representing the
average of the residuals. The pattern suggests that the model fits
the data well.
In recent years, ANNs have been used for the performance pre-
diction of the WWTPs. Hanbay et al. [1] developed ANN model
for prediction of the levels of pH, conductivity, COD, TN, and TSS
in the effluents using the set of variables in the influent as the
model input. Hamed et al. [3] developed ANN models to predict
the performance of a WWTP in terms of the BOD and total sus-
pended solids (TSS) in the effluents using the same variables in
the influent wastewater as the model input. The correlation coeffi-
cients ranging between 0.63–0.81 and 0.45–0.65 for the measured
Fig. 7. Plot of residuals versus model predicted values of BOD in the UASB reactor and model predicted levels of BOD and TSS in WWTP effluents
effluents (a) calibration (b) validation, and (c) test sets using the combined BOD–COD were reported. Mjalli et al. [17] developed ANN models for pre-
ANN model. dicting the BOD, COD, and TSS in the effluents using the level
of these variables (individual) in influent wastewater and also in
selected ANN provided a best fit model for simultaneous predic- various combinations. The correlation coefficient values between
tion of both the variables (BOD and COD) in all the three sets. For the measured and model predicted levels were between 0.57 and
the BOD values predicted by the combined model, the coefficient 0.95. Raduly et al. [18] developed ANN model for rapid evaluation
of determination (R2 ) values (p < 0.001) for the training, validation of the WWTP performance over a broad range of plant operating
and test sets were 0.95, 0.92, and 0.90, respectively. The respec- conditions through combining an influent disturbance generator
tive values of RMSE and bias are 2.19 and −0.07 for training, 2.48 with a mechanistic WWTP model. The predicted levels of NH4 –N,
and −0.01 for validation, and 2.98 and −0.03 for the test set. A BOD, and TSS in the effluents were in good agreement with their
closely followed pattern of variation by the measured and model measured values. Hong et al. [52] applied supervised neural net-
predicted BOD values (Fig. 2a–c), R2 , RMSE and bias values suggest work for discovering the complex dependence between the process
for a good-fit of the model to the dataset. Moreover, relatively low variables and diagnosed the behavior of the municipal WWTP sys-
values of the errors (REP and SEP) and those of the accuracy fac- tem. Tomenko et al. [53] developed ANN models based on multiple
tor (Af ) and the Nash–Sutcliffe coefficient (Ef ) closer to unity for all regression analysis (MRA), multi-layer perception and radial basis
the three sets suggest for a perfect fit of the selected ANN model. function approaches for predicting the wastewater treatment effi-
Similarly, the simultaneously predicted COD values using the com- ciency of a constructed wetland treatment (CWT) system relating
bined BOD–COD model yielded the correlations of 0.90, 0.84, and the input–output wastewater variables and reported a high corre-
0.83 (p < 0.001) between the measured and model predicted val- lation (0.84–0.99) between the measured and the model predicted
ues in the training, validation and test sets, respectively (Table 1; values of the effluent BOD.
Fig. 4). The model yielded RMSE and bias values of 6.21 and −0.66 The correlation between the measured and the model predicted
(training), 6.39 and 0.08 (validation), and 4.52 and −0.09 (test) for values of the dependent variable is considered as an important
the three sets. The accuracy factor values closer to unity for all the parameter to indicate the predictive ability of the model. The
three sets suggest for a perfect fit of the selected ANN model. linear and nonlinear models used here yielded high correlations
The model predicted values of BOD and COD and the residu- between the measured and predicted values of BOD and COD in
als corresponding to the training, validation and testing sets are the UASB effluents in all the three sets (calibration, validation and
plotted in Figs. 7 and 8, respectively. The observed relationship test) and visibly it appears that the nonlinear models (MPR and
between residuals and model predicted values of BOD and COD for ANNs) performed relatively better (Table 1). Therefore, the relative
all the three data sets show almost complete independence and performance of the selected models was evaluated for each of the
random distribution with negligibly small correlations between three different sets using the Steiger’s Z test [54] for correlations at
a significance level of 5%. The Steiger’s test considers the Z-scores
corresponding to the correlation values (Fisher’s transformation)
in the significance testing. The Z-critical values do not depend on
the degrees of freedom. The Z-values were calculated for different
pairs of the correlations obtained for the selected models in all the
sets (calibration, validation and test). The computed Z-scores for
various correlation pairs were then compared with the Z-critical
value corresponding to the significance level (p < 0.05). In all the
cases (except for the ANN models in calibration set pertaining to
BOD), the computed Z-values for the correlation pairs were less
than the Z-critical value (1.96). Thus, suggesting for no statistically
significant difference in the predictive ability of different models
used.

4. Conclusions

Due to the high nonlinearity of the plant processes and the non-
Fig. 8. Plot of residuals versus model predicted values of COD in the UASB reactor
effluents (a) calibration (b) validation, and (c) test sets using the combined BOD–COD
uniformity and variability of the raw wastewater characteristics
ANN model. as well as the nature of biological treatment process, modeling
K.P. Singh et al. / Analytica Chimica Acta 658 (2010) 1–11 11

of WWTP is a difficult task. Both the linear and nonlinear mod- [17] F.S. Mjalli, S. Al-Asheh, H.E.J. Alfadala, Environ. Manag. 83 (2007) 329.
eling approaches (PLSR, MPR and ANNs) were implemented to [18] B. Raduly, K.V. Gernaey, A.G. Capodaglio, P.S. Mikkelsen, M. Henze, Environ.
Model. Soft 22 (2007) 1208.
solve the problem and to discover the interdependency of the [19] APHA, Standard Methods for the Examination of Water and Wastewater, 18th
influent–effluent variables. The UASB reactor influent–effluent data ed., American Public Health Association, Washington, DC, 1998.
were used to predict the plant behavior without using mechanis- [20] J.W. Turkey, Explanatory Data Analysis, Addison Wesley, Reading, MA, 1997.
[21] M. Daszykowski, S. Semeels, K. Kaczmarck, P. van Espen, C. Croux, B. Walczak,
tic bio-modeling which involves a great degree of complexity and Chemom. Intell. Lab. Syst. 85 (2007) 269.
uncertainty. The predictive abilities of various linear and nonlin- [22] P. Geladi, B.R. Kowalski, Anal. Chim. Acta 185 (1986) 1.
ear models employed were evaluated through various diagnostics. [23] A. Donachie, A.D. walmsley, S.J. Haswell, Anal. Chim. Acta 378 (1999) 235.
[24] M. Martens, T. Naes, Multivariate Calibration, Wiley, Chichester, 1989.
All the modeling approaches (PLS, MPR, and ANNs) yielded sat- [25] K.P. Singh, A. Malik, N. Basant, P. Saxena, Anal. Chim. Acta 584 (2007) 385.
isfactory results predicting values of the dependent variables in [26] C. Durante, M. Cocchi, M. Grandi, A. Marchetti, R. Bro, Chemom. Intell. Lab. Syst.
close agreement with the respective measured values and showed 83 (2006) 54.
[27] L.P. Bras, M. Lopes, A.P. Ferreira, J.C. Menezes, J. Chemom. 22 (2008) 695.
high correlations between them. Although, these correlations were
[28] S.I. Gallant, Neural Network Learning and Expert Systems, The MIT Press, Mas-
not statistically (p < 0.05) different, the results of other diagnostics sachusetts, USA, 1993.
performed suggested that the MPR and ANN models performed [29] M. Smith, Neural Networks for Statistical Modelling, Van Nostrand Reinhold,
marginally better than the PLSR models. These models provided NY, 1994, p. 235.
[30] M. Caudill, C. Butler, Understanding Neural Networks. Basic Networks, vol. 1,
relatively better estimates for the BOD and COD levels in the efflu- MIT Press, Cambridge, MA, 1992.
ents, which cover a range of data for the training, validation, and [31] G. Dreyfus, J.M. Martinez, M. Samuelides, M.B. Gordon, F. Badran, S. Thiria, L.
testing purposes. These modeling approaches can be used as a tool Herault, Reseaux de Neurones: Methodologie et Applications, Editions Eyrolles,
Paris, France, 2002.
for the performance evaluation of the WWTP. [32] N. Karunanithi, W.J. Grenney, D. Whitley, K. Bovee, J. Comput. Civ. Eng. ASCE 8
(1994) 210.
Acknowledgements [33] S. Palani, S. Liong, P. Tkalich, Mar. Pollut. Bull. 56 (2008) 1586.
[34] S. Haykin, Neural Networks, a Comprehensive Foundation, College Publishing
Comp. Inc., 1994.
The authors thank the Director, Indian Institute of Toxicology [35] H.R. Maier, G.C. Dandy, Environ. Model. Soft 13 (1998) 193.
Research, Lucknow (India) for his keen interest in the work. Finan- [36] Q.H. Wang, J. Qinghai Univ. 22 (2004) 82.
[37] A.P. Dedecker, P.L.M. Goethals, W. Gabriels, N. De Pauw, Ecol. Model. 174 (2004)
cial assistance from CSIR, New Delhi is thankfully acknowledged. 161.
[38] C. Karul, S. Soyupak, A.F. Cilesiz, N. Akbay, E. German, Ecol. Model. 134 (2000)
References 145.
[39] M.T. Hagan, H.P. Demuth, M. Beale, Neural Networks Design, PWS Publishing,
Boston, MA, USA, 1996.
[1] D. Hanbay, I. Turkoglu, Y. Demir, Expert Sys. Appl. 34 (2008) 1038.
[40] D.L. Massart, B.G.M. Vandeginste, L.M.C. Buydens, S.D. Jong, P.J. Lewi, J. Smeyers-
[2] K.P. Singh, N. Basant, A. Malik, S. Sinha, G. Jain, Chemom. Intell. Lab. Syst. 95
Verbeke, Handbook of Chemometrics and Qualimetrics, Elsevier Science,
(2009) 18.
Amsterdam, 1998.
[3] M.M. Hamed, M.G. Khalifullah, E.A. Hassanien, Environ. Model. Soft 19 (2004)
[41] J.F. Chenard, D. Caissie, Hydrol. Process. (2008), doi:10.1002/hyp. 6928.
919.
[42] S. Platikanov, X. Puig, J. Martin, R. Tauler, Water. Res. 41 (2007) 3394.
[4] X. Hao, M.C.M. van Loosdrecht, S.C.F. Meijer, Y. Qian, Water Res. 35 (2001) 2851.
[43] J.E. Nash, I.V. Sutcliffe, J. Hydrol. 10 (1970) 282.
[5] S. Salem, D. Berends, J.J. Heijnen, M.C.M. van Loosdrecht, Water Sci. Technol. 45
[44] B. Schaefli, H.V. Gupta, Hydrol. Process. (2007), doi:10.1002/hyp.6825.
(2002) 169.
[45] T. Ross, J. Appl. Biotech. 81 (1996) 501.
[6] J.B. Copp, The COST Simulation Benchmark-Discription and Simulator Man-
[46] K.H. Esbensen, Multivariate Data Analysis in Practice, CAMOSmart, Corvallis,
ual, Office for Official Publications of the Europian Communities, Luxembourg,
OR, USA, 2002.
2002, ISBN 92-894r-r1658-0.
[47] NIST/SEMATECH e-Handbook of Statistical Methods. http://www.itl.nist.gov/
[7] M.F. Hamoda, I.A. Al-Gusain, A.H. Hassan, Water Sci. Technol. 40 (1999) 55.
div898/handbook, 2006.
[8] G. Lettinga, H. Pol, Water Sci. Technol. 24 (1991) 109.
[48] K.P. Singh, A. Basant, A. Malik, G. Jain, Ecol. Model. 220 (2009) 888.
[9] Z. Wang, Z. Cheng, Z. Qian, Proceedings of the 4th International Symposium on
[49] L. Tarassenko, A Guide to Neural Computing Applications, Arnold Publishers,
Anaerobic Digestion, Guangzhon, China, 1985.
London, 1998.
[10] M.M. Wu, R.F. Hickey, J. Environ. Eng. 123 (1997) 244.
[50] R. Hecht-Nielsen, Kolmogorov’s mapping neural network existence theorem,
[11] D.S. Lee, J.M. Park, J. Biotech. 75 (1999) 229.
in: Proceedings of Ist IEEE International joint conference of neural networks.
[12] I. Plazi, G. Pipus, M. Drolka, T. Koloini, Acta Chim. Slovenica 42 (1999) 289.
Institute of Electrical and Electronics Engineers, New York, NY, 1987.
[13] R.S. Govindaraju, J. Hydrol. Eng. 5 (2000) 124.
[51] Z. Ying, N. Jun, C. Fuyi, G. Liang, J. Zhejiang Univ. Sci. A 8 (2007) 1482.
[14] H.R. Maier, G.C. Dandy, Environ. Model. Soft 15 (2000) 101.
[52] Y.T. Hong, M.R. Rosen, R. Bhamidimarri, Water Res. 37 (2003) 1608.
[15] T.R. Neelakantan, T.R. Brion, S. Lingireddy, Water Sci. Technol. 43 (2001) 125.
[53] V. Tomenko, S. Ahmed, V. Popov, Ecol. Model. 205 (2007) 355.
[16] K.P. Oliveira-Esquerre, M. Mori, R.E. Bruns, Brazil. J. Chem. Eng. 19 (2002)
[54] J.H. Steiger, Psychol. Bull. 87 (1980) 245.
365.

You might also like