You are on page 1of 15

Journal of Hydrology 535 (2016) 211–225

Contents lists available at ScienceDirect

Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol

A comparison between wavelet based static and dynamic neural


network approaches for runoff prediction
Muhammad Shoaib a,⇑, Asaad Y. Shamseldin a, Bruce W. Melville a, Mudasser Muneer Khan b
a
Department of Civil and Environmental Engineering, The University of Auckland, Private Bag 92019, Auckland, New Zealand
b
Department of Civil Engineering, Bahauddin Zakariya University, Multan, Pakistan

a r t i c l e i n f o s u m m a r y

Article history: In order to predict runoff accurately from a rainfall event, the multilayer perceptron type of neural net-
Received 3 August 2015 work models are commonly used in hydrology. Furthermore, the wavelet coupled multilayer perceptron
Received in revised form 26 January 2016 neural network (MLPNN) models has also been found superior relative to the simple neural network
Accepted 28 January 2016
models which are not coupled with wavelet. However, the MLPNN models are considered as static and
Available online 6 February 2016
This manuscript was handled by Corrado
memory less networks and lack the ability to examine the temporal dimension of data. Recurrent neural
Corradini, Editor-in-Chief, with the network models, on the other hand, have the ability to learn from the preceding conditions of the system
assistance of Gokmen Tayfur, Associate and hence considered as dynamic models. This study for the first time explores the potential of wavelet
Editor coupled time lagged recurrent neural network (TLRNN) models for runoff prediction using rainfall data.
The Discrete Wavelet Transformation (DWT) is employed in this study to decompose the input rainfall
Keywords: data using six of the most commonly used wavelet functions. The performance of the simple and the
Rainfall runoff modelling wavelet coupled static MLPNN models is compared with their counterpart dynamic TLRNN models.
Dynamic recurrent models The study found that the dynamic wavelet coupled TLRNN models can be considered as alternative to
Discrete wavelet transformation the static wavelet MLPNN models. The study also investigated the effect of memory depth on the perfor-
mance of static and dynamic neural network models. The memory depth refers to how much past infor-
mation (lagged data) is required as it is not known a priori. The db8 wavelet function is found to yield the
best results with the static MLPNN models and with the TLRNN models having small memory depths. The
performance of the wavelet coupled TLRNN models with large memory depths is found insensitive to the
selection of the wavelet function as all wavelet functions have similar performance.
Ó 2016 Elsevier B.V. All rights reserved.

1. Introduction without taking into consideration the spatial variability and


stochastic characteristics of the rainfall–runoff process. Physically
Prediction of runoff produced from a watershed as a result of based models involve the solution of a system of partial differential
rainfall event is a key area of research in hydrology. It is considered equations in order to simulate various constituent processes of the
as one of the most complex hydrological process to be modelled hydrological cycle. Data-driven models consider the hydrological
because of the involvement of number of variables in the mod- system as a black-box and try to establish a relationship between
elling process and the enormous spatial and temporal variability historical inputs (such as rainfall, evaporation etc.) and outputs
of watershed characteristics. Since the establishment of rational (such as runoff).
method in 1850 (Mulvany, 1850) for calculation of the peak dis- Among data-driven models, the artificial neural network (ANN)
charge, numerous hydrological models have been proposed. These models has emerged as powerful black-box models and received a
models include two main categories: the theory driven (conceptual great attention during last two decades. The idea of ANN is inspired
and physically-based) models and the data driven (empirical and by the operation of the biological neural networks of the central
black-box) models. Conceptual models describe the general sub- nervous system of human brain. Mathematically, an ANN is a com-
processes and the physical mechanisms of the hydrological cycle pound nonlinear function with numerous factors that are adjusted
in such a way that the ANN output becomes comparable to the
observed output. The ANN approach has been successfully used
⇑ Corresponding author. for different modelling problems in various branches of science
E-mail addresses: msho127@aucklanduni.ac.nz (M. Shoaib), a.shamseldin@ and engineering. In the field of hydrology, French et al. (1992) were
auckland.ac.nz (A.Y. Shamseldin), b.melville@auckland.ac.nz (B.W. Melville), the first to use ANN for forecasting rainfall. Shamseldin (1997)
mkha222@aucklanduni.ac.nz (M.M. Khan).

http://dx.doi.org/10.1016/j.jhydrol.2016.01.076
0022-1694/Ó 2016 Elsevier B.V. All rights reserved.
212 M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225

pioneered the use of ANN in modelling rainfall–runoff relationship. neural network (RNN) models. RNN models have the capability to
The ANN has been successfully applied in many hydrological stud- learn from the preceding conditions of the system as they facilitate
ies (e.g. Akiner and Akkoyunlu, 2012; Antar et al., 2006; Arsenault time delay units through feedback connections and thus have
et al., 2015; Aziz et al., 2014; Jain et al., 2004; Lallahem and Mania, attracted much attention recently. The application of RNN can be
2003; Mekanik et al., 2013; Nourani et al., 2009a; Piotrowski et al., found in many studies (e.g. Anmala et al., 2000; Assaad et al.,
2015; Senthil Kumar et al., 2005). An extensive review of ANN in 2005; Badjate and Dudul, 2009; Chang et al., 2012; Chiang et al.,
hydrological applications can be found in ASCE Task Committee 2004; Coulibaly and Baldwin, 2005; Coulibaly and Evora, 2007;
(2000a, 2000b), Abrahart et al. (2012) and Tayfur (2012). Güldal and Tongal, 2010; Kale and Dudul, 2009; Kote and
However, Cannas et al. (2006) pointed out that the ANN based Jothiprakash, 2008; Ma et al., 2008; Muluye, 2011; Serpen and
models may not be able to deal with non-stationary data until Xu, 2003).
pre-processing of the input and/or output data is performed. Appli- It is evident from the literature reviewed and cited in this paper
cation of wavelet transformation (WT) on time series data has been that the use of static MLPNN and dynamic RNN models is increas-
found effective in addressing this issue of non-stationary data ing in hydrological studies, but most of the hybrid wavelet ANN
(Nason and Sachs, 1999). The WT decomposes the time series data models are relying only on the static MLPNN models. However,
into its sub-constituents and these sub-constituents are used as to our present knowledge, no study has yet been conducted to
external inputs to the ANN. The resulting model is known as the evaluate the potential of wavelet coupled dynamic neural network
hybrid wavelet model. These hybrid models improve the perfor- models. This study, is therefore, conducted to compare the perfor-
mance of ANN by capturing the important temporal and the spec- mance of hybrid wavelet static MLPNN models and dynamic time
tral information embedded in the time series data. Various studies lagged recurrent neural network models for runoff prediction using
used WT in order to improve the results of the ANN based rainfall data. The performance of hybrid wavelet models is sensi-
hydrological models (e.g., Wang and Ding, 2003; Cannas et al., tive to the selection of a particular mother wavelet function, the
2006; Nourani et al., 2009a, 2009b; Tiwari and Chatterjee, 2010; choice of decomposition level and the preference of appropriate
Adamowski and Sun, 2010; Kisi, 2011; Singh, 2012; Shoaib et al., input variables. This study will, therefore, investigate effect of var-
2014a, 2014b; Altunkaynak and Nigussie, 2015). Wang and Ding ious most commonly used wavelet functions, the choice of suitable
(2003) used a three layered hybrid wavelet feed forward neural decomposition level and the selection of suitable delay signal for
network (FFNN) model with back propagation (BP) training algo- the hybrid wavelet RNN models. The paper is arranged in the fol-
rithm for forecasting shallow groundwater levels and river dis- lowing manner. Section 1 gives the introduction and the review
charges. Cannas et al. (2006) applied hybrid wavelet MLPNN of literature. Section 2 is the methodology section which also
models for forecasting river flows. Nourani et al. (2009a) presented describes data used in the study. In this section, the theoretical
a hybrid wavelet MLPNN model for prediction of precipitation background of MLPNN, time-lagged neural network (TLNN) recur-
while Nourani et al. (2009b) utilized a FFNN model with BP train- rent models, the development of simple and the hybrid wavelet
ing algorithm for modelling rainfall–runoff process. Tiwari and static and dynamic models are discussed. Also in this section, the
Chatterjee (2010) employed a wavelet coupled MLPNN model for performance indices used to evaluate the developed models are
flood forecasting purposes. Wavelet coupled flow forecasting presented. The results of the different developed models are dis-
MLPNN model for non-perennial rivers was presented by cussed in Section 3. The conclusions of the paper are presented
Adamowski and Sun (2010). Likewise, Singh (2012) presented in Section 4.
wavelet-MLPNN conjunction models for prediction of flood events.
Hybrid wavelet MLPNN and radial basis function neural network
2. Methodology
(RBFNN) models were used by Shoaib et al. (2014a) for comparing
the performance of various wavelet coupled models.
2.1. Artificial neural networks (ANN)
Most of the ANN models including the simple and the wavelet
coupled models used in hydrology are of static nature relying on
2.1.1. Multilayer perceptron neural network (MLPNN)
the MLPNN model to learn the relationship between the observed
The MLPNN consists of a number of neurons arranged in a series
input and the observed output. MLPNN is a static network as it
of consecutive layers. Typically, it consists of an input layer, a hid-
allows only one-way information flow from the input layer to
den layer and an output layer. Each neuron receives an array of
the output layer. Moreover, it is also considered as memory less
inputs and produces a single output. The output of a neuron in
network because of absence of any memory or recursion compo-
the input layer will be input for the neuron in the hidden layer.
nent to store the past information at any given time step. Further-
Similarly, the output of the neuron in the hidden layer will be input
more, the MLPNN models lack the capability to examine the
for the output layer. Each neuron in all the layers processes its
temporal dimension of data and cannot instinctively learn from
input by a mathematical function known as the neuron transfer
the preceding conditions of the system (Saharia and
function. The neurons in the input layer have connection with
Bhattacharjya, 2012). This is very vital in case of the hydrological
the neuron in the hidden layer while neuron in the output layer
systems since the current response of a hydrologic system can be
is only connected to the neuron in the hidden layer. There is no
very reliant on their preceding states. An implicit method of encod-
direct connection between the neuron in the input layer with the
ing temporal characteristics in static ANN is to use a sliding win-
neurons in the output layer. MLPNN is the most widely used neural
dow of input sequences (e.g. Coulibaly et al., 2000a, 2000b; Kisi
network type in various application of hydrology (Dawson et al.,
et al., 2013; Lohani et al., 2012; Tayfur and Guldal, 2006; Tayfur
2002; Maier and Dandy, 2000). More theoretical background of
et al., 2014). In this method, a form of static memory is implicitly
ANN and its various applications in water resources engineering
provided to the MLPNN by selecting an input vector comprising
can be found in Tayfur (2012).
of the fixed number of past events relevant to the current system
response. But incapability of this method to encode temporal pat-
terns with randomly selected time intervals makes it unsuited for 2.1.2. Time-lagged recurrent neural network (TLNN)
conditions that require high forecasting efficiency (Saharia and Conventionally, MLPNN, where neurons in one layer are only
Bhattacharjya, 2012). The concept of signal delays play an impera- connected to neurons in the next layer, have been used for predic-
tive role in the biological neural network system of human brain. tion and forecasting applications. Nevertheless, recurrent net-
This concept has prompted the development of dynamic recurrent works, where neurons in one layer can be connected to neurons
M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225 213

in the next layer, the previous layer, the same layer and even to lZ 1
GðzÞ ¼ ð1Þ
themselves, have been proposed as alternatives to MLPNN 1  ð1  lÞZ 1
(Warner and Misra, 1996). Time is an important aspect of learning.
We may include time into neural network design implicitly or and the output of each tap of a discrete Gamma memory structure is
explicitly (French et al., 1992). A time lagged recurrent neural net- given by the following two equations (Motter and Principe, 1994):
work (TLNN) is used in the present study. The static processing ele-
X 0 ðtÞ ¼ XðtÞ ð2Þ
ments/neurons of the MLPNN are replaced by the neurons with
short-term memories in the TLRNN models. These short term
memories may be attached to any layer in the network, producing X k ðtÞ ¼ ð1  lÞX k ðt  1Þ þ lX k1 ðt  1Þ; k ¼ 1; . . . ; k ð3Þ
very sophisticated neural topologies which may be very useful for where l is an adjustable parameter used to find an optimal compro-
time series prediction and system identification (Charaniya and mise between memory depth and resolution during training. The
Dudul, 2013). The addition of a short-term memory structure only order of Gamma memory (k) is product of (D) and (R). For a kth
in the input layer of the static MLPNN is a straightforward method order Gamma memory, the memory depth is approximated by
of implicit representation of time. The resulting neural network (k/l) and memory resolutions l. Lallahem and Mania (2003),
type is called as focused time-lagged recurrent neural network Coulibaly et al. (2000a, 2000b), Sajikumar and Thandaveswara
(FTLRNN). This memory is used to store past information, which (1999) and Rummelhart et al. (1986) have successfully demon-
can be used to analyse the temporal variations in the dataset in a strated application of time-lagged recurrent models in hydrology.
more effective manner.
There are two methods to implement the short term memory
2.1.3. Wavelet transformation (WT)
structure in the TLRNN models. These methods include the Tapped
Mathematical transformations are often used to extract infor-
delay line (TDL) memory and the Gamma memory. The TDL mem-
mation from a time series data which is not readily available in
ory may be viewed as single input-multiple output network as it
its raw form. Wavelets are mathematical functions which give a
consists of p unit delays with (p + 1) terminals as shown in Fig. 1
time-scale representation of the time series data and their relation-
(a). In the figure, the Z1 represents the unit delay. The memory
ships. They are suitable for analyzing time series data that contain
depth (D) is fixed as k and its memory resolution (R) is fixed as
non-stationaries. Wavelet transform (WT) is regarded as being
unity. The memory depth (D) refers to how far into past the mem-
capable of revealing aspects of the original raw time series data
ory stores information while the memory resolution (R) refers to
such as trends, breakdown points, and discontinuities that other
the degree to which information regarding input is stored. The
time series analysis techniques might miss (Adamowski and Sun,
Gamma memory introduced by Hopfield (1982) provides control
2010; Singh, 2012). There are two types of the wavelet transforma-
over the memory depth by building a feedback loop around each
tions; Continuous Wavelet Transformation (CWT) and Discrete
unit delay as represented in Fig. 1(b). The unit delay Z1 of TDL is
Wavelet Transformation (DWT). The Continuous Wavelet
replaced by the transfer function G(z) given below;
Transform (CWT) of a time series f(t) is defined as follows:

x(t) x0(t)

G(z)

Memory
Hidden Output
Layer
X(n)
Layer Layer 1-

Z-1

x1(t)

X(n-1)
G(z)

1-

X(n-2)
Z-1

x2(t)

X(n-k) G(z)
(a)
1-

(b)
Z-1

xn(t)

Gamma Memory

Fig. 1. Schematic diagram of TLRNN with (a) TDL memory, (b) Gamma memory.
214 M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225

Z þ1  
1 tb using an increasing number of hidden neurons. In order to obtain
W a;b ðtÞ ¼ f ðtÞ pffiffiffi w dt ð4Þ
1 a a near maximum efficiency with as few neurons as necessary
(Hammerstrom, 1993). The sigmoid activation function is used as
where ⁄ refers to the complex conjugate and w(t) which is called the
the transfer function for the neurons of the hidden and the output
wavelet function or the mother wavelet. The entire range of the sig-
layer for the simple and wavelet coupled models. Calibration of the
nal/time series is analyzed by the wavelet function by using the
simple and hybrid models in the present study is done by using the
parameters ‘a’ and ‘b’. The parameter ‘a’ and ‘b’ are the dilation
Levenberg–Marquardt algorithm (LMA) because of its simplicity
(scale) and translation (position) parameters, respectively. The cal-
(Adamowski and Sun, 2010). LMA is an iterative technique which
culation of CWT coefficients at each scale ‘a’ and translation ‘b’
can be used of an objective function which in this study is
results in a large amount of data. This problem was resolved in
expressed as sum of squares of nonlinear functions. The stopping
the DWT which operates scaling (low pass filter) and wavelet (High
criteria for the training of the developed models in the present
pass filter) functions. The DWT scales and positions are based on
study is either a maximum of 100 epochs or training is set to ter-
power of two (dyadic scales and positions) and can be defined for
minate when the mean squared error (MSE) of the cross validation
a discrete time series f(t);
testing data set begins to increase. This is an indication that the
Z j¼J network has begun to over-train. Overtraining is when the network
Wða; bÞD ¼ 2j=2 w ð2j=2  kÞf ðtÞdt ð5Þ simply memorizes the training set and is unable to generalize the
j¼1
problem.
where the real numbers, j and k are the integers which control the The calibrated network is then tested by presenting different set
wavelet dilation and translation, respectively. In WT, a time series of validation data which has not been used in calibration. The net-
passes through a low pass filter and a high pass filter and produces work is re-calibrated by changing the number of neurons in the
detail information d(n) and the approximation a(n), respectively. hidden layer, if it failed to perform satisfactory during validation
The high pass filters are used to analyse the high frequencies while phase. Testing of the network ensures that it has learned the gen-
the low pass filters are used to analyse the low frequency content of eral patterns of the system and has not simply memorized a given
the time series. The approximations are the high scale, low fre- set of data.
quency components of the signal while detail represents the low
scale, high frequency components. The low frequency content of
2.3. Performance indices
the time series data is the most significant part and it gives the sig-
nal its identity while on the other hand, the high frequency content
In the present study, the performance of the developed models
imparts flavour or nuance. The detail signals can catch trivial attri-
is determined by using various statistical tests that define errors
butes of interpretational value in data while the approximation
associated with the models. The performance of the developed
shows the background information of data (Nourani et al., 2009a;
models is assessed in terms of statistical measures of goodness of
Tiwari and Chatterjee, 2010). This decomposition process continues
fit. In the current study, two statistical indices, namely, the Root
until the desired level is achieved. The maximum possible number
Mean Squared Error (RMSE) and the Nash–Sutcliffe Efficiency
of levels depends on the length of the signal/data. At each decompo-
(NSE) Nash and Sutcliffe (1970) are used. More details on these sta-
sition level, the low pass and the high pass filters produces the sig-
tistical indices along with their equations can be found in Tayfur
nals spanning only half the frequency band. This doubles the
(2012). The RMSE is used to measure the estimated output accu-
frequency resolution double as the uncertainty in the frequency is
racy. The RMSE values range between zero and 1. An RMSE value
reduced by half. The decomposition of temporal data by DWT satis-
of zero indicates perfect match between the estimated and the
fies the following conditions; f(t) = d1 + a1 = d1 + d2 + a2 = d1 + d2 +
observed discharges. Similarly, an RMSE of 1 suggests that there
d3 + a3 and so on. Further details on the DWT can be found in many
is no match between the estimated and the observed discharges.
text books including Daubechies (1992) and Addison (2002).
Karunanithi et al. (1994) suggested that the RMSE value is a good
measure for indicating the goodness of fit for high flows. The NSE
2.2. Development of hybrid wavelet models
is one of the most widely used criteria for assessment of the hydro-
logical models performance. The NSE provides a measure of ability
The static MLPNN and the dynamic TLNN models are integrated
of the model to predict observed values. In general, high values of
with DWT in this study to develop hybrid wavelet models. DWT is
NSE (up to 100%) and small values for RMSE indicate a good model.
used to decompose the rainfall time series into sub-series of
These two statistical measures can be used to evaluate perfor-
approximation and details in order to reveal its temporal and the
mance of hydrological models satisfactorily (Legates and McCabe,
spectral information. The hybrid wavelet multilayer perceptron
1999).
neural network (WMLPNN) and the hybrid wavelet time lagged
neural network (WTLNN) models are then developed using the
decomposed data as input and the observed runoff as the output. 3. Data
The performance of the hybrid wavelet models are compared with
their respective simple multilayer perceptron neural network The daily rainfall runoff data of two catchments located in dif-
(SMLPNN) and the simple time lagged neural network (STLNN) ferent hydro-climatic conditions in the world is used in the present
models. A schematic diagram of hybrid wavelet model is presented study. The two catchments are the Baihe catchment located in
in Fig. 2. The SMLPNN and the STLNN models comprises of three north eastern China and the Brosna catchment located in Ireland.
layers; input, hidden and output layer. The number of neurons in The Baihe catchment with an area of 61,780 km2 is a sub-basin
the input and output layers are fixed as one for the simple models. of Upper Hanjian River basin as shown in Fig. 3(a). The Hanjian
For the wavelet coupled models, the number of neurons in the River is the largest tributary of the Yangtze River in China, covers
input layer is determined on the basis of the selected decomposi- a total drainage area of approximately 151,000 km2 (34°300 –30°4
tion level. The selection of the number of neurons in hidden layer 90 N, 106°140 –114°560 E) with a total length of 1577 km. The altitude
is important for maximizing the ANN performance. The selection of of the basin decreases from 3500 m in the northwest to 88 m at the
appropriate number of neurons in hidden layer in the present Danjiangkou reservoir in the southeast (Sun et al., 2014). The Han-
study is determined by trial and error procedure. This is accom- jian River is about 737 km long from northwest to Baihe station in
plished by calibrating the network and evaluating its performance southeast. The average width of the main stream of Hanjiang is
M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225 215

Wavelet
an
Transformation
d1

Rainfall
MLP/ Runoff
Data d2 TLNN

dn

Fig. 2. Schematic diagram of hybrid wavelet models.

Fig. 3. Location of the study area (a) Baihe, (b) Brosna.

about 200–300 m with an average slope of 6% and surrounded by in the study is the daily averaged data measured at the Baihe sta-
high and steep mountains, narrow valleys and swift torrents. This tion. The data for a period of six years from 1972 to 1979 (2190
mountainous basin is semi-arid in nature, lies in a typical subtrop- data points) is used for training the models, whereas the remaining
ical monsoon climate region with rainfall mainly during the sum- data for a period of two years from 1978 to 1979 (730 data points)
mer and autumn seasons. Normally, the evaporation is greater is used for validation purpose.
than rainfall from November to April and in August, whereas rain- The Brosna catchment located in Ireland having drainage area of
fall is less in the other months. The daily rainfall data for a period of 1207 km2 up to the Ferbane flow gauging station is used in this
eight years starting from 1st January 1972 onwards from nineteen study. The catchment has a very flat topography except for
rain gauge stations and three hydrological stations are used to some undulations caused by glacial deposits. There is no noticeable
obtain daily average rainfall. The concurrent discharge data used evidence of substantial groundwater movement across the
216 M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225

topographical boundary of the catchment. Daily rainfall data for a approximation at level one (a1) and one detail (d1), level two
period for ten years (1st January 1969 to 31st December 1978) are decomposition gives approximation (a2) and two details (d1 and
lumped by averaging the data from four rain gauge stations. The d2), level three decomposition results in approximation (a3) and
concurrent discharge data used in the study is the daily averaged three details (d1, d2 and d3) and so on. Level one detail sub-series
data measured at the Ferbane station. The first eight years data d1 describes the features of original data detectable at a scale of
from 1969 to 1976 (2920 data points) is used for the training/ up to two days. The d2 sub-series represent features of the raw data
calibration purpose while remaining two years data from 1977 to measurable at a scale of 2–4 days. Likewise, d3 can detect features
1978 (730 data points) is used for the testing purpose. The location on approximately weekly basis in the original data. Various meth-
of the catchments along with rainfall gauging sites is shown in ods have been applied in different studies to select suitable decom-
Fig. 3(b). position level. Adamowski and Sun (2010), Partal and Kisßi (2007)
The summary of statistical characteristics of the training and and Kisi and Shiri (2011) used trial and error method to select
the testing data are given in Table 1. Furthermore, the results of decomposition level and employed level eight, level ten and level
cross correlation analysis between lagged rainfall data and dis- three decomposition, respectively. Aussem et al. (1998), Nourani
charge is shown in Fig. 4 for both selected catchments of Baihe et al. (2009a, 2009b), Tiwari and Chatterjee (2010) and
and Brosna. It can be seen from Fig. 4 that the lag-3 day rainfall Adamowski and Chan (2011) used the following formula to select
time series r(t  3) has the maximum correlation with the suitable decomposition level:
observed discharge.
L ¼ intðlogðNÞÞ ð6Þ
4. Results and discussion
where L is the level and N is the total data points in data. Neverthe-
4.1. Selection of suitable wavelet function less, Nourani et al. (2011) and Moosavi et al. (2013) argued that this
equation was derived only for fully autoregressive data by consider-
In order to decompose the raw time series data into approxima- ing the length of data only without giving consideration to the sea-
tion and details using WT, a wavelet function is required. There are sonal signature of the hydrologic process. Shoaib et al. (2014a, b)
different wavelet families available each containing different and Shoaib et al. (2015) favoured the use of level nine decomposi-
wavelet functions such as Daubechies wavelet families contains tion based on the results of regression analysis between wavelet
db1, db2, db3 and so on as the wavelet functions. These different transformed rainfall data at each decomposition level and runoff
wavelet functions are categorized by their unique features includ- data. Therefore, a regression analysis between wavelet transformed
ing the region of support and the number of vanishing moments. rainfall data and observed runoff is carried out in the present study
The region of support of wavelet is related with the span length for the two selected catchments to determine the appropriate
of the wavelet which affects its feature localization possessions decomposition level. For this purpose, lagged-1 rainfall data is
of a signal while the vanishing moment confines the wavelet’s abil- transformed for all possible decomposition levels (i.e. from level
ity to signify polynomial behaviour or information in a signal. The one to level nine at most) using six selected wavelet functions. A
details of different wavelet families can be found in many text total of 108 decomposition levels are obtained for both selected
books such as Daubechies (1992) and Addison (2002). This study catchments. The results of the regression analysis are presented in
uses six most commonly used wavelet functions in hydrology are Fig. 5 for both selected catchments. It is found from Fig. 5 that coef-
used in this study. These wavelet functions includes the simple ficient of determination R2 has the lowest value at the first decom-
wavelet functions Haar wavelet, the db4 and db8 wavelet functions position level. The value of R2 is found to increase with the increase
of most commonly wavelet family Daubeches, sym4 and sym8 in decomposition level and the R2 value is found maximum at
wavelet functions with sharp peak of Symlet wavelet family and decomposition level nine. Therefore, level nine decomposition is
discrete approximation of Meyer (dmey) wavelet function. also selected in the current study as it is giving best results for both
selected catchments. Level nine decomposition contains one large
4.2. Selection of decomposition level scale sub-signal approximation (a9) and nine small scale sub-
signals detail (d1, d2, d3, d4, d5, d6, d7, d8 and d9). The detail sub-
A DWT involves the decomposition of the raw time series data series d4 corresponds to variation up to 16-day, d5 up to 32-day
into approximation and details. A DWT decomposition consist of (about month), d6 to 64-day, d7 to 128-day (about four month), d8
log2 (N) levels/stages at most where N is the total number of data to 256-day (about eight and half month) and d9 to 512-day mode
points. Selection of suitable decomposition level for the develop- (about seventeen month). Therefore, selected level nine decomposi-
ment of hybrid wavelet models is considered as one of the impor- tion holds d8 and d9 sub-signals which are liable for detecting sea-
tant factors to be considered. This decomposition associated with sonal variations in the input rainfall data on nearly annual basis.
the seasonal and periodic variations embedded in the hydrologic This annual periodicity is considered as a very important and
data under consideration. Level one decomposition yield foremost seasonal cycle in the hydrologic time series data.

Table 1
Statistical summary of data of test catchments.

Catchment Data type Calibration period Validation period


Maximum Average Coefficient of variation Maximum Average Coefficient of variation
(mm/day) (mm/day) (mm/day) (mm/day)
Baihe Rainfall 47.08 2.59 2.17 79.98 2.48 2.53
Evaporation 12.80 2.89 0.80 8.10 2.53 0.71
Discharge 28.25 1.04 1.89 22.66 0.78 2.23
Brosna Rainfall 32.67 2.20 1.63 27.56 2.47 1.51
Evaporation 9.80 1.31 1.04 6.90 1.32 1.04
Discharge 6.94 0.98 0.83 6.62 1.22 0.86
M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225 217

0.80 0.80
Training
0.70 (a) Testing (b)
Training
0.70 Testing

Correlation Coefficient

Correlation Coefficient
0.60 0.60
0.50 0.50
0.40 0.40
0.30 0.30
0.20 0.20
0.10 0.10
0.00 0.00
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Lag (days) Lag (days)

Fig. 4. Cross-correlation results of rainfall–runoff data for (a) Baihe, (b) Brosna.

Fig. 5. Variation of regression R2 with decomposition level (a) Baihe, (b) Brosna.

4.3. Selection of memory depth Nayak et al. (2005a, 2005b, 2007) has successfully applied this
approach of selection of appropriate input vectors for data driven
4.3.1. Static MLPNN model models. Another method of selection of appropriate input vector
In order to use temporal information contained in the time ser- for the ANN models was suggested by Sudheer et al. (2002). This
ies data with the static MLPNN model, a straight forward method method involves the selection of input vector on the basis of statis-
of implicit representation of time is to add a short-term memory tical properties of cross-correlation, auto-correlation and partial
structure in the input layer of a static MLPNN in the form of lagged correlation of the lagged input time series with the output time
rainfall data. The performance of the simple and hybrid wavelet series. However, some hydrological studies including Nayak et al.
static MLPNN models is very much dependent on the selection of (2004) and Senthil Kumar et al. (2005) criticizes the application
appropriate number of lagged rainfall data series/memory depth of statistical properties of cross-correlation, auto-correlation and
in the input vector. This is because the response of a hydrological partial correlation for the selection of appropriate input vector
system is inherently dependent on their previous states, so the for ANN models as these properties considered a linear relation-
use of past (time lagged) data is necessarily required in order to ship between the input and output variables and additional vari-
encode temporal features of the input data. There are several able effect to take account any non-linear residual dependencies
approaches for the selection of appropriate input vector for the is not considered in this method. Moreover, Shoaib et al. (2014a,
data driven models. One of the most common approach for the 2014b) compared in detail various approaches for the selection
selection of appropriate input vector is considering a sliding win- of appropriate input vector and favoured the use of most common
dow of input sequences in the input vector. With this approach, approach for the wavelet coupled ANN models. Hence, the follow-
the input vector starts from containing only lagged-1 day time ser- ing five input vectors are considered in the present study to predict
ies data in the input vector. The input vector is then modified by runoff at the present time using SMLPNN and WMLPNN models:
successively adding one more lagged time series into it and this
continues up to a specific lag time. This approach has been succes- I1 r(t  1)
sively applied in many hydrological studies including Furundzic I2 r(t  1), r(t  2)
(1998), Tokar and Markus (2000), Riad et al. (2004), Chua et al. I3 r(t  1), t(  2), r(t  3)
(2008), Moosavi et al. (2013), Shoaib et al. (2015), etc. This specific I4 r(t  1), t(  2), r(t  3), r(t  4)
lagged time can be estimated either by trial and error or may be I5 r(t  1), r(t  2), r(t  3), r(t  4), t(t  5)
selected from present time to the time where the lagged time data
is most correlated with the observed discharge/runoff. The other
common approach of selection of appropriate input vector is the
selection of most correlated lagged time series data in the input The first input vector I1 contains only lagged-1 day rainfall data
vector only and neglecting the poorly lagged rainfall data series. series r(t  1). The second input vector I2 is obtained by adding
218 M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225

lagged-2 day rainfall data series r(t  2) in the first input vector and are further increased to 72% and 75%, respectively for the simple
then the external input vector is further modified by successively MLPNN models while RMSE values are reduced to 700 cumecs with
adding one more lagged time rainfall data series into input vector I4 and 600 cumecs with I5 input vector (not shown here). However,
and this continues up to a 5-day lagged rainfall data series to no significant improvement can be seen for WMLPNN models with
predict runoff at the current time step. input vector I4 and I5 relative to input vector I3. For input vector I3,
The performance of the static simple and the wavelet MLPNN I4 and I5, the NSE and RMSE are found about 80% and 400 cumecs,
models in terms of performance indices of NSE (%) and the RMSE respectively as shown in Fig. 6(c)–(f).
for both selected catchments and for all selected six wavelet func- The NSE and RMSE values of simple and wavelet coupled mod-
tions are shown in Figs. 6 and 7. Fig. 6(a) and (b) illustrates the NSE els for the Brosna catchment with all five input vectors is repre-
(%) and RMSE values of WMLPNN models for Baihe catchment with sented in Fig. 7. It is evident from the figure that the simple
input vector I1. It is evident from figure that WMLPNN models out- MLPNN model yielded NSE value of about 6% with input vector
performed their counterpart simple MLPNN models for all six I1, 12% with I2, 19% with I3, 25% with I4 and 30% with I5 input vector
selected wavelet functions. The NSE and RMSE values are found containing one to five lagged rainfall data series to predict runoff at
to be 4.5% and 1369 cumecs, respectively with input vector Ii for the current time step as shown in Fig. 7(a), (c) and (e), respectively.
the simple models. The WMLPNN models developed with wavelet A RMSE value of about 10 cumecs is found for simple models with
functions db8 and sym8 are found best among all the wavelet func- all five input vectors considered without showing any significant
tions and yielded NSE and RMSE values of about 50% and improvement as shown in Fig. 7(b), (d) and (f). In the case of the
900 cumecs, respectively with I1 input vector. For I2 input vector WMLPNN models, the wavelet coupled MLPNN model with db8
containing both lagged one and two day rainfall data series in wavelet function is found the best among all the wavelet coupled
the input vector, the NSE is increased to about 20% and RMSE value models for input vector I1 with NSE value of about 40% and RMSE
is decreased to about to 1200 cumecs in case of the simple models value of about 8 cumecs as shown in Fig. 7(a) and (b), respectively.
(not shown here). For input vector I2, the WMLPNN models developed with wavelet
The WMLPNN model developed in this study with the wavelet functions db8 and sym4 are found best and yielded the NSE and
function db8 is found to be the best among all WMLPNN models RMSE values of about 52% and 8 cumecs, respectively (not shown
developed and gives the NSE and RMSE values of about 82% and here). For input vector I3 and I4, the wavelet models developed
400 cumecs, respectively. As shown in Fig. 6(c) and (d), the NSE with wavelet functions db8, sym4, sym8 and dmey wavelet func-
(%) value is further increased from 20% with I2 input vector to tions are the best models with NSE and RMSE values of about
about 55% with I3 input vector while RMSE value is found to be 60% and 7 cumecs, respectively. However, the wavelet coupled
decreased from 1200 cumecs with I2 to 800 cumecs with I3 input model developed with wavelet function db8 is found to be best
vector. For input vector I3, the performance of WMLPNN models among all wavelet coupled models for input vector I5 with NSE
developed with all selected wavelet functions except Haar wavelet and RMSE values of about 60% and 7 cumecs, respectively as shown
is almost found equal with NSE and RMSE values of about 80% and in Fig. 7(e) and (f). It can be concluded on the basis of above
400 cumecs, respectively. For input vector I4 and I5, the NSE values analysis that the db8 wavelet function is the only function which

100 (a) I1 Training 1400 (b) I1 Training Testing


Testing 1200
80
1000
NSE (%)

60
RMSE

800
40 600
400
20
200
0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

100 (c) I3 Training Testing 1400 (d) I3 Training


Testing
1200
80
1000
NSE (%)

RMSE

60 800
600
40
400
20 200
0
0 Simple Haar db4 db8 sym4 sym8 dmey
Simple Haar db4 db8 sym4 sym8 dmey

100 (e) I5 1400 (f) I5 Training


Training Testing
1200 Testing
80
1000
NSE (%)

60
RMSE

800
40 600
400
20
200
0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

Fig. 6. Performance of static MLPNN models for the Baihe catchment.


M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225 219

100 (a) I1 Training 15 (b) I1 Training Testing


80 Testing
10

RMSE
NSE (%)
60
40
5
20
0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

100 16 (d) I3
(c) I3 Training Testing Training Testing
14
80 12
NSE (%)

10

RMSE
60
8
40 6
4
20
2
0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

100 (e) I5 Training


Training Testing
15 (f) I5
Testing
80
10

RMSE
NSE (%)

60

40
5
20

0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

Fig. 7. Performance of static MLPNN models for the Brosna catchment.

performed well for all the selected input vectors considered and for with trajectory learning factors. The simple time lagged recurrent
both selected catchments of Baihe and Brosna in this study during neural network (STLRNN) models are developed with rainfall r(t)
training as well as testing phases. and runoff q(t) at the current time as the input and desired target,
respectively. The hybrid wavelet time lagged recurrent neural net-
4.3.2. Dynamic TLRNN model work (WTLRNN) are developed using discrete wavelet transformed
Time lagged recurrent neural networks (TLRNN) are the input rainfall data at current time as input and observed runoff q(t)
extended form of static MLPNN models with short term memory as the desired target. The level nine decomposition is also
structure and recurrent connections. The input layer used the employed for wavelet transformation of the input rainfall data r
inputs lagged by multiple time steps before presented to the net- (t) which contains one approximation and nine details sub-signal
work and thus forms the short term memory. There are several of the data.
memory structures to choose such as TDL, Gamma and Laguarre. Results of the STLRNN and WTLRNN models for Baihe and
As stated earlier in Section 2.1.2, the Gamma memory is selected Brosna catchments are presented in Figs. 8 and 9, respectively for
for the present study. It is very critical to determine how much different values of D ranging from 1 to 6. It was found that the
short memory is required to describe the temporal pattern of the STLRNN models are the least performing models. The STLRNN
data in case of TLRNN. This is because as it is not known a priori model yielded NSE and RMSE values of about 2% and 1200, respec-
where, in time, the relevant information is. It is the depth (D) tively for D = 1 and D = 2 which refers to the tap of about 0.67–
parameter which determines the depth of the memory. The depth 1.34 days. Furthermore, the WTLRNN models developed with
parameter (D) of the Gamma memory is used to compute the num- wavelet functions db8 and sym4 are the best performing models
ber of taps to be used for storing past information. As the short for D equals to one and two. The WTLRNN models yielded the
term memory of one to five day is considered in the development NSE and RMSE values of about 50% and 1000 cumecs, respectively.
of static MLPNN models where input vector I1 and I5 corresponds The results for D = 1 are presented in Fig. 8(a) and (b) while results
to the memory of one day and five days, respectively. Ideally, the for D = 2 are not shown here. It is further found that the perfor-
same memory should be used for the TLRNN network models. mance of simple models is increased exponentially when D is set
However, the same memory cannot be employed for the TLRNN as three which corresponds to the tap of two days. It yielded the
models developed using Gamma memory, as the number of taps NSE and RMSE values of about 60% and 800 cumecs, respectively
in case of Gamma memory is calculated by the formula T = 2D/3. against the 75% NSE value and 600 cumecs RMSE values of best
This formula yields a memory of 0.67, 1.34, 2, 2.67, 3.34, and 4 days WTLRNN model with db8 wavelet function. The NSE value of
with D taken as one, two, three, four, five and six, respectively. TLRNN is further increased to about 69% and 77% with D set as
Therefore, the value of D in the present study is varied from one three and four, respectively. The corresponding RMSE values of
to six in the development of simple and hybrid wavelet TLRNN the simple models are found as about 750 and 600 cumecs. The
models, so that its performance can be compared with the static WTLRNN model developed with db8 wavelet function is again
MLPNN models with a memory of one to five days. found to be best performing wavelet models by yielding NSE and
Typically, the TLRNN has three layers, namely, the input layer, RMSE values of about 80% and 550 cumecs, respectively with D
the hidden layer and the output layer with feedback connection set as four. The results for D = 3 are presented in Fig. 8(c) and (d)
from the hidden layer back to the input layer. Training of the while results for D = 4 are not presented here. The performance
TLRNN is accomplished with the back-propagation through time of all simple and wavelet models is found to be similar for D is
220 M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225

100 1400 (b) D = 1 Training


(a) D = 1 Testing
1200
80

NSE (%)
Training Testing 1000

RMSE
60 800
40 600
400
20
200
0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

100 Training 1400 (d) D = 3 Training


(c) D = 3
Testing 1200 Testing
80
1000
NSE (%)

RMSE
60 800
40 600
400
20
200
0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

100 Training (e) D = 6 1400 (f) D = 6 Training


90 Testing 1200 Testing
80
70 1000
NSE (%)

RMSE
60 800
50
40 600
30 400
20
200
10
0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

Fig. 8. Performance of TLRNN models of Baihe catchment.

100 Training
(a) D = 1 16 (b) D = 1 Training
Testing 14 Testing
80 12
NSE (%)

RMSE

60 10
8
40 6
4
20
2
0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

100 Training 16 (d) D = 3 Training


Testing (c) D = 3 14 Testing
80 12
NSE (%)

RMSE

60 10
8
40 6
4
20
2
0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

100 Training (e) D = 6 16


Testing (f) D = 6 Training
80 14 Testing
12
NSE (%)

RMSE

60 10
8
40 6
20 4
2
0 0
Simple Haar db4 db8 sym4 sym8 dmey Simple Haar db4 db8 sym4 sym8 dmey

Fig. 9. Performance of TLRNN models of Brosna catchment.

taken as five and six. The results for D = 6 are presented in Fig. 8 coupled models. However, the difference between performance
(e) and (f) while results for D = 5 are not presented here. It is evi- of simple and wavelet models in terms of NSE (%) and RMSE values
dent from the above analysis for the Baihe catchment that D has is very small for D ranging from three to six.
an important role in the performance of TLRNN models. Further- In addition, the performance of wavelet coupled models is
more, the performance of simple models is found least with D found to be insensitive of wavelet function with higher values
ranging from one to two compared to their counterpart wavelet of D (five and six) as all the wavelet functions are found to be
M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225 221

performing equally well. This may be due to the fact that the larger model increases drastically (increase in NSE (%) value and decrease
memory depths contained maximum temporal information and in RMSE value) from I1 with only one day lagged rainfall data series
therefore be retrieved well by the all the wavelet functions includ- to I2 input vector with lagged one and lagged two rainfall data ser-
ing the simpler Haar wavelet function as well. Fig. 9(a)–(f) shows ies and thereafter it almost remains constant with I3, I4 and I5 input
the performance of simple and wavelet coupled TLRNN models vector. The addition of one more lagged data series in I2 and there-
for the Brosna catchment with different values of D ranging from after did not improve the performance significantly. This may be
one to six. The performance of simple TLRNN models for D from attributed to the fact that maximum temporal and spectral infor-
one to two almost remains same in terms of NSE (%) and RMSE val- mation contained in the data has been presented to the network
ues. The simple TLRNN models yields NSE and RMSE values of by the lagged one and lagged two wavelet transformed rainfall
about 3% and 11 cumecs, respectively. The WMLPNN models data series. The addition of lagged three, four and five day input
developed with the db8 wavelet function is found as the best rainfall data did not provided any further information to the net-
wavelet coupled models for D values one and two. The best wave- work and thus resulted in no significant improvement. However,
let coupled models yields the NSE and RMSE values of about 45% for the simple static MLPNN model which uses lagged rainfall data
and 9 cumecs, respectively. The results for D = 1 are shown in in its raw form in the input vector behaved entirely different than
Fig. 9(a) and (b) while no results are presented here for D = 2. Sim- its counterpart wavelet coupled model. It is evident from Fig. 10
ilarly, the best WMLPNN model yields about 50% NSE value and (a) and (b) that the performance of the simple model kept on
8 cumecs, respectively. The results for D = 1 are shown in Fig. 9 increasing from I1 to I2 then I3 and up to I4 and then it remains
(a) and (b) while no results are presented here for D = 2. Similar almost constant. This may be due to the fact that the rainfall data
to the Baihe catchment, the NSE values of simple model is expo- in its raw form did not contain too much temporal information.
nentially increase to about 40% with D as three and four, to about That is why with the addition of one more lagged time data series
45% with D as five and to about 48% with D set as six as shown in in the input vector, the performance of the simple models
Fig. 9(c) and (e). Similarly, the RMSE value decreases from improved significantly and this continues up to I4 input vector
10 cumecs with D set as two to about 8 cumecs with D set as six which contains lagged one to lagged four rainfall data series in
for the simple models. The wavelet coupled TLRNN models devel- the input vector. With further addition of one more lagged rainfall
oped with wavelet functions db4 and db8 are found the best mod- data series in the input vector I4, no significant improvement can
els among WTLNN models developed with other four wavelet be seen. This means maximum temporal information contained
functions with D set as three and four. The best WTLRNN models in the rainfall data has been presented to the network. As for as
yields NSE value of about 65% and 70% for D value set as three the performance of wavelet coupled TLRNN models is concerned
and four respectively. Likewise, the corresponding RMSE values with different memory depths, the results are presented in
of about 7 cumecs is found for the best WTLRNN models. The Fig. 10(c) and (d). It is evident from the figures that the memory
results for D = 3 are shown in Fig. 9(c) and (d) while no results depth has affected the performance significantly. The trend is
are presented here for D = 4. Similar to Baihe catchment, the almost similar for both simple and wavelet coupled models. It
WTLRNN models developed with selected six wavelet functions can be seen from the figures that the performance of both the sim-
are found performing equally good for D value set as five and six ple and wavelet coupled models increases drastically from mem-
as shown in Fig. 9(e) and (f) for D = 6. ory depth two to memory depth three. The memory depth two
and three refers to the memory of about 1.5 and 2 days, respec-
4.4. Discussion tively. The performance of the wavelet coupled TLRNN model is
consistent with the static MLPNN model as both are giving maxi-
The performance of the best wavelet coupled static MLPNN and mum performance at the memory depth of two days. Both the sim-
the dynamic TLRNN models are compared in this section with dif- ple and wavelet coupled models are found to perform best with
ferent memory depths. From the above analysis shown, the db8 memory depth of five which corresponds to the about three days
wavelet function is found to produce the beset results for small and thereafter it starts decreasing. It can be further seen from
as well as for large memory depths, so wavelet coupled models Fig. 10 that the performance of wavelet coupled MLPNN models
developed with db8 wavelet function are considered for this com- outperformed their counterpart simple MLPNN models with input
parison. The best performance of the db8 wavelet function may be vector I1, I2 and I3. However, with input vectors I4 and I5 which con-
due to the point that by decomposing a time series data up to max- tains up to lagged four and lagged five rainfall data series, respec-
imum level, the information embedded in the data can be revealed tively, the performance of simple and wavelet coupled MLPNN
well by a composite wavelet function such as db8. models are not too much significantly different. But wavelet couple
Moreover, time series data mostly have high frequencies occurs TLRNN models only outperformed the simple TLRNN models only
only for small period, while low frequencies spans almost whole with D set as one and two. With D taken as three up to six, the both
range of the signal/temporal data. Consequently, the wavelets hav- the models are almost following the similar trend.
ing higher support length are able of capturing the low frequencies Fig. 11 shows the impact of memory depth on the performance
more precisely. The db8 wavelet performed well compared to of the static simple and wavelet coupled MLPNN and dynamic
other wavelets used in the present study as it is having reasonable TLRNN models. Fig. 11(a) and (b) presents the performance of sta-
support width and also contains good time–frequency localization tic simple and wavelet coupled models while Fig. 11(c) and (d)
property. All of these characteristics together allow the hybrid shows the results for the TLRNN models. It is evident from
wavelet models developed with the db8 wavelet function to cap- Fig. 11(a) and (b) that the performance of the wavelet coupled
ture both the underlying trend as well as the short term variability models outperformed their counterpart simple MLPNN models
in the time series data. for all the input vectors considered in the study. However, the
Figs. 10 and 11 represent performance of different models for wavelet coupled TLRNN models only outperformed their counter-
the Baihe and the Brosna catchments, respectively considering dif- part simple models with memory depth of one and two only. With
ferent memory depths. Fig. 10(a) and (b) shows the performance of D taken as three up to six the performance of simple TLRNN models
static wavelet coupled MLPNN models while Fig. 10(c) and (d) rep- is not significantly different with their counterpart wavelet cou-
resents the performance of TLRNN models with different memory pled models. It can, therefore, be concluded on the basis of above
information. It is evident from Fig. 10(a) and (b) that performance analysis that wavelet coupled static MLPNN and dynamic TLRNN
in terms of NSE (%) and RMSE of the wavelet coupled static MLPNN models performed better relative to their counterpart simple
222 M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225

100 1400
(a) (b)
90
1200
80

RMSE (cumecs)
Simple
70 1000
Wavelet

NSE (%)
60 800
50
Simple 600
40
Wavelet
30 400
20
200
10
0 0
I1 I2 I3 I4 I5 I1 I2 I3 I4 I5

100 1400
(c) (d) Simple
90
1200
80 Wavelet
70 1000

NSE (%)
NSE (%)

60 800
Simple
50
Wavelet 600
40
30 400
20
200
10
0 0
D1 D2 D3 D4 D5 D6 D1 D2 D3 D4 D5 D6

Fig. 10. Impact of memory depth on the performance of models for Baihe catchment.

100 20
90
(a) Simple
18
(b) Simple
Wavelet Wavelet
RMSE (cumecs)

80 16
70 14
NSE (%)

60 12
50 10
40 8
30 6
20 4
10 2
0 0
I1 I2 I3 I4 I5 I1 I2 I3 I4 I5

100 16
90
(c) (d)
14
80
12
70
NSE (%)

NSE (%)

60 10
50 8
40 6
Simple
30 Simple
Wavelet 4
20 Wavelet
10 2
0 0
D1 D2 D3 D4 D5 D6 D1 D2 D3 D4 D5 D6

Fig. 11. Impact of memory depth on the performance of models for Brosna catchment.

models. However, the performance of the simple TLRNN models flow percentile threshold. Similarly, 11 to 89 percentile flow is con-
with memory depth of about 1.5 days is not significantly lesser sidered as medium flow range while 90 percentile flows are con-
than their counterpart wavelet coupled models in contrast with sidered as the low flow threshold. The medium flow percentile
the simple MLPNN models. Furthermore, the performance of the can be further divided in to high medium flow percentile and
wavelet coupled MLPNN and TLRNN models is almost similar for low medium flow percentile from 11 to 49 percentile and 50 to
both the selected catchments in contrast to their simple counter- 89 percentile flows respectively. Fig. 12(a) and (b) shows the FDCs
part models. This may be due to the fact that the wavelet trans- of static MLPNN and dynamic TLRNN models, respectively for the
formed data contains temporal and spectral information which Baihe catchment while 12(c) and (d) presents the FDCs for the cor-
when presented to any network type either static or recurrent responding models of Brosna catchment. It is evident from Fig. 12
enhances the performance significantly which is not the case of (a) that the simple model with input vector I2 containing lagged
simple models which make use of rainfall input data in its raw one and lagged two rainfall data series) can only captures the high
form. flow trails of the observed hydrograph while it over estimates dur-
Finally, the flow duration curves (FDCs) of the observed and ing the medium and low flow trails of the observed hydrograph.
best models are presented in Fig. 12 in order to check the ability The performance of the corresponding wavelet model is much bet-
of the best models to simulate the high, low and medium trails ter than its counterpart simple model as it well trails the high,
of the observed hydrograph for the both selected catchments. From medium and low flow trails of the observed hydrograph. It can
the FDC, the 10 percentile flow (the flow that is equalled or be further seen that the performance of the simple and wavelet
exceeded 10% of the period of record) can be considered as high coupled models with I5 input vector (containing from lagged
M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225 223

100000 100000
(a) Observed (b)
Observed
Discharge (cumecs) 10000 Simple I2

Discharge (cumecs)
Simple_D2
Simple I5 10000 Simple_D5
Wavelet I2
Wavelet_D2
1000 Wavelet I5
Wavelet_D5

1000
100

10 100
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Percent of time indicated flow was equalled or exceeded Percent of time indicated flow was equalled or exceeded

100 100
(c) Observed (d) Observed
Simple I2 Simple_D6
Wavelet I2 Wavelet_D6
Discahrge (cumecs)

Discahrge (cumecs)
Simple I5 Simple_D2
Wavelet I5 Wavelet_D2

10 10

1 1
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Percent of time indicated flow was equalled or exceeded Percent of time indicated flow was equalled or exceeded

Fig. 12. Flow duration curves.

rainfall data series to lagged five day rainfall data series) in trailing  The dynamic TLRNN models are found to perform equally well
low, medium and high flows of the observed hydrograph is almost compared to the commonly used static MLPNN models and
similar and these modes are equally good as the wavelet coupled hence can be considered as an alternative to the MLPNN
models with input vector I2. Almost similar trend can be found models.
for the dynamic TLRRNN models for the Baihe catchment as shown  The performance of the wavelet coupled MLPNN models is
in Fig. 12(b). Wavelet coupled TLRNN models with input vector I2 found sensitive to the selection of wavelet function. The db8
input vector can be considered as the best model. It is apparent is found best among the six selected wavelet functions. How-
from Fig. 12(c) and (d) for the Brosna catchment that the simple ever, performance of the dynamic TLRNN models is only found
and wavelet models (both static and dynamic models) with input sensitive to the selection of the wavelet function with less
vector I2 and I5 captures the features of the observed hydrograph memory depth and again the db8 wavelet function is found to
well only for the high flow and medium high flows. The wavelet yield the best results. With large memory depth, the perfor-
coupled models outperformed their counterpart simple models mance of the all wavelet coupled TLRNN models is found almost
for the medium low and low flow trails of the observed hydro- similar with all six wavelet functions even with the simple Haar
graphs shown in Fig. 12(c). For the TLRNN models, the simple mod- wavelet function. It can, therefore, be concluded to use the db8
els with D take as 2 only describes the average behaviour of the wavelet function with static MLPNN models and also for
hydrograph without giving any information rearing low, medium dynamic TLRNN models with smaller memory depth. However,
and high flow trails of the observed hydrograph. The performance any wavelet function can be used with dynamic TLRNN models
of the wavelet coupled TLRNN model with D taken as 2 is much with larger memory depth.
better than its counterpart simple mode in trailing low, medium  The performance of the wavelet coupled MLPNN and TLRNN
and high flows of the observed hydrographs. It is further evident models is equally well for all input vector/memory depths.
from Fig. 12(d) that the simple TLRNN model with D taken as 6 However, the simple TLRNN models outperformed their coun-
is unable to trail the medium low and low flow periods of the terpart simple MLPNN models with small memory depth.
observed hydrograph. The counterpart wavelet coupled TLRNN
model is found to be the best in trailing all the features of the
observed hydrograph. It can be concluded on the basis of above
analysis of FDCs that the wavelet coupled models can well captures References
the features of the observed hydrograph compared with their out-
Abrahart, R.J., Anctil, F., Coulibaly, P., Dawson, C.W., Mount, N.J., See, L.M.,
performed simple models and this is true for both static MLPNN Shamseldin, A.Y., Solomatine, D.P., Toth, E., Wilby, R.L., 2012. Two decades of
and dynamic TLRNN models. anarchy? Emerging themes and outstanding challenges for neural network river
forecasting. Prog. Phys. Geogr. 36 (4), 480–513.
Adamowski, J., Chan, H.F., 2011. A wavelet neural network conjunction model for
groundwater level forecasting. J. Hydrol. 407 (1–4), 28–40.
5. Conclusions Adamowski, J., Sun, K., 2010. Development of a coupled wavelet transform and
neural network method for flow forecasting of non-perennial rivers in semi-arid
watersheds. J. Hydrol. 390 (1–2), 85–91.
The following conclusions can be drawn from the present Addison, P.S., 2002. The Illustrated Wavelet Transform Handbook. Institute of
study: Physics Publishing, London.
224 M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225

Akiner, M.E., Akkoyunlu, A., 2012. Modeling and forecasting river flow rate from the Kisi, O., Shiri, J., Tombul, M., 2013. Modeling rainfall–runoff process using soft
Melen Watershed, Turkey. J. Hydrol. 456, 121–129. computing techniques. Comput. Geosci. 51, 108–117.
Altunkaynak, A., Nigussie, T.A., 2015. Prediction of daily rainfall by a hybrid Kote, A.S., Jothiprakash, V., 2008. Reservoir inflow prediction using time lagged
wavelet-season-neuro technique. J. Hydrol. 529, 287–301. recurrent neural networks. In: First International Conference on Emerging
Anmala, J., Zhang, B., Govindaraju, R.S., 2000. Comparison of ANNs and empirical Trends in Engineering and Technology, 2008. ICETET’08. IEEE.
approaches for predicting watershed runoff. J. Water Resour. Plann. Manage. Lallahem, S., Mania, J., 2003. A nonlinear rainfall–runoff model using neural
126 (3), 156–166. network technique: example in fractured porous media. Math. Comput. Modell.
Antar, M.A., Elassiouti, I., Allam, M.N., 2006. Rainfall-runoff modelling using 37 (9), 1047–1061.
artificial neural networks technique: a Blue Nile catchment case study. Legates, D.R., McCabe, G.J., 1999. Evaluating the use of ‘‘goodness-of-fit” measures
Hydrol. Process. 20 (5), 1201–1216. in hydrologic and hydroclimatic model validation. Water Resour. Res. 35 (1),
Arsenault, R., Gatien, P., Renaud, B., Brissette, F., Martel, J.-L., 2015. A comparative 233–241.
analysis of 9 multi-model averaging approaches in hydrological continuous Lohani, A.K., Kumar, R., Singh, R.D., 2012. Hydrological time series modeling: a
streamflow simulation. J. Hydrol. 529, 754–767. comparison between adaptive neuro-fuzzy, neural network and autoregressive
ASCE Task Committee on Application of Artificial Neural Networks in Hydrology, techniques. J. Hydrol. 442, 23–35.
2000a. Artificial neural networks in hydrology. I: preliminary concepts. J. Ma, X.-X., Chen, X., Guo, H.-F., 2008. Reservoir annual runoff ANFIS forecast model
Hydrol. Eng., 115–123 http://dx.doi.org/10.1061/(ASCE)1084-0699(2000)5:2 based on wavelet analysis. China Rural Water Hydropower 7, 005.
(115). Maier, H.R., Dandy, G.C., 2000. Neural networks for the prediction and forecasting of
ASCE Task Committee on Application of Artificial Neural Networks in Hydrology, water resources variables: a review of modelling issues and applications.
2000b. Artificial neural networks in hydrology II: hydrologic applications. J. Environ. Modell. Softw. 15 (1), 101–124.
Hydrol. Eng., 124–137 http://dx.doi.org/10.1061/(ASCE)1084-0699(2000)5:2 Mekanik, F., Imteaz, M.A., Gato-Trinidad, S., Elmahdi, A., 2013. Multiple regression
(124). and Artificial Neural Network for long-term rainfall forecasting using large scale
Assaad, M., Boné, R., Cardot, H., 2005. Study of the behavior of a new boosting climate modes. J. Hydrol. 503, 11–21.
algorithm for recurrent neural networks. Artificial Neural Networks: Formal Moosavi, V., Vafakhah, M., Shirmohammadi, B., Behnia, N., 2013. A wavelet-ANFIS
Models and their Applications – ICANN 2005. Springer, pp. 169–174. hybrid model for groundwater level forecasting for different prediction periods.
Aussem, A., Campbell, J., Murtagh, F., 1998. Wavelet-based feature extraction and Water Resour. Manage., 1–21
decomposition strategies for financial forecasting. J. Comput. Intell. Finance 6 Motter, M., Principe, J.C., 1994. A gamma memory neural network for system
(2), 5–12. identification. In: Neural Networks. IEEE International Conference on IEEE
Aziz, K., Rahman, A., Fang, G., Shrestha, S., 2014. Application of artificial neural World Congress on Computational Intelligence. IEEE.
networks in regional flood frequency analysis: a case study for Australia. Stoch. Muluye, G.Y., 2011. Improving long-range hydrological forecasts with extended
Env. Res. Risk A. 28 (3), 541–554. Kalman filters. Hydrol. Sci. J. 56 (7), 1118–1128.
Badjate, S.L., Dudul, S.V., 2009. Multi step ahead prediction of north and south Mulvany, T.J., 1850. On the use of self-registering rain and flood gauges. Making
hemisphere sun spots chaotic time series using focused time lagged recurrent Observations of the Relations of Rain Fall and Flood Discharges in a Given
neural network model. WSEAS Trans. Inform. Sci. Appl. 6 (4), 684–693. Catchment. Transactions and Minutes of the Proceedings of the Institute of Civil
Cannas, B., Fanni, A., See, L., Sias, G., 2006. Data preprocessing for river flow Engineers of Ireland, Dublin, Ireland, Session 1.
forecasting using neural networks: wavelet transforms and data partitioning. Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptual models.
Phys. Chem. Earth, Parts A/B/C 31 (18), 1164–1171. Part 1: A discussion of principles. J. Hydrol. 10 (3), 282–290.
Chang, L.-C., Chen, P.-A., Chang, F.-J., 2012. Reinforced two-step-ahead weight Nason, G.P., Von Sachs, R., 1999. Wavelets in time-series analysis. Philos. T. Roy. Soc.
adjustment technique for online training of recurrent neural networks. IEEE A 357 (1760), 2511–2526.
Trans. Neural Netw. Learn. Syst. 23 (8), 1269–1278. Nayak, P.C., Sudheer, K.P., Ramasastri, K.S., 2005. Fuzzy computing based rainfall–
Charaniya, N.A., Dudul, S.V., 2013. Time Lag recurrent Neural Network model for runoff model for real time flood forecasting. Hydrol. Process. 19 (4), 955–968.
Rainfall prediction using El Niño indices. Int. J. Sci. Res. Publ. (IJSRP) 3 (1), 367. Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2004. A neuro-fuzzy
Chiang, Y.-M., Chang, L.-C., Chang, F.-J., 2004. Comparison of static-feedforward and computing technique for modeling hydrological time series. J. Hydrol. 291 (1–
dynamic-feedback neural networks for rainfall–runoff modeling. J. Hydrol. 290 2), 52–66.
(3), 297–311. Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2005. Short-term flood
Chua, L.H.C., Wong, T.S.W., Sriramula, L.K., 2008. Comparison between kinematic forecasting with a neurofuzzy model. Water Resour. Res. 41 (4).
wave and artificial neural network models in event-based runoff simulation for Nayak, P.C., Sudheer, K.P., Jain, S.K., 2007. Rainfall–runoff modeling through hybrid
an overland plane. J. Hydrol. 357 (3), 337–348. intelligent system. Water Resour. Res. 43.
Coulibaly, P., Anctil, F., Bobee, B., 2000a. Daily reservoir inflow forecasting using Nourani, V., Alami, M.T., Aminfar, M.H., 2009a. A combined neural-wavelet model
artificial neural networks with stopped training approach. J. Hydrol. 230 (3), for prediction of Ligvanchai watershed precipitation. Eng. Appl. Artif. Intell. 22
244–257. (3), 466–472.
Coulibaly, P., Anctil, F., Bobée, B., 2000b. Neural network-based long-term Nourani, V., Kisi, Ö., Komasi, M., 2011. Two hybrid artificial intelligence approaches
hydropower forecasting system. Comput.-Aided Civil Infrastruct. Eng. 15 (5), for modeling rainfall–runoff process. J. Hydrol. 402 (1), 41–59.
355–364. Nourani, V., Komasi, M., Mano, A., 2009b. A multivariate ANN-wavelet approach for
Coulibaly, P., Baldwin, C.K., 2005. Nonstationary hydrological time series forecasting rainfall–runoff modeling. Water Resour. Manage. 23 (14), 2877–2894.
using nonlinear dynamic methods. J. Hydrol. 307 (1), 164–174. Partal, T., Kisßi, Ö., 2007. Wavelet and neuro-fuzzy conjunction model for
Coulibaly, P., Evora, N.D., 2007. Comparison of neural network methods for infilling precipitation forecasting. J. Hydrol. 342 (1), 199–212.
missing daily weather records. J. Hydrol. 341 (1), 27–41. Piotrowski, A.P., Napiorkowski, M.J., Napiorkowski, J.J., Osuch, M., 2015. Comparing
Daubechies, I., 1992. Ten Lectures on Wavelets (CBMS-NSF Regional Conference various artificial neural network types for water temperature prediction in
Series in Applied Mathematics). Society for Industrial and Applied Mathematics. rivers. J. Hydrol. 529, 302–315.
Dawson, C.W., Harpham, C., Wilby, R.L., Chen, Y., 2002. Evaluation of artificial neural Riad, S., Mania, J., Bouchaou, L., Najjar, Y., 2004. Predicting catchment flow in a
network techniques for flow forecasting in the River Yangtze, China. Hydrol. semi-arid region via an artificial neural network technique. Hydrol. Process. 18
Earth Syst. Sci. Discuss. 6 (4), 619–626. (13), 2387–2393.
French, M.N., Krajewski, W.F., Cuykendall, R.R., 1992. Rainfall forecasting in space Rummelhart, D.E, McClelland, J.L, P.R. Group, 1986. Parallel Distributed Processing.
and time using a neural network. J. Hydrol. 137 (1), 1–31. Explorations in the Microstructure of Cognition. Foundations, vol. 1. The MIT
Furundzic, D., 1998. Application example of neural networks for time series Press.
analysis: rainfall–runoff modeling. Signal Process. 64, 383–396. Saharia, M., Bhattacharjya, R.K., 2012. Geomorphology-based time-lagged recurrent
Güldal, V., Tongal, H., 2010. Comparison of recurrent neural network, adaptive neural networks for runoff forecasting. KSCE J. Civil Eng. 16 (5), 862–869.
neuro-fuzzy inference system and stochastic models in Eğirdir Lake level Sajikumar, N., Thandaveswara, B.S., 1999. A non-linear rainfall–runoff model using
forecasting. Water Resour. Manage. 24 (1), 105–128. an artificial neural network. J. Hydrol. 216 (1), 32–55.
Hammerstrom, D., 1993. Working with neural networks. IEEE Spectrum 30 (7), 46– Senthil Kumar, A.R., Sudheer, K.P., Jain, S.K., Agarwal, P.K., 2005. Rainfall–runoff
53. modelling using artificial neural networks: comparison of network types.
Hopfield, J.J., 1982. Neural networks and physical systems with emergent collective Hydrol. Process. 19 (6), 1277–1291.
computational abilities. Proc. Natl. Acad. Sci. USA 79 (8), 2554–2558. Serpen, G., Xu, Y., 2003. Simultaneous recurrent neural network trained with non-
Jain, A., Sudheer, K.P., Srinivasulu, S., 2004. Identification of physical processes recurrent backpropagation algorithm for static optimisation. Neural Comput.
inherent in artificial neural network rainfall runoff models. Hydrol. Process. 18 Appl. 12 (1), 1–9.
(3), 571–581. Shamseldin, A.Y., 1997. Application of a neural network technique to rainfall–runoff
Kale, S.N., Dudul, S.V., 2009. Intelligent noise removal from EMG signal using modelling. J. Hydrol. 199 (3–4), 272–294.
focused time-lagged recurrent neural network. Appl. Comput. Intell. Soft Shoaib, M., Shamseldin, A.Y., Melville, B.W., 2014a. Comparative study of different
Comput., 1. wavelet based neural network models for rainfall–runoff modeling. J. Hydrol.
Karunanithi, N., Grenney, W.J., Whitley, D., Bovee, K., 1994. Neural networks for 515, 47–58.
river flow prediction. J. Comput. Civil Eng. 8 (2), 201–220. Shoaib, M., Shamseldin, A.Y., Melville, B.W., Khan, M.M., 2014b. Hybrid wavelet
Kisi, O., 2011. Wavelet regression model as an alternative to neural networks for neuro-fuzzy approach for rainfall–runoff modeling. J. Comput. Civil Eng. 30 (1),
river stage forecasting. Water Resour. Manage. 25 (2), 579–600. 04014125.
Kisi, O., Shiri, J., 2011. Precipitation forecasting using wavelet-genetic programming Shoaib, M., Shamseldin, A.Y., Melville, B.W., Khan, M.M., 2015. Runoff forecasting
and wavelet-neuro-fuzzy conjunction models. Water Resour. Manage. 25 (13), using hybrid Wavelet Gene Expression Programming (WGEP) approach. J.
3135–3152. Hydrol. 527, 326–344.
M. Shoaib et al. / Journal of Hydrology 535 (2016) 211–225 225

Singh, R., 2012. Wavelet-ANN model for flood events. In: Deep, K., Nagar, A., Pant, Tayfur, G., Zucco, G., Brocca, L., Moramarco, T., 2014. Coupling soil moisture and
M., Bansal, J.C. (Eds.), Proceedings of the International Conference on Soft precipitation observations for predicting hourly runoff at small catchment
Computing for Problem Solving (SocProS 2011) December 20–22, 2011, vol. scale. J. Hydrol. 510, 363–371.
131. Springer, Berlin/Heidelberg, pp. 165–175. Tiwari, M.K., Chatterjee, C., 2010. Development of an accurate and reliable hourly
Sudheer, K., Gosain, A., Ramasastri, K., 2002. A data-driven algorithm for flood forecasting model using wavelet–bootstrap–ANN (WBANN) hybrid
constructing artificial neural network rainfall–runoff models. Hydrol. Process. approach. J. Hydrol. 394 (3–4), 458–470.
16 (6), 1325–1330. Tokar, A.S., Markus, M., 2000. Precipitation-runoff modeling using artificial neural
Sun, Y., Tian, F., Yang, L., Hu, H., 2014. Exploring the spatial variability of networks and conceptual models. J. Hydrol. Eng. 5 (2), 156–161.
contributions from climate variation and change in catchment properties to Wang, W., Ding, J., 2003. Wavelet network model and its application to the
streamflow decrease in a mesoscale basin by three different methods. J. Hydrol. prediction of hydrology. Nat. Sci. 1 (1), 67–71.
508, 170–180. Warner, B., Misra, M., 1996. Understanding neural networks as statistical tools. Am.
Tayfur, G., 2012. Soft Computing in Water Resources Engineering: Artificial Neural Stat. 50 (4), 284–293.
Networks, Fuzzy Logic and Genetic Algorithms. WIT Press.
Tayfur, G., Guldal, W., 2006. Artificial neural networks for estimating daily total
suspended sediment in natural streams. Nord. Hydrol. 37, 69–79.

You might also like