You are on page 1of 6

Available online at www.sciencedirect.

com

ScienceDirect
ICT Express xxx (xxxx) xxx
www.elsevier.com/locate/icte

Scour modeling using deep neural networks based on hyperparameter


optimization
Mohammed Asima , Adnan Rashidb , Tanvir Ahmada ,∗
a Department of Computer Engineering, Jamia Millia Islamia (Central University), New Delhi, India
b Department of Civil Engineering, Jamia Millia Islamia (Central University), New Delhi, India

Received 7 June 2021; received in revised form 19 July 2021; accepted 26 September 2021
Available online xxxx

Abstract
Design of bridge piers and abutments is significantly impacted by hydrodynamic processes that cause scouring of the foundation. Although,
many empirical formulae are available in the literature to estimate the depth of scouring, but they suffer from several limitations. A major
limitation of empirical formulae is that they are largely applicable to the hydraulic conditions for which they have been derived. In this
research, a deep neural network (DNN) has been developed and applied to predict the depth of scour around bridge piers and abutments.
The practicality of the proposed model has been demonstrated using the experimental data sets consisting of 211 data points. The novelty
of the DNN model applied herein lies in the use of Adam Optimizer for optimizing the parameters of the DNN model. The performance of
the DNN model was evaluated for each parameter set using statistical indicators such as the coefficient of determination, root mean square
error, and mean absolute error. A regression equation based upon the available data set has also been proposed. Based upon the values of
the statistical parameters, the DNN model has been found to be significantly better than the regression model. The model proposed herein
performs better than the regression model. A distinct practical advantage of the model proposed herein is that it eliminates the need of hit
and trial procedure to determine the optimal parameter set for the model.
⃝c 2021 The Author(s). Published by Elsevier B.V. on behalf of The Korean Institute of Communications and Information Sciences. This is an open
access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords: DNN; Hyperparameter; Scour; Adam; Optimizer

1. Introduction in the past to ascertain the amount of scouring under different


hydraulic conditions. A large number of studies are still ongo-
Scour is defined as the flow-induced erosion of soil particles
ing owing to the great practical significance associated with the
around bridge piers and abutment due to waves, currents,
estimation of the maximum local scour around the structures.
or other hydrodynamic drivers. The process of scouring ex-
acerbates during floods and high flow events. Scour around Scour around bridge piers and abutments is a complex
piers has been found to be the major cause of failure of phenomenon due to the complicated non-linear relationship
bridges [1,2]. Traditionally, scour depth has been estimated between the predictor and response variables. The traditional
using various equations derived using experimental data. With method for the prediction of scour around bridge piers and
these equations, the predicted scour depth is either underes- abutments is based upon experimental investigations. The
timated or overestimated leading to unsafe or uneconomical problems experienced by the researchers in experimental and
design of foundations, respectively. Therefore, it is extremely theoretical estimation of scour around bridge piers has led
important to predict scour depth with greater accuracy to to the interest in the application of artificial neural networks
provide safe and economical design. Due to the importance (ANNs) in scour modeling.
of scour estimation, numerous studies have been carried out Several studies have demonstrated the effectiveness of
ANNs in predicting scour depth at the bridge piers [3,4].
∗ Corresponding author.
However, determination of an appropriate network structure
E-mail addresses: mohammed187900@st.jmi.ac.in (M. Asim),
rashid.adnan03@gmail.com (A. Rashid), tahmad2@jmi.ac.in (T. Ahmad).
in an ANN is a daunting task. Specifically, choosing the size
Peer review under responsibility of The Korean Institute of Communica- of input layers, the type of learning curve, the number of
tions and Information Sciences (KICS). hidden layers, number of nodes in different layers among other
https://doi.org/10.1016/j.icte.2021.09.012
2405-9595/⃝ c 2021 The Author(s). Published by Elsevier B.V. on behalf of The Korean Institute of Communications and Information Sciences. This is an
open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Please cite this article as: M. Asim, A. Rashid and T. Ahmad, Scour modeling using deep neural networks based on hyperparameter optimization, ICT Express (2021),
https://doi.org/10.1016/j.icte.2021.09.012.
M. Asim, A. Rashid and T. Ahmad ICT Express xxx (xxxx) xxx

parameters is still quite challenging. Often, a hit and trial


method based on user’s intuition and experience is employed
to determine the best set of parameters. However, such a set
of parameters obtained through hit and trial method may not
necessarily be the optimal set of parameters for the problem
under consideration. Therefore, the benefits of ANNs are not
maximized when the optimal parameter set is not used while
implementing them. The application of novel optimization
algorithms for the determination of optimal parameter sets for
use in neural networks can lead to improved performance of
the model. In this paper, a deep neural network (DNN) based
on hyperparameter optimization has been proposed to predict
scour depths around circular bridge piers. A DNN is a type of
neural network that has multiple hidden layers.
Fig. 1. General structure of a DNN.
2. Dataset and methods
In this research, we used 9 experimental datasets described essentially represents the connectivity pattern between the
in [5–11], and [12]. The dataset considered herein comprised neurons. The network architecture is represented by the input
of 211 observations. An important consideration in the selec- layer, the hidden layers, and an output layer. The choice of
tion of the database was to ensure that the range of different input layer and output layer is relatively straightforward. The
parameters affecting the scour process is sufficiently wide. number of neurons in an input layer is equal to the number
The range of L/y clearly shows that the dataset includes both
of independent variables in a problem, whereas the number of
short and long abutments. The range of U/Uc shows that
neurons is an output layer is equal to the number of dependent
the data points were collected in both clear-water and live-
variables. In classification problems, the number of neurons
bed conditions. In addition, the range of σg shows that the
in an output layer is usually more than 1. Evidently, the
experiments were conducted using both uniform and non-
optimal number of hidden layers and the number of neurons
uniform sediments. The d50 range indicates that both fine and
in the respective hidden layers are two important factors that
coarse sediments were studied.
determine the performance of any neural network. In the
For the development of DNN model, the 70% of the ob-
present work, the number of hidden layers and the number of
servations have been used for training of the model, and the
neurons in the respective hidden layers were determined using
remaining 30% have been used for testing. With a data set of
hyperparameter optimization (Kingma and Ba, 2015) instead
211 observations as input, the software MINITAB was used
of a hit and trial procedure. However, the Adam optimizer
to conduct multiple linear regression analysis. For the DNN
itself contains several parameters. The major parameters of
model, a total of 147 observations was used to train the model,
Adam optimizer include the learning rate (α), the exponential
whereas the remaining 64 observations were used to test the
decay rate for the first moment estimates (β1 ), the exponential
model.
decay rate for the second-moment estimates (β2 ), and epsilon
2.1. Deep neural networks (ε). Kingma and Ba (2015) suggest that α = 0.001, β1 = 0.9,
β2 = 0.999 and ε = 10−8 are good default settings for machine
Neural networks are one of the most powerful learning learning problems. In the Keras library used in this research,
algorithms. A neuron in a human body has dendrites that act as the values of parameters of Adam optimizer suggested by
input wires and axons that act as output wires. The axons are Kingma and Ba (2015) are provided as an input.
the components that send signals to other neurons in a human The ReLu function defined by (1) is used as an activation
brain. Artificial Neural Networks (ANNs) were developed in function in the DNN model.
{
80s to simulate neurons or a network of neurons in the brain. x, x > 0
Their popularity declined in mid 1990s, mainly due to the f (x) = , or f (x) = max(0, x) (1)
0, x ≤ 0
computational complexity associated with their applications to
problems of moderate size. A typical neural network consists The output from the Relu function is the input itself when
of an input layer, a hidden layer and an output layer. A network the input is positive, whereas the output is zero when the input
can have one or more hidden layers. When a network has more is negative or zero. A general structure of a DNN is shown
than one hidden layer, it is called a DNN. Owing to easy in Fig. 1, whereas the working of typical neuron is shown in
availability of large computing facilities, DNNs have gained Fig. 2.
huge popularity in recent times. Several applications of DNN
to solve complex problems from diverse can be found in the 2.2. Hyperparameter optimization
literature (For example, [13–16]).
A good choice of network architecture is critical for effi- In the present research, hyperparameter optimization based
cient performance of a neural network. A network architecture on Adam Optimizer was carried out using Keras Tuner Library
2
M. Asim, A. Rashid and T. Ahmad ICT Express xxx (xxxx) xxx

Fig. 2. Working of a typical neuron.


Fig. 3. Scatterplot of observed and predicted values from regression model
using test dataset.
in Python to determine the optimal values of hyperparameters
that control the learning process for a learning algorithm [17].
The Keras Tuner Library in Python uses backpropagation which a total of 147 observations was used to train the model,
algorithm to train the network, and a loss function to evaluate whereas the remaining 64 observations were used to test
the model performance. In the present work, the mean squared the model. The objective function in the Adam optimization
error (MSE) was specified as a loss function. The objective algorithm is the MSE defined as
was to minimize the mean squared error between the observed n
1 ∑( )2
and the predicted values of the dependent variable, that is the MSE = Yi − Ỹi (2)
depth of scour. The parameter optimization was carried out n i=1
using the Adam Optimizer [13]. where, n = number of data points; Yi = observed values;
Adam combines two gradient descent algorithms, AdaGrad and Ỹi = predicted values. The goal of the Adam optimizer
(adaptive gradient algorithm) [18], and RMSProp (root mean is to minimize the objective function given by (2). The Relu
square propagation) [19]. Activation Function was used, and the training was carried out
The following steps are used in Adam Optimizer to deter- with Adam optimizer for 500 epochs.
mine the optimal set of weights and biases.
1. Select the values of α, β1 , β2 3. Results and discussion
2. Initialize the first moment vector, m and the second
In this paper, the performance of the DNN model for the
moment vector v to zero. Set t=0.
training and testing dataset was evaluated through statistical
3. On iteration t, set t = t + 1 and obtain the gradi-
parameters such as, coefficient of determination (R2 ), root
ents/derivatives(g) gt = grad(θt−1 )
mean square error (RMSE), and mean absolute error (MAE).
4. Update the first moment m t = β1 .m t−1 + (1 − β1 ) .gt .
5. Update the second moment: vt = β2 .vt−1 +(1 − β2 ) .gt2 .
mt 3.1. Regression analysis of scour prediction model
6. Compute the bias corrected m t : m̂ t = (1−β t)
1
vt
7. Compute the bias corrected vt : v̂t = (1−β2t ) Based upon the nonlinear regression analysis conducted
8. Update the parameters θ := θt−1 − α. √m̂ t in this research, the following equation for the estimation
v̂t+ε of scour depth around bridge piers and abutments has been
9. Stop the iterative procedure when the termination crite- proposed.
rion defined by ε is met. ds
( )−0.116 ( )0.358
U L ( )0.489
(
d50 0.0165
)
= 2.668 σg (Fr )0.638 (3)
Upon the termination of the loop, the optimal values of y Uc y y
weights and biases are expected to be found. where, U = average velocity of approach flow, U c = thresh-
The parameters of Adam optimizer (α, β1 , β2 , ε) are old velocity for sediment motion, ρ = mass density of the
provided as an input to Keras Tuner library. In addition to fluid, ρs = mass density of the sediment, g = acceleration due
the parameters of Adam optimizer, the range of values of to gravity, L = projected length of abutment in the direction
hyperparameters (number of hidden layers, number of neurons of flow, y = approaching flow depth, d50 = median grain
in each hidden layer, learning rate) are provided as an input to size, σg = geometric standard deviation of sediments, and
Keras Tuner Library. The optimal values of hyperparameter set ds = equilibrium scour depth. The regression equation was
based on Adam optimizer are then determined using the Keras developed using the 70% data points, and the remaining 30%
Library. Once the optimal hyperparameter set was identified, of the data was used to validate the accuracy of the predictions.
the DNN model was applied to the data that was not seen by The scatter plot of observed and predicted values from a
the model during the train-test split. The dataset used herein regression equation using the test data is shown in Fig. 3. A
consists of 211 pier and abutment scour measurements, out of quantitative performance evaluation of the regression model
3
M. Asim, A. Rashid and T. Ahmad ICT Express xxx (xxxx) xxx

Fig. 5. Progression of loss curve with epochs during the training process.
Fig. 4. Scatterplot of observed and values predicted from a regression
equation using the entire dataset.

Table 1
Statistical Performance of DNN model.
Testing data Entire data
MSE MAE R2 MSE MAE R2
DNN 0.0156 0.087 0.987 0.0151 0.077 0.988
Regression 0.103 0.253 0.933 0.219 0.286 0.821

based upon R2 , MSE, and MAE was carried for the test as well
as the entire data set. The values of R2 , MSE, and MAE for
the test as well as the entire data set is presented in Table 1.
The value of R2 was 0.933 when the test data was used to
predict the values of ds /y using the equation developed in this Fig. 6. Progression of loss curve with epochs during the validation process.
work. A considerable degree of scatter was obtained around
the 450 line as shown in Fig. 3. The MSE computed using the
observed and values predicted by the regression equation was optimization the DNN model with 3 hidden layers with 50,
0.103, which is acceptable. For the case when the test data was 90, and 60 neurons in respective hidden layers, and a learning
considered, the value of the MAE was found to be 0.253. rate of 0.01was found to be optimal. With this optimal set
A comparison of observed ds /y and values of ds /y predicted of parameters, the DNN model was trained for 500 epochs.
by the regression equation for the entire dataset is presented The progression of loss during the training process, which is
in Fig. 4. The value of R2 was found to be 0.821, which was defined by MSE, is shown in Fig. 5. The loss curve measures
lower than that obtained when only the test data was provided the model error, and it can be seen from Fig. 5 that the MSE
as an input to the regression model. This was due to the greater abruptly decreases from a value of greater than 4 at the start
number of data points considered in computing R2 when the of the training to close to zero by the end of first few epochs.
entire dataset was provided as input to the regression equation. Thereafter, the MSE decreases steadily with epochs and be-
The MSE, in this case, was also higher (0.219) than the case comes constant at around 450 epochs. No further improvement
when only the test data was provided as input to the regression is visible after 450 epochs. It can be seen from the loss curves
equation. A value of 0.286 was obtained for MAE. A R2 value (Figs. 5 and 6) that the initial choice of training the model
of 0.81 during the testing stage and 0.77 during the training for 500 epochs was correct. A zig-zag loss curve indicates
stage of their regression model was reported in [20]. In our over-fitting of the model. In the present case, the loss curves
case, the values of R2 were significantly higher than reported for both training and validation are relatively smooth, which
in [20]. indicates that there is no over-fitting of the model.
The scatter plots of observed values of ds/y and those
3.2. Performance evaluation of the DNN model predicted with the DNN model for the test data and the entire
data are shown in Fig. 7 and Fig. 8, respectively. It is evident
The input layer for the DNN comprised of five neurons from Figs. 7 and 8 that there is a very good agreement between
(U/Uc , L/y, σg , Fr , d50 /y). The output layer consists of a single the observed and predicted values for the DNN model that
neuron (ds /y). The DNN model was trained with 147 sets of was trained using the optimal parameter set based on Adam
observed values, whereas the testing of the model was carried optimizer.
out using 64 sets of observed values. Using hyperparameter
4
M. Asim, A. Rashid and T. Ahmad ICT Express xxx (xxxx) xxx

on the development of DNN model whose parameters could be


determined through hyperparameter optimization instead of a
hit and trial procedure. The DNN predicted scour depth more
accurately than the regression model developed using the same
data set as that used for the DNN. The analysis of the statis-
tical indicators clearly indicates that the DNN trained using
the optimal parameter set obtained through Adam Optimizer
performed better than the regression model. The improved
performance of the DNN may be attributed to Adam Optimizer
that can solve complex nonlinear problems with greater effi-
ciency than the classical algorithms. Adam optimizer could,
therefore, be an efficient tool to determine optimal parameter
set for a DNN. A distinct practical advantage of the DNN
Fig. 7. Scatter plot of observed and predicted values of ds/y for the test model proposed in this research is that it is computationally
data, DNN model. efficient, requires little memory, and eliminates the need for
hit and trial procedure for determining the optimal parameter
set.

CRediT authorship contribution statement


Mohammed Asim: Conception and design of study, Acqui-
sition of data, Analysis and/or interpretation of data, Writing
– original draft, Writing – review & editing. Adnan Rashid:
Conception and design of study, Acquisition of data, Analysis
and/or interpretation of data, Writing – original draft, Writing
– review & editing. Tanvir Ahmad: Conception and design
of study, Acquisition of data, Analysis and/or interpretation of
data, Writing – original draft, Writing – review & editing.

Declaration of competing interest


Fig. 8. Scatter plot of observed and predicted values of ds/y for entire data,
DNN model. The authors declare that they have no known competing
financial interests or personal relationships that could have
appeared to influence the work reported in this paper.
Table 1 presents the values of various statistical indicators
computed using the output from the DNN as well as the regres-
sion model for the testing dataset and the entire dataset. It can Acknowledgment
be seen from Table 1 that the R2 value for the DNN model is Approval of the version of the manuscript to be published
significantly better than the R2 value for the regression model. (the names of all authors must be listed): Mohammed Asim,
For the DNN model, the R 2 was 0.987 compared to 0.933 Adnan Rashid, Tanvir Ahmad.
for the regression model. It can be seen from Table 1 that the The authors gratefully acknowledge the computing support
DNN model has MSE of 0.0156 and 0.0151 for the testing and received at the APPLE Inc. supported laboratory of the depart-
the entire datasets, respectively. The corresponding values for ment of computer engineering at Jamia Millia Islamia (Central
the regression model are 0.103 and 0.219. The MAE for the University), New Delhi, INDIA.
testing and entire data was 0.087 and 0.077, respectively. The
corresponding values for the regression model are 0.253 and
References
0.286, which are significantly higher than for the DNN model.
A very good agreement was found between the observed and [1] U.C. Kothyari, K.G. Ranga Raju, Scour around spur dikes and bridge
the relative scour depths predicted by the DNN model. From abutments, J. Hydraul. Eng. 39 (2001) 367–374, http://dx.doi.org/10.
1080/00221680109499841.
the results presented in Table 1, it is quite clear that the [2] M. Pandey, Z. Ahmad, P.K. Sharma, Scour around impermeable spur
performance of DNN model is far superior to that of the dikes: a review, ISH J. Hydraul. Eng. (2017) 1–20, http://dx.doi.org/
regression model. 10.1007/s10652-017-9529-9.
[3] T.L. Lee, D.S. Jeng, G.H. Zhang, et al., Neural network modeling for
estimation of scour depth around bridge piers, J. Hydrodyn. 19 (2007)
4. Conclusions 378–386, http://dx.doi.org/10.1016/S1001-6058(07)60073-0.
[4] A. Kaya, Artificial neural network study of observed pattern of scour
The performance of both the regression and the DNN model depth around bridge piers, Comput. Geotech. 37 (3) (2010) 413–418.
in predicting the scour depth has been evaluated using various [5] A.K. Barbhuiya, M.H. Mazumder, Live-bed local scour around
statistical indicators. The focus of the research was, however, vertical-wall abutments, ISH J. Hydraul. Eng. 20 (3) (2014) 339–351.
5
M. Asim, A. Rashid and T. Ahmad ICT Express xxx (xxxx) xxx

[6] A.E. Shahidi, M.S. Rohani, Prediction of scour at abutments using [14] C. Affonso, A.L.D. Rossi, F.H.A. Vieira, A.C.P. de Leon Ferreira,
piecewise regression, in: Proceedings of the Institution of Civil Deep learning for biological image classification, Expert Syst. Appl.
Engineers Water Management 167 2014 Issue WM2, 2014, pp. 79–87, 85 (2017) 114–122.
http://dx.doi.org/10.1680/wama.11.00100. [15] E. Chong, C. Han, F.C. Park, Deep learning networks for stock market
[7] S.Y. Kayatürk, Scour and Scour Protection at Bridge Abutments (Ph.D. analysis and prediction: Methodology, data representations, and case
thesis), Civil Eng. Department, METU, 2005. studies, Expert Syst. Appl. 83 (2017) 187–205.
[8] S. Dey, A.K. Barbhuiya, Time variation of scour at abutments, J. [16] A. Guven, A multi-output descriptive neural network for estimation
Hydraul. Eng. 131 (1) (2005) 11–23. of scour geometry downstream from hydraulic structures, Adv. Eng.
[9] E.S.R. Chaurasia, P.B.B. Lal, Local scour around bridge abutments,
Softw. 42 (3) (2011) 85–93.
Int. J. Sed. Res. 17 (1) (2002) 48–74.
[17] F. Chollet, et al., Keras, 2015, URL https://github.com/keras-team/
[10] E.V. Richardson, S.R. Davis, Evaluating scour at bridges, in: Hy-
keras.
draulic Engineering Circular No. 18, fourth ed., Federal Highway
Administration, Arlington, VA, 2001. [18] Duchi, et al., Adaptive Subgradient Methods for Online Learning and
[11] D.C. Froehlich, Local scour at bridge abutments, in: Proc. ASCE Stochastic Optimization, Stanford, 2011.
National Hydraulic Conference, Colorado Springs, Colorado, 1989, pp. [19] G. Hinton, N. Srivastava, K. Swersky, Neural Networks for Machine
13–18. Learning (Lecture 6) UToronto and Coursera, 2012.
[12] M.A. Gill, Erosion of sand beds around spur dikes, J. Hydraul. Div. [20] M. Muzzammil, Application of neural networks to scour
ASCE 98 (HY9) (1972) 1587–1602. depth prediction at the bridge abutments, Eng. Appl. Comput.
[13] D.P. Kingma, J. Ba, Adam: a method for stochastic optimization, 2015, Fluid Mech. 2 (1) (2008) 30–40, http://dx.doi.org/10.1080/19942060.
arXiv preprint arXiv:1412.6980. 2008.11015209.

You might also like