You are on page 1of 39

Accepted Manuscript

Slope stability prediction using integrated metaheuristic and machine learning


approaches: A comparative study

Chongchong Qi, Xiaolin Tang

PII: S0360-8352(18)30064-0
DOI: https://doi.org/10.1016/j.cie.2018.02.028
Reference: CAIE 5090

To appear in: Computers & Industrial Engineering

Received Date: 28 August 2017


Revised Date: 15 January 2018
Accepted Date: 18 February 2018

Please cite this article as: Qi, C., Tang, X., Slope stability prediction using integrated metaheuristic and machine
learning approaches: A comparative study, Computers & Industrial Engineering (2018), doi: https://doi.org/
10.1016/j.cie.2018.02.028

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers
we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and
review of the resulting proof before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Slope stability prediction using integrated metaheuristic and machine
learning approaches: A comparative study

Chongchong Qi1, Xiaolin Tang2


1
Ph.D. Student, School of Civil, Environmental and Mining Engineering, The
University of Western Australia, Perth, Western Australia, Australia
2
Ph.D. Student, Planning and Transport Research Centre, The University of
Western Australia, Perth, Western Australia, Australia
Corresponding author: Chongchong Qi, email: 21948042@student.uwa.edu.au,
telephone: 0415522736

Present address: 35 Stirling Highway, The University of Western Australia,


Perth, WA, Australia

Slope stability prediction using integrated metaheuristic and machine


learning approaches: A comparative study

Abstract: Advances in dataset collection and machine learning (ML) algorithms are important
contributors to the stability analysis in industrial engineering, especially to slope stability analysis. In
the past decade, various ML algorithms have been used to estimate slope stability on different datasets,
and yet a comprehensive comparative study of the most advanced ML algorithms is lacking. In this
article, we proposed and compared six integrated artificial intelligence (AI) approaches for slope
stability prediction based on metaheuristic and ML algorithms. Six ML algorithms, including logistic
regression, decision tree, random forest, gradient boosting machine, support vector machine, and
multilayer perceptron neural network, were used for the relationship modelling and firefly algorithm
(FA) was used for the hyper-parameters tuning. Three performance measures, namely confusion
matrices, the receiver operating characteristic (ROC) curve, and the area under the ROC curve (AUC),
were used to evaluate the predictive performance of AI approaches. We first demonstrated that
integrated AI approaches had great potential to predict slope stability and FA was efficient in the
hyper-parameter tunning. The AUC values of all AI approaches on the testing set were between 0.822
and 0.967, denoting excellent performance was achieved. The optimum support vector machine model
with the Youden’s cutoff was recommended in terms of the AUC value, the accuracy, and the true
negative rate. We also investigated the relative importance of influencing variables and found that
cohesion was the most influential variable for slope stability with an importance score of 0.310. This
1
research provides useful recommendations for future slope stability analysis and can be used for a
wider application in the rest of industrial engineering.

Keywords: Slope stability prediction; Integrated AI approaches; Machine learning algorithms;


Firefly algorithm; Variable importance

Nomenclature

AI Artificial intelligence
ANN(s) Artificial neural network(s)
AUC Area under the ROC curve
CV Cross validation
DT Decision tree
FA Firefly algorithm
GBM Gradient boosting machine
LR Logistic regression
ML Machine learning
MLPNN Multilayer perceptron neural network
MOAs Metaheuristic optimisation algorithms
OCM(s) Optimum classification model(s)
RF Random forest
ROC Receiver operating characteristic curve
SVM Support vector machine
TP, FP, TN, and FN True positive, false positive, true negative, and false negative
TPR and TNR True positive rate and true negative rate
D Training set
p Number of input variables
m Number of samples in the training set D
xi and yi Inputs and output
w, C, c, and b Model parameters
1. Introduction

The safety of many industrial engineering projects, such as mountain roads, earth dams and retaining
walls, is influenced by the stability of slopes. Because of this, slope failures are very undesirable
phenomena that constitute a series of disastrous consequences in many countries. With the
development of economy and the expansion of population, more man-made facilities have to be
constructed under the influence of slope failures. There is, consequently, a pressing need for the quick

2
estimation of slope stability before any crucial decisions regarding slope design and remedial support
can be made.

Developing an approach toward slope stability prediction is very challenging as a precise estimation
of slope stability involves many physical and geometric variables. Such prediction approaches must
be user-friendly and provide a high level of accuracy. Furthermore, the prediction should be made in a
short computational time as fast estimations are demanded during the engineering application. These
requirements have increased the difficulty in developing prediction approaches for slope stability.
However, reliable and accurate slope stability prediction can identify collapse-prone areas, determine
the appropriate retaining structures and establish efficient excavation plans (Cheng & Hoang, 2015;
Ghosh et al., 2015). Therefore, many researchers have attempted to develop approaches to estimate
slope stability, which can be mainly classified into analytical methods, numerical modelling, and
artificial intelligence (AI) based methods.

Analytical methods can be used to identify the most possible sliding surface and calculate the factor
of safety based on the slope displacement model. The limit equilibrium method (LEM) and the
circular/non-circular failure surface method are the most widely used analytical methods for the slope
stability analysis. Faramarzi et al. (2017) employed the LEM to analyse the stability of the Cham-Shir
dam power plant pit in Iran. The slope stability analysis based on the circular failure surface method
has been used by Wang et al. (2016) to model lateral enlargement in dam breaches. Though analytical
methods are computationally efficient, they fail to provide a complete understanding of the slope
behaviour due to their inherent drawbacks, such as simplifications and required input parameters for
the whole studied region. Therefore, analytical methods are only applicable to slopes with simple
geometries and in small regions (Song et al., 2012).

Different from the first approach, numerical modelling occurred in the past as a theoretically more
realistic and rigorous method for slope stability analysis (Li et al., 2016). Using numerical modelling,
Tsiampousi et al. (2016) investigated the influence of soil-atmosphere interaction on the stability and
serviceability of a slope cut in London clay. Kokutse et al. (2016) conducted a numerical analysis
using the software PLAXIS 2D to investigate the influence of vegetation on slope stability. Numerical
modelling was also used by Azarafza et al. (2017) to assess the stability of discontinuous rock slopes.
Nevertheless, the major drawback of numerical modelling is that its input parameters need to be back-
analysed using in-situ measurements, which is not available in many cases (Qi et al., 2017).

Recently, artificial intelligence (AI) techniques have been utilised for the discrimination of slopes. AI
techniques are proposed based on machine learning (ML) algorithms to learn the relationship between
slope stability and its influencing variables from historical data (Cheng & Hoang, 2014). Das et al.
(2011) developed different artificial neural networks (ANNs) to predict the stability of slopes and
estimate the factors of safety. Gordan et al. (2016) used ANN and particle swarm optimisation to
3
estimate the slope stability during earthquakes. Hoang and Bui (2017) carried out a comparative study
of slope stability estimation using radial basis function neural network, extreme learning machine and
least squares support vector machine. It should be stated that although the above-mentioned studies
are significant, there are still several problems need to be properly addressed: (1) only a limited set of
advanced ML algorithms have been used in slope stability prediction and the feasibility of other
advanced ML algorithms, such as random forest, has not been extensively explored. (2) the
implementation of ML algorithms always requires a proper setting of hyper-parameters, whereas the
capability of firefly algorithm (FA) in optimizing their hyper-parameters has not been fully
investigated on slope dataset. (3) the critical failure model should be carefully considered during the
preparation of slope dataset (Sakellariou & Ferentinou, 2005), which is ignored in several studies,
such as in (Hoang & Pham, 2016). (4) a systematic, quantitative comparison of the available ML
algorithms still lacks, although performance differences may be substantial in their application to
slope stability prediction.

To address above problems, this paper proposed and compared six integrated AI approaches for slope
stability prediction. These AI approaches used ML algorithms for the relationship modelling and FA
for the hyper-parameters tuning. Six ML algorithms were used, including logistic regression (LR),
decision tree (DT), random forest (RF), gradient boosting machine (GBM), support vector machine
(SVM), and multilayer perceptron neural network (MLPNN). This research can be worked as a
benchmark study in the application of AI approaches in slope stability prediction, which is of great
significance during slope design. The outline of this study is as follows. Section 2 presents a brief
introduction to ML algorithms and FA and their application in industrial engineering. Section 3
describes the slope stability problem, including the dataset collection and the selection of influencing
variables. Section 4 demonstrates the methodology of the proposed AI approaches. Section 5 provides
the result and discussion as well as the contribution and limitations of current study while Section 6
summarizes findings.

2. Machine learning and firefly algorithms

In this paper, ML algorithms were used to model the relationship between slope stability and its
influencing variables while FA was used for the hyper-parameters tunning. A brief introduction to ML
algorithms and FA is presented in this section, as well as their application in industrial engineering.

2.1. Machine learning algorithms

Six ML algorithms were selected in the current study for the prediction of slope stability. The
selection of these six ML algorithms, including LR, DT, RF, GBM, SVM, and MLPNN, is based on
their common characteristics:

4
i) All algorithms have been widely used in industrial engineering and have been proven to have
good predictive performance (Qi et al., 2017);

ii) All algorithms can deal with classification problems with multiple influencing variables and
can model non-linear relationships between inputs and outputs;

iii) Standard procedures for the implementation of these ML algorithms have been established
and some of them are recognized as top data mining algorithms (Wu et al., 2008).

A brief introduction to each ML algorithm is provided and detailed descriptions can be found in the
relevant references (Kuhn & Johnson, 2013).

2.2.1. Logistic regression

LR, also known as logit regression, is a type of generalized linear model used to estimate the
probability of a binary response based on one or more independent variables (Akgun, 2012). It allows
establishing a multivariate regression model between an output variable and multiple input variables.
Supposing the (p+1)-dimensional, where p is the number of input variables, training set
. The probability of can be expressed by Equation (1):
1
P(y  1| x; w)  (1)
1  e yw
T
x

where represents the model parameter (also known as the regression coefficient). As an
optimization problem, logistic regression minimizes the following regularized cost function:
n
1
min wT w  C  log(exp( yi ( X iT w  c))  1) (2)
w, c 2
i 1

In industrial engineering, LR has been used to monitor heterogeneous usage rate for subscription-
based services (Samimi & Aghaie, 2011) and improve the effectiveness of training evaluation (Wang
et al., 2015).

2.2.2. Decision tree


DT is a decision support technique that uses a tree-like graph to assist the decision-making. It is a
non-parametric model with no presumed relationships between output variables and input variables,
which performs effectively in the prediction problems. The procedure of DT constitutes two major
stages, i.e. generation of a tree and pruning it.

Generally, the tree consists of a root node, some intermediate nodes and leaf nodes, and branches
connecting them. According the growing rules, all samples in the root node is divided into subsets,
ensuring the characteristics of samples in the same subset are as homogeneous as possible while
samples in two different subsets are heterogeneous as possible. All intermediate nodes comply with

5
this growing rule. The iteration will finish once the characteristics of the node belong to the same type
or the maximum depth is reached. The nodes at the end of branches are marked as leaf nodes, each
will be represented by a class for all samples in that leaf node.

Pruning is an efficient approach to help the DT technique avoid overfitting, consisting of pre-pruning
and post-pruning. The principle of pruning is to cut the branches that optimize little of the
generalization ability of the tree. The structure of DT is shown in Fig 1. In recent years, many studies
appeared in the literature utilizing DT as a tool in industrial engineering, i.e. training effectiveness
evaluation (Wang et al., 2015) and online detection for mean shifts (Guh & Shiue, 2008).

2.2.3. Random forest

RF is an ensemble of several unpruned DTs (i.e. the base learner), which are built randomly and are
different from each other. The RF algorithm constructs individual DTs based on bagging, using
bootstrap sampling where samples are taken randomly with replacement from the training set. The
main idea of the RF algorithm is to construct a collection of DTs with controlled variations.

Random feature selection is also important in the process of training. For every node of each DT, a
subset containing k influencing variables is selected from all the influencing variables contained in
this node. Then, only those selected influencing variables are used for splitting. The k selected
influencing variables are different in every DT, ensuring the variety of individual DTs. Therefore, the
generalization ability can be effectively increased by this combination of sample disturbance and
variable disturbance on DTs. Notably, every DT in the forest would make a judgement separately and
the final decision results are reached by voting of all DTs. The application of RF in industrial
engineering includes child garment size matching (Pierola et al., 2016) and customer profitability
estimation (Fang et al., 2016).

2.2.4. Gradient boosting machine

GBM is an ensemble of weak prediction models, such as DTs, to increase the prediction accuracy.
GBM builds the model in a stage-wise fashion and the scope of GBM application is much wider than
other boosting methods as it allows optimization of an arbitrary differentiable loss function.

GBM is an accurate and efficient technique that can be utilised for both classification and regression
problems. GBM modelling has been widely used in a lot of research fields, such as web search
ranking and ecology. The advantages of GBM includes natural handling of data of mixed type, robust
predictive performance, and expert in dealing with outliers in output space. In civil and industrial
engineering, GBM has been applied to predict stope hangingwall stability (Qi et al., 2017).

2.2.5. Support vector machine


SVM is a ML algorithm that aims to identify a decision boundary with the largest margin possible that
can still separate different classes. Supposing there are m samples in the dataset space, and y (0 or 1)
6
represents the result type, i.e. the training set D  {( x1 , y1 ),( x2 , y2 ),...,( xm , ym )}, yi {0,1} . The

learning objective of SVM is to find a hyperplane with maximum margin in the p-dimensional space,
where p is the number of influencing variables, to represent the decision boundary. The function of
the hyperplane is shown in Equation (3).

wT x  b  0 (3)

Where and b are SVM parameters. The decision boundary margin is given by the Equation (4).

2
γ (4)
w

To maximum the decision boundary margin means to find the largest value of , i.e.:

2
max (5)
w ,b w

 
s.t. yi wT xi  b  1, i  1, 2, ,m (6)

When the classification problem is non-linearly separable, the kernel function, which is used to
compute the similarity between two inputs, can be used in SVM. The inner product of pairwise
samples can be represented by the kernel functions so that the dimension of feature vector can be
reduced. A number of kernel functions, e.g. linear function, polynomial kernel function, Gaussian
radial basis function, sigmoid function, are often used in SVM under different situations. SVM has
been used by Kartal et al. (2016) for multi-attribute inventory classification. Moreover, it has been
proved to be an efficient technique in hangingwall stability prediction (Qi et al., 2017).

2.2.6. Multilayer perceptron neural network

Artificial neural networks (ANNs) is a mathematical technique using an analogy to biological neurons
to generate a general solution to a problem. All neural functions, as well as memory, are believed to
be stored in the neurons and the connections between them. The training of ANNs is considered as the
establishment of new connections between neurons or the adjustment of existing connections (Fig. 2).

MLPNN is a typical ANN architecture that has one or more hidden layers between the input and
output layers. The simplest MLPNN consists of an input layer, an output layer, and a hidden layer
connecting them. Each layer constitutes neurons, which are connected with other neurons by the
weights passing signals to others. When the amount of signals received by one neuron overtakes its
threshold, the activation function will be awoken and the outcome will be treated as the input of next
neuron. The application of MLPNN for solving industrial engineering problems can be found in many

7
research areas, e.g. material strength prediction (Qi et al., 2018) and multi-attribute inventory
classification (Kartal et al., 2016).

2.2. Firefly algorithm

This study examines the feasibility of six integrated AI approaches for the prediction of slope stability.
In spite of the usefulness, all six ML algorithms utilised in these approaches include hyper-parameters
that have to be tuned. Better results can be achieved by means of experimenting with the hyper-
parameters; but doing so manually can be a tiresome task and the computation ability of computers
may limit the scope of parameters being searched. To address this, metaheuristic optimisation
algorithms (MOAs) can be utilised for exploring the optimum hyper-parameters of six ML algorithms.

MOAs are usually nature-inspired with multiple interacting agents, which have attracted much
attention in the past decade. These algorithms are often proposed by mimicking the swarm
intelligence characteristics of biological agents such as ants and birds. A dozen of new MOAs have
appeared in the literature and they have been proven to have great potential in solving optimisation
problems (Kaboli, Fallahpour, et al., 2017; Kaboli et al., 2016; Kaboli, Selvaraj, et al., 2017; Mostafa
Modiri-Delshad et al., 2016; M. Modiri-Delshad, Kaboli, et al., 2013; M. Modiri-Delshad, Koohi-
Kamali, et al., 2013; Rafieerad et al., 2017; Sebtahmadi et al., 2017). For example, Kaboli et al. (2016)
used artificial cooperative search algorithm for the estimation of long-term electric energy
consumption. Mostafa Modiri-Delshad et al. (2016) used backtracking search algorithm to solve
economic dispatch problems. Among these new MOAs, FA has been demonstrated to be efficient in
solving multimodal, global optimisation problems (Yang & He, 2013).

FA is a swarm-based technique inspired by the flashing patterns and behaviour of fireflies. It is


proposed by Yang (2008) for the searching of global optimum solutions. The basic rules utilized in
FA are as follows: (1) fireflies are unisex (2) the attractiveness among fireflies is proportional to the
brightness and inversely proportional to mutual distance. (3) The brightness is calculated based on the
objective function.

FA was selected in this paper as it not only has similar advantages that other MOAs have, it also has
two major advantages over other MOAs: the automatical subdivision and the ability to deal with
multimodality (Yang & He, 2013). FA can solve enormous time in generating optimization solutions,
such as hyper-parameters tuning, compared with the trial-and-error and it has been widely used in
industrial engineering (Kuo & Li, 2016; Madani-Isfahani et al., 2014; Qi et al., 2017). Previous
studies have shown that FA is more efficient in global optimization problems compared with other
metaheuristic algorithms (Amiri et al., 2013; Banati & Bajaj, 2011; Zaman & Matin, 2012). Detailed
introduction of FA, including its advantages, disadvantages, and how FA can solve the computational
drawbacks, has been presented in relevant references (Yang & He, 2013; Yang, 2008).

8
The procedure for the implementation of FA is illustrated in Fig. 3, which starts from defining the
objective function and creating an initial population of fireflies. This initial population, whose size is
m, is created from a random selection of the parameters in the parameter space. It has been proved
that the influence of randomness on the robustness of FA can be well controlled during iterations
(Yang & He, 2013). The objective function is utilised to assign a light intensity, or brightness, to each
firefly in the initial generation. The creation of next generation is performed through the movement of
fireflies in the previous generation based on the light intensity and update formula. This process will
continue until the maximum generation is obtained.

3. Slope stability

The performance of six integrated AI approaches was verified and compared on a multinational
dataset collected from the literature. Such dataset contains a considerable number of field case
histories, from which AI approaches can learn the relationship between slope stability and its
influencing variables. The whole dataset consists of 168 slope cases collected from five published
research works (Li & Wang, 2010; Sah et al., 1994; Xu et al., 1999; Yan & Li, 2011; Zhou & Chen,
2009), which covers a wide range of variable and spatial values. As the successful application of
proposed AI approaches for slope stability prediction is based on similar slope failure mechanisms
and geological conditions (Sakellariou & Ferentinou, 2005), an examination of failure mechanisms of
slope cases in the dataset was conducted. It is found that twenty slope cases in the dataset had a
cohesion value of zero, whose critical failure mode was planar compared with other slope cases
(circular failure mode with non-zero cohesion) (Sakellariou & Ferentinou, 2005). Thus, these specific
slope cases have been excluded in the dataset, leading to a total number of 148 slope cases in the
dataset for classification models’ training and testing (Appendix). Based on the slope condition, such
as the movement of slope bases, the status of slopes was divided into two classes, i.e. stable and
unstable. Within the dataset, there are 78 stable slope cases and 70 unstable slope cases.

For the purpose of slope stability prediction, six slope attributes were used: slope height (m), slope
angle (°), pore water ratio, unit weight (kN/m3), cohesion (kPa), and internal friction angle (°). These
attributes were considered to be the influencing variables that govern the stability of slopes, which
were selected based on findings from previous researches (Rukhaiyar et al., 2017; Sakellariou &
Ferentinou, 2005). Each of these influencing variables is introduced as follows:

Slope height is defined as the vertical distance between the slope base to the slope crest.

Slope angle is the angle between the inclined slope plane and the slope base plane.

Pore water ratio is defined as the ratio of the pore water pressure to the overburden pressure.

Unit weight is, by definition, the weight per unit volume of the soil/rock.

9
Cohesion determines the part of shear strength that is irrelevant to the normal effective stress during
soil/rock movements.

Internal friction angle is a soil/rock parameter that measures its ability to withstand a shear stress.

Fig. 4 illustrates the distribution of six influencing variables used in the construction of the prediction
model (the diagonal line). The input variable interaction is also shown with pairwise relationships in
the upper triangle and correlation coefficients in the lower triangle. As shown, most parameters have a
relatively poor correlations (R<0.5) with one another (Koo & Li, 2016). The slope height was
moderately correlated with unit weight while there was almost no correlation between unit weight and
pore water ratio. It can also be seen that the dataset was quiet widely distributed and the distribution
of most influencing variables was not symmetric. Hence, scaling all input variables into [0, 1] range
based on their minimum and maximum values will greatly improve the computation speed of
classification models.

10
4. Methodology

4.1. Dataset partition

In supervised classification problems, the performance of classification models are required to be


verified on a new dataset to test its generalization capability. Thus, the dataset of slope stability was
randomly split into two parts: the training set and the testing set. The training set is used to train the
classification models and search for the optimum hyper-parameters while the testing set, as mentioned
above, is used individually to test the generalization capability of classification models.

The selection of training data is important for the training of AI approaches and it must be
representative of the whole dataset. For one thing, the relationship cannot be properly learned using a
too small training set. For another, the generalisation capability cannot be verified when the training
set is too large. Also, over-fitting may occur using a too large training set. In AI practice, the
percentage of the training and testing set is often determined by an optimisation analysis (Qi et al.,
2017). In this paper, approximately 70% of the whole dataset (103 cases) was included in the training
set and the remaining 30% (45 cases) was included in the testing set after the optimisation analysis.

4.2. Performance measures

In this work, the predictive performance of a classification model was typically evaluated using
confusion matrices, the receiver operating characteristic (ROC) curve and the area under the ROC
curve (AUC) value. A confusion matrix, also known as an error matrix, is a specific table layout that
allows visualization of a model’s performance. In the confusion matrix, each column represents the
instances in a predicted class while each row represents the instances in an actual class. The confusion
matrix utilised in the slope stability prediction is shown in Table 1.

True positive (TP) represents the number of correctly predicted stable slopes and true negative (TN)
represents the number of correctly predicted unstable slopes. By contrast, false positive (FP)
represents the number of incorrectly predicted unstable slopes and false negative (FN) represents the
number of incorrectly predicted stable slopes. Using above parameters, three performance indicators
can be defined, including the accuracy, the true positive rate (TPR), and the true negative rate (TNR).

TP  TN
Accuracy  (7)
TP  TN  FP  FN

TP
TPR  (8)
TP  FN

TN
TNR  (9)
TN  FP

11
The ROC curve is plotted using the TPR as the vertical axis and the (1-TNR) as the horizontal axis,
which can be used to determine the true predictive power of classification models. As all ML
algorithms used in this paper can generate a continuous-valued prediction that indicates the class
membership probability (between 0 and 1), the selection of the cutting-point probability (the cutoff) is
of particular importance for the predictive performance of classification models. Many methods have
been developed for the selection of cutoff, such as the baseline cutoff (cutoff being set to 50%), the
top left cutoff (cutoff with the minimal distance to the upper left corner in the ROC plot), and the
Youden’s cutoff (cutoff with the maximum Youden’s J index)(Youden, 1950). In this paper, the
baseline and Youden’s cutoffs were used and the Youden’s J index is gave as:

J  TPR  TNR  1 (10)

The AUC value can be treated as a single-value evaluation for the predictive performance of
classification models and the most robust model is often considered to be the one with the largest
AUC value (Kuhn & Johnson, 2013). A perfect classification model (correctly predicts all stable and
unstable slopes) will achieve the AUC value of 1 while an inefficient model (whose prediction is
similar to randomly guessing) would have AUC value around 0.5. As suggested by Jr & Lemeshow
(2004), the performance of a classification model that achieves an AUC value above 0.9 is
outstanding, between 0.8 and 0.9 is excellent, and between 0.7 and 0.8 is acceptable.

4.3. K-fold cross validation

There are several validation methods of classification models, including simple substitution method,
holdout method, bootstrap method, and bolstered method (Braga-Neto et al., 2004; Efron & Tibshirani,
1993; Gascuel & Caraux, 1992; Rayens, 1993). One of these methods, and probably the most popular
one, is k -fold cross validation (CV) (Stone, 1974). As the k-fold CV is used during the process of
hyper-parameters tuning, the original training set is divided into k folds. A classification model is
trained using k-1 folds while validated using the remaining one fold. This process will be repeated k
times with different folds being used as the validating fold and the k -fold CV performance is the
average performance committed in each fold.

As mentioned above, the performance of k -fold CV depends on the partition of the original training
set. Computation time and variance need to be taken into consideration during the determination of
the k value. In this paper, k was set to be ten as recommended by Kohavi (2001). Therefore, 10-fold
CV was applied for each possible set of hyper-parameters during the construction of classification
models.

12
4.4. Hyper-parameters tuning

As discussed before, the hyper-parameters of ML algorithms were tuned using the 10-fold CV and FA
on the training set. The light intensity of each firefly was considered to be the average AUC value,
which means any combination of hyper-parameters achieving a larger average AUC value was
represented by a brighter firefly. The generation size m was selected to be 100. The light absorption
coefficient, step size parameter, and maximum generation were set to be 0.001, 0.15, and 20 (Mo et
al., 2013). The hyper-parameters tuning was performed for all six ML algorithms so that the optimum
classification models (OCMs) could be trained before the approach comparison started. Details on the
hyper-parameters of six ML algorithms tunned with FA are shown in Table 2.

The overall procedure for the implementation of six integrated AI techniques, including data
preparation, performance measures, 10-fold CV, and hyper-parameters tuning, is demonstrated in Fig.
5.

5. Results and discussion

5.1. Results of hyper-parameters tuning

As mentioned above, FA was utilised in this paper for the hyper-parameters tuning of six ML
algorithms. To analyse the effectiveness of FA the in hyper-parameters tuning, the average and
optimum AUC values of each generation were traced during the evolution. Fig. 6 illustrated the
evolution of the average AUC value within the first six generations and the optimum AUC values
after the hyper-parameters tuning. Note that no further increase in the optimum AUC value was
observed after the sixth generation in this paper.

It can be seen that different ML algorithms had various evolution patterns, both in the optimum AUC
values and the convergent rate. The average AUC values of LR and SVM became stable from the
initial generation, denoting that the influence of hyper-parameters on the performance of LR and
SVM was not distinct on the slope dataset. The convergent rates of DT and MLPNN were a little bit
slower with one generation required and it took about five generations for RF and GBM to converge.
It needs to mention that the average AUC value was considered to converge when no distinct changes
were observed afterwards. It can be concluded that hyper-parameters had a significant influence on
the performance of MLPNN as the average AUC value was significantly increased during iterations
(Fig. 6). By contrast, the influence of hyper-parameters on the performance of the other ML
algorithms was relatively small on the slope dataset.

As for the optimum average AUC values on the training set, the highest AUC value was achieved by
GBM (0.957), implying that it had outstanding performance on the training set. Also, it is interesting
to note that even with the initial generation, the performance of GBM on the training set was still

13
better than the other ML algorithms with several generations’ evolution. The optimum AUC values of
RF, SVM, and DT were 0.949, 0.916, and 0.906 respectively, which could also be recognized as
outstanding classification models (Jr & Lemeshow, 2004). Though the optimum AUC values of DT
and LR were relatively low (0.871 for MLPNN and 0.853 for LR), their performance was still
excellent on the training set.

Once FA has grown to the maximum generation, the optimum AUC values were found and their
corresponding hyper-parameters were considered to be the optimum hyper-parameters. Classification
models with the optimum hyper-parameters were then trained using the whole training set and their
predictive performance was evaluated on the testing set. Specific details on the optimum hyper-
parameters used by six ML algorithms are detailed below.

 LR: C_inverse = 261, tol=1.26e-4.

 DT: Criterion = Gini, max_depth = 13, min_samples_split = 4, min_samples_leaf = 7.

 RF: Criterion = Entropy, n_estimators = 266, max_depth = 23, min_samples_split = 2,


min_samples_leaf = 1.

 GBM: Loss = Exponential, n_estimators = 704, max_depth = 21, min_samples_split = 5,


min_samples_leaf = 6.

 SVM: C_penalty = 4, kernel = RBF, tol = 4.8e-3.

 MLPNN: Hidden_layer_num = 3, hidden_layer_size = (8, 7, 8), which means 8 neurons in first


hidden layer, 7 neurons in second hidden layer, and 8 neurons in third hidden layer.

5.2. Comparison of integrated AI approaches

As discussed before, six integrated AI approaches have been used for the prediction of slope stability
and their predictive performance on the testing set is discussed in this section. Confusion matrices, the
ROC curve, and the AUC value were utilised for the performance evaluation and comparison.

Table 3 describes confusion matrices, as well as the calculated accuracy, of six OCMs on the testing
set when the baseline and Youden’s cutoffs were used. As can be seen, the OCMs’ performance with
the Youden’s cutoff was generally better than the OCMs’ performance with the baseline cutoff. The
average accuracy of all OCMs had a 4.8% increase from the baseline cutoff to the Youden’s cutoff.
The largest accuracy increase was achieved by the optimum SVM model and the accuracy had been
improved from 73% to 96% after the utilisation of the Youden’s cutoff. A 0.2% accuracy increase
occurred in the optimum LR, DT, and MLPNN models with the Youden’s cutoff while there was no
accuracy change in the optimum RF and GBM models. The highest accuracy was achieved by the
optimum SVM model with the Youden’s cutoff, in which case 43 out of 45 slope cases were correctly

14
predicted. The accuracy of the optimum SVM model with the Youden’s cutoff, as well as the
optimum RF model and the optimum GBM model, was larger than 90%, indicating they had an
advantage over the other OCMs in term of the accuracy.

The TPR and TNR calculated from Table 3 are illustrated in Fig 7. As shown, best classification
models regarding the TPR (correctly predicted stable slope) were the optimum RF and GBM models,
in both cases all 20 stable slopes were correctly predicted. By contrast, the optimal prediction for the
TNR was achieved by the optimum SVM classification model with the Youden’s cutoff and all
unstable slope cases were correctly predicted. The baseline and Youden’s cutoffs have no influence
on the performance of the optimum RF and GBM models in this paper. As no OCMs have both the
highest TPR and TNR, the model selection should be based on the AUC values, the accuracy and
engineering requirements.

Fig. 8 shows the ROC curves of six OCMs from integrated AI approaches on the testing set together
with their AUC values. As can be seen, there were no classification models dominating the others
throughout the whole diagram. However, the optimum SVM, GBM and RF models were generally
closer to the left and top axes than the other OCMs, denoting that they achieved higher overall
performance. This can also be verified by the AUC values of these three OCMs (0.967, 0.962 and
0.957 respectively). The performance of MLPNN on the testing set was excellent in term of AUC
value (0.864). Compared with RF and GBM, the performance of DT was relatively poor, showing that
tree-based ensemble algorithms (RF and GBM) can achieve higher predictive performance than DT in
classification problems. Moreover, the optimum LR classification model achieved the smallest AUC
value, and this is likely due to the technique’s incompetence to handle nonlinear relationship
modelling.

Overall, the optimum SVM model with the Youden’s cutoff yielded better results than the other
OCMs, achieving 0.967 AUC value, 0.96 accuracy and the highest TNR. The optimum RF and GBM
models, which achieved the highest TPR, were not recommended due to engineering requirements. As
a mistaken estimation of unstable slopes will produce severe loss of life and property, the TNR is
considered to be more important than the TPR in slope stability analysis. Moreover, SVM converged
very fast (within one generation) when FA was used, indicating great computation-efficiency. Owing
to the merits of the ability to model non-linear relationships, excellent flexibility and outstanding
generalization ability, the optimum SVM model with the Youden’s cutoff was recommended in this
paper for the prediction of slope stability.

Detailed results about the performance of the optimum SVM model on the testing set with the
baseline and Youden’s cutoffs are shown in Fig 9. As stated before, ‘1’ represented stable slopes
while ‘0’ represented unstable slopes. Huge improvement in predictive performance of the optimum
SVM model can be seen when the Youden’s cutoff was utilised. Moreover, the optimum SVM model
15
with the baseline cutoff was more likely to predict stable and 11 out 12 incorrectly predicted slope
cases were caused by incorrectly predicted stable slopes. In contrast, the stopes were more likely to be
labelled unstable with the Youden’s cutoff. Therefore, the optimum SVM model with the Youden’s
cutoff was more conservative during the prediction process, which may meet the requirement of
engineering projects.

5.3. Relative importance of influencing variables

The evaluation of the ‘importance’ of variables is difficult due to two issues. (1) The correlation effect,
which means irrelevant influencing variables may become relevant in the context of others. (2) The
determined relative importance has a strong relationship to the evaluation criterion, thus may show
great variation (Guyon et al., 2006). In this paper, a sensitivity analysis of the influencing variables
was implemented in the optimum RF and GBM models to investigate the effect of variables for the
estimation of the slope instances. The selection for these two OCMs was based on their shared
characteristics:

i) Both of these two OCMs had outstanding performance on the testing set;

ii) The feature importance from the solution of the optimum SVM model, which also had
outstanding performance on the testing set, cannot be directly obtained when the RBF kernel
was used (Chen & Lin, 2006; Tuia et al., 2010);

iii) The relative importance from GBM and RF can be easily obtained by the Gini importance,
which is a measure of variable importance based on the Gini impurity index (Breiman et al.,
1984).

The finally feature importance was determined by averaging the results from the optimum RF and
GBM models. Normalization was performed on the feature importance scores and the result is shown
in Fig. 10.

Apparently, cohesion was the most sensitive variable influencing slope stability in this paper, which
accounted for almost one-third of importance score among all variables. This is a reasonable outcome
and is in accordance with that obtained by Box-Behnken statistical design (Kostić et al., 2015).
Importance scores of the remaining influencing variables on slope stability decreased according to the
following order: internal friction angle (0.208) > unit weight (0.163) > slope angle (0.125) > slope
height (0.118) > pore pressure ratio (0.0075). It can be seen that most influencing variables had a non-
ignorable importance scores, which means they were also important as all these influencing variables
constitute the basic input parameters in most engineering projects (Sakellariou & Ferentinou, 2005).
Also, it is worth mentioning that different importance scores might be obtained when different dataset
and classification models are used (Guyon et al., 2006). More representative results can be obtained as
more valid slope cases are available in the future.
16
5.4. Contribution and limitations

The primary strength of this study is proposing and comparing six integrated AI approaches on the
stability prediction of slopes. This study contributes to the slope analysis, as well as other fields of
industrial engineering, in the following aspects: a) the integrated AI approaches based on ML
algorithms and FA are very promising for classification problems, b) the stableness and robustness of
classification models can be well investigated using confusion matrices, the ROC curve and the AUC
value, c) model recommendation has been made for slope stability prediction, which can provide
reliable suggestion for future researchers in similar studies, d) the methodology in this paper has great
potential for a wider application in the rest of industrial engineering where classification problems are
widely encountered.
The ignorance of other influencing variables for slope stability, such as vegetation coverage,
geological structure, and engineering disturbance, is a clear limitation of the present study. A further
limitation is that the slope dataset is still relatively small. The performance of proposed AI approaches
will improve when more data is available. A final limitation is that the slope stability analysis can be
better considered to be a multiclass classification problems or regression problems, which is currently
under investigation.
6. Summary and conclusion

In this study, a systematically verification and comparison of six integrated AI approaches on the
prediction of slope stability has been conducted. These integrated AI approaches using six ML
algorithms, including LR, DT, RF, GBM, SVM, and MLPNN, for the relationship modelling and FA
for the hyper-parameters tuning. 148 slope cases were collected for the preparation of the dataset after
the consideration of slope failure mechanism and geological conditions. 10-fold CV was used as the
validation method and the performance measures were chosen to be confusion matrices, the ROC
curve and the AUC value. The relative importance of influencing variables was investigated and
analysed using the results from the optimum RF and GBM models. Based on the analysis results, the
following conclusions can be drawn:

1. FA could effectively assist the hyper-parameters tuning of ML algorithms as the optimum AUC
values were obtained within the first six generations for all ML algorithms.

2. The AUC values of the OCMs on the testing set ranged from 0.822 to 0.967, indicating the OCMs
had excellent performance on the testing set. The largest AUC value on the testing set was achieved
by the optimum SVM model (0.967), followed by the optimum GBM model (0.962) and the optimum
RF model (0.957). All these three models could be considered to have outstanding performance for
slope stability prediction.

17
3. Comparison between the baseline and Youden’s cutoffs indicates that models with the Youden’s
cutoff generally had better performance than models with the baseline cutoff in the slope stability
prediction.

4. The optimum SVM model with the Youden’s cutoff was recommended for slope stability
prediction in terms of the AUC value, the accuracy, and the TNR.

5. Cohesion was found to be the most influential variable for the prediction of slope stability, which
achieved a 0.310 importance score out of 1.

Conflict of interest: none

18
References
Akgun, A. (2012). A comparison of landslide susceptibility maps produced by logistic regression,
multi-criteria decision, and likelihood ratio methods: a case study at İzmir, Turkey.
Landslides, 9(1), 93-106.
Amiri, B., Hossain, L., Crawford, J.W. & Wigand, R.T. (2013). Community Detection in Complex
Networks: Multi&ndash;objective Enhanced Firefly Algorithm. Knowledge-Based Systems,
46(1), 1-11.
Azarafza, M., Asghari-Kaljahi, E. & Akgün, H. (2017). Assessment of discontinuous rock slope
stability with block theory and numerical modeling: a case study for the South Pars Gas
Complex, Assalouyeh, Iran. Environmental Earth Sciences, 76(11), 397.
Banati, H. & Bajaj, M. (2011). Firefly based feature selection approach. International Journal of
Computer Science Issues, 8(4), 473-480.
Braga-Neto, U., Hashimoto, R., Dougherty, E.R., Nguyen, D.V. & Carroll, R.J. (2004). Is cross-
validation better than resubstitution for ranking genes? Bioinformatics, 20(2), 253-258.
Breiman, L.I., Friedman, J.H., Olshen, R.A. & Stone, C.J. (1984). Classification and Regression Trees
(CART). 40(3), 358.
Chen, Y. & Lin, C. (2006). Feature Extraction: Foundations and Applications.
Cheng, M. & Hoang, N. (2014). Groutability Estimation of Grouting Processes with Microfine
Cements Using an Evolutionary Instance-Based Learning Approach. Journal of Computing in
Civil Engineering, 28(4), 04014014.
Cheng, M. & Hoang, N. (2015). A Swarm-Optimized Fuzzy Instance-based Learning approach for
predicting slope collapses in mountain roads. Knowledge-Based Systems, 76(Supplement C),
256-263.
Das, S.K., Biswal, R.K., Sivakugan, N. & Das, B. (2011). Classification of slopes and prediction of
factor of safety using differential evolution neural networks. Environmental Earth Sciences,
64(1), 201-210.
Efron, B. & Tibshirani, R.J. (1993). An Introduction to the Bootstrap: Chapman & Hall.
Fang, K., Jiang, Y. & Song, M. (2016). Customer profitability forecasting using Big Data analytics: A
case study of the insurance industry. Computers & Industrial Engineering, 101(Supplement
C), 554-564.
Faramarzi, L., Zare, M., Azhari, A. & Tabaei, M. (2017). Assessment of rock slope stability at Cham-
Shir Dam Power Plant pit using the limit equilibrium method and numerical modeling.
Bulletin of Engineering Geology and the Environment, 76(2), 783-794.
Gascuel, O. & Caraux, G. (1992). Distribution-free performance bounds with the resubstitution error
estimate. Pattern Recognition Letters, 13(11), 757-764.

19
Ghosh, J.K., Bhattacharya, D., Boccardo, P. & Samadhiya, N.K. (2015). Automated Geo-Spatial
Hazard Warning System GEOWARNS: Italian Case Study. Journal of Computing in Civil
Engineering, 29(5), 04014065.
Gordan, B., Jahed Armaghani, D., Hajihassani, M. & Monjezi, M. (2016). Prediction of seismic slope
stability through combination of particle swarm optimization and neural network. Engineering
with Computers, 32(1), 85-97.
Guh, R. & Shiue, Y. (2008). An effective application of decision tree learning for on-line detection of
mean shifts in multivariate control charts. Computers & Industrial Engineering, 55(2), 475-
493.
Guyon, I.M., Gunn, S.R., Nikravesh, M. & Zadeh, L. (2006). Feature Extraction, Foundations and
Applications. Springer Verlag. Studies in Fuzziness & Soft Computing, 205(12), 68-84.
Hoang, N. & Bui, D. (2017). Slope Stability Evaluation Using Radial Basis Function Neural Network,
Least Squares Support Vector Machines, and Extreme Learning Machine. In: Handbook of
Neural Computation (pp. 333-344): Elsevier.
Hoang, N. & Pham, A. (2016). Hybrid artificial intelligence approach based on metaheuristic and
machine learning for slope stability assessment: A multinational data analysis. Expert
Systems with Applications, 46, 60-68.
Jr, D.H.W. & Lemeshow, S. (2004). Applied Logistic Regression. Journal of the American Statistical
Association, 85(411).
Kaboli, S.H.A., Fallahpour, A., Selvaraj, J. & Rahim, N. (2017). Long-term electrical energy
consumption formulating and forecasting via optimized gene expression programming.
Energy, 126, 144-164.
Kaboli, S.H.A., Selvaraj, J. & Rahim, N. (2016). Long-term electric energy consumption forecasting
via artificial cooperative search algorithm. Energy, 115, 857-871.
Kaboli, S.H.A., Selvaraj, J. & Rahim, N. (2017). Rain-fall optimization algorithm: A population based
algorithm for solving constrained optimization problems. Journal of Computational Science,
19, 31-42.
Kartal, H., Oztekin, A., Gunasekaran, A. & Cebi, F. (2016). An integrated decision analytic
framework of machine learning with multi-criteria decision making for multi-attribute
inventory classification. Computers & Industrial Engineering, 101(Supplement C), 599-613.
Kohavi, R. (2001). A study of cross-validation and bootstrap for accuracy estimation and model
selection. In: International Joint Conference on Artificial Intelligence (pp. 1137--1143).
Kokutse, N.K., Temgoua, A.G.T. & Kavazović, Z. (2016). Slope stability and vegetation: Conceptual
and numerical investigation of mechanical effects. Ecological Engineering, 86(Supplement C),
146-153.
Koo, T.K. & Li, M.Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients
for reliability research. Journal of chiropractic medicine, 15(2), 155-163.
20
Kostić, S., Vasović, N. & Sunarić, D. (2015). A new approach to grid search method in slope stability
analysis using Box–Behnken statistical design. Applied Mathematics & Computation, 256(C),
425-437.
Kuhn, M. & Johnson, K. (2013). Applied Predictive Modeling.
Kuo, R.J. & Li, P. (2016). Taiwanese export trade forecasting using firefly algorithm based K-means
algorithm and SVR with wavelet transform. Computers & Industrial Engineering, 99, 153-
161.
Li, D., Xiao, T., Cao, Z., Phoon, K. & Zhou, C. (2016). Efficient and consistent reliability analysis of
soil slope stability using both limit equilibrium analysis and finite element analysis. Applied
Mathematical Modelling, 40(9), 5216-5229.
Li, J. & Wang, F. (2010). Study on the Forecasting Models of Slope Stability under Data Mining. In:
Biennial International Conference on Engineering, Construction, and Operations in
Challenging Environments; and Fourth Nasa/aro/asce Workshop on Granular Materials in
Lunar and Martian Exploration (pp. 765-776).
Madani-Isfahani, M., Tavakkoli-Moghaddam, R. & Naderi, B. (2014). Multiple cross-docks
scheduling using two meta-heuristic algorithms. Computers & Industrial Engineering, 74,
129-138.
Mo, Y., Ma, Y. & Zheng, Q. (2013). Optimal Choice of Parameters for Firefly Algorithm. In: Fourth
International Conference on Digital Manufacturing and Automation (pp. 887-892).
Modiri-Delshad, M., Kaboli, S.H.A., Taslimi-Renani, E. & Rahim, N.A. (2016). Backtracking search
algorithm for solving economic dispatch problems with valve-point effects and multiple fuel
options. Energy, 116, 637-649.
Modiri-Delshad, M., Kaboli, S.H.A., Taslimi, E., Selvaraj, J. & Rahim, N.A. (2013). An iterated-
based optimization method for economic dispatch in power system. In: 2013 IEEE
Conference on Clean Energy and Technology (CEAT) (pp. 88-92).
Modiri-Delshad, M., Koohi-Kamali, S., Taslimi, E., Kaboli, S.H.A. & Rahim, N.A. (2013). Economic
dispatch in a microgrid through an iterated-based algorithm. In: 2013 IEEE Conference on
Clean Energy and Technology (CEAT) (pp. 82-87).
Pierola, A., Epifanio, I. & Alemany, S. (2016). An ensemble of ordered logistic regression and
random forest for child garment size matching. Computers & Industrial Engineering,
101(Supplement C), 455-465.
Qi, C., Fourie, A. & Chen, Q. (2018). Neural network and particle swarm optimization for predicting
the unconfined compressive strength of cemented paste backfill. Construction and Building
Materials, 159(Supplement C), 473-478.
Qi, C., Fourie, A., Ma, G., Tang, X. & Du, X. (2017). Comparative Study of Hybrid Artificial
Intelligence Approaches for Predicting Hangingwall Stability. Journal of Computing in Civil
Engineering, 32(2), 04017086.
21
Rafieerad, A., Bushroa, A., Nasiri-Tabrizi, B., Kaboli, S., Khanahmadi, S., Amiri, A., Vadivelu, J.,
Yusof, F., Basirun, W. & Wasa, K. (2017). Toward improved mechanical, tribological,
corrosion and in-vitro bioactivity properties of mixed oxide nanotubes on Ti–6Al–7Nb
implant using multi-objective PSO. Journal of the mechanical behavior of biomedical
materials, 69, 1-18.
Rayens, W.S. (1993). Discriminant Analysis and Statistical Pattern Recognition. Journal of the Royal
Statistical Society, 35(3), 324-326.
Rukhaiyar, S., Alam, M. & Samadhiya, N. (2017). A PSO-ANN hybrid model for predicting factor of
safety of slope. International Journal of Geotechnical Engineering, 1-11.
Sah, N.K., Sheorey, P.R. & Upadhyaya, L.N. (1994). Maximum likelihood estimation of slope
stability. International Journal of Rock Mechanics & Mining Science & Geomechanics
Abstracts, 31(1), 47-53.
Sakellariou, M.G. & Ferentinou, M.D. (2005). A study of slope stability prediction using neural
networks. Geotechnical and Geological Engineering, 23(4), 419-445.
Samimi, Y. & Aghaie, A. (2011). Using logistic regression formulation to monitor heterogeneous
usage rate for subscription-based services. Computers & Industrial Engineering, 60(1), 89-98.
Sebtahmadi, S.S., Azad, H.B., Kaboli, S.H.A., Islam, M.D. & Mekhilef, S. (2017). A PSO-DQ
Current Control Scheme for Performance Enhancement of Z-source Matrix Converter to
Drive IM Fed by Abnormal Voltage. IEEE Transactions on Power Electronics.
Song, Y., Gong, J., Gao, S., Wang, D., Cui, T., Li, Y. & Wei, B. (2012). Susceptibility assessment of
earthquake-induced landslides using Bayesian network: A case study in Beichuan, China.
Computers & Geosciences, 42(Supplement C), 189-199.
Stone, M. (1974). Cross-Validatory Choice and Assessment of Statistical Predictions. Journal of the
Royal Statistical Society, 36(2), 111-147.
Tsiampousi, A., Zdravkovic, L. & Potts, D.M. (2016). Numerical study of the effect of soil–
atmosphere interaction on the stability and serviceability of cut slopes in London clay.
Canadian Geotechnical Journal, 54(3), 405-418.
Tuia, D., Camps-Valls, G., Matasci, G. & Kanevski, M. (2010). Learning Relevant Image Features
With Multiple-Kernel Classification. IEEE Transactions on Geoscience & Remote Sensing,
48(10), 3780-3791.
Wang, J., Lin, Y. & Hou, S. (2015). A data mining approach for training evaluation in simulation-
based training. Computers & Industrial Engineering, 80(Supplement C), 171-180.
Wang, L., Chen, Z., Wang, N., Sun, P., Yu, S., Li, S. & Du, X. (2016). Modeling lateral enlargement
in dam breaches using slope stability analysis based on circular slip mode. Engineering
Geology, 209(Supplement C), 70-81.

22
Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu,
B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J. & Steinberg, D. (2008). Top 10
algorithms in data mining. Knowledge and Information Systems, 14(1), 1-37.
Xu, W., Xie, S., Jean-Pascal, D. (1999). SLOPE STABILITY ANALYSIS AND EVALUATION
WITH PROBABILISTIC ARTIFICIAL NEURAL NETWORK METHOD. Site
Investigationence & Technology.
Yan, X. & Li, X. (2011). Bayes discriminant analysis method for predicting the stability of open pit
slope. In: International Conference on Electric Technology and Civil Engineering (pp. 147-
150).
Yang, X. & He, X. (2013). Firefly algorithm: recent advances and applications. International Journal
of Swarm Intelligence, 1(1), 36-50.
Yang, X.S. (2008). Nature-Inspired Metaheuristic Algorithms: Luniver Press.
Youden, W.J. (1950). Index for rating diagnostic tests. Cancer, 3(1), 32–35.
Zaman, M.A. & Matin, A. (2012). Nonuniformly spaced linear antenna array design using firefly
algorithm. International Journal of Microwave Science and Technology, 2012.
Zhou, K.P. & Chen, Z.Q. (2009). Stability Prediction of Tailing Dam Slope Based on Neural Network
Pattern Recognition. In: Second International Conference on Environmental and Computer
Science, Icecs 2009, Dubai, Uae, 28-30 December (pp. 380-383).

23
Figure Captions:
Fig. 1. An example of basic rules of the DT algorithm.

Fig. 2. Illustration of ANN: (a) single neuron; (b) ANN architecture.

Fig. 3. Procedure of FA.

Fig. 4. Distribution and interaction of influencing variables in the dataset (R represents


correlation coefficient).

Fig. 5. Overall procedure for slope prediction using integrated AI techniques.

Fig. 6. Transition of the average AUC values along with generations.

Fig. 7. Predictive performance of six OCMs (a) true positive rate (b) true negative rate.

Fig. 8. ROC curve and AUC values of six OCMs on the testing set.

Fig. 9. Predictive performance of the optimum SVM model (a) baseline cutoff (b) Youden’s
cutoff.

Fig. 10. Relative importance of influencing variables for slope stability.

24
Table 1. Confusion matrix for slope stability prediction.

Predicted condition
Actual condition
Stable (1) Unstable (0)
True positive rate False negative rate
False Negative
Stable (1) True Positive (TP)
(FN)

False positive rate True negative rate


True Negative
Unstable (0) False Positive (FP)
(TN)

25
Table 2. Hyper-parameters tuned in six ML algorithms.

Algorithms Parameters Definition Scope


LR tol Tolerance for stopping criteria. 1e-5-1e-3.
C_inverse Inverse of regularization strength. 0.1-10.
DT criterion Quality measurement of a split. Gini, Entropy.
max_depth The maximum depth of the tree. 1-100.
min_samples_split The minimum number of samples 2-10.
required to split an internal node.
min_samples_leaf The minimum number of samples 1-10.
required to be at a leaf node.
RF criterion Quality measurement of a split. Gini, Entropy.
n_estimators The number of trees in the forest. 1-1000.
GBM loss Loss function to be optimized. Deviance, Exponential.
n_estimators The number of boosting stages. 1-1000.
SVM C_penalty Penalty parameter of the error term. 0.1-10.
kernel Kernel type in the algorithm. Linear, RBF.
tol Tolerance for stopping criterion. 1e-4-1e-2.
BPNN hidden_layer_num The number of hidden layers. 1-3.
hidden_layer_size Number of neurons in hidden layer. 1-20.
Note: The max_depth, min_samples_split, and min_samples_leaf tunned in DT were also tuned in RF
and GBM.

26
Table 3. Testing set confusion matrices of six OCMs with different cutoffs.

Predicted condition
Actual condition Baseline cutoff Youden’s cutoff
Stable Unstable Accuracy Stable Unstable Accuracy
LR Stable 17 3 17 3
0.80 0.82
Unstable 6 19 5 20
DT Stable 16 4 15 5
0.78 0.80
Unstable 6 19 4 21
RF Stable 20 0 20 0
0.93 0.93
Unstable 3 22 3 22
GBM Stable 20 0 20 0
0.93 0.93
Unstable 3 22 3 22
SVM Stable 19 1 18 2
0.73 0.96
Unstable 11 14 0 25
BPNN Stable 17 3 17 3
0.82 0.84
Unstable 5 20 4 21

27
Research Highlights:

 Integrated approaches based on GA and ML are used for slope stability prediction.
 Three performance measures, confusion matrices, the ROC, and the AUC, are used.
 The results show that integrated approaches have excellent predictive performance.
 Support vector machine is recommended in slope stability prediction.
 Cohesion is found to be the most influential variable of slope stability.

Acknowledgement:
The first author is supported by China Scholarship Council (grant number: 201606420046)

28

You might also like