2022 Employing A Genetic Algorithm and Grey Wolf Optimizer For Optimizing RF Models To Evaluate Soil Liquefaction Potential

Artificial Intelligence Review (2022) 55:5673–5705
https://doi.org/10.1007/s10462-022-10140-5
Employing a genetic algorithm and grey wolf optimizer

for optimizing RF models to evaluate soil liquefaction
potential
Jian Zhou1 · Shuai Huang1 · Tao Zhou2 · Danial Jahed Armaghani3 · Yingui Qiu1
Accepted: 14 January 2022 / Published online: 19 February 2022

© The Author(s), under exclusive licence to Springer Nature B.V. 2022
Abstract
Among the research hotspots in geological/geotechnical engineering, research on the pre-
diction of soil liquefaction potential is still limited. In this research, several machine-learn-
ing methods were developed to evaluate the liquefaction potential of soil using random
forest (RF) as the base model. The parameters of the RF model were optimized using two
optimization algorithms, namely, the grey wolf optimizer (GWO) and genetic algorithm
(GA). In the experiment, three in situ databases based on the standard penetration test
(SPT), shear wave velocity test (SWVT) and cone penetration test (CPT) were considered
and used to investigate the applicability of GA-RF and GWO-RF models. For comparison
purposes, a single RF model was also constructed to predict soil liquefaction. The devel-
oped models in this study were evaluated using four metrics, i.e., accuracy, recall, preci-
sion and F1-score (F1). Furthermore, receiver operating characteristic and precision-recall
curves were also proposed for evaluation purposes. The results showed that the developed
GA-RF and GWO-RF models can improve the performance of the original classifier. By
comparing the two hybrid models, it was found that the GWO-RF performs better on two
databases, i.e., CPT and SPT, while in the case of the SWVT database, the GA-RF has bet-
ter performance. Considering a variety of metrics, the two hybrid models can be employed
as powerful techniques to estimate soil liquefaction potential and may be feasible tools to
assist technicians in making correct decisions. By implementing sensitivity analysis, the
impact of each model predictor on soil liquefaction was evaluated, and the most influential
parameters were identified.
Keywords Soil liquefaction potential · Random forest · Genetic algorithm · Grey wolf
optimizer · Hybrid RF model
* Jian Zhou
j.zhou@csu.edu.cn; csujzhou@hotmail.com
* Tao Zhou
tzhou@szu.edu.cn
Extended author information available on the last page of the article
13
Vol.:(0123456789)
5674 J. Zhou et al.
1 Introduction
An earthquake is a very destructive natural disaster, and in a period of time after these
seismic waves, landslides, debris flows, soil liquefaction and other secondary disasters may
occur. Among them, soil liquefaction is considered a malignant disaster (Erzin and Ecemis
2015; Xue and Yang 2016; Hoang and Bui 2018). Several cases of earthquake-induced
soil liquefaction have been reported in many parts of the world. Especially in some coastal
areas, due to the proximity of the ocean, the probability of soil liquefaction caused by an
earthquake is greater (Chen et al. 2019). The reason is that the cohesive force between the
soil particles in the soil layer of these coastal areas is very small. In this situation, the soil
is loose and in a saturated state. Under these conditions, if an earthquake occurs, the pore
water pressure in the soil layer will rise sharply, resulting in significant degradation of soil
strength and loss of bearing capacity, which may lead to the occurrence of soil liquefaction
(Samui and Sitharam 2011; El Mohtar et al. 2014).
Once soil liquefaction occurs in a certain area, the most obvious phenomenon is that
the foundations of buildings in that specific area will lose their original stability. Because
these foundations will collapse, the surrounding area will also sink, which causes traffic
paralysis and casualties (Youd and Idriss 2001; Alobaidi et al. 2019; Ter-Martirosyan and
Le Duc 2020). Therefore, it is of great significance and interest to measure the potential of
soil liquefaction (Juang et al. 2000b; Samui et al. 2011). If the potential of soil liquefaction
is assessed more accurately, countermeasures can be developed in advance to avoid the
negative effects of soil liquefaction more effectively. However, there are many factors influ-
encing the occurrence and development of soil liquefaction (Mahmood et al. 2020; Ahmad
et al. 2021c). Since most of these factors do not have a linear relationship with soil lique-
faction, the prediction of soil liquefaction potential is difficult. To estimate the potential of
soil liquefaction, many attempts have been made by different authors worldwide, who have
tried to propose new, applicable and cutting-edge research methods to solve this problem
(Seed and Idriss 1971; Seed et al. 1983; Idriss and Boulanger 2006; Hanna et al. 2007a;
Rezania et al. 2010; Heidari and Andrus 2012; Juang et al. 2012; Kohestani et al. 2015;
Xue and Xiao 2016; Lianyang 1998; Ahmad et al. 2019a, b).
The “simplified procedure” developed by Seed and Idriss (Seed and Idriss 1971) has a
high status in evaluating the potential of soil liquefaction. Based on the “simplified proce-
dure”, many scholars have further expanded and developed it according to field test results
(Andrus and Stokoe 2000; Youd et al. 2001). Common field test methods include the stand-
ard penetration test (SPT), the shear wave velocity (Vs) test (SWVT) and the cone penetra-
tion test (CPT) (Goh 2002; Cetin et al. 2004; Moss et al. 2006; Seo et al. 2012; Samui and
Hariharan 2015; Zhang et al. 2004; Juang et al. 2002). Semiempirical methods represented
by the “simplified procedure” and pure empirical methods are traditional prediction meth-
ods, and they mostly rely on the “limit states”. The limit states can be employed to distin-
guish the liquefied region from the non-liquefied region (Pal 2006). However, the empirical
and semiempirical methods are not accurate enough to predict soil liquefaction potential.
In reality, the soil liquefaction caused by earthquakes can be influenced by a series of fac-
tors, such as the soil properties of the place where the earthquake occurs. However, the
characteristics of soil are highly uncertain; thus, it is relatively difficult to choose a reason-
able and appropriate empirical equation for estimating soil liquefaction potential (Xue and
Yang 2016). It seems that the reported techniques/equations are not considered effective
in this regard, and there is a need to move to another direction/strategy to achieve a higher
prediction capacity level to predict soil liquefaction potential.
13
Employing a genetic algorithm and grey wolf optimizer for… 5675
Over the past decades, artificial intelligence (AI) approaches have been developing rap-
idly. In particular, machine learning (ML) methods, have greatly promoted the progress
of many research directions in the engineering field (Zhou et al. 2019a, b, 2020a, b, c,
2021b, c, d, f; Feng et al. 2020, 2021; Le et al. 2019a, b; Bui et al. 2020; Guo et al. 2021;
Fang et al. 2021; Yu et al. 2021; Xie et al. 2021; Yong et al. 2021; Li et al. 2021a, b; Qiu
et al. 2021; Armaghani et al. 2021a, b). Accordingly, in the area of soil liquefaction predic-
tion, several scholars have used and developed AI and ML techniques (Hanna et al. 2007a;
Chern et al. 2008; Samui and Karthikeyan 2013; Zhou et al. 2019c; Rahbarzare and Azadi
2019; Ahmad et al. 2021b; Zhao et al. 2021; Zhang et al. 2021a, b). Furthermore, with the
increase in data collected by in situ tests over the past years, the application of AI and ML
methods in this field has increased rapidly. Compared to traditional prediction methods,
methods based on AI and ML background can generally achieve higher levels of accuracy
in predicting the potential of soil liquefaction (Lee and Chern 2013; Hoang and Bui 2018).
Among the many ML algorithms, artificial neural networks (ANNs), as powerful algo-
rithms, have been widely used in geotechnical engineering (Hanna et al. 2007a, b; Chern
and Lee 2009; Erzin and Ecemis 2015; Shahri 2016). Goh (Goh 1996) used an ANN and
samples obtained by CPT tests for the first time to assess soil liquefaction potential. Goh
also verified the feasibility of a probabilistic neural network (PNN) to evaluate soil lique-
faction potential through VS and CPT datasets (Goh 2002). Juang et al. (Juang et al. 2000a)
trained an ANN model with 243 data samples from the SPT database and established a
function of the liquefaction limit state. As a special neural network model, the adaptive
neuro-fuzzy inference system (ANFIS) also has great potential in the study of determining/
evaluating soil liquefaction potential in an investigation conducted by Xue and Yang (Xue
and Yang 2013). Moreover, as a learning model that can provide an appropriate framework
to address the uncertainty and causality of the target problem, Ahmad et al. and Hu et al.
conducted many important studies evaluating the potential of soil liquefaction using Bayes-
ian belief networks (BNNs) and achieved relatively ideal results, providing important refer-
ences for research in this field (Ahmad et al. 2019a, 2020a, b, 2021a, b; Hu and Liu 2019a,
b; Hu et al. 2015, 2016). In addition to the mentioned methods, other ML and AI meth-
odologies, such as support vector machine (SVM) (Pal 2006; Goh and Goh 2007; Samui
et al. 2011; Xue and Xiao 2016; Hoang and Bui 2018; Cai et al. 2021; Zhou et al. 2021a),
relevance vector machine (RVM) (Samui 2007; Samui and Karthikeyan 2014) and stochas-
tic gradient boosting (SGB) (Zhou et al. 2019c), have been successfully applied to evaluate
liquefaction potential. These emerging prediction AI models are not only generally supe-
rior to the traditional prediction methods in terms of prediction accuracy but also better
choices when there is a comprehensive database with many data samples available. Moreo-
ver, these methods do not need to obtain the correlation information between each variable
and the output result and can effectively deal with the complex interaction between each
characteristic variable (Kohestani et al. 2015).
However, there is no perfect method for predicting liquefaction potential, and these
AI- and ML-based models also have some limitations. The ANN model, which is the
most widely used ML methodology in this field, does not analyze the importance of each
variable when solving the target problem, so it cannot know the relationship between the
output results and characteristic variables. Additionally, ANNs do not have the ability to
fully explain the network structure within the model, which is known as the “black box”
property of this model (Samui et al. 2011). According to Xue and Xiao (Xue and Xiao
2016), the ANN model includes some defects, such as poor generalization performance,
an overfitting tendency and slow convergence. Considering that the phenomenon of soil
liquefaction is very complex to evaluate and is often affected by many geological factors,
13
5676 J. Zhou et al.
the published prediction models based on AI technologies have relatively limited applica-
bility. Therefore, it is necessary to seek more predictive models to provide more options for
subsequent research.
Random forest (RF), developed by Breiman (Breiman 2001), is an ML algorithm with a
very mature system and high flexibility that has been widely applied in the field of civil and
geotechnical engineering (Harris and Grunsky 2015; Zhou et al. 2015, 2016). Kohestani
et al. (Kohestani et al. 2015) applied the RF model to evaluate the potential of soil liquefac-
tion and compared its results with two other ML models, namely, ANN and SVM. They
successfully showed that their developed RF model achieved a higher classification accu-
racy than the other two approaches. As a mature and widely used integrated model, RF has
many advantages. First, when solving various engineering problems, the high performance
of the RF model, intuitively represented by high accuracy, is beyond doubt. Second, RF
also has a very good processing capability when dealing with large data samples or high-
dimensional input parameters. More importantly, RF algorithms have the ability to obtain
the importance of each feature in the input sample (Genuer et al. 2010), which makes them
a perfect group of AI and ML techniques.
The goal of this paper is to investigate the feasibility and reliability of two hybrid mod-
els based on three in situ test databases for predicting soil liquefaction potential. These two
hybrid classification models are RF models optimized by the grey wolf optimizer (GWO)
and the genetic algorithm (GA), namely, GWO-RF and GA-RF, respectively. A model
built with only one database is often unable to prove the universality of its application;
therefore, using multiple databases to train the classification model and verify the reliable
performance of the model will make this study more rigorous. The excellent abilities of
these two optimization algorithms used in this study have been proven by a large number
of researchers (e.g., ). In nature, the success of many creatures is based on “survival of the
fittest” in their own population evolution, and GA addresses the target problem by simulat-
ing this natural law (Holland 1975). The GWO is a kind of swarm intelligence algorithm
that simulates the predation behaviour of the grey wolf (Mirjalili et al. 2014). During the
optimization process, each grey wolf as a separate solution will keep updating its position
until the prey is successfully captured.
The rest of this paper is organized as follows:
Section 2 gives an introduction of the three datasets used in this study and descriptions
of the corresponding characteristic variables. Section 3 introduces various algorithms and
models in predicting the potential of soil liquefaction. Section 4 describes the performance
evaluation indices used in this study. The modelling process and modelling evaluation of
different techniques used in this study are given in Sect. 5. Section 6 discusses the results
obtained, introduces the best model, and discusses the limitations of the study.
2 Historical datasets
The data used in this research include three different databases, i.e., Databases A, B and C.
Database A contains a total of 226 historical cases, among which the number of liquefied
samples is 133, and the remaining 93 historical cases are all non-liquefied samples. All the
data samples were obtained based on CPT tests (Goh and Goh 2007; Juang et al. 2003),
and there are six parameters related to soil liquefaction properties in the dataset. These
include the maximum horizontal ground surface acceleration αmax (g), the sleeve friction
13
ratio Rf (%), the effective stress S1 (KPa), the earthquake moment magnitude Mw, the total
stress S2 (KPa), and the cone tip resistance qc (MPa).
Samples in Database B were obtained based on the SPT tests. These data contain 620
historical cases documented by Hanna et al. (Hanna et al. 2007b), with 12 characteristic
variables and a decision result. These characteristic variables include the depth of ground
water table dw (m), the cyclic stress ratio CSR, the total vertical stress Svo (KPa), the earth-
quake magnitude Mw, the effective vertical stress S′vo (KPa), the initial soil friction angle
φ′, the corrected standard penetration blow numbers N1, the threshold acceleration αt (g),
the percent finest content less than 75 μm F75 (%), the depth of the soil specimen Z (m),
the shear wave velocity Vs (m/s), and the peak horizontal acceleration at ground surface
αmax (g). The numbers of liquefied and non-liquefied cases in Database B are 256 and 364,
respectively.
Database C includes a total of 415 cases that were obtained by Vs tests in the study
carried out by Kayen et al. (Kayen et al. 2013). This database is also divided into two
categories, namely non-liquefaction samples and liquefaction samples. There are 128 and
287 of these two types of samples, respectively. There are 8 input parameters in Database
C, which include the effective stress normalized shear-wave velocity (Vs1, m/s), the effec-
tive vertical stress (S′vo, KPa), the cyclic stress ratio (CSR), the water table depth (Depth to
GWT, m), the peak horizontal acceleration at the ground surface (αmax, g), the total verti-
cal stress (Svo, KPa), the nonlinear shear mass participation factor (rd), and the earthquake
magnitude (Mw).
In this research, the three original databases introduced in this section were divided into
training sets (containing 80% of the samples) and test sets (containing the remaining 20%
of the samples) for model training and performance verification. Considering the large dif-
ference in the proportion of liquefied samples and non-liquefied samples in the original
databases, simple random sampling will result in failure of the sampling bias to be con-
trolled in an acceptable range. Therefore, to avoid the influence of sampling bias as much
as possible, a stratified sampling technique was employed in this study to ensure that the
difference in the class ratio between the sample and the population was close to zero.
In addition, to further reduce the differences between the samples used for model train-
ing and validation, the statistical consistency between the training and test sets was also
fully considered when dividing them to ensure that the differences in statistical indicators
such as the mean and standard deviation between samples were controlled within a very
small range.
The descriptive statistics of the variables in each dataset are presented in Tables 1, 2, 3,
and Fig. 1 visually represents the value range, average value, outlier and other useful infor-
mation of each variable in the training and test sets separated from the three original data-
bases. Obviously, it is not practical to make the means and standard deviations of all vari-
ables in the training and test sets completely consistent. However, it can be seen intuitively
from Fig. 1 that the statistical differences of all variables in the samples of the training and
test sets after comprehensive analysis and processing were small.
13
5678
13
Table 1 Descriptive statistics of variables in Database A
Mean Standard deviation
Variable Minimum Median Maximum Original dataset Training set Test set Original dataset Training set Test set
qc 0.90 4.90 25.00 5.82 5.91 5.44 4.09 4.16 3.83

Rf 0.10 0.90 5.20 1.22 1.22 1.22 1.05 1.07 0.95
S1 22.50 62.80 215.20 74.65 75.02 73.18 34.40 34.34 34.96
S2 26.60 90.30 274.00 106.89 106.63 107.92 55.36 54.58 58.92
αmax 0.08 0.25 0.80 0.29 0.30 0.26 0.14 0.14 0.14
Mw 6.00 7.10 7.60 6.95 6.94 6.97 0.44 0.44 0.46
J. Zhou et al.
Table 2 Descriptive statistics of variables in Database B
Z 0.80 6.70 19.80 7.66 7.69 7.51 4.90 4.93 4.79

N1 1.00 11.00 75.00 14.48 14.38 14.86 11.39 11.31 11.72
F75 1.00 74.50 100.00 62.99 62.19 66.18 34.28 34.65 32.67
dw 0.35 1.10 10.00 1.45 1.44 1.50 1.20 1.20 1.23
Svo 12.10 121.60 408.90 144.60 145.08 142.70 98.20 98.79 96.17
7.50 68.15 233.70 82.48 82.56 82.13 52.84 53.17 51.70
Employing a genetic algorithm and grey wolf optimizer for…
S′vo
αt 0.00 0.06 0.85 0.07 0.07 0.08 0.07 0.07 0.09
CSR 0.12 0.39 0.77 0.37 0.37 0.37 0.15 0.15 0.16
Vs 37.00 155.00 500.00 166.98 166.20 170.09 67.09 65.62 72.87
φ′ 23.46 31.41 52.08 31.96 31.96 31.98 4.85 4.88 4.74
Mw 7.40 7.40 7.60 7.49 7.49 7.50 0.10 0.10 0.10
αmax 0.18 0.40 0.67 0.38 0.38 0.38 0.15 0.15 0.15
5679
13
5680
13
Table 3 Descriptive statistics of variables in Database C
Mw 5.90 7.00 9.00 7.12 7.13 7.09 0.53 0.52 0.57

Depth to GWT 0.40 1.90 7.00 1.95 1.91 2.12 1.15 1.11 1.32
Svo 17.40 79.28 331.68 89.69 90.12 87.98 45.07 45.59 43.15
S′vo 11.03 51.89 176.08 56.20 56.20 56.19 25.57 25.40 26.42
αmax 0.02 0.35 0.76 0.33 0.33 0.33 0.16 0.16 0.17
rd 0.51 0.88 1.00 0.85 0.85 0.87 0.11 0.11 0.10
CSR 0.02 0.26 0.73 0.28 0.28 0.29 0.15 0.14 0.17
Vs1 81.70 159.80 362.90 166.70 165.65 170.89 40.20 39.44 43.09
J. Zhou et al.
TR TE TR TE TR TE TR TE TR TE TR TE
(a)
(b)
Depth
to
GWT
Mean value
25% 75% (Training set (TR))
Abnormal value
25% 75% (Test set (TE))
Median
TR TE TR TE (c)
Fig. 1 Boxplots of variables for training and test sets of the three databases: a Database A; b Database B
and c Database C
3 Research methodologies
3.1 Random forest (RF)
3.1.1 The principle of RF
The RF is an ML algorithm proposed by Breiman (Breiman 2001) that is combined with

bagging ensemble learning theory (Breiman 1996) and the random subspace method pro-
posed by Ho (Ho 1998). As the base classifier of RF, the decision tree (DT) has a simple
structure, but the DT faces the problem of overfitting when the input variables are rela-
tively complex. Hence, the DT will fail to effectively deal with the classification problems.
Compared to the traditional DT model, RF has a stronger generalization ability and a
better classification effect (Breiman 2001). The randomness of the RF model is mainly
13
5682 J. Zhou et al.
reflected in the following two aspects. First, for the training set input into the model, the
bootstrap method is applied to randomly select K new sample sets, and each new sample
set is used to train a DT. Therefore, there are K DTs in the RF model as the base classifier.
The unselected samples constitute the out-of-bag (OOB) datasets. Second, during DT con-
struction, a certain number of features needed to be randomly extracted from all features
(N) of the input variables.
In the classification problem, each DT will output a classification result, and then the
classification results of K DTs will be voted to determine the final classification RF results.
The final decision result of the RF model is shown as follows (Belgiu and Dragut 2016):
K
∑
H(X) = arg max I(hi (X) = Y) (1)
i=1
where H(X) refers to an RF model composed of DTs; hi refers to individual DTs; K repre-
sents the number of DTs in this model; and X and Y are the input vector and correct clas-
sification vector, respectively.
3.1.2 Parameters in the RF model
In the RF model, there are many built-in parameters, and the two optimization algorithms
selected in this paper (GWO and GA) are both used to optimize three parameters of RF:
(1) the number of DTs in the RF model (n_estimators); if the value of this parameter is too
small, the combined model fails to fit the input variables well, and the classification per-
formance of the model is correspondingly low, but if the value is too large, the calculation
amount of the model will be too large, and the running speed will be greatly slowed down;
(2) the minimum number of samples needed for a node in the tree to split (min_samples_
split), and (3) the maximum depth of the generated tree (max_depth).
3.2 Genetic Algorithm (GA)
According to Darwin’s theory of “survival of the fittest”, all kinds of creatures in nature
have to undergo a long period of evolution to make it easier for them to survive, and those
individuals who do not have good adaptability will have a high probability of being elimi-
nated. This process of evolution actually reflects the evolution of organisms’ chromosomes,
and chromosomes with better performance tend to make organisms more adaptable to the
environment; as a result, these chromosomes are more likely to be preserved and passed
on to the next generation. The core idea of the GA is to use this theory to find the optimal
solutions.
When optimizing a problem using GA, an initial population containing a number of
individuals is first established, and these individuals are all called “chromosomes”, each
of which represents a possible solution. According to the biological point of view, these
chromosomes reflect the characteristics of the individual; the specific difference is that the
genes in various chromosomes are different. In GA, the most commonly used encoding
method of chromosomes is binary encoding, that is, the gene of the chromosome is “0” or
“1”. Chromosomes are the basic unit for various operations in GA, among which there are
mainly three operators, namely, selection, crossover and mutation.
(1) Selection operator
13
In the process of biological evolution, all organisms must adapt to the living environ-
ment around them, and only the individuals selected through the living environment
are better individuals. This phenomenon is called natural selection, and the selection
operator of GA is simulated by this phenomenon. The higher the fitness of a chromo-
some, the less likely it is to be eliminated.
(2) Crossover operator
Among the three basic operators of GA, the crossover operator plays the most impor-
tant role in the algorithm. Crossover refers to the process by which parts of the gene
fragments on two chromosomes are exchanged to produce another two new chromo-
somes. The crossover probability PCX should not be too small; otherwise, the searching
ability of GA will be significantly reduced, and the diversity of the population cannot
be guaranteed.
(3) Mutation operator
In the course of biological evolution, not all chromosomes crossover, and occasion-
ally, some of them mutate to produce unforeseen new features. In GA, the probability
of chromosome mutation (PMU) is small, generally 0.0001–0.5, but the existence of a
mutation operator can keep this algorithm from falling into a precocious state, make
the direction of evolution more diversified, and improve the local search ability of GA.
3.3 Grey Wolf Optimizer (GWO)
By studying the collective hunting process of the grey wolf population, Mirjalili et al. pro-
posed the GWO algorithm (Mirjalili et al. 2014). There are approximately 5 to 12 wolves
in every grey wolf population in nature, but the status of these wolves is not equal. In the
mathematical model of GWO, each grey wolf is regarded as a candidate solution, and the
goal of each grey wolf is to search in the search space to find the prey. The grey wolves in
the population are divided into four grades from bottom to top: ω, δ, β and α (as shown in
Fig. 2a).
In the GWO, the grey wolves in the population follow specific formulas to encircle the
target prey at first:
⃗ = ||C
D ⃗ ⋅X ⃗ ||
⃗ P (t) − X(t) (2)
| |
X(t ⃗ P (t) + A
⃗ + 1) = X ⃗ ⋅D
⃗ (3)
where X ⃗ P indicates the position vector of the prey, X

⃗ refers to the position vector of a wolf,
and t represents the current iteration. A and C are both coefficient vectors, and their calcu-
⃗ ⃗
lation formulas are shown in Eqs. (4) and (5), respectively.
⃗ = 2⃗a ⋅ ⃗r1 − a⃗
A (4)
⃗ = 2⃗r2
C (5)
where ⃗r1 and ⃗r2 are both random vectors with a value range of [0, 1], and a⃗ is not a constant
value, which will decrease linearly from 2 to 0 as the number of iterations increases.
The location of the target prey is not clearly known in the mathematical model of GWO.
To better model the hunting behaviour of wolves, the algorithm assumes that α, β, and δ
in the pack know more about the location of the prey compared with other wolves (ω),
13
5684 J. Zhou et al.
(a) α (c)
|A|<1 |A|>1
β
δ
ω (c1) (c2)
a2 α
C2
β Prey
R C1
Dβ
Move
Dα
a2
δ
C2 Dδ
ω
δ (b)
Fig. 2 Mathematical model of GWO: a hierarchy; b position updating of grey wolves; c attack of the prey
(c1) versus search for other prey (c2)
especially the α wolf. The other grey wolves update their positions under the leadership
of the best three grey wolves, as presented in Eq. (6). A schematic diagram of the location
update for ω is shown in Fig. 2b.
⃗1 + X
X ⃗2 + X
⃗3
⃗ + 1) =
X(t (6)
3
( ) ( ) ( )
X ⃗𝛼 − A
⃗1 = X ⃗1 ⋅ D ⃗2 = X
⃗𝛼 ,X ⃗𝛽 − A
⃗2 ⋅ D ⃗3 = X
⃗𝛽 ,X ⃗𝛿 − A
⃗3 ⋅ D
⃗𝛿 (7)
| ←| | ←| | ←|
⃗ 𝛼 = |C
D ⃗ ⃗ | ⃗ |⃗ ⃗ | ⃗ |⃗ ⃗ | (8)
| 1 ⋅ X𝛼 − X |, D𝛽 = |C2 ⋅ X𝛽 − X |, D𝛿 = |C3 ⋅ X𝛿 − X |
| | | | | |
⃗ + 1) is the new position of grey wolves other than α, β, and δ, and X
where X(t ⃗ 𝛼 , X
⃗ 𝛽 , and X
⃗𝛿
are the positions of α, β, and δ.
In Fig. 2c, (c1) and (c2) represent two situations in which grey wolves attack prey and
stay away from prey, and this mechanism is controlled by the value of A⃗.
13
3.4 GA‑based RF Model (GA‑RF)
Selection of the RF parameters is crucial for the classification accuracy, and the appropri-
ate parameter combination will achieve a better classification effect. The GA-RF model,
which enjoys the advantages of both RF and GA approaches (Hassan et al. 2015; Ye et al.
2018; Wang et al. 2020), will serve as the first hybrid model in this research. The imple-
mentation steps of GA-RF are described as follows:
Step 1 Initialize the population. An initial population is composed of a certain number
of chromosomes that are constructed randomly at first, and all chromosomes are binary-
coded. If a gene encodes “0”, it indicates that a certain feature is not selected.
Step 2 Evaluate the fitness of each individual. The individual with higher fitness is
closer to the optimal solution sought by the algorithm. The fitness function used in this
hybrid model is based on the average accuracy after five-fold cross-validation.
Step 3 GA operators. First, the selection operator is used to screen out several pairs of
chromosomes according to the fitness of different individuals; thus, a parent population for
subsequent operations is constructed. Then, through the crossover operator and mutation
operator set in GA, individuals in the parent population execute crossover and mutation
operations to form candidate individuals with new characteristics.
Step 4 Output of the hybrid model. Step 3 will be repeated until the maximum number
of iterations is reached. When the stop condition is met, the model will output the optimal
RF parameters.
The framework of the proposed GA-RF model is shown in Fig. 3 (Wang et al. 2020;
Zhang and Wang 2021). In this figure, GA is employed to find a better combination of the
three parameters in RF (n_estimators, min_samples_split and max_depth).
3.5 GWO‑based RF model (GWO‑RF)
In addition to GA, another optimization algorithm, GWO, is also used in this study to
optimize the RF model. Thus, the second hybrid model proposed in this study, GWO-RF
(Maroufpoor et al. 2019; Golafshani et al. 2020; Yu et al. 2020), is formed. The general
frameworks of the two hybrid models are quite similar, and the main difference between
them refers to their main optimization loops. The main steps of the GWO-RF model are as
follows (Amirsadri et al. 2018):
Step 1 Population initialization. First, an initial population with each grey wolf repre-
senting a candidate solution is constructed, which serves as the basis for subsequent opera-
tions. Then, the internal parameters of the GWO algorithm are initialized.
Step 2 Fitness evaluation. The fitness function in the GWO-RF model is also set by the
five-fold cross-validation approach. The fitness value indicates the advantages and disad-
vantages of the individual and directly determines the choice of the first three wolves (α, β
and δ) in the population.
Step 3 Iterative optimization process. When the initial population is created and the
three initial optimal individuals (α, β and δ) are selected, the iterative process begins.
In addition to the three optimal individuals, the remaining grey wolves also have the
potential to become the optimal individuals, but their positions need to be updated
under the guidance of the currently selected three optimal solutions to get closer to
the target prey. After the positions of these grey wolves are updated, a population rep-
resenting a new solution set is created. Then, the control parameters in the algorithm,
13
5686 J. Zhou et al.
0 1 1 1 0 1…1 1 0
Data set
{
20% 80% Gene Chromosome
Testing set Training set Initialize the population
Train the GA-RF model

New population of chromosomes
… Five folds
Training fold
1
Tree 1 Tree 2 Tree n 2
3
4 Selection
Test fold
Fitness evaluation 5 Crossover
Is stop condition
{ Five-fold CV
Fitness value
Mutation
satisfied? No
Yes 0 1 1 1 0 1…1 1 0
The optimal chromosome Selection
(The optimal RF parameters) 0 1 1 0 1 0…1 1 0
0 1 1
Test the trained 1 0 1 1 0 1
GA-RF model Crossover Mutation
0 0 1 1 1 1
End 1 1 1
Fig. 3 Flowchart of the GA-RF predictive model
including a⃗ , A
⃗ and C⃗ , will be updated. Next, the fitness of all the individuals in the new
population is calculated and sorted based on their fitness. The three individuals with
the highest fitness are saved as α, β and δ. In the GWO-RF model, the above process
will be repeated in each iteration until the termination condition is met.
Step 4 Output the optimal solution and test the optimized model. When the termina-
tion condition set in the hybrid model is triggered (the termination condition is either
the maximum number of iterations given in the model or a sufficiently high value of
fitness), the final α wolf screened by the GWO-RF represents the optimal solution; oth-
erwise, the system will repeat from Step 2.
After going through the complete process described above, the optimal parameters
are obtained. The whole process is displayed in Fig. 4.
13
Data set
20% 80% Set α as best
wolf
Testing set Training set Set β as second
best wolf
Initialize the population
Set δ as third best wolf
Calculate the fitness

of each grey wolf
Calculate the fitness
of each grey wolf
Determine the three initial
optimal solution α, β and δ
Update a, A and C
Update the positions of the

Is stop condition No other grey wolves based on
satisfied? Xα, Xβ and Xδ
Yes
Optimized parameters Formula (6) ?
Test GWO-RF model ?

with cases of test sets
?
End ?
Fig. 4 Flowchart of the GWO-RF predictive model
4 Performance evaluation criteria
In this study, four commonly used indicators were selected and applied to determine
the performance of the classification models mentioned above, namely, accuracy, preci-
sion, recall and F1-Score (F1) (Hu and Liu 2019c; Mahmood et al. 2020; Zhang and
Wang 2021; Zhou et al. 2022). In addition, the precision-recall curve (called the P-R
curve) (Saito and Rehmsmeier 2015) and receiver operating characteristic curve (called
the ROC curve) (Fawcett 2006) were also introduced as references to assess the perfor-
mance of the models.
In the process of binary classification, some samples will be included in the opposite
categories. The function of accuracy is to evaluate whether the classification model is
effective for the classification of samples, including both positive and negative samples.
Precision is the percentage of true positive cases among all cases that are judged to be
positive. Recall can measure whether the model is effective in classifying the positive
13
5688 J. Zhou et al.
samples in the input variables. The calculation equations of these three indices are
shown in Eqs. 9, 10 and 11:
TP + TN TP + TN
Accuracy = = (9)
TP + TN + FP + FN N
TP
Precision = (10)
TP + FP
TP
Recall(Sensitivity) = (11)
TP + FN
where TN and TP represent the numbers of correctly classified negative samples and cor-
rectly classified positive samples, respectively. Correspondingly, FN and FP refer to the
numbers of misclassified negative samples and misclassified positive samples, respectively
(Sokolova and Lapalme 2009).
The F1-score is an indicator that links both precision and recall. When the input data are
unbalanced, this indicator is more reliable than the above three indicators:
2 × (Precision ⋅ Recall)
F1 − Score = (12)
Precision + Recall
In addition to these four indicators, there is another important and widely used indica-
tor called AUC, which is the area under the ROC curve. The ROC curve is able to directly
reflect the relationship between the false-positive rate (FPR) and the true-positive rate
(TPR). These two indicators are the abscissa and ordinate of the ROC curve, respectively,
as presented in Eqs. (13) and (14). In the ROC curve graph, the starting point of the curve
is the point where both TPR and FPR are 0, and the end point is (1, 1). The ROC curve of
the classification model under study tends to the upper left corner of the figure, which will
result in a larger AUC value and better model performance.
TP
TPR = (13)
TP + FN
FP
FPR = (14)
TN + TP
The process of evaluating the prediction performance of the two hybrid classification
models developed in this study based on three in situ database is shown in Fig. 5, in which
the four performance indicators introduced in this section are employed.
5 Results and discussion
5.1 Parameter settings
In this study, three models were employed to assess the liquefaction potential of soil,
including a base classification model (RF) and two hybrid models (GWO-RF and
GA-RF). In the modelling process, three field databases (i.e., A, B and C) were used.
To make a more comprehensive comparison of the classification performance of the
13
Big crack Surface subsidence

Housing collapse
Seismic wave Landslide
Water seepage
Earthquake phenomenon
occurs
Road damage
Soil sinks and
Soil particle discharges groundwater
Earthquake
Dataset A Dataset B Dataset C
qc Rf Z N1 φ′ Mw Mw Depth
to
GWT
Svo
S1 S2 dw Svo F75 VS S′vo αmax rd

CPT SPT SWV
αmax Mw αt CSR S′vo αmax CSR VS1
Construct models Initialize the population
Calculate the fitness of each candidate solution
Main loop Main loop
Selection, crossover and Stopping criteria? Sort the fitness of each

mutation operators wolf to obtain α, β, and δ
Form a new population Update Xα, Xβ, and Xδ and

Fitness evaluation
of chromosomes position of other wolves
GA-RF model GWO-RF model
Performance assessment Performance assessment

TP + TN
Accuracy Predicted label
Positive sample TP + FN + FP + TN
Yes (1) No (0)
TP
Precision
No (0) Yes (1)
TP + FP
TP FP TP FN
True label
TP
Recall
TP + FN
Negative sample
2Presion · Recall FP TN
FN TN F1-score
Presion+ Recall
Fig. 5 Graphical representation of the whole process for the evaluation of soil liquefaction potential
13
5690 J. Zhou et al.
two hybrid models, in this study, the same 8 population sizes were set in the two models
(10, 20, 30, 40, 50, 100, 150 and 200) (Zhou et al. 2021e; Qiu et al. 2021). According
to the information conveyed in Tables 4, 5, 6, the performances of the two hybrid clas-
sification techniques with different population sizes were good level, but it is unrealistic
to consider all of these models. Therefore, it is necessary to comprehensively consider
and score the indicators of the hybrid models after the completion of training to select
Table 4 The performance of the two hybrid models under Database A

Population Accuracy Rank Precision Rank Recall Rank F1 Rank Total
Training (GA-RF)
10 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
20 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
30 0.9944 7 1.000 8 0.9906 7 0.9953 7 29
40 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
50 0.9889 6 0.9906 7 0.9906 7 0.9906 6 26
100 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
150 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
200 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
Testing (GA-RF)
10 0.9783 8 1.0000 8 0.9630 8 0.9811 8 32
20 0.9783 8 1.0000 8 0.9630 8 0.9811 8 32
30 0.9783 8 1.0000 8 0.9630 8 0.9811 8 32
40 0.9783 8 1.0000 8 0.9630 8 0.9811 8 32
50 0.9783 8 1.0000 8 0.9630 88 0.9811 8 32
100 0.9783 8 1.0000 8 0.9630 8 0.9811 8 32
150 0.9783 8 1.0000 8 0.9630 0.9811 8 32
200 0.9783 8 1.0000 8 0.9630 8 0.9811 8 32
Training (GWO-RF)
10 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
20 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
30 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
40 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
50 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
100 0.9944 7 0.9907 7 1.0000 8 0.9953 7 29
150 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
200 0.9889 6 0.9906 6 0.9906 7 0.9906 6 25
Testing (GWO-RF)
10 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
20 0.9783 7 1.0000 8 0.9630 7 0.9811 7 29
30 0.9783 7 1.0000 8 0.9630 7 0.9811 7 29
40 0.9783 7 1.0000 8 0.9630 7 0.9811 7 29
50 0.9783 7 1.0000 8 0.9630 7 0.9811 7 29
100 0.9783 7 1.0000 8 0.9630 7 0.9811 7 29
150 0.9783 7 1.0000 8 0.9630 7 0.9811 7 29
200 0.9783 7 1.0000 8 0.9630 7 0.9811 7 29
13
Table 5 The performance of the two hybrid models under Database B

Training (GA-RF)
10 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
20 0.9960 6 0.9903 7 1.0000 8 0.9951 6 27
30 0.9919 5 0.9902 6 0.9902 6 0.9902 5 22
40 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
50 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
100 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
150 0.9980 7 1.0000 8 0.9951 7 0.9976 7 29
200 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
Testing (GA-RF)
10 0.8952 8 0.9130 7 0.8235 8 0.8660 8 31
20 0.8871 7 0.8936 6 0.8235 8 0.8571 6 27
30 0.8952 8 0.9524 8 0.7843 7 0.8602 7 30
40 0.8952 8 0.9130 7 0.8235 8 0.8660 8 31
50 0.8871 7 0.8936 6 0.8235 8 0.8571 6 27
100 0.8952 8 0.9130 7 0.8235 8 0.8660 8 31
150 0.8952 8 0.9130 7 0.8235 8 0.8660 8 31
200 0.8871 7 0.8936 6 0.8235 8 0.8571 6 27
Training (GWO-RF)
10 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
20 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
30 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
40 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
50 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
100 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
150 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
200 1.0000 8 1.0000 8 1.0000 8 1.0000 8 32
Testing (GWO-RF)
10 0.8871 6 0.8936 5 0.8235 6 0.8571 5 22
20 0.9032 7 0.9149 7 0.8431 7 0.8776 6 27
30 0.9032 7 0.8980 6 0.8627 8 0.8800 7 28
40 0.9032 7 0.9149 7 0.8431 7 0.8776 6 27
50 0.9113 8 0.9167 8 0.8627 8 0.8889 8 32
100 0.9113 8 0.9167 8 0.8627 8 0.8889 8 32
150 0.8871 6 0.8936 5 0.8235 6 0.8571 5 22
200 0.9032 7 0.8980 6 0.8627 8 0.8800 7 28
the best population size. In Table 7, all the optimal population sizes of the GWO-RF
and the GA-RF are presented, and the optimal RF parameter combinations output by the
models are also given in this table.
13
5692 J. Zhou et al.
Table 6 The performance of the two hybrid models under Database C

Training (GA-RF)
10 0.9940 6 0.9914 6 1.000 8 0.9957 6 26
20 0.9970 7 0.9957 7 1.000 8 0.9978 7 29
30 0.9970 7 0.9957 7 1.000 8 0.9978 7 29
40 0.9819 5 0.9746 5 1.000 8 0.9871 5 23
50 0.9970 7 0.9957 7 1.000 8 0.9978 7 29
100 1.000 8 1.000 8 1.000 8 1.000 8 32
150 0.9970 7 0.9957 7 1.000 8 0.9978 7 29
200 0.9940 6 0.9914 6 1.000 8 0.9957 6 26
Testing (GA-RF)
10 0.9398 6 0.9333 5 0.9825 7 0.9573 6 24
20 0.9398 6 0.9483 6 0.9649 6 0.9565 5 23
30 0.9518 7 0.9492 7 0.9825 7 0.9655 7 28
40 0.9639 8 0.9500 8 1.000 8 0.9744 8 32
50 0.9398 6 0.9333 5 0.9825 7 0.9573 6 24
100 0.9639 8 0.9500 8 1.000 8 0.9744 8 32
150 0.9398 6 0.9333 5 0.9825 7 0.9573 6 24
200 0.9518 7 0.9492 7 0.9825 7 0.9655 7 28
Training (GWO-RF)
10 1.000 8 1.000 8 1.000 8 1.000 8 32
20 1.000 8 1.000 8 1.000 8 1.000 8 32
30 0.9759 6 0.9664 6 1.000 8 0.9829 6 26
40 0.9910 7 0.9871 7 1.000 8 0.9935 7 29
50 0.9910 7 0.9871 7 1.000 8 0.9935 7 29
100 0.9759 6 0.9664 6 1.000 8 0.9829 6 26
150 0.9759 6 0.9664 6 1.000 8 0.9829 6 26
200 0.9759 6 0.9664 6 1.000 8 0.9829 6 26
Testing (GWO-RF)
10 0.9398 7 0.9333 7 0.9825 8 0.9573 7 29
20 0.9398 7 0.9333 7 0.9825 8 0.9573 7 29
30 0.9518 8 0.9492 8 0.9825 8 0.9655 8 32
40 0.9518 8 0.9492 8 0.9825 8 0.9655 8 32
50 0.9518 8 0.9492 8 0.9825 8 0.9655 8 32
100 0.9518 8 0.9492 8 0.9825 8 0.9655 8 32
150 0.9518 8 0.9492 8 0.9825 8 0.9655 8 32
200 0.9518 8 0.9492 8 0.9825 8 0.9655 8 32
5.2 Analysis and description of classification performance
This study mainly systematically analysed and compared the classification performance
of GA-RF and GWO-RF based on Databases A, B and C in the process of the training
and testing phases to explore an intelligent prediction model that is more suitable for
practical engineering problems. In addition, it is also necessary to determine whether
13
Table 7 The optimal parameters Database Database A Database B Database

of two hybrid models for C
different databases
GA-RF
The optimal population size 150 100 100
n_estimators 91 267 70
max_depth 28 24 15
min_samples_split 2 4 3
GWO-RF
The optimal population size 10 100 20
n_estimators 170 216 115
max_depth 9 29 21
min_samples_leaf 2 3 3
GWO and GA have improvement effects on the RF classification performance. These

can be achieved by adding an unoptimized RF model to the comparison process.
In the training process of the models, each hybrid model can obtain 8 different
change curves of fitness with different numbers of iterations. All fitness change curves
generated by the two hybrid classification models based on Databases A, B and C are
displayed in Figs. 6, 7, 8. In these figures, the fitness change curves obtained by the
GA-RF model are on the left, while the figures on the right are related to the GWO-RF
model.
By comparing these figures, it is noted that when the optimization algorithm was the
GWO, the fitness change curves with different population sizes converged faster. It is
also observed that most of the GWO iterations were stable before iteration number 50.
In the GA-RF model, as the number of iterations increases, the shapes of fitness change
curves with different population sizes were very tortuous, and the convergence process
lagged behind that of the GWO-RF model.
On the premise of the same dataset, the comparison of the performances of the two
hybrid models in this study was based on the optimal population size screened out above.
Table 8 demonstrates the performances of the GWO-RF and GA-RF models and a simple
(i.e., pre-developed) RF model using Databases A-C to test the classification models.
(a1) (a2)
Fig. 6 Optimizing the RF model with GA (a1) and GWO (a2) for different population sizes based on Data-
base A
13
5694 J. Zhou et al.
(b1) (b2)
Fig. 7 Optimizing the RF model with GA (b1) and GWO (b2) for different population sizes based on Data-
base B
(c1) (c2)
Fig. 8 Optimizing the RF model with GA (c1) and GWO (c2) for different population sizes based on Data-
base C
Upon analysis of the data presented in Table 8, it can be seen that either the GA or the
GWO algorithms were used to optimize the RF, the values of all measurement indicators
were better than RF when the samples of three testing datasets were adopted for verifica-
tion. Thus, it is proven that GWO-RF and GA-RF can obtain better classification accuracy
and have better feasibility and reliability in processing liquefaction datasets and applying
them to engineering decisions compared with the unoptimized basic classifier RF.
For the test sets separated from Databases A and B, the GWO-RF model was obvi-
ously superior to the GA-RF model. Especially when Database A was employed for
model construction, in both the training and test phases, the scores of all indicators out-
put by GWO-RF were 1.0, proving that this model had the ability to classify all samples
in Database A into correct categories. For Database C, the situation was completely
opposite. Obviously, four indicators of the GA-RF model were higher than those of the
GWO-RF model, but the differences were within a less significant range. Notably, the
recall rate obtained by GA-RF was 1.0, indicating that this model was capable, powerful
and effective in correctly classifying the positive cases in the input database. To further
analyse the performance differences between the RF model and the two hybrid models
13
Table 8 Performance comparison Model Accuracy Precision Recall F1

of the two hybrid models and the
pre-developed RF model Database A (Training)
RF 1.0000 1.0000 1.0000 1.0000
GA-RF 1.0000 1.0000 1.0000 1.0000
GWO-RF 1.000 1.0000 1.0000 1.0000
Database A (Testing)
RF 0.9565 1.0000 0.9259 0.9615
GA-RF 0.9783 1.0000 0.9630 0.9811
GWO-RF 1.0000 1.0000 1.0000 1.0000
Database B (Training)
RF 1.0000 1.0000 1.0000 1.0000
GA-RF 1.0000 1.0000 1.0000 1.0000
GWO-RF 1.0000 1.0000 1.0000 1.0000
Database B (Testing)
RF 0.8790 0.8913 0.8039 0.8454
GA-RF 0.8952 0.9130 0.8235 0.8660
GWO-RF 0.9113 0.9167 0.8627 0.8889
Database C (Training)
RF 1.0000 1.0000 1.0000 1.0000
GA-RF 1.0000 1.0000 1.0000 1.0000
GWO-RF 1.0000 1.0000 1.0000 1.0000
Database C (Testing)
RF 0.9277 0.9322 0.9649 0.9483
GA-RF 0.9639 0.9500 1.0000 0.9744
GWO-RF 0.9398 0.9333 0.9825 0.9573
as well as the differences between the two hybrid models, more analyses are needed for
a comprehensive comparison. The ROC curves of each model are shown in Fig. 9, and
the AUC values of the RF model and the two hybrid models with the optimal population
size are clearly demonstrated in this figure. As presented in Fig. 9a, the AUCs of both
hybrid models were 1.0, demonstrating the excellent performance of these two hybrid
classification models in dealing with Database A. For Databases B and C, the AUC
True Positive Rate
True Positive Rate
True Positive Rate
GA-RF: AUC = 1.0 GA-RF: AUC = 0.9600 GA-RF: AUC = 0.9771

GWO-RF: AUC = 1.0 GWO-RF: AUC = 0.9640 GWO-RF: AUC = 0.9838
RF: AUC = 1.0 RF: AUC = 0.9562 RF: AUC = 0.9750
False Positive Rate False Positive Rate False Positive Rate

(a) (b) (c)
Fig. 9 ROC curves generated by the GA-RF, GWO-RF and RF models based on a Database A, b Database
B and c Database C
13
5696 J. Zhou et al.
values obtained by the GA-RF and GWO-RF models were higher than those of the RF
model.
When dealing with binary classification problems, the recall and precision rates are
often more convincing, and the P-R curve derived from these two metrics is considered
an important reference for binary classification problems (Saito and Rehmsmeier 2015).
Figure 10a, b and c shows the P-R curves of the three models when the input samples were
from Databases A, B and C, respectively. In the P-R curve, the abscissa represents the
recall rate, and the ordinate represents the precision rate. Although the area enclosed by
the coordinate axis and the curve can show the performance of the model intuitively, which
is similar to the ROC curve and the AUC, calculating this area becomes quite challeng-
ing. Therefore, this part only roughly judges the performance of the models through the
intuitive observation of the P-R curves for each model. Since the starting point of the P-R
curve is (0, 1) in the coordinate system, once the P-R curve output by a model is closer to
the upper right corner of the coordinate system, it indicates that the classification ability of
this model is better. In Fig. 10a, b and c, the P-R curves of the RF model were surrounded
by those of the other two hybrid models. It is worth noting that for the two hybrid models,
although the shapes of the curves were different, the areas enclosed by the coordinate axis
were not significantly different. The P-R curves of the GWO-RF and GA-RF models in
Fig. 10c were close to the upper right corner of the graph. For Database B, the P-R curves
of the two hybrid models were not very close to the upper right corner of the figure, but
compared with the RF model, it was greatly improved. This indicates that if the P-R curve
is employed as the evaluation standard of model performance, the GWO and GA optimiza-
tion algorithms do improve the performance of the RF model to a certain extent.
In addition, according to the definition of the confusion matrix (Sokolova and Lapalme
2009), for the classification of binary datasets, the matrix contains four parts, in which if
the value of the upper left corner (true negative rate, TNR) and the value of the lower right
corner (true positive Rate, TPR) are closer to 1, the performance of the model is proven
to be better. The confusion matrices obtained after testing the three models presented in
this research by using Databases A, B and C are displayed in Fig. 11a, b and c, respec-
tively. Combined with the data in the 9 confusion matrices, the TPR and TNR values of
the GA-RF and GWO-RF models were significantly higher than those of the RF model.
The TPR values of the GWO-RF on Databases A and B were 0.0741 and 0.0588 higher
than that of the unoptimized RF model. For Database C, the TPR and TNR of GA-RF were
(0, 1) (0, 1) (0, 1)
GA-RF GA-RF GA-RF

GWO-RF GWO-RF GWO-RF
RF RF RF
(a) (b) (c)

Fig. 10 P-R curves generated by the GA-RF, GWO-RF and RF models based on a Database A, b Database
B and c Database C
13
GA-RF GWO-RF RF
(a)
GA-RF GWO-RF RF
(b)
GA-RF GWO-RF RF
(c)
Fig. 11 Confusion matrices generated by the GA-RF, GWO-RF and RF models based on a Database A, b
Database B and c Database C
0.0351 and 0.0384 higher than those of RF, respectively. These data show that the clas-
sification performance of the RF classifier was significantly improved with the help of the
two optimization strategies. In addition, it is particularly noteworthy that the TNR and TPR
values of the GWO-RF model in Fig. 11a were both 1.0, which indicates that GA-RF had
excellent performance on Database A. The TNR of the GWO-RF model based on Data-
bases A and B was 0.0370 and 0.0392 higher than that of GA-RF, respectively, which was
13
5698 J. Zhou et al.
considered a large difference. For Database C, the TNR and TPR of GA-RF were 0.3840
and 0.0175 higher than those of the GWO-RF, respectively, which was sufficient to prove
that the performance of GA-RF on Database C was superior to that of the GWO-RF.
5.3 Feature importance analysis
Since the three databases employed in this experiment were all in situ test datasets, each
characteristic variable had a different influence on the determination of liquefaction cases
or non-liquefaction cases. Some characteristic variables had a crucial impact on the classi-
fication of samples, while others had negligible effects. In addition, according to the study
of Ahmad et al. (2019b), the relationship between different characteristics and soil lique-
faction potential is not consistent. For example, the liquefaction potential of soil and the
destructive power of liquefaction increase with the earthquake magnitude Mw. Under the
same effective vertical stress S′vo, loose soils more easily liquefy than dense soils. Under
the condition of the same density, the soil under high effective stress is more prone to liq-
uefaction than that under low effective stress. In addition, the resistance to soil liquefaction
is inversely proportional to the depth of the ground water table dw. Therefore, it is neces-
sary to clarify the order of importance of the characteristic variables in the three databases
to provide a preliminary explanation for the prediction process of the models (Ahmad et al.
2019b).
The classification models selected in this study were constructed based on RF, which
is an integrated ML method that can effectively analyse the input characteristic variables.
Different from the classification model where the population size is a fixed value, the two
hybrid models that need to be compared in this study had the same 8 population sizes. In
this way, each classification model can obtain eight importance order charts of character-
istic variables. In this study, the importance order chart obtained by the hybrid model with
the optimal population size was selected to represent the feature importance order of the
same classification model based on the same dataset.
The ranking charts of the importance of characteristic variables obtained from the two
hybrid classification models based on the three databases are shown in Fig. 12. In this fig-
ure, parts (a), (b) and (c) show the order of feature importance of Databases A, B and C
input into the GA-RF model, and the corresponding ranking diagrams of feature impor-
tance of Databases A, B and C input into the GWO-RF model are shown in Fig. 12d, e and
f.
As shown in Fig. 12a and d, when the dataset input into the two models was Data-
base A, the order of importance of characteristic variables was consistent, and the order
from high to low was qc, amax, Rf, S2, S1, Mw. In the prediction of the target variable,
the variable with the highest degree of usefulness was qc, and the importance levels of
this variable were 0.3275 in GA-RF and 0.3257 in GWO-RF. Such high values indi-
cate that qc plays a nearly decisive role in the classification process of models, followed
by amax. The characteristic variable that plays the least role in the classification of the
target variable was Mw (the importance values of this variable in GA-RF and GWO-
RF were 0.0528 and 0.0498, respectively). For Database B, the order of importance of
characteristic variables was also consistent. The characteristic variables that play the
most critical role in the prediction of the two hybrid models in Database B were F75
and φ´. Among the 12 characteristic variables, the sums of these two variables were
0.3786 in the GA-RF model and 0.3708 in the GWO-RF model. The variable with the
least influence was Mw (GA-RF: 0.0053, GWO-RF: 0.0064), and the importance of this
13
0.3275 (a) 0.3257 (d)
0.2299 0.2364
0.1611 0.1657
0.1257 0.1183
0.1041
0.0528 0.0498
0.1948 838
0.1
(b) 0.1894 814
0.1 (e)
1 8
26 24
0.1 0.1
91
9
6 911 46
0.0 85 0.0 8 2
0.0 74
1 25 0.0 .078 723
0
0.0 0.07 0.0
2 2
52 0 52 83 6
0.0 0.047 415 0.0 0.04 .045
0.0 0
2
25 25
6
0.0 0.0 4
053 06
0.0 0.0
φ′
φ′
0.2560 (c) 0.2507 (f)
0.1898
0.1760
0.1480
0.1372
0.1143 0.1156
0.0881 0.0863 0.0805
0.0764 0.0693 0.0689 0.0723 0.0707
Depth
Depth
GWT
GWT
to
to
′
Fig. 12 Feature importance plot obtained by the two hybrid models: a b c Databases A, B and C are the
input datasets of the GA-RF model, respectively; and d e f Databases A, B and C are the input datasets of
the GWO-RF model, respectively
variable was close to 0 in both models, indicating that this variable had almost no effect
on the prediction of the target variable or that both hybrid models were insensitive to
this parameter. As shown in Fig. 12c and f, when the dataset used for model training
and testing was Database C, there was a small difference in the importance order of the
characteristic variables of the two models. The first ranked variable was VS1, the impor-
tance of this variable in the prediction process of the two hybrid models was more than
0.25, and the first four important variables were VS1, CSR, amax and Mw. The sum of
these four variables in the two hybrid models was 0.69, close to 3/4.
13
5700 J. Zhou et al.
6 Summary and conclusions
Due to the complexity of soil liquefaction and the randomness of its occurrence, assessing
the potential of soil liquefaction has always been a crucial problem perplexing the scientific
community. In this paper, GA-RF and GWO-RF models were used to optimize the param-
eters of the RF model to find more reasonable parameter combinations. To that end, three
different databases collected by the CPT, SPT and SWVT methods were employed to esti-
mate the potential of soil liquefaction. To compare the two hybrid models, the population
size of each hybrid model was divided into 8 categories: 10, 20, 30, 40, 50, 100, 150 and
200. Then, in the training and testing stages of the models, the two hybrid models with dif-
ferent population sizes were scored according to the four metrics, i.e., F1, accuracy, recall
and precision, to select the optimal population size for each model. In addition to these
four indicators, the ROC curve (especially the area under the curve, AUC), P-R curve and
confusion matrix were also determined to analyse the performance of the models proposed
in this paper.
According to the results, using the testing samples, the hybrid RF-based models showed
better performance than the performance prediction of a single RF model. This comparison
was based on the use of various performance indices. In addition, the comparison of the
two hybrid models showed that the GWO-RF method was obviously better for the CPT and
SPT databases than the GWO-RF model, especially for the CPT database (GWO-RF: accu-
racy = 1.0, precision = 1.0, recall = 1.0, F1 = 1.0, AUC = 1.0; GA-RF: accuracy = 0.9783,
precision = 1.0, recall = 0.9630, F1 = 0.9811, AUC = 1.0). For the SWVT database, the
GA-RF model was judged to be superior according to the corresponding results (GA-RF:
accuracy = 0.9639, precision = 0.9500, recall = 1.0, F1 = 0.9744, AUC = 0.9771). Further-
more, importance order graphs of characteristic variables of each database were also gener-
ated, which can directly reflect the relationships between different variables and liquefac-
tion potential.
The performance of the two hybrid models presented in this study is relatively satis-
factory, and the developed optimization algorithms were able to effectively improve the
accuracy of the RF model. However, there are limitations to the study. Although the feature
importance of variables obtained from the two hybrid models was analysed in this study,
feature screening was not carried out to further improve the classification accuracy of the
models. In addition, only the binary classification models used to predict soil liquefaction
potential were studied in this paper, and the sample sizes of all databases were relatively
limited.
Acknowledgements This research was funded by the Innovation‐Driven Project of Central South Univer-
sity (2020CX040), the National Natural Science Foundation of China (Nos. 52004161 and 42177164), and
the Program for Guangdong Introducing Innovative and Enterpreneurial Teams (No. 2019ZT08G315).
References
Ahmad M, Tang XW, Qiu JN, Ahmad F (2019) Evaluating seismic soil liquefaction potential using bayesian
belief network and C4.5 decision tree approaches. Appl Sci-Basel 9(20):4226
Ahmad M, Tang XW, Qiu JN, Ahmad F (2019) Interpretive structural modeling and MICMAC analysis for
identifying and benchmarking significant factors of seismic soil liquefaction. Appl Sci-Basel 9(2):233
Ahmad M, Tang X-W, Qiu J-N, Ahmad F (2021a) Evaluation of liquefaction-induced lateral displacement
using Bayesian belief networks. Front Struct Civ Eng 15(1):80–98
13
Ahmad M, Tang X-W, Qiu J-N, Ahmad F, Gu W-J (2020a) A step forward towards a comprehensive frame-
work for assessing liquefaction land damage vulnerability: exploration from historical data. Front
Struct Civ Eng 14(6):1476–1491
Ahmad M, Tang X-W, Qiu J-N, Ahmad F, Gu W-J (2021b) Application of machine learning algorithms for
the evaluation of seismic soil liquefaction potential. Front Struct Civ Eng 15(2):490–505
Ahmad M, Tang X-W, Qiu J-N, Gu W-J, Ahmad F (2020b) A hybrid approach for evaluating CPT-
based seismic soil liquefaction potential using Bayesian belief networks. J Central South Univ
27(2):500–516
Ahmad M, Tang X, Ahmad F, Hadzima-Nyarko M, Nawaz A, Farooq A (2021c) Elucidation of seismic soil
liquefaction significant factors. In Earthquakes.) IntechOpen
Alobaidi MH, Meguid MA, Chebana F (2019) Predicting seismic-induced liquefaction through ensemble
learning frameworks. Sci Rep 9:12
Amirsadri S, Mousavirad SJ, Ebrahimpour-Komleh H (2018) A Levy flight-based grey wolf optimizer
combined with back-propagation algorithm for neural network training. Neural Comput Appl
30(12):3707–3720
Andrus RD, Stokoe KH (2000) Liquefaction resistance of soils from shear-wave velocity. J Geotech Geoen-
viron Eng 126(11):1015–1025
Armaghani DJ, Harandizadeh H, Momeni E, Maizir H, Zhou J (2021a) An optimized system of GMDH-
ANFIS predictive model by ICA for estimating pile bearing capacity. Artif Intell Rev. https://doi.org/
10.1007/s10462-021-10065-5
Armaghani DJ, Yagiz S, Mohamad ET, Zhou J (2021b) Prediction of TBM performance in fresh through
weathered granite using empirical and statistical approaches. Tunnel Undergr Space Technol
118:104183
Belgiu M, Dragut L (2016) Random forest in remote sensing: a review of applications and future directions.
ISPRS J Photogramm Rem Sens 114:24–31
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Bui XN, Nguyen H, Choi Y, Nguyen-Thoi T, Zhou J, Dou J (2020) Prediction of slope failure in open-pit
mines using a novel hybrid artificial intelligence model based on decision tree and evolution algo-
rithm. Sci Rep 10(1):1–17
Cai M, Hocine O, Mohammed AS, Chen X, Amar MN, Hasanipanah M (2021) Integrating the LSSVM
and RBFNN models with three optimization algorithms to predict the soil liquefaction potential. Eng
Comput. https://doi.org/10.1007/s00366-021-01392-w
Cetin KO, Seed RB, Der Kiureghian A, Tokimatsu K, Harder LF, Kayen RE, Moss RES (2004) Standard
penetration test-based probabilistic and deterministic assessment of seismic soil liquefaction poten-
tial. J Geotech Geoenviron Eng 130(12):1314–1340
Chen G, Kong M, Khoshnevisan S, Chen W, Li X (2019) Calibration of Vs-based empirical models for
assessing soil liquefaction potential using expanded database. Bull Eng Geol Env 78(2):945–957
Chern SG, Lee CY (2009) Cpt-based simplified liquefaction assessment by using fuzzy-neural network. J
Marine Sci Technol-Taiwan 17(4):326–331
Chern SG, Lee CY, Wang CC (2008) CPT-based liquefaction assessment by using fuzzy-neural network. J
Marine Sci Technol-Taiwan 16(2):139–148
El Mohtar CS, Bobet A, Drnevich VP, Johnston CT, Santagata MC (2014) Pore pressure generation in sand
with bentonite: from small strains to liquefaction. Geotechnique 64(2):108–117
Erzin Y, Ecemis N (2015) The use of neural networks for CPT-based liquefaction screening. Bull Eng Geol
Env 74(1):103–116
Fang Q, Nguyen H, Bui XN, Nguyen-Thoi T, Zhou J (2021) Modeling of rock fragmentation by firefly opti-
mization algorithm and boosted generalized additive model. Neural Comput Appl 33(8):3503–3519
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874
Feng DC, Cetiner B, Kakavand MRA, Taciroglu E (2021) Data-driven approach to predict the plastic hinge
length of reinforced concrete columns and its application. J Struct Eng 147(2):04020332
Feng DC, Liu ZT, Wang XD, Jiang ZM, Liang SX (2020) Failure mode classification and bearing capacity
prediction for reinforced concrete columns based on ensemble machine learning algorithm. Adv Eng
Inform 45:101126
Genuer R, Poggi J-M, Tuleau-Malot C (2010) Variable selection using random forests. Pattern Recogn Lett
31(14):2225–2236
Goh ATC (1996) Neural-network modeling of CPT seismic liquefaction data. J Geotech Engng, ASCE
122(1):70–73
Goh ATC (2002) Probabilistic neural network for evaluating seismic liquefaction potential. Can Geotech J
39(1):219–232
13
5702 J. Zhou et al.
Goh ATC, Goh SH (2007) Support vector machines: their use in geotechnical engineering as illustrated
using seismic liquefaction data. Comput Geotech 34(5):410–421
Golafshani EM, Behnood A, Arashpour M (2020) Predicting the compressive strength of normal and high-
performance concretes using ANN and ANFIS hybridized with grey wolf optimizer. Constr Build
Mater 232:117266
Guo H, Zhou J, Koopialipoor M, Armaghani DJ, Tahir MM (2021) Deep neural network and whale optimi-
zation algorithm to assess flyrock induced by blasting. Eng Comput 37(1):173–186
Hanna AM, Ural D, Saygili G (2007a) Evaluation of liquefaction potential of soil deposits using artificial
neural networks. Eng Comput 24(1–2):5–16
Hanna AM, Ural D, Saygili G (2007b) Neural network model for liquefaction potential in soil deposits using
Turkey and Taiwan earthquake data. Soil Dyn Earthq Eng 27(6):521–540
Harris JR, Grunsky EC (2015) Predictive lithological mapping of Canada’s North using Random Forest
classification applied to geophysical and geochemical data. Comput Geosci 80:9–25
Hassan H, Badr A, Abdelhalim MB (2015) Prediction of O-glycosylation sites using random forest and GA-
Tuned PSO technique. Bioinform Biol Insights 9:103–109
Heidari T, Andrus RD (2012) Liquefaction potential assessment of Pleistocene beach sands near charleston,
South Carolina. J Geotech Geoenviron Eng 138(10):1196–1208
Ho T (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach
Intell 20(8):832–844
Hoang ND, Bui DT (2018) Predicting earthquake-induced soil liquefaction based on a hybridization of ker-
nel Fisher discriminant analysis and a least squares support vector machine: a multi-dataset study.
Bull Eng Geol Env 77(1):191–204
Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor
Hu J-L, Tang X-W, Qiu J-N (2015) A Bayesian network approach for predicting seismic liquefaction
based on interpretive structural modeling. Georisk-Assessment Manage Risk Eng Syst Geohazards
9(3):200–217
Hu J-L, Tang X-W, Qiu J-N (2016) Assessment of seismic liquefaction potential based on Bayesian network
constructed from domain knowledge and history data. Soil Dyn Earthq Eng 89:49–60
Hu J, Liu H (2019a) Bayesian network models for probabilistic evaluation of earthquake-induced liquefac-
tion based on CPT and V-s databases. Eng Geol 254:76–88
Hu J, Liu H (2019b) Identification of ground motion intensity measure and its application for predicting soil
liquefaction potential based on the Bayesian network method. Eng Geol 248:34–49
Idriss IM, Boulanger RW (2006) Semi-empirical procedures for evaluating liquefaction potential during
earthquakes. Soil Dyn Earthq Eng 26(2–4):115–130
Juang CH, Chen CJ, Jiang T, Andrus RD (2000a) Risk-based liquefaction potential evaluation using stand-
ard penetration tests. Can Geotech J 37(6):1195–1208
Juang CH, Chen CJ, Tang WH, Rosowsky DV (2000b) CPT-based liquefaction analysis, Part 1: determina-
tion of limit state function. Geotechnique 50(5):583–592
Juang CH, Ching JY, Luo Z, Ku CS (2012) New models for probability of liquefaction using standard pen-
etration tests based on an updated database of case histories. Eng Geol 133:85–93
Juang CH, Jiang T, Andrus RD (2002) Assessing probability-based methods for liquefaction potential evalu-
ation. J Geotech Geoenviron Eng 128(7):580–589
Juang CH, Yuan HM, Lee DH, Lin PS (2003) Simplified cone penetration test-based method for evaluating
liquefaction resistance of soils. J Geotech Geoenviron Eng 129(1):66–80
Kayen R, Moss RES, Thompson EM, Seed RB, Cetin KO, Kiureghian AD, Tanaka Y, Tokimatsu K (2013)
Shear-wave velocity-based probabilistic and deterministic assessment of seismic soil liquefaction
potential. J Geotech Geoenviron Eng 139(3):407–419
Kohestani VR, Hassanlourad M, Ardakani A (2015) Evaluation of liquefaction potential based on CPT data
using random forest. Nat Hazards 79(2):1079–1089
Lee C-Y, Chern S-G (2013) Application of a support vector machine for liquefaction assessment. J Mar Sci
Technol 21(3):318–324
Le LT, Nguyen H, Dou J, Zhou J (2019a) A comparative study of PSO-ANN, GA-ANN, ICA-ANN, and
ABC-ANN in estimating the heating load of buildings’ energy efficiency for smart city planning.
Appl Sci 9(13):2630
Le LT, Nguyen H, Zhou J, Dou J, Moayedi H (2019b) Estimating the heating load of buildings for smart city
planning using a novel artificial intelligence technique PSO-XGBoost. Appl Sci 9(13):2714
Li E, Yang F, Ren M, Zhang X, Zhou J, Khandelwal M (2021a) Prediction of blasting mean fragment size
using support vector regression combined with five optimization algorithms. J Rock Mech Geotech
Eng 13(6):1380–1397
13
Li E, Zhou J, Shi X, Armaghani DJ, Yu Z, Chen X, Huang P (2021b) Developing a hybrid model of salp
swarm algorithm-based support vector machine to predict the strength of fiber-reinforced cemented
paste backfill. Eng Comput 37(4):3519–3540
Lianyang Z (1998) Predicting seismic liquefaction potential of sands by optimum seeking method. Soil Dyn
Earthq Eng 17(4):219–226
Mahmood A, Tang X-W, Qiu J-N, Gu W-J, Feezan A (2020) A hybrid approach for evaluating CPT-
based seismic soil liquefaction potential using Bayesian belief networks. J Central South Univ
27(2):500–516
Maroufpoor S, Maroufpoor E, Bozorg-Haddad O, Shiri J, Yaseen ZM (2019) Soil moisture simulation using
hybrid artificial intelligent model: Hybridization of adaptive neuro fuzzy inference system with grey
wolf optimizer algorithm. J Hydrol 575:544–556
Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61
Moss RES, Seed RB, Kayen RE, Stewart JP, Kiureghian AD, Cetin KO (2006) CPT-based probabilistic
and deterministic assessment of in situ seismic soil liquefaction potential. J Geotech Geoenviron Eng
132(8):1032–1051
Pal M (2006) Support vector machines-based modelling of seismic liquefaction potential. Int J Numer Anal
Meth Geomech 30(10):983–996
Qiu Y, Zhou J, Khandelwal M, Yang H, Yang P, Li C (2021) Performance evaluation of hybrid WOA-
XGBoost GWO-XGBoost and BO-XGBoost models to predict blast-induced ground vibration. Eng
Comput. https://doi.org/10.1007/s00366-021-01393-9
Rahbarzare A, Azadi M (2019) Improving prediction of soil liquefaction using hybrid optimization algo-
rithms and a fuzzy support vector machine. Bull Eng Geol Env 78(7):4977–4987
Rezania M, Javadi AA, Giustolisi O (2010) Evaluation of liquefaction potential based on CPT results using
evolutionary polynomial regression. Comput Geotech 37(1–2):82–92
Saito T, Rehmsmeier M (2015) The precision-recall plot is more informative than the ROC plot when evalu-
ating binary classifiers on imbalanced datasets. Plos One 10(3):e0118432
Samui P (2007) Seismic liquefaction potential assessment by using relevance vector machine. Earthq Eng
Eng Vib 6(4):331–336
Samui P, Hariharan R (2015) A unified classification model for modeling of seismic liquefaction potential
of soil based on CPT. J Adv Res 6(4):587–592
Samui P, Karthikeyan J (2013) Determination of liquefaction susceptibility of soil: a least square support
vector machine approach. Int J Numer Anal Meth Geomech 37(9):1154–1161
Samui P, Karthikeyan J (2014) The use of a relevance vector machine in predicting liquefaction potential.
Indian Geotech J 44(4):458–467
Samui P, Kim D, Sitharam TG (2011) Support vector machine for evaluating seismic-liquefaction potential
using shear wave velocity. J Appl Geophys 73(1):8–15
Samui P, Sitharam TG (2011) Machine learning modelling for predicting soil liquefaction susceptibility. Nat
Hazard 11(1):1–9
Seed HB, Idriss IM (1971) Simplified procedure for evaluating soil liquefaction potential. J Soil Mech
Found Eng Div ASCE 97(9):1249–1273
Seed HB, Idriss IM, Arango I (1983) Evaluation of liquefaction potential using field performance data. J
Geotech Eng Div ASCE 109(3):458–482
Seo MW, Olson SM, Sun CG, Oh MH (2012) Evaluation of liquefaction potential index along western coast
of south korea using SPT and CPT. Mar Georesour Geotechnol 30(3):234–260
Shahri AA (2016) Assessment and prediction of liquefaction potential using different artificial neural net-
work models: a case study. Geotech Geol Eng 34(3):807–815
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf
Process Manage 45(4):427–437
Ter-Martirosyan A, Le Duc A (2020) Calculation of the settlement of pile foundations taking into account
the influence of soil liquefaction. In XXIII International Scientific Conference on Advance in Civil
Engineering: "Construction - The Formation of Living Environment" (FORM-2020), 23–26 Sept.
2020.) IOP Publishing, UK, vol. 869, pp. 052025 (9 pp.)
Wang JH, Yan WZ, Wan ZJ, Wang Y, Lv JK, Zhou AP (2020) Prediction of permeability using random for-
est and genetic algorithm model. Cmes-Computer Model Eng Sci 125(3):1135–1157
Xie C, Nguyen H, Bui XN, Choi Y, Zhou J, Nguyen-Trang T (2021) Predicting rock size distribution in
mine blasting using various novel soft computing models based on meta-heuristics and machine
learning algorithms. Geosci Front 12(3):101108
Xue XH, Xiao M (2016) Application of genetic algorithm-based support vector machines for prediction
of soil liquefaction. Environ Earth Sci 75(10):11
13
5704 J. Zhou et al.
Xue XH, Yang XG (2013) Application of the adaptive neuro-fuzzy inference system for prediction of
soil liquefaction. Nat Hazards 67(2):901–917
Xue XH, Yang XG (2016) Seismic liquefaction potential assessed by support vector machines
approaches. Bull Eng Geol Env 75(1):153–162
Ye X, Dong L-A, Ma D (2018) Loan evaluation in P2P lending based on random forest optimized by
genetic algorithm with profit score. Electron Commer Res Appl 32:23–36
Yong W, Zhou J, Jahed Armaghani D, Tahir MM, Tarinejad R, Pham BT, Van Huynh V (2021) A new
hybrid simulated annealing-based genetic programming technique to predict the ultimate bearing
capacity of piles. Eng Comput 37(3):2111–2127
Youd TL, Idriss IM (2001) Liquefaction resistance of soils: Summary report from the 1996 NCEER and
1998 NCEER/NSF workshops on evaluation of liquefaction resistance of soils. J Geotech Geoen-
viron Eng 127(4):297–313
Youd TL, Idriss IM, Andrus RD, Arango I, Castro G, Christian JT, Dobry R, Finn WDL, Harder LF,
Hynes ME, Ishihara K, Koester JP, Liao SSC, Marcuson WF, Martin GR, Mitchell JK, Moriwaki
Y, Power MS, Robertson PK, Seed RB, Stokoe KH (2001) Liquefaction resistance of soils: sum-
mary report from the 1996 NCEER and 1998 NCEER/NSF Workshops on evaluation of liquefac-
tion resistance of soils. J Geotech Geoenviron Eng 127(10):817–833
Yu Z, Shi XZ, Zhou J, Chen X, Miao XH, Teng B, Ipangelwa T (2020) Prediction of blast-induced rock
movement during bench blasting: use of gray wolf optimizer and support vector regression. Nat
Resour Res 29(2):843–865
Yu Z, Shi X, Miao X, Zhou J, Khandelwal M, Chen X, Qiu Y (2021) Intelligent modeling of blast-
induced rock movement prediction using dimensional analysis and optimized artificial neural net-
work technique. Int J Rock Mech Min Sci 143:104794
Zhang G, Robertson PK, Brachman RWI (2004) Estimating liquefaction-induced lateral displace-
ments using the standard penetration test or cone penetration test. J Geotech Geoenviron Eng
130(8):861–871
Zhang JF, Wang YH (2021) An ensemble method to improve prediction of earthquake-induced soil liq-
uefaction: a multi-dataset study. Neural Comput Appl 33(5):1533–1546
Zhang Y-G, Qiu J, Zhang Y, Wei Y (2021a) The adoption of ELM to the prediction of soil liquefaction
based on CPT. Nat Hazards 107(1):539–549
Zhang Y, Qiu J, Zhang Y, Xie Y (2021b) The adoption of a support vector machine optimized by GWO
to the prediction of soil liquefaction. Environ Earth Sci 80(9):1–9
Zhao Z, Duan W, Cai G (2021) A novel PSO-KELM based soil liquefaction potential evaluation system
using CPT and Vs measurements. Soil Dyn Earthq Eng 150:106930
Zhou J, Chen C, Armaghani DJ, Ma S (2020a) Developing a hybrid model of information entropy and
unascertained measurement theory for evaluation of the excavatability in rock mass. Eng Comput.
https://doi.org/10.1007/s00366-020-01053-4
Zhou J, Huang S, Wang M, Qiu Y (2021a) Performance evaluation of hybrid GA-SVM and GWO-SVM
models to predict earthquake-induced liquefaction potential of soil: a multi-dataset investigation.
Eng Comput. https://doi.org/10.1007/s00366-021-01418-3
Zhou J, Koopialipoor M, Li E, Armaghani DJ (2020b) Prediction of rockburst risk in underground pro-
jects developing a neuro-bee intelligent system. Bull Eng Geol Env 79(8):4265–4279
Zhou J, Li C, Arslan CA, Hasanipanah M, Bakhshandeh Amnieh H (2021b) Performance evaluation
of hybrid FFA-ANFIS and GA-ANFIS models to predict particle size distribution of a muck-pile
after blasting. Eng Comput 37(1):265–274
Zhou J, Li C, Koopialipoor M, Armaghani DJ, Pham BT (2021c) Development of a new methodology
for estimating the amount of PPV in surface mines based on prediction and probabilistic models
(GEP-MC). Int J Min Reclam Environ 35(1):48–68
Zhou J, Li E, Wei H, Li C, Qiao Q, Armaghani DJ (2019a) Random forests and cubist algorithms for pre-
dicting shear strengths of rockfill materials. Appl Sci-Basel. https://doi.org/10.3390/app9081621
Zhou J, Li E, Yang S, Wang M, Shi X, Yao S, Mitri HS (2019b) Slope stability prediction for circular
mode failure using gradient boosting machine approach based on an updated database of case his-
tories. Saf Sci 118:505–518
Zhou J, Li EM, Wang MZ, Chen X, Shi XZ, Jiang LS (2019c) Feasibility of stochastic gradient boost-
ing approach for evaluating seismic liquefaction potential based on SPT and CPT case histories. J
Perform Constr Facil 33(3):04019024
Zhou J, Li XB, Mitri HS (2015) Comparative performance of six supervised learning methods for the
development of models of hard rock pillar stability prediction. Nat Hazards 79(1):291–316
Zhou J, Li XB, Mitri HS (2016) Classification of rockburst in underground projects: comparison of ten
supervised learning methods. J Comput Civil Eng 30(5):04016003
13
Zhou J, Qiu Y, Armaghani DJ, Zhang W, Li C, Zhu S, Tarinejad R (2021d) Predicting TBM penetration rate
in hard rock condition: a comparative study among six XGB-based metaheuristic techniques. Geosci
Front 12(3):101091
Zhou J, Qiu Y, Zhu S, Armaghani DJ, Khandelwal M, Mohamadd ET (2020c) Estimation of the TBM
advance rate under hard rock conditions using XGBoost and Bayesian optimization. Undergr Space.
https://doi.org/10.1016/j.undsp.2020.05.008
Zhou J, Qiu Y, Zhu S, Armaghani DJ, Li C, Nguyen H, Yagiz S (2021e) Optimization of support vector
machine through the use of metaheuristic algorithms in forecasting TBM advance rate. Eng Appl
Artif Intell 97:104015
Zhou J, Qiu Y, Khandelwal M, Zhu S, Zhang X (2021f) Developing a hybrid model of Jaya algorithm-based
extreme gradient boosting machine to estimate blast-induced ground vibrations. International Journal
of Rock Mechanics and Mining Sciences 145:104856.
Zhou J, Zhu S, Qiu Y, Armaghani DJ, Zhou A, Yong W (2022) Predicting tunnel squeezing using support
vector machine optimized by whale optimization algorithm. Acta Geotechnica 1–24. https://doi.org/
10.1007/s11440-022-01450-7
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Authors and Affiliations
Jian Zhou1 · Shuai Huang1 · Tao Zhou2 · Danial Jahed Armaghani3 · Yingui Qiu1

Shuai Huang
205511038@csu.edu.cn
Danial Jahed Armaghani
danialarmaghani@susu.ru
Yingui Qiu
195512085@csu.edu.cn
1
School of Resources and Safety Engineering, Central South University, Changsha 410083, China
2
Guangdong Provincial Key Laboratory of Deep Earth Sciences and Geothermal Energy
Exploitation and Utilization, Institute of Deep Earth Sciences and Green Energy, College of Civil
and Transportation Engineering, Shenzhen University, Shenzhen, China
3
Department of Urban Planning, Engineering Networks and Systems, Institute of Architecture
and Construction, South Ural State University, 76, Lenin Prospect, Chelyabinsk 454080, Russia
13

2022 Employing A Genetic Algorithm and Grey Wolf Optimizer For Optimizing RF Models To Evaluate Soil Liquefaction Potential

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2022 Employing A Genetic Algorithm and Grey Wolf Optimizer For Optimizing RF Models To Evaluate Soil Liquefaction Potential

Uploaded by

Copyright:

Available Formats

Artificial Intelligence Review (2022) 55:5673–5705

Employing a genetic algorithm and grey wolf optimizer

Jian Zhou1 · Shuai Huang1 · Tao Zhou2 · Danial Jahed Armaghani3 · Yingui Qiu1

Accepted: 14 January 2022 / Published online: 19 February 2022

qc 0.90 4.90 25.00 5.82 5.91 5.44 4.09 4.16 3.83

Z 0.80 6.70 19.80 7.66 7.69 7.51 4.90 4.93 4.79

Mw 5.90 7.00 9.00 7.12 7.13 7.09 0.53 0.52 0.57

3.1 Random forest (RF)

3.1.1 The principle of RF

The RF is an ML algorithm proposed by Breiman (Breiman 2001) that is combined with

3.1.2 Parameters in the RF model

3.2 Genetic Algorithm (GA)

(1) Selection operator

3.3 Grey Wolf Optimizer (GWO)

where X ⃗ P indicates the position vector of the prey, X

3.4 GA‑based RF Model (GA‑RF)

3.5 GWO‑based RF model (GWO‑RF)

Testing set Training set Initialize the population

Train the GA-RF model

Fig. 3 Flowchart of the GA-RF predictive model

Calculate the fitness

Update the positions of the

Test GWO-RF model ?

Fig. 4 Flowchart of the GWO-RF predictive model

4 Performance evaluation criteria

Big crack Surface subsidence

Dataset A Dataset B Dataset C

S1 S2 dw Svo F75 VS S′vo αmax rd

Construct models Initialize the population

Calculate the fitness of each candidate solution

Main loop Main loop

Selection, crossover and Stopping criteria? Sort the fitness of each

Form a new population Update Xα, Xβ, and Xδ and

GA-RF model GWO-RF model

Performance assessment Performance assessment

Table 4 The performance of the two hybrid models under Database A

Table 5 The performance of the two hybrid models under Database B

Table 6 The performance of the two hybrid models under Database C

5.2 Analysis and description of classification performance

Table 7 The optimal parameters Database Database A Database B Database

GWO and GA have improvement effects on the RF classification performance. These

Table 8 Performance comparison Model Accuracy Precision Recall F1

True Positive Rate

True Positive Rate

GA-RF: AUC = 1.0 GA-RF: AUC = 0.9600 GA-RF: AUC = 0.9771

False Positive Rate False Positive Rate False Positive Rate

(0, 1) (0, 1) (0, 1)

GA-RF GA-RF GA-RF

(a) (b) (c)

5.3 Feature importance analysis

0.3275 (a) 0.3257 (d)

0.2560 (c) 0.2507 (f)

Authors and Affiliations

Jian Zhou1 · Shuai Huang1 · Tao Zhou2 · Danial Jahed Armaghani3 · Yingui Qiu1

You might also like

3.1 Random forest (RF)

3.1.1 The principle of RF

3.1.2 Parameters in the RF model

3.2 Genetic Algorithm (GA)

3.3 Grey Wolf Optimizer (GWO)

3.4 GA‑based RF Model (GA‑RF)

3.5 GWO‑based RF model (GWO‑RF)

Fig. 3 Flowchart of the GA-RF predictive model

Fig. 4 Flowchart of the GWO-RF predictive model

4 Performance evaluation criteria

Table 4 The performance of the two hybrid models under Database A

Table 5 The performance of the two hybrid models under Database B

Table 6 The performance of the two hybrid models under Database C

5.2 Analysis and description of classification performance

Table 7 The optimal parameters Database Database A Database B Database

Table 8 Performance comparison Model Accuracy Precision Recall F1

5.3 Feature importance analysis