Professional Documents
Culture Documents
DOI: 10.1002/aic.18115
RESEARCH ARTICLE
Process Systems Engineering
1
Department of Process and Environmental
Engineering, Biomaterials and Transport Abstract
Phenomena Laboratory (LBMPT), University of
The purpose of this work was to compare the performance of 7 meta-heuristics
Yahia Fares, Faculty of Technology, Medea,
Algeria algorithms namely: Dragonfly (DA), Ant Lion (ALO), Grey Wolf (GWO), Artificial Bee
2
Univ Rennes, Ecole Nationale Supérieure de Colony (ABC), Particle Swarm (PSO), Whale (WAO), and a hybrid Particle Swarm with
Chimie de Rennes, CNRS, Rennes, France
Grey Wolf (HPSOGWO) optimizers in terms of fine-tuning hyper-parameters of a
Correspondence hybrid quantitative structure property relationships (QSPR)-support vector regression
Imane Euldji, Department of Process and
Environmental Engineering Biomaterials and
(SVR) for the prediction of molar fraction solubilities of drug compounds in supercritical
Transport Phenomena Laboratory (LBMPT), carbon dioxide (SC-CO2). A dataset of 168 drug compounds, 13 inputs, and 4490 exper-
University of Yahia Fares, Faculty of
Technology, Medea 26000, Algeria.
imental data points was used to achieve the goal. All 7 models were statistically and
Email: euldji.imane@univ-medea.dz; imene. graphically approved while the HPSOGWO-SVR was found to over-perform with an
eull@gmail.com
average absolute relative deviation (AARD) of 0.706% and an AIC of 14,434,249. The
Funding information model was subjected to an external test (validation) using 160 experimental data points
Algerian Ministry of Higher Education and
Scientific Research, Grant/Award Number:
that were not used in the training and the test set. The overall results proved that the
PRFU Project A16N01UN260120220003; the obtained model has good predictivity ability and robustness.
University Yahia Fares of Medea
KEYWORDS
drugs, metaheuristic algorithms, solubility, supercritical carbon dioxide, support vector
regression
1 | I N T RO DU CT I O N For more than decades, it has been clearly demonstrated that sol-
ubility is the major factor affecting the in vivo systemic exposure of
“Finish early, but with fewer errors.” Through decades, this concept drug candidates without doubt.3–5 Solubility could be considered as a
was adapted and sacred by the pharmaceutical industry. When it big threshold in both preclinical and clinical phases. Where statistic
comes to human life, any mistake or delay is indeed forbidden. From results had reported that more than 70% of the drug candidates were
discovering new drug candidates to valid them, those candidates pass eliminated in these stages due to poor water solubility. In other words,
through a long and complicated process full of obstructs and issues solubility plays a key role in the formulation development to achieve
usually related to ineffectiveness caused by unsatisfactory pharmaco- the bioavailability and therapeutic action of the drug at the target
kinetics and pharmacodynamics, hence bioavailability. This ineffec- site.6 As explained by Kakran et al.7 To overcome the challenge of
tiveness occurs due to five main keys of critical compound properties poor water solubility, numerical techniques were used, which can be
in early compound screening, such as solubility, dissociation constant, divided into three main categories: physical modification, chemical
permeability, stability, and lipophilicity.1,2 modification, and miscellaneous methods.2 Supercritical fluid
technology (SCF), particle size reduction, complexation, co-solvency, Baghban et al.44 utilized a least square support vector machine (LSSVM)
and electrospinning are the well-known methods adopted; while with RBF kernel function not coupling and coupling with PSO to esti-
8–31 32
SCF is the most promising approach in pharmaceutical research mate the solubility of 33 different drug compounds in SC-CO2 base on
thanks to the safety of the environment and the economy. The most their physical property. El Hadj et al.45 established a hybrid model based
widely SCF used is carbon dioxide (SC-CO2), almost 98% of the appli- on a feed-forward artificial neural network and particle swarm optimiza-
cation of the pharmaceutical industry33 due to its many advantages: tion algorithm (PSO-ANN) for predicting the solubility of solid drugs in
low critical temperature (Tc = 31.10 C), critical Pressure SC-CO2. Lasharboloki et al.,46 Mehdizadeh and Movagharnejad47 per-
(Pc = 73.8 bar), nontoxic, nonflammable, and inexpensive. 32
Ideally, formed a comparison study between ANN and thermodynamic modeling
the knowledge of solubility of pharmaceutical molecules, whatever it approached for predicting the solubility of compounds in SC-CO2. Sodei-
is in or not in SC-CO2, for all compounds at different temperature and fian et al.48 performed a comprehensive comparison among four
pressure ranges, and mixing compositions for which data are required. approaches, namely EoS, empirical, and semi-empirical, solution models,
This can be only achieved through the use of computational methods and ANN method, for correlating solubilities of different pharmaceutical
for solubility prediction; in view of the fact that the experimental compounds in SC-CO2. Mehdizadeh and Movagharnejad49 performed a
studies are very expensive and time-consuming. Consequently, efforts comparison study between the GA-LSSVM algorithm and seven semi-
in developing reliable models for predicting water solubility led to a empirical equations for modeling the solubility of 25 solutes in SC-CO2.
significant among of publications, in which different approaches were Vatani et al.50 tested the ability of methods based on empirical equations
proposed. Reviewing the literature timelines indicates that the solubil- and fuzzy-genetic models to estimate the solubility of solutes in SC-CO2.
ity of solid compounds in SCF is usually predicted using a thermody- The proposed models were compared to other empirical models. Huus-
namic approach based on equations of state (EoS) which is mainly konen et al.51 performed a QSAR study on a set of 191 drug-like com-
divided into three main classes: cubic EoS, complex noncubic EoS, and pounds extracted from the AQUASOL database based on their structural
conductor-like screening model (COSMO) based on quantum mechan- and physicochemical properties to predict water solubility. Gurikov
ical calculation, empirical, or semi-empirical correlations or intelligent et al.52 performed a hybrid thermodynamic/QSPR approach to develop a
34
computer techniques. More details are available in a previous paper. robust model that would allow for the solubility prediction entirely based
Reddy and Garlapati35 proposed a new empirical model to correlate on molecular structure and the parameters of two models (Chrastil equa-
the solubilities of 27 industrially important pharmaceutical compounds in tion and the modified, expanded liquid model), involving more than
SC-CO2, which is developed based on the degree-of-freedom analysis, 300 original publications of solubility data for high-boiling point com-
the proposed model was found to correlate better in terms of average pounds in SC-CO2 and its mixtures with modifiers. Valenzuela Roediger
absolute relative deviation (AARD) than exiting models mentioned in the et al.53 developed a hybrid QSAR-semiempirical model to predict the
article. Amooey36 contributed by an empirical correlation base on tem- parameters of the solubility equation of chrastil using a group of com-
perature and density models for 31 solubility data series of different pounds in SC-CO2 under different pressure and temperate conditions.
drugs in SC-CO2, the model has generally resulted in minimum values of We have contributed to a previous study by an ANN-QSPR model for
AARE% compare to commonly used semi-empirical models. Nejad predicting the solubility of drug compounds in SC-CO2 with a dataset of
et al.,37 built an empirical equation using 16 publications to correlate the 148 molecule accounting 3971 EDP.34 None of the approaches men-
solute solubility in SC-CO2 with temperature, pressure, and density of tioned earlier proved to be superior to others. However, it should be
pure CO2, the model produces reasonably accurate correlation of the noted that machine learning approaches, such as support vector machine
mole fraction solubility of solutes in SC-CO2 with respect to pressure, which has been recently improved by combining with meta-heuristics
temperature, and density of pure SC-CO2. Belghait et al.38 performed a techniques for a better convergence to desired solutions are right now
comparison study of the correlation performance of 21 semi-empirical one of the hottest topics in various research fields.54–69 Classical tech-
models using a dataset of 210 solid solutes in SC-CO2, counting 5550 niques for optimizing are usually time consuming and do not provide an
data points, and proposed a density-based semi-empirical model. exact or feasible solution. Therefore, there is a necessity to implement
Si-Moussa et al.39 investigated in a density-based model by simple modi- meta-heuristic algorithms. The term meta-heuristics refers to a global
fication of the six parameters of Jouyban's model to correlate the solubil- search technique or all modern nature-inspired optimization algorithms,
ity of 100 drugs accounting 2891 experimental data points. Su and used in various problems where the solution methods are inexact and
Chen40 applied the regular solution model with the Flory–Huggins equa- near-optimal. The algorithms are highly effective and have general appli-
tion to correlate the solubility of 60 pharmaceutical molecules in cations with high performance and a great driving force for optimality.
SC-CO2. Yazdizadeh et al.41 studied the effect of applying the cubic Different systems inspired by nature can be introduced, such as genetic
equation of state and mixing rules to calculate the solubilities of 52 solid algorithm (GA), and particle swarm optimizer (PSO). the optimal method
compounds in SC-CO2 account 1776 experimental data points. Wang for a specific problem can be only select using comparison trials.70
42
and Lin used the Peng–Robinson with COSMOSAC EoS to predict the The aim of this study was to select the best optimization algo-
solubility of 46 drugs in SC-CO2 account 1160 data points. Eric et al.43 rithm for fine tuning the support vector regression algorithm, in form
applied a reliable model for the prediction of aqueous solubility based on of a comparison study between seven (7) selected meta-heuristic
the implementation of an algorithm for the automatic adjustment of algorithms: Dragonfly (DA), Ant Lion (ALO), Grey Wolf (GWO), Artifi-
descriptor's relative importance in counter-propagation acritical neural cial Bee Colony (ABC), Particle Swarm (PSO), Whale (WAO), and a
networks, with a dataset of 374 diverse drug-like molecules. hybrid Particle Swarm with Grey Wolf (HPSOGWO) optimizers.
EULDJI ET AL. 3 of 17
(
On the other hand, this article aims to validate a hybrid QSPR-SVR 1X n Xn
Maximize αi , αi αj , αj φðxi Þ φ xj ϵ αi ,αi
model able to predict and correlate the solubility of drug compounds 2 i, j¼1 i¼1
ð3Þ
in SC-CO2 with acceptable accuracy using various statistical and X
n
þ yi αi , αi g
graphical appraisals, including an external test. i¼1
8 Xn
>
>
> αi αi ¼ 0
2 | EXPERIMENTAL DATABASE <
i¼1
Subject to
>
> 0 ≤ αi ≤ C i ¼ 1, 2,3, …, n
>
:
A total of 168 drug compounds accounting for 4490 experimental data 0 ≤ αi ≤ C i ¼ 1,2, 3,…, n
points (EDP), solubility (mole fraction 0.000000001–0.131) measured in
supercritical carbon dioxide in the temperature range of 298–373.15 K Here αi , αi are nonlinear Lagrangian multipliers. Solving the dual
and a pressure range of 80–500 bars were collected (The detailed drug maximization problem in equation (3) gives the SVR function:
solubility data are tabulated in the Supporting Information). These data
were updated from previous work.34 QSPR\QSAR modeling was applied Xn
F x, αi , αi ¼ αi αi φðxi Þ φ xj þ b ð4Þ
to 148 drug compounds (3971 EDP) to determine the best-suitable com- i¼1
binations of features able to predict and correlate this property. The result
of that study ends up meeting all the OECD principles for QSAR valida- The vector inner-product (φðxi Þ φ xj ) represents the mapping
tion and showed that a combination of 13 descriptors is sufficient for function. That is, it can be replaced by a kernel function k xi , xj as
modeling the solubility. Therefore, the selected PaDEL-descriptors were shown in Equation (4):
as follows: AATS3v, MATS2e, GATS4c, GATS3v, GATS4e, GATS3s,
nBondsM, AVP-0, SHBd, MLogP, and MLFER_S, with the addition of T(K) Xn
F x, αi ,αi ¼ αi αi k xi , xj þ b ð5Þ
and P(MPa). The output property was converted to the corresponding i¼1
X
n
0:9 0:4
Si ¼ X Xj ð7Þ w ¼ 0:9 i ð15Þ
j¼1 I
where X and X j are the positions of the current individual and the jth where s, a, c, f, e, and w are the weights of their corresponding ele-
neighboring individual, respectively and n is the number of neighbor- ment. w is calculated using Equation (15), i is the current iteration and
ing individuals. I is the number of iterations, and e is calculated in Equation (14), s, a,
and c are three different random numbers between 0 and 2e; f is a
Alignment random number between 0 and 2. More details can be found in Refer-
This step represents the velocity matching between dragonflies of the ence [80].
same group. It is given by:
P
n
2.2.2 | Ant Lion optimizer
Vj
j¼1
Ai ¼ X ð8Þ
n
This technique is inspired by the hunting mechanism of antlions in
nature and is proposed by Mirjalili84 in 2015. The antlions (doodle-
where vj is the velocity of neighboring individual j. bugs) belong to the Myrmeleontidae family of insects and Neuroptera
order. They have an average lifespan of about 3 years that it is spent
Cohesion as larvae except for 3–5 weeks of that period is spent in adulthood.85
The cohesion refers to the tendency of members toward the center of At the larvae phase, Ant Lions are known for their unique process of
the swarm's group, the neighborhood mass. It can be defined as: hunting and preferably of hunting ants. Using their massive jaw,
antlions dig conical pits in the sand. Then, the larvae hide and wait at
P
n
the bottom of that pit, sharp enough for the prey to be trapped in
Xj
j¼1 (most cases, ants). Once the antlion noticed the prey are inside the
Ci ¼ X ð9Þ
n
trap, they started to throw sands out of the hole across the prey. Con-
sequently, the ants will fail to escape from the pit and slide to the bot-
Attraction to food tom of that pit. After that, antlions would consume the prey and
Since individuals aim is to survive. The objective of attraction to food thrown away the leftovers. With this strategy, individuals prepare the
is that all dragonflies must move toward the food. As shown following pit for the next hunt. As well as the chance of survival increases.85,86
this mathematical formula: The mathematical representation of this algorithm is given as
follows84–86:
þ
Fi ¼ X X ð10Þ
where Xþ shows the position of the food source. Random walks of Ants
Distraction from enemy XðtÞ ¼ ½0, cumsumð2r ðt1 1Þ, cumsumð2rðt2 Þ 1, …, cumsumð2rðtn Þ 1Þ
This final step refers to when individuals move far away from the ð16Þ
enemy's sources to survive; it is calculated as follows:
where n is the maximum number of iterations, cumsum calculates the
Ei ¼ X X ð11Þ cumulative sum, and t is the step of the random walk. Hence,
Trapping in Antlion's pit where RtA is the random walk around the antlion selected by the rou-
Antlion's trap affects the random walks of ants. It is given by: lette wheel at tth iteration, RtE is the random walk, and Antti represents
the position of ith ant at tth iteration.
cti ¼ Antliontj þ ct
ð20Þ
dti ¼ Antliontj þ dt
2.2.3 | Grey wolf optimizer
t t
where c represents the minimum of all variables at tth iteration, d
indicates the vector including the maximum of all variables at tth itera- The GWO is another meta-heuristic technique proposed by Mirjalili
tion, cti is the minimum of all variables for ith ant, dti is the maximum et al.,87 for solving optimized problems. This algorithm is mainly
of all variables for ith ant, and Antliontj shows the position of the inspired by the social leadership and hunting behavior of grey wolves
selected jth antlion at tth iteration. in nature.88 Grey wolves are social animals that live in a pack of 5–12
members. In each path, there are four types of wolves divided accord-
Building trap ing to their responsibilities and decisions making roles, during the pro-
In this step, the ALO involved a roulette wheel operator to select cess of prey hunting, to alpha (α) wolf as the group leader, beta (β)
antlions based on their fitness during optimization. This technique wolves as the second ones in command, then delta (δ) wolves the sub-
provides high chances of catching ants by the fitter antlions. ordinate ones, those three are considered as the best solution to lead
the rest wolves known as omega (ω) wolves, the fourth type, toward
Sliding Ants toward Antlion promising areas for the aim of retching the global solution.77,88,89
Here, the radius of the random walks hypersphere of ants is Those individuals have a special mechanism of hunting, considering
decreased adaptively in the mathematical modeling by applying the three main phases. In the beginning, grey wolves approach the prey
following Equation (21): using a process of tracking and chasing. Then, they encircle the prey
and harass it until the prey stops moving. The final step is to attack
ct them, as mentioned in Reference [90]. The GWO algorithm follows
ct ¼
I ð21Þ the following mathematical technique of hunting77,87–90:
t dt
d ¼
I
Encircling
t
where I is a ratio, c is the minimum of all variables at tth iteration, and The encircling behavior of prey by grey wolves is mathematically mod-
dt indicates the vector, including the maximum of all variables at tth eled as shown:
iteration.
! ! ! !
D ¼ jC X p ðtÞ X ðtÞj ð24Þ
Catching pray and rebuilding the pit
The following is the final hunting stage. Based on the position of the ! ! ! !
X ðt þ 1Þ ¼ X P ðtÞ A D ð25Þ
latest hunted ant, the antlions update their position to improve the
6 of 17 EULDJI ET AL.
!
Here, t is the current iteration, X P is the vector of the prey posi- inspiration is from the behaviors of honeybees in finding food sources,
! ! !
tion, and X indicates the vector of the grey wolf position, A , C are as well as sharing that information with the rest in the nest. In the
coefficient vectors which can be calculated as follows: algorithm, individual bees are classified into three types (employed,
onlooker, and scout) where each agent plays different roles in the pro-
! ! ! !
A ¼ 2a r 1 a ð26Þ cess, as explained by Reference [81].
!
C ¼2!
r2 ð27Þ
2.2.5 | Particle swarm optimizer
!
where components of a are linearly decreased from 2 to 0, over the
course of iterations and r &!
!
r are random vectors in [0, 1].
1 2 The PSO is one of the most well-known population-based meta-
heuristic optimization algorithms. This technique was first introduced
Hunting by Kennedy and Eberhart93–95 in 1995. Its strategy was inspired by
The mathematical modeling assumes that α (the best candidate solution), the social behavior of bird flocking, fish schooling, a swarm of bees,
β, and δ have better knowledge about the location of the prey. For that sometimes from the social behavior of humans when it searches for
reason, it is mandatory to save the first three best solutions obtained so food.95,96
far and to update the positions of the other wolves (ω) according to those
solutions. The formulas of hunting mechanism are given by:
2.2.6 | Whale optimizer algorithm
! ! ! ! ! ! ! ! ! ! ! !
Dα ¼ j C1 X α X j, Dβ ¼ j C2 Xβ X j, Dδ ¼ j C3 Xδ X j ð28Þ
The WOA is one of the meta-heuristic methods which was intro-
! ! ! ! ! ! ! ! ! duced by Mirjalili and Lewis.97 The inspiration comes from the
X1 ¼ Xα A1 Dα , X2 ¼ Xβ A2 Dβ , X3
! ! ! hunting behavior of humpback whales. Those humpbacks whales
¼ Xδ A3 Dδ ð29Þ
have a special hunting behavior which is known as the bubble-net
feeding technique, used for hunting schools of fish. Therefore,
! ! !
! X1 þ X2 þ X3 whales create a 9-shaped net with bubbles that encircles the prey
X ð t þ 1Þ ¼ ð30Þ
3 and makes it easy for the whales to eat them during hunting. 98–100
This technique is described in more details in References [97,99].
where C1, C2, and C3 are calculated by Equation (27), Xα, Xβ, and Xδ The mathematical model for performing the optimization follows
are the first three best solutions at iteration t. A1, A2, and A3 are calcu- three phases:
lated as in Equation (26), and Dα, Dβ, and Dδ are defined as in
Equation (28). Encircling prey
Using the objective function each time the wolves are repositioned, In this phase, humpback whales encircle the prey that they have
an iteration occurred, and the solutions are re-evaluated. Then, the alpha, found. Then, modify their position toward the best search agent
beta, and delta wolves become the three best solutions, respectively. On through a course of iterations. This algorithm considers that the posi-
the other hand, the omega wolves update their positions in the next itera- tion of the target prey is the best or near to the optimum solution.
tion. These moves are repeated until the stopping criterion is met. The mathematical representation is given97,99,100:
! ! ! !
Attacking D ¼ jC X ðnÞ X ðnÞj ð32Þ
The moment that wolves start the attack, the hunting mode is con-
cluded. This can be mathematically represented by the value of !
a ! ! ! !
X ðn þ 1Þ ¼ X ðnÞ A D ð33Þ
which is linearly decreased over the course of iterations controlling
the exploration and exploitation:
where n indicates the current iteration, X is the position vector of the
! !
a ðtÞ ¼ 2 ð2 tÞ=MaxIter ð31Þ best solution obtained so far iteration n, X is the position vector of
! !
each agent, A , C are calculated as follows:
Between the prey position and their current position, wolves change
their positions randomly. More details can be found elsewhere.87
! ! ! !
A ¼ 2a r 1 a ð34Þ
!
C ¼ 2!
r2 ð35Þ
2.2.4 | Artificial bee colony
!
The artificial bee colony approach is proposed in 2005 by Karaboga where components of a are linearly decreasing from 2 to 0, over the
91 92 ! !
and Basturk, then its performance was analyzed in 2007. The course of iterations and r 1 & r 2 are random vectors in [0, 1].
EULDJI ET AL. 7 of 17
Bubble-net attacking method and exploitation of an individual wolf in the search space. For combin-
This attack combines two strategies that can be mathematically ing PSO and GWO variants, Equation (42a) is proposed for the veloc-
defined as follows: ity, while Equation (42b) for updating101:
1. Shrinking encircling mechanism: This technique is simulated by
! ! ! !
decreasing the value of a in (34) which is explained in Refer- dα ¼ j c 1 xα ω xj
ences [97,99]. ! ! ! !
dβ ¼ j c 1 xβ ω x j ð41Þ
2. Spiral updating position: the helix-shaped movement of hump- ! ! ! !
dδ ¼ j c 1 xδ ω x j
back whales is simulated using Equations (36) and (37):
! !0 ! vkþ1
i ¼ ω vki þ c1 r 1 x1 xk1 þ c2 r 2 x2 xki þ c3 r 3 x3 xki ðaÞ
X ðn þ 1Þ ¼ D e cosð2πv Þ þ X ðnÞ
bv
ð36Þ ð42Þ
xikþ1 ¼ xki þ vkþ1
i ðbÞ
!0 ! !
D ¼ jX ðnÞ XðnÞj ð37Þ
Data Collection
(168 Drug-like compounds)
Applied metaheuristics
Algorithms (DA, ALO, Dataset Division Test Set
GWO, ABC, PSO,
WOA & HPSOGWO)
Training Set
Optimization Process
No Max
Evaluation of SVR
(iteration) ?
Model base on AARD%
Yes
End
SVR, GWO-SVR, ABC-SVR, PSO-SVR, WOA-SVR, and HPSOGWO- development and validation of any model. This process confirms the reli-
SVR) respectively. The flowchart of the proposed SVR hybrid models is ability of the developed model for its possible application on a new set of
summarized in Figure 1. data, and confidence of prediction can thus be judged along with the pos-
sibility of receiving a statistical comparison judgment between several
models and various inputs of a model. Hence, a model can lead to the
2.4 | Statistical criteria for evaluation of models' false prediction of response if the developed model is not validated cor-
performances rectly. The prediction accuracy of the seven models was examined by
using various statistical criteria and through graphical appraisal, that is,
The evaluation of the performance of a regression model, also known as scatter plot, bar diagrams, and so on. More details are addressed in the
validation, is now known as the most important concept for the R&D section. The statistical performance indicators are given by:
EULDJI ET AL. 9 of 17
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
uN 2 where r is the correlation coefficient between the observed and
u P obs
u pred
ti¼1 yi yi predicted value of compounds with an interception, R2 is the coeffi-
RMSE ¼ ð43Þ
N cient of determination, RMSE is the root-mean-squared error,
AARD is the average absolute relative deviation, MSE is the
P N obs
i¼1 yi yobs ypred i ypred mean squared error, RE is the relative error, AICs is the Akaike's
R ¼ r ¼ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffisffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð44Þ
P N 2 P N 2 information criterion, MAE is the mean absolute error Af is the preci-
yobs
i yobs ypred
i ypred sion factor and Bf is the bias factor . The error parameters namely
i¼1 i¼1
AARD, RMSE, MSE, and MAE are usually used for comparison
N
P 2 between different models and not to validate their performance while
yobs
i ypred
i the statistics R, R2, Q2F1 , Q2F2 , Q2F3 , Af , Bf , and AICs are the statistical cri-
R2 ¼ r2 ¼ 1 i¼1
N 2 ð45Þ
P teria used for evaluation of models performance and validation. More-
yobs
i yobs
i¼1 over, the statistics AICs and Q2F3 were found to the best suitable
statistics for the evaluation of similar studies as explained in
1X N
yobs ypred References [102] and [103], respectively. More detail can be found
AARD% ¼ 100 j i obs i j ð46Þ
N i¼1 yi elsewhere.104–108
N 2
1X pred
MSE ¼ i yi
yobs ð47Þ
N i¼1 2.5 | Applicability domain
P
Nout 2
yobs ypred =NOUT William's plot is a plot of standardized residuals vs. the leverage
i i
Q2F3 ¼ 1 i¼1N ð51Þ values refer to Hat (hi) diagonal values; it was used to visualize the
P TR 2
yobs
i yTR =NTR respective AD and to discover the outlier's response. Those out-
i¼1
liers are usually compounds with standardized residuals greater
than two ±3 SD units for y-axes and structurally influential chemi-
SSE 2np ðnp þ 1Þ
AICs ¼ Nln þ 2np þ ð52Þ cals (hi > h*) for x-axes in the model, where h* is a threshold lever-
N N ðnp þ 1Þ
age value. The leverage (h) value of a compound is defined as
N
X 2 follows:
pred
SSE ¼ i yi
yobs ð53Þ
i¼0 1
hi ¼ xi X T X xi ð57Þ
1 X
N
pred
MAE ¼ X i yi
jyobs j ð54Þ where xi is the descriptor-row vector of the query compound, and X is
N i¼1
a k n matrix containing the k descriptor values for each one of the
0 1 n training compounds.
P
N yobs
log i
B y
pred
C The warning leverage h* is given by:
@ i¼1
N
i
A
Bf ¼ 10 ð55Þ 3ðp þ 1Þ
h ¼ ð58Þ
n
0 1
PN yobs
jlog i j
B i¼1 ypred C
@ N
i
A
where n is the total number of samples in the training set, and p is the
Af ¼ 10 ð56Þ
number of descriptors involved in the correlation.
10 of 17 EULDJI ET AL.
When h < h*, the observed and predicted values have a high PSO, ABC, ALO, GWO, then DA, with errors of 0.7280, 0.7345,
probability while molecules with hi > h* hardly influence the quality of 0.7419, 0.7481, 0.7508, respectively. This conforms to the over-
fit of the developed model. However, these compounds may not be perform of the meta-heuristic HPSOGWO over the rest six algorithms
an outlier because of low residuals. in terms of minimizing the cost function (AARD%) and the conver-
It can be observed that compounds with a high value of leverage gence capacity. Table 1 shows the hyper-parameters of the best per-
and good fitting in the developed model may stabilize the model. In formances so far from each meta-heuristic's algorithms used for
contrast, compounds with bad fitting in the developed model can be tuning the parameters of the hybrid QSPR-SVR algorithm.
outliers. Hence, the standardized residual and the leverage must be Figure 3 represents the scatter plots, that is, predicted versus
utilized at once for the description of the applicability of the domain observed values of the solubility fraction of drug compounds in SC-
of the expanded model. CO2 for the global, train, and test sets for the top three hybrid SVR
models of the best iteration. In general, all the plots of all algorithms
could give a positive correlation between the calculated values and
2.5.2 | Insubria graph the experimental values with a few outliers. This confirms the good
quality of all models and proves their ability in predicting.
The Insubria graph is a plot of diagonal values versus predicted values Table 2 lists the statistics results about AARD, RMSE, MSE, MAE,
that are used in the case of chemicals without experimental data to r, R2, Q2F1 ,Q2F2 , Q2F3, and AIC for the top three developed models. It can
provide a visualization of interpolated and extrapolated predictions. A be observed from Table 2 that the calculated parameters were all valid
zone of higher reliability is always provided for both structures (<h*) with high correlation coefficient r as well as great robustness and the
and (ln(y2)) predictions (between the maximum and the minimum coefficients of determination R2 were above 1 in addition to the
value of the observed solubility of the training set). RMSE, MSE, and MAE of all models which were below 0.3. In sum-
mary, the results above were in total agreement with the acceptability
criteria of the statistical validation of regression models.
3 | RESULTS AND DISCUSSION
Parameters models C σ ε
This study had adopted various analysis methods such as convergence
curve to compare the performance of the seven hybrid QSPR-SVR DA-SVR 102.2490 1.7118 0.0076
models and to determine the best-suitable meta-heuristic algorithm so far ALO-SVR 65.0157 1.7484 0.0066
for tuning the hyper-parameters of the SVR model for a dataset of GWO-SVR 60 2 0.0068
168 drug compounds, accounting for 4490 experimental data points ABC-SVR 60 1.6724 0.0067
regarding solubility in supercritical carbon dioxide, and 13 selected inputs. PSO-SVR 62.4902 1.7045 0.0068
Figure 2 represents the convergence curve of the seven selected WOA-SVR 52.0924 1.5791 0.0063
optimization approaches. It is clearly demonstrated, based on the HPSOGWO-SVR 62.9392 1.6780 0.0065
objective function (AARD%) for an amount of 100 iterations, that the
Abbreviations: ABC, Artificial Bee Colony; ALO, Ant Lion optimizers; DA,
hybrid HPSOGWO, colored in blue, led to the lowest error (0.7063),
Dragonfly; GWO, Grey Wolf; HPSOGWO, hybrid particle Swarm with
subsequently the WOA algorithm with an error of 0.7271. The perfor- Grey Wolf optimizers; PSO, Particle Swarm optimizers; SVR, support
mance of this approach was asymptotic to HPSOGWO. Thereafter, vector regression; WAO, Whale optimizers.
F I G U R E 2 Convergence
curve of the seven algorithms.
EULDJI ET AL. 11 of 17
F I G U R E 3 Scatter plots predicted versus observed values for the (1) global, (2) train and (3) test sets, for the top three models: (A) Particle
Swarm optimizers (PSO), (B) Whale optimizers (WOA), (C) hybrid particle Swarm with Grey Wolf optimizers (HPSOGWO).
Statistics models AARD% RMSE MSE MAE r R2 Q2F1 Q2F2 Q2F3 Af Bf AICs
PSO-SVR 0.7280 0.1968 0.0387 0.0709 0.9971 0.9941 0.9941 0.9941 0.9941 1.0073 0.9999 14051.7736
WOA-SVR 0.7271 0.1929 0.0372 0.0705 0.9972 0.9944 0.9944 0.9944 0.9945 1.0073 0.9997 14225.7070
HPSOGWO-SVR 0.7063 0.1883 0.0355 0.0693 0.9973 0.9946 0.9946 0.9946 0.9947 1.0070 0.9996 14434.2487
Abbreviations: AARD, average absolute relative deviation; Af, precision factor; AIC, Akaike's information criterion; Bf, bias factor HPSOGWO, hybrid particle Swarm
with Grey Wolf optimizers; MAE, mean absolute error; MSE, mean squared errorPSO, Particle Swarm optimizers; RMSE, root-mean-squared error, SVR, support
vector regression; WAO, Whale optimizers.
12 of 17 EULDJI ET AL.
65
55
45 PSO
WOA
35
HPSOGWO
25
15
5
-5
1 2 3 4 5 6 7 8 9 10 11
Absolute Relative Error %
40
PSO
30
WOA
20 HPSOGWO
10
0
1 2 3 4 5 6 7 8 9 10 11
Absolute Relative Error %
Consequently, all the models performed well in model development, In order to quantitatively visualize the distribution of relative
assuming an excellent prediction ability of future data. The models errors, and to compare the generalization capacity, the tendency of
were therefore stable, robust, and predictive, even if the over-fitting of the different models, Figure 4, and Figure 5, illustrated
HPSOGWO-SVR led to the best performances and the best AIC a bar distribution of the relative error for both training and test set,
value of 14434.2487. respectively.
EULDJI ET AL. 13 of 17
Those results confirmed the high quality of the model DATA AVAILABILITY STAT EMEN T
HPSOGWO-SVR. Since the error values were close to the training The additional data of this article can be found online on (Database-
and test sets and there were no significantly large residual values for SvM.PDF). Software, web servers and calculation tools belong to their
the validation set displayed in Figure 8, it can be concluded in the respective developers and copyright holders. All statistical and graphic
absence of over-fitting of the model. In summary, the HPSOGWO- analyzes have been obtained with codes that we have developed
SVR model has by far the best robustness (>0.9) and high predictive using the Matlab® R2021B environment.
power and can be recommended for predicting future data falling into
the defined applicability domain. OR CID
Imane Euldji https://orcid.org/0000-0002-8025-4996
4 | C O N CL U S I O N RE FE RE NCE S
1. Alsenz J, Kansy M. High throughput solubility measurement in drug
discovery and development. Adv Drug Deliv Rev. 2007;59(7):546-567.
In this article, for the first time, a comparative study was adopted
2. Vimalson DC. Techniques to enhance solubility of hydrophobic
between seven meta-heuristic algorithms in terms of fine-tuning hyper- drugs: an overview. Asian J Pharmaceut. 2016;10(2):67-75.
parameters of a hybrid QSPR-SVR model defined for predicting drug 3. Tihanyi KK, Vastag M. Solubility, Delivery and ADME Problems of
solubility in supercritical carbon dioxide. Solubility of 168 drug com- Drugs and Drug Candidates. Bentham Science Publishers; 2011.
4. Sodeifian G, Sajadian SA, Razmimanesh F. Solubility of an antiar-
pounds was first collected and then correlated with their molecular
rhythmic drug (amiodarone hydrochloride) in supercritical carbon
structure by QSPR technique and two independent intensive state vari-
dioxide: experimental and modeling. Fluid Phase Equilibria. 2017;450:
ables (temperature and pressure) of a previous study. The seven meta- 149-159.
heuristic algorithms DA, ALO, GWO, ABC, PSO, WOA, and HPSOGWO 5. Sodeifian G, Sajadian SA, Ardestani NS. Determination of solubility
responsible for the seven models DA-SVR, ALO-SVR, GWO-SVR, ABC- of Aprepitant (an antiemetic drug for chemotherapy) in supercritical
carbon dioxide: empirical and thermodynamic models. J Supercr
SVR, PSO-SVR, WOA-SVR, and HPSOGWO-SVR were all statistically
Fluids. 2017;128:102-111.
and graphically approved, while the hybrid HPSOGWO-SVR model 6. Savjani KT, Gajjar AK, Savjani JK. Drug solubility: importance and
over-performed the six other models. The HPSOGWO-SVR model enhancement techniques. Int Sch Res Notices 2012;2012:1-10.
proved to have good predictivity ability and robustness, and thus it can 7. Kakran M, Li L, Müller RH. Overcoming the challenge of poor drug
solubility. Pharm Eng. 2012;32(7–8):1-7.
be used to estimate the solubility for drug compounds without experi-
8. Sodeifian G, Garlapati C, Hazaveie SM, Sodeifian F. Solubility of 2, 4,
mental data available in the literature. The validity of the model predic- 7-Triamino-6-phenylpteridine (triamterene, diuretic drug) in super-
tions was further guaranteed by the external test on 160 EDP critical carbon dioxide: experimental data and modeling. J Chem Eng
compared to experimental values, considering new compounds which Data. 2020;65(9):4406-4416.
9. Sodeifian G, Nasri L, Razmimanesh F, Abadian M. CO2 utilization for
should belong to the applicability domain (William's plot & Insubria
determining solubility of teriflunomide (immunomodulatory agent) in
graph). The SVR model presented in this work showed better statistical supercritical carbon dioxide: experimental investigation and thermo-
parameter values and better predictability results. However, due to the dynamic modeling. J CO2 Util. 2022;58:101931.
stochastic nature of all swarm intelligence algorithms, it is never guaran- 10. Hazaveie SM, Sodeifian G, Sajadian SA. Measurement and thermo-
dynamic modeling of solubility of Tamsulosin drug (anti cancer and
teed to find an optimal solution for any problem, which always opens
anti-prostatic tumor activity) in supercritical carbon dioxide.
the door for new possibilities. J Supercrit Fluids. 2020;163:104875.
11. Sodeifian G, Garlapati C, Razmimanesh F, Sodeifian F. The solubility
AUTHOR CONTRIBUTIONS of Sulfabenzamide (an antibacterial drug) in supercritical carbon
dioxide: evaluation of a new thermodynamic model. J Mol Liq. 2021;
Imane Euldji: Methodology (equal); validation (equal); visualization
335:116446.
(equal); writing – original draft (equal); writing – review and editing
12. Sodeifian G, Razmimanesh F, Sajadian SA. Prediction of solubility of
(equal). Aicha Belghait: Resources (equal); writing – review and editing sunitinib malate (an anti-cancer drug) in supercritical carbon dioxide
(equal). Cherif Si-Moussa: Methodology (equal); supervision (equal); (SC–CO2): experimental correlations and thermodynamic modeling.
writing – original draft (equal); writing – review and editing (equal). J Mol Liq. 2020;297:111740.
13. Sodeifian G, Razmimanesh F, Sajadian SA, Hazaveie SM. Experimental
Othmane Benkortbi: Methodology (equal); supervision (equal);
data and thermodynamic modeling of solubility of Sorafenib tosylate,
writing – original draft (equal); writing – review and editing (equal). as an anti-cancer drug, in supercritical carbon dioxide: evaluation of
Abdeltif Amrane: Writing – review and editing (equal). Wong-Sandler mixing rule. J Chem Thermodyn. 2020;142:105998.
14. Sodeifian G, Sajadian SA. Experimental measurement of solubilities
of sertraline hydrochloride in supercriticalcarbon dioxide with/
ACKNOWLEDGMENTS
without menthol: data correlation. J Supercrit Fluids. 2019;149:79-87.
The authors gratefully acknowledge the Algerian Ministry of Higher 15. Sodeifian G, Garlapati C, Razmimanesh F, Sodeifian F. Solubility of
Education and Scientific Research (PRFU Project A16N01UN26 amlodipine besylate (calcium channel blocker drug) in supercritical
0120220003) and the University Yahia Fares of Medea. carbon dioxide: measurement and correlations. J Chem Eng Data.
2021;66(2):1119-1131.
16. Sodeifian G, Garlapati C, Razmimanesh F, Nateghi H. Experimental
CONF LICT OF IN TE RE ST ST AT E MENT solubility and thermodynamic modeling of empagliflozin in supercrit-
The authors no conflict of interest. ical carbon dioxide. Sci Rep. 2022;12(1):9008.
EULDJI ET AL. 15 of 17
17. Sodeifian G, Ardestani NS, Sajadian SA, Panah HS. Experimental 34. Euldji I, Si-Moussa C, Hamadache M, Benkortbi O. QSPR modelling
measurements and thermodynamic modeling of coumarin-7 solid of the solubility of drug and drug-like compounds in supercritical
solubility in supercritical carbon dioxide: production of nanoparticles carbon dioxide. Molecular Informatics. 2022;41(10):2200026.
via RESS method. Fluid Phase Equilib. 2019;483:122-143. 35. Reddy TA, Garlapati C. Dimensionless empirical model to correlate
18. Sodeifian G, Alwi RS, Razmimanesh F, Abadian M. Solubility of pharmaceutical compound solubility in supercritical carbon dioxide.
Dasatinib monohydrate (anticancer drug) in supercritical CO2: Chem Eng Technol. 2019;42(12):2621-2630.
experimental and thermodynamic modeling. J Mol Liq. 2022;346: 36. Amooey AA. A simple correlation to predict drug solubility in super-
117899. critical carbon dioxide. Fluid Phase Equilib. 2014;375:332-339.
19. Sodeifian G, Detakhsheshpour R, Sajadian SA. Experimental study 37. Nejad SJ, Abolghasemi H, Moosavian M, Maragheh M. Prediction of
and thermodynamic modeling of esomeprazole (proton-pump inhibi- solute solubility in supercritical carbon dioxide: a novel semi-
tor drug for stomach acid reduction) solubility in supercritical carbon empirical model. Chem Eng Res Des. 2010;88(7):893-898.
dioxide. J Supercrit Fluids. 2019;154:104606. 38. Belghait A, Si-Moussa C, Laidi M, Hanini S. Semi-empirical correla-
20. Sodeifian G, Razmimanesh F, Sajadian SA. Solubility measurement of tion of solid solute solubility in supercritical carbon dioxide: compar-
a chemotherapeutic agent (Imatinib mesylate) in supercritical carbon ative study and proposition of a novel density-based model. C R
dioxide: assessment of new empirical model. J Supercritical Fluids. Chim. 2018;21(5):494-513.
2019;146:89-99. 39. Si-Moussa C, Belghait A, Khaouane L, Hanini S, Halilali A. Novel
21. Sodeifian G, Sajadian SA, Razmimanesh F, Hazaveie SM. Solubility of density-based model for the correlation of solid drugs solubility in
ketoconazole (antifungal drug) in SC-CO2 for binary and ternary sys- supercritical carbon dioxide. C R Chim. 2017;20(5):559-572.
tems: measurements and empirical correlations. Sci Rep. 2021;11(1): 40. Su C-S, Chen Y-P. Correlation for the solubilities of pharmaceutical
7546. compounds in supercritical carbon dioxide. Fluid Phase Equilib. 2007;
22. Sodeifian G, Hazaveie SM, Sajadian SA, Razmimanesh F. Experimen- 254(1–2):167-173.
tal investigation and modeling of the solubility of oxcarbazepine 41. Yazdizadeh M, Eslamimanesh A, Esmaeilzadeh F. Thermodynamic
(an anticonvulsant agent) in supercritical carbon dioxide. Fluid Phase modeling of solubilities of various solid compounds in supercritical
Equilibria. 2019;493:160-173. carbon dioxide: effects of equations of state and mixing rules.
23. Sodeifian G, Alwi RS, Razmimanesh F. Solubility of Pholcodine (anti- J Supercrit Fluids. 2011;55(3):861-875.
tussive drug) in supercritical carbon dioxide: experimental data and 42. Wang L-H, Lin S-T. A predictive method for the solubility of drug in
thermodynamic modeling. Fluid Phase Equilibria. 2022;556:113396. supercritical carbon dioxide. J Supercrit Fluids. 2014;85:81-88.
24. Sodeifian G, Alwi RS, Razmimanesh F, Tamura K. Solubility of quetia- 43. Eric S, Kalinic M, Popovic A, Zloh M, Kuzmanovski I. Prediction of
pine hemifumarate (antipsychotic drug) in supercritical carbon diox- aqueous solubility of drug-like molecules using a novel algorithm for
ide: experimental, modeling and Hansen solubility parameter automatic adjustment of relative importance of descriptors imple-
application. Fluid Phase Equilibria. 2021;537:113003. mented in counter-propagation artificial neural networks. Int J
25. Sodeifian G, Hazaveie SM, Sodeifian F. Determination of Galanta- Pharm. 2012;437(1–2):232-241.
mine solubility (an anti-alzheimer drug) in supercritical carbon diox- 44. Baghban A, Jalali A, Mohammadi AH, Habibzadeh S. Efficient model-
ide (CO2): experimental correlation and thermodynamic modeling. ing of drug solubility in supercritical carbon dioxide. J Supercrit Fluids.
J Mol Liq. 2021;330:115695. 2018;133:466-478.
26. Sodeifian G, Ardestani NS, Sajadian SA, Panah HS. Measurement, 45. Abdallah El Hadj A, Laidi M, Si-Moussa C, Hanini S. Novel approach
correlation and thermodynamic modeling of the solubility of for estimating solubility of solid drugs in supercritical carbon dioxide
Ketotifen fumarate (KTF) in supercritical carbon dioxide: evaluation and critical properties using direct and inverse artificial neural net-
of PCP-SAFT equation of state. Fluid Phase Equilib. 2018;458: work (ANN). Neu Comput Appl. 2017;28:87-99.
102-114. 46. Lashkarbolooki M, Vaferi B, Rahimpour M. Comparison the capabil-
27. Sodeifian G, Sajadian SA, Derakhsheshpour R. Experimental mea- ity of artificial neural network (ANN) and EOS for prediction of solid
surement and thermodynamic modeling of lansoprazole solubility in solubilities in supercritical carbon dioxide. Fluid Phase Equilib. 2011;
supercritical carbon dioxide: application of SAFT-VR EoS. Fluid Phase 308(1–2):35-43.
Equilib. 2020;507:112422. 47. Mehdizadeh B, Movagharnejad K. A comparison between neural
28. Sodeifian G, Razmimanesh F, Sajadian SA, Panah HS. Solubility mea- network method and semi empirical equations to predict the solubil-
surement of an antihistamine drug (Loratadine) in supercritical car- ity of different compounds in supercritical carbon dioxide. Fluid
bon dioxide: assessment of qCPA and PCP-SAFT equations of state. Phase Equilib. 2011;303(1):40-44.
Fluid Phase Equilib. 2018;472:147-159. 48. Sodeifian G, Sajadian SA, Razmimanesh F, Ardestani NS. A compre-
29. Sodeifian G, Nasri L, Razmimanesh F, Abadian M. Measuring and model- hensive comparison among four different approaches for predicting
ing the solubility of an antihypertensive drug (losartan potassium, the solubility of pharmaceutical solid compounds in supercritical car-
Cozaar) in supercritical carbon dioxide. J Mol Liq. 2021;331:115745. bon dioxide. Korean J Chem Eng. 2018;35:2097-2116.
30. Sodeifian G, Ardestani NS, Razmimanesh F, Sajadian SA. Experimen- 49. Mehdizadeh B, Movagharnejad K. A comparative study between LS-
tal and thermodynamic analyses of supercritical CO2-solubility of SVM method and semi empirical equations for modeling the solubil-
minoxidil as an antihypertensive drug. Fluid Phase Equilib. 2020;522: ity of different solutes in supercritical carbon dioxide. Chem Eng Res
112745. Des. 2011;89(11):2420-2427.
31. Sodeifian G, Hazaveie SM, Sajadian SA, Saadati Ardestani N. Deter- 50. Vatani Z, Ramezanian Bajgiran S, Amini G, Tayyebi S. Solubility
mination of the solubility of the repaglinide drug in supercritical car- modeling of supercritical fluid extraction in a wide range com-
bon dioxide: experimental data and thermodynamic modeling. pounds: comparison between fuzzy-genetic and new empirical
J Chem Eng Data. 2019;64(12):5338-5348. models. Energy Sources A: Recovery Util Environ Eff. 2020;42(3):
32. Thakkar FMV, Soni T, Gohel M, Gandhi T. Supercritical fluid technol- 365-374.
ogy: a promising approach to enhance the drug solubility. J Pharm 51. Huuskonen J, Livingstone DJ, Manallack DT. Prediction of drug solu-
Sci Res. 2009;1(4):1. bility from molecular structure using a drug-like training set. SAR
33. Kankala RK, Zhang YS, Wang SB, Lee CH, Chen AZ. Supercritical QSAR Environ Res. 2008;19(3–4):191-212.
fluid technology: an emphasis on drug delivery and related biomedi- 52. Gurikov P, Lebedev I, Kolnoochenko A, Menshutina N. Prediction
cal applications. Adv Healthc Mater. 2017;6(16):1700433. of the solubility in supercritical carbon dioxide: a hybrid
16 of 17 EULDJI ET AL.
thermodynamic/QSPR approach. Comput Aided Chem Eng. 2016;38: 71. Hu X. Support Vector Machine and its Application to Regression and
1587-1592. Classification. 2017.
53. Valenzuela Roediger LM, Reveco-Chilla A, Del Valle Lladser JM. 72. Jakkula V. Tutorial on support vector machine (svm). School of EECS,
Modeling solubility in supercritical carbon dioxide using quantitative Washington State University 2006 37(2.5):3.
structure-property relationships. 2014. 73. Benimam H, Moussa CS, Hentabli M, Hanini S, Laidi M. Dragonfly-
54. Zhang J, Wang Y. Evaluating the bond strength of FRP-to-concrete support vector machine for regression modeling of the activity coef-
composite joints using metaheuristic-optimized least-squares sup- ficient at infinite dilution of solutes in imidazolium ionic liquids using
port vector regression. Neu Comput Appl. 2021;33:3621-3635. σ-profile descriptors. J Chem Eng Data. 2020;65(6):3161-3172.
55. Zhang J, Huang Y, Ma G, Sun J, Nener B. A metaheuristic-optimized 74. Yu P-S, Chen S-T, Chang I-F. Support vector regression for real-time
multi-output model for predicting multiple properties of pervious flood stage forecasting. J Hydrol. 2006;328(3–4):704-716.
concrete. Construct Build Mater. 2020;249:118803. 75. Tatar A, Barati A, Yarahmadi A, Najafi A, Lee M, Bahadori A. Predic-
56. Tran D-H, Luong D-L, Chou J-S. Nature-inspired metaheuristic tion of carbon dioxide solubility in aqueous mixture of methyldietha-
ensemble model for forecasting energy consumption in residential nolamine and N-methylpyrrolidone using intelligent models. Int J
buildings. Energy. 2020;191:116552. Greenhouse Gas Con. 2016;47:122-136.
57. Zhou J, Qiu Y, Zhu S, et al. Optimization of support vector machine 76. Baghban A, Mohammadi AH, Taleghani MS. Rigorous modeling of
through the use of metaheuristic algorithms in forecasting TBM CO2 equilibrium absorption in ionic liquids. Int J Greenhouse Gas
advance rate. Eng Appl Artif Intel. 2021;97:104015. Con. 2017;58:19-41.
58. Panahi M, Gayen A, Pourghasemi HR, Rezaie F, Lee S. Spatial predic- 77. Xu C, Nait Amar M, Ghriga MA, Ouaer H, Zhang X, Hasanipanah M.
tion of landslide susceptibility using hybrid support vector regression Evolving support vector regression using Grey wolf optimization;
(SVR) and the adaptive neuro-fuzzy inference system (ANFIS) with forecasting the geomechanical properties of rock. Eng Comput.
various metaheuristic algorithms. Sci Total Environ. 2020;741: 2022;38:1819-1833.
139937. 78. Farhat NH. Photonic neural networks and learning machines. IEEE
59. Balogun A-L, Rezaie F, Pham QB, et al. Spatial prediction of landslide Expert. 1992;7(5):63-72.
susceptibility in western Serbia using hybrid support vector regres- 79. Amroune M, Bouktir T, Musirin I. Power system voltage stability
sion (SVR) with GWO, BAT and COA algorithms. Geosci Frontiers. assessment using a hybrid approach combining dragonfly optimiza-
2021;12(3):101104. tion algorithm and support vector regression. Arabian J Sci Eng.
60. Abbaszadeh Shahri A, Maghsoudi Moud F, Mirfallah Lialestani SP. A 2018;43:3023-3036.
hybrid computing model to predict rock strength index properties 80. Mirjalili S. Dragonfly algorithm: a new meta-heuristic optimization
using support vector regression. EngComput. 2022;38(1):579-594. technique for solving single-objective, discrete, and multi-objective
61. Caraka RE, Chen RC, Bakar SA, et al. Employing best input SVR problems. Neu Comput Appl. 2016;27:1053-1073.
robust lost function with nature-inspired metaheuristics in wind 81. Yasen M, Al-Madi N, Obeid N. Optimizing neural networks using
speed energy forecasting. IAENG Int J Comput Sci. 2020;47(3): dragonfly algorithm for medical prediction. Paper presented at:
572-584. 2018 8th international conference on computer science and infor-
62. Musa B, Yimen N, Abba SI, Adun HH, Dagbasi M. Multi-state load mation technology (CSIT). 2018.
demand forecasting using hybridized support vector regression inte- 82. Salam MA, Zawbaa HM, Emary E, Ghany KKA, Parv B. A hybrid
grated with optimal design of off-grid energy systems—a metaheur- dragonfly algorithm with extreme learning machine for prediction.
istic approach. Processes. 2021;9(7):1166. Paper presented at: 2016 International symposium on innovations in
63. Bonah E, Huang X, Hongying Y, et al. Detection of salmonella Typhi- intelligent systems and applications (INISTA). 2016.
murium contamination levels in fresh pork samples using electronic 83. Reynolds CW. Flocks, herds and schools: A distributed behavioral
nose smellprints in tandem with support vector machine regression model. Paper presented at: Proceedings of the 14th annual confer-
and metaheuristic optimization algorithms. J Food Sci Technol. 2021; ence on Computer graphics and interactive techniques. 1987.
58:3861-3870. 84. Mirjalili S. The ant lion optimizer. Adv Eng Software. 2015;83:80-98.
64. Malik A, Tikhamarine Y, Souag-Gamane D, Rai P, Sammen SS, Kisi O. 85. Saha S, Mukherjee V. A novel quasi-oppositional chaotic antlion opti-
Support vector regression integrated with novel meta-heuristic algo- mizer for global optimization. Appl Intelligence. 2018;48:2628-2660.
rithms for meteorological drought prediction. Meteorol Atmos Phys. 86. Gupta E, Saxena A. Performance evaluation of antlion optimizer
2021;133:891-909. based regulator in automatic generation control of interconnected
65. Rahmati O, Darabi H, Panahi M, et al. Development of novel hybrid- power system. J Eng. 2016;2016:1-14.
ized models for urban flood susceptibility mapping. Sci Rep. 2020; 87. Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Adv Eng Soft-
10(1):12937. ware. 2014;69:46-61.
66. Fadhillah MF, Lee S, Lee C-W, Park Y-C. Application of support vec- 88. Nadimi-Shahraki MH, Taghian S, Mirjalili S. An improved grey wolf
tor regression and metaheuristic optimization algorithms for ground- optimizer for solving engineering problems. Expert SystAppl. 2021;
water potential mapping in Gangneung-si, South Korea. Remote Sens 166:113917.
(Basel). 2021;13(6):1196. 89. Emary E, Yamany W, Hassanien AE, Snasel V. Multi-objective gray-
67. Setiawan IN, Kurniawan R, Yuniarto B, Caraka RE, Pardamean B. wolf optimization for attribute reduction. Procedia Comput Sci. 2015;
Parameter optimization of support vector regression using Harris 65:623-632.
hawks optimization. Procedia Comput Sci. 2021;179:17-24. 90. Kumar A, Pant S, Ram M. System reliability optimization using gray
68. da Silva Santos CE, Sampaio RC, dos Santos Coelho L, Bestard GA, wolf optimizer algorithm. Qual Reliab EngInt. 2017;33(7):1327-1335.
Llanos CH. Multi-objective adaptive differential evolution for SVM/ 91. Karaboga D, Basturk B. On the performance of artificial bee colony
SVR hyperparameters selection. Pattern Recognit. 2021;110:107649. (ABC) algorithm. Appl Soft Comput. 2008;8(1):687-697.
69. Malla C, Panigrahi I. Review of condition monitoring of rolling ele- 92. Karaboga D. An Idea Based on Honey Bee Swarm for Numerical Opti-
ment bearing using vibration analysis and other techniques. JVib Eng mization: Technical Report-TR06. Erciyes University, Engineering
Technol. 2019;7:407-414. Faculty, Computer; 2005.
70. Okwu MO, Tartibu LK. Metaheuristic Optimization: Nature-Inspired 93. Okwu MO, Tartibu LK. Particle swarm optimisation. Metaheuristic
Algorithms Swarm and Computational Intelligence, Theory and Applica- Optimization: Nature-Inspired Algorithms Swarm and Computational
tions. Vol 927. Springer Nature; 2020. Intelligence, Theory and Applications. Springer Nature; 2021:5-13.
EULDJI ET AL. 17 of 17
94. Eberhart R, Kennedy J. A new optimizer using particle swarm theory. 107. Roy K, Mitra I. On various metrics used for validation of predictive
Paper presented at: MHS'95. Proceedings of the Sixth International QSAR models with applications in virtual screening and focused
Symposium on Micro Machine and Human Science 1995. library design. Comb Chem High Throughput Screen. 2011;14(6):
95. Garg H. A hybrid PSO-GA algorithm for constrained optimization 450-474.
problems. Appl Math Comput. 2016;274:292-305. 108. Falyouna O, Eljamal O, Maamoun I, Tahara A, Sugihara Y. Magnetic
96. Şenel FA, Gökçe F, Yüksel AS, Yig it T. A novel hybrid PSO–GWO zeolite synthesis for efficient removal of cesium in a lab-scale con-
algorithm for optimization problems. Eng Comput. 2019;35:1359- tinuous treatment system. J Colloid Interface Sci. 2020;571:66-79.
1373. 109. Hamadache M, Hanini S, Benkortbi O, Amrane A, Khaouane L,
97. Mirjalili S, Lewis A. The whale optimization algorithm. Adv Eng Soft- Moussa CS. Artificial neural network-based equation to predict the
ware. 2016;95:51-67. toxicity of herbicides on rats. Chemom Intel Lab Syst. 2016;154:7-15.
98. Hemeida A, Alkhalaf S, Mady A, Mahmoud E, Hussein M, 110. Bouarra N, Kherouf S, Bouakkadia A, Messadi D. QSPR application
Eldin AMB. Implementation of nature-inspired optimization algo- on modeling of boiling point of polycyclic aromatic hydrocarbons.
rithms in some data mining tasks. Ain Shams Eng J. 2020;11(2): Res J Pharma Biol Chem Sci. 2017;8(6):19-28.
309-318. 111. Zhao X, Pan Y, Jiang J, Xu S, Jiang J, Ding L. Thermal hazard of ionic liq-
99. Nasiri J, Khiyabani FM. A whale optimization algorithm (WOA) uids: modeling thermal decomposition temperatures of imidazolium ionic
approach for clustering. Cogent math. Stat. 2018;5(1):1483565. liquids via QSPR method. Ind Eng Chem Res. 2017;56(14):4185-4195.
100. Kaur G, Arora S. Chaotic whale optimization algorithm. J Comput 112. Mansourian M, Saghaie L, Fassihi A, Madadkar-Sobhani A,
Des Eng. 2018;5(3):275-284. Mahnam K. Linear and nonlinear QSAR modeling of 1, 3,
101. Singh N, Singh S. Hybrid algorithm of particle swarm optimization 8-substituted-9-deazaxanthines as potential selective a 2B AR
and grey wolf optimizer for improving convergence performance. antagonists. Med Chem Res. 2013;22:4549-4567.
J Appl Mathematics. 2017;2017:1-15.
102. Kuonen D. Book review: regression modeling strategies: with appli-
cations to linear models, logistic regression, and survival analysis.
Stat Methods Med Res. 2004;13(5):415-416. SUPPORTING INF ORMATION
103. Consonni V, Ballabio D, Todeschini R. Comments on the definition Additional supporting information can be found online in the Support-
of the Q 2 parameter for QSAR validation. J Chem Inf Model. 2009; ing Information section at the end of this article.
49(7):1669-1678.
104. Todeschini R, Ballabio D, Grisoni F. Beware of unreliable Q 2! A
comparative study of regression metrics for predictivity assessment How to cite this article: Euldji I, Belghait A, Si-Moussa C,
of QSAR models. J Chem Inf Model. 2016;56(10):1905-1913.
Benkortbi O, Amrane A. A new hybrid quantitative structure
105. Soleimani R, Saeedi Dehaghani AH, Shoushtari NA, Yaghoubi P,
Bahadori A. Toward an intelligent approach for predicting surface property relationships-support vector regression (QSPR-SVR)
tension of binary mixtures containing ionic liquids. Korean J Chem approach for predicting the solubility of drug compounds in
Eng. 2018;35:1556-1569. supercritical carbon dioxide. AIChE J. 2023;e18115. doi:10.
106. Veerasamy R, Rajak H, Jain A, Sivadasan S, Varghese CP,
1002/aic.18115
Agrawal RK. Validation of QSAR models-strategies and importance.
Int J Drug des Discov. 2011;3:511-519.