You are on page 1of 17

Received: 16 July 2022 Revised: 12 March 2023 Accepted: 2 April 2023

DOI: 10.1002/aic.18115

RESEARCH ARTICLE
Process Systems Engineering

A new hybrid quantitative structure property


relationships-support vector regression (QSPR-SVR)
approach for predicting the solubility of drug
compounds in supercritical carbon dioxide

Imane Euldji 1 | Aicha Belghait 1 | Cherif Si-Moussa 1 | Othmane Benkortbi 1 |


Abdeltif Amrane 2

1
Department of Process and Environmental
Engineering, Biomaterials and Transport Abstract
Phenomena Laboratory (LBMPT), University of
The purpose of this work was to compare the performance of 7 meta-heuristics
Yahia Fares, Faculty of Technology, Medea,
Algeria algorithms namely: Dragonfly (DA), Ant Lion (ALO), Grey Wolf (GWO), Artificial Bee
2
Univ Rennes, Ecole Nationale Supérieure de Colony (ABC), Particle Swarm (PSO), Whale (WAO), and a hybrid Particle Swarm with
Chimie de Rennes, CNRS, Rennes, France
Grey Wolf (HPSOGWO) optimizers in terms of fine-tuning hyper-parameters of a
Correspondence hybrid quantitative structure property relationships (QSPR)-support vector regression
Imane Euldji, Department of Process and
Environmental Engineering Biomaterials and
(SVR) for the prediction of molar fraction solubilities of drug compounds in supercritical
Transport Phenomena Laboratory (LBMPT), carbon dioxide (SC-CO2). A dataset of 168 drug compounds, 13 inputs, and 4490 exper-
University of Yahia Fares, Faculty of
Technology, Medea 26000, Algeria.
imental data points was used to achieve the goal. All 7 models were statistically and
Email: euldji.imane@univ-medea.dz; imene. graphically approved while the HPSOGWO-SVR was found to over-perform with an
eull@gmail.com
average absolute relative deviation (AARD) of 0.706% and an AIC of 14,434,249. The
Funding information model was subjected to an external test (validation) using 160 experimental data points
Algerian Ministry of Higher Education and
Scientific Research, Grant/Award Number:
that were not used in the training and the test set. The overall results proved that the
PRFU Project A16N01UN260120220003; the obtained model has good predictivity ability and robustness.
University Yahia Fares of Medea

KEYWORDS
drugs, metaheuristic algorithms, solubility, supercritical carbon dioxide, support vector
regression

1 | I N T RO DU CT I O N For more than decades, it has been clearly demonstrated that sol-
ubility is the major factor affecting the in vivo systemic exposure of
“Finish early, but with fewer errors.” Through decades, this concept drug candidates without doubt.3–5 Solubility could be considered as a
was adapted and sacred by the pharmaceutical industry. When it big threshold in both preclinical and clinical phases. Where statistic
comes to human life, any mistake or delay is indeed forbidden. From results had reported that more than 70% of the drug candidates were
discovering new drug candidates to valid them, those candidates pass eliminated in these stages due to poor water solubility. In other words,
through a long and complicated process full of obstructs and issues solubility plays a key role in the formulation development to achieve
usually related to ineffectiveness caused by unsatisfactory pharmaco- the bioavailability and therapeutic action of the drug at the target
kinetics and pharmacodynamics, hence bioavailability. This ineffec- site.6 As explained by Kakran et al.7 To overcome the challenge of
tiveness occurs due to five main keys of critical compound properties poor water solubility, numerical techniques were used, which can be
in early compound screening, such as solubility, dissociation constant, divided into three main categories: physical modification, chemical
permeability, stability, and lipophilicity.1,2 modification, and miscellaneous methods.2 Supercritical fluid

AIChE J. 2023;e18115. wileyonlinelibrary.com/journal/aic © 2023 American Institute of Chemical Engineers. 1 of 17


https://doi.org/10.1002/aic.18115
2 of 17 EULDJI ET AL.

technology (SCF), particle size reduction, complexation, co-solvency, Baghban et al.44 utilized a least square support vector machine (LSSVM)
and electrospinning are the well-known methods adopted; while with RBF kernel function not coupling and coupling with PSO to esti-
8–31 32
SCF is the most promising approach in pharmaceutical research mate the solubility of 33 different drug compounds in SC-CO2 base on
thanks to the safety of the environment and the economy. The most their physical property. El Hadj et al.45 established a hybrid model based
widely SCF used is carbon dioxide (SC-CO2), almost 98% of the appli- on a feed-forward artificial neural network and particle swarm optimiza-
cation of the pharmaceutical industry33 due to its many advantages: tion algorithm (PSO-ANN) for predicting the solubility of solid drugs in

low critical temperature (Tc = 31.10 C), critical Pressure SC-CO2. Lasharboloki et al.,46 Mehdizadeh and Movagharnejad47 per-
(Pc = 73.8 bar), nontoxic, nonflammable, and inexpensive. 32
Ideally, formed a comparison study between ANN and thermodynamic modeling
the knowledge of solubility of pharmaceutical molecules, whatever it approached for predicting the solubility of compounds in SC-CO2. Sodei-
is in or not in SC-CO2, for all compounds at different temperature and fian et al.48 performed a comprehensive comparison among four
pressure ranges, and mixing compositions for which data are required. approaches, namely EoS, empirical, and semi-empirical, solution models,
This can be only achieved through the use of computational methods and ANN method, for correlating solubilities of different pharmaceutical
for solubility prediction; in view of the fact that the experimental compounds in SC-CO2. Mehdizadeh and Movagharnejad49 performed a
studies are very expensive and time-consuming. Consequently, efforts comparison study between the GA-LSSVM algorithm and seven semi-
in developing reliable models for predicting water solubility led to a empirical equations for modeling the solubility of 25 solutes in SC-CO2.
significant among of publications, in which different approaches were Vatani et al.50 tested the ability of methods based on empirical equations
proposed. Reviewing the literature timelines indicates that the solubil- and fuzzy-genetic models to estimate the solubility of solutes in SC-CO2.
ity of solid compounds in SCF is usually predicted using a thermody- The proposed models were compared to other empirical models. Huus-
namic approach based on equations of state (EoS) which is mainly konen et al.51 performed a QSAR study on a set of 191 drug-like com-
divided into three main classes: cubic EoS, complex noncubic EoS, and pounds extracted from the AQUASOL database based on their structural
conductor-like screening model (COSMO) based on quantum mechan- and physicochemical properties to predict water solubility. Gurikov
ical calculation, empirical, or semi-empirical correlations or intelligent et al.52 performed a hybrid thermodynamic/QSPR approach to develop a
34
computer techniques. More details are available in a previous paper. robust model that would allow for the solubility prediction entirely based
Reddy and Garlapati35 proposed a new empirical model to correlate on molecular structure and the parameters of two models (Chrastil equa-
the solubilities of 27 industrially important pharmaceutical compounds in tion and the modified, expanded liquid model), involving more than
SC-CO2, which is developed based on the degree-of-freedom analysis, 300 original publications of solubility data for high-boiling point com-
the proposed model was found to correlate better in terms of average pounds in SC-CO2 and its mixtures with modifiers. Valenzuela Roediger
absolute relative deviation (AARD) than exiting models mentioned in the et al.53 developed a hybrid QSAR-semiempirical model to predict the
article. Amooey36 contributed by an empirical correlation base on tem- parameters of the solubility equation of chrastil using a group of com-
perature and density models for 31 solubility data series of different pounds in SC-CO2 under different pressure and temperate conditions.
drugs in SC-CO2, the model has generally resulted in minimum values of We have contributed to a previous study by an ANN-QSPR model for
AARE% compare to commonly used semi-empirical models. Nejad predicting the solubility of drug compounds in SC-CO2 with a dataset of
et al.,37 built an empirical equation using 16 publications to correlate the 148 molecule accounting 3971 EDP.34 None of the approaches men-
solute solubility in SC-CO2 with temperature, pressure, and density of tioned earlier proved to be superior to others. However, it should be
pure CO2, the model produces reasonably accurate correlation of the noted that machine learning approaches, such as support vector machine
mole fraction solubility of solutes in SC-CO2 with respect to pressure, which has been recently improved by combining with meta-heuristics
temperature, and density of pure SC-CO2. Belghait et al.38 performed a techniques for a better convergence to desired solutions are right now
comparison study of the correlation performance of 21 semi-empirical one of the hottest topics in various research fields.54–69 Classical tech-
models using a dataset of 210 solid solutes in SC-CO2, counting 5550 niques for optimizing are usually time consuming and do not provide an
data points, and proposed a density-based semi-empirical model. exact or feasible solution. Therefore, there is a necessity to implement
Si-Moussa et al.39 investigated in a density-based model by simple modi- meta-heuristic algorithms. The term meta-heuristics refers to a global
fication of the six parameters of Jouyban's model to correlate the solubil- search technique or all modern nature-inspired optimization algorithms,
ity of 100 drugs accounting 2891 experimental data points. Su and used in various problems where the solution methods are inexact and
Chen40 applied the regular solution model with the Flory–Huggins equa- near-optimal. The algorithms are highly effective and have general appli-
tion to correlate the solubility of 60 pharmaceutical molecules in cations with high performance and a great driving force for optimality.
SC-CO2. Yazdizadeh et al.41 studied the effect of applying the cubic Different systems inspired by nature can be introduced, such as genetic
equation of state and mixing rules to calculate the solubilities of 52 solid algorithm (GA), and particle swarm optimizer (PSO). the optimal method
compounds in SC-CO2 account 1776 experimental data points. Wang for a specific problem can be only select using comparison trials.70
42
and Lin used the Peng–Robinson with COSMOSAC EoS to predict the The aim of this study was to select the best optimization algo-
solubility of 46 drugs in SC-CO2 account 1160 data points. Eric et al.43 rithm for fine tuning the support vector regression algorithm, in form
applied a reliable model for the prediction of aqueous solubility based on of a comparison study between seven (7) selected meta-heuristic
the implementation of an algorithm for the automatic adjustment of algorithms: Dragonfly (DA), Ant Lion (ALO), Grey Wolf (GWO), Artifi-
descriptor's relative importance in counter-propagation acritical neural cial Bee Colony (ABC), Particle Swarm (PSO), Whale (WAO), and a
networks, with a dataset of 374 diverse drug-like molecules. hybrid Particle Swarm with Grey Wolf (HPSOGWO) optimizers.
EULDJI ET AL. 3 of 17

(
On the other hand, this article aims to validate a hybrid QSPR-SVR 1X n      Xn  
Maximize  αi , αi αj , αj φðxi Þ  φ xj  ϵ αi ,αi
model able to predict and correlate the solubility of drug compounds 2 i, j¼1 i¼1
ð3Þ
in SC-CO2 with acceptable accuracy using various statistical and X
n  
þ yi αi , αi g
graphical appraisals, including an external test. i¼1

8 Xn 
> 
>
> αi  αi ¼ 0
2 | EXPERIMENTAL DATABASE <
i¼1
Subject to
>
> 0 ≤ αi ≤ C i ¼ 1, 2,3, …, n
>
:
A total of 168 drug compounds accounting for 4490 experimental data 0 ≤ αi ≤ C i ¼ 1,2, 3,…, n
points (EDP), solubility (mole fraction 0.000000001–0.131) measured in
supercritical carbon dioxide in the temperature range of 298–373.15 K Here αi , αi are nonlinear Lagrangian multipliers. Solving the dual
and a pressure range of 80–500 bars were collected (The detailed drug maximization problem in equation (3) gives the SVR function:
solubility data are tabulated in the Supporting Information). These data
were updated from previous work.34 QSPR\QSAR modeling was applied   Xn    
F x, αi , αi ¼ αi  αi φðxi Þ  φ xj þ b ð4Þ
to 148 drug compounds (3971 EDP) to determine the best-suitable com- i¼1

binations of features able to predict and correlate this property. The result
 
of that study ends up meeting all the OECD principles for QSAR valida- The vector inner-product (φðxi Þ  φ xj ) represents the mapping
 
tion and showed that a combination of 13 descriptors is sufficient for function. That is, it can be replaced by a kernel function k xi , xj as
modeling the solubility. Therefore, the selected PaDEL-descriptors were shown in Equation (4):
as follows: AATS3v, MATS2e, GATS4c, GATS3v, GATS4e, GATS3s,
nBondsM, AVP-0, SHBd, MLogP, and MLFER_S, with the addition of T(K)   Xn    
F x, αi ,αi ¼ αi  αi k xi , xj þ b ð5Þ
and P(MPa). The output property was converted to the corresponding i¼1

log-transformed property “-Ln(y2)” to guarantee the linear distribution as


shown in34 The final dataset was divided into two main sets. The first set In this study, the Gaussian kernel function was chosen
of 4330 EDP was retained for modeling; while the second set of Equation (6):
160 EDP was kept hidden from the models, that is, an external test.
  1
k xi ,xj ¼ pffiffiffiffiffiffi e2ð σ Þ
1 xμ 2
ð6Þ
σ 2π

2.1 | Support vector regression

The support vector machines are classification (SVC) and regression


(SVR) tools that use the machine learning concepts to increase the pre-
dictive accuracy. It was first introduced in 1995 by Vladimir Vapnik for 2.2 | Meta-heuristic approaches
the purpose of surpassing the traditional neural network algorithms that
had suffered from severe difficulties with generalization and producing 2.2.1 | Dragonfly algorithm
models.71,72 For regression modeling, the SVR hypothetical decreases
the error of predictability in a learning process and reduces overfit- The dragonfly algorithm (DA) is an optimization algorithm that
ting.73,74 The linear regression function is defined as follows74–77: was first proposed by Seyedali Mirjalili in 2015.80 This algorithm
is inspired by the static and dynamic behavior of dragonflies in
f ðxÞ ¼ ω  φðxÞ þ b ð1Þ nature. In the static swarm process, the dragonflies hunt prey;
the dragonflies behave similarly to the exploitation phase in
where φ (x) is the kernel function, w and b denote the weight vector meta-heuristic algorithms in which they fly in small groups fre-
and the bias term, respectively. That can be obtained by minimizing quently over a well-determined small area and much closer to the
the cost function77,78: land. While the dynamic behavior is similar to the exploration
phase, migratory swarms form as large groups (hundreds of thou-
1 Xk  
cost function ¼ ω2 þ C ζ þ
i þ ζi ð2Þ sands of dragonflies) and fly in a single direction for long dis-
2 i¼1
tances to find food resources. 79–81 The life cycle of agents is
8 þ
< yi  ðw  φðxi Þ þ bÞ ≤ ϵ þ ξi
> divided into two different phases: the nymph phase and the adult
Subject to ðw  φðxi Þ þ bÞ  yi ≤ ϵ þ ξ phase. 82 The natural behavior of each dragonfly in the swarm
>
: þ 
i
ξi , ξi ≥ 0,i ¼ 1,2, 3,…, n obliges the move toward nurturing sources and distract outward
enemies. 79,81
A standard dualization method using Lagrangian multipliers was According to Reynolds,83 the behavior of swarm follows five dif-
applied for easily solving the optimization problem in Equation (2). ferent principles that are necessary in finding the weights solu-
The dual formulation of this problem is represented as follows79: tion79–81,83:
4 of 17 EULDJI ET AL.

Separation Xtþ1 ¼ X t þ ΔXtþ1 ð13Þ


The purpose of this step is to avoid the collision of individuals with
!
their neighbors that is close to its position, the static swarm. It is 0:1
e ¼ 0:1  1  I
ð14Þ
defined as follows: 2

X
n  
0:9  0:4
Si ¼ X  Xj ð7Þ w ¼ 0:9  i  ð15Þ
j¼1 I

where X and X j are the positions of the current individual and the jth where s, a, c, f, e, and w are the weights of their corresponding ele-
neighboring individual, respectively and n is the number of neighbor- ment. w is calculated using Equation (15), i is the current iteration and
ing individuals. I is the number of iterations, and e is calculated in Equation (14), s, a,
and c are three different random numbers between 0 and 2e; f is a
Alignment random number between 0 and 2. More details can be found in Refer-
This step represents the velocity matching between dragonflies of the ence [80].
same group. It is given by:

P
n
2.2.2 | Ant Lion optimizer
Vj
j¼1
Ai ¼ X ð8Þ
n
This technique is inspired by the hunting mechanism of antlions in
nature and is proposed by Mirjalili84 in 2015. The antlions (doodle-
where vj is the velocity of neighboring individual j. bugs) belong to the Myrmeleontidae family of insects and Neuroptera
order. They have an average lifespan of about 3 years that it is spent
Cohesion as larvae except for 3–5 weeks of that period is spent in adulthood.85
The cohesion refers to the tendency of members toward the center of At the larvae phase, Ant Lions are known for their unique process of
the swarm's group, the neighborhood mass. It can be defined as: hunting and preferably of hunting ants. Using their massive jaw,
antlions dig conical pits in the sand. Then, the larvae hide and wait at
P
n
the bottom of that pit, sharp enough for the prey to be trapped in
Xj
j¼1 (most cases, ants). Once the antlion noticed the prey are inside the
Ci ¼ X ð9Þ
n
trap, they started to throw sands out of the hole across the prey. Con-
sequently, the ants will fail to escape from the pit and slide to the bot-
Attraction to food tom of that pit. After that, antlions would consume the prey and
Since individuals aim is to survive. The objective of attraction to food thrown away the leftovers. With this strategy, individuals prepare the
is that all dragonflies must move toward the food. As shown following pit for the next hunt. As well as the chance of survival increases.85,86
this mathematical formula: The mathematical representation of this algorithm is given as
follows84–86:
þ
Fi ¼ X  X ð10Þ

where Xþ shows the position of the food source. Random walks of Ants

Distraction from enemy XðtÞ ¼ ½0, cumsumð2r ðt1  1Þ, cumsumð2rðt2 Þ  1, …, cumsumð2rðtn Þ  1Þ
This final step refers to when individuals move far away from the ð16Þ
enemy's sources to survive; it is calculated as follows:
where n is the maximum number of iterations, cumsum calculates the
Ei ¼ X  X ð11Þ cumulative sum, and t is the step of the random walk. Hence,

where X shows the position of the enemy. r ðtÞ ¼


1 if rand > 0:5
ð17Þ
The position update of each dragonfly for the purpose of testing 0 if rand < 0:5
another weight solution and getting another fitness value is obtained
by calculating ΔX and X using Equations (12) and (13)80,81: Here, (t) is a stochastic function and rand is a random number
generated with uniform distribution at interval of [0, 1]. The positions
ΔXi ¼ ðsSi þ aAi þ cCi þ fF i þ eEi Þ þ wΔXt ð12Þ of ants are saved and considered during optimization in the matrix:
EULDJI ET AL. 5 of 17

2 3 possibility of catching new prey. The mathematical formula is the


A1,1 A1,2     A1,d
6 A2,1 A2,2     A2,d 7 following:
6 7
6 . .. .. .. .. 7
6 7
Mant ¼ 6 .. . . . . 7 ð18Þ    
6 7 Antliontj ¼ Antti if f Antti > f Antliontj ð22Þ
6 .. .. .. .. .. 7
4 . . . . . 5
An,1 An,2     An,d

where t represents the current iteration, Antliontj is the position of the


where MAnt is the matrix for saving the position of each ant, Ai, shows selected jth antlion at tth iteration, and Antti represents the position of
the value of the jth variable of ith ant, n is the number of ants, and d ith ant at tth iteration.
is the number of variables. According to the random walk, Ants update
their position with each step of optimization. The min–max normaliza- Elitism
tion equation is used to normalize the random walks: In the ALO algorithm, an elite refers to the best-obtained antlion
through the whole iteration. This elite affects the movement of the
 t   
X  ai  di  cti rest of the ants during iteration. Therefore, it is assumed that every
Xti ¼ i  t  þ ci ð19Þ
di  ai prey walks randomly around a selected antlion by the roulette wheel
and the elite altogether as follows:
where ai is the minimum of random walk of ith variable, di is the maxi-
mum of random walk of ith variable, cti is the minimum of ith variable RtA þ RtE
Antti ¼ ð23Þ
at tth iteration, and dti is the maximum of ith variable at tth iteration. 2

Trapping in Antlion's pit where RtA is the random walk around the antlion selected by the rou-
Antlion's trap affects the random walks of ants. It is given by: lette wheel at tth iteration, RtE is the random walk, and Antti represents
the position of ith ant at tth iteration.
cti ¼ Antliontj þ ct
ð20Þ
dti ¼ Antliontj þ dt
2.2.3 | Grey wolf optimizer
t t
where c represents the minimum of all variables at tth iteration, d
indicates the vector including the maximum of all variables at tth itera- The GWO is another meta-heuristic technique proposed by Mirjalili
tion, cti is the minimum of all variables for ith ant, dti is the maximum et al.,87 for solving optimized problems. This algorithm is mainly
of all variables for ith ant, and Antliontj shows the position of the inspired by the social leadership and hunting behavior of grey wolves
selected jth antlion at tth iteration. in nature.88 Grey wolves are social animals that live in a pack of 5–12
members. In each path, there are four types of wolves divided accord-
Building trap ing to their responsibilities and decisions making roles, during the pro-
In this step, the ALO involved a roulette wheel operator to select cess of prey hunting, to alpha (α) wolf as the group leader, beta (β)
antlions based on their fitness during optimization. This technique wolves as the second ones in command, then delta (δ) wolves the sub-
provides high chances of catching ants by the fitter antlions. ordinate ones, those three are considered as the best solution to lead
the rest wolves known as omega (ω) wolves, the fourth type, toward
Sliding Ants toward Antlion promising areas for the aim of retching the global solution.77,88,89
Here, the radius of the random walks hypersphere of ants is Those individuals have a special mechanism of hunting, considering
decreased adaptively in the mathematical modeling by applying the three main phases. In the beginning, grey wolves approach the prey
following Equation (21): using a process of tracking and chasing. Then, they encircle the prey
and harass it until the prey stops moving. The final step is to attack
ct them, as mentioned in Reference [90]. The GWO algorithm follows
ct ¼
I ð21Þ the following mathematical technique of hunting77,87–90:
t dt
d ¼
I
Encircling
t
where I is a ratio, c is the minimum of all variables at tth iteration, and The encircling behavior of prey by grey wolves is mathematically mod-
dt indicates the vector, including the maximum of all variables at tth eled as shown:
iteration.
! ! ! !
D ¼ jC  X p ðtÞ  X ðtÞj ð24Þ
Catching pray and rebuilding the pit
The following is the final hunting stage. Based on the position of the ! ! ! !
X ðt þ 1Þ ¼ X P ðtÞ  A  D ð25Þ
latest hunted ant, the antlions update their position to improve the
6 of 17 EULDJI ET AL.

!
Here, t is the current iteration, X P is the vector of the prey posi- inspiration is from the behaviors of honeybees in finding food sources,
! ! !
tion, and X indicates the vector of the grey wolf position, A , C are as well as sharing that information with the rest in the nest. In the
coefficient vectors which can be calculated as follows: algorithm, individual bees are classified into three types (employed,
onlooker, and scout) where each agent plays different roles in the pro-
! ! ! !
A ¼ 2a  r 1  a ð26Þ cess, as explained by Reference [81].

!
C ¼2!
r2 ð27Þ
2.2.5 | Particle swarm optimizer
!
where components of a are linearly decreased from 2 to 0, over the
course of iterations and r &!
!
r are random vectors in [0, 1].
1 2 The PSO is one of the most well-known population-based meta-
heuristic optimization algorithms. This technique was first introduced
Hunting by Kennedy and Eberhart93–95 in 1995. Its strategy was inspired by
The mathematical modeling assumes that α (the best candidate solution), the social behavior of bird flocking, fish schooling, a swarm of bees,
β, and δ have better knowledge about the location of the prey. For that sometimes from the social behavior of humans when it searches for
reason, it is mandatory to save the first three best solutions obtained so food.95,96
far and to update the positions of the other wolves (ω) according to those
solutions. The formulas of hunting mechanism are given by:
2.2.6 | Whale optimizer algorithm
! ! ! ! ! ! ! ! ! ! ! !
Dα ¼ j C1  X α  X j, Dβ ¼ j C2  Xβ  X j, Dδ ¼ j C3  Xδ  X j ð28Þ
The WOA is one of the meta-heuristic methods which was intro-
! ! !  ! ! ! !  ! ! duced by Mirjalili and Lewis.97 The inspiration comes from the
X1 ¼ Xα  A1  Dα , X2 ¼ Xβ  A2  Dβ , X3
! !  ! hunting behavior of humpback whales. Those humpbacks whales
¼ Xδ  A3  Dδ ð29Þ
have a special hunting behavior which is known as the bubble-net
feeding technique, used for hunting schools of fish. Therefore,
! ! !
! X1 þ X2 þ X3 whales create a 9-shaped net with bubbles that encircles the prey
X ð t þ 1Þ ¼ ð30Þ
3 and makes it easy for the whales to eat them during hunting. 98–100
This technique is described in more details in References [97,99].
where C1, C2, and C3 are calculated by Equation (27), Xα, Xβ, and Xδ The mathematical model for performing the optimization follows
are the first three best solutions at iteration t. A1, A2, and A3 are calcu- three phases:
lated as in Equation (26), and Dα, Dβ, and Dδ are defined as in
Equation (28). Encircling prey
Using the objective function each time the wolves are repositioned, In this phase, humpback whales encircle the prey that they have
an iteration occurred, and the solutions are re-evaluated. Then, the alpha, found. Then, modify their position toward the best search agent
beta, and delta wolves become the three best solutions, respectively. On through a course of iterations. This algorithm considers that the posi-
the other hand, the omega wolves update their positions in the next itera- tion of the target prey is the best or near to the optimum solution.
tion. These moves are repeated until the stopping criterion is met. The mathematical representation is given97,99,100:

! ! ! !
Attacking D ¼ jC  X  ðnÞ  X ðnÞj ð32Þ
The moment that wolves start the attack, the hunting mode is con-
cluded. This can be mathematically represented by the value of !
a ! ! ! !
X ðn þ 1Þ ¼ X ðnÞ  A  D ð33Þ
which is linearly decreased over the course of iterations controlling
the exploration and exploitation:
where n indicates the current iteration, X is the position vector of the
! !
a ðtÞ ¼ 2  ð2  tÞ=MaxIter ð31Þ best solution obtained so far iteration n, X is the position vector of
! !
each agent, A , C are calculated as follows:
Between the prey position and their current position, wolves change
their positions randomly. More details can be found elsewhere.87
! ! ! !
A ¼ 2a  r 1  a ð34Þ

!
C ¼ 2!
r2 ð35Þ
2.2.4 | Artificial bee colony

!
The artificial bee colony approach is proposed in 2005 by Karaboga where components of a are linearly decreasing from 2 to 0, over the
91 92 ! !
and Basturk, then its performance was analyzed in 2007. The course of iterations and r 1 & r 2 are random vectors in [0, 1].
EULDJI ET AL. 7 of 17

Bubble-net attacking method and exploitation of an individual wolf in the search space. For combin-
This attack combines two strategies that can be mathematically ing PSO and GWO variants, Equation (42a) is proposed for the veloc-
defined as follows: ity, while Equation (42b) for updating101:
1. Shrinking encircling mechanism: This technique is simulated by
! ! ! !
decreasing the value of a in (34) which is explained in Refer- dα ¼ j c 1  xα  ω  xj
ences [97,99]. ! ! ! !
dβ ¼ j c 1  xβ  ω  x j ð41Þ
2. Spiral updating position: the helix-shaped movement of hump- ! ! ! !
dδ ¼ j c 1  xδ  ω  x j
back whales is simulated using Equations (36) and (37):
      
! !0 ! vkþ1
i ¼ ω  vki þ c1 r 1 x1  xk1 þ c2 r 2 x2  xki þ c3 r 3 x3  xki ðaÞ
X ðn þ 1Þ ¼ D  e  cosð2πv Þ þ X ðnÞ
bv
ð36Þ ð42Þ
xikþ1 ¼ xki þ vkþ1
i ðbÞ

!0 ! !
D ¼ jX ðnÞ  XðnÞj ð37Þ

2.3 | Hybrid support vector regression models


Here, v is a random number in [1,1], b is constant for defining
!
the shape of the logarithmic spiral, X is the position vector of the SVR is a robust supervised learning technique to solve nonlinear prob-
!
prey position, and X is the position vector of the humpback whale. lems, type regression. The main concern for researchers, with this
Note that there is a probability of 50% that the humpback whales approach, is the difficulty of defining their best hyper-parameters
swim either around the prey within a shrinking circle or along a helical values for a given database that can ensure sufficient accuracy with
path. Below is the mathematical formula: minimal error. Noted that unsuitable SVR parameter values may cause
over-fitting or under-fitting issues. Traditionally, the grid search algo-
8 ! ! !
! < X ðnÞ  A  D if p < 0:5 rithm and gradient descent algorithm were the most used algorithms
X ðn þ 1Þ ¼ ð38Þ
: !0 ! to determine SVM parameters. Those techniques reported many diffi-
D  ebv  cosð2πvÞ þ X ðnÞ if p > 0:5
culties such as convergence to local minima point, computational
complexity, and height computational time requirement led to the
where p is a random number in [0,1]. development of meta-heuristic optimization algorithms. Some of them
were adopted to determine the SVR parameters as a replacement to
Search for prey the traditional algorithms. GA and PSO can be distinguished. So far,
In the bubble net method, humpback whales search randomly for prey there is no consensus between researchers on which optimization
in order to find the position of the optimal agent. The mathematical technique is much suitable in every aspect, not even for a specific
formula is given by: optimizing problem. This leads to develop new algorithms as well as
comparing their performances when it is needed to.
! ! ! !
D ¼ jC  X rand  X j ð39Þ The present work aims to apply seven meta-heuristics algorithms:
DA, ALO, GWO, ABC, PSO, WOA, and HPSOGWO to seek out the
! ! ! ! optimal values of the three SVR parameters; the constant C (box-Con-
X ðn þ 1Þ ¼ X rand  A  D ð40Þ
straint), ε the epsilon, and σ the parameter of the kernel function
(Kernel-Scale) of the model, while Gaussian was chosen to be the kernel
where Xrand is a random position vector. More detail can be found function. The developed model was carried out by the fitrsvm function
elsewhere. 97–99
of MATLAB® R2021b. The cross-validation “holdout” method was set
to 0.3; 70% of the data were used for the training set, and 30% for the
test set. The meta-heuristic algorithm fed the SVR initially with a ran-
2.2.7 | Hybrid particle swarm optimization grey dom combination of hyper-parameters within their ranges. During sev-
wolf optimizer eral iterations depends on the algorithm used, the steps began with the
division of data and up until the development of the SVR model; it was
The hybridized version of particle swarm optimization with the GWO repeated and the minimum value of AARD obtained was saved as best.
algorithm was developed by applying a low-level co-evolutionary Then, each meta-heuristic algorithm generated a new population of
mixed hybrid, as explained by Reference [96]. The inspiration came to hyper-parameters for the SVR algorithm. The same set of steps were
improve the ability of exploitation in PSO with the ability of explora- repeated to achieve a new best AARD. That was for 100 trials; among
tion in GWO to benefit from both variant's strengths to reduce the which the minimum AARD corresponded to the resultant optimal
possibility of trapping into a local minimum.96,101 hybrid SVR model. Noted that to guarantee the robustness calculus was
The HPSOGWO algorithm uses the following modified equations. performed at various times. The association of the SVR model with the
Equation (41) is used to update the search space of the first three meta-heuristic algorithms selected (DA, ALO, GWO, ABC, PSO, WOA,
agent locations. The inertia constant is used to control the exploration and HPSOGWO) can build the following hybrid models (DA-SVR, ALO-
8 of 17 EULDJI ET AL.

F I G U R E 1 Flowchart of the hybrid


Start metaheuristic-QSPR-support vector
regression (SVR) model.

Data Collection
(168 Drug-like compounds)

Selection of 13 features from


Inputs Selection the previous QSPR modelling
study

Kernel function selection


(Gaussien)

Applied metaheuristics
Algorithms (DA, ALO, Dataset Division Test Set
GWO, ABC, PSO,
WOA & HPSOGWO)

Training Set

Optimizing The SVR


Hyperparameters Training SVR Model

Optimization Process
No Max
Evaluation of SVR
(iteration) ?
Model base on AARD%

Yes

Obtine Optimal SVR


hyperparamiters

Test and evaluate the SVR


model

End

SVR, GWO-SVR, ABC-SVR, PSO-SVR, WOA-SVR, and HPSOGWO- development and validation of any model. This process confirms the reli-
SVR) respectively. The flowchart of the proposed SVR hybrid models is ability of the developed model for its possible application on a new set of
summarized in Figure 1. data, and confidence of prediction can thus be judged along with the pos-
sibility of receiving a statistical comparison judgment between several
models and various inputs of a model. Hence, a model can lead to the
2.4 | Statistical criteria for evaluation of models' false prediction of response if the developed model is not validated cor-
performances rectly. The prediction accuracy of the seven models was examined by
using various statistical criteria and through graphical appraisal, that is,
The evaluation of the performance of a regression model, also known as scatter plot, bar diagrams, and so on. More details are addressed in the
validation, is now known as the most important concept for the R&D section. The statistical performance indicators are given by:
EULDJI ET AL. 9 of 17

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
uN  2 where r is the correlation coefficient between the observed and
u P obs
u pred
ti¼1 yi  yi predicted value of compounds with an interception, R2 is the coeffi-
RMSE ¼ ð43Þ
N cient of determination, RMSE is the root-mean-squared error,
AARD is the average absolute relative deviation, MSE is the
P N  obs   
i¼1 yi  yobs  ypred i  ypred mean squared error, RE is the relative error, AICs is the Akaike's
R ¼ r ¼ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffisffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð44Þ
P N  2 P N  2 information criterion, MAE is the mean absolute error Af is the preci-
yobs
i  yobs  ypred
i  ypred sion factor and Bf is the bias factor . The error parameters namely
i¼1 i¼1
AARD, RMSE, MSE, and MAE are usually used for comparison
N 
P 2 between different models and not to validate their performance while
yobs
i  ypred
i the statistics R, R2, Q2F1 , Q2F2 , Q2F3 , Af , Bf , and AICs are the statistical cri-
R2 ¼ r2 ¼ 1  i¼1
N  2 ð45Þ
P teria used for evaluation of models performance and validation. More-
yobs
i  yobs
i¼1 over, the statistics AICs and Q2F3 were found to the best suitable
statistics for the evaluation of similar studies as explained in
1X N
yobs  ypred References [102] and [103], respectively. More detail can be found
AARD% ¼ 100  j i obs i j ð46Þ
N i¼1 yi elsewhere.104–108

N  2
1X pred
MSE ¼ i  yi
yobs ð47Þ
N i¼1 2.5 | Applicability domain

According to the third OECD principle of the QSAR modeling valida-


yobs  ypred
RE% ¼ 100  j i predi j ð48Þ tion, any model should have an applicability domain (AD). It is a theo-
yi
retical region defined by the descriptors of the model used to define
the area each time a model can be used for screening new chemicals.
P
Nout 2
yobs
i  ypred
i In other words, chemicals whose structures are “similar” to the ones
i¼1
Q2F1 ¼ 1  ð49Þ
NPout  2 in the training set are the only molecules that can be predicted reli-
yobs
i  yTR
i¼1 ably. Therefore, it is quite impossible to predict all chemicals using a
single model. This study addressed the use of the Williams plot and
P
Nout 2 the Insubria's plot.107,109–112
yobs
i  ypred
i
i¼1
Q2F2 ¼ 1  ð50Þ
NPout  2
yobs
i  yOUT
i¼1 2.5.1 | William's plot

P
Nout  2
yobs  ypred =NOUT William's plot is a plot of standardized residuals vs. the leverage
i i
Q2F3 ¼ 1  i¼1N ð51Þ values refer to Hat (hi) diagonal values; it was used to visualize the
P TR  2
yobs
i  yTR =NTR respective AD and to discover the outlier's response. Those out-
i¼1
liers are usually compounds with standardized residuals greater
  than two ±3 SD units for y-axes and structurally influential chemi-
SSE 2np ðnp þ 1Þ
AICs ¼ Nln þ 2np þ ð52Þ cals (hi > h*) for x-axes in the model, where h* is a threshold lever-
N N  ðnp þ 1Þ
age value. The leverage (h) value of a compound is defined as
N 
X 2 follows:
pred
SSE ¼ i  yi
yobs ð53Þ
i¼0  1
hi ¼ xi X T X xi ð57Þ

1 X
N
pred
MAE ¼ X i  yi
jyobs j ð54Þ where xi is the descriptor-row vector of the query compound, and X is
N i¼1
a k  n matrix containing the k descriptor values for each one of the
0 1 n training compounds.
P
N yobs
log i
B y
pred
C The warning leverage h* is given by:
@ i¼1
N
i
A
Bf ¼ 10 ð55Þ 3ðp þ 1Þ
h ¼ ð58Þ
n
0 1
PN yobs
jlog i j
B i¼1 ypred C
@ N
i
A
where n is the total number of samples in the training set, and p is the
Af ¼ 10 ð56Þ
number of descriptors involved in the correlation.
10 of 17 EULDJI ET AL.

When h < h*, the observed and predicted values have a high PSO, ABC, ALO, GWO, then DA, with errors of 0.7280, 0.7345,
probability while molecules with hi > h* hardly influence the quality of 0.7419, 0.7481, 0.7508, respectively. This conforms to the over-
fit of the developed model. However, these compounds may not be perform of the meta-heuristic HPSOGWO over the rest six algorithms
an outlier because of low residuals. in terms of minimizing the cost function (AARD%) and the conver-
It can be observed that compounds with a high value of leverage gence capacity. Table 1 shows the hyper-parameters of the best per-
and good fitting in the developed model may stabilize the model. In formances so far from each meta-heuristic's algorithms used for
contrast, compounds with bad fitting in the developed model can be tuning the parameters of the hybrid QSPR-SVR algorithm.
outliers. Hence, the standardized residual and the leverage must be Figure 3 represents the scatter plots, that is, predicted versus
utilized at once for the description of the applicability of the domain observed values of the solubility fraction of drug compounds in SC-
of the expanded model. CO2 for the global, train, and test sets for the top three hybrid SVR
models of the best iteration. In general, all the plots of all algorithms
could give a positive correlation between the calculated values and
2.5.2 | Insubria graph the experimental values with a few outliers. This confirms the good
quality of all models and proves their ability in predicting.
The Insubria graph is a plot of diagonal values versus predicted values Table 2 lists the statistics results about AARD, RMSE, MSE, MAE,
that are used in the case of chemicals without experimental data to r, R2, Q2F1 ,Q2F2 , Q2F3, and AIC for the top three developed models. It can
provide a visualization of interpolated and extrapolated predictions. A be observed from Table 2 that the calculated parameters were all valid
zone of higher reliability is always provided for both structures (<h*) with high correlation coefficient r as well as great robustness and the
and (ln(y2)) predictions (between the maximum and the minimum coefficients of determination R2 were above 1 in addition to the
value of the observed solubility of the training set). RMSE, MSE, and MAE of all models which were below 0.3. In sum-
mary, the results above were in total agreement with the acceptability
criteria of the statistical validation of regression models.
3 | RESULTS AND DISCUSSION

3.1 | Comparisons of performances TABLE 1 Hyperparameter of the SVR models.

Parameters models C σ ε
This study had adopted various analysis methods such as convergence
curve to compare the performance of the seven hybrid QSPR-SVR DA-SVR 102.2490 1.7118 0.0076

models and to determine the best-suitable meta-heuristic algorithm so far ALO-SVR 65.0157 1.7484 0.0066
for tuning the hyper-parameters of the SVR model for a dataset of GWO-SVR 60 2 0.0068
168 drug compounds, accounting for 4490 experimental data points ABC-SVR 60 1.6724 0.0067
regarding solubility in supercritical carbon dioxide, and 13 selected inputs. PSO-SVR 62.4902 1.7045 0.0068
Figure 2 represents the convergence curve of the seven selected WOA-SVR 52.0924 1.5791 0.0063
optimization approaches. It is clearly demonstrated, based on the HPSOGWO-SVR 62.9392 1.6780 0.0065
objective function (AARD%) for an amount of 100 iterations, that the
Abbreviations: ABC, Artificial Bee Colony; ALO, Ant Lion optimizers; DA,
hybrid HPSOGWO, colored in blue, led to the lowest error (0.7063),
Dragonfly; GWO, Grey Wolf; HPSOGWO, hybrid particle Swarm with
subsequently the WOA algorithm with an error of 0.7271. The perfor- Grey Wolf optimizers; PSO, Particle Swarm optimizers; SVR, support
mance of this approach was asymptotic to HPSOGWO. Thereafter, vector regression; WAO, Whale optimizers.

F I G U R E 2 Convergence
curve of the seven algorithms.
EULDJI ET AL. 11 of 17

F I G U R E 3 Scatter plots predicted versus observed values for the (1) global, (2) train and (3) test sets, for the top three models: (A) Particle
Swarm optimizers (PSO), (B) Whale optimizers (WOA), (C) hybrid particle Swarm with Grey Wolf optimizers (HPSOGWO).

TABLE 2 Statistics parameters of the seven SVR models.

Statistics models AARD% RMSE MSE MAE r R2 Q2F1 Q2F2 Q2F3 Af Bf AICs

PSO-SVR 0.7280 0.1968 0.0387 0.0709 0.9971 0.9941 0.9941 0.9941 0.9941 1.0073 0.9999 14051.7736

WOA-SVR 0.7271 0.1929 0.0372 0.0705 0.9972 0.9944 0.9944 0.9944 0.9945 1.0073 0.9997 14225.7070

HPSOGWO-SVR 0.7063 0.1883 0.0355 0.0693 0.9973 0.9946 0.9946 0.9946 0.9947 1.0070 0.9996 14434.2487

Abbreviations: AARD, average absolute relative deviation; Af, precision factor; AIC, Akaike's information criterion; Bf, bias factor HPSOGWO, hybrid particle Swarm
with Grey Wolf optimizers; MAE, mean absolute error; MSE, mean squared errorPSO, Particle Swarm optimizers; RMSE, root-mean-squared error, SVR, support
vector regression; WAO, Whale optimizers.
12 of 17 EULDJI ET AL.

95 F I G U R E 4 Bar distribution diagrams of the


absolute relative error [%] versus number of data
85
points in percentage of the training set.
75
Number of Data Points in %

65
55
45 PSO
WOA
35
HPSOGWO
25
15
5
-5
1 2 3 4 5 6 7 8 9 10 11
Absolute Relative Error %

60 F I G U R E 5 Bar distribution diagrams of the


absolute relative error [%] versus number of data
50 points in percentage of the test set.
Number of Data Points in %

40

PSO
30
WOA

20 HPSOGWO

10

0
1 2 3 4 5 6 7 8 9 10 11
Absolute Relative Error %

F I G U R E 6 William's plot for


hybrid particle Swarm with Grey
Wolf optimizers-support vector
regression (HPSOGWO-SVR)
model.

Consequently, all the models performed well in model development, In order to quantitatively visualize the distribution of relative
assuming an excellent prediction ability of future data. The models errors, and to compare the generalization capacity, the tendency of
were therefore stable, robust, and predictive, even if the over-fitting of the different models, Figure 4, and Figure 5, illustrated
HPSOGWO-SVR led to the best performances and the best AIC a bar distribution of the relative error for both training and test set,
value of 14434.2487. respectively.
EULDJI ET AL. 13 of 17

F I G U R E 7 Insubria graph for


the hybrid particle Swarm with
Grey Wolf optimizers-support
vector regression (HPSOGWO-
SVR) model.

From both Figures 4 and 5, all models showed good distribution,


since the majority of the data points fall to the range of 0 to less than
1, more than 90% for the training set, and more than 51% for the test
set. In other words, all models had similar generalization capacity and
showed no over-fitting. In conclusion, these results demonstrate that
the HPSOGWO algorithm is superior to the other six models in terms
of optimizing the SVR hyper-parameters. Overall, the results also illus-
trate that the HPSOGWO-SVR model has a high quality of fast learn-
ing and great generalization capacity to solve nonlinear regression
issues. The SVR is an effective and powerful technique for forecasting
the solubility of the drug compounds in SC-CO2.
Figure 6 shows a plot of standardized residual (Y-axis) versus
leverage values (X-axis) referred to as the Williams plot, for the
HPSOGWO-SVR model while the reliable prediction zone of the
model based on structural similarity to the training compounds (lever-
age value) and the predicted value of solubility was defined with the
Insubria graph in Figure 7.
Analyzing Figure 6, there was no structural influential compound in
the whole dataset that had leverage higher than the warning h*
(0.0139). 98.24% of the whole dataset was located within the horizontal
lines (range of ±3), and only 1.76% showed a distribution without the
suspected limit. Due to its high predictive ability, the proposed model
F I G U R E 8 Scatter plot predicted versus observed for hybrid
could be used to screen existing databases or virtual chemical structures
particle Swarm with Grey Wolf optimizers-support vector regression
to identify solubility. In this case, the applicability domain can serve as a (HPSOGWO-SVR) model, validation set.
valuable tool for filtering out “dissimilar” chemical structures.
In Figure 7, a plot of leverages values (X-axis) versus predicted
values (Y-axis) for both training and test set referred to as the Insubria As mentioned earlier in this article, the whole dataset was ran-
graph of the HPSOGWO was considered. The terms “-lny (Max)” and domly divided into two main sets. The first set (4330 EDP) for model
“-lny (Min)” refer to the maximum (20.7233) and the minimum development and the second set (160 EDP) was kept hidden from the
(2.0956) value of observed solubility in the training set, respectively. build model HPSOGWO-SVR to check the model's capacity in predict-
The predicted results are reliable if both conditions: hi < h* and -lny ing untested data (Figure 8).
(Min) < lny (Pred) < lny(Max). Good-of-fit, robustness, and predictive power were confirmed by
We found that all compounds from the test set were located the values of R (0. 9951), R2 (0.9900), Q2F1 (0.9900), Q2F2 (0.9900), Q2F3
within the model's applicability domain. This demonstrates that the (0.9900), relatively low values of errors, AARD% (1.6034), RMSE
model obtained in this work has high applicability to new drugs, and it (0.3102), MSE (0.0962). Moreover, the scatterplot of Figure 8 shows a
can be applied it in order to screen and prioritize them for future visual correlation between observed and predicted solubility fraction
experiments or for filling the data gap. values for the validation set.
14 of 17 EULDJI ET AL.

Those results confirmed the high quality of the model DATA AVAILABILITY STAT EMEN T
HPSOGWO-SVR. Since the error values were close to the training The additional data of this article can be found online on (Database-
and test sets and there were no significantly large residual values for SvM.PDF). Software, web servers and calculation tools belong to their
the validation set displayed in Figure 8, it can be concluded in the respective developers and copyright holders. All statistical and graphic
absence of over-fitting of the model. In summary, the HPSOGWO- analyzes have been obtained with codes that we have developed
SVR model has by far the best robustness (>0.9) and high predictive using the Matlab® R2021B environment.
power and can be recommended for predicting future data falling into
the defined applicability domain. OR CID
Imane Euldji https://orcid.org/0000-0002-8025-4996

4 | C O N CL U S I O N RE FE RE NCE S
1. Alsenz J, Kansy M. High throughput solubility measurement in drug
discovery and development. Adv Drug Deliv Rev. 2007;59(7):546-567.
In this article, for the first time, a comparative study was adopted
2. Vimalson DC. Techniques to enhance solubility of hydrophobic
between seven meta-heuristic algorithms in terms of fine-tuning hyper- drugs: an overview. Asian J Pharmaceut. 2016;10(2):67-75.
parameters of a hybrid QSPR-SVR model defined for predicting drug 3. Tihanyi KK, Vastag M. Solubility, Delivery and ADME Problems of
solubility in supercritical carbon dioxide. Solubility of 168 drug com- Drugs and Drug Candidates. Bentham Science Publishers; 2011.
4. Sodeifian G, Sajadian SA, Razmimanesh F. Solubility of an antiar-
pounds was first collected and then correlated with their molecular
rhythmic drug (amiodarone hydrochloride) in supercritical carbon
structure by QSPR technique and two independent intensive state vari-
dioxide: experimental and modeling. Fluid Phase Equilibria. 2017;450:
ables (temperature and pressure) of a previous study. The seven meta- 149-159.
heuristic algorithms DA, ALO, GWO, ABC, PSO, WOA, and HPSOGWO 5. Sodeifian G, Sajadian SA, Ardestani NS. Determination of solubility
responsible for the seven models DA-SVR, ALO-SVR, GWO-SVR, ABC- of Aprepitant (an antiemetic drug for chemotherapy) in supercritical
carbon dioxide: empirical and thermodynamic models. J Supercr
SVR, PSO-SVR, WOA-SVR, and HPSOGWO-SVR were all statistically
Fluids. 2017;128:102-111.
and graphically approved, while the hybrid HPSOGWO-SVR model 6. Savjani KT, Gajjar AK, Savjani JK. Drug solubility: importance and
over-performed the six other models. The HPSOGWO-SVR model enhancement techniques. Int Sch Res Notices 2012;2012:1-10.
proved to have good predictivity ability and robustness, and thus it can 7. Kakran M, Li L, Müller RH. Overcoming the challenge of poor drug
solubility. Pharm Eng. 2012;32(7–8):1-7.
be used to estimate the solubility for drug compounds without experi-
8. Sodeifian G, Garlapati C, Hazaveie SM, Sodeifian F. Solubility of 2, 4,
mental data available in the literature. The validity of the model predic- 7-Triamino-6-phenylpteridine (triamterene, diuretic drug) in super-
tions was further guaranteed by the external test on 160 EDP critical carbon dioxide: experimental data and modeling. J Chem Eng
compared to experimental values, considering new compounds which Data. 2020;65(9):4406-4416.
9. Sodeifian G, Nasri L, Razmimanesh F, Abadian M. CO2 utilization for
should belong to the applicability domain (William's plot & Insubria
determining solubility of teriflunomide (immunomodulatory agent) in
graph). The SVR model presented in this work showed better statistical supercritical carbon dioxide: experimental investigation and thermo-
parameter values and better predictability results. However, due to the dynamic modeling. J CO2 Util. 2022;58:101931.
stochastic nature of all swarm intelligence algorithms, it is never guaran- 10. Hazaveie SM, Sodeifian G, Sajadian SA. Measurement and thermo-
dynamic modeling of solubility of Tamsulosin drug (anti cancer and
teed to find an optimal solution for any problem, which always opens
anti-prostatic tumor activity) in supercritical carbon dioxide.
the door for new possibilities. J Supercrit Fluids. 2020;163:104875.
11. Sodeifian G, Garlapati C, Razmimanesh F, Sodeifian F. The solubility
AUTHOR CONTRIBUTIONS of Sulfabenzamide (an antibacterial drug) in supercritical carbon
dioxide: evaluation of a new thermodynamic model. J Mol Liq. 2021;
Imane Euldji: Methodology (equal); validation (equal); visualization
335:116446.
(equal); writing – original draft (equal); writing – review and editing
12. Sodeifian G, Razmimanesh F, Sajadian SA. Prediction of solubility of
(equal). Aicha Belghait: Resources (equal); writing – review and editing sunitinib malate (an anti-cancer drug) in supercritical carbon dioxide
(equal). Cherif Si-Moussa: Methodology (equal); supervision (equal); (SC–CO2): experimental correlations and thermodynamic modeling.
writing – original draft (equal); writing – review and editing (equal). J Mol Liq. 2020;297:111740.
13. Sodeifian G, Razmimanesh F, Sajadian SA, Hazaveie SM. Experimental
Othmane Benkortbi: Methodology (equal); supervision (equal);
data and thermodynamic modeling of solubility of Sorafenib tosylate,
writing – original draft (equal); writing – review and editing (equal). as an anti-cancer drug, in supercritical carbon dioxide: evaluation of
Abdeltif Amrane: Writing – review and editing (equal). Wong-Sandler mixing rule. J Chem Thermodyn. 2020;142:105998.
14. Sodeifian G, Sajadian SA. Experimental measurement of solubilities
of sertraline hydrochloride in supercriticalcarbon dioxide with/
ACKNOWLEDGMENTS
without menthol: data correlation. J Supercrit Fluids. 2019;149:79-87.
The authors gratefully acknowledge the Algerian Ministry of Higher 15. Sodeifian G, Garlapati C, Razmimanesh F, Sodeifian F. Solubility of
Education and Scientific Research (PRFU Project A16N01UN26 amlodipine besylate (calcium channel blocker drug) in supercritical
0120220003) and the University Yahia Fares of Medea. carbon dioxide: measurement and correlations. J Chem Eng Data.
2021;66(2):1119-1131.
16. Sodeifian G, Garlapati C, Razmimanesh F, Nateghi H. Experimental
CONF LICT OF IN TE RE ST ST AT E MENT solubility and thermodynamic modeling of empagliflozin in supercrit-
The authors no conflict of interest. ical carbon dioxide. Sci Rep. 2022;12(1):9008.
EULDJI ET AL. 15 of 17

17. Sodeifian G, Ardestani NS, Sajadian SA, Panah HS. Experimental 34. Euldji I, Si-Moussa C, Hamadache M, Benkortbi O. QSPR modelling
measurements and thermodynamic modeling of coumarin-7 solid of the solubility of drug and drug-like compounds in supercritical
solubility in supercritical carbon dioxide: production of nanoparticles carbon dioxide. Molecular Informatics. 2022;41(10):2200026.
via RESS method. Fluid Phase Equilib. 2019;483:122-143. 35. Reddy TA, Garlapati C. Dimensionless empirical model to correlate
18. Sodeifian G, Alwi RS, Razmimanesh F, Abadian M. Solubility of pharmaceutical compound solubility in supercritical carbon dioxide.
Dasatinib monohydrate (anticancer drug) in supercritical CO2: Chem Eng Technol. 2019;42(12):2621-2630.
experimental and thermodynamic modeling. J Mol Liq. 2022;346: 36. Amooey AA. A simple correlation to predict drug solubility in super-
117899. critical carbon dioxide. Fluid Phase Equilib. 2014;375:332-339.
19. Sodeifian G, Detakhsheshpour R, Sajadian SA. Experimental study 37. Nejad SJ, Abolghasemi H, Moosavian M, Maragheh M. Prediction of
and thermodynamic modeling of esomeprazole (proton-pump inhibi- solute solubility in supercritical carbon dioxide: a novel semi-
tor drug for stomach acid reduction) solubility in supercritical carbon empirical model. Chem Eng Res Des. 2010;88(7):893-898.
dioxide. J Supercrit Fluids. 2019;154:104606. 38. Belghait A, Si-Moussa C, Laidi M, Hanini S. Semi-empirical correla-
20. Sodeifian G, Razmimanesh F, Sajadian SA. Solubility measurement of tion of solid solute solubility in supercritical carbon dioxide: compar-
a chemotherapeutic agent (Imatinib mesylate) in supercritical carbon ative study and proposition of a novel density-based model. C R
dioxide: assessment of new empirical model. J Supercritical Fluids. Chim. 2018;21(5):494-513.
2019;146:89-99. 39. Si-Moussa C, Belghait A, Khaouane L, Hanini S, Halilali A. Novel
21. Sodeifian G, Sajadian SA, Razmimanesh F, Hazaveie SM. Solubility of density-based model for the correlation of solid drugs solubility in
ketoconazole (antifungal drug) in SC-CO2 for binary and ternary sys- supercritical carbon dioxide. C R Chim. 2017;20(5):559-572.
tems: measurements and empirical correlations. Sci Rep. 2021;11(1): 40. Su C-S, Chen Y-P. Correlation for the solubilities of pharmaceutical
7546. compounds in supercritical carbon dioxide. Fluid Phase Equilib. 2007;
22. Sodeifian G, Hazaveie SM, Sajadian SA, Razmimanesh F. Experimen- 254(1–2):167-173.
tal investigation and modeling of the solubility of oxcarbazepine 41. Yazdizadeh M, Eslamimanesh A, Esmaeilzadeh F. Thermodynamic
(an anticonvulsant agent) in supercritical carbon dioxide. Fluid Phase modeling of solubilities of various solid compounds in supercritical
Equilibria. 2019;493:160-173. carbon dioxide: effects of equations of state and mixing rules.
23. Sodeifian G, Alwi RS, Razmimanesh F. Solubility of Pholcodine (anti- J Supercrit Fluids. 2011;55(3):861-875.
tussive drug) in supercritical carbon dioxide: experimental data and 42. Wang L-H, Lin S-T. A predictive method for the solubility of drug in
thermodynamic modeling. Fluid Phase Equilibria. 2022;556:113396. supercritical carbon dioxide. J Supercrit Fluids. 2014;85:81-88.
24. Sodeifian G, Alwi RS, Razmimanesh F, Tamura K. Solubility of quetia- 43. Eric S, Kalinic M, Popovic A, Zloh M, Kuzmanovski I. Prediction of
pine hemifumarate (antipsychotic drug) in supercritical carbon diox- aqueous solubility of drug-like molecules using a novel algorithm for
ide: experimental, modeling and Hansen solubility parameter automatic adjustment of relative importance of descriptors imple-
application. Fluid Phase Equilibria. 2021;537:113003. mented in counter-propagation artificial neural networks. Int J
25. Sodeifian G, Hazaveie SM, Sodeifian F. Determination of Galanta- Pharm. 2012;437(1–2):232-241.
mine solubility (an anti-alzheimer drug) in supercritical carbon diox- 44. Baghban A, Jalali A, Mohammadi AH, Habibzadeh S. Efficient model-
ide (CO2): experimental correlation and thermodynamic modeling. ing of drug solubility in supercritical carbon dioxide. J Supercrit Fluids.
J Mol Liq. 2021;330:115695. 2018;133:466-478.
26. Sodeifian G, Ardestani NS, Sajadian SA, Panah HS. Measurement, 45. Abdallah El Hadj A, Laidi M, Si-Moussa C, Hanini S. Novel approach
correlation and thermodynamic modeling of the solubility of for estimating solubility of solid drugs in supercritical carbon dioxide
Ketotifen fumarate (KTF) in supercritical carbon dioxide: evaluation and critical properties using direct and inverse artificial neural net-
of PCP-SAFT equation of state. Fluid Phase Equilib. 2018;458: work (ANN). Neu Comput Appl. 2017;28:87-99.
102-114. 46. Lashkarbolooki M, Vaferi B, Rahimpour M. Comparison the capabil-
27. Sodeifian G, Sajadian SA, Derakhsheshpour R. Experimental mea- ity of artificial neural network (ANN) and EOS for prediction of solid
surement and thermodynamic modeling of lansoprazole solubility in solubilities in supercritical carbon dioxide. Fluid Phase Equilib. 2011;
supercritical carbon dioxide: application of SAFT-VR EoS. Fluid Phase 308(1–2):35-43.
Equilib. 2020;507:112422. 47. Mehdizadeh B, Movagharnejad K. A comparison between neural
28. Sodeifian G, Razmimanesh F, Sajadian SA, Panah HS. Solubility mea- network method and semi empirical equations to predict the solubil-
surement of an antihistamine drug (Loratadine) in supercritical car- ity of different compounds in supercritical carbon dioxide. Fluid
bon dioxide: assessment of qCPA and PCP-SAFT equations of state. Phase Equilib. 2011;303(1):40-44.
Fluid Phase Equilib. 2018;472:147-159. 48. Sodeifian G, Sajadian SA, Razmimanesh F, Ardestani NS. A compre-
29. Sodeifian G, Nasri L, Razmimanesh F, Abadian M. Measuring and model- hensive comparison among four different approaches for predicting
ing the solubility of an antihypertensive drug (losartan potassium, the solubility of pharmaceutical solid compounds in supercritical car-
Cozaar) in supercritical carbon dioxide. J Mol Liq. 2021;331:115745. bon dioxide. Korean J Chem Eng. 2018;35:2097-2116.
30. Sodeifian G, Ardestani NS, Razmimanesh F, Sajadian SA. Experimen- 49. Mehdizadeh B, Movagharnejad K. A comparative study between LS-
tal and thermodynamic analyses of supercritical CO2-solubility of SVM method and semi empirical equations for modeling the solubil-
minoxidil as an antihypertensive drug. Fluid Phase Equilib. 2020;522: ity of different solutes in supercritical carbon dioxide. Chem Eng Res
112745. Des. 2011;89(11):2420-2427.
31. Sodeifian G, Hazaveie SM, Sajadian SA, Saadati Ardestani N. Deter- 50. Vatani Z, Ramezanian Bajgiran S, Amini G, Tayyebi S. Solubility
mination of the solubility of the repaglinide drug in supercritical car- modeling of supercritical fluid extraction in a wide range com-
bon dioxide: experimental data and thermodynamic modeling. pounds: comparison between fuzzy-genetic and new empirical
J Chem Eng Data. 2019;64(12):5338-5348. models. Energy Sources A: Recovery Util Environ Eff. 2020;42(3):
32. Thakkar FMV, Soni T, Gohel M, Gandhi T. Supercritical fluid technol- 365-374.
ogy: a promising approach to enhance the drug solubility. J Pharm 51. Huuskonen J, Livingstone DJ, Manallack DT. Prediction of drug solu-
Sci Res. 2009;1(4):1. bility from molecular structure using a drug-like training set. SAR
33. Kankala RK, Zhang YS, Wang SB, Lee CH, Chen AZ. Supercritical QSAR Environ Res. 2008;19(3–4):191-212.
fluid technology: an emphasis on drug delivery and related biomedi- 52. Gurikov P, Lebedev I, Kolnoochenko A, Menshutina N. Prediction
cal applications. Adv Healthc Mater. 2017;6(16):1700433. of the solubility in supercritical carbon dioxide: a hybrid
16 of 17 EULDJI ET AL.

thermodynamic/QSPR approach. Comput Aided Chem Eng. 2016;38: 71. Hu X. Support Vector Machine and its Application to Regression and
1587-1592. Classification. 2017.
53. Valenzuela Roediger LM, Reveco-Chilla A, Del Valle Lladser JM. 72. Jakkula V. Tutorial on support vector machine (svm). School of EECS,
Modeling solubility in supercritical carbon dioxide using quantitative Washington State University 2006 37(2.5):3.
structure-property relationships. 2014. 73. Benimam H, Moussa CS, Hentabli M, Hanini S, Laidi M. Dragonfly-
54. Zhang J, Wang Y. Evaluating the bond strength of FRP-to-concrete support vector machine for regression modeling of the activity coef-
composite joints using metaheuristic-optimized least-squares sup- ficient at infinite dilution of solutes in imidazolium ionic liquids using
port vector regression. Neu Comput Appl. 2021;33:3621-3635. σ-profile descriptors. J Chem Eng Data. 2020;65(6):3161-3172.
55. Zhang J, Huang Y, Ma G, Sun J, Nener B. A metaheuristic-optimized 74. Yu P-S, Chen S-T, Chang I-F. Support vector regression for real-time
multi-output model for predicting multiple properties of pervious flood stage forecasting. J Hydrol. 2006;328(3–4):704-716.
concrete. Construct Build Mater. 2020;249:118803. 75. Tatar A, Barati A, Yarahmadi A, Najafi A, Lee M, Bahadori A. Predic-
56. Tran D-H, Luong D-L, Chou J-S. Nature-inspired metaheuristic tion of carbon dioxide solubility in aqueous mixture of methyldietha-
ensemble model for forecasting energy consumption in residential nolamine and N-methylpyrrolidone using intelligent models. Int J
buildings. Energy. 2020;191:116552. Greenhouse Gas Con. 2016;47:122-136.
57. Zhou J, Qiu Y, Zhu S, et al. Optimization of support vector machine 76. Baghban A, Mohammadi AH, Taleghani MS. Rigorous modeling of
through the use of metaheuristic algorithms in forecasting TBM CO2 equilibrium absorption in ionic liquids. Int J Greenhouse Gas
advance rate. Eng Appl Artif Intel. 2021;97:104015. Con. 2017;58:19-41.
58. Panahi M, Gayen A, Pourghasemi HR, Rezaie F, Lee S. Spatial predic- 77. Xu C, Nait Amar M, Ghriga MA, Ouaer H, Zhang X, Hasanipanah M.
tion of landslide susceptibility using hybrid support vector regression Evolving support vector regression using Grey wolf optimization;
(SVR) and the adaptive neuro-fuzzy inference system (ANFIS) with forecasting the geomechanical properties of rock. Eng Comput.
various metaheuristic algorithms. Sci Total Environ. 2020;741: 2022;38:1819-1833.
139937. 78. Farhat NH. Photonic neural networks and learning machines. IEEE
59. Balogun A-L, Rezaie F, Pham QB, et al. Spatial prediction of landslide Expert. 1992;7(5):63-72.
susceptibility in western Serbia using hybrid support vector regres- 79. Amroune M, Bouktir T, Musirin I. Power system voltage stability
sion (SVR) with GWO, BAT and COA algorithms. Geosci Frontiers. assessment using a hybrid approach combining dragonfly optimiza-
2021;12(3):101104. tion algorithm and support vector regression. Arabian J Sci Eng.
60. Abbaszadeh Shahri A, Maghsoudi Moud F, Mirfallah Lialestani SP. A 2018;43:3023-3036.
hybrid computing model to predict rock strength index properties 80. Mirjalili S. Dragonfly algorithm: a new meta-heuristic optimization
using support vector regression. EngComput. 2022;38(1):579-594. technique for solving single-objective, discrete, and multi-objective
61. Caraka RE, Chen RC, Bakar SA, et al. Employing best input SVR problems. Neu Comput Appl. 2016;27:1053-1073.
robust lost function with nature-inspired metaheuristics in wind 81. Yasen M, Al-Madi N, Obeid N. Optimizing neural networks using
speed energy forecasting. IAENG Int J Comput Sci. 2020;47(3): dragonfly algorithm for medical prediction. Paper presented at:
572-584. 2018 8th international conference on computer science and infor-
62. Musa B, Yimen N, Abba SI, Adun HH, Dagbasi M. Multi-state load mation technology (CSIT). 2018.
demand forecasting using hybridized support vector regression inte- 82. Salam MA, Zawbaa HM, Emary E, Ghany KKA, Parv B. A hybrid
grated with optimal design of off-grid energy systems—a metaheur- dragonfly algorithm with extreme learning machine for prediction.
istic approach. Processes. 2021;9(7):1166. Paper presented at: 2016 International symposium on innovations in
63. Bonah E, Huang X, Hongying Y, et al. Detection of salmonella Typhi- intelligent systems and applications (INISTA). 2016.
murium contamination levels in fresh pork samples using electronic 83. Reynolds CW. Flocks, herds and schools: A distributed behavioral
nose smellprints in tandem with support vector machine regression model. Paper presented at: Proceedings of the 14th annual confer-
and metaheuristic optimization algorithms. J Food Sci Technol. 2021; ence on Computer graphics and interactive techniques. 1987.
58:3861-3870. 84. Mirjalili S. The ant lion optimizer. Adv Eng Software. 2015;83:80-98.
64. Malik A, Tikhamarine Y, Souag-Gamane D, Rai P, Sammen SS, Kisi O. 85. Saha S, Mukherjee V. A novel quasi-oppositional chaotic antlion opti-
Support vector regression integrated with novel meta-heuristic algo- mizer for global optimization. Appl Intelligence. 2018;48:2628-2660.
rithms for meteorological drought prediction. Meteorol Atmos Phys. 86. Gupta E, Saxena A. Performance evaluation of antlion optimizer
2021;133:891-909. based regulator in automatic generation control of interconnected
65. Rahmati O, Darabi H, Panahi M, et al. Development of novel hybrid- power system. J Eng. 2016;2016:1-14.
ized models for urban flood susceptibility mapping. Sci Rep. 2020; 87. Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Adv Eng Soft-
10(1):12937. ware. 2014;69:46-61.
66. Fadhillah MF, Lee S, Lee C-W, Park Y-C. Application of support vec- 88. Nadimi-Shahraki MH, Taghian S, Mirjalili S. An improved grey wolf
tor regression and metaheuristic optimization algorithms for ground- optimizer for solving engineering problems. Expert SystAppl. 2021;
water potential mapping in Gangneung-si, South Korea. Remote Sens 166:113917.
(Basel). 2021;13(6):1196. 89. Emary E, Yamany W, Hassanien AE, Snasel V. Multi-objective gray-
67. Setiawan IN, Kurniawan R, Yuniarto B, Caraka RE, Pardamean B. wolf optimization for attribute reduction. Procedia Comput Sci. 2015;
Parameter optimization of support vector regression using Harris 65:623-632.
hawks optimization. Procedia Comput Sci. 2021;179:17-24. 90. Kumar A, Pant S, Ram M. System reliability optimization using gray
68. da Silva Santos CE, Sampaio RC, dos Santos Coelho L, Bestard GA, wolf optimizer algorithm. Qual Reliab EngInt. 2017;33(7):1327-1335.
Llanos CH. Multi-objective adaptive differential evolution for SVM/ 91. Karaboga D, Basturk B. On the performance of artificial bee colony
SVR hyperparameters selection. Pattern Recognit. 2021;110:107649. (ABC) algorithm. Appl Soft Comput. 2008;8(1):687-697.
69. Malla C, Panigrahi I. Review of condition monitoring of rolling ele- 92. Karaboga D. An Idea Based on Honey Bee Swarm for Numerical Opti-
ment bearing using vibration analysis and other techniques. JVib Eng mization: Technical Report-TR06. Erciyes University, Engineering
Technol. 2019;7:407-414. Faculty, Computer; 2005.
70. Okwu MO, Tartibu LK. Metaheuristic Optimization: Nature-Inspired 93. Okwu MO, Tartibu LK. Particle swarm optimisation. Metaheuristic
Algorithms Swarm and Computational Intelligence, Theory and Applica- Optimization: Nature-Inspired Algorithms Swarm and Computational
tions. Vol 927. Springer Nature; 2020. Intelligence, Theory and Applications. Springer Nature; 2021:5-13.
EULDJI ET AL. 17 of 17

94. Eberhart R, Kennedy J. A new optimizer using particle swarm theory. 107. Roy K, Mitra I. On various metrics used for validation of predictive
Paper presented at: MHS'95. Proceedings of the Sixth International QSAR models with applications in virtual screening and focused
Symposium on Micro Machine and Human Science 1995. library design. Comb Chem High Throughput Screen. 2011;14(6):
95. Garg H. A hybrid PSO-GA algorithm for constrained optimization 450-474.
problems. Appl Math Comput. 2016;274:292-305. 108. Falyouna O, Eljamal O, Maamoun I, Tahara A, Sugihara Y. Magnetic
96. Şenel FA, Gökçe F, Yüksel AS, Yig it T. A novel hybrid PSO–GWO zeolite synthesis for efficient removal of cesium in a lab-scale con-
algorithm for optimization problems. Eng Comput. 2019;35:1359- tinuous treatment system. J Colloid Interface Sci. 2020;571:66-79.
1373. 109. Hamadache M, Hanini S, Benkortbi O, Amrane A, Khaouane L,
97. Mirjalili S, Lewis A. The whale optimization algorithm. Adv Eng Soft- Moussa CS. Artificial neural network-based equation to predict the
ware. 2016;95:51-67. toxicity of herbicides on rats. Chemom Intel Lab Syst. 2016;154:7-15.
98. Hemeida A, Alkhalaf S, Mady A, Mahmoud E, Hussein M, 110. Bouarra N, Kherouf S, Bouakkadia A, Messadi D. QSPR application
Eldin AMB. Implementation of nature-inspired optimization algo- on modeling of boiling point of polycyclic aromatic hydrocarbons.
rithms in some data mining tasks. Ain Shams Eng J. 2020;11(2): Res J Pharma Biol Chem Sci. 2017;8(6):19-28.
309-318. 111. Zhao X, Pan Y, Jiang J, Xu S, Jiang J, Ding L. Thermal hazard of ionic liq-
99. Nasiri J, Khiyabani FM. A whale optimization algorithm (WOA) uids: modeling thermal decomposition temperatures of imidazolium ionic
approach for clustering. Cogent math. Stat. 2018;5(1):1483565. liquids via QSPR method. Ind Eng Chem Res. 2017;56(14):4185-4195.
100. Kaur G, Arora S. Chaotic whale optimization algorithm. J Comput 112. Mansourian M, Saghaie L, Fassihi A, Madadkar-Sobhani A,
Des Eng. 2018;5(3):275-284. Mahnam K. Linear and nonlinear QSAR modeling of 1, 3,
101. Singh N, Singh S. Hybrid algorithm of particle swarm optimization 8-substituted-9-deazaxanthines as potential selective a 2B AR
and grey wolf optimizer for improving convergence performance. antagonists. Med Chem Res. 2013;22:4549-4567.
J Appl Mathematics. 2017;2017:1-15.
102. Kuonen D. Book review: regression modeling strategies: with appli-
cations to linear models, logistic regression, and survival analysis.
Stat Methods Med Res. 2004;13(5):415-416. SUPPORTING INF ORMATION
103. Consonni V, Ballabio D, Todeschini R. Comments on the definition Additional supporting information can be found online in the Support-
of the Q 2 parameter for QSAR validation. J Chem Inf Model. 2009; ing Information section at the end of this article.
49(7):1669-1678.
104. Todeschini R, Ballabio D, Grisoni F. Beware of unreliable Q 2! A
comparative study of regression metrics for predictivity assessment How to cite this article: Euldji I, Belghait A, Si-Moussa C,
of QSAR models. J Chem Inf Model. 2016;56(10):1905-1913.
Benkortbi O, Amrane A. A new hybrid quantitative structure
105. Soleimani R, Saeedi Dehaghani AH, Shoushtari NA, Yaghoubi P,
Bahadori A. Toward an intelligent approach for predicting surface property relationships-support vector regression (QSPR-SVR)
tension of binary mixtures containing ionic liquids. Korean J Chem approach for predicting the solubility of drug compounds in
Eng. 2018;35:1556-1569. supercritical carbon dioxide. AIChE J. 2023;e18115. doi:10.
106. Veerasamy R, Rajak H, Jain A, Sivadasan S, Varghese CP,
1002/aic.18115
Agrawal RK. Validation of QSAR models-strategies and importance.
Int J Drug des Discov. 2011;3:511-519.

You might also like