You are on page 1of 11

Molecular Diversity

https://doi.org/10.1007/s11030-019-10026-9

ORIGINAL ARTICLE

Predictive QSAR modeling for the antioxidant activity of natural


compounds derivatives based on Monte Carlo method
Shahin Ahmadi1 · Hosein Ghanbari1 · Shahram Lotfi2 · Neda Azimi3

Received: 13 October 2019 / Accepted: 23 December 2019


© Springer Nature Switzerland AG 2020

Abstract
In this research, QSAR modeling was carried out through SMILES of compounds and on the basis of the Monte Carlo
method to predict the antioxidant activity of 79 derivatives of pulvinic acid, 23 of coumarine, as well as nine structurally non-
related compounds against three radiation sources of Fenton, gamma, and UV. QSAR model was designed through CORAL
software, as well as a newer optimizing method well known as the index of ideality correlation. The full set of antioxidant
compounds were randomly distributed into four sets, including training, invisible training, validation, and calibration; this
division was repeated three times randomly. The optimal descriptors were picked up from a hybrid model by the combination
of the hydrogen-suppressed graph and SMILES descriptors based on the objective function. These models’ predictability was
assessed on the sets of validation. The results of three randomized sets showed that simple, robust, reliable, and predictive
models were achieved for training, invisible training, validation, and calibration sets of all three models. The central decrease/
increase descriptors were identified. This simple QSAR can be useful to predict antioxidant activity of numerous antioxidants.
Graphic abstract

120 Training
Invisible training
Calibration
100
Validation
Predicted Fenton antioxidant activity

80

60

40

20

-20
-20 0 20 40 60 80 100 120
Experimental Fenton antioxidant activity

Keywords QSAR · Antioxidant activity · CORAL software · Monte Carlo

Electronic supplementary material The online version of this


article (https​://doi.org/10.1007/s1103​0-019-10026​-9) contains
supplementary material, which is available to authorized users.

Extended author information available on the last page of the article

13
Vol.:(0123456789)
Molecular Diversity

Introduction of compounds. However, the application of a predicting


model, which may rapidly evaluate the potential activity
Antioxidants are chemical molecules that deferment autoxi- of compounds, can appear as an interesting alternative in
dation either by preventing the formation of free radicals or determining activities of chemical compounds experimen-
interrupting the dissemination of free radicals. Free radicals tally. Various multivariate linear and nonlinear techniques
are perpetually produced via many biological or chemical can be employed to deduce correlation models between the
processes, and also they can emerge spontaneously [1]. molecular structure and biological activities, including mul-
Many factors assist the formation of free radicals, for tiple linear regression (MLR) [19–28], partial least squares
example smoking, consuming alcohol or certain types of (PLS) [29, 30], principal component regression (PCR) [28],
food, exposing to UV and ionizing radiation (such as γ-rays), artificial neural networks (ANN) [31], and support vector
ozone, or by low-molecular-weight complexes of transition machine (SVM) [31].
state metals, such as iron via the Fenton reaction. These For the last two decades, the QSAR models have been
compounds have a key part in many physiological, biologi- quickly developing and broadly applied by chemists for pre-
cal, and pathological conditions. In biological systems, they dicting chemical properties/biological activities of different
are usually defined as reactive molecules containing oxy- compound types. Nowadays, the QSAR technique is applied
gen, nitrogen, and sulfur. Thus, they are related to material in the correlation between chemical structure and activity of
degradation, food deterioration, as well as many different certain antioxidants. For example, Jorge et al. built a QSAR
human diseases, including degenerative eye, cardiovascular model through Dragon software and the neural network tech-
disease, diabetes, Alzheimer’s, cancer, and neurodegenera- nique founded on the MLP (multilayer perceptron) to pre-
tive diseases. Therefore, the free radical formation can be dict radical scavenging ability of 1373 chemical compounds.
regulated by antioxidants through different mechanisms. The Their models indicated an acceptable performance for the
effectivity of antioxidants is associated with the activation training (R2 = 0.713) and test set (Q2ext = 0.654) [32]. In an
energy, rate constants, oxidation–reduction potential, and additional work, Le Roux et al. presented a QSAR model
antioxidant solubility [2–6]. for predicting the antioxidative potency of 30 derivatives
In this context, the plant foods containing a large num- of pulvinic acid toward Fenton condition and UV irradia-
ber of antioxidants, such as tocopherols, ascorbic acid, tion based on counter-propagation artificial neural networks
glutathione, or natural antioxidants, can prevent human dis- (CP-ANN). They aimed to predict 80 novel molecules with
ease. Also, natural antioxidants prove to be suitable addi- high antioxidant activity for further consideration of carry-
tives in cosmetic products, foods, and food supplements ing out a chemical synthesis. Among them, they found that
[7, 8]. Pulvinic acids and coumarine derivatives are natural coumarine derivative has excellent protective potency under
compounds; the former is orange pigment found in lichens UV irradiation and was moderate in Fenton conditions [18].
and mushrooms, and the latter found in a variety of plant Moreover, Alisi et al. performed a QSAR to predict
sources, green plants in particular [9, 10]. These compounds antioxidant activities of the 47 curcumin derivatives based
and their derivatives display various biological activities, on the DPPH (2, 2-diphenyl-1-picrylhydrazyl) assay by
such as antimicrobial [11], antiviral, antitumor, antifungal genetic function algorithm (GFA) with satisfactory internal
[12], cytotoxic [13], as well as anti-inflammatory, antial- and external validations [33]. Martinčič et al. obtained the
lergic, anticarcinogenic, hepatoprotective, and antioxidant antioxidant activity for a series of coumarine and pulvinic
activities [9]. acid derivatives experimentally and modeled the activity
The ionizing radiation is used in medical and industrial of the compounds based on CP-ANN, SVR, and MLR to
sectors, such as cancer diagnosis and treatment, as well as discover new compounds with high antioxidant properties.
the power supply in nuclear plants. There are many phar- Ten new compounds were demonstrated by the gained data
macological approaches to decrease the detrimental conse- from chemical classes of the tetronic acid and barbituric acid
quences of radiation [14–16]. Among them, antioxidants can with promising antioxidant activity, which is comparable or
be used as radiation protection agents. These compounds even superior to certain standard antioxidants [31]. Recently,
can effectively scavenge free radicals by trapping them or Ahmadi et al. constructed a QSAR model to predict the radi-
absorbing radiations [5, 17]. The ability of antioxidants cal scavenging activities of 96 natural antioxidants on the
against free radicals can be tested in vivo or in vitro, given basis of the atomic orbitals’ graph by CORAL. The QSAR
their disadvantages, such as being expensive, time-consum- models validated internally and externally and had satisfied
ing, and questionable due to ethical reasons (animal sacri- R2 and Q2 values for training and validation sets [4].
fice) [18]. Here, the data set on the activity of antioxidants from
Quantitative structure–activity relationship (QSAR) Martinčič et al. was utilized. They constructed three QSAR
is a powerful theoretical technique to predict the activity models for this data set based on combination descriptors
from Codessa, Dragon, as well as Volsurf software, and

13
Molecular Diversity

utilizing multiple linear regression (MLR), counter-prop- and external validation with more than 10 percent of vali-
agation artificial neural networks (CP-ANN), as well as dation set. Finally, QSAR models with three random splits
support vector regression (SVR) [31]. This research aims were divided into the training (about 80%), calibration
to assess CORAL-software-based (http://www.insil​ico.eu/ (about 10%), and validation (about 10%) and an effective sta-
coral​) QSAR models for antioxidant activity of some natural tistical quality was obtained for training and validation sets.
compounds and the assessment of the IIC as a predictability
criterion. The CORAL software uses the basic molecular- Descriptors
input line-entry system (SMILES) for the rapidly obtained
chemical molecules. Requiring no molecular optimization The utilized optimal descriptors in this study to build the
and saving a lot of time and money, it has an important role QSAR model are a combination of hydrogen suppressor
in the molecular design process. Furthermore, the con- graph (HSG) and SMILES descriptors.
structed models can be interpreted in CORAL software, The optimal descriptors of correlation weights to predict
but the model interpretation based on 3D descriptors and the antioxidant activity of compounds have been defined as
via other linear or nonlinear multivariate calibrations (e.g., follows:
PLS, PCR, and ANN) is not simple. In CORAL software,
according to the model results for increasing and decreasing
Activity = C0 + C1 × DCW(T ∗ , N ∗ ) (1)
descriptors and as a result of QSAR modeling, new com- where T* is optimal for the threshold to define rare molecu-
pounds can be easily and rapidly designed with high anti- lar features. N* is the optimal number of epochs in the opti-
oxidant activity. This is not the case for many QSAR models mization [4, 34, 35].
built with 3D descriptors and multivariate calibration such The hybrid model is defined through SMILES and graph
as PLS, PCR, and ANN. descriptor combination as the following equation:

DCW(T ∗ , N ∗ ) = SMILES DCW(T ∗ , N ∗ ) + Graph DCW(T ∗ , N ∗ )


Method (2)
The optimal graph-based descriptor (DCW) is computed
Data set as the following equation:
( ) ∑
The experimental data concerning the antioxidant activity
HSG
( ) ( )
DCW(T, Nepoch ) = CW C5 + CW C6 + CW pt3k

of compounds containing 79 derivatives of pulvinic acid, 23


∑ (0 ) ∑ (1 ) ∑
CW 2 ECk
( )
+ CW ECk + CW ECk +
derivatives of coumarine, as well as nine structurally non- (3)
related compounds were obtained from Martinčič et al. [31]. where C5 and C6 denote the presence of five-and six-member
The antioxidant activities of compounds have been defined rings, respectively; pt3k descriptor is the path numbers of
through three various sources of radicals: gamma radiation, length 3 that starts from k-vertex. 0 ECk , 1 ECk , and 2 ECk are
UV radiation, as well as Fenton reaction [6, 18]. For mod- the hierarchy of Morgan’s extended connectivity.
eling objectives, the terms Fenton, UV, as well as gamma
activity will be utilized in the text once discussing the anti- SMILES
DCW(T, Nepoch ) = CW(HARD) + CW(BOND)
oxidant activity of various molecules moving forward. Fen- +
∑ ( ) ∑
CW Sk +
( )
CW SSk
ton and UV activities were assessed utilizing 100 μM of
(4)
each compound, whereas the gamma activity assay was per-
formed utilizing a concentration of 50 μM. All the activities where HARD is the association of BOND, NOSP, and
were expressed as a thymidine percentage, which continued HALO in united structural code, and Sk and SSk are a com-
to be intact after being subjected to the source of radicals in bination of one or two symbols in SMILES representa-
the antioxidant presence. Both for Fenton and UV activities, tion, respectively. The CW(x) is a correlation weight for a
in the data set, all the compounds were assessed experimen- SMILES attribute or an HSG invariant.
tally, whereas for the gamma activity, only 91 compounds For Fenton activity modeling, a hybrid model was made
were tested. Through the free software BIOVIA Draw 2016, from a combination of EC0, EC1, EC2, Sk, SSk, and HARD.
the antioxidant molecular structures were transmuted into For UV, a combination of EC0, EC1, EC2, Sk, SSk, NOSP,
the canonical SMILES. Randomly, the data set was divided and HARD was used. Finally, for gamma, a hybrid model
into sets of training, calibration, and validation. Different from the combination of EC0, EC1, EC2, pt3, C5, C6, Sk,
numbers of compounds were used for training (e.g., training, SSk, and HARD was used.
invisible training), calibration, and validation set, because There exist two target functions to create the QSAR
the data set is structurally diverse. However, the models do model in CORAL software: target function on the basis of
not acquire any proper statistical quality for cross-validation

13
Molecular Diversity

the balance of correlation (TF1) and target function based (R2), concordance correlation coefficient (CCC), the index of
on the index of ideality (TF2) [36]. ideality of correlation (IIC), cross-validated correlation coef-
ficient (Q2), correlation coefficient for external data (Qext2),
TF1 = RTRN + RiTRN − ||RTRN − RiTRN || × c (4) standard error of estimation (s), mean absolute error (MAE),
Fischer ratio (F), and novel metrics ( R2m and MAE-based
where RTRN and RiTRN are correlation coefficients between
metric).
an endpoint’s perceived and computed values for the training
and invisible training sets, respectively, and c is empirical
constant.
Applicability domain
In this work, ­TF2 has been applied:
One of the OECD concepts for model validation is demon-
TF2 = TF1 + IIC (5) strating the applicability domain (AD) of the QSAR models,
that is, the chemical space, where the model can be applied
where IIC (index of the ideality of correlation) is calculated and is constructed using the “ DefectAK ”, as in Eq. (10) [39]:
through the formula below [37]:
|PTRN (AK )−PCAL (AK )|
(
min − MAECAL , + MAECAL
) DefectAK = If (AK ) > 0
NTRN (AK )+NCAL (AK ) (10)
IIC = rCAL × ( ) (6) DefectAK = 1 If (AK ) > 0
max − MAECAL , + MAECAL
PTRN (AK ) presents the attribute AK presence probability
where rCAL is the correlation coefficient value between an
in the training set, as shown in Eq. (11):
endpoint’s experimental and computed values for the cali-
bration set. NTRN (AK )
MAE is mean absolute error, which is computed based on PTRN (AK ) = (11)
NTRN
the following equation:

PTRN (AK ) denotes the attribute AK presence probability in
N

MAECAL =−
1 ∑| |
|Δ | Δ < 0, − N is the number of Δy < 0
the training set, as represented in Eq. (12):
N y=1 | y | y
NCAL (AK )
(7) PCAL (AK ) = (12)
NCAL
N+
+ 1 ∑| |
MAECAL = + |Δ | Δ ≥ 0, + N is the number of Δy ≥ 0 NTRN (Ak) shows the number of Ak in the set of training;
N y=1 | y | y
NTRN signifies the total number of the training set; NCLT(Ak)
(8)
denotes the number of A ­ k in the calibration set; NCAL speci-
Δy = Obsy − Calcy (9) fies the total number of the calibration set.
where Obsy is experimental cell viability (%), and Calcy is AK

calculated cell viability (%). DefectMolecule = DefectAK . (13)
Now, the values of T and N are defined from 1 to 3 and 1 i=1

to 30, respectively. By SMILES, the substance is encompassed in the domain


After searching for the response surface of TF2 for the of applicability if:
validation set, T* and N* are obtained with the maximum
value of TF2 between antioxidant activity and DCW (T*, Defectmolecule < 2 × DefectTRN (14)
N*).
where DefectTRN is the mean of the Defectmolecule over the
Validation of the QSAR model set of training.

The main objective of any QSAR modeling is to establish Model interpretation


a robust model with the capacity to predict the activity of
novel compounds objectively, reliably, and precisely [38]. Mechanical interpretation of the studied phenomena is pos-
There are some criteria to evaluate the predictive capabil- sible by the developed QSAR models. One may extract three
ity of QSAR models, including internal validation or cross- classes of features with the numerical data on the correla-
validation, external validation and data randomization, and tion weights in several Monte Carlo optimization runs [34]:
Y-scrambling. (1) those with a positive value of the correlation weight
Various standard statistical criteria were applied to vali- in all runs (promoters of the endpoint increase); (2) those
date the QSAR models, including the correlation coefficient with a negative value of the correlation weight in all runs

13
Molecular Diversity

(promoters of the endpoint decrease); as well as (3) those UV activity model


with both negative and positive values of the correlation
weight in various optimization runs. Such properties have Split 1
an unclear part (that may not be classified as an increase/
UV activity = −3.3205(±0.3548) + 0.5704(±0.0066) × DCW(1, 28)
decrease promoter of the endpoint).
(21)
Split 2
Results and discussion UV activity = −6.5173(±0.2603) + 0.6680(±0.0052) × DCW(1, 26)
(22)
QSAR models Split 3

In the equations below, the QSAR models for the antioxi- UV activity = −4.8531(±0.5135) + 0.4683(±0.0069) × DCW(1, 10).
dant activity of natural compounds against UV radiation, (23)
gamma radiation, as well as Fenton reaction are obtained on The statistical criteria of QSAR models to predict the
the basis of the Monte Carlo optimization method for three antioxidant activity of compounds based on Eqs. 15–23 are
random data splits according to ­TF2: represented in Table 1. The satisfactory statistical criteria
of models in Table 1 indicate that these robust models have
Fenton activity model excellent performance and may be utilized to predict the
antioxidant activity of compounds.
Split 1 These models are quite appropriate for all splits. Further-
more, all constructed models’ reliability is further assessed
Fenton activity = 14.4150(±0.3316) + 1.9843(±0.0237)
(15) through the randomization method (Y-scrambling) to exam-
× DCW(1, 30) ine their robustness. A model may be defined as robust if CR2
p

Split 2 is greater than 0.5 [34]. Table S1 shows SMILES of antioxi-


dant compounds, the set of each compound, their observed
Fenton activity = −40.5829(±0.6171) + 2.8038(±0.0282) and calculated of Fenton, UV, and gamma activity, and the
× DCW(1, 28) applicability domain of the compounds in three splits as
(16) Supplementary Data. Figure 1a–c shows graphical result
Split 3 models derived of split 1 from Fenton, UV, as well as gamma
activity, respectively. These plots indicate an excellent
Fenton activity = −5.7652(±0.460) + 1.8719(±0.0251)
agreement between predicted and experimental data.
× DCW(1, 13) Table S2 contains the outlier molecules of each split and the
(17) R2Val results of removing them. Based on Table S2 results for
Fenton activity, after removing the outliers, the derived
Gamma activity model model from split 3 with the highest (0.9762) is the best
model. For UV activity, with removing five molecules, the
Split 1 derived model from split 2 with the highest R2Val (0.9263) is
the best one. Moreover, gamma activity removing four mol-
Gamma activity = −3.9081(±0.6500) + 1.0488(±0.0110)
ecules, the derived model from split 3 with R2Val (0.9762) is
× DCW(1, 22) the best. Table 2 indicates the examples of increase/decrease
(18) promoters for the antioxidant activity of compounds. As
Split 2 observed in Table 2, there is a suitable distribution for the
frequencies of these attributes in training and calibration
Gamma activity = −8.0740(±0.6581) + 1.0751(±0.0095)
sets, and the valid positive or negative impact of the attrib-
× DCW(1, 29) utes on antioxidant activity of compounds may be con-
(19) cluded. In Fenton activity, the presence of oxygen branching
Split 3 from oxygen atom (O…(…….) is the promoter for the anti-
oxidant activity increase, whereas the presence of double
Gamma activity = −5.2544(± 0.8484) + 0.8831(± 0.0108)
covalent bonds (= ………..) is a promoter for Fenton activity
× DCW(1, 10) decrease. In UV and gamma activity, the presence of at least
(20)
one ring (1………..) is a promoter for antioxidant activity
increase (the low values of activity for molecules No. 30 and

13
Molecular Diversity

Table 1  The summary statistical qualities of the QSAR models obtained for Fenton, UV, and gamma activity for three different random splits
Split Set n R2 IIC Q2 R2m CR2p r̄m2 Δrm2 S MAE F

Fenton
1 Training 49 0.7518 0.8324 0.7311 0.7447 14.5 10.7 142
Invisible training 40 0.8791 0.6288 0.8686 0.8607 12.6 10.3 276
Calibration 13 0.6837 0.8267 0.5539 0.6147 16.7 12.4 24
Validation 8 0.6981 0.7845 0.4727 0.6049 0.6079 0.0060 15.1 12.1 14
2 Training 49 0.7984 0.6701 0.7822 0.7820 12.2 8.6 186
Invisible training 43 0.8116 0.8299 0.791 0.7940 12.8 9.9 177
Calibration 9 0.7673 0.8758 0.5721 0.6777 16.4 13.0 23
Validation 9 0.8625 0.6701 0.6737 0.8184 0.7675 0.1018 13.9 11.8 44
3 Training 48 0.7378 0.7268 0.7143 0.7309 13.1 9.7 129
Invisible training 45 0.7379 0.5494 0.7161 0.7287 16.6 12.0 121
Calibration 9 0.8419 0.9168 0.6219 0.7849 16.6 13.5 37
Validation 8 0.8003 0.6570 0.5248 0.6179 0.6899 0.1440 16.4 12.9 24
Gamma
1 Training 35 0.8357 0.8634 0.8197 0.8159 9.5 7
Invisible training 33 0.8386 0.6751 0.8171 0.8253 10.3 7 161
Calibration 13 0.8571 0.9258 0.8105 0.8228 10.6 8.2 66
Validation 10 0.919 0.8011 0.8851 0.5176 0.6035 0.1718 9.4 7.0 91
2 Training 38 0.8572 0.7495 0.8445 0.8513 9.1 6.2 216
Invisible training 31 0.9171 0.7008 0.9069 0.8946 10.7 8.3 321
Calibration 13 0.8100 0.8999 0.7365 0.7720 12.0 8.3 47
Validation 9 0.7955 0.7585 0.6791 0.6305 0.7115 0.1619 11.4 10.1 27
3 Training 35 0.7983 0.8438 0.778 0.7872 10.4 8.0 131
Invisible training 30 0.8894 0.7991 0.8762 0.8811 12.3 9.8 225
Calibration 14 0.8934 0.9452 0.8579 0.8651 13.7 10.4 101
Validation 12 0.8583 0.4606 0.8003 0.5641 0.6456 0.1631 11.8 9.2 61
UV
1 Training 41 0.8417 0.6499 0.8248 0.8230 4.4 3.2 207
Invisible training 41 0.8485 0.853 0.8337 0.8194 5.1 3.9 218
Calibration 18 0.8099 0.8999 0.7599 0.7760 6.7 5.6 68
Validation 10 0.8030 0.8376 0.7079 0.5148 0.6132 0.1968 5.7 4.4 33
2 Training 44 0.8525 0.9233 0.8411 0.8367 4.68 3.59 243
Invisible training 36 0.8765 0.6832 0.8616 0.8627 5.1 4.11 241
Calibration 16 0.7797 0.8829 0.7172 0.7296 6.5 5.5 50
Validation 14 0.7882 0.6608 0.7110 0.5538 0.6487 0.1898 5.04 4.22 45
3 Training 44 0.7926 0.7419 0.7684 0.7894 5.02 4.19 161
Invisible training 43 0.8161 0.7865 0.8010 0.8034 5.65 4.48 182
Calibration 12 0.7682 0.8761 0.6853 0.7289 6.19 4.82 33
Validation 11 0.7771 0.6392 0.6707 0.5659 0.6604 0.1890 4.94 3.54 31

31 confirm this well), whereas the presence of carbon with antioxidants based on certain increasing or decreasing
branching (C…(…….) is a promoter for a decrease in UV promoters of attributes and the effects of new activities of
and gamma activity. designed structures are calculated based on QSAR models
The classification of molecular attributes as promoters of of split 1. The structure, SMILES notation, and predicted
the studied activity increase or decrease is done on the basis activity of all designed antioxidants are given in Table 3.
of the model interpretation section. Antioxidant compound One of the identified promoters in an increase of the Fen-
24 was selected as the template molecule for interpretation ton activity is the presence of oxygen branching from oxy-
and examination of increasing and decreasing descriptors gen atom (O…(…….); one hydroxyl group is added to the
to design new antioxidant compounds. Seven new modeled template molecule M0 (molecule 24). The calculated Fenton

13
Molecular Diversity

Fig. 1  The graph of the (a) 120 Training


experimental versus predicted Invisible training
values for different antioxidant Calibration
activities, based on split 1 for: a 100 Validation

Predicted Fenton antioxidant activity


Fenton activity; b gamma activ-
ity; c UV activity 80

60

40

20

-20
-20 0 20 40 60 80 100 120
Experimental Fenton antioxidant activity
(b) 120
Training
Invisible training
Calibration
100
Predicted gamma antioxidant activity

Validation

80

60

40

20

0
0 20 40 60 80 100 120
Experimental gamma antioxidant activity
(c) 60
Training
Invisible training
Calibration
50 Validation
Predicted UV antioxidant activity

40

30

20

10

-10
-10 0 10 20 30 40 50 60
Experimental UV antioxidant activity

13
Molecular Diversity

Table 2  The correlation weights, number of each attribute in each set, and interpretation of most important attributes for increasing and decreas-
ing of Fenton, UV, and gamma activity of split 1
Activity CW CW CW NaT NbiT NcC Defect ­(MFk) Interpretation of descriptor
­(MFk) in ­(MFk) in ­(MFk) in
run 1 run 2 run 3

Fenton Increase
O…(……. 0.81 0.56 0.75 49 39 13 0 Branching of molecular skeleton started from oxygen
EC2-C…19.. 1.81 1.62 1.37 47 33 11 0.0019 The presence of Morgan connectivity second order for
carbon equal to 19
EC1-C…9… 5.38 4.94 4.56 45 34 12 0.0001 The presence of Morgan connectivity first order for
carbon equal to 9
Decrease
(……….. − 1.00 − 0.44 − 0.88 49 39 13 0 Branching
=……….. − 1.13 − 1.31 − 0.75 49 39 13 0 The presence of double bond
C……….. − 0.75 − 0.87 − 0.75 49 40 13 0 The presence of aliphatic carbon atom
O……….. − 1.81 − 1.68 − 0.81 49 39 13 0 The presence of oxygen atom
C… = ……. − 1.38 − 0.87 − 2.38 48 37 13 0.0003 The presence of aliphatic carbon atom with double bond
EC0-O…2… − 2.62 − 1.75 − 1.19 48 37 13 0.0003 The presence of Morgan connectivity zero order for
oxygen equal to 2
Gamma Increase
1……….. 4.13 3.63 4.37 44 42 12 0 The presence of one ring
EC0-O…1… 0.50 0.62 0.50 44 43 12 0 The presence of Morgan connectivity zero order for
oxygen equal to 1
EC1-C…7… 2.19 2.31 2.00 44 40 12 0 The presence of Morgan connectivity first order for
carbon equal to 7
O…C……. 2.31 2.75 2.88 44 40 12 0 The presence of oxygen connected to the aliphatic
carbon
2……….. 3.63 4.75 3.75 42 41 10 0.0023 The presence of two rings
Decrease
(……….. − 0.44 − 0.31 − 0.57 44 43 12 0 Branching
C…(……. − 0.38 − 0.50 − 0.50 44 41 12 0 The presence of aliphatic carbon with branching
O… = ……. − 0.75 − 0.88 − 0.50 44 42 12 0 The presence of oxygen atom with double bond
UV Increase
1……….. 3.07 3.50 2.69 33 33 13 0.0012 The presence of one ring
PT3-O…3… 2.38 0.43 0.18 32 33 13 0.0019 The presence of the path of length 3 equal to 3 for
oxygen atom
PT3-C…4… 1.13 1.00 0.63 32 32 13 0.0019 The presence of the path of length 3 equal to 4 for
carbon atom
PT3-C…5… 2.63 2.56 3.69 32 33 11 0.0016 The presence of the path of length 3 equal to 5 for
carbon atom
Decrease
EC0-C…2… − 1.00 − 0.75 − 1.07 34 33 12 0.0011 The presence of Morgan connectivity zero order for
carbon atom equal to 2
C…(……. − 1.13 − 0.56 − 1.50 32 33 13 0.0019 The presence of aliphatic carbon with branching

activity for the designed antioxidant M1 was 61.5, more activity increases from 61.5 to 64.6. Molecules M4 and M5
than 20 percent higher than the Fenton activity of molecule modified by adding one or two isopropyl groups to the M0
M0. The presence of double covalent bonds (= ………..) is template are concluded from carbon branching from (C…
identified as decreasing promoters for Fenton activity; the (…….) as decreasing promoters; gamma activity reduces
molecule M0 modified with an ethylene group to M1, as well from 61.5 to 38.7 and 9.7 for M4 and M5, respectively. Also,
as this compound activity, decreases from 40.4 to 33.7. Mol- the branching effect of aliphatic carbon atom investigated
ecule M3 has a hydroxyl group addition to five-membered by designing molecules M6 and M7 through adding one
rings of M0 that indicates the presence of oxygen associated and two isopropyl groups to the M0 template is concluded
with the aliphatic carbon as increasing promoters; gamma from carbon with branching from (C…(…….) as decreasing

13
Molecular Diversity

Table 3  The structure, SMILES notation, and predicted activity of all designed antioxidants (the added group showed as blue color)
No. Structure and SMILES Exp. Fenton activity Exp. gamma activity Exp. UV activity

M0
40.4 61.5 27.2

OC1=C(C(=O)OC1)c2ccccc2 Predicted Fenton activity from Eq. 15

61.5 The presence of oxygen with branching from oxygen


M1
atom (O...(.......) as increasing promotors

OC1=C(C(=O)OC1)c2cccc(O)c2 Predicted Fenton activity from Eq. 15

The presence of double covalent bonds (=...........) as


M2 33.7
decreasing promotors

OC1=C(C(=O)OC1)c2cccc(C=C)c2 Predicted gamma activity from Eq. 18

The presence of oxygen connected to the aliphatic carbon


M3 64.6
as increasing promotors

OC1OC(=O)C(=C1O)c2ccccc2 Predicted gamma activity from Eq. 18

The presence of carbon with branching (C...(.......) as


M4 38.7
decreasing promotors

CC(C)c1ccc(cc1)C2=C(O)COC2=O Predicted gamma activity from Eq. 18

The presence of more carbon with branching (C...(.......)


M5 9.7
as decreasing promotors

CC(C)c1ccc(C2=C(O)COC2=O)c(c1)C(C)C Predicted UV activity from Eq. 21

The presence of carbon with branching (C...(.......) as


M6 20.1
decreasing promotors

CC(C)c1ccc(cc1)C2=C(O)COC2=O Predicted UV activity from Eq. 21

The presence of more carbon with branching (C...(.......)


5.5
as decreasing promotors

M7 CC(C)c1ccc(C2=C(O)COC2=O)c(c1)C(C)C

promoters; UV activity reduces from 27.2 to 20.1 and 5.5 for interpretation without any 3D optimization that has high
M6 and M7, respectively. computational costs and long developing time. According to
Table 4 indicates the statistical characteristics of reported Table 4, R2training of the models developed by Martinčič et al.
QSAR models for the antioxidant activity of the studied is comparable by the present study as a proper QSAR model.
compounds in the previous literature [31]. These statistical We used about 10% of the data for validation because the
parameters show the comparison of suggested models with data set is structurally diverse. Using the larger data set for
the other ones presented in previous studies. In contrast to the training set, we obtained a good statistical quality for the
the commonly conformation-dependent QSAR methods, test set. Also, we designed some new molecules based on
CORAL is capable of generating molecular descriptors and
developing QSAR models with appropriate mechanistic

13
Molecular Diversity

Table 4  The comparison No. Radiation source ntest Descriptor generator Regres- Ref.
ntraining R2training R2Test
between characteristics of sion
previous models from published method
literature and the current study
1 Fenton 79 31 Codessa, Dragon, and SVM 0.83 0.52 [31]
Volsurf packages
2 UV 83 27 0.85 0.64
3 Gamma 58 33 0.86 0.66
4 Fenton 101 9 CORAL package LR 0.79 0.86 This work
5 UV 100 10 0.83 0.80
6 Gamma 79 12 0.86 0.86

decreasing and increasing descriptors and showed the effect 2. Lee A, Mercader AG, Duchowicz PR, Castro EA, Pomilio AB
of some parameters on antioxidant activity. (2012) QSAR study of the DPPH radical scavenging activity of
di (hetero) arylamines derivatives of benzo [b] thiophenes, halo-
phenols and caffeic acid analogues. Chemometr Intell Lab Syst
116:33–40
Conclusion 3. Valko M, Leibfritz D, Moncol J, Cronin MT, Mazur M, Telser
J (2007) Free radicals and antioxidants in normal physiological
functions and human disease. Int J Biochem Cell Biol 39(1):44–84
In the current research, new QSAR models were developed 4. Ahmadi S, Mehrabi M, Rezaei S, Mardafkan N (2019) Structure-
to predict the antioxidant activity of a different set of tested activity relationship of the radical scavenging activities of some
antioxidants using three various radical sources, including natural antioxidants based on the graph of atomic orbitals. J Mol
Fenton reaction, gamma radiation, and UV radiation. The Struct 1191:165–174
5. Brewer M (2011) Natural antioxidants: sources, compounds,
QSAR models were created via CORAL software based on mechanisms of action, and potential applications. Compr Rev
the Monte Carlo optimization technique. The predictabilities Food Sci Food Saf 10(4):221–247
of the suggested models were evaluated through appropriate 6. Habrant D, Poigny S, Ségur-Derai M, Brunel Y, Heurtaux BT, Le
statistical criteria. However, all of the established models Gall T, Strehle A, Saladin R, Meunier S, Mioskowski C (2009)
Evaluation of antioxidant properties of monoaromatic derivatives
were appropriate to predict the novel antioxidant candi- of pulvinic acids. J Med Chem 52(8):2454–2464
dates and could be assessed before synthesis. The hybrid 7. Fusi J, Bianchi S, Daniele S, Pellegrini S, Martini C, Galetta F,
attributes with IIC target specified a meaningful relationship Giovannini L, Franzoni F (2018) An in vitro comparative study
between antioxidant activity and certain global HSG-type of the antioxidant activity and SIRT1 modulation of natural com-
pounds. Biomed Pharmacother 101:805–819
graphs along with SMILES features, which were applied to 8. Jeremić S, Radenković S, Filipović M, Antić M, Amić A,
achieve the correlation weights for molecular attributes via Marković Z (2017) Importance of hydrogen bonding and aro-
the Monte Carlo technique. maticity indices in QSAR modeling of the antioxidative capac-
A noble mechanistic interpretation of the increasing and ity of selected (poly) phenolic antioxidants. J Mol Graph Model
72:240–245
decreasing attributes has been made by analyzing the cor- 9. Kostova I (2005) Synthetic and natural coumarins as cytotoxic
relation weights of different molecular attributes gained in agents. Curr Med Chem-Anti-Cancer Agents 5(1):29–46
three Monte Carlo optimization runs. Such descriptors were 10. Bourdreux Y, Bodio E, Willis C, Billaud C, Le Gall T, Mioskowski
applied to design new and more potent antioxidants in silico C (2008) Synthesis of vulpinic and pulvinic acids from tetronic
acid. Tetrahedron 64(37):8930–8937
approaches. 11. Benedict R, Brady L (1972) Antimicrobial activity of mushroom
metabolites. J Pharm Sci 61(11):1820–1822
Acknowledgements The authors are thankful to Dr. Alla P. Toropova 12. Dias D, White J, Urban S (2007) Pinastric acid revisited: a
and Dr. Andrey A. Toropov for providing the CORAL software. complete NMR and X-ray structure assignment. Nat Prod Res
21(4):366–376
Compliance with ethical standards 13. Osman H, Arshad A, Lam CK, Bagley MC (2012) Microwave-
assisted synthesis and antioxidant properties of hydrazinyl thia-
zolyl coumarin derivatives. Chem Cent J 6(1):32
Conflict of interest The authors declare no conflicts of interest. 14. Hosseinimehr SJ (2007) Trends in the development of radioprotec-
tive agents. Drug Discover Today 12(19–20):794–805
15. Weiss JF, Landauer MR (2009) History and development of radi-
References ation-protective agents. Int J Radiat Biol 85(7):539–573
16. Le Roux A, Meunier S, Le Gall T, Denis JM, Bischoff P, Wagner
A (2011) Synthesis and radioprotective properties of pulvinic acid
1. Lü JM, Lin PH, Yao Q, Chen C (2010) Chemical and molecular derivatives. Chem Med Chem 6(3):561–569
mechanisms of antioxidants: experimental approaches and model 17. Okunieff P, Swarts S, Keng P, Sun W, Wang W, Kim J, Yang
systems. J Cell Mol Med 14(4):840–860 S, Zhang H, Liu C, Williams JP (2008) Antioxidants reduce

13
Molecular Diversity

consequences of radiation exposure. Oxygen Transport to Tissue surfactants from molecular structures. Annali di Chimica J Anal,
XXIX. Springer, Boston, pp 165–178 Environ Cultural Herit Chem 97(1–2):69–83
18. Le Roux A, Kuzmanovski I, Habrant D, Meunier S, Bischoff P, 30. Ghasemi JB, Ahmadi S, Brown S (2011) A quantitative structure–
Nadal B, Thetiot-Laurent SA-L, Le Gall T, Wagner A, Novic M retention relationship study for prediction of chromatographic rel-
(2011) Design and synthesis of new antioxidants predicted by the ative retention time of chlorinated monoterpenes. Environ Chem
model developed on a set of pulvinic acid derivatives. J Chem Inf Lett 9(1):87–96
Model 51(12):3050–3059 31. Kuzmanovski I, Wagner A, Novič M (2015) Development of
19. Ahmadi S, Khani R, Moghaddas M (2018) Prediction of anti- models for prediction of the antioxidant activity of derivatives of
cancer activity of 1, 8-naphthyridin derivatives by using of genetic natural compounds. Anal Chim Acta 868:23–35
algorithm-stepwise multiple linear regression. Med Sci J Islam 32. Goya Jorge E, Rayar A, Barigye S, Jorge Rodríguez M, Sylla-
Azad Univ-Tehran Med Branch 28(3):181–194 Iyarreta Veitía M (2016) Development of an in silico model of
20. Ahmadi S, Khazaei MR, Abdolmaleki A (2014) Quantitative DPPH free radical scavenging capacity prediction of antioxidant
structure–property relationship study on the intercalation of anti- activity of coumarin type compounds. Int J Mol Sci 17(6):881
cancer drugs with ct-DNA. Med Chem Res 23(3):1148–1161 33. Alisi IO, Uzairu A, Abechi SE, Idris SO (2018) Evaluation of the
21. Ahmadi S (2012) A QSPR study of association constants of mac- antioxidant properties of curcumin derivatives by genetic function
rocycles toward sodium cation. Macroheterocycles 5(1):23–31 algorithm. J Adv Res 12:47–54
22. Habibpour E, Ahmadi S (2017) QSAR modeling of the arylthioin- 34. Ahmadi S, Mardinia F, Azimi N, Qomi M, Balali E (2019) Predic-
dole class of colchicine polymerization inhibitors as anticancer tion of chalcone derivative cytotoxicity activity against MCF-7
agents. Curr Comput Aided Drug Des 13(2):143–159 human breast cancer cell by Monte Carlo method. J Mol Struct
23. Ahmadi S, Habibpour E (2017) Application of GA-MLR for 1181:305–311
QSAR modeling of the arylthioindole class of tubulin polym- 35. Ahmadi S, Akbari A (2018) Prediction of the adsorption coef-
erization inhibitors as anticancer agents. Anti-Cancer Agents ficients of some aromatic compounds on multi-wall carbon
Med Chem (Formerly Curr Med Chemistry-Anti-Cancer Agents) nanotubes by the Monte Carlo method. SAR QSAR Environ Res
17(4):552–565 29(11):895–909
24. Ahmadi S, Ganji S (2016) Genetic algorithm and self-organizing 36. Toropova AP, Toropov AA (2019) Quasi-SMILES: quantitative
maps for QSPR study of some N-aryl derivatives as butyrylcho- structure–activity relationships to predict anticancer activity. Mol
linesterase inhibitors. Curr Drug Discov Technol 13(4):232–253 Diversity 23(2):403–412
25. Ahmadi S, Babaee E (2014) Application of self organizing 37. Ahmadi S (2019) Mathematical modeling of cytotoxicity of metal
maps and GA-MLR for the estimation of stability constant of oxide nanoparticles using the index of ideality correlation criteria.
18-crown-6 ether derivatives with sodium cation. J Incl Phenom Chemosphere 242:125192
Macrocycl Chem 79(1–2):141–149 38. Kumar P, Kumar A, Sindhu J (2019) Design and development of
26. Ahmadi S (2012) Application of GA-MLR method in QSPR mod- novel focal adhesion kinase (FAK) inhibitors using Monte Carlo
eling of stability constants of diverse 15-crown-5 complexes with method with index of ideality of correlation to validate QSAR.
sodium cation. J Incl Phenom Macrocycl Chem 74(1–4):57–66 SAR QSAR Environ Res 30(2):63–80
27. Ghasemi JB, Ahmadi S, Ayati M (2010) QSPR modeling of stabil- 39. Toropov AA, Toropova AP, Cappellini L, Benfenati E, Davoli E
ity constants of the Li-hemispherands complexes using MLR: a (2018) QSPR analysis of threshold of odor for the large number
theoretical host-guest study. Macroheterocycles 3(4):234–242 of heterogenic chemicals. Mol Divers 22(2):397–403
28. Ghasemi JB, Zohrabi P, Khajehsharifi H (2010) Quantitative
structure–activity relationship study of nonpeptide antagonists Publisher’s Note Springer Nature remains neutral with regard to
of CXCR2 using stepwise multiple linear regression analysis. jurisdictional claims in published maps and institutional affiliations.
Monatshefte Chemie-Chemical Monthly 141(1):111–118
29. Ghasemi J, Ahmadi S (2007) Combination of genetic algorithm
and partial least squares for cloud point prediction of nonionic

Affiliations

Shahin Ahmadi1 · Hosein Ghanbari1 · Shahram Lotfi2 · Neda Azimi3

2
* Shahin Ahmadi Department of Chemistry, Payame Noor University (PNU),
ahmadi.chemometrics@gmail.com 19395‑4697 Tehran, Iran
3
1 Department of Chemical Engineering, Kermanshah Branch,
Department of Chemistry, Kermanshah Branch, Islamic
Islamic Azad University, Kermanshah, Iran
Azad University, Kermanshah, Iran

13

You might also like