SSRN Id4049717

In silico design of EGFRL858R/T790M/C797S inhibitors via 3D-QSAR,
molecular docking, ADMET properties and molecular dynamics
ed
simulations
iew
Hanine Hadni and Menana Elhallaouia
LIMAS, Faculty of Sciences Dhar El Mahraz, Sidi Mohamed Ben Abdellah University, Fez,
Morocco;
Corresponding author: E-mail address: hadni.hanine@yahoo.fr (Hanine Hadni).
v
Abstract:
re
The development of L858R/T790M/C797S mutations in EGFR is one of the main reasons for
the emergence of resistance after third-generation treatment of non-small cell lung cancer
(NSCLC). Therefore, the development of 4th generation drugs needs urgent attention. To
er
overcome resistance, in silico drug discovery and Design approaches were employed on a
library of 29 novel 9-heterocyclyl substituted 9H-purines derivatives with
pe
EGFRL858R/T790M/C797S inhibition for anticancer activity against NSCLC. The COMSIA/EHA
2 = 0.73) showed a stable and reliable predictive
model (Q2 = 0.584, R2 =0.816, and 𝑅𝑝𝑟𝑒𝑑
ability of NSCLC activity, which was tested by several validation methods. Molecular
docking studies reveal crucial interactions with EGFRL858R/T790M/C797S inhibition for NSCLC
ot
activity. Based on theoretical methods, we designed 10 new compounds with good activity
potential, which were tested using ADMET properties. Then, the docking results were
tn
verified by molecular dynamics simulations to confirm the stability of hydrogen bonding

interactions with crucial residues such as MET790, MET793 and SER797, which are essential
for the design of 4th generation EGFR Inhibitors to combat drug-resistant NSCLC.
rin
Keywords: non-small cell lung cancer, EGFRL858R/T790M/C797S inhibition, 3D-QSAR, Molecular

docking, ADMET properties, Molecular dynamics simulation.
ep
1. Introduction
Lung cancer is responsible for the highest cancer mortality in men and women worldwide [1].
Pr
According to statistics from the World Health Organization (WHO) in 2020 [2], the number
of lung cancer cases is expected to exceed 2.21 million and cause 1.80 million deaths
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4049717
worldwide. Thus, the cure rate for lung cancer is less than 20%, indicating that we need to
strengthen our efforts to fight lung cancer. Non-small cell lung cancer (NSCLC) accounts for
ed
85 % of lung cancers [3]. The Epidermal Growth Factor Receptor (EGFR) is a receptor
tyrosine kinase of the ErbB family, which plays a key role in the transmission of cell
signalling that leads to proliferation, differentiation, apoptosis and migration [4–6]. The
iew
discovery of EGFR tyrosine kinase inhibitors (EGFR-TKIs) represents an important milestone
in NSCLC treatment procedures [7]. However, mutations in the EGFR gene have had a
considerable impact on these procedures, resulting in three generations of EGFR-TKIs.
First-generation EGFR-TKIs (Gefitinib and Erlotinib) have been shown to be effective in
v
treating NSCLC in patients with the L858R mutation in EGFR [8–10]. However, the
re
appearance of the T790M mutation in the EGFRT gene has led to the therapeutic failure of
first-generation inhibitors [11]. To overcome the resistance to the T790M mutation, second-
generation EGFR-TKIs, namely Afatinib and Dacomitinib, have been developed [12].
er
However, these inhibitors cause serious side effects [13]. To overcome the mutations and
observed toxicity, the development of third-generation EGFR-TKIs (rociletinib, osimertinib)
avoided these problems with acceptable side effects and improved action against the T790M
pe
mutation [14]. Despite the significant therapeutic success of third-generation EGFR-TKIs, the
development of new resistance occurring on target is the C797S mutation, rendering
osimertinib clinically ineffective [15]. Therefore, the discovery of new drugs to overcome the
L858R/T790M/C797S mutants of the EGFR represents an urgent therapeutic need for the
ot
treatment of NSCLC.
In recent years, the use of molecular modeling techniques has yielded very impressive results
tn
in the new drug discovery process [16–18]. For this purpose, quantitative three-dimensional
structure-activity relationship (3D-QSAR), molecular docking, pharmacokinetic parameters
(ADMET) and Molecular dynamic (MD) simulation have been performed for the design of
rin
new molecule able to overcome the resistance of EGFRL858R/T790M/C797S mutation. In this study,
29 EGFRL858R/T790M/C797S inhibitors were studied using two methods, 3D-QSAR and molecular
docking to identify key structural factors affecting inhibitory activity. In addition, it helps us
ep
to obtain recommendations for the design of new drug candidates from the ZINC database,
which aims to catalog all biologically relevant molecules in 2D and 3D representations. To
test their drug-like ability, each designed compound was evaluated by calculating the
Pr
ADMET parameters. Furthermore, 100 ns MD simulations were performed to estimate the

stability of the ligand-receptor under normal physiological conditions.
2. Material and methods
2.1. Database and biological activity
ed
In this work, a series of 29 novel 9-heterocyclyl substituted 9H-purines derivatives (Figure 1)
as EGFR-TKIs of the L858R/T790M/C797S mutants was collected from the recent literature
iew
[19]. Thus, the observed activity values (IC50) were converted to their negative logarithm
pIC50 (-logIC50) values to construct the 3D-QSAR models. To build the 3D-QSAR models
(Table1), the dataset containing 29 compounds was randomly divided into a training set
containing 21 compounds to generate the 3D-QSAR models (Table 1), while the remaining 8
v
compounds 5represented by the symbol "*" superscript) were used as a test set to verify the
performance and accuracy of the established models.
re
R1
N
N
er N N
NH
N N N
H
pe
R2
Figure 1. Structures of 9-heterocyclyl substituted 9H-purines derivatives.
Table 1. Observed and predicted anticancer activity against NSCLC of 9-heterocyclyl substituted 9H-
purines derivatives.
ot
Compounds R1 R2 IC50(μM) pIC50(obs) pIC50(pred) Residual

(S)
tn
N O
1 H O
0.11 6.958 6.726 0.231
(S)
N O
2 F 0.58 6.236 6.124 0.112
rin
(R)
N O
3* H O
0.012 7.921 7.607 0.313
ep
(R)
N O
4 F O
0.027 7.568 7.48 0.087
Pr
(S)
N
5* F O 0.12 6.921 7.222 -0.301
O
(R)
N
6 F O 0.28 6.553 6.709 -0.156
ed
O
(S)
7 H NH 0.53 6.276 6.919 -0.643
(S)
8 F NH 1.22 5.914 6.332 -0.418
iew
(R)
9* H NH
0.065 7.187 6.875 0.311
(R)
10 F NH 0.3 6.523 6.319 0.204
(S)
11* F 0.038 7.421 6.98 0.441
NH
(R)
v
12 F 0.12 6.921 6.784 0.136
NH
(S) O
re
13 H N 0.17 6.769 6.221 0.547
N
H
(S)
N
14 H 0.26 6.585 7.068 -0.483
erO
(S)
N O
15* H S 0.017 7.769 7.696 0.072
O
(S)
pe
N O
16 H S 0.0016 8.796 8.058 0.738
O
(S) OH
N
17 F 0.38 6.421 6.534 -0.113
O
(S)
ot
O
18 F N
N 1.52 5.818 5.638 0.179
H
(S) O
19 F N 1.81 5.742 5.722 0.019
tn
O
(S)
N O
20* F S 0.0075 8.125 7.892 0.232
O
(R)
rin
N O
21 H S 0.00088 9.055 8.919 0.135
O
(R) OH
N
22 F 0.35 6.456 7.092 -0.635
ep
O
(R)
N O
23 F S 0.0025 8.602 8.345 0.256
O
(S)
Pr
N
24 F 0.0018 8.745 8.029 0.715
O OH
(S)
N
O
25 F S 0.0086 8.065 8.759 -0.694
ed
O
(R)
26* F N 0.066 7.181 6.959 0.221

O OH
iew
27 F HN
0.01 8 8.163 -0.163
O
28 F OH
0.0075 8.125 8.179 -0.054
v
HN
O
re
O
H
29* F N S 0.05 7.301 7.023 0.278
O
2.2. Structural optimization and alignment

er
Molecular alignment is considered one of the most important parameters of 3D-QSAR
studies, as this step is a determining factor of the successful model [20]. All ligands were
constructed with the sketch module and optimized under the Tripos force field, using
pe
Gasteiger-Hückel charges [21]. The convergence criterion of the Powell gradient algorithm
was set to 0.005 kcal/ (mol Å) and a maximum iteration coefficient of 10,000 to obtain a
stable conformation [22]. All optimized molecular structures were aligned to the most active
ot
compound 21 of the series, using the ALIGN DATABASE method available in SYBYL-X
2.0 software.
tn
2.3. Generation of 3D-QSAR models
Ligand-guided alignment was used to derive the CoMSIA models using SYBYL-X 2.1. The
aligned inhibitors of the training set were placed inside a 3D cubic lattice with a grid spacing
rin
of 2 Å in all Cartesians directions. The CoMSIA descriptors, which includes five force fields
(steric, electrostatic, hydrophobic, hydrogen bond donors and acceptors), were calculated
using a sp3 hybridized carbon probe atom with a van der Waals radius of 1.52 Å and a net
ep
charge of +1.0. An energy cut-off value was set at 30 kcal/mol. The minimum column
filtering and attenuation factor were set to default values of 2.0 kcal/mol and 0.3, respectively.
2.3.1. Partial least squares (PLS) analysis.

Pr
The partial least squares (PLS) method was used to build the CoMSIA models [23]. First, the
leave-one-out (LOO) cross-validation method was applied to obtain the optimal number of
components (ONC) with the highest cross-validation coefficient of determination (Q2) to
validate this step. Second, after determining the ONC, CoMSIA models were built based on
ed
statistical parameters such as the coefficient of determination (R2), the standard error of
estimate (SEE), the F-value (Fischer test) and the contributions of each field. In order to test
the predictive ability and stability of the models, several validation strategies were also
iew
applied.
2.4. Validation of QSAR models.
2.4.1.Y-Randomization
v
The Y randomization technique is a widely used approach to ensure the robustness of a 3D-
re
QSAR model [24]. In this test, only the values of the dependent variable (biological activity)
are randomly shuffled several times, while the values of the independent variable (descriptors)
remain unchanged. These random data are then used to generate new 3D-QSAR models, and
2
the 𝑄𝑦𝑟𝑎𝑛𝑑 2
and 𝑅𝑦𝑟𝑎𝑛𝑑
er
values of the random model are then tested against the Eriksson and
Wold criteria [25] represented by the following rule:
pe
 2 < 0.2 and
𝑄𝑦𝑟𝑎𝑛𝑑 2 < 0.2, no chance correlation
𝑅𝑦𝑟𝑎𝑛𝑑
 2 and 0.2 <
any 𝑄𝑦𝑟𝑎𝑛𝑑 2 < 0.3, negligible chance correlation
 2 and 0.3 <
any 𝑄𝑦𝑟𝑎𝑛𝑑 2 < 0.4, tolerable chance correlation
 2 and
any 𝑄𝑦𝑟𝑎𝑛𝑑 2 > 0.4, recognized chance correlation.
ot
Another criterion based on the parameter called c𝑅2𝑟 was calculated, which must be greater
than 0.5 according to the following equation:
tn
c𝑅2
𝑟 = 𝑅 × (𝑅2 ‒ 𝑅2𝑟)
2.4.2. External validation of the CoMSIA model

rin
To verify the predictive capabilities of the proposed models, the biological activities of the
test set of 8 compounds were predicted. The predictive ability of the models is measured by
2 ) calculated by the following
the external validation coefficient of determination (𝑅𝑝𝑟𝑒𝑑
ep
formula:
2
𝑅𝑝𝑟𝑒𝑑 = (𝑆𝐷 ‒ 𝑃𝑅𝐸𝑆𝑆)/𝑆𝐷
Pr
where SD is the sum of squared deviations between the biological activity of the test set and
average activity of the training set, and PRESS is the sum of squared deviations between the
predicted and experimental activities of the test set.To test the reliability of 3D-QSAR models
in predicting the activity of new compounds, an additional statistical criteria was introduced
ed
by Golbraikh and Tropsha [26] for external validation, represented by the following
equations:
iew
∑(Y test(pred) ‒ kYtest(pred))
2
r20 = 1 ‒
∑(Y test(pred) ‒ Ytest(pred))
2
∑(Y test ‒ kYtest)

2
r'2
0 =1‒
∑(Y 2
v
test ‒ Ytest)
∑(Y ‒ Y
re
test test(pred))
K=
∑(Y test(pred))
2
∑(Y ‒ Y
test test(pred))
er
K' =
∑(Y test)
2
Where 𝑟2 is a squared correlation coefficient between predicted and experimental activity of

pe
the test set.
𝑟20 and 𝑟'2

0 are squared correlation coefficient of predicted versus experimental and
experimental versus predicted activity for the test set at zero intercept, respectively.
ot
K and K' are the slope of the plot of predicted versus observed and observed versus predicted
activity for the test set at zero intercept, respectively.
tn
Furthermore, according to Roy's statistical criteria [25], to further validate the predictive
capacity of the model, it is important to calculate the difference between the values of r2, 𝑟20
and 𝑟'2
0 according the following equation:
rin
2
𝑟𝑚 = 𝑟2(1 ‒ (𝑟2 ‒ 𝑟20)
'2
𝑟𝑚 = 𝑟2(1 ‒ (𝑟2 ‒ 𝑟'2
0)
ep
2.4.3. Applicability domain
The applicability domain (AD) [27] has been applied to determine the region of chemical
space of a 3D QSAR model, which can be used to reliably predict new compounds. In this
Pr
study, one of the simple approaches to determine the DA is the leverage approach . the
leverage values (h) were calculated for each molecule for a graphical detection of outliers in
the Williams diagram by the following relationship:
ed
ℎ𝑖 = (𝑥𝑡𝑖 (𝑋 𝑋𝑡) ‒ 1𝑥𝑖) with (i= 1,2,…n)
In this equation: xi: the descriptor vector of the tested compound, X : the matrix n*(k-1)
iew
where n is the number of compounds and k the number of descriptors in the training set and
the exponent t refers to the matrix/vector transposition. This diagram validates the reliability
of the 3D-QSAR model if the value of h is lower than the critical value of the leverage (h*)
(h* = 2.5 (k + 1)/n) [28,29], where n is the number of compounds and k is the number of
descriptors in the training set. The AD of the 3D-QSAR model is located to the left of the
v
vertical line of h* = 0.47.
re
2.5. Molecular docking
Molecular docking is a reliable computational method, which can be performed to analyze the
interactions between potential drugs and the active site of the target protein, in order to better
er
understand the key structural requirements of a geometric model as a function of binding
energy. In the present work, we performed the molecular docking study of two compounds 4
pe
and 2 with the same R1 and R2 substitution, but with different R and S enantiomers, with
compound 4 having higher activity than compound 2, in the active site of
EGFRL858R/T790M/C797S (PDB code 6lud), which is downloaded from the RCSB protein
database [30]. First, we used Discovery Studio software [31] to prepare the protein by
ot
removing all water molecules, ligands and non-protein parts. AutoDock Tools software
version 1.5.6 was used to analyze ligand-protein interactions [32]. The 3D grid was created by
tn
the AUTOGRID algorithm to measure the energies of ligand-protein interactions [33]. The
grid maps were generated using 60 Å in all Cartesian directions, with a default grid point
spacing of 0.375 Å, the grid centre coordinates are approximately ( -53.330 Å, -5.149 Å, and -
rin
17.579 Å) for the location of the ligand in the complex. The 2D and 3D visualizations and the
analysis of the docked conformations according to the established interactions were
performed using Discovery Studio [31]
ep
2.5.1. Docking validation protocol
The molecular docking simulation was validated by re-docking the crystallized ligand in the
protein (6lu7.pdb). Therefore, the native ligand was separated from the protein and docked
Pr
with the same protein. The lowest energy pose of the docked ligand and the native ligand was
superimposed, in order to to calculate the root mean square deviation (RMSD). To validate
the docking process, the RMSD must be less than 2 Å [34,35].
ed
2.6. In silico pharmacokinetics ADMET
The development of computational technology in the pharmaceutical field has made it
iew
possible to identify new drug candidates, reducing the number of experimental tests and
improving the success rate. With this in mind, the in silico study provides a pathway for
ADMET (Absorption, Distribution, Metabolism, Excretion and Toxicity) pharmacokinetic
parameters [36,37], with drug absorption in the human gut, drug penetration into the central
v
nervous system and the blood-brain barrier, metabolism is the chemical biotransformation of a
drug by the body, excretion is the elimination of a drug by the body and the toxicity levels of
re
a drug.
2.7. Molecular dynamic simulation

er
Molecular dynamics (MD) simulations were performed using the Nanoscale Molecular
Dynamics (NAMD) program [38]. The NAMD input files were generated on the CHARMM-
GUI[39], using the CHARMM36 force field for the calculation of the system was solvated
pe
with the TIP3 water model in a 10 Å cubic box around the protein and neutralised by the
addition of the KCl salt at the ionic concentration of 0.15 M using the Monte-Carlo method
for ion positioning [40]. The energy was minimised for 10,000 steps using the steepest
descent method. After minimisation, the system was equilibrated at 310 K for 100 ps in an
ot
ensemble with constant number of atoms, volume and temperature (NVT). Then the system
was submitted to unrestrained 100 ns-production MD simulations in a constant number of
tn
atoms, pressure, and temperature (NPT) ensemble with a reference temperature (310 K) and
pressure (1 atm). The analyses of the MD trajectories were used to generate the root mean
square deviation (RMSD), root mean square fluctuation (RMSF) using Visual Molecular
rin
Dynamics (VMD) software [41] to check the stability of the systems.
3. Results and discussion

ep
3.1. Molecular alignment
The compound-based molecular alignment method was performed using the available
alignment rule to build a powerful 3D-QSAR model. The most active compound 20 of the
Pr
series was chosen as the model molecule to align the data set and to visualise the CoMSIA
model contour maps.. Figure 2 shows the alignment of all 3D molecular structures of the data
sets to the common core based on the best conformation of compound 21.
ed
v iew
Figure 2. Superposition and alignment of the 29 studied compounds using molecule 21 as a template.
re
3.2. CoMSIA studies
The results of the statistical parameters of the CoMSIA models are presented in Table 2. For
the CoMSIA analysis, the combination of the five molecular fields was used to develop the
er
different CoMSIA models. However, the hydrogen bond donor field is zero for the molecules
in the series. The results (Table 2) indicate that the best combination was the electrostatic
pe
field, the hydrophobic field and the hydrogen bond acceptor field (EHA), with The
contribution rates were 24.7 %, 11.7 % and 63.6 %, respectively. In the CoMSIA/EHA model,
the Q2 was 0.571 with two as the optimal number of components, The R2 was 0.852with a
reliable SEE of 0.434 and the F -test value is 51.812, highest predictive value was obtained
ot
2 = 0.715 for the external validation of the test set

𝑅𝑝𝑟𝑒𝑑
Table 2. The PLS statistical results of CoMSIA models in different molecular field combinations.
tn
2 Fractions
Q2 R2 RMSE F ONC 𝑅𝑝𝑟𝑒𝑑
Ster Elec Hyd Don Acc
CoMSIA/SEA 0.584 0.844 0.445 48.687 2 0.603 0.121 0.216 - 0.663
CoMSIA/SEH 0.386 0.816 0.497 25.198 3 0.408 0.249 0.432 0.319 -
rin
CoMSIA/SHA 0.579 0.849 0.437 50.773 3 0.676 0.170 0.132 - 0.698

CoMSIA/EHA 0.571 0.852 0.434 51.812 2 0.715 0.247 0.117 - 0.636
CoMSIA/SEHA 0.567 0.846 0.442 69.893 2 0.680 0.112 0.198 0.108 - 0.582
Overall, the proposed model is considered a reliable predictive model, if the Q2 and R2 values
ep
2 of the new compound

are greater than 0.5 and 0.6, respectively, with a prediction value 𝑅𝑝𝑟𝑒𝑑
activity greater than 0.6. Thus, the CoMSIA/EHA model indicates statistical significance and
good predictive quality, which was confirmed by the predictive ability of the external
Pr
validation. For more precision concerning the stability and prediction of the CoMSIA/EHA
model, several validation methods such as the Y-randomization test, the Tropsha and
Golbraikh criteria and Roy criteria were used. Table 3 presents the results of the Y-
randomization test of the CoMSIA/EHA model.
ed
Table 3. Q2 rand, R2 yrand and c𝑅2𝑟 values of the COMSIA/EHA model after several Y-
Randomization tests.
iew
COMSIA/EHA
Iteration
2 2 c𝑅2
𝑄𝑦𝑟𝑎𝑛𝑑 𝑅𝑦𝑟𝑎𝑛𝑑 𝑟
1 -0.22 0.340 0.66
2 -0.239 0.388 0.628
3 -0.176 0.330 0.666
4 0.060 0.360 0.647
v
5 -0.143 0.404 0.617
re
In the results in Table 3, five random mixtures of the Y vector were performed, according to
2 ,
the criteria of the Y-randomization test described above. The values of 𝑄𝑦𝑟𝑎𝑛𝑑 2 and c𝑅2
𝑅𝑦𝑟𝑎𝑛𝑑 𝑟
indicate that the random correlation in the training set is tolerable. This revealed that the
er
results obtained from the original CoMSIA/EHA model were not due to chance correlation.
The results of the external validation test with the Tropcha and Roy criteria for the
CoMSIA/EHA model are listed in Table 4.
pe
Table 4. Statistical parameters for the validation of CoMSIA/EHA model.
Validation
Parameter CoMSIA/EHA
Criteria
𝑄2 𝑄2 > 0.5 0.571
ot
r2 r2 > 0.6 0.715

|𝑟0 ‒ 𝑟'20|
2 |𝑟0 ‒ 𝑟'20| < 0.3
2
0.02
k 0.85<k<1.15 1.026
tn
𝑟 ‒ 𝑟20
2
𝑟2 ‒ 𝑟20
< 0.1 0.02
𝑟2 𝑟2
K’ 0.85<k’<1.15 0.973
𝑟2 ‒ 𝑟'2 𝑟2 ‒ 𝑟20
rin
0
< 0.1 0.084
𝑟2 𝑟2
2 2
𝑟𝑚 𝑟𝑚 > 0.5 0.628
'2 '2
𝑟𝑚 𝑟𝑚 > 0.5 0.581
ep
The results in table 4 reveal that the CoMSIA/EHA model is in perfect agreement with the
Tropsha and Golbraikh as well as roy criteria. The CoMSIA/EHA model passed all validation
Pr
tests, showing a better accuracy in predicting the activity of new compounds. Therefore, to
determine the applicability domain of this model, we used William's graph presented in
Figure 3.
ed
v iew
re
er
Figure 3. William plot for the developed CoMSIA/EHA model.
the DA of the CoMSIA/EHA model was assessed by a leverage analysis expressed as a

pe
Williams diagram (Figure 3). In the Williams diagrams, the results indicate that all leverage
values of the training and test sets were below the critical leverage value (h* = 0.47), except
for one outlier of compound 29, which was above the critical leverage, this compound
belongs to the test set. The test set of the CoMSIA/EHA model was accurately predicted
ot
because there were no outliers for the training set. Therefore, we can reliably predict the
anticancer activity of new compounds using this model. Thus, CoMSIA/EHA contour maps
were used to analyse the structural requirements for the design of new active compounds.
tn
3.3. Graphical interpretation of CoMSIA model
The CoMSIA/EHA model was used to visualise the three-dimensional equipotentiality map,
rin
using the compound 21 with the highest activity as a template. Figure 4 shows the
electrostatic, hydrophobic and hydrogen bond acceptor fields contour maps of the CoMSIA
model.
ep
Pr
ed
iew
(a) (b)
v
re
er
pe
(c)
Figure 4. (a) Electrostatic, (b) hydrophobic and (c) Hydrogen bond acceptor Contour maps of
CoMSIA analysis of compound 21.
ot
In the CoMSIA electrostatic contour maps, we observed a red contour near the pyrrolidine
and piperidine of the R2 indicating an electronegative substitution can improve the activity.
tn
This is consistent with the fact that all compounds with electronegative substituents at the R2
position show higher activity. Therefore, the presence of electronegative groups in R2
substitutions could have better activity.
rin
In the CoMSIA hydrophobic contour map, we observed a gray contour near the pyrrolidine
and piperidine of the R2 substitution, indicating that hydrophilic substitution is required in this
ep
region. This is consistent with the fact that compound 1-6 with tert-butyl formate as the
hydrophilic group has a higher activity than compound 6-12, respectively. Thus, the presence
of a hydrophilic group bound to pyrrolidines or piperidines can increase the biological
activity.
Pr
In the hydrogen bond acceptor contour map, the red contours near the R2 substitution indicate
that the substitution of the hydrogen bond acceptor in this position is unfavourable. Thus, the
presence of hydrogen bond acceptor groups decreased the biological activity, which is due to
the nature of the receptor in this region which can be a hydrogen bond acceptor. This can be
ed
explained by the fact that compounds 24 and 28 with hydrogen bond donor groups show
better activities, and with the fact that compounds with hydrogen substitution in the R1 group
are more active than compounds with fluorine substitution in the same position. However, a
iew
small magenta contour near the nitrogen atom of the pyrrolidine or piperidine substitution R2
is also observed indicating that the hydrogen bond acceptor atoms are favorable in this
position only. This can be explained by the fact that compounds 21 shows the best activities
of the series. Generally, hydrogen bond acceptor groups are unfavorable to inhibitory activity.
v
From Table 2, we notice that the hydrogen bond acceptor field plays a key role in predicting
re
anticancer activity. In the case of the CoMSIA/EHA model, the hydrogen bond acceptor field
explains 63.6% of the variance, which explains why hydrogen bond acceptor groups are
essential for the inhibition of this process.
3.4. Molecular docking

er
The molecular docking study was conducted to obtain information on key structural
pe
requirements and to analyze the established interaction with EGFRL858R/T790M/C797S protein.
Figure 5 presents the interaction modes obtained by molecular docking for compounds 2 (IC50
= 0.58) and 3 (IC50 = 0.012).
ot
tn
rin
ep
(a)
Pr
ed
v iew
re
er (b)
Figure 5. 2D and 3D docking poses showing interactions of compounds 3 and 2 in the binding sites of
EGFRL858R/T790M/C797S protein. (a) Compound 3 (binding energy -9.52 kcal/mol). (b) Compound 2:
(binding energy -9.96 kcal/mol).
pe
Molecular docking interaction of compound 3 with EGFRL858R/T790M/C797S showed three
hydrogen bonding interactions with MET790, MET793 and SER797 at distances of 3.07 Å,
1.65 Å and 2.52 Å, respectively. In addition, Pi-sulfur interactions are observed between the
ot
two purine rings with the amino acid MET792 (4.12 Å, 5.38 Å). However, compound 2
forms only two hydrogen bonding interactions with the amino acid with MET790 (2.19Å) and
tn
MET793 (1.79Å) and one halogen bonding interaction with the amino acid ASP855 (2.69 Å),
as well as Two Pi-sulfur interactions are observed between the two purine rings with the
amino acid MET790 (4.35 Å, 5.86 Å).
rin
Compounds 3 and 2 formed two hydrogen bond and pi-sulfur interactions with the same
residues and position for both ligands. However, the more active compound 3 formed an
additional hydrogen bonding interaction with the most important mutated residue, Ser797, in
ep
the EGFR binding region, which enhanced the inhibitory activity. Interestingly, the presence
of these hydrogen bonds strengthened the binding of the compounds to the protein and
allowed the compounds to have strong inhibitory activity.
Pr
According to Table 1, the fluorine substituent in the R1 group decreased the inhibitory activity
compared to the hydrogen substituent, this could be due to the formation of a halogen
interaction with the fluorine substituent that blocks the ligand in the active site, thus
decreasing the stability of the ligand-receptor complex. Furthermore, the CoMSIA/EHA
ed
hydrogen bond acceptor field contour map shows that ligands with hydrogen bond acceptor
groups in are unfavourable for anticancer activity. This observation is fully consistent with the
3D docking results which clearly show that the EGFRL858R/T790M/C797S protein is a hydrogen
iew
bond acceptor at this position, validating our hypothesis in the CoMSIA contour map section.
3.4.1. Docking validation protocol
In order to validate the ability of docking algorithms to predict the conformation of the ligand
v
bound to the EGFRL858R/T790M/C797S proteins, a self-docking of the crystal ligands was
performed to test the accuracy of the docking procedure. Figure 6 shows the superimposed
re
view between the conformation of the docked ligand and native ligand, with an RMSD value
of 0.869 Å less than 2 Å. To visualise the quality of the docking poses in the protein, a visual
inspection of the interactions between the crystallized and docked ligand was performed
er
(Figure 7).
pe
ot
tn
Figure 6. Re-docking pose with the RMSD value of 0.869 Å (Green = native pose, blue = docked
rin
pose).
ep
Pr
ed
v iew
re
Figure 7. (a) 2D visualization showing interactions of ligand pose prediction result. (b) 2D
visualization showing interactions of the crystallographic ligand pose.
er
The results of the visual inspection show that we obtained the same interaction modes as in
the case of the experimental interaction, observed in Figure 7, this indicates a high reliability
pe
of the docking protocols to produce the binding mode of the novel EGFRL858R/T790M/C797S
inhibitors.
3.5. Design of new compounds

ot
The main objective of this study is to design new EGFR protein inhibitors to overcome the
L858R/T790M/C797S mutations. In this regard, new drug candidates are designed using
tn
recommendations from 3D-QSAR and molecular docking analysis (Figure 8) on the structural
characteristics of compound 21. In this study, ten 9H-purine derivatives (T1-T10) were
designed to improve anticancer activity, the proposed substitutions were taken from the ZINC
rin
database. Thus, these newly designed compounds were aligned using compound 21 as a
template. The previously established CoMSIA/EHA model and molecular docking with
EGFRL858R/T790M/C797S protein were used to predict the activity of these new compounds. The
structure of the newly designed molecules, the predicted pIC50 and IC50 values, the
ep
calculation of the leverage threshold h*as well as the molecular docking interactions are
presented in Table 5.
Pr
ed
v iew
Figure 8. Summary of Structural requirements based on the analysis of CoMSIA/EHA contour map
re
and molecular docking study.
Table 5. Molecular Docking interactions, the leverage threshold h* and predicted IC50 based on
CoMSIA/EHA model for the newly designed compounds.
er
pe
Interactions with
EGFRL858R/T790M/C797S protein
PIC50 IC50 leverage Amino

Structures of newly designed
Compounds (pred) (pred) threshold acid
molecules Number
h* Binding residues
ot
of
affinity with
Hydrogen
(kcal/mol) hydrogen
bonds
bonding
(distances)
tn
O
MET793
rin
N1 7.5432 0.0286 0.8065 -10.3 2 (1.81 Å,

H
N N N
2.54 Å)
NH
N N
N
N
ep
H
N
SER797
O
S (2.35 Å,
O 2.73 Å)
MET793
Pr
N2 H 9.1590 0.0007 0.3717 -10.55 3

N N N (1.71 Å)
N
NH MET790
N
N (2.57 Å)
N
HO
ASP800
ed
(2.92 Å)
OH
O SER797
H (2.27 Å)
N3 N N N 8.8819 0.0013 0.3486 -9.17 4
MET793
NH
N
N N (1.81 Å)
iew
N LEU718
(2.19 Å)
O N
S NH GLU804
(2.31 Å)
N
N4 7.0516 0.0887 0.4616 -9.95 2 SER797
H
(2.41Å)
v
N N N
NH
N
N N
re
N
F F
F
SER797
O (2.22Å)
MET793
N5 8.6883 0.002 0.4160 -9.3 3
H
N N N
NH
er (2.04 Å,
2.71 Å)
N N
N
N
pe
O
HN MET790
O (2.73 Å)
N6 7.0456 0.09 0.2360 -10.69 2 MET793
ot
(1.78 Å)
H
N N N
NH
N N
N
tn
N
O
N O SER797
S
O (2.35 Å)
N7 H
N 7.2338 0.058 0.4589 -10.57 2 Met793
rin
N N N
NH (1.91 Å)
N N
N
N
H
N
S
O SER797
O N (2.38Å)
ep
N
N MET793
N8 H 8.0158 0.0096 0.1332 -9.3 3
N N N (2.05 Å,
NH 2.53 Å)
N N
N
N
Pr
O
SER797
ed
HN
N
(1.96 Å)
N
N9 H
N 7.5243 0.0299 0.2335 -9.2 3 MET793
N N N (2.31 Å
NH
N N
2.49 Å),
N
iew
N
O
SER797
N
(2.24 Å ,
2.90 Å)
O
MET793
N10 7.3045 0.0496 0.6577 -10.38 2
N
(2.21 Å,
H
2.16 Å)
v
N N N
N
NH MET790
N
N (2.37 Å)
re
N
The first step of a reliable prediction is to verify that all predicted new molecules belong to
the application domain of the proposed model. The CoMSIA/EHA model has a critical
leverage value h* = 0.47, all the designed novel compounds have a threshold leverage value
er
h* lower than the critical leverage value, except for compounds N1 and N10 which could
belong to another chemical family, this result shows that N2-N9 compounds were reliably
pe
predicted anticancer activity. Using the CoMSIA/EHA model, the compound T2 also showed
better activity than all compounds in the data set. Furthermore, we carried out molecular
docking for ten new compounds designed with the EGFRL858R/T790M/C797S protein. The results
indicate that almost all compounds formed hydrogen bonding interactions with the amino
ot
acids at positions 797 and 790 of the protein, both of which play an important role in the
therapeutic failure of the anticancer drug. However, the inhibitory activity of compounds N1
tn
and N10 cannot be reliably predicted, but the molecular docking results are encouraging and
may lead to a new inhibitor family. Overall, all compounds show a good level of inhibitory
activity and can theoretically overcome the problem of drug resistance in lung cancer.
rin
3.6. ADMET prediction
To verify that the designed compounds can become drugs, we use the ADMET
pharmacokinetic parameters by the online tool pkCSM [42]. The ADMET parameters of the
ep
newly designed compounds are listed in Table 6.
Table 6. In silico ADMET properties of new designed compounds.

Pr
Absorptio
Compounds Distribution Metabolism Excretion Toxicity
n
Intestinal VDss BBB CNS Substrate Inhibitor Total AMES
absorption (human) permeab permeab Clearance toxicity
(human) ility ility
CYP
ed
2D6 3A4 1A2 2C19 2C9 2D6 3A4
Numeric Numeric Numeric Categori

Numeric Numeric
(% (Log (Log cal
iew
(Log (Log Categorical (Yes/No)
Absorbed) PS) ml/min/kg (Yes/No
L/kg) BB)
) )
N1 90.063 0.556 -1.608 -2.823 No No No Yes Yes No Yes 0.758 No
N2 87.61 0.623 -1.43 -2.525 No Yes No Yes Yes No Yes 0.229 No
N3 67.671 0.432 -1.492 -3.755 No No No No Yes No Yes 0.485 Yes
N4 78.808 0.585 -1.709 -2.826 No No No Yes Yes No Yes 0.221 Yes
v
N5 82.792 -0.197 -1.252 -2.146 No Yes Yes Yes Yes No Yes 0.302 Yes
N6 97.823 -0.039 -1.485 -3.391 No Yes No Yes Yes No Yes 0.638 No
re
N7 78.764 0.036 -1.775 -3.944 No No No No Yes No Yes 0.149 Yes
N8 81.253 0.058 -1.917 -4.339 No No No No Yes No Yes 0.089 No
N9 90.08 0.063 -1.697 -4.004 No No No No Yes No Yes 0.328 No
N10 94.626 0.079 -1.489 -3.844 No No No Yes Yes No Yes 0.5 No
er
An absorption value of less than 30% is considered poor intestinal absorption, the ten
compounds designed showed a value between (67.671 % and 97.823 %) indicating good
intestinal absorption, volume of distribution (VDss) is considered low if logVDss < -0.15 and
pe
high if logVDss > 0.45, central nervous system (CNS) and blood-brain barrier (BBB)
permeability standard values (> -2 to < -3 LogPS and >0.3 to < -1 Log BB), respectively, for a
given compound a LogBB < -1 corresponds to poor distribution to the brain, while LogBB
>0.3 are likely to cross the BBB and LogPS >3, to cross the BBB and LogPS > -2 considered
ot
to penetrate the CNS, while LogPS < -3 are difficult to move into the CNS [43]. Thus, the
compounds N1, N2, N6 and N8-N10 have an excellent potential to cross the barriers.
tn
The enzymatic metabolism refers to the chemical biotransformation of drugs in the human
body, which plays a crucial role in the metabolic stability of drugs in the body [44]. The
cytochrome P450 enzymes (CYP1A2, CYP3A4, CYP2C19, CYP2D6 and CYP2C9) found in
rin
the liver are the main enzymes of drug metabolism, being responsible for the
biotransformation of more than 90% of drugs. Inhibition of these metabolising enzymes can
increase the concentration of the active drug in the body. In this study, CYP3A4 was the main
ep
human enzyme responsible for the metabolism of the third-generation drug for treating
NSCLC [45–47]. The results show that all newly designed compounds appear to be CYP3A4
inhibitors, but only compounds N2, N5 and N6 appear to be CYP3A4 substrates. All newly
Pr
designed compounds showed a low total clearance value, which means accumulation and
persistence of the drugs in the body. Finally, compounds N1, N2, N6 and N8-N10 showed
negative toxicity. Overall, the newly designed compounds N1, N2, N6 and N8-N10 exhibit
good pharmacokinetic properties. The results of this study may represent excellent drug
ed
candidates for the treatment of NSCLC to overcome the L858R/T790M/C797S mutations in
EGFR-TKIs. Figure 9 shows the 2D visualization of the molecular docking results of the best
predicted compounds, which were then subjected to a molecular dynamics study
v iew
re
er
pe
N1 N2
ot
tn
rin
ep
N6 N8
Pr
ed
v iew
re
N9 N10
Figure 9. 2 D docking poses showing interactions of compounds N1, N2, N6 and N8-N10 in the
er
binding sites of EGFRL858R/T790M/C797S protein
3.7. MD simulations
pe
After performing molecular docking studies and ADMET properties of the predicted
compounds, MD studies of the best predicted compounds N1, N2, N6 and N8-N10 were
performed, using RMSD and RMSF parameters to analyse the dynamic behaviour and
stability of the target protein. The RMSD and RMSF plots of the EGFRL858R/T790M/C797S
ot
complex by Osimertinib and with the best predicted compounds are shown in Figure 10 (a)
and (b), respectively.
tn
rin
ep
(a) (b)
Pr
Figure 10. a) The RMSD values of the EGFRL858R/T790M/C797S protein in complex with Osimertinib and
six best ligands at 100 ns, and b) The RMSF values of the EGFRL858R/T790M/C797S protein residues in
ed
complex with Osimertinib and six best.
Analysis of the RMSD plots shows that all systems exhibited a rapid increase in RMSD
values from 0.53 Å to 1.2 Å within a 40 ns period. Thereafter, all systems fluctuated within a
iew
similar distance range of 1.2 Å and 1.5 Å, implying that all systems reached a state of stability
and equilibrium. The most stable complexation was the experimental crystallographic
structure of the EGFRL858R/T790M/C797S protein in combination with the drug Osimertinib (6lud)
obtained from the RCSB protein database with an RMSD value of 1.082 Å. Osimertinib
v
serves as the reference ligand in this study. From the RMSD plot, it can be seen that all the
re
proposed ligands have a similar trajectory to the reference ligand with a slight difference, the
RMSD values of ligands N1, N2, N6, N8, N9 and N10 complexed with EGFRL858R/T790M/C797S
were 1.231 Å, 1.137 Å, 1.173 Å, 1.169 Å, 1.159 Å and 1.251 Å, respectively. All ligands had
er
a value of about 1.3 Å and the N2 ligand had the best stability. According to the study of
Beura et al [48], an RMSD of less than 3 Å is an indicator of the conformational stability of
ligand-protein complexes. Therefore, it can be concluded that the docking results of all
pe
predicted ligand complexations with the EGFRL858R/T790M/C797S show higher conformational
stability of ligand-receptor complexes.
The RMSF trajectories measure the ligand binding affects the flexibility of the protein during
ot
the 100 ns MD simulation, which is crucial information on the receptor's stability, stiffness
and compactness. A high RMSF value indicates that the residue is flexible, while a low
RMSF value indicates that the residue is stable. In general, the majority of residues share the
tn
same RMSF values, with larger fluctuations present in different ranges, such as LYS754 (1.35
Å), LEU782 (1.14 Å), GLY874 (1.20 Å), SER921 (1.12 Å) and GLU1005 (1.11 Å), these
residues are not involved as they are located in the inactive regions of the
rin
EGFRL858R/T790M/C797S protein. However, crucial residues in the active site such as

LEU718, ASP800, GLU804, MET790, MET793, SER797 and ARG858 show smaller
fluctuations with RMSF values below 0.4 Å, which could be related to the generation of more
ep
hydrogen bonding interactions for greater stability of the ligands with the
EGFRL858R/T790M/C797S protein. These data confirm the RMSD results that all predicted
ligand complexations with the EGFRL858R/T790M/C797S protein show greater
Pr
conformational stability.
4. Conclusion
In summary, 3D-QSAR, docking, ADMET and MD simulation methods were performed for
the design of new drug candidates capable of overcoming drug resistance in third generation
ed
NSCLC. In the 3D-QSAR study, the best selected model (CoMSIA/EHA) has high stability
and predictive ability, which were assessed using external validation, Y-randomisation test
and applicability domain. The CoMSIA/SEHA contour map analyses provided a better
iew
understanding of the relationship between structure and activity, which was a progression in
guiding the design of new potent compounds. The molecular docking analysis shows the
importance of hydrogen bonds established with key residues, which also confirmed the
importance of residues such as MET790, MET793 and SER797 for the active site of EGFR
v
protein. Based on the precise recommendation provided by 3-D-QSAR and molecular
re
docking analysis, 10 new compounds with considerable activity were designed using the
virtual zinc base. The ADMET properties were used to select the best pharmacokinetic profile
of the proposed compounds. Finally, MD simulations verified the accuracy of the molecular
docking results in terms of reliability and stability, in which essential residues formed
er
hydrogens between the proposed compounds and the EGFR protein. The newly designed
compounds N1, N2, N6 and N8-N10 could be good drug candidates to overcome resistance to
pe
third generation drugs against NSCLC.
References
ot
[1] H. Sung, J. Ferlay, R.L. Siegel, M. Laversanne, I. Soerjomataram, A. Jemal, F. Bray,

Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality
tn
Worldwide for 36 Cancers in 185 Countries, CA. Cancer J. Clin. 71 (2021) 209–249.
https://doi.org/10.3322/CAAC.21660.
[2] Cancer, (n.d.). https://www.who.int/news-room/fact-sheets/detail/cancer (accessed

rin
March 27, 2021).
[3] K. Seegobin, U. Majeed, N. Wiest, R. Manochakian, Y. Lou, Y. Zhao, Immunotherapy

in Non-Small Cell Lung Cancer With Actionable Mutations Other Than EGFR, Front.
ep
Oncol. 11 (2021) 5040. https://doi.org/10.3389/FONC.2021.750657/BIBTEX.
[4] F. Ciardiello, G. Tortora, EGFR antagonists in cancer treatment, N. Engl. J. Med. 358
Pr
(2008) 1160–1174. https://doi.org/10.1056/NEJMRA0707704.
[5] A. Harandi, A.S. Zaidi, A.M. Stocker, D.A. Laber, Clinical efficacy and toxicity of
anti-EGFR therapy in common cancers, J. Oncol. (2009).
https://doi.org/10.1155/2009/567486.
ed
[6] S. V. Sharma, D.W. Bell, J. Settleman, D.A. Haber, Epidermal growth factor receptor
mutations in lung cancer, Nat. Rev. Cancer 2007 73. 7 (2007) 169–181.
iew
https://doi.org/10.1038/nrc2088.
[7] I. Solassol, F. Pinguet, X. Quantin, FDA- and EMA-Approved Tyrosine Kinase

Inhibitors in Advanced EGFR-Mutated Non-Small Cell Lung Cancer: Safety,
Tolerability, Plasma Concentration Monitoring, and Management, Biomol. 2019, Vol.
v
9, Page 668. 9 (2019) 668. https://doi.org/10.3390/BIOM9110668.
re
[8] M.H. Cohen, G.A. Williams, R. Sridhara, G. Chen, W.D. McGuinn, D. Morse, S.
Abraham, A. Rahman, C. Liang, R. Lostritto, A. Baird, R. Pazdur, United States Food
and Drug Administration Drug Approval summary: Gefitinib (ZD1839; Iressa) tablets,
Clin. Cancer Res. 10 (2004) 1212–1218. https://doi.org/10.1158/1078-0432.CCR-03-
0564.
er
[9] A.F. Gazdar, Activating and resistance mutations of EGFR in non-small-cell lung
pe
cancer: role in clinical response to EGFR tyrosine kinase inhibitors, Oncogene. 28
Suppl 1 (2009) S24–S31. https://doi.org/10.1038/ONC.2009.198.
[10] M. Tiseo, M. Bartolotti, F. Gelsomino, P. Bordi, Emerging role of gefitinib in the

ot
treatment of non-small-cell lung cancer (NSCLC), Drug Des. Devel. Ther. 4 (2010) 98.
https://doi.org/10.2147/DDDT.S6594.
tn
[11] W. Pao, V. Miller, M. Zakowski, J. Doherty, K. Politi, I. Sarkaria, B. Singh, R. Heelan,

V. Rusch, L. Fulton, E. Mardis, D. Kupfer, R. Wilson, M. Kris, H. Varmus, EGF
receptor gene mutations are common in lung cancers from “never smokers” and are
rin
associated with sensitivity of tumors to gefitinib and erlotinib, Proc. Natl. Acad. Sci.
101 (2004) 13306–13311. https://doi.org/10.1073/PNAS.0405220101.
[12] M. Singh, H.R. Jadhav, Targeting non-small cell lung cancer with small-molecule
ep
EGFR tyrosine kinase inhibitors, Drug Discov. Today. 23 (2018) 745–753.

https://doi.org/10.1016/J.DRUDIS.2017.10.004.
[13] D. Westover, J. Zugazagoitia, B.C. Cho, C.M. Lovly, L. Paz-Ares, Mechanisms of

Pr
acquired resistance to first- and second-generation EGFR tyrosine kinase inhibitors,

Ann. Oncol. 29 (2018) i10–i19. https://doi.org/10.1093/ANNONC/MDX703.
[14] M.R. V. Finlay, M. Anderton, S. Ashton, P. Ballard, P.A. Bethel, M.R. Box, R.H.
Bradbury, S.J. Brown, S. Butterworth, A. Campbell, C. Chorley, N. Colclough, D.A.E.
ed
Cross, G.S. Currie, M. Grist, L. Hassall, G.B. Hill, D. James, M. James, P. Kemmitt, T.
Klinowska, G. Lamont, S.G. Lamont, N. Martin, H.L. McFarland, M.J. Mellor, J.P.
Orme, D. Perkins, P. Perkins, G. Richmond, P. Smith, R.A. Ward, M.J. Waring, D.
iew
Whittaker, S. Wells, G.L. Wrigley, Discovery of a potent and selective EGFR inhibitor
(AZD9291) of both sensitizing and T790M resistance mutations that spares the wild
type form of the receptor, J. Med. Chem. 57 (2014) 8249–8267.
https://doi.org/10.1021/JM500973A.
v
[15] J.J. Chabon, A.D. Simmons, A.F. Lovejoy, M.S. Esfahani, A.M. Newman, H.J.
re
Haringsma, D.M. Kurtz, H. Stehr, F. Scherer, C.A. Karlovich, T.C. Harding, K.A.
Durkin, G.A. Otterson, W.T. Purcell, D.R. Camidge, J.W. Goldman, L. V. Sequist, Z.
Piotrowska, H.A. Wakelee, J.W. Neal, A.A. Alizadeh, M. Diehn, Circulating tumour
er
DNA profiling reveals heterogeneity of EGFR inhibitor resistance mechanisms in lung
cancer patients, Nat. Commun. 7 (2016). https://doi.org/10.1038/NCOMMS11815.
pe
[16] H. Hadni, M. Bakhouch, M. Elhallaoui, 3D-QSAR, molecular docking, DFT and
ADMET studies on quinazoline derivatives to explore novel DHFR inhibitors,
Https://Doi.Org/10.1080/07391102.2021.2004233. (2021) 1–15.
https://doi.org/10.1080/07391102.2021.2004233.
ot
[17] S. Sarvagalla, S.B. Syed, M.S. Coumar, An Overview of Computational Methods,

Tools, Servers, and Databases for Drug Repurposing, in: Silico Drug Des., Elsevier,
tn
2019: pp. 743–780. https://doi.org/10.1016/b978-0-12-816125-8.00025-0.
[18] H. Hadni, M. Elhallaoui, 3D-QSAR, docking and ADMET properties of aurone

analogues as antimalarial agents, Heliyon. 6 (2020) e03580.
rin
https://doi.org/10.1016/j.heliyon.2020.e03580.
[19] H. Lei, S. Fan, H. Zhang, Y.J. Liu, Y.Y. Hei, J.J. Zhang, A.Q. Zheng, M. Xin, S.Q.
Zhang, Discovery of novel 9-heterocyclyl substituted 9H-purines as
ep
L858R/T790M/C797S mutant EGFR tyrosine kinase inhibitors, Eur. J. Med. Chem.

186 (2020) 111888. https://doi.org/10.1016/J.EJMECH.2019.111888.
[20] G. Klebe, U. Abraham, T. Mietzner, Molecular Similarity Indices in a Comparative

Pr
Analysis (CoMSIA) of Drug Molecules To Correlate and Predict Their Biological

Activity, J. Med. Chem. 37 (1994) 4130–4146. https://doi.org/10.1021/jm00050a010.
[21] R.R. Mittal, L. Harris, R.A. McKinnon, M.J. Sorich, Partial charge calculation method
affects CoMFA QSAR prediction accuracy, J. Chem. Inf. Model. 49 (2009) 704–709.
ed
https://doi.org/10.1021/ci800390m.
[22] M.J.D. Powell, Restart procedures for the conjugate gradient method, Math. Program.
iew
12 (1977) 241–254. https://doi.org/10.1007/BF01593790.
[23] S. Wold, A. Ruhe, H. Wold, W.J. Dunn, III, The Collinearity Problem in Linear
Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses, SIAM
J. Sci. Stat. Comput. 5 (1984) 735–743. https://doi.org/10.1137/0905052.
v
[24] K. Roy, On some aspects of validation of predictive quantitative structure-activity
re
relationship models, Expert Opin. Drug Discov. 2 (2007) 1567–1577.
https://doi.org/10.1517/17460441.2.12.1567.
[25] K. Roy, I. Mitra, On Various Metrics Used for Validation of Predictive QSAR Models
er
with Applications in Virtual Screening and Focused Library Design, Comb. Chem.
High Throughput Screen. 14 (2011) 450–474.
https://doi.org/10.2174/138620711795767893.
pe
[26] A. Golbraikh, A. Tropsha, Beware of q2!, J. Mol. Graph. Model. 20 (2002) 269–276.
https://doi.org/10.1016/S1093-3263(01)00123-1.
[27] K. Roy, S. Kar, P. Ambure, On a simple approach for determining applicability domain
ot
of QSAR models, Chemom. Intell. Lab. Syst. 145 (2015) 22–29.

https://doi.org/10.1016/j.chemolab.2015.04.013.
tn
[28] T.I. Netzeva, A.P. Worth, T. Aldenberg, R. Benigni, M.T.D. Cronin, P. Gramatica, J.S.
Jaworska, S. Kahn, G. Klopman, C.A. Marchant, G. Myatt, N. Nikolova-Jeliazkova,
G.Y. Patlewicz, R. Perkins, D.W. Roberts, T.W. Schultz, D.T. Stanton, J.J.M. van de
rin
Sandt, W. Tong, G. Veith, C. Yang, Current Status of Methods for Defining the
Applicability Domain of (Quantitative) Structure-Activity Relationships, Altern. to
Lab. Anim. 33 (2005) 155–173. https://doi.org/10.1177/026119290503300209.
ep
[29] S. Kar, K. Roy, J. Leszczynski, Applicability Domain: A Step Toward Confident

Predictions and Decidability for QSAR Modeling, Methods Mol. Biol. 1800 (2018)
141–169. https://doi.org/10.1007/978-1-4939-7899-1_6.
Pr
[30] K. Kashima, H. Kawauchi, H. Tanimura, Y. Tachibana, T. Chiba, T. Torizawa, H.

Sakamoto, CH7233163 Overcomes Osimertinib-Resistant EGFR-Del19/T790M/C797S
Mutation, Mol. Cancer Ther. 19 (2020) 2288–2297. https://doi.org/10.1158/1535-
7163.MCT-20-0229.
ed
[31] D.S. BIOvIA, Discovery studio modeling environment., San Diego, Dassault Syst.
Release, 4. (2015). https://doi.org/https://doi.org/10.11436/mssj.17.98.
iew
[32] G.M. Morris, R. Huey, W. Lindstrom, M.F. Sanner, R.K. Belew, D.S. Goodsell, A.J.
Olson, AutoDock4 and AutoDockTools4: Automated docking with selective receptor
flexibility, J. Comput. Chem. 30 (2009) 2785–2791. https://doi.org/10.1002/jcc.21256.
[33] G.M. Morris, D.S. Goodsell, R.S. Halliday, R. Huey, W.E. Hart, R.K. Belew, A.J.
v
Olson, AutoDock-related material Automated Docking Using a Lamarckian Genetic
re
Algorithm and an Empirical Binding Free Energy Function, Comput. Chem. J.
Comput. Chem. 19 (1998) 1639–1662. https://doi.org/10.1002/jcc.20634.
[34] K. Onodera, K. Satou, H. Hirota, Evaluations of Molecular Docking Programs for

er
Virtual Screening, J. Chem. Inf. Model. 47 (2007) 1609–1618.
https://doi.org/10.1021/ci7000378.
[35] G.L. Warren, C.W. Andrews, A.-M. Capelli, B. Clarke, J. LaLonde, M.H. Lambert, M.
pe
Lindvall, N. Nevins, S.F. Semus, S. Senger, G. Tedesco, I.D. Wall, J.M. Woolven, C.E.
Peishoff, M.S. Head, A Critical Assessment of Docking Programs and Scoring
Functions, J. Med. Chem. 49 (2006) 5912–5931. https://doi.org/10.1021/jm050362n.
ot
[36] L.L.G. Ferreira, A.D. Andricopulo, ADMET modeling approaches in drug discovery,
Drug Discov. Today. 24 (2019) 1157–1165.
tn
https://doi.org/10.1016/j.drudis.2019.03.015.
[37] C.Y. Jia, J.Y. Li, G.F. Hao, G.F. Yang, A drug-likeness toolbox facilitates ADMET
study in drug discovery, Drug Discov. Today. 25 (2020) 248–258.
rin
https://doi.org/10.1016/j.drudis.2019.10.014.
[38] J.C. Phillips, R. Braun, W. Wang, J. Gumbart, E. Tajkhorshid, E. Villa, C. Chipot, R.D.
Skeel, L. Kalé, K. Schulten, Scalable molecular dynamics with NAMD, J. Comput.
ep
Chem. 26 (2005) 1781–1802. https://doi.org/10.1002/JCC.20289.
[39] S. Jo, T. Kim, V.G. Iyer, W. Im, CHARMM-GUI: A web-based graphical user
interface for CHARMM, J. Comput. Chem. 29 (2008) 1859–1865.
Pr
https://doi.org/10.1002/JCC.20945.
[40] W. Im, S. Seefeld, B. Roux, A Grand Canonical Monte Carlo–Brownian Dynamics
Algorithm for Simulating Ion Channels, Biophys. J. 79 (2000) 788–801.
https://doi.org/10.1016/S0006-3495(00)76336-3.
ed
[41] W. Humphrey, A. Dalke, K. Schulten, VMD: Visual molecular dynamics, J. Mol.
Graph. 14 (1996) 33–38. https://doi.org/10.1016/0263-7855(96)00018-5.
iew
[42] D.E. V. Pires, T.L. Blundell, D.B. Ascher, pkCSM: Predicting Small-Molecule
Pharmacokinetic and Toxicity Properties Using Graph-Based Signatures, J. Med.
Chem. 58 (2015) 4066–4072. https://doi.org/10.1021/acs.jmedchem.5b00104.
[43] D.E. Clark, In silico prediction of blood–brain barrier permeation, Drug Discov. Today.
v
8 (2003) 927–933. https://doi.org/10.1016/S1359-6446(03)02827-7.
re
[44] S. Kok-Yong, L. Lawrence, Drug Distribution and Drug Elimination, in: Basic
Pharmacokinet. Concepts Some Clin. Appl., InTech, 2015.
https://doi.org/10.5772/59929.
[45]
er
D.R. Duckett, M.D. Cameron, Metabolism considerations for kinase inhibitors in
cancer treatment, Expert Opin. Drug Metab. Toxicol. 6 (2010) 1193.
https://doi.org/10.1517/17425255.2010.506873.
pe
[46] M.K. Bollinger, A.S. Agnew, G.P. Mascara, Osimertinib: A third-generation tyrosine
kinase inhibitor for treatment of epidermal growth factor receptor-mutated non-small
cell lung cancer with the acquired Thr790Met mutation, J. Oncol. Pharm. Pract. 24
ot
(2018) 379–388. https://doi.org/10.1177/1078155217712401.
[47] A. Kenneth MacLeod, D. Lin, J.T.J. Huang, L.A. McLaughlin, C.J. Henderson, C.
tn
Roland Wolf, Identification of Novel Pathways of Osimertinib Disposition and

Potential Implications for the Outcome of Lung Cancer Therapy, Clin. Cancer Res. 24
(2018) 2138–2147. https://doi.org/10.1158/1078-0432.CCR-17-3555.
rin
[48] S. Beura, P. Chetti, In-silico strategies for probing chloroquine based inhibitors against
SARS-CoV-2, J. Biomol. Struct. Dyn. 39 (2021) 3747–3759.
https://doi.org/10.1080/07391102.2020.1772111.
ep
Pr

SSRN Id4049717

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SSRN Id4049717

Uploaded by

Copyright:

Available Formats

In silico design of EGFRL858R/T790M/C797S inhibitors via 3D-QSAR,

molecular docking, ADMET properties and molecular dynamics

Corresponding author: E-mail address: hadni.hanine@yahoo.fr (Hanine Hadni).

verified by molecular dynamics simulations to confirm the stability of hydrogen bonding

Keywords: non-small cell lung cancer, EGFRL858R/T790M/C797S inhibition, 3D-QSAR, Molecular

First-generation EGFR-TKIs (Gefitinib and Erlotinib) have been shown to be effective in

ADMET parameters. Furthermore, 100 ns MD simulations were performed to estimate the

Figure 1. Structures of 9-heterocyclyl substituted 9H-purines derivatives.

Compounds R1 R2 IC50(μM) pIC50(obs) pIC50(pred) Residual

26* F N 0.066 7.181 6.959 0.221

2.2. Structural optimization and alignment

2.3. Generation of 3D-QSAR models

2.3.1. Partial least squares (PLS) analysis.

2.4. Validation of QSAR models.

2.4.2. External validation of the CoMSIA model

∑(Y test ‒ kYtest)

Where 𝑟2 is a squared correlation coefficient between predicted and experimental activity of

𝑟20 and 𝑟'2

2.4.3. Applicability domain

2.5.1. Docking validation protocol

The development of computational technology in the pharmaceutical field has made it

2.7. Molecular dynamic simulation

Dynamics (VMD) software [41] to check the stability of the systems.

3. Results and discussion

3.1. Molecular alignment

2 = 0.715 for the external validation of the test set

CoMSIA/SHA 0.579 0.849 0.437 50.773 3 0.676 0.170 0.132 - 0.698

2 of the new compound

r2 r2 > 0.6 0.715

the DA of the CoMSIA/EHA model was assessed by a leverage analysis expressed as a

3.3. Graphical interpretation of CoMSIA model

3.4. Molecular docking

3.4.1. Docking validation protocol

3.5. Design of new compounds

PIC50 IC50 leverage Amino

N1 7.5432 0.0286 0.8065 -10.3 2 (1.81 Å,

N2 H 9.1590 0.0007 0.3717 -10.55 3

3.6. ADMET prediction

newly designed compounds are listed in Table 6.

Table 6. In silico ADMET properties of new designed compounds.

Numeric Numeric Numeric Categori

EGFRL858R/T790M/C797S protein. However, crucial residues in the active site such as

[1] H. Sung, J. Ferlay, R.L. Siegel, M. Laversanne, I. Soerjomataram, A. Jemal, F. Bray,

[2] Cancer, (n.d.). https://www.who.int/news-room/fact-sheets/detail/cancer (accessed

March 27, 2021).

[3] K. Seegobin, U. Majeed, N. Wiest, R. Manochakian, Y. Lou, Y. Zhao, Immunotherapy

Oncol. 11 (2021) 5040. https://doi.org/10.3389/FONC.2021.750657/BIBTEX.

(2008) 1160–1174. https://doi.org/10.1056/NEJMRA0707704.

[7] I. Solassol, F. Pinguet, X. Quantin, FDA- and EMA-Approved Tyrosine Kinase

[10] M. Tiseo, M. Bartolotti, F. Gelsomino, P. Bordi, Emerging role of gefitinib in the

[11] W. Pao, V. Miller, M. Zakowski, J. Doherty, K. Politi, I. Sarkaria, B. Singh, R. Heelan,

EGFR tyrosine kinase inhibitors, Drug Discov. Today. 23 (2018) 745–753.

[13] D. Westover, J. Zugazagoitia, B.C. Cho, C.M. Lovly, L. Paz-Ares, Mechanisms of

acquired resistance to first- and second-generation EGFR tyrosine kinase inhibitors,

[17] S. Sarvagalla, S.B. Syed, M.S. Coumar, An Overview of Computational Methods,

2019: pp. 743–780. https://doi.org/10.1016/b978-0-12-816125-8.00025-0.

[18] H. Hadni, M. Elhallaoui, 3D-QSAR, docking and ADMET properties of aurone

L858R/T790M/C797S mutant EGFR tyrosine kinase inhibitors, Eur. J. Med. Chem.

[20] G. Klebe, U. Abraham, T. Mietzner, Molecular Similarity Indices in a Comparative

Analysis (CoMSIA) of Drug Molecules To Correlate and Predict Their Biological

of QSAR models, Chemom. Intell. Lab. Syst. 145 (2015) 22–29.

[29] S. Kar, K. Roy, J. Leszczynski, Applicability Domain: A Step Toward Confident