Professional Documents
Culture Documents
A Comparative Linear Interaction Energy and MM/PBSA Study On SIRT1-Ligand Binding Free Energy Calculation
A Comparative Linear Interaction Energy and MM/PBSA Study On SIRT1-Ligand Binding Free Energy Calculation
Article
■ INTRODUCTION
A quantitative knowledge of protein−ligand binding affinities is
theory9 and the molecular mechanics/Poisson−Boltzmann
surface area (MM/PBSA) approach.10
In LIE, ΔGbind is assumed to be linearly proportional with
essential in understanding molecular recognition; hence,
the differences in van der Waals and electrostatic interactions
efficient and accurate binding free energy (ΔGbind) calculations involving the ligand and its environment, as obtained from
are a central goal in computer-aided drug design.1 An arsenal simulations of its protein-bound and unbound states in solvent.
of ΔGbind calculation approaches is available, ranging from Differences in these interactions (modeled using Lennard-
rigorous alchemical methods such as free energy perturbation Jones (LJ) and Coulomb potential-energy functions) are scaled
(FEP)2 and thermodynamic integration (TI)3 to fast empirical by LIE parameters α and β, respectively.9 Originally, β was set
scoring functions featured in molecular docking.4 The latter are to several fixed values according to a series of studied
prominent in predicting protein−ligand binding poses and in systems.9,11−15 Later, it turned out that fixed values for β are
discriminating binders and nonbinders within large chemical usually only suitable for particular systems of interest and not
databases,5 but they typically lack accuracy in quantitatively generally transferable between different systems.16 To mitigate
ranking and predicting ΔGbind values.6 In contrast, rigorous this, a proposal was made to treat both α and β as freely
alchemical methods may provide reliable estimates for ΔGbind adjustable parameters that can be fitted based on a set of
but require extensive sampling of multiple intermediate experimentally determined ΔGbind values.17,18 The fitted values
nonphysical states and are thus computationally more of α and β (and optionally offset parameter γ) determine the
expensive and still impractical for use in high-throughput LIE scoring function to be used for predicting ΔGbind for
scenarios.7 ligands with unknown affinity, which is given by9
Compared to these counterparts, the alternate end-point
methods offer an intermediate in terms of effectiveness and ΔGbind = αΔV vdw + β ΔV ele + γ (1)
efficiency in ΔGbind computation by allowing one to explicitly
with γ set to zero in this work (unless noted otherwise), and
explore ligand, protein−ligand, and solvent configurational
space in the protein−ligand bound and unbound states only.8 ΔV vdw = ⟨Vligvdw vdw
− surr ⟩bound − ⟨Vlig − surr ⟩unbound (2)
This provides advantages both over rigorous free energy
calculations (in terms of efficiency) and empirical scoring
functions (in terms of potential accuracy). Frequently applied Received: July 24, 2019
end-point methods make use of linear interaction energy (LIE) Published: August 28, 2019
Table 1. Compound ID, Molecular Structure, and Experimental Values for Binding Free Energies (ΔGbind) of SIRT1 Inhibitors
Used in the Current Study, with ΔGbind Values Derived from Experimental Data Obtained by Disch et al.37,a
Table 1. continued
Table 1. continued
a
The compound numbering (IDs) as listed below is the same as used by Disch et al.37
In eq 4, Gcomplex represents the free energy of the complex, binding poses as those manually selected by Karaman and
and Gprotein and Gligand represent the free energies of the Sippl31 in their MM/GBSA study as the most probable ones.
unbound protein and ligand, respectively. The separate free In a final step, we monitor the contributions from the different
energy terms Gx for the protein, ligand, or protein−ligand terms in the central LIE and MM/PBSA equations to the
complex are each quantified as23 calculations in (trends in) binding affinity, in order to evaluate
the possibility to further increase efficiency and accuracy in
Gx = V bonded + V vdw + V ele + G polar + Gnonpolar − TS computing SIRT1 affinities.
where V bonded
(5)
comprises bond-stretch, angle-bend, torsion, and
improper-dihedral energies, and Vvdw and Vele are the van der
■ METHODS
Reference Data and Input File Preparation. Reference
Waals and electrostatic nonbonded interaction energies, data for the SIRT1 ligand set (i.e., experimentally determined
respectively. Together, the sum of these terms make up the values ΔGexp for protein−ligand binding free energies) were
vacuum MM energy terms, while Gpolar and Gnonpolar constitute taken from Disch et al.37 (Table 1). All 27 compounds have a
the solvation free energies calculated using a continuum thieno[3,2-d]pyrimidine-6-carboxamide scaffold, and ΔGexp
(implicit) solvation model, representing the free energy change ranges between −29.3 and −50.3 kJ mol−1. Values for ΔGexp
due to converting a solute in vacuum into its solvated state. In were consistently obtained from the set of IC50 inhibition
MM/PBSA, Gpolar is obtained by solving the Poisson− constants reported by Disch et al., using the Cheng−Prusoff
Boltzmann equation, while the Gnonpolar term is often relation38,39
approximated by a solvent accessible surface area (SASA) IC
term.24 Replacing Poisson−Boltzmann with Generalized ΔGexp = RT ln 50
2 (6)
Born25 for calculating Gpolar is on the basis of the alternative
MM/GBSA approach.10 Here, T and S denote the temperature where R is the gas constant, T is the temperature, and the
of the system and the entropy of the solute in vacuum, concentration of agonist used to measure the IC50 data is
respectively. The TS terms entering eq 4 via Gcomplex, Gprotein, assumed to be equal to its EC50 value.39
and Gligand account for the change in conformational entropy of Analogous to previous MD studies,31,40 the SIRT1 crystal
the protein−ligand complex upon ligand binding. This can be structure with PDB ID 4I5I41 was used as the protein template
approximated, for example, via normal-mode analysis of the structure for docking and subsequent MD. Along with the
protein−ligand coordinates obtained from the simulation of protein atom coordinates, the positions of the zinc ion and one
the complex. Often this term is neglected,26 which is justified, of the water molecules in the binding pocket31 were retained to
for example, by the typical interest in relative binding free obtain the template. Protein structure preparation was
energies of similar compounds. performed using the QuickPrep function of Molecular
As previously reported by Genheden and Ryde27 among Operating Environment (MOE) version 2015.1042 to add
others, LIE can be more compute efficient up to almost 1 order hydrogens, assign protonation states and partial charges, and
of magnitude when compared to MM/GBSA (which is due to perform energy minimization prior to docking. During
the entropy27 and polar term28 in eq 5), while the relative minimization, the protein and zinc ion were described using
accuracy of both methods in predicting experimental data can the Amber14SB force field.43
be system dependent, and this conclusion also applies to MM/ To obtain starting structures for ligand binding poses, two
PBSA.27 In the current work, we evaluate if LIE can achieve alternative strategies were used: ligand−binding poses were
similar (or higher) accuracy compared to MM/PBSA in either taken directly from Karaman and Sippl31 or obtained
calculating binding affinities of 27 thieno[3,2-d]pyrimidine-6- from our docking. An advantage of using the Karaman poses
carboxamide analogs for the sirtuin 1 (SIRT1) receptor. SIRT1 can be that these were carefully selected for all ligands such
is a member of the sirtuin family which is responsible for that their scaffold showed maximal structural overlap with the
regulating and/or deacetylating stress resistance and metabo- scaffold of ligands 11c, 28, and 31, which were co-crystallized
lism processes,29 and inhibition of SIRT1 has been proposed in PDB structures 4JSR, 4JT8, and 4JT9 of the homologous
for treating cancer, HIV-1 infections, mental retardation SIRT3 protein, respectively,37 after fitting these onto the 4I5I
syndrome, and parasitic diseases.30 Results will also be SIRT1 structure. In this way, the superimposed poses of all
compared to a MM/GBSA model of Karaman and Sippl31 ligands resemble the poses of the co-crystallized inhibitors,
for these ligands binding to SIRT1. which suggests that they occupy both the acetyl lysine
Although LIE needs experimental data to train the model substrate channel and the nicotinamide C-pocket of SIRT1
parameters in eq 1, their calibration gives access to direct that overlaps with the NAD+ binding channel.31,37
estimates of ΔGbind for individual (query) compounds. This Ligand-binding poses were also generated using the
makes it straightforward to use a Boltzmann-like weighting PLANTS (Protein−Ligand Ant System) docking software
scheme to combine results of multiple MD simulations starting version 1.244 with the speed 1 setting and evaluated using the
from different protein conformations and/or ligand binding ChemPLP scoring function.45 Coordinates for the docking
modes into a single calculation per compound.32,33 Previously, center were defined as the average of the center-of-mass of
we successfully used this strategy to compute binding affinities ligands in the Karaman binding poses, and a docking radius of
for flexible proteins such as Cytochrome P450s that can bind 0.8 nm was used. For each ligand, 100 docking poses were
substrates and/or inhibitors in multiple ways.33−36 In the generated and subjected to hierarchical clustering with the
current work, we evaluate the potential of this strategy to maximum number of clusters set to 3. For each cluster, a
compute SIRT1 binding free energies by concatenating results representative pose with the smallest deviation to the average
of MD simulations starting with binding poses obtained from cluster coordinate set was selected as an initial structure for
unsupervised docking and clustering. In addition, we MD simulations. Ligand topologies for MD were generated
investigate if the Boltzmann-like weighting scheme can identify according to the general Amber force field (GAFF)46 at the
4022 DOI: 10.1021/acs.jcim.9b00609
J. Chem. Inf. Model. 2019, 59, 4018−4033
Journal of Chemical Information and Modeling Article
Table 2. Model Parameters α and β of LIE Models for SIRT1 and Their Respective Root-Mean-Square Error (RMSE),
Standard Deviation of Error in Prediction (SDEP, based on leave-one-out cross validation), and Correlation Metrics
Represented by r2, q2, Pearson’s r (rP), and Spearman’s ρ (ρS) for the Data Set of 27 Compoundsa
α β RMSE SDEP r2 q2 rP ρS
LIEsingle 0.40 −0.04 4.21 4.66 0.52 0.35 0.72 0.72
LIEγ=−10.24 0.30 −0.05 4.00 4.54 0.52 0.38 0.72 0.73
LIEmult 0.42 −0.19 6.16 6.59 0.16 −0.30 0.40 0.43
LIEcomb 0.40 −0.05 4.30 4.88 0.49 0.29 0.70 0.70
LIEBoltzmann4 0.40 −0.02 4.21 4.67 0.49 0.35 0.70 0.69
LIEfirst1ns 0.41 −0.04 3.96 4.39 0.54 0.42 0.74 0.70
LIEfirst5ns 0.41 −0.02 4.41 4.91 0.46 0.28 0.68 0.68
LIEfirst10ns 0.40 −0.04 4.27 4.73 0.51 0.33 0.71 0.72
LIEfirst15ns 0.40 −0.04 4.27 4.71 0.51 0.34 0.71 0.71
LIElast1ns 0.41 0.03 4.58 5.01 0.48 0.25 0.69 0.65
LIElast5ns 0.40 −0.03 4.28 4.75 0.52 0.32 0.72 0.69
LIElast10ns 0.40 −0.05 4.41 4.83 0.50 0.30 0.70 0.70
LIElast15ns 0.40 −0.05 4.30 4.74 0.52 0.33 0.72 0.72
LIEequil 0.41 −0.02 4.50 5.30 0.45 0.16 0.67 0.66
LIEβ=0 0.41 − 4.24 4.40 0.51 0.42 0.71 0.67
LIEα=0 − −1.67 31.3 32.1 0.05 −29.9 0.23 0.18
vdwbound 0.19 − 3.79 3.94 0.58 0.54 0.76 0.76
vdwunbound 0.35 − 4.39 4.56 0.49 0.38 0.70 0.67
a
RMSEs and SDEPs are given in kJ mol−1.
where γsurf is related to the surface tension of the solvent and is from the ideal in-plane 180° orientation and the heavy
set to its default value of 2.26778 kJ mol−1 nm−2, while a atom acceptor neighbor−acceptor−H angle did not
default value (3.84982 kJ mol−1) was also used for fitting deviate more than 90°.77 Donor atom SYBYL types: N.3,
parameter b. A solvent probe radius of 0.14 nm was employed N.2, N.acid, N.ar, N.am, N.4, N.pl3, N.plc, and O.3 with
to determine the SASA.24 Due to its high computational at least one attached hydrogen. Acceptor atom SYBYL
demand and large standard error compared to other types: N.3, N.2, N.1, N.acid, N.ar, O.3, O.co2, O.2, S.m,
contributing terms,6,20,31,66,69−72 the entropic term was not and S.a not of a donor character. H-bonds part of a salt
included in the MM/PBSA models presented in this work. bridge were labeled as such by the salt-bridge detection
Protein−ligand Interaction Profiling. To analyze method.
protein−ligand interactions that occurred during the MD • Salt bridges: identified as all interactions between the
simulations, we built a dedicated Python-based biomolecular heavy atoms representing the geometric centers of two
analysis library MDInteract, which we made freely available at charged groups of opposite signs within 0.55 nm
https://github.com/MD-Studio/MDInteract. MDInteract adds distance.78 The H-bond component of the salt bridge
structure analysis methods to the API of the popular pandas was added by the H-bond detection method.
data analysis toolkit.73 The MD trajectory and structure file • Cation−π interactions: identified as all interactions
formats are imported using the MDTraj library.74 MD between the center of an aromatic ring and a center of
trajectories were analyzed for the following biomolecular positive charge if the distance was within 0.6 nm and the
interaction types at a 100 ps resolution similar to previous offset between charge center and ring center after
studies performed by us:35,61,75 projection on the same plane was less than 0.2 nm. In
• Hydrophobic interactions: identified as all contacts case the charge center was a tertiary amine, the angle
between carbon atoms within 0.4 nm having only C or between charge center and ring plane should be within
H atoms as neighbors. Interacting pairs from the same 90° ± 30°.79
source atom to multiple target atoms in the same • Halogen bonds: identified as pairs of halogen bond donor
residue, in both directions, were filtered to retain only (I, Br, Cl, F) and acceptor atoms (O, N, and S having a
the pair with the smallest distance. Note that atoms single atom neighbor of type P, C, or S) within 0.41 nm
involved in π- or T-stacking were separately considered where the donor angle (C−X−O) was within 165° ±
(see below). 30° and the acceptor angle (Y−O−X) was within 120°
• π- and T-stacking: identified as the geometric centers of ± 30°.80
two aromatic rings within 0.55 nm, with an offset of no The frequency of interactions per ligand pose to interacting
more than 0.2 nm after projection of the geometric amino acid residues was normalized by the number of
center of one ring onto the other and an angle between replicates (i.e., by dividing interaction counts by two) and
the normals of both rings of 180° ± 30° for π-stacking or was represented as heat maps using the seaborn Python
90° ± 30° for T-stacking.76 Aromatic rings were package version 0.8.81 Ligand−residue interaction counts that
detected using a SSSR algorithm (Smallest Set of are less than 20 were discarded for further analysis.
Smallest Rings) filtering for aromatic planar rings.
• Hydrogen bonds (H-bond): identified as all interactions
between H-bond donor−acceptor pairs within 0.41 nm
■ RESULTS AND DISCUSSION
LIE and MM/PBSA Models. Tables 2 and 3 summarize the
where the donor−H−acceptor angle was within 50° predictive ability of our LIE and MM/PBSA models, which
4024 DOI: 10.1021/acs.jcim.9b00609
J. Chem. Inf. Model. 2019, 59, 4018−4033
Journal of Chemical Information and Modeling Article
Table 3. Performance of MM/PBSA Models for SIRT1 in When calibrating a LIE model using per ligand the binding
Terms of Their Correlation Metrics, Represented by r2, q2, pose proposed by Karaman and Sippl31 as a (single) starting
Pearson’s r (rP), and Spearman’s ρ (ρS) for the Data Set of pose for MD, we obtain a model (LIEsingle in Table 2) with
27 Compounds RMSE = 4.21 kJ mol−1, which is within the typical
experimental uncertainty of 4−5 kJ mol−1.87,88 Overall, this
r2 rP ρS
suggests satisfactory performance of the model parameters
MM/PBSAε=2 0.41 0.64 0.61 used. Including the optional offset γ parameter did not improve
MM/PBSAε=1 0.39 0.63 0.60 model performance further: The resulting LIEγ=−10.24 model
MM/PBSAε=4 0.22 0.47 0.45 (with γ = −10.24 kJ mol−1) showed a slight reduction in
MM/PBSAε=8 0.11 0.33 0.31 RMSE and SDEP of 0.21 and 0.12 kJ mol−1, respectively, but it
MM/PBSAf irst1ns 0.28 0.53 0.46 did not show an increase in correlation coefficients compared
MM/PBSAf irst5ns 0.41 0.64 0.64 to LIEsingle (Table 2). Figure 1 illustrates that LIEsingle shows a
MM/PBSAf irst10ns 0.42 0.65 0.63
slightly better correlation between calculated and experimental
MM/PBSAf irst15ns 0.41 0.64 0.63
ΔGbind values than the MM/PBSA model in which εsolute is set
MM/PBSAlast1ns 0.36 0.60 0.56
to its default g_mmpbsa value of 2 (MM/PBSAε=2 in Table 3).
MM/PBSAlast5ns 0.35 0.59 0.56
This is confirmed by the slightly higher r2 (0.52 vs 0.41) and
MM/PBSAlast10ns 0.36 0.60 0.58
Pearson’s and Spearman’s coefficients (0.72 vs 0.64 or 0.61) of
MM/PBSAlast15ns 0.39 0.62 0.60
LIEsingle compared to MM/PBSAε=2 (Tables 2 and 3). It should
MM/PBSAele 0.19 0.43 0.08
be kept in mind however that α and β in the LIE equations are
MM/PBSAvdW 0.60 0.77 0.76
fitted parameters. Here, q2 of LIEsingle (0.35) is in fact slightly
MM/PBSApolar 0.36 −0.60 −0.57
lower than r2 for MM/PBSAε=2 (0.41), and the value of −0.04
MM/PBSASASA 0.57 0.76 0.75
for β hints at possible overfitting. As discussed in more detail
MM/PBSAnoele 0.04 0.20 0.14
below, this could be resolved by constraining β to 0 in a
MM/PBSAnopolar 0.49 0.70 0.71
separate LIEβ=0 model, which has a q2 value of 0.42 (Table 2).
In accordance with previous findings from MM/PBSA
was evaluated in terms of obtained deviations from and studies on other protein systems,66 we found a clear effect of
correlation with experimental data. The latter was assessed by εsolute on obtained correlations in the MM/PBSA models.
calculating r2, Pearson’s r (rP), and Spearman’s ρ (ρS) Setting its value to 1 or 4 led to slightly and significantly lower
coefficients.82 For the calibrated LIE models, we also report correlation coefficients, respectively, while using εsolute = 8
correlation coefficients q2 obtained from leave-one-out cross further decreased the quality of the model when compared to
validation (LOO-CV). Deviations of computed ΔGbind values MM/PBSAε=2 (Table 3). Even though, values of 6−7 or higher
from experiments were analyzed in terms of root-mean-square have been reported as average estimates for dielectric
errors (RMSEs) and standard deviations of errors in permittivities of protein binding pockets,89 εsolute is often set
predictions based on LOO-CV (SDEPs). In Tables 2 and 3, to 1 or 2.24,90 Our finding that the most successful models of
RMSEs and SDEPs are only reported for our LIE models MM/PBSA in predicting SIRT1−ligand binding use a low
because the MM/PBSA calculations do not allow one to value of εsolute is in line with the hydrophobic acetyl−lysine
directly compare predicted and experimental ΔGbind values for binding interface in SIRT1;91 as discussed by Karaman and
individual ligands.66,83−86 This is exemplified by the range of Sippl,31 the binding site of all inhibitors is characterized by six
predicted values obtained from one of the MM/PBSA models hydrophobic (Ala22, Ile30, Phe33, Phe57, Ile107, Phe174) and
discussed below, as presented in Figure 1 (right panel). two charged side chains (Asp108, His123).
Figure 1. Scatter plots for experimental (exp) and calculated (calc) free energies ΔGbind of ligand−SIRT1 binding. Calculated values are obtained
using the LIEsingle (left panel) or MM/PBSAε=2 model (right panel). The solid blue lines represent linear fits to the data, while the distributions of
data are represented by kernel density plots.
Figure 2. Ligand−residue interaction profiles for MD simulations used in the LIEsingle and MM/PBSA models (upper panel) and in the LIEmult
model (lower panel). Interaction counts are presented for individual SIRT1 residues (x-axis labels) and averaged over duplicated 20-ns runs per
ligand (y-axis labels in upper panel) or per combination of ligand and starting pose used for MD (y-axis labels in lower panel).
Our MM/PBSAε=2 model performs similarly as one of the Inclusion of Multiple Binding Poses. In a next step, we
MM/GBSA models presented by Karaman and Sippl,31 who investigated if we could use the iterative LIE scheme to
used a εsolute value of 1 and reported r2 values ranging between improve calculations and to develop a binding affinity model in
0.4 and 0.6 in their MM/GBSA study on SIRT binding. In an unsupervised manner. For that purpose, multiple binding
poses were directly obtained from docking and clustering.
addition, the similar performance of LIEsingle and MM/PBSAε=2
These were used as starting structures in MD simulations i in
is in line with the earlier work of Uciechowska et al., who eq 8. The resulting LIEmult model showed significantly higher
found comparable performance of LIE and MM/PBSA in RMSE and lower correlation coefficients than the LIEsingle
reproducing experimental data for a smaller set of SIRT2 model (Table 2). We analyzed if this decrease in predictive
binders.72 quality was due to either a lack of correct MD starting poses in
4026 DOI: 10.1021/acs.jcim.9b00609
J. Chem. Inf. Model. 2019, 59, 4018−4033
Journal of Chemical Information and Modeling Article
the LIEmult model or the fact that the protein does not bind its actions, iterative LIE training can subsequently be performed
ligands in multiple orientations (or due to a combination of in an unsupervised manner.
both). Here, we evaluated if Boltzmann-like weighting (eq 7)
We first explored the interactions between the ligands and within the LIEcomb model predicted the manually prepared
SIRT1 residues during simulations used in the LIEsingle and binding poses of Karaman and Sippl as the most probable ones
LIEmult models. The obtained interaction heat maps are by inspecting per ligand the starting pose used in the MD
depicted in Figure 2. They show high frequencies in three simulation with the highest weight Wi (Figure 4). We found
ligand−SIRT1 hot-spot interactions that were present in the that for 22 of the 27 ligands, the simulation starting from the
starting poses31 and persistent during the simulations used in binding pose used in the LIEsingle model shows the largest
the LIEsingle and MM/PBSA models (i.e., with Phe33, Ile107, weight Wi. For the five ligands for which the largest weight
and Asp108; Figure 3). Figure 2 shows that these frequencies comes from a simulation starting from one of the docked poses
used in LIEmult (19, 20, 21, 27, 38), we inspected the
corresponding pose (Figure 5) and discovered that the
scaffolds of three out of five ligands (19, 20, 21) show clear
structural overlap with the ligand pose used for LIEsingle (cf.
Figure 5 and the relatively low RMSDs in Table S1).
Interestingly, docked starting poses of ligands 27 (pose 1)
and 38 (pose 1) for the simulations with the highest weight Wi
in the LIEcomb model show less overlap (Figure 5) and higher
RMSDs (Table S1), but they still share similar interaction
profiles with the simulations used for these ligands in LIEsingle
(Figure 2). This highlights that similarity in protein−ligand
interactions may well be a better measure for comparable
modes and/or affinities of binding than structural overlap
alone, in agreement with results from previous studies in which
we used iterative LIE to evaluate binding to flexible proteins
such as CYPs or nuclear receptor FXR.34,35,61 This is further
exemplified by MD simulations starting from poses which
show relatively high contributions to LIEcomb calculations such
as in the case of pose 1 of compound 26, which does not have
the lowest RMSD of poses 1−3 when compared to the LIEsingle
starting pose (Table S1), but shows a similar pattern of
protein−ligand interactions (Figure 2). Subsequently, we
trained a LIE model using per ligand the simulation with the
highest weight Wi from LIEcomb, and we found that this
Figure 3. Residues of SIRT1 (Phe33, Ile107, Asp108) that show hot- LIEBoltzmann4 model performs similarly as LIEsingle and LIEcomb
spot interactions with the ligands during the simulations used, for
example, in the LIEsingle model. All ligands included in the model are (Table 2), confirming that the latter does not utilize too many
depicted in stick representation using different colors (in their starting different MD starting poses.
structure used for MD), while the interacting residues are labeled and Impact of Simulation Length. We also explored the
shown in ball-and-stick representation. effect of simulation length on ΔGbind computations by the
LIEsingle and MM/PBSAε=2 models. For that purpose, models
are significantly lower for most of the LIEmult simulations, were developed based on averages for the interaction energy
especially those involving interactions with Asp108. This and solvation terms over either the first or the last 1, 5, 10, or
finding indicates that the unsupervised selection of starting 15 ns of the production simulations starting from the Karaman
poses for MD leads to a larger spread in protein−ligand and Sippl poses (and using ε = 2 in the MM/PBSA models).
interactions during simulation than in the LIEsingle model. For The performance of these models is listed in Tables 2 and 3
most ligands, we found a relatively large root-mean-square (using subscripts f irstXns or lastXns) and was compared to the
deviation (RMSD) in atomic positions between the MD models based on 20-ns averages as described above.
starting poses used in LIEmult and the LIEsingle pose; for only MM/PBSA studies in the literature often report simulations
eight of the ligands, at least one of the three docked starting of tens of nanoseconds from which the last few nanoseconds
poses showed an RMSD of less than 0.1 nm (Table S1). Only are used to compute averages for the different terms in eq 5 or
when combining results of the MD simulations initiated from 9.64 Our analysis shows that for the current system of interest a
the starting poses of LIEsingle with those from LIEmult, we could simulation length of 5 ns (in the MM/PBSAf irst5ns model) is
train a combined model LIEcomb with similar accuracy and sufficient to obtain comparable correlation coefficients for
correlation coefficients (and α and β) as LIEsingle (Table 2). MM/PBSAε=2. However, by running longer simulation times
These findings indicate that docking and clustering for the and taking the last 1 or 5 ns to average over gives a slight
purpose of LIEmult training did not lead to sufficiently deterioration in model performance (cf. MM/PBSAlast1ns and
appropriate binding poses, while the retained predictive quality MM/PBSAlast5ns in Table 3, respectively). It is also worth
of the LIEcomb model (compared to LIEsingle) implies that mentioning that 1-ns LIE simulations are sufficient to achieve a
incorporating too many binding poses does not necessarily model (LIEf irst1ns) that shows even slightly better performance
lead to lower predictive quality. Thus, when being able to compared to LIEsingle (Table 2). The finding that shorter
generate and select appropriate binding poses from docking simulation lengths can improve LIE calculations is in line with
and/or experimental information on protein−ligand inter- previous results in iterative models that showed improved
4027 DOI: 10.1021/acs.jcim.9b00609
J. Chem. Inf. Model. 2019, 59, 4018−4033
Journal of Chemical Information and Modeling Article
Figure 4. Boltzmann weights Wi for the simulations (x-axis) used per ligand (y-axis) in the combined LIEcomb model. Starting poses 0 and 1−3 were
starting poses for the simulations used in the LIEsingle and LIEmult model, respectively, and the intensity of the color indicates the size of Wi per
simulation.
Figure 5. Comparison of ligand starting poses for the MD simulations that have the highest weights Wi in the combined LIEcomb model (solid cyan)
with the LIEsingle pose (transparent green).
performance when restricting sampling to local parts of typically observed time series of the different MM/PBSA and
conformational space,62,92 and it can be seen as a clear LIE terms. In conclusion, longer equilibration times do not
advantage of LIE in terms of its efficiency. It prompted us to necessarily lead to an improved agreement with experimental
generate a LIE model (LIEequil) using the output of our thermal data which makes the selection of the optimal simulation time
equilibration simulation at 300 K only, but this in turn showed to be an effective model parameter.
a slightly higher RMSE (4.50 kJ mol−1) and lower rP (0.67) Contributions of Different LIE and MM/PBSA Terms
than those for LIEsingle and LIEf irst1ns (Table 2). The reason that to ΔGbind Calculations. In the LIE models discussed above, β
longer equilibration may not necessarily lead to improved is close or practically equal to zero (Table 2). This indicates
correlation for the MM/PBSA model (which is in line with that electrostatic interactions do not significantly contribute to
previous findings in literatures66,93) and that longer simulations trends in binding affinity, which can be in line with the
are more required for MM/PBSA than for LIE is due to the hydrophobic character of the SIRT1 binding pocket.91 In a
relatively slow convergence of the MM/PBSA polar term.90,94 previous LIE study on a protein with a hydrophobic active site
This is illustrated in Figure 6, which shows an example of a (i.e., Cytochrome P450 1A2 or CYP1A2), fitted values for β
4028 DOI: 10.1021/acs.jcim.9b00609
J. Chem. Inf. Model. 2019, 59, 4018−4033
Journal of Chemical Information and Modeling Article
■
suggesting that hot-spot and other hydrogen-bonding inter-
actions with SIRT1 residues compensate for the ligand− CONCLUSIONS
solvent interactions in the unbound state. As a result, training a
LIE model with α set to zero (and using only β as fitting By comparing fitted LIE models for a set of 27 compounds
parameter) is of no meaning at all (cf. LIEα=0 in Table 2). binding to SIRT1 with single-trajectory-based MM/PBSA
Similarly as for LIE, we tried to simplify the MM/PBSA models, we find calculated ΔGbind values from our LIE models
models by only taking van der Waals interactions or the with a RMSE of 4.2 kJ mol−1, while the obtained correlations
nonpolar/SASA term into account. Table 3 shows that models are comparable (Pearson’s r = 0.72 and 0.64 for LIEsingle and
that only include the van der Waals VvdW complex interactions (MM/
MM/PBSAε=2, respectively). This indicates that if agreement
PBSAvdW, rP = 0.77) or the SASA ΔGnonpolar term in eq 9 (MM/ with experimental affinities in absolute terms is not of interest,
PBSASASA, rP = 0.76) show higher predictivity than models both methods have a comparable predictive quality. We also
based on electrostatic Vele
complex interactions (rP = 0.43) or the
found that longer equilibration may not necessarily lead to
polar ΔGpolar term only (rP = −0.60) and even than the overall improved correlation with experiments and that longer
MM/PBSAε=2 model. Our finding that LIE and MM/PBSA simulations are more required for MM/PBSA than for LIE
models with a reduced number of model terms can give similar due to the relatively slow convergence of the MM/PBSA polar
or slightly improved performance would make it of interest to term. In addition to the higher cost in computing this term
explore the use of extended LIE approaches such as the one when compared to the terms in the central LIE equation, this
recently proposed by He and co-workers,96 in which MM/ makes LIE more efficient.
PBSA-like terms are included together with six fitting For the studied protein and ligands, calibration of an
parameters. Based on our current findings, this number may iterative LIE model (that uses weighted contributions from
also be effectively reduced in a model for ligand binding to multiple simulations starting from different ligand-binding
SIRT1. In addition to the small electrostatic contribution to poses) did not lead to satisfactory computations when
trends in binding to the hydrophobic SIRT1 pocket (Table generating starting poses for MD simulations in unsupervised
S2), the reason that neglecting electrostatics and polar docking and clustering. During these simulations, we found
contributions can improve our MM/PBSA predictions can relatively low frequencies in ligand interactions with hot-spot
be explained by the relatively large uncertainty in determining protein residues, as analyzed using our Python library
complex and especially ΔG
Vele polar
(cf. Figure 6). The difficulty to MDInteract. In other words, careful selection of proper MD
reach convergence in the latter can be due to fluctuations in starting poses can be crucial in generating iterative LIE models.
contributions not only from the protein binding site but also Encouragingly, when combining results from simulations
from distant protein residues that may be less relevant for starting from the docked poses and from poses based on
ligand binding.22 The uncertainty in determining ΔGpolar was homologous crystal structure data (in the LIEcomb model),
4029 DOI: 10.1021/acs.jcim.9b00609
J. Chem. Inf. Model. 2019, 59, 4018−4033
Journal of Chemical Information and Modeling Article
Boltzmann-like weighting of the multiple simulations was able ment Fund for Education (LPDP), Ministry of Finance,
to identify appropriate starting configurations. Republic of Indonesia.
■
For our simulations with starting structures obtained in a
supervised way, we found relatively small values for the REFERENCES
difference ΔVele in electrostatic interactions in the bound and
unbound state. Apparently, ligand interactions with hot-spot (1) Murcko, M. A. Computational Methods to Predict Binding Free
Energy in Ligand-Receptor Complexes. J. Med. Chem. 1995, 38,
residues can largely compensate for the loss in polar 4953−4967.
interactions with the solvent upon binding. As a result, LIE (2) Zwanzig, R. W. High-Temperature Equation of State by a
parameter β was fitted to values close to zero. By constraining Perturbation Method. I. Nonpolar Gases. J. Chem. Phys. 1954, 22,
it to zero, we could obtain a model with reduced overfitting 1420−1426.
compared to models in which β was used as a free parameter. (3) Kirkwood, J. G. Statistical Mechanics of Fluid Mixtures. J. Chem.
This was illustrated by a lower SDEP and higher q2 for the Phys. 1935, 3, 300−313.
former model in leave-one-out cross validation. We found that (4) Kitchen, D. B.; Decornez, H.; Furr, J. R.; Bajorath, J. Docking
the correlation between calculated and experimental ΔGbind and Scoring in Virtual Screening for Drug Discovery: Methods and
values could be maximized by including protein−ligand van Applications. Nat. Rev. Drug Discovery 2004, 3, 935.
der Waals interactions only (i.e., either in a MM/PBSA model (5) Shoichet, B. K.; McGovern, S. L.; Wei, B.; Irwin, J. J. Lead
Discovery Using Molecular Docking. Curr. Opin. Chem. Biol. 2002, 6,
complex in eq 9 as the only term or in a LIE model with α
with VvdW
439−446.
as the only free parameter and which only includes results from (6) Genheden, S.; Ryde, U. The MM/PBSA and MM/GBSA
the bound-complex simulations). On the other hand, our Methods to Estimate Ligand-Binding Affinities. Expert Opin. Drug
models could be made most efficient by only including ligand Discovery 2015, 10, 449−461.
properties, e.g., in a SASA-only-based model or in a LIE model (7) Amaro, R. E.; Baron, R.; McCammon, J. A. An Improved
that only includes van der Waals interaction energies involving Relaxed Complex Scheme for Receptor Flexibility in Computer-Aided
the unbound ligand. The finding that predictive models could Drug Design. J. Comput.-Aided Mol. Des. 2008, 22, 693−705.
be obtained in terms of van der Waals interactions only for this (8) de Ruiter, A.; Oostenbrink, C. Free Energy Calculations of
specific protein is in line with the hydrophobic character of its Protein−Ligand Interactions. Curr. Opin. Chem. Biol. 2011, 15, 547−
binding site and with the uncertainty in determining the 552.
ΔGpolar MM/PBSA term. (9) Åqvist, J.; Medina, C.; Samuelsson, J.-E. A New Method for
■
Predicting Binding Affinity in Computer-Aided Drug Design. Protein
Eng., Des. Sel. 1994, 7, 385−391.
ASSOCIATED CONTENT (10) Srinivasan, J.; Cheatham, T. E.; Cieplak, P.; Kollman, P. A.;
*
S Supporting Information Case, D. A. Continuum Solvent Studies of the Stability of DNA, RNA,
The Supporting Information is available free of charge on the and Phosphoramidate- DNA Helices. J. Am. Chem. Soc. 1998, 120,
ACS Publications website at DOI: 10.1021/acs.jcim.9b00609. 9401−9409.
(11) Åqvist, J.; Hansson, T. On the Validity of Electrostatic Linear
Root-mean-square deviations (RMSDs) of ligand−atom Response in Polar Solvents. J. Phys. Chem. 1996, 100, 9512−9521.
positions between starting poses used in the three (12) Hansson, T.; Marelius, J.; Åqvist, J. Ligand Binding Affinity
simulations per ligand in the LIEmult model and the Prediction by Linear Interaction Energy Methods. J. Comput.-Aided
starting pose of the LIEsingle simulations (Table S1). Mol. Des. 1998, 12, 27−35.
ΔVvdw and ΔVele values in kJ mol−1 per ligand as used in (13) Wang, W.; Wang, J.; Kollman, P. A. What Determines the van
the LIEsingle model (Table S2) (PDF) der Waals Coefficient β in the LIE (Linear Interaction Energy)
Method to Estimate Binding Free Energies Using Molecular
Ligand force field parameters: GROMACS include Dynamics Simulations? Proteins: Struct., Funct., Genet. 1999, 34,
topology (.itp) files per ligand (using ligand IDs from 395−402.
Table 1 for file nomenclature) and a separate atom type (14) Almlöf, M.; Carlsson, J.; Åqvist, J. Improving the Accuracy of
file (attype.itp) to specify van der Waals parameters used the Linear Interaction Energy Method for Solvation Free Energies. J.
in the GROMACS simulations (ZIP) Chem. Theory Comput. 2007, 3, 2162−2175.
■
(15) Gutiérrez-de Terán, H.; Åqvist, J. Computational Drug Discovery
and Design; Springer, 2012; pp 305−323.
AUTHOR INFORMATION (16) Valiente, P. A.; Gil, L. A.; Batista, P. R.; Caffarena, E. R.; Pons,
Corresponding Author T.; Pascutti, P. G. New Parameterization Approaches of the LIE
Method to Improve Free Energy Calculations of PlmII-Inhibitors
*E-mail: d.p.geerke@vu.nl. Complexes. J. Comput. Chem. 2010, 31, 2723−2734.
ORCID (17) Carlson, H. A.; Jorgensen, W. L. An Extended Linear Response
Daan P. Geerke: 0000-0002-5262-6166 Method for Determining Free Energies of Hydration. J. Phys. Chem.
1995, 99, 10667−10673.
Notes (18) Wall, I. D.; Leach, A. R.; Salt, D. W.; Ford, M. G.; Essex, J. W.
The authors declare no competing financial interest.
■
Binding Constants of Neuraminidase Inhibitors: an Investigation of
the Linear Interaction Energy Method. J. Med. Chem. 1999, 42,
ACKNOWLEDGMENTS 5142−5152.
(19) Wang, C.; Greene, D.; Xiao, L.; Qi, R.; Luo, R. Recent
Dr. Berin Karaman (Biruni University) and Prof. Wolfgang
Developments and Applications of the MMPBSA Method. Front. Mol.
Sippl (Martin-Luther-University of Halle-Wittenberg) are Biosci. 2018, 4, 87.
gratefully acknowledged for sharing their SIRT1−ligand (20) Š packová, N.; Cheatham, T. E.; Ryjácek, F.; Lankaš, F.; van
complex structures with us. We thank The Netherlands Meervelt, L.; Hobza, P.; Š poner, J. Molecular Dynamics Simulations
Organisation for Scientific Research (NWO, VIDI Grant and Thermodynamics Analysis of DNA-Drug Complexes. Minor
723.012.105) for financial support, and E.A.R. gratefully Groove Binding between 4, 6-diamidino-2-phenylindole and DNA
acknowledges financial support from the Indonesia Endow- Duplexes in Solution. J. Am. Chem. Soc. 2003, 125, 1759−1769.
(21) Lepšík, M.; Kříž, Z.; Havlas, Z. Efficiency of a Second- and Cheng-Prusoff Equations. Br. J. Pharmacol. 1993, 109, 1110−
Generation HIV-1 Protease Inhibitor Studied by Molecular Dynamics 1119.
and Absolute Binding Free Energy Calculations. Proteins: Struct., (40) Andika, A.; Erlina, L.; Azminah, A.; Yanuar, A. Molecular
Funct., Bioinf. 2004, 57, 279−293. Dynamics Simulation of SIRT1 Inhibitor from Indonesian Herbal
(22) Gilson, M. K.; Zhou, H.-X. Calculation of Protein-Ligand Database. J. Young Pharm. 2018, 10, 3−6.
Binding Affinities. Annu. Rev. Biophys. Biomol. Struct. 2007, 36, 21−42. (41) Zhao, X.; Allison, D.; Condon, B.; Zhang, F.; Gheyi, T.; Zhang,
(23) Kollman, P. A.; Massova, I.; Reyes, C.; Kuhn, B.; Huo, S.; A.; Ashok, S.; Russell, M.; MacEwan, I.; Qian, Y.; Jamison, J. A.; Luz,
Chong, L.; Lee, M.; Lee, T.; Duan, Y.; Wang, W.; Donini, O.; Cieplak, J. G. The 2.5 Å Crystal Structure of the SIRT1 Catalytic Domain
P.; Srinivasan, J.; Case, D. A.; Cheatham, T. E. Calculating Structures Bound to Nicotinamide Adenine Dinucleotide (NAD+) and an
and Free Energies of Complex Molecules: Combining Molecular Indole (EX527 Analogue) Reveals a Novel Mechanism of Histone
Mechanics and Continuum Models. Acc. Chem. Res. 2000, 33, 889− Deacetylase Inhibition. J. Med. Chem. 2013, 56, 963−969.
897. (42) Molecular Operating Environment (MOE), Version 2015.10;
(24) Kumari, R.; Kumar, R.; Open Source Drug Discovery Chemical Computing Group, Inc., Canada.
Consortium; Lynn, A. g_mmpbsa−A GROMACS Tool for High- (43) Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.;
Throughput MM-PBSA Calculations. J. Chem. Inf. Model. 2014, 54, Hauser, K. E.; Simmerling, C. ff14SB: Improving the Accuracy of
1951−1962. Protein Side Chain and Backbone Parameters from ff99SB. J. Chem.
(25) Still, W. C.; Tempczyk, A.; Hawley, R. C.; Hendrickson, T. Theory Comput. 2015, 11, 3696−3713.
Semianalytical Treatment of Solvation for Molecular Mechanics and (44) Korb, O.; Stützle, T.; Exner, T. E. An Ant Colony Optimization
Dynamics. J. Am. Chem. Soc. 1990, 112, 6127−6129. Approach to Flexible Protein−Ligand Docking. Swarm Intell 2007, 1,
(26) Shirts, M. R.; Mobley, D. L.; Brown, S. P. Drug Design: 115−134.
Structure- and Ligand-Based Approaches; Cambridge University Press: (45) Korb, O.; Stutzle, T.; Exner, T. E. Empirical Scoring Functions
New York, 2010; pp 61−86. for Advanced Protein-Ligand Docking with PLANTS. J. Chem. Inf.
(27) Genheden, S.; Ryde, U. Comparison of the Efficiency of the Model. 2009, 49, 84−96.
LIE and MM/GBSA Methods to Calculate Ligand-Binding Energies. (46) Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D.
J. Chem. Theory Comput. 2011, 7, 3768−3778. A. Development and Testing of a General Amber Force Field. J.
(28) Feig, M.; Onufriev, A.; Lee, M. S.; Im, W.; Case, D. A.; Brooks, Comput. Chem. 2004, 25, 1157−1174.
C. L., III Performance Comparison of Generalized Born and Poisson (47) Jakalian, A.; Bush, B. L.; Jack, D. B.; Bayly, C. I. Fast, Efficient
Methods in the Calculation of Electrostatic Solvation Energies for Generation of High-Quality Atomic Charges. AM1-BCC Model: I.
Protein Structures. J. Comput. Chem. 2004, 25, 265−284. Method. J. Comput. Chem. 2000, 21, 132−146.
(29) Guarente, L. Sirtuins in Aging and Disease. Cold Spring Harbor (48) Wang, J.; Wang, W.; Kollman, P. A.; Case, D. A. Automatic
Atom Type and Bond Type Perception in Molecular Mechanical
Symp. Quant. Biol. 2007, 72, 483−488.
(30) Villalba, J. M.; Alcaín, F. J. Sirtuin Activators and Inhibitors. Calculations. J. Mol. Graphics Modell. 2006, 25, 247−260.
(49) Case, D. A.; Berryman, J. T.; Betz, R. M.; Cerutti, D. S.;
BioFactors 2012, 38, 349−359.
Cheatham, T. E., III; Darden, T. A.; Duke, R. E.; Giese, T. J.; Gohlke,
(31) Karaman, B.; Sippl, W. Docking and Binding Free Energy
H.; Goetz, A. W.; Homeyer, N.; Izadi, S.; Janowski, P.; Kaus, J.;
Calculations of Sirtuin Inhibitors. Eur. J. Med. Chem. 2015, 93, 584−
Kovalenko, A.; Lee, T. S.; LeGrand, S.; Li, P.; Luchko, T.; Luo, R.;
598.
Madej, B.; Merz, K. M.; Monard, G.; Needham, P.; Nguyen, H.;
(32) Stjernschantz, E.; Oostenbrink, C. Improved Ligand-Protein
Nguyen, H. T.; Omelyan, I.; Onufriev, A.; Roe, D. R.; Roitberg, A.;
Binding Affinity Predictions Using Multiple Binding Modes. Biophys.
Salomon-Ferrer, R.; Simmerling, C. L.; Smith, W.; Swails, J.; Walker,
J. 2010, 98, 2682−2691.
R. C.; Wang, J.; Wolf, R. M.; Wu, X.; York, D. M.; Kollman, P. A.
(33) Perić-Hassler, L.; Stjernschantz, E.; Oostenbrink, C.; Geerke, D.
AMBER 2015; University of California, San Francisco, 2015.
P. CYP 2D6 Binding Affinity Predictions Using Multiple Ligand and
(50) Sousa da Silva, A. W.; Vranken, W. F. ACPYPE-Antechamber
Protein Conformations. Int. J. Mol. Sci. 2013, 14, 24514−24530. Python Parser Interface. BMC Res. Notes 2012, 5, 367.
(34) Capoferri, L.; Verkade-Vreeker, M. C.; Buitenhuis, D.; (51) Jorgensen, W. L.; Madura, J. D. Quantum and Statistical
Commandeur, J. N. M.; Pastor, M.; Vermeulen, N. P. E.; Geerke, Mechanical Studies of Liquids. 25. Solvation and Conformation of
D. P. Linear Interaction Energy Based Prediction of Cytochrome Methanol in Water. J. Am. Chem. Soc. 1983, 105, 1407−1413.
P450 1A2 Binding Affinities with Reliability Estimation. PLoS One (52) Hockney, R. W.; Eastwood, J. W. Computer Simulation Using
2015, 10, No. e0142232. Particles; CRC Press, 1988.
(35) van Dijk, M.; Ter Laak, A. M.; Wichard, J. D.; Capoferri, L.; (53) Darden, T.; York, D.; Pedersen, L. Particle Mesh Ewald: An N
Vermeulen, N. P. E.; Geerke, D. P. Comprehensive and Automated log (N) Method for Ewald Sums in Large Systems. J. Chem. Phys.
Linear Interaction Energy Based Binding-Affinity Prediction for 1993, 98, 10089−10092.
Multifarious Cytochrome P450 Aromatase Inhibitors. J. Chem. Inf. (54) Feenstra, K. A.; Hess, B.; Berendsen, H. J. Improving Efficiency
Model. 2017, 57, 2294−2308. of Large Time-Scale Molecular Dynamics Simulations of Hydrogen-
(36) Vosmeer, C. R.; Pool, R.; van Stee, M.; Perić-Hassler, L.; Rich Systems. J. Comput. Chem. 1999, 20, 786−798.
Vermeulen, N. P. E.; Geerke, D. P. Towards Automated Binding (55) Hess, B.; Bekker, H.; Berendsen, H. J.; Fraaije, J. G. LINCS: a
Affinity Prediction Using an Iterative Linear Interaction Energy Linear Constraint Solver for Molecular Simulations. J. Comput. Chem.
Approach. Int. J. Mol. Sci. 2014, 15, 798−816. 1997, 18, 1463−1472.
(37) Disch, J. S.; Evindar, G.; Chiu, C. H.; Blum, C. A.; Dai, H.; Jin, (56) Berendsen, H. J.; Postma, J.; van Gunsteren, W. F.; DiNola, A.;
L.; Schuman, E.; Lind, K. E.; Belyanskaya, S. L.; Deng, J.; Coppo, F.; Haak, J. Molecular Dynamics with Coupling to an External Bath. J.
Aquilani, L.; Graybill, T. L.; Cuozzo, J. W.; Lavu, S.; Mao, C.; Vlasuk, Chem. Phys. 1984, 81, 3684−3690.
G. P.; Perni, R. B. Discovery of Thieno [3,2-d] Pyrimidine-6- (57) Pronk, S.; Páll, S.; Schulz, R.; Larsson, P.; Bjelkmar, P.;
Carboxamides as Potent Inhibitors of SIRT1, SIRT2, and SIRT3. J. Apostolov, R.; Shirts, M. R.; Smith, J. C.; Kasson, P. M.; van der
Med. Chem. 2013, 56, 3666−3679. Spoel, D.; Hess, B.; Lindahl, E. GROMACS 4.5: a High-Throughput
(38) Cheng, Y.; Prusoff, W. H. Relationship Between the Inhibition and Highly Parallel Open Source Molecular Simulation Toolkit.
Constant (KI) and the Concentration of Inhibitor Which Causes 50 Bioinformatics 2013, 29, 845−854.
Per Cent Inhibition (I50) of an Enzymatic Reaction. Biochem. (58) Capoferri, L.; van Dijk, M.; Rustenburg, A. S.; Wassenaar, T. A.;
Pharmacol. 1973, 22, 3099−3108. Kooi, D. P.; Rifai, E. A.; Vermeulen, N. P. E.; Geerke, D. P. eTOX
(39) Lazareno, S.; Birdsall, N. Estimation of Competitive Antagonist ALLIES: an Automated PipeLine for Linear Interaction Energy-Based
Affinity from Functional Inhibition Curves Using the Gaddum, Schild Simulations. J. Cheminf. 2017, 9, 58.
(59) Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; (76) McGaughey, G. B.; Gagné, M.; Rappé, A. K. π-Stacking
Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Interactions Alive and Well in Proteins. J. Biol. Chem. 1998, 273,
Dubourg, V.; Vanderplas, J.; Passos, A.; Cournapeau, D.; Brucher, M.; 15458−15463.
Perrot, M.; Duchesnay, É . Scikit-learn: Machine Learning in Python. J. (77) Hubbard, R. E.; Kamran Haider, M. Encyclopedia of Life Sciences
Mach. Learn. Res. 2011, 12, 2825−2830. (ELS); Wiley Online Library, 2010; pp 1−7.
(60) Van Lipzig, M. M.; Ter Laak, A. M.; Jongejan, A.; Vermeulen, (78) Barlow, D. J.; Thornton, J. Ion-Pairs in Proteins. J. Mol. Biol.
N. P. E.; Wamelink, M.; Geerke, D. P.; Meerman, J. H. Prediction of 1983, 168, 867−885.
Ligand Binding Affinity and Orientation of Xenoestrogens to the (79) Gallivan, J. P.; Dougherty, D. A. Cation-π Interactions in
Estrogen Receptor by Molecular Dynamics Simulations and the Structural Biology. Proc. Natl. Acad. Sci. U. S. A. 1999, 96, 9459−
Linear Interaction Energy Method. J. Med. Chem. 2004, 47, 1018− 9464.
1030. (80) Auffinger, P.; Hays, F. A.; Westhof, E.; Ho, P. S. Halogen
(61) Rifai, E. A.; van Dijk, M.; Vermeulen, N. P. E.; Geerke, D. P. Bonds in Biological Molecules. Proc. Natl. Acad. Sci. U. S. A. 2004,
Binding Free Energy Predictions of Farnesoid X Receptor (FXR) 101, 16789−16794.
Agonists Using a Linear Interaction Energy (LIE) Approach with (81) Waskom, M.; Botvinnik, O.; O’Kane, D.; Hobson, P.; Ostblom,
Reliability Estimation: Application to the D3R Grand Challenge 2. J. J.; Lukauskas, S.; Gemperline, D. C.; Augspurger, T.; Halchenko, Y.;
Comput.-Aided Mol. Des. 2018, 32, 239−249. Cole, J. B.; Warmenhoven, J.; de Ruiter, J.; Pye, C.; Hoyer, S.;
(62) Hritz, J.; de Ruiter, A.; Oostenbrink, C. Impact of Plasticity and Vanderplas, J.; Villalba, S.; Kunter, G.; Quintero, E.; Bachant, P.;
Flexibility on Docking Results for Cytochrome P450 2D6: a Martin, M.; Meyer, K.; Miles, A.; Ram, Y.; Brunner, T.; Yarkoni, T.;
Combined Approach of Molecular Dynamics and Ligand Docking. Williams, M. L.; Evans, C.; Fitzgerald, C.; Brian; Qalieh, A. Seaborn,
J. Med. Chem. 2008, 51, 7469−7477. v0.9.0; 2018; DOI: 10.5281/zenodo.1313201.
(63) Baker, N. A.; Sept, D.; Joseph, S.; Holst, M. J.; McCammon, J. (82) To evaluate correlation coefficients, we used the online tool
A. Electrostatics of Nanosystems: Application to Microtubules and that is available at https://www.socscistatistics.com.
the Ribosome. Proc. Natl. Acad. Sci. U. S. A. 2001, 98, 10037−10041. (83) Brown, S. P.; Muchmore, S. W. Large-Scale Application of
(64) Homeyer, N.; Gohlke, H. Free Energy Calculations by the High-Throughput Molecular Mechanics with Poisson-Boltzmann
Molecular Mechanics Poisson-Boltzmann Surface Area Method. Mol. Surface Area for Routine Physics-Based Scoring of Protein-Ligand
Inf. 2012, 31, 114−122. Complexes. J. Med. Chem. 2009, 52, 3159−3165.
(65) Honig, B.; Nicholls, A. Classical Electrostatics in Biology and (84) Yang, T.; Wu, J. C.; Yan, C.; Wang, Y.; Luo, R.; Gonzales, M.
Chemistry. Science 1995, 268, 1144−1149. B.; Dalby, K. N.; Ren, P. Virtual Screening Using Molecular
(66) Hou, T.; Wang, J.; Li, Y.; Wang, W. Assessing the Performance Simulations. Proteins: Struct., Funct., Bioinf. 2011, 79, 1940−1951.
of the MM/PBSA and MM/GBSA Methods. 1. The Accuracy of (85) Hou, T.; Wang, J.; Li, Y.; Wang, W. Assessing the Performance
Binding Free Energy Calculations Based on Molecular Dynamics of the Molecular Mechanics/Poisson Boltzmann Surface Area and
Simulations. J. Chem. Inf. Model. 2011, 51, 69−82.
Molecular Mechanics/Generalized Born Surface Area Methods. II.
(67) Bondi, A. Van der Waals Volumes and Radii. J. Phys. Chem.
The Accuracy of Ranking Poses Generated from Docking. J. Comput.
1964, 68, 441−451.
Chem. 2011, 32, 866−877.
(68) Sitkoff, D.; Sharp, K. A.; Honig, B. Accurate Calculation of
(86) Oehme, D. P.; Brownlee, R. T.; Wilson, D. J. Effect of Atomic
Hydration Free Energies Using Macroscopic Solvent Models. J. Phys.
Charge, Solvation, Entropy, and Ligand Protonation State on MM-PB
Chem. 1994, 98, 1978−1988.
(69) Weis, A.; Katebzadeh, K.; Söderhjelm, P.; Nilsson, I.; Ryde, U. (GB) SA Binding Energies of HIV Protease. J. Comput. Chem. 2012,
Ligand Affinities Predicted with the MM/PBSA Method: Dependence 33, 2566−2580.
on the Simulation Method and the Force Field. J. Med. Chem. 2006, (87) Åqvist, J.; Luzhkov, V. B.; Brandsdal, B. O. Ligand Binding
49, 6596−6606. Affinities from MD Simulations. Acc. Chem. Res. 2002, 35, 358−365.
(70) Aldeghi, M.; Bodkin, M. J.; Knapp, S.; Biggin, P. C. Statistical (88) Shirts, M. R.; Pitera, J. W.; Swope, W. C.; Pande, V. S.
Analysis on the Performance of Molecular Mechanics Poisson− Extremely Precise Free Energy Calculations of Amino Acid Side
Boltzmann Surface Area Versus Absolute Binding Free Energy Chain Analogs: Comparison of Common Molecular Mechanics Force
Calculations: Bromodomains as a Case Study. J. Chem. Inf. Model. Fields for Proteins. J. Chem. Phys. 2003, 119, 5740−5761.
2017, 57, 2203−2221. (89) Li, L.; Li, C.; Zhang, Z.; Alexov, E. On the Dielectric Constant
(71) Genheden, S.; Kuhn, O.; Mikulskis, P.; Hoffmann, D.; Ryde, U. of Proteins: Smooth Dielectric Function for Macromolecular
The Normal-Mode Entropy in the MM/GBSA Method: Effect of Modeling and Its Implementation in DelPhi. J. Chem. Theory Comput.
System Truncation, Buffer Region, and Dielectric Constant. J. Chem. 2013, 9, 2126−2136.
Inf. Model. 2012, 52, 2079−2088. (90) Wang, E.; Sun, H.; Wang, J.; Wang, Z.; Liu, H.; Zhang, J. Z.;
(72) Uciechowska, U.; Schemies, J.; Scharfe, M.; Lawson, M.; Hou, T. End-Point Binding Free Energy Calculation with MM/PBSA
Wichapong, K.; Jung, M.; Sippl, W. Binding Free Energy Calculations and MM/GBSA: Strategies and Applications in Drug Design. Chem.
and Biological Testing of Novel Thiobarbiturates as Inhibitors of the Rev. 2019, 119, 9478−9508.
Human NAD+ Dependent Histone Deacetylase Sirt2. MedChem- (91) Sanders, B. D.; Jackson, B.; Marmorstein, R. Structural Basis for
Comm 2012, 3, 167−173. Sirtuin Function: What We Know and What We Don’t. Biochim.
(73) McKinney, W. Data Structures for Statistical Computing in Biophys. Acta, Proteins Proteomics 2010, 1804, 1604−1616.
Python. In Proceedings of the 9th Python in Science Conference, 2010; pp (92) Vosmeer, C. R.; Kooi, D. P.; Capoferri, L.; Terpstra, M. M.;
51−56. Vermeulen, N. P. E.; Geerke, D. P. Improving the Iterative Linear
(74) McGibbon, R. T.; Beauchamp, K. A.; Harrigan, M. P.; Klein, Interaction Energy Approach Using Automated Recognition of
C.; Swails, J. M.; Hernández, C. X.; Schwantes, C. R.; Wang, L.-P.; Configurational Transitions. J. Mol. Model. 2016, 22, 31.
Lane, T. J.; Pande, V. S. MDTraj: a Modern Open Library for the (93) Genheden, S.; Ryde, U. How to Obtain Statistically Converged
Analysis of Molecular Dynamics Trajectories. Biophys. J. 2015, 109, MM/GBSA Results. J. Comput. Chem. 2010, 31, 837−846.
1528−1532. (94) Zhou, Y.; Feig, M.; Wei, G.-W. Highly Accurate Biomolecular
(75) Luirink, R. A.; Dekker, S. J.; Capoferri, L.; Janssen, L. F.; Electrostatics in Continuum Dielectric Environments. J. Comput.
Kuiper, C. L.; Ari, M. E.; Vermeulen, N. P. E.; Vos, J. C.; Chem. 2008, 29, 87−97.
Commandeur, J. N. M.; Geerke, D. P. A Combined Computational (95) Vasanthanathan, P.; Olsen, L.; Jørgensen, F. S.; Vermeulen, N.
and Experimental Study on Selective Flucloxacillin Hydroxylation by P. E.; Oostenbrink, C. Computational Prediction of Binding Affinity
Cytochrome P450 BM3 Variants. J. Inorg. Biochem. 2018, 184, 115− for CYP1A2-Ligand Complexes Using Empirical Free Energy
122. Calculations. Drug Metab. Dispos. 2010, 38, 1347−1354.
(96) He, X.; Man, V. H.; Ji, B.; Xie, X.-Q.; Wang, J. Calculate
Protein−ligand Binding Affinities with the Extended Linear
Interaction Energy Method: Application on the Cathepsin S Set in
the D3R Grand Challenge 3. J. Comput.-Aided Mol. Des. 2019, 33,
105−117.
(97) Case, D. A.; Cheatham, T. E., III; Darden, T.; Gohlke, H.; Luo,
R.; Merz, K. M., Jr; Onufriev, A.; Simmerling, C.; Wang, B.; Woods,
R. J. The Amber Biomolecular Simulation Programs. J. Comput. Chem.
2005, 26, 1668−1688.
(98) Salomon-Ferrer, R.; Case, D. A.; Walker, R. C. An Overview of
the Amber Biomolecular Simulation Package. Wiley Interdiscip. Rev.:
Comput. Mol. Sci. 2013, 3, 198−210.