You are on page 1of 11

An Estimation Method of Binding Free Energy In Terms

Of ABEEMrp/MM and Continuum Electrostatics
Fused Into LIE Method
SHU-LING CHEN, DONG-XIA ZHAO, ZHONG-ZHI YANG
School of Chemistry and chemical Engineering, Liaoning Normal University, Dalian 116029,
People’s Republic of China
Received 4 February 2010; Revised 14 May 2010; Accepted 10 June 2010
DOI 10.1002/jcc.21625
Published online 26 July 2010 in Wiley Online Library (wileyonlinelibrary.com).
Abstract: A method is proposed for the estimation of absolute binding free energy of interaction between proteins
and ligands. The linear interaction energy method is combined with atom-bond electronegativity equalization method
at rp level Force field (fused into molecular mechanics) and generalized Born continuum model calculation of elec-
trostatic solvation for the estimation of the absolute free energy of binding. The parameters of this method are cali-
brated by using a training set of 24 HIV-1 protease–inhibitor complexes (PDB entry 1AAQ). A correlation coeffi-
cient of 0.93 was obtained with a root mean square deviation of 0.70 kcal mol
21
. This approach is further tested on
seven inhibitor and protease complexes, and it provides small root mean square deviation between the calculated
binding free energy and experimental binding free energy without reparametrization. By comparing the radii of gyra-
tion and the hydrogen bond distances between ligand and protein of three training model molecules, the consistent
comparison result of binding free energy is obtained. It proves that this method of calculating the binding free
energy with appropriate structural analysis can be applied to quickly assess new inhibitors of HIV-1 proteases. To
test whether the parameters of this method can apply to other drug targets, we have validated this method for the
drug target cyclooxygenase-2.
q 2010 Wiley Periodicals, Inc. J Comput Chem 32: 338–348, 2011
Key words: LIE binding free energy; ABEEMrp/MM; continuum electrostatics; HIV-1 protease; structural analysis
Introduction
HIV-1 protease carries out the essential proteolytic cleavage of
viral proteins into functional units.
1
It is one of the major targets
in AIDS therapies. Much effort has been devoted to find effec-
tive inhibitors of HIV-1 protease, including many computational
studies.
2
Characterization of the structure and energetics of mo-
lecular complexes are of great importance for understanding
many biological functions, and it has been a major goal in com-
putational chemistry in molecular recognition and drug design.
So appropriate structural analysis and accurate estimation of
binding free energy for ligand–receptor complexes are the most
important two steps of structure-based ligand design approach.
3–8
The binding free-energy calculation can be time consuming,
because it requires an accurate determination of the interactions
and a reliable treatment of solvent effects. Finding methods for
evaluation of the ligand-binding affinity that are fast enough to
treat thousands of candidate ligands in a reasonable time is a major
challenge in drug design.
9
A great deal of efforts has been invested
in the methods of calculating the binding free energy, ranging
from quick estimates using empirical scoring functions
10–20
to
more complicated calculations of free energies involving thermal
averaging, such as rigorous free-energy perturbation (FEP) tech-
nique and thermodynamic integration (TI) method.
21–23
Although
most empirical scoring functions are very fast to be evaluated,
they contain rather crude approximations that often result in poor
predictive ability. Moreover, the rigorous approaches, FEP and TI,
are more accurate but typically require little variation in the ligand
structure and intensive computing time that prevent them from
being used routinely.
23
Recently, several semiempirical methods
based on linear approximations to the free energy have been intro-
duced and used with success.
24
The linear interaction energy approximation (LIE) is a way
of combining molecular mechanics calculations with experimen-
tal data to build a model scoring function for the evaluation of
Contract/grant sponsor: National Natural Science Foundation of China;
contract/grant numbers: 20633050, 20703022, 20873055, 21011120087
Contract/grant sponsor: Educational Department of Liaoning Province;
contract/grant numbers: 2009T057, 2008S133
Correspondence to: Zhong-Zhi Yang; e-mail: zzyang@lnnu.edu.cn
q 2010 Wiley Periodicals, Inc.
ligand–protein binding free energies. LIE method was first sug-
gested by A
˚
qvist and co-workers,
25–28
which was based on con-
formational sampling by molecular dynamics (MD) or Monte
Carlo trajectory. But in contrast to FEP/TI in which a large
number of intermediate windows must be evaluated, the LIE
method only requires simulations of the two ending windows. In
the LIE approximation, the binding free energy is divided into a
polar and a nonpolar contribution, and calculated according to
KG
LIE
bind
¼ bðhE
elec
i
bound
À hE
elec
i
free
Þ þ aðhE
vdw
i
bound
À hE
vdw
i
free
Þ
(1)
where E
elec
and E
vdw
are the electrostatic and van der Waals
interaction energies, respectively, between the ligand and its sur-
roundings. The surroundings are either the solvated receptor
binding site (bound state) or just solvent (free state). The h. . .i
denotes MD or MC averages of the nonbonded electrostatic
(elec) and van der Waals (vdw) interactions of the ligand with
its surroundings. In other words, two simulations are required:
one with the ligand free in solution and one with it bound to sol-
vated receptor. The coefficients a and b are scaling factors for
these energies, the coefficient a is determined empirically.
26
The
linear response approximation provides a physical basis for the
treatment of the electrostatic contribution to the binding free
energy, which predicts a value of b 5 0.5.
29,30
In fact, the elec-
trostatic scaling factor is also considered a free parameter in the
fitting except for a few studies characterized by either a small
number of ligands
26,27
or large deviations in some of the pre-
dicted binding energies.
31,32
The LIE method was already
applied to many areas in the calculation of binding free energy
with an accurate result. Although the LIE method of A
˚
qvist
et al. is much faster than a full free-energy simulation, it is still
too slow for screening a large number of ligands. Recently,
Zhou et al.
33
and Caflisch and co-workers
34,35
described a modi-
fied LIE method based on a continuum treatment of the solvent,
instead of using explicit solvent. We have developed a simpli-
fied method, although different form, for the estimation of abso-
lute binding free energies inspired by the LIE approach. The
aim is the same as that of Caflisch et al., i.e. ideally to obtain a
method for estimating binding free energies that is fast, accurate,
and general. In our approach, we replace the MD with energy
minimization and combine the LIE method with a treatment of
continuum electrostatics, the formula of the present method is
DG
bind
¼ a DE
vdw
ð Þ þ b DG
elec
ð Þ þDG
tr;rot
DE
vdw
¼ DE
complex
vdw
À ðDE
protien
vdw
þDE
ligand
vdw
Þ
DE
elec;coul
¼ DE
complex
elec;coul
À ðDE
protien
elec;coul
þDE
ligand
elec;coul
Þ
DE
elec;solv
¼ DE
complex
elec;solv
À ðDE
protien
elec;solv
þDE
ligand
elec;solv
Þ
DG
elec
¼ DE
elec;coul
þDE
elec;solv
ð2Þ
where DE
elec,coul
is the electrostatic interaction energy in vacuo
between the ligand and the protein, DE
vdw
is van der Waals
interaction energy between ligand and protein, DE
elec,solv
is the
electrostatic solvation free-energy contribution to the binding
free energy, it is equal to the difference of the solvation free
energies of the complex and the isolated ligand and protein,
DG
elec
is the sum of the ligand–protein Columbic energy in
vacuo and electrostatic solvation energy in continuum model,
and DG
tr,rot
accounts for the loss of translational and rotational
degrees of freedom on binding.
36,37
On the basis of the electronegativity equalization principle,
Yang and Wang et al.
38–42
designed the Atom-Bond Electronega-
tivity Equalization Method (ABEEM) for large organic and biolog-
ical molecular charge distribution. Lately, ABEEM model has been
fused with MM, i.e., ABEEM/MM, which has been applied to the
water systems and ion–water systems
43–48
as well as to the confor-
mations of alkane and peptide.
49–51
Recently, the ABEEM/MM
model has been used to perform dynamics simulations for pro-
teins.
52
In this study, we apply ABEEM model at rp level
(ABEEMrp),
41,49,53
which treats the charge regions explicitly
including atoms, bonds, lone-pair electrons, and p-regions, to cal-
culate the charge distribution for the ligand to be docked in the pro-
tein environment. The electrostatic solvation energy contribution to
the binding energy is calculated by using the generalized Born
(GB) model proposed by Still et al.
54
in TINKER programmes.
55
This work aims to calculate the binding free energy to study
the interaction of inhibitor and protease. This general outline is
organized as follows. The methods and related details are sum-
marized in section 2. The results are discussed in section 3.
Finally, section 4 gives the concluding remarks and outlook to
future applications.
Methodology
The ABEEMrp/MM Model
In this method, potential energy function of complexes of a protein
and its inhibitors is evaluated as a sum of the following compo-
nents in eq. (3): bond stretching and angle-bending terms E
bond
and E
angle
, the torsional energy E
torsion
, the improper dihedral
angle term E
imptors
, and the nonbonded energy E
nb
. The nonbonded
energy is computed as a sum of the Lennard-Jones and Coulomb
contributions for pairwise intra- and intermolecular interactions.
E
ABEEMrp=MM
¼
¸
bonds
E
b
þ
¸
angles
E
h
þ
¸
torsion
E
/
þ
¸
imptors
E
imptors
þ
¸
nonbonded
ðE
vdw
þ E
elec
Þ ð3Þ
The bond stretching and angle bending energies are obtained
in accordance with the following formulas:
E
bond
¼
¸
bonds
k
b
ðr À r
eq
Þ
2
(4)
E
angle
¼
¸
angles
k
h
ðh À h
eq
Þ
2
(5)
Here, k
b
and k
h
represent the force constants of stretching and
bending, respectively; r and h are actual values of bond lengths
339 Estimation of Binding Free Energy
Journal of Computational Chemistry DOI 10.1002/jcc
and angles, respectively; and r
eq
and h
eq
are used to denote the
equilibrium values of the bond length and angle, respectively. The
torsional term is computed as follows:
E
torsion
¼
¸
i

:
V
1
2
1 þ cos /
i
ð Þ ½ Š þ
V
2
2
1 À cos 2/
i
ð Þ ½ Š
þ
V
3
2
1 þ cos 3/
i
ð Þ ½ Š

ð6Þ
The improper dihedral angle term is written as eq. (7):
E
imptors
¼
¸
imptors
vð1 À cos 2/Þ (7)
Here, V
1
, V
2
, V
3
, and m are the dihedral angle and improper dihe-
dral angle force constants, respectively. The nonbonded part con-
tains the Lennard-Jones and Coulomb contributions for pairwise
intra- and intermolecular interactions. E
vdw
describes the van der
Waals nonbonded atom–atom interaction:
E
vdw
¼
¸
i<j
4f
ij
e
ij
r
12
ij
=r
12
ij
À r
6
ij
=r
6
ij

(8)
Geometric combining rules for the Lennard-Jones coefficients
used are r
ij
5(r
ii
r
jj
)
1/2
and e
ij
5(e
ii
e
jj
)
1/2
.
The summation runs over all of the pairs of atoms i \ j on
molecules A and B or A and A, for the intramolecular interac-
tions. Moreover, in the latter case, the coefficient f
ij
is equal to
0.0 for any i–j pair connected by a valence bond (1–2 pairs) or
a valence bond angle (1–3 pairs). f
ij
5 0.5 for 1,4 interactions
(atoms separated by exactly three bonds) and f
ij
5 1.0 for all of
the other cases. The electrostatic interaction energy E
elec
is
expressed as:
E
elec
¼
¸
i<j
kq
i
q
j
=r
ij
(9)
For the Coulomb term, the partial charges q
i
are calculated by
atom–bond electronegativity equalization method (ABEEMrp),
in eq. (9), q
i
and q
j
are the partial charges of sites i and j, r
ij
is
separation of sites i and j, k is an overall correction coefficient
0.57
39
in the ABEEMrp model if there is no otherwise specifica-
tion. The electrostatic interaction term in the ABEEMrp/MM
model has been described in detail in Refs. 49 and 53.
The Electrostatic Solvation Energy Model
The electrostatic solvation energy contribution to the binding
energy is calculated using the GB model proposed by Still
et al.
54
in TINKER program.
55
The formula is described as
follows:
DG
pol
¼ À166:0 1 À
1
e

¸
N
i¼1
¸
N
j¼1
q
i
q
j
r
2
ij
þ a
2
ij
e
ÀDij

0:5
(10)
where q
i
and q
j
are the net charge of the atoms in the molecule,
a
ij
5 (a
i
a
j
)
0.5
and D
ij
5 r
2
ij
/(2a
ij
)
2
and the double sum runs over
all pairs of atoms (i and j). a
i
is so-called Born radius of atom i.
r
ij
is the distance between atoms i and j, e is commonly called
the solvent dielectric constant of the media, here set to 78.3 for
the solvent is water liquid. It is the main problem to obtain
accurate Born radii to compute the electrostatic solvation free
energy. Here, the Born radius a
i
was calculated with a fast ana-
lytical approach.
56
In this study, we deal with the net partial
charges by ABEEMrp method, which treat the charge regions
explicitly including atoms, bonds, lone-pair electrons, and p
regions. We take the parameters of bond stretching and angle
bending, the ABEEMrp parameters (v
*
and 2g
*
) and Lennard-
Jones parameters (r and e), and all the other parameters from
the previous articles.
41,49,53
All atomic radii were set to the
ABEEMrp/MM van der Waals radii (
ffiffi
2
6
p
r
2
).
Manipulation of Training Set
The coordinates of HIV-1 protease in complex with the inhibitor
Ala-Ala-Phe-C-{}-Ala-Val-Val-OMe were obtained from the
Brookhaven Protein Data Bank,
57
with a 2.5 A
˚
resolution X-ray
structure (PDB entry 1AAQ
58
). The water bridging the two flaps
was retained as structural water binding of the inhibitors consid-
ered in this study. A monoprotonated state at the catalytic aspar-
tates is considered as reference from the study by Huang and
Caflisch.
34
The crystal structure of the 1AAQ complex contains
the largest compound from a set of 24 HIV-1 PR inhibitors
(Fig. 1) with inhibition constant (K
i
) values ranging from 0.4
nM to 6.5 lM.
58
The remaining 23 inhibitors were modeled
manually by deleting parts of the inhibitor in 1AAQ.
Minimization and Energy Calculations
Here, hydrogen atoms were added to all structures using TIN-
KER program
55
and minimized with the ABEEMrp fluctuating
charge force field. Partial charges were assigned using the
ABEEMrp method. All protein–inhibitor complexes were mini-
mized by the conjugate gradient algorithm to root mean square
(rms) of the gradient of 0.01 kcal mol
21
A
˚
21
. Nonbonding cut-
off of 14 A
˚
was used. Residues greater than 20 A
˚
away from
the active site of protein and the water molecule in HIV-1 PR
were kept fixed during minimization. The minimized structures
were used for evaluating the van der Waals energy and electro-
static interaction energies.
The van der Waals and electrostatic interaction energies were
calculated by subtracting the values of the isolated components
from the energy of the complex. The van der Waals energy was
calculated with ABEEMrp force field using the default cutoff of
14 A
˚
. The electrostatic energy is the sum of the Coulombic
energy in vacuo and the electrostatic solvation energy. The for-
mer was calculated with ABEEMrp/MM, treating partial charges
on atoms, bonds, lone-pair electrons, and p regions as variables,
which respond to their environments in a way similar to the
polarization response of real molecules. The electrostatic solva-
tion energy was calculated by using the GB model proposed by
Still et al.
54
in TINKER program.
55
The binding free-energy for-
mula is used for the fitting of a three-parameter model,
31
340 Chen, Zhao, and Yang • Vol. 32, No. 2 • Journal of Computational Chemistry
Journal of Computational Chemistry DOI 10.1002/jcc
DG
bind
¼ a DE
vdw
ð Þ þ b DG
elec
ð Þ þDG
tr;rot
(11)
and a two-parameter model.
26
DG
bind
¼ a DE
vdw
ð Þ þ b DG
elec
ð Þ (12)
Results and Discussion
Binding Free Energy
Calculated binding free energies and the experimental binding
free energy for the 24 receptor–ligand complexes listed in Ta-
ble 1 and Figure 1, respectively, were used to determine optimal
Figure 1. HIV-1 PR inhibitors tested by Dreyer et al.
58
341 Estimation of Binding Free Energy
Journal of Computational Chemistry DOI 10.1002/jcc
coefficients, a, b, and DG
tr,rot
, for the model of eqs. (11) and
(12). The parameter optimizations were performed by least-
squares optimization method, which minimizes the rms deviation
and mean unsigned deviation (h|dev|i), in kilocalories per mole,
between the calculated and experimental values for the 24 com-
plexes.
All the results of these parameters and also some common
statistical figures of merit for linear regression models are sum-
marized in Table 2,
59
the multiple correlation coefficient r
2
, a
measure of the overall fit of the model, is calculated as: r
2
5
SSR/(SSR 1 SSE), where SSR is the square sum of deviations
explained, SSR 5
¸
i
(DG
complex
(i) 2 hDG
expt
i)
2
and SSE is the
residual unexplained square sum of deviations SSE 5
¸
i
(DG
expt
(i) 2 DG
complex
(i))
2
, and i range from 1 to 24. Data
overfitting were assessed by leave-one-out cross-validation, opti-
mal parameters were found using each of the 24 data sets, miss-
ing one of the compounds. Each resulting model parameteriza-
tion was used to predict the left out DG from the left out simula-
tion data, and the square sum of deviations of these predictions
from experimental values is the ‘‘Predictive Residual Sum of
Squares.’’ The leave-one-out cross-validated correlation coeffi-
cient is then q
2
LOO
5 1 2 (PRESS/SSR) and the cross-validated
standard deviation is s
LOO
PRESS
¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PRESS=ðn À p À 1Þ

, where n 5
24 is the training set size, and p is the number optimized in the
model.
Table 2 reports the coefficients for the three- and two-param-
eter models and some statistical figures of merit. A correlation
of the calculated binding free energy versus experimental bind-
ing free energies for 24 HIV-1 PR inhibitors is shown in Fig-
ures 2 and 3. The correlation equation about the calculated bind-
ing free energies with the three fitting parameters model versus
the experiment value is given in Figure 2, and the linear correla-
tion coefficient r is around 0.93.
From Table 2 and Figure 3, it is found that the three-parame-
ter model has better predictive accuracy than the two-parameter
model. But the two models with three or two parameters give
result in lower rms deviations and mean unsigned deviations
h|dev|i values, 0.70, 0.77 kcal mol
21
and 0.56, 0.55 kcal mol
21
,
respectively, which is very encouraging. To avoid overfitting the
data, leave-one-out cross-validation has been performed on the
models. The cross-validated statistical figures of q
2
LOO
and s
LOO
PRESS
are 0.82 and 0.78 kcal mol
21
of three-parameter model, and the
multiple correlation coefficient r
2
of three- and two-parameter
model is 0.87 and 0.67, respectively. Above all, the three-parame-
ter model has the good predictive accuracy. It is also deserved to
note that the third parameter DG
tr,rot
5 8.67 kcal mol
21
is within
the range of 7–11 kcal mol
21
observed experimentally.
36,37
The predictive ability of the present approach was further
tested on a set of inhibitors, their PDB code is 1hvi,
60
1hvj,
60
1hvk,
60
1hvl,
60
1bvg
61
, 9hvp
62
, and 1hvr.
63
The structures, the
values of experimental binding free energy, and K
i
are listed in
Table 3. The calculated energy components and the calculated
binding free energy for seven testing sets are reported in Table 4.
The seven complexes were minimized in the same method as
proposed above, the 1hvr complex does not contain the water
molecular bridging the flaps. The rms deviations for the seven
inhibitors of the test are 1.24 and 2.39 kcal mol
21
for the three-
and two-parameter model, respectively. The predictive power is
Table 1. Interaction Energies, Solvation Energies Calculated for the
Training Set Compounds, As Well As the Calculated Binding Free
Energies of Three- and Two-Parameter Models.
Inhibitors DE
vdw
DE
ele,coul
DE
ele,solv
DG
cacl
3-Parameter fit
DG
cacl
2-Parameter fit
1 272.91 234.05 66.70 27.55 28.91
2 276.47 233.84 67.47 28.49 29.47
3 278.12 233.66 64.64 29.65 210.34
4 279.29 238.14 69.00 210.06 210.61
5 287.20 240.38 75.08 211.75 211.52
6 282.78 255.20 98.59 28.27 28.86
7 282.95 264.29 102.19 29.61 29.99
8 281.43 270.14 103.70 210.13 210.53
9 279.72 274.09 100.14 211.33 211.65
10 292.97 266.49 103.85 213.01 212.21
11 287.41 273.82 111.29 211.17 211.01
12 290.81 272.47 110.77 212.09 211.56
13 295.64 258.72 106.17 211.53 210.77
14 290.75 279.08 114.72 212.69 212.08
15 295.50 269.95 106.73 213.98 212.85
16 284.92 285.29 120.68 210.84 210.90
17 291.13 282.58 118.27 212.80 212.15
18 297.82 275.35 117.24 213.54 212.33
19 294.91 286.28 124.53 213.44 212.44
20 295.69 280.02 122.97 212.60 211.67
21 285.24 282.02 116.32 211.20 211.18
22 284.47 286.01 117.33 211.65 211.61
23 294.70 279.29 118.91 213.05 212.12
24 293.74 283.72 120.42 213.42 212.50
All energies are in kcal mol
21
.
Table 2. Coefficients for the Two- and Three-Parameter Models and Some Statistical Figures of Merit.
a b DG
tr,rot
(kcal mol
21
) rmsd (kcal mol
21
) h|dev|i (kcal mol
21
) r
2a
q
2
LOO
b
s
LOO
PRESS
c
(kcal mol
21
)
0.2109 0.1981 0.77 0.55 0.67 0.61 0.95
0.3274 0.2342 8.6742 0.70 0.56 0.87 0.82 0.78
a
The multiple correlation coefficient r
2
of a model measures what proportion of the variance observed in the
experimental binding data.
b
Leave-one-out cross-validated correlation coefficient.
c
Leave-one-out cross-validated standard deviation, in kcal mol
21
.
342 Chen, Zhao, and Yang • Vol. 32, No. 2 • Journal of Computational Chemistry
Journal of Computational Chemistry DOI 10.1002/jcc
good if one considers that the seven inhibitors have rather differ-
ent chemical structures.
The values of the coefficient a are 0.2109 and 0.3274 for the
two- and three-parameter models discussed above, respectively
(Table 2). It is close to the values of a obtained in the study by
Huang and Caflisch
34
who used a different force field and solva-
tion model. The similar values of the scaling parameter for the
van der Waals interaction indicate that it is rather robust for the
physicochemical characteristics of the binding site. On the other
hand, the values of b are completely different 0.0168 versus
0.1981 for the two-parameter model and 0.0636 versus 0.2342
for the three-parameter model. Through comparison, we con-
clude that the role of electrostatics is completely changed in the
binding process. The electrostatic energy is the sum of the Cou-
lombic energy in vacuo and the electrostatic solvation energy.
The former was calculated with ABEEMrp/MM, treating partial
charges on atoms, bonds, lone-pair electrons, and p regions as
variables, which respond to their environments in a way similar
to the polarization response of real molecules. In the study by
Huang and Caflisch, Partial charges were assigned using the
MPEOE method, which enables the charge distribution,
described as a set of atom-centered monopoles, to be calculated
directly. So as we can see that the Coulombic energy in our
method is different from the results in their article. In general,
the solvation penalty is also less in the current studies compared
with that of Huang and Caflisch, and the difference may be
caused by both GB versus Poisson Boltzmann and the method
of the assignation of partial charges. We think that it is reasona-
ble to assume that the parameters should be system dependent
and might be force field dependent also. Moreover, the ratio of
the value of b versus the value of a is close to the ratio obtained
in study by Zhou et al.
33
The Parameter Transferability
In this article, we reported good results for validation of this
method for HIV-1 protease inhibitors to test whether the param-
eters of this method can be applied to other drug targets and
validated this method for the drug target cyclooxygenase
(COX)-2.
System Preparation
The PDB structure of murine COX-2 was selected as starting
point for this study (PDB code 1CX2
64
). Here, the hydrogen
atoms were added to all structures using TINKER program
55
and minimized with the ABEEMrp fluctuating charge force
field. Partial charges were assigned using the ABEEMrp
method. All protein–inhibitor complexes were minimized by the
conjugate gradient algorithm to an rms of the gradient of 0.01
kcal mol
21
A
˚
21
. For the model of COX-2, to reduce the compu-
tational cost, only the protein residues that have one heavy atom
within about 20 A
˚
of any heavy atom of ligand were retained.
65
The structure of the inhibitor of COX-2 is shown in Figure 4,
the substituent R is shown for each compound. The perturbations
selected in this work are typical of those carried out in a binding
free-energy study, and the perturbations are involved from 1 to
2, 3 to 1, and 4 to 3. The relative binding free energies for these
perturbation complexes are also calculated and compared with
the available experiment results.
65
The van der Waals and electrostatic interaction energies were
calculated as described above. We validate this method for the
drug target COX-2 by calculation of the binding free energies
and the relative binding free energies (DDG
binding(A?B)
) of a set
of inhibitors of COX-2. The calculated binding free energies are
compared with the experimental results.
66
The relative binding
free energies were calculated using the following formula:
DDG
bindingðA!BÞ
¼ DG
binding
ðBÞ ÀDG
binding
ðAÞ (13)
Figure 2. Comparison of binding free energy calculated using eq.
(11) versus experimental binding free energies for 24 HIV-1 PR
inhibitors, while r is the correlation coefficient.
Figure 3. Calculated binding free energy using eqs. (11) and (12)
versus experimental binding free energies for 24 HIV-1 PR inhibi-
tors, the diagonal is drawn for visual help, while r value in paren-
theses is the correlation coefficient. The black dots represent the
three-parameter model and the open triangles represent the two-pa-
rameter model. If the calculated results perfectly agree with the ex-
perimental values, the data points should be on the diagonal line.
343 Estimation of Binding Free Energy
Journal of Computational Chemistry DOI 10.1002/jcc
Table 3. Complexes of Testing Set, Experimental Inhibition Constants (nM), and Experimental Binding Free
Energy (kcal mol
21
).
Complex
a
Structure K
i
(nM)
b
DG
exp
bind
c
(kcal mol
21
l) Refs.
1hvi 0.012 215.54 60
1hvj 0.004 216.22 60
1hvk 0.011 215.6 60
1hvl 0.112 214.16 60
1bvg 0.270 213.59 61
9hvp 4.5 211.59 62
1hvr 0.31 213.50 63
a
Code in the PDB.
b
Experimental inhibition constants (nM).
c
Experimental binding free energy (kcal mol
21
).
where DG
binding
(B) and DG
binding
(A) refer to the binding free
energies of the different inhibitors in the complexes. The calcu-
lated binding free energies of three- and two-parameter model
for COX-2, the calculated relative binding free energies (A?B)
of three-parameter model, and the experimental results of bind-
ing free energies and relative binding free energies of COX-2
were listed in Table 5.
The rms deviations for the four inhibitors of COX-2 are 1.23
and 1.77 kcal mol
21
for the three- and two-parameter models,
respectively. The large deviations occur between the COX-2 (3),
and the experimental results are 1.86 and 3.25 kcal mol
21
for
the three- and two-parameter models, respectively. Comparing
the value of DDG
binding (A?B)
between the experimental and cal-
culated results, the present method can give the correct predic-
tion of the relative binding free energies between the inhibitors
and enzyme. From discussion, we conclude that the present
method can be applied to predict the relative binding free energy
between the different inhibitors and enzyme, but in the case of
calculation of absolute binding free energies, it should be done
to refit the parameters of the model to be applied to the other
systems, which means the parameters are not always transferable
for applying to new systems.
Comparison With the Recent LIE Method
It is necessary to compare the present approach with a previous
method proposed by Huang and Caflisch,
34
and another recent
simplified approach for the estimation of absolute binding free
energies. Huang and Caflisch applied LIE method in combination
with energy minimization and finite difference Poisson calculation
of electrostatic solvation energy for the estimation of the absolute
free energy of binding. We choose GB
still
model, to calculate the
electrostatic solvation energy and to avoid calculating the time-
consuming Poisson-Boltzmann equation. It is encouraging that
compared with the results of three- or two-parameter models of
Huang and Caflisch, the present method can gain the same or bet-
ter correlation coefficient between the calculated energies and ex-
perimental values, the rms deviations of our two models are 0.70
and 0.77 kcal mol
21
, respectively, near or lesser than 0.73 and
0.89 kcal mol
21
in their report. The mean unsigned deviations in
this work are 0.56 and 0.55 for three- and two-parameter models,
respectively, which is also excited in their report.
The recent simplified approach similar to that by Zoete
et al.
9
performed conformational sampling by MD in vacuo (dis-
tance-dependent dielectric function). For a training set of 16
Figure 4. Structure of the inhibitors of the cyclooxygenase-2 con-
sidered in this study.
Table 5. Energy Components and the Calculated Binding Free Energies of Three- and Two-Parameter Model
for COX-2, the Calculated Relative Binding Free Energies (A?B) of Three-Parameter Model, and the
Experiment Results of Binding Free Energies and Relative Binding Free Energies of COX-2.
Inhibitors DE
vdw
DE
ele,coul
DE
ele,solv
DG
cacl
3-Parameter fit
DG
cacl
2-Parameter fit DG
exp
bind
a
DDG
exp
binding
b
(A?B)
DDG
cacl
binding (A?B)
COX-2 (1) 278.31 0.49 31.19 29.55 210.24 210.09 (1t2) 1.82 (1t2) 0.81
COX-2 (2) 277.77 0.68 33.69 28.74 29.59 28.28 (3t1) \24.64 (3t1) 22.24
COX-2 (3) 273.04 3.63 30.23 27.31 28.70 25.45 (4t3) [4.77 (4t3) 1.48
COX-2 (4) 275.95 1.85 29.77 28.79 29.76 210.22
All energies are in kcal/mol.
a
Experimental binding free energies for COX-2.
b
Experimental relative binding free energies of COX-2 with the inhibitors.
Table 4. Energy Components and the Calculated Binding Free Energies
of Three- and Two-Parameter Models for the HIV-1 PR Testing Set.
inhibitors DE
vdw
DE
ele,coul
DE
ele,solv
DG
cacl
3-Parameter fit
DG
cacl
2-Parameter fit
1hvi 2105.26 244.63 87.90 215.65 213.63
1hvj 2113.06 239.95 90.88 216.41 213.76
1hvk 2119.12 228.18 92.38 215.29 212.40
1hvl 2120.76 224.10 97.08 213.77 211.01
1bvg 2114.22 243.02 110.85 212.84 210.65
1hvr 2130.24 240.63 117.70 215.92 212.20
9hvp 2106.99 252.45 106.80 213.63 211.80
All energies are in kcal mol
21
.
345 Estimation of Binding Free Energy
Journal of Computational Chemistry DOI 10.1002/jcc
HIV-1 protease-inhibitor complexes of known three-dimensional
structure, they proposed a four-parameter model based on the
electrostatic interaction energy between the ligand and the pro-
tein, the difference of the electrostatic solvation free energies on
binding, the buried surface, and a constant term. The first three
energy terms were averaged over 50 snapshots saved along 100
ps of MD simulation. In the results of their four-parameter
model, the correlation coefficient is 0.91, the leave-one-out
method correlation coefficient is 0.8403, the rms deviation is
0.90 kcal mol
21
, and the mean unsigned deviation is 0.8 kcal
mol
21
, compared with our three-parameter model in which the
correlation coefficient is 0.93, the leave-one-out method correla-
tion coefficient is 0.82, the rms deviation is 0.70 kcal mol
21
,
and the mean unsigned deviation is 0.56 kcal mol
21
. On com-
parison, we conclude that our method gives better result. Fur-
thermore, In any case, minimization is easier and more efficient
than obtaining an average structure by MD sampling even if one
runs MD sampling in vacuo.
Structure Analysis
Radius of Gyration
The radius of gyration, R
g
, presents the degree of cumulate com-
pactness of protein–ligand complex.
67
The previous research
expressed that the interface of compound has the same degree of
cumulate compactness as the inside of protein.
68,69
So, the radius
of gyration characters the quality of geometric fit, small R
g
value corresponds to compact packing and good fit of geometry
on the interface. The formula of the radius of gyration is
R
g
¼
¸
NpþNl
i
ðx
i
À xÞ
2
þ ðy
i
À yÞ
2
þ ðz
i
À zÞ
2

N
P
þ N
l

1=2
(14)
where N
p
and N
l
are the number of non-hydrogen atoms of pro-
tein and ligand, respectively. x
i
, y
i
, and z
i
are the coordinates of
those non-hydrogen atoms, x, y, and z are the coordinates of
center of mass. For the systems of our research, we keep the
protease as the crystal structure in the calculation, so, the value
of R
g
denotes the correlation of quality of the geometric fit on
the interface.
We chose the complexes 6 (1aaq6), 1aaq11, and 1aaq21 as
the model molecule to investigate the relationship of geometric
fit between these three different protein–ligand complexes.
When minimizations keep the proteins as the crystal structure,
the radii of gyration of complexes 1aaq6, 1aaq11, 1aaq21, and
the isolated protease are 1.7521, 1.7494, 1.7491, and 1.7695 nm,
respectively. Moreover, through the 1000 ps dynamic simulation
of three complexes, the shifts of radius of gyration for three
complexes are shown in Figure 5, the average radii of gyration
of three complexes are 1.7523, 1.7495, and 1.7493 nm. It is
found that, from the results of optimization and the dynamic
simulation, the radii of the gyration of three complexes are
related as 1aaq6 [1aaq11 [1aaq21. Complex 21 has smallest
R
g
value, this proves that complex 21 has the best geometric fit
relatively and this is consistent with the binding free energy
compared result, the binding free energies of 1aaq6, 1aaq11, and
1aaq21 are 27.65, 29.46, and 211.47 kcal mol
21
, respectively.
Furthermore, the R
g
of three complexes are all smaller than the
isolated protease. This denotes that three ligands are all inside
the pockets of the proteases.
Hydrogen Bond
Hydrogen bond is important for the stabilization of a protein–
ligand conformation. Possible hydrogen bond donors or accept-
ors of the ligand should be saturated as much as possible. Under
this assumption, we added the investigation of the hydrogen
bond distance to identify among three HIV-1 protease–inhibitor
complexes (1aaq6, 1aaq11, and 1aaq21). As commonly accepted,
a hydrogen bond exists if the distance between donor and
acceptor heavy atoms lie in 2.5–3.5 A
˚
, the distance between H
atom and acceptor atom lies in 1.6–2.3 A
˚
, and the angle of do-
nor–H–acceptor is larger than 1508. Of course, if the distance
Figure 5. Radius of gyration as a function of simulations time dur-
ing 700 ps molecular dynamics runs of equilibrated structures of
three complexes.
Table 6. The Average Hydrogen Bond Distances of Atom Pairs NÁÁÁO
and OÁÁÁO in HIV-1 Protease–Inhibitor Complexes (1aaq6, 1aaq11, and
1aaq21) of ABEEMrp/MM Simulated Structures.
Atom pairs
Minimization Molecular dynamics
1aaq6 1aaq11 1aaq21 1aaq6 1aaq11 1aaq21
Asp25 O1ÁÁÁO 3.103 3.015 3.039 3.730 3.408 3.219
Asp25 O2ÁÁÁO 3.034 3.018 2.921 3.462 3.178 3.170
Asp25
0
O1ÁÁÁO 3.216 3.193 3.113 3.960 3.869 3.494
Asp25
0
O2ÁÁÁO 3.074 3.057 3.032 3.580 3.715 3.316
Gly27 OÁÁÁN 3.314 3.303 3.150 3.325 3.302 3.327
Gly27
0
OÁÁÁN 2.965 2.941 2.994 3.075 3.037 3.151
Asp29
0
NÁÁÁO 3.517 3.681 3.437 5.465 3.399 3.623
Gly48
0
OÁÁÁN 2.893 2.892 3.051 4.134 3.023 3.095
WT OÁÁÁO1¼¼C 3.611 3.480 3.111 4.086 3.843 3.563
WT OÁÁÁO2¼¼C 3.117 3.091 2.956 3.104 3.230 3.416
Ile50 NÁÁÁO(WT) 2.911 2.910 2.910 2.911 2.911 2.911
Ile50
0
NÁÁÁO(WT) 3.229 3.229 3.229 3.229 3.229 3.229
Average distance 3.165 3.151 3.079 3.672 3.345 3.293
Hydrogen bond distance in A
˚
.
346 Chen, Zhao, and Yang • Vol. 32, No. 2 • Journal of Computational Chemistry
Journal of Computational Chemistry DOI 10.1002/jcc
between donor and acceptor heavy atoms is smaller than 2.5 A
˚
,
more characteristics of covalent bond exist and the intensity of
the hydrogen bond will be stronger.
Here, the average distances of atom pairs NÁÁÁO and OÁÁÁO in
HIV-1 protease–inhibitor complexes (1aaq6, 1aaq11, and
1aaq21) of ABEEMrp/MM simulated structures are listed in
Table 6. The overall scheme of the inhibitor binding is shown in
Figure 6 about the main hydrogen bond between the inhibitor
and binding pockets of HIV-1 protease
58
in which the Asp25 is
protonated.
To further investigate the relationship of geometric fit
between these three different protein–ligand complexes, we com-
pared the distances of hydrogen bond between these three
ligands and protease. From Table 6, the average hydrogen bond
distances are 3.165, 3.151, and 3.079 A
˚
of complexes 1aaq6,
1aaq11, and 1aaq21, respectively, from the minimized results,
and the average hydrogen bond distances are 3.672, 3.345, and
3.293 A
˚
of 1aaq6, 1aaq11, and 1aaq21, respectively, from the
dynamic simulation results. We gain the conclusion that the av-
erage hydrogen bond distance of three complexes is 1aaq6 [
1aaq11 [1aaq21, so the hydrogen bond strength of three com-
plexes is 1aaq21 [1aaq11 [1aaq6. This conclusion is well in
accordance with those compared result of the radius of gyration
and proves the accuracy of the conclusion that complex 1aaq21
has better geometric fit than other two complexes.
Conclusions
The LIE method is combined with ABEEMrp fluctuating charge
force field and GB continuum models calculation of electrostatic
solvation for the calculation of the absolute free energy of bind-
ing. The present method has been shown to give good results
for a training set of 24 HIV-1 protease inhibitors. A correlation
coefficient of 0.93 was obtained with an rms deviation of 0.70
kcal mol
21
and an unsigned mean deviation of only 0.56 kcal
mol
21
. When applied to seven HIV-1 protease–inhibitor com-
plexes of different structures, without any reparametrization, the
method provides a satisfactory correlation with rms deviation of
1.24 kcal mol
21
.
As Huang et al. discussed in their article that the original
LIE method based on MD (or Monte Carlo) sampling might be
more appropriate than LIE combined with minimization for
flexible binding sites, binding site flexibility requires longer MD
simulations to reach convergence, which might not be computa-
tionally feasible for a large library of compounds in virtual
screening. Furthermore, the present approach does not need to
add a Born correction term for ionized systems as required in
explicit solvent,
26
and the entropic contribution to the binding is
taken into account implicitly through the DG
tr,rot
term and
simultaneous consideration of the loss of translational and rota-
tional degrees of freedom on binding. In addition, the present
method can be applied to predict the relative binding free
energy between the different inhibitors of other target drugs, but
similar to the calculation of absolute binding free energy, the
parameters of present method are not completely transferable,
and therefore, it may be necessary to refit the parameters for
applying to other systems.
By comparing the radii of gyration and hydrogen bond dis-
tances of three model complexes, the quality of geometric fit of
protein–ligand conformations is discussed. With the encourage-
ment that the computed results have a good correlation with the
binding free energy, the present approach for calculating binding
free energy with structural analysis can be applied to quickly
assess new inhibitors of HIV-1 proteases.
Acknowledgements
We are very grateful to the editor and reviewers’ nice sugges-
tions on the manuscript. We also greatly thank Professor Jay
William Ponder for providing the TINKER programs.
References
1. Debouck, C.; Gorniak, J. G.; Strickler, J. E.; Meek, T. D.; Metcalf,
B. W.; Rosenberg, M. Proc Natl Acad Sci USA 1987, 84, 8903.
2. Wlodawer, A. Annu Rev Med 2002, 53, 595.
3. Caflisch, A.; Karplus, M. Perspect Drug Discov Design 1995, 3, 51.
4. Caflisch, A.; Walchi, R.; Ehrhardt, C. News Physiol Sci 1998, 13,
182.
5. Almlof, M.; Brandsdal, B. O.; A
˚
qvist, J. J Comput Chem 2004, 25,
1242.
6. Kollman, P. A.; Massova, I.; Reyes, C.; Kuhn, B.; Huo, S.; Chong, L.;
Lee, M.; Lee, T.; Duan, Y.; Wang, W.; Donini, O.; Cieplak, P.; Srini-
vasan, J.; Case, D. A.; Cheatham, T. E. Acc Chem Res 2000, 33, 889.
7. Lamb, M. L.; Jorgensen, W. L. Curr Opin Chem Biol 1997, 1, 449.
8. Jorgensen, W. L. Science 2004, 303, 1813.
9. Zoete, V.; Michielin, O.; Karplus, M. J Comput Aided Mol Des
2003, 17, 861.
10. Goodford, P. J. J Med Chem 1985, 28, 849.
11. Tomioka, N.; Itai, A.; Iitaka, Y. J Comput Aided Mol Des 1987, 1,
197.
12. Meng, E. C.; Shoichet, B. K.; Kuntz, I. D. J Comput Chem 1992,
13, 505.
Figure 6. Essential hydrogen bonds to the inhibitor and four bind-
ing pockets (S2-S2
0
) are indicated. Hydrogen bonds are indicated by
dotted lines.
347 Estimation of Binding Free Energy
Journal of Computational Chemistry DOI 10.1002/jcc
13. Krystek, S.; Stouch, T.; Novotny, J. J Mol Biol 1993, 234, 661.
14. Rotstein, S. H.; Murcko, M. A. J Med Chem 1993, 36, 1700.
15. Bo¨hm, H. J. J Comput Aided Mol Des 1994, 8, 243.
16. Wallqvist, A.; Jernigan, R. L.; Covell, D. G. Protein Sci 1995, 4
1881.
17. Verkhivker, G.; Appelt, K.; Freer, S. T.; Villafranca, J. E. Protein
Eng 1995, 8, 677.
18. Head, R. D.; Smythe, M. L.; Oprea, T. I.; Waller, C. L.; Greene, S.;
Marshall, G. R. J Am Chem Soc 1996, 118, 3959.
19. Novotny, J.; Bruccoleri, R. E.; Saul, F. A. Biochemistry 1989, 28,
4735.
20. Wang, R. X.; Liu, L.; Lai, L. H.; Tang, Y. Q. J Mol Model 1998, 4,
379.
21. Jorgensen, W. L. Acc Chem Res 1989, 22, 184.
22. Straasma, T. P.; MaCammon, J. A. Ann Rev Phys Chem 1992, 43, 407.
23. Kollman, P. A. Chem Rev 1993, 93, 2395.
24. Apostolakis, J.; Caflisch, A. Comb Chem High Throughput Screen
1999, 2, 91.
25. A
˚
qvist, J.; Hanson, T. J Phys Chem 1996, 100, 9512.
26. A
˚
qvist, J.; Medina, C.; Samuelsson, J. E. Protein Eng 1994, 7, 385.
27. Hansson, T.; A
˚
qvist, J. Protein Eng 1995, 8, 1137.
28. Ljungberg, K. B.; Marelius, J.; Musil, D.; Svensson, P.; Norden, B.;
A
˚
qvist, J. Eur J Pharm Sci 2001, 12, 441.
29. Warshel, A.; Russell, S. T. Q Rev Biophys 1984, 17, 283.
30. Roux, H. A.; Yu, B.; Karplus, M. J Phys Chem 1990, 94, 4683.
31. Wang, W.; Wang, J.; Kollman, P. A. Proteins Struct Funct Genet
1999, 34, 395.
32. Wang, J.; Dixon, R.; Kollman, P. A. Proteins Struct Funct Genet
1999, 34, 69.
33. Zhou, R.; Friesner, R. A.; Ghoshs, A.; Rizzo, R. C.; Jorgensen, W.
J. J Phys Chem B 2001, 102, 10388.
34. Huang, D. Z.; Caflisch, A. J Med Chem 2004, 47, 5791.
35. Zhou, T.; Huang, D. Z.; Caflisch, A. J Med Chem 2008, 51, 4280.
36. Williams, D. H.; Cox, J. P. L.; Adrew, J. D.; Mark, G.; Ute, G.
J Am Chem Soc 1991, 113, 7020.
37. Searle, M. S.; Williams, D. H.; Gerhard, U. J Am Chem Soc 1992,
114, 10697.
38. Yang, Z. Z.; Wang, C. S. J Phys Chem A 1997, 101, 6315.
39. Wang, C. S.; Li, S. M.; Yang, Z. Z. J Mol Struct (Theochem) 1998,
430, 191.
40. Wang, C. S.; Yang, Z. Z. J Chem Phys 1999, 110, 6189.
41. Cong, Y.; Yang, Z. Z. Chem Phys Lett 2000, 316, 324.
42. Yang, Z. Z.; Wang, C. S. J Theor Comput Chem 2003, 2, 273.
43. Yang, Z. Z.; Wu, Y.; Zhao, D. X. J Chem Phys 2004, 120, 2541.
44. Wu, Y.; Yang, Z. Z. J Phys Chem A 2004, 108, 7563.
45. Li, X.; Yang, Z. Z. J Theor Comput Chem 2006, 5, 341.
46. Li, X.; Yang, Z. Z. J Theor Comput Chem 2006, 5, 75.
47. Yang, Z. Z.; Li, X. J Chem Phys 2005, 123, 094507.
48. Yang, Z. Z.; Li, X. J Phys Chem A (letters) 2005, 109, 3517.
49. Yang, Z. Z.; Zhang, Q. J Comput Chem 2006, 27, 1.
50. Yang, Z. Z.; Zhang, Q. Chem Phys Lett 2005, 403, 242.
51. Yang, Z. Z.; Qian, P. J Chem Phys 2006, 125, 064311.
52. Guan, Q. M.; Yang, Z. Z. J Theor Comput Chem 2007, 6, 731.
53. Zhao, D. X.; Liu, C.; Wang, F. F.; Yu, C. Y.; Gong, L. D.; Liu, S.
B.; Yang, Z. Z. J Chem Theor Comput 2010, 6, 795.
54. Still, W. C.; Tempczyk, A.; Hawley, R. C.; Hendrickson, T. J Am
Chem Soc 1990, 112, 6127.
55. Ponder, J. WTINKER—Software Tools for Molecular Design, ver-
sion 42; Washington University: St Louis, MO, 2004.
56. Qiu, D.; Shenkin, P. S.; Hollinger, F. P.; Still, C. W. J Phys Chem
A 1997, 101, 3005.
57. Bernstein, F. C.; Koetzle, T. F.; Williams, T. F.; Meyer, G. J. B.,
Jr.; Brice, M. D.; Rodgers, J. R.; Kennard, O.; Shimanouchi, T.;
Tasumi, M. J Mol Biol 1977, 112, 535.
58. Dreyer, G. B.; Lambert, D. M.; Meek, T. D.; Carr, T. J.; Tomaszek,
J. Biochemistry 1992, 31, 6646.
59. Wold, S. Quant Struct Act Relatsh 1991, 10, 191.
60. Hosur, M. V.; Bhat, T. N.; Kempf, D. J.; Baldwin, E. T.; Liu, B.;
Gulnik, S.; Wideburg, N. E.; Norbeck, D. W.; Applet, K.; Erickson,
J. W. J Am Chem Soc 1994, 116, 847.
61. Yamazaki, T.; Hinck, A. P.; Wang, Y. X.; Nichoson, L. K.; Lam, P.
Y. Protein Sci 1996, 5, 495.
62. Erickson, J.; Neidhart, D. J.; Vandrie, J.; Kempe, D. J. Science
1990, 249, 527.
63. Lam, P. Y.; Jadhav, P. K.; Eyermann, C. J.; Hodge, C. N.; Ru, Y.
Science 1994, 263, 380.
64. Kurumbail, R. G.; Stevens, A. M.; Gierse, J. K.; Mcdonald, J. J.; Stege-
man, R. A.; Pak, J. Y.; Gildehaus, D.; Miyashiro, J. M.; Penning, T.
D.; Seibert, K.; Isakson, P. C.; Stallings, W. C. Nature 1996, 384, 644.
65. Michel, J.; Verdonk, M. L.; Essex, J. W. J Med Chem 2006, 49, 7427.
66. Wesolowski, S. S.; Jorgensen, W. L. Bioorg Med Chem Lett 2002,
12, 267.
67. Marcia, E. N.; Barbara, A. L.; lorante, A. Q. J Biol Chem 1981,
256, 13218.
68. Chothia, C.; Janin, J. Nature 1975, 256, 705.
69. Walls, P.; Stemberg, M. J Mol Biol 1992, 228, 227.
348 Chen, Zhao, and Yang • Vol. 32, No. 2 • Journal of Computational Chemistry
Journal of Computational Chemistry DOI 10.1002/jcc

e. is much faster than a full free-energy simulation. and the nonbonded energy Enb.solv þ DEligand Þ elec.41.32 The LIE method was already applied to many areas in the calculation of binding free energy ˚ with an accurate result.coul elec. Zhou et al.solv ¼ DEcomplex elec. . The electrostatic solvation energy contribution to the binding energy is calculated by using the generalized Born (GB) model proposed by Still et al. the torsional energy Etorsion. which has been applied to the water systems and ion–water systems 43–48 as well as to the conformations of alkane and peptide. (3): bond stretching and angle-bending terms Ebond and Eangle.rot accounts for the loss of translational and rotational degrees of freedom on binding. The aim is the same as that of Caflisch et al. DEelec. the coefficient a is determined empirically. which predicts a value of b 5 0.1002/jcc Journal of Computational Chemistry . We have developed a simplified method. to calculate the charge distribution for the ligand to be docked in the protein environment. Lately. ideally to obtain a method for estimating binding free energies that is fast.solv (5) where DEelec. The h. The results are discussed in section 3.Estimation of Binding Free Energy 339 ligand–protein binding free energies. the improper dihedral angle term Eimptors. between the ligand and its surroundings. section 4 gives the concluding remarks and outlook to future applications.53 which treats the charge regions explicitly including atoms. respectively.30 In fact.55 This work aims to calculate the binding free energy to study the interaction of inhibitor and protease. The coefficients a and b are scaling factors for these energies. Yang and Wang et al.54 in TINKER programmes. we apply ABEEM model at rp level (ABEEMrp).rot DEvdw ¼ DEcomplex À ðDEprotien þ DEligand Þ vdw vdw vdw DEelec.35 described a modified LIE method based on a continuum treatment of the solvent.25–28 which was based on conformational sampling by molecular dynamics (MD) or Monte Carlo trajectory. respectively. lone-pair electrons. accurate. the binding free energy is divided into a polar and a nonpolar contribution. and p-regions. DEvdw is van der Waals Here.coul DEelec. EABEEMrp=MM ¼ X bonds Eb þ X angles Eh þ X torsion E/ þ X imptors Eimptors þ X ðEvdw þ Eelec Þ ð3Þ nonbonded The bond stretching and angle bending energies are obtained in accordance with the following formulas: Ebond ¼ Eangle ¼ X bonds kb ðr À req Þ2 kh ðh À heq Þ2 (4) X angles DGelec ¼ DEelec.coul is the electrostatic interaction energy in vacuo between the ligand and the protein.49–51 Recently.38–42 designed the Atom-Bond Electronegativity Equalization Method (ABEEM) for large organic and biological molecular charge distribution.solv is the electrostatic solvation free-energy contribution to the binding free energy.e. LIE method was first sug˚ gested by Aqvist and co-workers.coul ¼ DEcomplex À ðDEprotien þ DEligand Þ elec. and general. The surroundings are either the solvated receptor binding site (bound state) or just solvent (free state). This general outline is organized as follows. for the estimation of absolute binding free energies inspired by the LIE approach. i. it is equal to the difference of the solvation free energies of the complex and the isolated ligand and protein. The nonbonded energy is computed as a sum of the Lennard-Jones and Coulomb contributions for pairwise intra.solv À ðDEprotien elec. Finally. bonds.31. the ABEEM/MM model has been used to perform dynamics simulations for proteins. ABEEM/MM. two simulations are required: one with the ligand free in solution and one with it bound to solvated receptor.27 or large deviations in some of the predicted binding energies.52 In this study. Methodology The ABEEMrp/MM Model In this method. . the electrostatic scaling factor is also considered a free parameter in the fitting except for a few studies characterized by either a small number of ligands26..33 and Caflisch and co-workers34.solv ð2Þ interaction energy between ligand and protein. the LIE method only requires simulations of the two ending windows. Although the LIE method of Aqvist et al. ABEEM model has been fused with MM. In the LIE approximation. although different form. i. DGelec is the sum of the ligand–protein Columbic energy in vacuo and electrostatic solvation energy in continuum model.5. instead of using explicit solvent. The methods and related details are summarized in section 2. we replace the MD with energy minimization and combine the LIE method with a treatment of continuum electrostatics. and DGtr.i denotes MD or MC averages of the nonbonded electrostatic (elec) and van der Waals (vdw) interactions of the ligand with its surroundings. the formula of the present method is DGbind ¼ aðDEvdw Þ þ bðDGelec Þ þ DGtr. kb and kh represent the force constants of stretching and bending.26 The linear response approximation provides a physical basis for the treatment of the electrostatic contribution to the binding free energy. But in contrast to FEP/TI in which a large number of intermediate windows must be evaluated.. r and h are actual values of bond lengths DOI 10. and calculated according to KGLIE ¼ bðhEelec ibound À hEelec ifree Þ þ aðhEvdw ibound À hEvdw ifree Þ bind (1) where Eelec and Evdw are the electrostatic and van der Waals interaction energies.29.36.coul þ DEelec.coul elec. In our approach.and intermolecular interactions. Recently. it is still too slow for screening a large number of ligands. In other words.49.37 On the basis of the electronegativity equalization principle. potential energy function of complexes of a protein and its inhibitors is evaluated as a sum of the following components in eq.

here set to 78. k is an overall correction coefficient 0. The Electrostatic Solvation Energy Model The electrostatic solvation energy contribution to the binding energy is calculated using the GB model proposed by Still et al. A monoprotonated state at the catalytic aspartates is considered as reference from the study by Huang and Caflisch. bonds. respectively. 49 and 53. which respond to their environments in a way similar to the polarization response of real molecules. the Born radius ai was calculated with a fast analytical approach. in eq. (7): Eimptors ¼ X imptors vð1 À cos 2/Þ (7) Here. for the intramolecular interactions. The electrostatic interaction energy Eelec is expressed as: Eelec ¼ X i<j The coordinates of HIV-1 protease in complex with the inhibitor Ala-Ala-Phe-C-{}-Ala-Val-Val-OMe were obtained from the ˚ Brookhaven Protein Data Bank. 2 • Journal of Computational Chemistry and angles. aij 5 (aiaj)0.5 for 1. Evdw describes the van der Waals nonbonded atom–atom interaction: Evdw ¼ X i<j where qi and qj are the net charge of the atoms in the molecule. The water bridging the two flaps was retained as structural water binding of the inhibitors considered in this study. The electrostatic solvation energy was calculated by using the GB model proposed by Still et al. No. rij is the distance between atoms i and j. Moreover.55 The binding free-energy formula is used for the fitting of a three-parameter model. and p regions as variables. and m are the dihedral angle and improper dihedral angle force constants.34 The crystal structure of the 1AAQ complex contains the largest compound from a set of 24 HIV-1 PR inhibitors (Fig.54 in TINKER program. in the latter case.340 Chen. Manipulation of Training Set   12 6 4fij eij r12 =rij À r6 =rij ij ij (8) Geometric combining rules for the Lennard-Jones coefficients used are rij 5 (riirjj)1/2 and eij 5 (eiiejj)1/2. We take the parameters of bond stretching and angle bending. and all the other parameters from the previous articles. Minimization and Energy Calculations kqi qj =rij (9) For the Coulomb term. Zhao. the coefficient fij is equal to 0. The electrostatic interaction term in the ABEEMrp/MM model has been described in detail in Refs. V1. and Yang • Vol.1002/jcc Journal of Computational Chemistry .4 interactions (atoms separated by exactly three bonds) and fij 5 1. The minimized structures were used for evaluating the van der Waals energy and electrostatic interaction energies. The nonbonded part contains the Lennard-Jones and Coulomb contributions for pairwise intra. respectively.55 The formula is described as follows:   N N 1 XX DGpol ¼ À166:0 1 À  e i¼1 j¼1 qi qj 2 rij þ a2 eÀDij ij 0:5 (10) Here. Partial charges were assigned using the ABEEMrp method. 32. (9).and intermolecular interactions.41. respectively.4 nM to 6.01 kcal mol21 A21. qi and qj are the partial charges of sites i and j. rij is separation of sites i and j.31 DOI 10.5 and Dij 5 r2 /(2aij)2 and the double sum runs over ij all pairs of atoms (i and j). we deal with the net partial charges by ABEEMrp method. Nonbonding cut˚ was used. The electrostatic energy is the sum of the Coulombic energy in vacuo and the electrostatic solvation energy. fij 5 0.5739 in the ABEEMrp model if there is no otherwise specification. e is commonly called the solvent dielectric constant of the media. ai is so-called Born radius of atom i. bonds. and p regions.57 with a 2.54 in TINKER program. The torsional term is computed as follows: Etorsion ¼ X  V1 V2 : ½1 þ cosð/i ފ þ ½1 À cosð2/i ފ 2 2 i þ V3 ½1 þ cosð3/i ފ 2  ð6Þ The improper dihedral angle term is written as eq.58 The remaining 23 inhibitors were modeled manually by deleting parts of the inhibitor in 1AAQ.0 for any i–j pair connected by a valence bond (1–2 pairs) or a valence bond angle (1–3 pairs). The former was calculated with ABEEMrp/MM. which treat the charge regions explicitly including atoms. The van der Waals energy was calculated with ABEEMrp force field using the default cutoff of ˚ 14 A. V2. Residues greater than 20 A away from ˚ off of 14 A the active site of protein and the water molecule in HIV-1 PR were kept fixed during minimization. Here. The van der Waals and electrostatic interaction energies were calculated by subtracting the values of the isolated components from the energy of the complex. hydrogen atoms were added to all structures using TINKER program55 and minimized with the ABEEMrp fluctuating charge force field. lone-pair electrons.56 In this study. V3. and req and heq are used to denote the equilibrium values of the bond length and angle. treating partial charges on atoms. the partial charges qi are calculated by atom–bond electronegativity equalization method (ABEEMrp). lone-pair electrons. the ABEEMrp parameters (v* and 2g*) and LennardJones parameters (r and e).49. 1) with inhibition constant (Ki) values ranging from 0.3 for the solvent is water liquid.5 A resolution X-ray structure (PDB entry 1AAQ58). The summation runs over all of the pairs of atoms i \ j on molecules A and B or A and A.0 for all of the other cases.53 All atomic radii were set to the pffiffi 6 2 ABEEMrp/MM van der Waals radii ( 2 r). It is the main problem to obtain accurate Born radii to compute the electrostatic solvation free energy. All protein–inhibitor complexes were minimized by the conjugate gradient algorithm to root mean square ˚ (rms) of the gradient of 0.5 lM.

Estimation of Binding Free Energy 341 Figure 1. respectively.1002/jcc . were used to determine optimal Journal of Computational Chemistry DOI 10.26 DGbind ¼ aðDEvdw Þ þ bðDGelec Þ (11) Results and Discussion Binding Free Energy (12) Calculated binding free energies and the experimental binding free energy for the 24 receptor–ligand complexes listed in Table 1 and Figure 1. HIV-1 PR inhibitors tested by Dreyer et al.58 DGbind ¼ aðDEvdw Þ þ bðDGelec Þ þ DGtr.rot and a two-parameter model.

06 211.47 258.01 279.70. and the multiple correlation coefficient r2 of three. 32. between the calculated and experimental values for the 24 complexes. Each resulting model parameterization was used to predict the left out DG from the left out simulation data.solv 3-Parameter fit 2-Parameter fit 234. 0.24 and 2.60 1hvk.67 0.75 28.77 kcal mol21 and 0. The rms deviations for the seven inhibitors of the test are 1.47 278.27 117.29 270.coul DEele. The cross-validated statistical figures of q2 and sLOO PRESS LOO are 0.87 and 0.28 280.72 279.21 211.77 212.47 64.1002/jcc .56 210. in kilocalories per mole. and p is the number optimized in the model.53 212. Table 2 reports the coefficients for the three. missing one of the compounds.and Three-Parameter Models and Some Statistical Figures of Merit.53 122.42 28. where n 5 PRESS 24 is the training set size.rot 5 8.85 210. No.27 29. 2 • Journal of Computational Chemistry Table 1. 0. optimal parameters were found using each of the 24 data sets.24 284. the 1hvr complex does not contain the water molecular bridging the flaps.91 29.60 1bvg61.77 106. for the model of eqs.49 29. and the linear correlation coefficient r is around 0.64 290.61 210.02 286.20 211.14 240.65 210.85 111.15 212.78 8.37 The predictive ability of the present approach was further tested on a set of inhibitors. is calculated as: r2 5 SSR/(SSR 1 SSE).84 212. Above all.47 210. in kcal mol21.35 286.19 103.99 210.82 and 0.95 281. Data i Training Set Compounds.12 212.33 118. and the square sum of deviations of these predictions from experimental values is the ‘‘Predictive Residual Sum of Squares.84 233. DGcacl DGcacl DEele.67 211.78 282.18 211.12 279. and i range from 1 to 24.90 212. leave-one-out cross-validation has been performed on the models.34 210.08 269.00 75.61 212.86 29. c Leave-one-out cross-validated standard deviation.60 211.70 100.3274 b 0. and Yang • Vol.6742 a The multiple correlation coefficient r2 of a model measures what proportion of the variance observed in the experimental binding data.87 q2 b LOO 0.05 233.50 284. b Leave-one-out cross-validated correlation coefficient. respectively. 9hvp62. Interaction Energies.29 283.91 120. and Ki are listed in Table 3. it is found that the three-parameter model has better predictive accuracy than the two-parameter model.65 213. b. the values of experimental binding free energy.74 All energies are in kcal mol21.97 116.59 102.08 212.61 211.and two-parameter model is 0.91 295.29 282.09 266.52 28. their PDB code is 1hvi.13 211.02 282.67 kcal mol21 is within the range of 7–11 kcal mol21 observed experimentally.72 106.63 The structures. From Table 2 and Figure 3.41 290.80 213. SSR 5 (DGcomplex(i) 2 hDGexpti)2 and SSE is the i residual unexplained square sum of deviations SSE 5 overfitting were assessed by leave-one-out cross-validation.59 the multiple correlation coefficient r2.49 273.77 0.09 211.17 212. Journal of Computational Chemistry DOI 10.29 287.44 212.64 69.95 0.53 211. Solvation Energies Calculated for the P (DGexpt(i) 2 DGcomplex(i))2.70 67. which is very encouraging. coefficients.67. The parameter optimizations were performed by leastsquares optimization method. Zhao.and Two-Parameter Models.95 285.01 211. The correlation equation about the calculated binding free energies with the three fitting parameters model versus the experiment value is given in Figure 2.342 Chen.72 66.13 297.14 103.78 kcal mol21 of three-parameter model. and 1hvr.33 212.1981 0. As Well As the Calculated Binding Free Energies of Three. All the results of these parameters and also some common statistical figures of merit for linear regression models are summarized in Table 2. and DGtr.32 117. (11) and (12).38 255. respectively.39 kcal mol21 for the threeand two-parameter model.05 213. the three-parameter model has the good predictive accuracy.73 120.75 295. But the two models with three or two parameters give result in lower rms deviations and mean unsigned deviations h|dev|i values.42 27.58 275.82 294.55 0.14 274. To avoid overfitting the data.60 1hvj.17 114.’’ The leave-one-out cross-validated correlation coefficient is then q2 LOO 5 1 2 (PRESS/SSR) and the cross-validated pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi standard deviation is sLOO ¼ PRESS=ðn À p À 1Þ. respectively. where SSR is the square sum of deviations P explained.55 kcal mol21. a.68 118.36.70 h|dev|i (kcal mol21) 0.66 238.33 213.91 276.61 0.69 285.43 279.20 282.93.97 287.69 213.2109 0.82 272. Coefficients for the Two.82 sLOO c (kcal mol21) PRESS 0.rot (kcal mol21) rmsd (kcal mol21) 0.24 124. The predictive power is Table 2. a measure of the overall fit of the model.92 291.and two-parameter models and some statistical figures of merit. A correlation of the calculated binding free energy versus experimental binding free energies for 24 HIV-1 PR inhibitors is shown in Figures 2 and 3. 0.20 264.2342 DGtr.56 r2a 0.08 98. It is also deserved to note that the third parameter DGtr.70 293.65 212.54 213. a 0. The seven complexes were minimized in the same method as proposed above.98 210.56.55 28.50 Inhibitors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 DEvdw 272.81 295.60 1hvl.44 211.29 110.47 294.rot.01 211. The calculated energy components and the calculated binding free energy for seven testing sets are reported in Table 4. which minimizes the rms deviation and mean unsigned deviation (h|dev|i).72 292.

we conclude that the role of electrostatics is completely changed in the binding process. The similar values of the scaling parameter for the van der Waals interaction indicate that it is rather robust for the physicochemical characteristics of the binding site. the hydrogen atoms were added to all structures using TINKER program55 and minimized with the ABEEMrp fluctuating charge force field. and the difference may be caused by both GB versus Poisson Boltzmann and the method of the assignation of partial charges. and the perturbations are involved from 1 to 2. good if one considers that the seven inhibitors have rather different chemical structures. We validate this method for the drug target COX-2 by calculation of the binding free energies and the relative binding free energies (DDGbinding(A?B)) of a set of inhibitors of COX-2. Here. Partial charges were assigned using the ABEEMrp method. the ratio of the value of b versus the value of a is close to the ratio obtained in study by Zhou et al. It is close to the values of a obtained in the study by Huang and Caflisch34 who used a different force field and solvation model. System Preparation Figure 2. The values of the coefficient a are 0. The former was calculated with ABEEMrp/MM. Comparison of binding free energy calculated using eq. The perturbations selected in this work are typical of those carried out in a binding free-energy study. respectively (Table 2).Estimation of Binding Free Energy 343 eters of this method can be applied to other drug targets and validated this method for the drug target cyclooxygenase (COX)-2. Calculated binding free energy using eqs. The black dots represent the three-parameter model and the open triangles represent the two-parameter model. On the other hand. only the protein residues that have one heavy atom ˚ within about 20 A of any heavy atom of ligand were retained.65 The structure of the inhibitor of COX-2 is shown in Figure 4.2342 for the three-parameter model.66 The relative binding free energies were calculated using the following formula: DDGbindingðA!BÞ ¼ DGbinding ðBÞ À DGbinding ðAÞ (13) In this article. lone-pair electrons. All protein–inhibitor complexes were minimized by the conjugate gradient algorithm to an rms of the gradient of 0. while r value in parentheses is the correlation coefficient.and three-parameter models discussed above. which respond to their environments in a way similar to the polarization response of real molecules. The relative binding free energies for these perturbation complexes are also calculated and compared with the available experiment results. to reduce the computational cost. Journal of Computational Chemistry DOI 10. and p regions as variables. treating partial charges on atoms. Through comparison. Moreover. while r is the correlation coefficient. The calculated binding free energies are compared with the experimental results. bonds. to be calculated directly.3274 for the two. We think that it is reasonable to assume that the parameters should be system dependent and might be force field dependent also.33 The Parameter Transferability The PDB structure of murine COX-2 was selected as starting point for this study (PDB code 1CX264). the values of b are completely different 0.2109 and 0. and 4 to 3.1981 for the two-parameter model and 0.65 The van der Waals and electrostatic interaction energies were calculated as described above. the data points should be on the diagonal line. 3 to 1. the solvation penalty is also less in the current studies compared with that of Huang and Caflisch. (11) and (12) versus experimental binding free energies for 24 HIV-1 PR inhibitors. described as a set of atom-centered monopoles. which enables the charge distribution. (11) versus experimental binding free energies for 24 HIV-1 PR inhibitors. In the study by Huang and Caflisch. Partial charges were assigned using the MPEOE method. The electrostatic energy is the sum of the Coulombic energy in vacuo and the electrostatic solvation energy.0636 versus 0. the diagonal is drawn for visual help.0168 versus 0. we reported good results for validation of this method for HIV-1 protease inhibitors to test whether the param- Figure 3. For the model of COX-2.1002/jcc .01 ˚ kcal mol21 A21. If the calculated results perfectly agree with the experimental values. In general. the substituent R is shown for each compound. So as we can see that the Coulombic energy in our method is different from the results in their article.

Complexa 1hvi Structure Ki(nM)b 0.59 62 1hvr 0.270 213.5 211.004 216.54 Refs. Complexes of Testing Set.112 214. b . Experimental Inhibition Constants (nM). 60 1hvj 0.50 63 a Code in the PDB.16 60 1bvg 0. Experimental inhibition constants (nM).6 60 1hvl 0. and Experimental Binding Free Energy (kcal mol21).59 61 9hvp 4.22 60 1hvk 0.012 DGexp c(kcal mol21l) bind 215. c Experimental binding free energy (kcal mol21).31 213.011 215.Table 3.

the Calculated Relative Binding Free Energies (A?B) of Three-Parameter Model.22 2130. which is also excited in their report.45 210. The rms deviations for the four inhibitors of COX-2 are 1. we conclude that the present method can be applied to predict the relative binding free energy between the different inhibitors and enzyme.80 inhibitors 1hvi 1hvj 1hvk 1hvl 1bvg 1hvr 9hvp DEvdw 2105.76 2114. near or lesser than 0.80 215.76 DDGcacl binding DGexp a bind 210.31 277.63 213.65 216.Estimation of Binding Free Energy 345 Table 4.63 239.59 28.10 243.25 kcal mol21 for the three.88 92.70 29.85 DEele.63 213.24 (4t3) 1. respectively.77 212.28 25. Energy Components and the Calculated Binding Free Energies of Three.34 and another recent simplified approach for the estimation of absolute binding free energies. the rms deviations of our two models are 0.coul 0. It is necessary to compare the present approach with a previous method proposed by Huang and Caflisch.63 1.22 DDGexp b binding (A?B) (A?B) Inhibitors COX-2 COX-2 COX-2 COX-2 (1) (2) (3) (4) DEvdw 278.56 and 0.31 28. Comparing the value of DDGbinding (A?B) between the experimental and calculated results. respectively.24 2106.06 2119. From discussion.and Two-Parameter Model for COX-2.63 252.and two-parameter models. For a training set of 16 Table 5.74 27. and the experimental results of binding free energies and relative binding free energies of COX-2 were listed in Table 5.41 215.77 273.24 29. Energy Components and the Calculated Binding Free Energies of Three.84 215.29 213.65 212.89 kcal mol21 in their report.26 2113.77 kcal mol21 for the three.55 for three.solv 3-Parameter fit 2-Parameter fit 244.69 30.70 106.23 and 1. respectively. DGcacl DGcacl DEele.76 212.99 All energies are in kcal mol21.02 240.85 117.92 213. DGcacl 3-Parameter fit 29.68 3. it should be done to refit the parameters of the model to be applied to the other systems. Comparison With the Recent LIE Method Figure 4.45 87.77 kcal mol21.48 All energies are in kcal/mol.18 224.79 DGcacl 2-Parameter fit 210. respectively. It is encouraging that compared with the results of three.95 DEele.and two-parameter models. The calculated binding free energies of three. the present method can gain the same or better correlation coefficient between the calculated energies and experimental values. the present method can give the correct prediction of the relative binding free energies between the inhibitors and enzyme.coul DEele. Journal of Computational Chemistry DOI 10.and two-parameter model for COX-2.77 (1t2) 0.55 28.1002/jcc .77 (1t2) 1.82 (3t1) \ 24.86 and 3.19 33.73 and 0.08 110. and the experimental results are 1. which means the parameters are not always transferable for applying to new systems. We choose GBstill model. where DGbinding (B) and DGbinding (A) refer to the binding free energies of the different inhibitors in the complexes. The recent simplified approach similar to that by Zoete et al. b Experimental relative binding free energies of COX-2 with the inhibitors.12 2120.95 228.20 211.01 210. a Experimental binding free energies for COX-2.49 0.40 211.38 97. The mean unsigned deviations in this work are 0. the calculated relative binding free energies (A?B) of three-parameter model.solv 31.90 90.and Two-Parameter Models for the HIV-1 PR Testing Set.23 29.9 performed conformational sampling by MD in vacuo (distance-dependent dielectric function).64 (4t3) [ 4. The large deviations occur between the COX-2 (3).09 28. but in the case of calculation of absolute binding free energies.04 275.70 and 0.81 (3t1) 22. to calculate the electrostatic solvation energy and to avoid calculating the timeconsuming Poisson-Boltzmann equation. and the Experiment Results of Binding Free Energies and Relative Binding Free Energies of COX-2. Structure of the inhibitors of the cyclooxygenase-2 considered in this study.and two-parameter models.or two-parameter models of Huang and Caflisch. Huang and Caflisch applied LIE method in combination with energy minimization and finite difference Poisson calculation of electrostatic solvation energy for the estimation of the absolute free energy of binding.

1aaq11.911 3.316 3. the rms deviation is 0.314 2. 32. and the angle of donor–H–acceptor is larger than 1508. through the 1000 ps dynamic simulation of three complexes. When minimizations keep the proteins as the crystal structure.150 2.075 5.90 kcal mol21.921 3.91. xi.034 3. presents the degree of cumulate compactness of protein–ligand complex. the rms deviation is 0. Moreover.5–3.293 The radius of gyration. respectively. HIV-1 protease-inhibitor complexes of known three-dimensional structure.892 3.032 3.965 3. As commonly accepted.091 2. Complex 21 has smallest Rg value. and 1aaq21) of ABEEMrp/MM Simulated Structures. The formula of the radius of gyration is i91=2 8PN þN h < i p l ðxi À xÞ2 þ ðyi À yÞ2 þ ðzi À Þ2 =   z : NP þ Nl .580 3. and 1aaq21 as the model molecule to investigate the relationship of geometric fit between these three different protein–ligand complexes. and 1. Furthermore.465 4. 1aaq21.910 3. the Rg of three complexes are all smaller than the isolated protease. small Rg value corresponds to compact packing and good fit of geometry on the interface. this proves that complex 21 has the best geometric fit relatively and this is consistent with the binding free energy compared result.170 3. and zi are the coordinates of   those non-hydrogen atoms.165 1aaq11 3. the radii of gyration of complexes 1aaq6.994 3. and Yang • Vol. This denotes that three ligands are all inside the pockets of the proteases. and  are the coordinates of z center of mass. and the mean unsigned deviation is 0. 29.7495. if the distance Table 6.437 3.111 2.151 3.8 kcal mol21.229 3.303 2. 1.911 3. and the isolated protease are 1. Journal of Computational Chemistry DOI 10. a hydrogen bond exists if the distance between donor and ˚ acceptor heavy atoms lie in 2. the radius of gyration characters the quality of geometric fit. Rg. Hydrogen Bond Hydrogen bond is important for the stabilization of a protein– ligand conformation. compared with our three-parameter model in which the correlation coefficient is 0. the leave-one-out method correlation coefficient is 0. the difference of the electrostatic solvation free energies on binding.956 2.117 2. Atom pairs Asp25 O1ÁÁÁO Asp25 O2ÁÁÁO Asp250 O1ÁÁÁO Asp250 O2ÁÁÁO Gly27 OÁÁÁN Gly270 OÁÁÁN Asp290 NÁÁÁO Gly480 OÁÁÁN WT OÁÁÁO1¼C ¼ WT OÁÁÁO2¼C ¼ Ile50 NÁÁÁO(WT) Ile500 NÁÁÁO(WT) Average distance 1aaq6 3. yi. Under this assumption.47 kcal mol21. The first three energy terms were averaged over 50 snapshots saved along 100 ps of MD simulation. x. In the results of their four-parameter model. No.5 A. 1aaq11. 2 • Journal of Computational Chemistry Figure 5.67 The previous research expressed that the interface of compound has the same degree of cumulate compactness as the inside of protein.104 2.408 3. For the systems of our research. It is found that.018 3.623 3.494 3.229 3.079 Rg ¼ (14) where Np and Nl are the number of non-hydrogen atoms of protein and ligand.82. the correlation coefficient is 0.730 3. and 1aaq21).960 3. 1. Radius of gyration as a function of simulations time during 700 ps molecular dynamics runs of equilibrated structures of three complexes.074 3.869 3.051 3. the shifts of radius of gyration for three complexes are shown in Figure 5.7521.193 3.086 3.095 3.037 3.70 kcal mol21.911 3.325 3.023 3. the value of Rg denotes the correlation of quality of the geometric fit on the interface. and 1.229 3. The Average Hydrogen Bond Distances of Atom Pairs NÁÁÁO and OÁÁÁO in HIV-1 Protease–Inhibitor Complexes (1aaq6. We chose the complexes 6 (1aaq6).219 3.68. and 211. the leave-one-out method correlation coefficient is 0.480 3.941 3.69 So.7494.229 3. from the results of optimization and the dynamic simulation. y.611 3.057 3.93.7695 nm.015 3. the distance between H ˚ atom and acceptor atom lies in 1.229 3.3 A.910 3. Minimization Molecular dynamics 1aaq6 3.230 2. 1aaq11. Of course.7491. and 1aaq21 are 27. we keep the ˚ Hydrogen bond distance in A. Possible hydrogen bond donors or acceptors of the ligand should be saturated as much as possible.8403. and a constant term. On comparison. 1aaq11.911 3.843 3.7493 nm.462 3. the average radii of gyration of three complexes are 1.039 2.134 4. Structure Analysis Radius of Gyration protease as the crystal structure in the calculation.113 3.56 kcal mol21.563 3.345 1aaq21 3.6–2.399 3.302 3.681 2. Zhao.7523. we added the investigation of the hydrogen bond distance to identify among three HIV-1 protease–inhibitor complexes (1aaq6. they proposed a four-parameter model based on the electrostatic interaction energy between the ligand and the protein.672 1aaq11 3.103 3.151 1aaq21 3. minimization is easier and more efficient than obtaining an average structure by MD sampling even if one runs MD sampling in vacuo. 1. 1aaq11. Furthermore.416 2. the radii of the gyration of three complexes are related as 1aaq6 [ 1aaq11 [ 1aaq21. the binding free energies of 1aaq6.327 3. In any case.517 2.346 Chen. so. and the mean unsigned deviation is 0. we conclude that our method gives better result.65.178 3.1002/jcc .229 3.715 3. the buried surface. respectively.216 3.893 3.46. respectively.

the present approach for calculating binding free energy with structural analysis can be applied to quickly assess new inhibitors of HIV-1 proteases. more characteristics of covalent bond exist and the intensity of the hydrogen bond will be stronger. M. it may be necessary to refit the parameters for applying to other systems.. the parameters of present method are not completely transferable. E. Science 2004. O. and 1aaq21) of ABEEMrp/MM simulated structures are listed in Table 6. the average hydrogen bond ˚ distances are 3. The overall scheme of the inhibitor binding is shown in Figure 6 about the main hydrogen bond between the inhibitor and binding pockets of HIV-1 protease58 in which the Asp25 is protonated..165. and 1aaq21. Caflisch. O.. Kuhn. and 3. Chong.079 A of complexes 1aaq6. Karplus. the present approach does not need to add a Born correction term for ionized systems as required in explicit solvent. 1242. 1. Acknowledgements We are very grateful to the editor and reviewers’ nice suggestions on the manuscript.672.. Perspect Drug Discov Design 1995. the average distances of atom pairs NÁÁÁO and OÁÁÁO in HIV-1 protease–inhibitor complexes (1aaq6. K. A.. and 1aaq21. Furthermore. Karplus. By comparing the radii of gyration and hydrogen bond distances of three model complexes. Brandsdal. 2. Donini. T. Aqvist... Essential hydrogen bonds to the inhibitor and four binding pockets (S2-S20 ) are indicated.. Debouck. Cieplak... Cheatham. and ˚ 3. E. With the encouragement that the computed results have a good correlation with the binding free energy. 25. E. B. W. A... 13. M..5 A. 1. 53. Lee. M. Meek. Hydrogen bonds are indicated by dotted lines.. W. 51. 3. When applied to seven HIV-1 protease–inhibitor complexes of different structures. S. 449. 33. Almlof.rot term and simultaneous consideration of the loss of translational and rotational degrees of freedom on binding.. 1aaq11. M. Duan. Y. the present method can be applied to predict the relative binding free energy between the different inhibitors of other target drugs. T. respectively. We also greatly thank Professor Jay William Ponder for providing the TINKER programs.. Wang. We gain the conclusion that the average hydrogen bond distance of three complexes is 1aaq6 [ 1aaq11 [ 1aaq21. References 1. I. ˚ 5. 17. Metcalf.56 kcal Journal of Computational Chemistry DOI 10. 861. M. 595. This conclusion is well in accordance with those compared result of the radius of gyration and proves the accuracy of the conclusion that complex 1aaq21 has better geometric fit than other two complexes. 849. Caflisch. 1aaq11.. To further investigate the relationship of geometric fit between these three different protein–ligand complexes. Srinivasan. L. D. P. Ehrhardt. discussed in their article that the original LIE method based on MD (or Monte Carlo) sampling might be more appropriate than LIE combined with minimization for flexible binding sites. From Table 6. N. O. Massova.. Proc Natl Acad Sci USA 1987. Jorgensen. In addition. W. W. A. 3. 303. Rosenberg. J Comput Chem 1992. C. J Comput Chem 2004.151. I.24 kcal mol21. 12.345.. from the dynamic simulation results. B. 4. News Physiol Sci 1998. and the average hydrogen bond distances are 3. 182.26 and the entropic contribution to the binding is taken into account implicitly through the DGtr. Iitaka. C. P...1002/jcc . B. C. 6. 505. Strickler. Goodford. 8903. Here. J. Michielin. J. so the hydrogen bond strength of three complexes is 1aaq21 [ 1aaq11 [ 1aaq6. Huo. the method provides a satisfactory correlation with rms deviation of 1. A. 889. T.70 kcal mol21 and an unsigned mean deviation of only 0. B. A. from the minimized results. Itai. Meng. Conclusions The LIE method is combined with ABEEMrp fluctuating charge force field and GB continuum models calculation of electrostatic solvation for the calculation of the absolute free energy of binding. 8. Y. P. L. but similar to the calculation of absolute binding free energy. J Comput Aided Mol Des 2003. Wlodawer. without any reparametrization. which might not be computationally feasible for a large library of compounds in virtual screening.. Kuntz. Jorgensen. Acc Chem Res 2000. As Huang et al. J Comput Aided Mol Des 1987. Shoichet.. respectively. 11. 10. Lamb. ˚ between donor and acceptor heavy atoms is smaller than 2. A correlation coefficient of 0. G. the quality of geometric fit of protein–ligand conformations is discussed. J. 13. Gorniak. D. D... Tomioka. 3. 3.Estimation of Binding Free Energy 347 Figure 6. 9. and therefore. The present method has been shown to give good results for a training set of 24 HIV-1 protease inhibitors. V.93 was obtained with an rms deviation of 0. M. Reyes. 84. L. binding site flexibility requires longer MD simulations to reach convergence. 28. 1813. C. L.293 A of 1aaq6.. Curr Opin Chem Biol 1997. Annu Rev Med 2002.. A. Walchi. 197. 1aaq11.. J Med Chem 1985. J. 7. Zoete.. R. J.. Lee. mol21.. Case. Kollman. we compared the distances of hydrogen bond between these three ligands and protease.

22. Williams.. C. J Theor Comput Chem 2006. Z. 403... J Comput Chem 2006. Huang. J. L. Carr. 69. M. Ru. U. K. 39. Baldwin. X. Miyashiro. 55.. S. B. Yang. J Mol Model 1998. 6127. J. A. S. P. I. ˚ Hansson. 256. J Med Chem 2006. Proteins Struct Funct Genet 1999. 116. 23. Yang. J Phys Chem A 1997. R. A. Z.. J Am Chem Soc 1992. C. T. Yang.. Greene. Z. J.. P... J Mol Biol 1993. X. Janin. Waller. J. D.... Qiu. Stallings. Freer. 91. Walls. M. L.. Barbara. G. Lambert. 47. P. 7563. Wang. J Am Chem Soc 1996.. H. Kennard. J Phys Chem A (letters) 2005. 527. J. 7020. Rodgers. 43. Erickson. L. Yang. Z.. Rotstein. 26. T. 53. 7427. 8. T. J. Bernstein.. S. 63. J Mol Struct (Theochem) 1998. Z. 191. Z. M. 2004.. D. Williams. Z. R. Michel. Yang.. D. 12.. T. Guan. Acc Chem Res 1989. Villafranca. 5791. 2. Z. B. 6189.. Z. 27. Ghoshs. Lai. F. S. Yang. Z. Penning. J Chem Phys 2006. 32. Q Rev Biophys 1984. D. Zhao. Marelius. 1700. A. N. S. 19. Z. Tempczyk. R.. Krystek.. Chem Phys Lett 2000. 3959. Stemberg.. 795.. Yang. 3005. N. 102. C.. Bruccoleri. S. Liu.. Kollman. J Theor Comput Chem 2007. 661. S. J Biol Chem 1981. L.. Kempe. Z. W.. Liu. W. Bioorg Med Chem Lett 2002. A. W. A. Shenkin. 395. Biochemistry 1992.. Wang. Hawley. 38. C. V. B. Z. J Med Chem 2008. Li. Liu. J. Mark.. 54.. 8. B. T.. G. A. Still... Apostolakis. D. Searle. ¨ Bohm. 75. 112. D.. Bhat. Chothia. J.. Wang. 58. Ann Rev Phys Chem 1992. Z. C. A. C. J. L. Z. D. Tasumi. 93. 379. 6. Stegeman.. P.. J Med Chem 2004. Erickson. 273. Z. 242. M. X. Huang. Marcia. 101. Liu. P. 4683. 37. Y. 51. Musil. Yang. and Yang • Vol. 6646. S. Gong. 9512. 45. J. Stouch. S. 21. J. C.. Jernigan. Rizzo. Gulnik. 2 • Journal of Computational Chemistry 13.. P. Yang. Williams. Yamazaki. Protein Eng 1995. Yang. T. S. 2541. Z.. Jorgensen. K. Appelt. 316. 644. J Med Chem 1993. Z. Dreyer. A. Protein Sci 1996. Tang.. A.. L. E.. P. Z. Protein Eng 1995. D. Oprea. 60. Z.. M. Washington University: St Louis. Y.. 14. Lam. J. 6315. Pak. 35. 10388. H.. H. T... Jadhav. J. X. J. Jr.. ˚ Aqvist. 34. 46. 4. Z. J. MaCammon. 101. Eyermann. Yang... J Phys Chem B 2001. Biochemistry 1989... 36. D.. D. Q... 64. K. R. 5. Medina.. C. No. 4735. Wallqvist. 110. 341. Novotny. Q. Y. Ljungberg. 51. Ute.. F.. 191....... Wold. Y. WTINKER—Software Tools for Molecular Design. F. 57. 17. T. Hosur. Stevens. H. C. Zhou. J. Covell. Ponder. Gildehaus. Science 1994. 3517. Smythe.. Y. 120. Proteins Struct Funct Genet 1999. Norbeck. Yang. K.. Adrew. P... 49. ˚ Aqvist. P. Zhao. 243. S. Kempf. Caflisch. Head. Li. Brice. J Chem Phys 2005. E. 430. 17. 256.. Hollinger.. 59..... T. J Phys Chem 1990. 15. R. A. 705. J Mol Biol 1992. Y. Straasma. S. Still. B. W.. 441. C. J... Jorgensen. Wang. J. 6. Meyer. K. lorante. Verkhivker. 731. 20. 4 1881. S. Russell. T.. 68. Yang. R. F. D. A. L. Z. Q.. 18. M. 677. P. Isakson.. T. 094507. 25... F.348 Chen. 118. Cong. 1137. Z. 34. J Chem Theor Comput 2010. J Comput Aided Mol Des 1994. J Phys Chem 1996. D. 407. R. 113. M. Yang.. 535. R.. 67. 66. 249. K. Saul. Wang. N. D. T.. F. Caflisch. 263. 24. 125.. 495. 10. Z. S. Verdonk. Journal of Computational Chemistry DOI 10. A. 064311. Caflisch. Z. S. 267. J Am Chem Soc 1990. R. A.. 123.. 380. 2395. J Am Chem Soc 1994. B. E. Cox. M.. Kollman... 1. A. T. Qian. Chem Rev 1993. Zhang.. 50. 10697. C.. Koetzle. Wesolowski.. 27... 4280... J.. G. C. Z. 324. 28. Z. 62. Q. C. Friesner. X. Wang.. Yu. P. 384. Wideburg.. D. J Am Chem Soc 1991. W. Li. Aqvist. 33. Tomaszek.. Quant Struct Act Relatsh 1991. J Chem Phys 2004. R. M. 13218.. 34. Nature 1996. 65. R. 16. H. Wang. Li. J.. F. A. 29.. Y. D.. C. M.. D. Yu. T. M. Warshel. N. Q. Z. Chem Phys Lett 2005. Kurumbail. Comb Chem High Throughput Screen 1999.. 184. Hanson. Z. 36. Norden. 52. L.. G. 32. E. C. 112. J Theor Comput Chem 2003. Li. A. 7. 31. Wu. 22. Y. Roux.. T. X. J.. Gierse. E. Yang. D. Marshall... G. J. 49. J. Z. J. 61. R. W. MO.. M. M. Lam. J Chem Phys 1999. G. version 42. Nichoson. 114. T. Nature 1975. Wang. Essex. 234. J. 48. W. Wu. P. 44.. Vandrie. Applet. T. Z. Hodge. J.. Z. Mcdonald. L. A. M.. 228.. H. Z. C. 47. J. Zhao. J.. J. Seibert. Gerhard. Novotny. X. J. Kollman.. P. W. J. 227.. 847. Z. W. 109. X. T. Wang. 31. 5. P. Zhang. 40. 94. Yang. J Mol Biol 1977. Samuelsson. J. Svensson.. J Theor Comput Chem 2006. A. L. Science 1990. Jorgensen. Y. E. J.. 100. 12. J. Hinck. J Phys Chem A 1997. 41. B. 283.. R. J. J. 2. D. Protein Eng 1994. Dixon...1002/jcc . 108. L. C. Murcko. A. Neidhart. Protein Sci 1995. Meek. D. A. D. Karplus. O. Y. 5. 8. K. L. 28. J.. ˚ Aqvist. S. Hendrickson. G. P. 42. 56. W. 385. C. 69. Shimanouchi. 30. Eur J Pharm Sci 2001. J Phys Chem A 2004. Wang. 43. Zhou. S. G. D.