You are on page 1of 37

REVIEW

pubs.acs.org/crt

Improving Drug Candidates by Design: A Focus on Physicochemical


Properties As a Means of Improving Compound
Disposition and Safety
Nicholas A. Meanwell*
Department of Medicinal Chemistry, Bristol Myers Squibb Research and Development, 5 Research Parkway, Wallingford,
Connecticut 06492, United States

ABSTRACT: The development of small molecule drug candidates


from the discovery phase to a marketed product continues to be a
challenging enterprise with very low success rates that have fostered
the perception of poor productivity by the pharmaceutical industry.
Although there have been significant advances in preclinical profiling
that have improved compound triaging and altered the underlying
reasons for compound attrition, the failure rates have not appreciably
changed. As part of an effort to more deeply understand the reasons
for candidate failure, there has been considerable interest in analyzing
the physicochemical properties of marketed drugs for the purpose of
comparing with drugs in discovery and development as a means capturing recent trends in drug design. The scenario that has emerged is one
in which contemporary drug discovery is thought to be focused too heavily on advancing candidates with profiles that are most easily satisfied
by molecules with increased molecular weight and higher overall lipophilicity. The preponderance of molecules expressing these properties is
frequently a function of increased aromatic ring count when compared with that of the drugs launched in the latter half of the 20th century and
may reflect a preoccupation with maximizing target affinity rather than taking a more holistic approach to drug design. These attributes not
only present challenges for formulation and absorption but also may influence the manifestation of toxicity during development. By providing
some definition around the optimal physicochemical properties associated with marketed drugs, guidelines for drug design have been
developed that are based largely on calculated parameters and which may readily be applied by medicinal chemists as an aid to understanding
candidate quality. The physicochemical properties of a molecule that are consistent with the potential for good oral absorption were initially
defined by Lipinski, with additional insights allowing further refinement, while deeper analyses have explored the correlation with metabolic
stability and toxicity. These insights have been augmented by careful analyses of physicochemical aspects of drugtarget interactions, with
thermodynamic profiling indicating that the signature of best-in-class drugs is a dependence on enthalpy to drive binding energetics rather
than entropy, which is dependent on lipophilicity. Optimization of the entropic contribution to the binding energy of a ligand to its target
is generally much easier than refining the enthalpic element. Consequently, in the absence of a fundamental understanding of the
thermodynamic complexion of an interaction, the design of molecules with increased lipophilicity becomes almost inevitable. The application
of ligand efficiency, a measure of affinity per heavy atom, group efficiency, which assesses affinity in the context of structural changes, and
lipophilic ligand efficiency, which relates potency to lipophilicity, offer less sophisticated but practically useful analytical algorithms to assess the
quality of drugtarget interactions. These parameters are readily calculated and can be applied to lead optimization programs in a fashion that
helps to maximize potency while minimizing the kind of lipophilic burden that has been dubbed “molecular obesity”. Several recently
described lead optimization campaigns provide illustrative, informative, and productive examples of the effect of paying close attention to
carefully controlling physicochemical properties by monitoring ligand efficiency and lipophilic ligand efficiency. However, to be successful
during the lead optimization phase, drug candidate identification programs will need to adopt a holistic approach that integrates multiple
parameters, many of which will have unique dependencies on both the drug target and the specific chemotype under prosecution.
Nevertheless, there are many important drug targets that necessitate working in space beyond that which has been defined by the retrospective
analyses of marketed drugs and which will require adaptation of some of the guideposts that are useful in directing lead optimization.

’ CONTENTS 2.3. Correlates between Absorption, Distribution,


1. Introduction 1421 Metabolism, Excretion, and Toxicity
2. Characterizing the Perceived Problems in Contem- (ADMET) Profiles and Calculated Physical
porary Drug Design 1422 Properties 1426
2.1. Analyses of the Structural Elements Associated 2.4. Physical Properties and Toxicity 1426
with Oral Bioavailability 1422
2.2. Time-Related Differences in the Physical Prop- Received: May 18, 2011
erties of Oral Drugs 1423 Published: July 26, 2011

r 2011 American Chemical Society 1420 dx.doi.org/10.1021/tx200211v | Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

2.5. Aromatic Ring Count and the Probability of 1. INTRODUCTION


Successful Drug Development 1429 Attrition rates for small molecule drug candidates at various
2.6. Physicochemical Properties and Drug Success: stages of development remain stubbornly high despite several
The Effect of sp2 Atom Count 1430 scientific advances that have improved compound profiling and
2.7. Physical Properties and Solubility 1432 candidate triaging at the preclinical stage.1 Although the pattern
3. Incorporating Physicochemical Properties into Drug
of underlying causes of candidate attrition has changed, the
failure rate has not, leading to the current perception of low
Design 1433 productivity by the pharmaceutical industry. This phenomenon
3.1. Molecular Properties Influencing Oral Absorp- is set against the backdrop of a rapidly rising investment in
tion: Rotatable Bonds 1434 research and development activities that has not been translated
3.2. Physicochemical Properties: Lipophilicity and into a commensurate increase in drug approvals, a paradigm that
Oral Absorption 1434 has driven the continuous increase in the estimated costs of
3.3. Golden Triangle Observations: Optimizing Oral developing a drug to launch (data summarized in Table 1). These
statistics have raised questions and considerable concern about
Absorption and Clearance 1434
the sustainability of such an enterprise.2,3
3.4. Physicochemical Properties of CNS Drugs: A Poor pharmacokinetic (PK) properties were estimated to
Special Case 1436 be an important contributor to clinical failure in 1991 (data
3.5. Further Defining CNS Drug Space and summarized in Table 2).1 However, within 10 years, poor human
Candidate Success: A CNS Multi-Parameter PKs had declined significantly as a source of candidate failure,
Optimization Tool 1436 contributing to the demise of just 8% of candidates in 2000
compared to 39% in 1991 (Table 2), a statistic that reflects
3.6. Optimizing DrugTarget Interactions: Thermo-
advances in preclinical profiling and improved methods of
dynamics of LigandProtein Binding 1438 extrapolating in vitro and in vivo PK data to humans. Interestingly,
3.7. Thermodynamic Signatures and Enthalpy- commercial reasons and cost of goods have emerged as more
Optimized Drug Candidates 1438 significant sources of drug failure, presumably a consequence of a
3.8. H-Bonding in DrugProtein Interactions 1440 more competitive business environment. Formulation issues and
toxicity have also increased as sources of attrition, the former
3.9. Ligand Efficiency, Binding Efficiency Index, and
reflecting the challenging physicochemical properties of con-
Surface Efficiency Index 1440 temporary drug candidates, while the latter may be due to an
3.10. Lipophilic Ligand Efficiency: LLE or LipE 1441 increase in the number of compounds advancing beyond phase
3.11. Analysis of the BEI and SEI for 92 Marketed Oral 1 PK studies. As part of an initiative by the medicinal chemistry
Drugs 1442 community to more deeply understand the reasons behind drug
3.12. Ligand Efficiency and Molecular Size 1442 failure, there has been considerable interest over the past decade
in trying to define the physicochemical properties of drug
3.13. Group Efficiency: An Assessment of the LE of
candidates that predict long-term viability. These studies have
Drug Fragments Used in Lead been conducted with a view to incorporating the insights
Optimization 1443 prospectively into contemporary drug design programs as a
3.14. Analysis of the Ligand Efficiency of Leads and means of improving candidate quality. Several analyses of
the Resultant Drugs 1444 marketed drugs have attempted to equate physicochemical
4. Some Recent Examples of the Application of LE and properties with success, by necessity a retrospective analysis of
drugs approved in the latter half of the 20th century.410 What
LLE in Drug Optimization 1444
has transpired is a scenario in which contemporary drug design
4.1. Cyclin-Dependent Kinase-2 Inhibitors (Astex) 1444 appears to be focused too heavily on addressing ever more
4.2. Protein Kinase B Inhibitors (Astex) 1447 challenging targets by relying upon structural fragments that
4.3. Soluble Epoxide Hydrolase Inhibitors display a high dependence on sp2-based ring systems and which
(Sumitomo) 1447 are all too easily assembled into molecules with increased
4.4. CB2 Agonists (Pfizer) 1448 molecular weight (MW) and high overall lipophilicity.412 This
movement, which has been characterized as driving potency by
4.5. CB2 Agonists/CB1 Inverse Agonists (Solvay
relying on molecular obesity,11 appears to be a function of
Pharmaceuticals) 1449 identifying drug candidates that are more heavily dependent on
4.6. ATP-Competitive Akt Inhibitors (Pfizer) 1449 entropic rather than enthalpic contributions to the thermody-
4.7. Dual PI3K/mTOR Inhibitors (Pfizer) 1450 namics of drugtarget interactions.1113 Although there is a
4.8. HIV Non-Nucleoside Reverse Transcriptase heightened awareness of the importance of the physicochemical
Inhibitors: The Discovery of Lersivirine properties of a drug candidate to its long-term success, the
1450 integration of multiple parameters into drug design will be
(Pfizer)
essential if improved molecules with greater potential to succeed
5. Epilogue 1451
are to be identified. However, it remains to be seen just how
Author Information 1453 successful this enterprise will be to pharmaceutical industry
Acknowledgment 1453 productivity, and given the length of time it takes to develop a
Abbreviations 1453 drug, it may be a decade or more before the impact of any
References 1453 changes in practices in drug design may be felt.
1421 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 1. Research and Development Spending, Drug Ap-


provals, and Estimated Costs of Developing a Drug

research and development


spending (millions of $) drug approvals

estimated cost of
non- developing a drug
year PhRMA PhRMA total NMEsa BLAsb (millions)

1979 $100
1991 $300 Figure 1. Lipinski’s “rule of 5” for predicting drug permeability.
1995 15.2
1996 16.5 53 3 Compound Developability; Dr. Stephen Johnson (Bristol-Myers
1997 18.8 39 6 Squibb), Molecular Matched Pairs Derived QSAR for the
1998 21.0 30 7 Optimization of ADMET Properties; and Dr. Travis T. Wager
1999 22.3 35 3 (Pfizer), Moving Beyond Rules: The Development of a Central
2000 26.0 27 2 $800
Nervous System Multi-Parameter Optimization (CNS MPO)
Approach to Enable Alignment of Drug-Like Properties.
2001 29.8 24 5
2002 31.0 17 7
2. CHARACTERIZING THE PERCEIVED PROBLEMS IN
2003 34.5 21 6
CONTEMPORARY DRUG DESIGN
2004 37.0 10.8 47.8 31 5
2005 39.9 11.9 51.8 18 2 $1300 The properties of a molecule are inherent to its structure, and
2006 42.4 15.7 58.1 18 4
once synthesized, all further studies of a drug candidate during
development are essentially focused on understanding its biolo-
2007 47.9 13.3 61.2 16 2
gical activities, metabolism and pharmacokinetics, toxicological
2008 47.4 14.3 61.7 21 2
profile, and pharmaceutical properties. The fundamental attri-
2009 46.8 19.5 66.3 19 6 butes of a molecule can only be managed as it progresses through
2010 15 6 the successive phases of drug development, and the factors
a
New molecular entity. b Biologic license application. involved in the long term success of a drug are still somewhat
enigmatic. The rising awareness that decisions made during lead
optimization are of critical importance to the ultimate success of a
Table 2. Estimated Sources of Drug Candidate Failure in drug candidate has led to a developing belief that molecules can
1991 and 2000 be designed more effectively if physicochemical principles are
1991 2000 given due consideration and applied in a constructive fashion.9
This is dependent on embracing a deeper understanding of the
lack of efficacy 30% 24% physicochemical aspects of a molecule and its interaction with its
PK/bioavailability 39% 8% biological receptor as well as the alternate proteins that give rise
clinical safety 10% 12% to off-target toxicities, metabolizing enzymes, and the biological
toxicity 11% 19% membranes encountered in vivo that modulate drug delivery. At a
commercial 5% 19% fundamental level of drug design, this involves avoiding structural
cost of goods 0% 8% elements associated with poor outcomes (toxicophores) and
formulation 0% 4% maximizing drugtarget association in a fashion that reduces
other/unknown 5% 6%
dependence on entropy (lipophilicity) by increasing enthalpy-
based interactions.12,13 However, the successful oral delivery of a
drug candidate necessitates orchestrating a compromise between
This review presents a synopsis of recent studies published in the properties that confer high potency, reasonable pharmaceutic
the medicinal chemistry literature and captures some of the more properties, high membrane permeability, and acceptable meta-
prominent physicochemical guideposts that have been devel- bolic stability.1522 Although there has been a significant and
oped as useful aids to decision making during lead optimization. inevitable focus on understanding the latter elements, interest
The material summarized includes published elements of pre- has begun to evolve toward understanding the role of physico-
sentations made as part of a symposium entitled Improving chemical properties in predicting toxicological outcomes as
Drug Candidates By Design: A Focus on Physical Properties to means of enhancing overall candidate viability.
Improve Disposition and Safety convened by Nicholas A. Mean- 2.1. Analyses of the Structural Elements Associated with
well and F. Peter Guengerich under the joint sponsorship of the Oral Bioavailability. The landmark assessment of the physico-
Divisions of Chemical Toxicology and Medicinal Chemistry at chemical properties associated with the oral bioavailability
the 240th American Chemical Society National Meeting held in of drugs and advanced candidates conducted by Christopher
Boston, Massachusetts on Tuesday, August 24th, 2010.14 Speak- Lipinski and his colleagues at Pfizer that has been codified in the
ers at the session, in order of appearance, were: Dr. James simple mnemonic known as the “rule of 5” stimulated consider-
Empfield (Astra-Zeneca), Physicochemical and Pharmacological able interest in providing a deeper and broader perspective on
Properties as Predictors of Drug Safety and Success; Dr. Simon this algorithm.16 The “rule of 5” predicted the potential for a
J. F. Macdonald (GlaxoSmithKline), Aromatic Ring Count and compound to exhibit good absorption and was based on an
1422 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 3. Time Related Changes in the Physicochemical Table 4. Comparison of the Physicochemical Properties of
Properties of Orally Bioavailable Drugs Launched before Orally Bioavailable Drugs Launched Pre-1983 with Those
1983 Compared to Those Launched from 19832002 Launched between 1983 and 1992, and 1993 and 2002
oral drugs oral drugs Δ mean all data are oral drugs Δ mean
pre-1983 19832002 values mean values pre-1983 19831992 19932002 values

number 864 329 number 864 175 154


MW 331 377 14% MW 331 374 382 2.1%
clog P 2.27 2.50 10% clog P 2.27 2.39 2.61 9.2%
% PSA 21.1 21.0 0% % PSA 21.1 20.9 21.2 1.4%
Σ OH + NH 1.81 1.77 2% Σ OH + NH 1.81 1.75 1.80 2.9%
ΣO+N 5.14 6.33 23% ΣO+N 5.14 6.33 6.32 0.2%
Σ HBA 2.95 3.74 27% Σ HBA 2.95 3.66 3.82 4.4%
rotatable bonds 4.97 6.42 28% rotatable bonds 4.97 6.29 6.58 4.6%
rings 2.56 2.88 13% rings 2.56 2.77 3.02 9.0%

analysis of 2,245 drugs captured in the World Drug Index prior to extended to a comparison of the properties of drugs across 5 of
1995 that were selected on the basis of consistency with clinical the major therapeutic areas launched between 1983 and 2002 in
exposure or occurrence in the United States Adopted Names order to provide insight into any differences based on the nature
(USAN) or International Nonproprietary Names (INN) databases. of disease targets. The therapeutic areas that formed the basis for
This collection was designated as the USAN library, and the this study were categorized as cardiovascular, gastrointestinal and
physicochemical parameters analyzed were log P, MW, polar surface metabolic, infectious diseases, nervous system and respiratory,
area (PSA), and the number of H-bond donors (HBDs) and and inflammation.
acceptors (HBAs). The analysis related the potential for good oral These analyses revealed several trends in drug properties that
bioavailability to the physicochemical boundaries summarized in differed between those launched before 1983 and those marketed
Figure 1, with permeability potentially compromised for com- between 1983 and 2002. Mean and median MW, the sum of O
pounds that violated 2 or more of the rules. The two notable and N atoms, HBAs, RBs, and the number of rings all increased,
exceptions recognized by Lipinski were natural products and drugs while clog P, % PSA, and the sum of HBDs (OH and NH) were
that are substrates of transporters. not significantly different (Table 3).7 Drugs launched between
Further insights emerged from an analysis of over 1,100 1983 and 2002 were found to be an average of 46 Da larger than
preclinical compounds in the SmithKline Beecham Pharmaceu- those in the pre-1983 data set, with 6.7% of the pre-1983 drugs
ticals collection that equated a series of physicochemical proper- violating Lipinski’s MW rule of >500 Da, a number that almost
ties with oral bioavailability in the rat. The results confirmed the doubled to 11.3% of the drugs launched between 1983 and 2002.
Lipinski observations but added an additional factor for con- However, the increase in MW was not accompanied by an
sideration, the number of rotatable bonds (RBs) in a molecule.17 increase in mean lipophilicity, an observation that suggested an
A total of e10 rotatable bonds was associated with good oral increase in the incorporation of polar or H-bonding elements in
exposure, and this criterion, when combined with a PSA of the 19832002 drug set. Indeed, the number of O and N atoms
e140 Å2, was considered to be sufficient to predict that a com- increased in the latter drug set compared to that in pre-1983
pound would exhibit a high probability of showing g20% oral drugs, but the number of HBDs was unaltered, attributed to their
bioavailability in the rat. However, a subsequent analysis of 434 importance as determinants of oral bioavailability. This presum-
Pharmacia compounds culled from several therapeutic areas ably reflects a Darwinian-like effect on drug attrition that
indicated that the correlation between oral exposure in the rat naturally selects compounds with both good permeability and
and the number of rotatable bonds was less stringent.18 Within good overall properties in preclinical species. The number of
some projects, compounds possessing 1520 rotatable bonds rings also increased by 13% to 2.88 from the mean of 2.56 noted
were associated with acceptable exposure, providing a cautionary for pre-1983 drugs.
note to the generalization of this property.18 In humans, 13 ro- An interesting observation revolved around differences in the
tatable bonds have been identified as an upper limit to predict distribution of PSA between the two data sets. The % PSA of
g20% oral bioavailability based on an analysis of 1,014 marketed the 19832002 drugs was found to be narrower than that for
drugs.19 the pre-1983 data set, with the 1090th percentile of % PSA
2.2. Time-Related Differences in the Physical Properties of spanning 4.539.5% for the pre-1983 drugs. This contrasts
Oral Drugs. Paul Leeson and colleagues at AstraZeneca have with the 19832002 cohort where the distribution was 25%
conducted several analyses of the changes in drug properties narrower, ranging from 9.9 to 35.9%, an observation thought to
over time in an effort to identify and understand underlying be related to increases in drug size and complexity. This notion
trends.7,9,10 The initial study focused on comparing the physi- was supported by the higher number of RBs occurring in the
cochemical properties of 864 orally administered drugs launched 19832002 drug set compared to that in the pre-1983 collec-
prior to 1983 with 329 molecules launched between 1983 and tion, a statistic that on the surface would appear to offer some
2002.7 The physicochemical properties that formed the basis of counterbalance to the finding that oral bioavailability in pre-
the evaluation were MW, clog P, PSA, the number of HBDs (OH clinical species decreased with increasing rotatable bond count in
and NH) and HBAs (O and N atoms), the number of RBs, and the proprietary set of compounds.17,18 However, the number of
the number of rings in a molecule. In addition, the analysis was rotatable bonds in both marketed drug sets is significantly lower,
1423 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 5. Comparison of the Physicochemical Properties of Orally Bioavailable Drugs Launched from 19832002 Arranged by
Therapeutic Category
all data are mean values CV (79) NS (74) GI and Met (38) infection (64) respiratory and inflammation (46) cancer (14) other (14)

MW 389 310 378 456 396 313 309


clog P 3.05 2.50 1.90 1.56 3.34 3.02 1.93
% PSA 19.8 16.3 26.7 24.6 20.5 20.8 22.9
Σ OH + NH 1.46 1.50 2.71 2.41 1.37 1.00 1.64
ΣO+N 6.73 4.32 6.84 8.78 6.17 4.50 4.29
Σ HBA 3.77 2.12 4.34 5.28 4.24 2.86 2.64
rotatable bonds 8.23 4.70 7.63 6.83 5.52 5.00 4.57
rings 2.84 2.85 2.32 3.45 3.02 2.36 2.36

4.97 in the pre-1983 set and 6.43 in the 19832002 compounds, Table 6. Mean Physicochemical Properties of 1791 Drugs
than the average of 8.19 (upper and lower quartile numbers Marketed Between 1937 and 1997
averaged 6.17 and 10.22) for the GlaxoSmithKline preclinical
# of oral drugs 1791 90th percentile
data set.17 Moreover, the average MW of both marketed drug
data sets was below that at which rotatable bond count appeared mean MW 333 469
to exert a significant influence on oral bioavailability in the rat.17 mean log P 2.5 4.8
Leeson also examined changes within the 19832002 drug mean H-bond donors 1.5 3
set, dividing the analyzed drugs by decade in an effort to more mean H-bond acceptors 5.1 9
effectively capture contemporary trends. Although all physico-
chemical properties exhibited upward trends, none achieved
statistical significance (Table 4). However, of particular note, guideline of a clog P < 5. These results were generally similar to
oral drugs launched between 1993 and 2002 contributed to the those observed in the Leeson analysis. The median HBD count
increases in both clog P and the number of rings observed in the increased significantly over time but less than 1.1% of the data set
19832002 data set when compared to those in the pre-1983 incorporated more than 5 HBDs, reflecting the relationship
drug cohort. between the number of HBDs and absorption and the observa-
The profile of drug properties across therapeutic areas high- tion that HBDs are frequently involved in phase 2 metabolism. It
lighted significant variation, particularly between anti-infective was noted that median HBA count began to increase in the mid-
and neuroscience drugs which exhibited the most extreme 1970s, with more substantial increases observed in drugs
properties, as summarized in Table 5.7,23,24 Anti-infective drugs launched in the 1990s, although the latter comprised a relatively
possessed the highest mean MW, the lowest mean lipophilicity, small cohort. Compounds with more than 10 HBAs amounted to
and the highest HBA and O and N atom counts.7,23 In contrast, 4.8% of the data set, and only 0.6% of the 1791 drugs possessed
and presumably reflecting the restrictive nature of the blood both a MW over 500 Da and more than 5 HBDs, two of the Lipinski
brain barrier since most drugs included in the data set acted rules. Compounds combining a MW of >500 Da with more than 3
centrally, nervous system drugs showed the lowest mean MW, HBDs comprised 2% of the data set, while the combination of a MW
the lowest mean HBA, O and N count, and the fewest rotatable of >500 Da and an Alog P of >5 occurred in 2% of the drugs, and less
bonds.7,24 CNS drug space represents a special circumstance that than 5% contained more than 4 H-bond donors.
will be discussed in more detail later. Drugs in therapeutic categories Since the launch of a drug typically occurs a decade or more after
beyond anti-infective exhibited a similar distribution of lipophilicity, its design and discovery, an examination of the physicochemical
emphasizing the importance of this property to oral bioavailability properties of marketed drugs will not adequately capture con-
irrespective of therapeutic area, with the conclusion that this maybe temporary practices. In an attempt to examine this phenomenon in
a more stringent drug-like property than MW. more detail, Leeson and Springthorpe compared the physico-
John Proudfoot scrutinized the physicochemical properties of chemical properties of 592 oral drugs launched between 1983 and
1791 drugs marketed in the 60 years spanning 19371997, a 2007 with those of compounds disclosed in patent applications
collection of compounds from which diagnostic, metal-contain- originating from 4 major pharmaceutical houses that were pub-
ing drugs, and unmodified natural products were specifically lished between 2001 and 2007.9 Merck, AstraZeneca, Pfizer, and
excluded.8 In this data set, median MW increased over time, with GlaxoSmithKline were selected as substrates for the patent estate
drugs launched in the period 19371950 generally <300 Da, evaluation on the basis of their interest in a broad range of
while a MW of >400 Da was more frequently encountered in the therapeutic areas and substantial productivity. In the drug data
drugs launched after 1980 (statistics compiled in Table 6). Only set, there was a median of 10.5 years between the publication and
7% of the complete data set exhibited a MW above the Lipinski launch dates, supporting the notion that emerging practices in
guideline of 500 Da. Indeed, the steady increase in MW was the drug design are more likely to be captured by profiling compounds
most notable change in properties observed, with just 7 drugs in disclosed in recent patent applications. Of particular interest, a
the 19371951 data set having a MW >500 Da, while 15 drugs in temporal analysis that looked at trends in clog P and MW of
the 19831997 drug set had a MW >500 Da. The median Alog compounds disclosed in patent applications published by the
P showed no upward or downward trend over the time frame individual companies was also included.
analyzed, with 8.5% of compounds exhibiting an Alog P of greater Two notable basic trends that emerged from the analysis of the
than 5 and 5.2% less than 1, which compares with the Lipinski 592 approved drugs were an increase of both the median clog P and
1424 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 7. Observations and Trends Captured by an Analysis of Key ADMET and Calculated Descriptors of Preclinical Compounds
at GlaxoSmithKline
number of
function compounds evaluated key observations and trends

solubility 44,584 As MW increases, solubility decreases.


Solubility is dependent on the ionization state with acids generally more soluble than bases, attributed to the ionization
state in the pH 7.4 buffer.
Solubility correlates negatively with clog P although reducing clog P to <3 brought neutral molecules into the solubility
range of ionizable molecules.
The effect of MW and clog P on solubility were found to be independent parameters with a minimal correlation, r2 = 0.096.

permeability across an 50,641 Permeability decreased as MW increased.


artificial membrane Acids and zwitterions were the least permeable, neutral molecules were the most permeable, and basic compounds
fell in between.
For acids, bases, and zwitterions, permeability increased with clog P.
The permeability and clog P of neutral molecules showed a nonlinear relationship.

biovailability in the rat 4,431 The average bioavailability of compounds with MW <300 was ∼18% and 10% for MW >700.
The bioavailability of neutral (15%), basic (13%), zwitterionic (10%), and acidic (18%) molecules showed a limited
dependence on the ionization state.
There was no statistically significant correlation between bioavailability and clog P.

volume of 9,375 As MW increased, the log Vd increased, but the relationship was very weakly correlated.
distribution (Vd) Basic molecules showed the highest Vd, acids had the lowest Vd, and neutral and zwitterionic molecules fell in between.
This was attributed to acids generally binding tightly to serum albumin and bases readily associating with negatively
charged phospholipid membranes.
The Vd of neutral and basic molecules increased with clog P, but that was not the case for acids and zwitterions.

CNS penetration 3,059 CNS penetration decreased as MW increased. Compounds with MW <300 had brain/blood ratios of 2.2 compared
to 0.1 for MW >700.
CNS penetration was dependent on ionization state descending in the order basic > neutral > zwitterionic
> acidic molecules.
CNS penetration increases with clog P, but this correlation is weaker than that for MW.

P-gp efflux 1,975 The P-gp efflux ratio increased with MW.
Ionization state played only a minor role on P-gp efflux descending in the order zwitterionic > neutral = basic>
acidic molecules.
The correlation between clog P and P-gp efflux was weak, but molecules with clog P between 3 and 5 had higher
mean ratios than those with clog Ps <3 or >5.

plasma protein 2,939 Plasma protein binding increased with MW, averaging 72% for MW <300, 54% for MW 300500, and 98.2%
binding for MW 500700.
Ionization state influenced protein binding with the order acidic > neutral > zwitterionic > basic molecules.
As clog P increased, plasma protein binding increased.

brain tissue binding 986 Brain tissue binding increases as MW increases.


Larger molecules bound more tightly to both plasma proteins and brain tissue.
Brain tissue binding was not dramatically affected by ionization state, which contrasts with the observations
in plasma protein binding.
More lipophilic compounds bound more tightly to brain tissue.

in vivo clearance 11,490 No significant relationship between clearance and MW.


Acidic molecules showed lower clearance than neutral and zwitterionic compounds, while bases were generally
cleared most rapidly.
There was a weak but significant correlation between an increase in clog P and increased clearance.

1425 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

Table 7. Continued
number of
function compounds evaluated key observations and trends

hERG inhibition 35,200 Mean pIC50 values increased as MW increased.


Neutral and acidic molecules exhibited weaker hERG inhibition than zwitterions and bases.
Mean pIC50 values increased as clog P increased for bases and zwitterions but not for acids and neutral molecules.

P450 1A2 inhibition 49,837 Mean pIC50 values decreased as MW increased, reflecting the narrow active site of this enzyme.
There was no significant correlation between ionization state and 1A2 inhibition.
Lipophilicity exerted a limited effect on pIC50 values.

P450 2C9 inhibition 51,097 There was a parabolic relationship between MW and 2C9 inhibition with compounds with a MW
300700 exhibiting a more potent inhibitory effect.
Neutral and acidic molecules exhibited higher affinity for 2C9 than bases and zwitterions.
2C9 inhibition increased with higher clog P values, and the magnitude of this effect was more pronounced for neutral
and acidic molecules.

P450 2C19 inhibition 48,464 2C19 inhibition showed minimal dependency on MW and ionization state, but affinity increased with clog P.

P450 2D6 inhibition 50,886 There was a weak parabolic relationship between 2D6 affinity and MW.
Basic molecules were more potent inhibitors than zwitterions > neutral > acidic compounds.
Molecules with higher clog P values exhibited greater inhibitory activity, an effect that was more pronounced for
neutral, basic, and zwitterionic molecules than acids.

P450 3A4 inhibition 42,987 Mean pIC50 values increased with MW.
Neutral molecules were more potent 3A4 inhibitors than neutral or zwitterionic compounds which were more
potent than acids.
An increase in clog P was associated with an increase in the potency of 3A4 inhibition.

MW with time, while the number of drugs approved per year Table 8. Odds Ratios for the Appearance of Toxicity in Vivo
worldwide decreased between 1983 and 2006 as did the proportion in a Pfizer Data Set of 245 Compounds Based on TPSA and
of drugs with a MW of less than 350 Da. Trends of increasing clog P clog P25
and MW compared to the marketed drug data set were observed in
compounds abstracted from AstraZeneca, GlaxoSmithKline, Merck, total drug free drug
and Pfizer patent applications. The median MW of compounds in TPSA >75 Å 2
TPSA <75 Å 2
TPSA >75 Å2 TPSA <75 Å2
the patent applications was 450 Da, and the median clog P was 4.1,
figures that compared with a median MW of 432 Da and a median clog P < 3 0.39 (57) 1.08 (27) 0.38 (44) 0.5 (27)
clog P of 3.1 for orally bioavailable drugs launched between 1990 clog P > 3 0.41 (38) 2.4 (85) 0.81 (29) 2.59 (61)
and 2007. Pfizer, however, represented a notable exception from the
other 3 pharmaceutical houses, with compounds disclosed in their
patent applications generally exhibiting a lower median MW and CNS penetration in the rat, rat brain tissue and plasma protein
clog P. This reflects a heightened awareness of the importance of binding, P-gp efflux, inhibition of hERG, and inhibition of the
controlling physicochemical properties in the design of drug P450 isozymes 1A2, 2C9, 2C19, 2D6, and 3A4, while the
candidates that presumably has its origin in Lipinski’s analysis of physicochemical descriptors evaluated were log P, log D, HBAs,
drug oral bioavailability. This phenomenon is reflected in several in HBDs, positive and negative ionization states, molecular flexibility,
depth analyses of the relationship between physicochemical proper- molar refractivity, MW, PSA, RBs, and HAC. The observations
ties and drug developability that have been published by Pfizer and trends are captured synoptically in Table 7 and provide useful
scientists over recent years. insights into the preferred physicochemical properties that satisfy
2.3. Correlates between Absorption, Distribution, Meta- basic in vitro and in vivo profiling criteria. Ionization state, clog P,
bolism, Excretion, and Toxicity (ADMET) Profiles and Calcu- and MW were considered to be the most useful predictors of
lated Physical Properties. Using a principal components potential ADMET problems with MW and clog P identified as the
analysis, Gleeson analyzed data from a large and structurally key drivers, leading to the succinct conclusion that preferable
diverse set of preclinical compounds profiled at GlaxoSmithKline molecules were those with a MW of <400 and a clog P of <4.20
seeking a correlation between 15 ADMET assays and 12 cal- 2.4. Physical Properties and Toxicity. An examination of the
culated physicochemical descriptors.20 The ADMET assays relationship between physicochemical properties and toxicity
comprised solubility, permeability across an artificial membrane, observed in animal in vivo toleration studies of 245 pre-
oral bioavailability, clearance, volume of distribution (Vd) and clinical candidates nominated for development at Pfizer between
1426 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 9. Odds Ratios for Biochemical Promiscuity in a Pfizer likely to show specificity than those exhibiting a low clog P and a
Data Set of 108 Compounds, Defined As >50% Activity in 3 high TPSA.25 Thus, less polar, more lipophilic compounds were
in Vitro Assays, on the Basis of TPSA and clog P more likely to be problematic, an observation that demonstrated
consistency across a broad range of toxicities and chemical struc-
promiscuity TPSA >75 Å2 TPSA <75 Å2
tural matter.
clog P < 3 0.25 (25) 0.8 (18) An interrogation of the biochemical fidelity associated with a
clog P > 3 0.44 (13) 6.25 (29) set of 3138 compounds at Novartis provided additional insights
into correlates with physicochemical properties while also iden-
tifying structural elements that are more frequently associated
Table 10. Physicochemical Properties Associated with 2,512 with promiscuity.28 The modeling exercise was based on 2512
Compounds from a Novartis Data Set Defined As Promiscu- compounds selected randomly, with the remaining 626 com-
ous, Moderately Promiscuous, and Selective on the Basis of an pounds and 119 marketed drugs used as test sets. The com-
Analysis of Activity in Biochemical Assaysa pounds analyzed had been screened in at least 50 of 79 assays, the
majority of which were G-protein coupled receptors, and a target
all moderately hit-rate parameter (THR10) was developed as the index of
compounds promiscuous promiscuous selective promiscuity, defined as follows:
MW 460 493 472 436 no: of targets inhibited by 50% at 10 μ M
Alog P 3.7 4.4 3.9 3.3 THR 10 ¼
no: of targets tested
HBA 5.2 5.4 5.2 5.2
HBD 2.0 2.1 2.1 1.9 A THR10 of g20% was used as the criterion to define a
O count 3.0 2.5 2.8 3.3 promiscuous compound, while compounds with a THR10 of
N count 4.0 4.6 4.2 3.6 e5% were considered to be selective and those with a THR in the
rot bonds 7.0 7.7 7.2 6.6 520% range viewed as being of moderate promiscuity.28 This
ring count 4.0 4.6 4.3 3.6 analysis allowed categorization of the data set into 604 (24%)
a
The data presented are mean numbers. promiscuous compounds, 1171 (47%) selective molecules with
the remaining 737 (29%) falling into the moderately promiscu-
ous category. MW and Alog P were higher for promiscuous
2002 and 2006 provided interesting insights into compound compounds by 10 and 33%, respectively, than for selective
survival.2527 The data set comprised 50% basic, 40% neutral, compounds, and compounds exhibiting promiscuity incorpo-
and 10% acidic molecules, and both measured and calculated rated a significantly higher number of rings and rotatable bonds
properties were used for the analysis.25 PK data were available (data summarized in Table 10). Promiscuous compounds also
for all of the compounds, and correlates with both the free and possessed a higher number of N atoms but a lower number of O
total drug concentrations were evaluated, with 10 μM set as atoms than selective compounds, possibly reflecting problems
the toxicity threshold for total drug and 1 μM for free drug with the presence of basic amines, while the prevalence of
concentrations. The descriptors that emerged as being most HBDs and HBAs was not significantly different across the 3
closely related to the observation of toxicity were topological categories.28
polar surface area (TPSA) and clog P. Thresholds were set at a A deeper analysis that sought to equate the appearance of
TPSA of 75 Å2 and a clog P of 3 for the analysis of toxicity odds, specific substructures with compound promiscuity revealed a
the ratio of toxic to nontoxic compounds, which were deter- preponderance of indole, furan, and piperazine heterocycles in
mined for the data set and are captured in Table 8. Compounds the promiscuous compound data set, while carboxylic acids were
with a low clog P and a high TPSA were found to be 2.5-fold less found to be more frequently associated with the selective class.
likely to be toxic, while those with a high clog P and a low TPSA The latter observation was attributed to the presence of a
were 2.5-fold more likely to be toxic. This afforded an odds ratio negative charge potentially leading to unfavorable interactions
of >6 between the two extremes, and the results were similar between a drug candidate and proteins, a tactic well-known to
whether calculations were based on either total or free drug be helpful in avoiding inhibition of the hERG ion channel.29
concentration in plasma. The presence of a single risk factor Compounds containing tetrazole and sulfonamide moieties
was found to moderately increase the potential for the appear- exhibited little difference in their distribution across the promis-
ance of toxicity, but compounds with both risk factors showed a cuity landscape, although an indication of the presence of overt
significant propensity to manifest problems. As noted by the acidity in these elements was not provided. The overall conclu-
authors, a low TPSA is associated with deeper tissue penetration, sion of the synopsis was that small hydrophilic compounds with a
which would increase the chances for toxicity, although, inter- carboxylic acid moiety were the most likely to show selectivity
estingly, there was no apparent correlation between the volume while bulky, hydrophobic amines were the most likely to
of distribution and toxicity. distribute to the promiscuous set. The authors developed a na€ive
Biochemical promiscuity was also assessed by analyzing CER- Bayesian model using a range of descriptor fingerprints that
EP BioPrint profiling data, available for 108 compounds that had compared the frequency of features between selective and
been evaluated in 48 assays, with promiscuous behavior defined promiscuous sets of compounds. The best models showed lower
as >50% activity at a concentration of 10 μM in 3 or more assays. promiscuity for marketed drugs than those in early development
Biochemical promiscuity showed a similar correlation with TPSA or those that failed during clinical development. Interestingly, the
and clog P to the observation of toxicity in vivo, with the strongest majority of the marketed drugs that were predicted to be
effect again seen when both risk factors were present (Table 9). promiscuous were compounds that targeted the central nervous
Compounds with a high clog P and a low TPSA were 25-fold less system.28
1427 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

At Roche, a cohort of 213 compounds originating from 62 Table 11. Correlation between hERG Inhibition Data and
projects that had been profiled in a series of biochemical assays Lipophilicity Associated with 7,685 Astra-Zeneca
between 2004 and 2007 was assessed for promiscuity, defined by Compounds
counting the number of off-target hits.30 The compounds were
acid base neutral zwitterion
divided into 2 categories, those exhibiting g30% and g90%
inhibition at a concentration of 10 μM, and the molecular n 350 4302 2598 435
descriptors analyzed for a potential relationship with promiscuity mean hERG pIC50 3.7 5.2 4.5 4.4
were clog P, clog D , MW, and pKa. The compounds demon- mean AZlog D 0.71 2.5 2.9 1.5
strating g30% inhibition in one assay revealed that pronounced mean clog P 3.1 3.6 3.2 4.4
promiscuity was not associated with molecules with a clog P of <2
or a clog D of <1. A large variation in specificity was observed for
compounds with a clog P in the 26 range, while those with a Table 12. Probability of hERG inhibition versus Lipophilicity
clog P of >6 were classified as moderately promiscuous. Posi- for Acidic, Basic, Neutral, and Zwitterionic Compounds in an
tively charged compounds were more prone to exhibit promis- Astra-Zeneca Data Set
cuity than neutral or negatively charged compounds, and
promiscuity increased with pKa. This result corroborates similar
observations made from an analysis of toxicity in compounds
synthesized at AstraZeneca, and in this data set, there was no
correlation between promiscuity and MW.27
However, when the analysis focused on compounds associated
with g90% inhibition in any of the in vitro assays, the potential
for promiscuity was found to be highly dependent on the pre-
sence of a positive charge, and all compounds that interacted
with >4% of targets incorporated a basic element.30 In particular,
off-target activity was common among several series of com-
pounds designed as aminergic GPCR ligands or reuptake trans-
porters which together accounted for 49% of the compounds in Table 13. Upper Limits of Lipophilicity to Avoid hERG
this category. Over half (55%) of basic compounds from non- Inhibition in Acidic, Basic, Neutral, and Zwitterionic Com-
aminergic GPCR programs also exhibited promiscuity, confirm- pounds in an Astra-Zeneca Data Set
ing that the presence of a basic amine leads to reduced specificity
target upper limits of log D and clog P to
regardless of target class.27,30
predict that >70% of compounds achieve a hERG IC50 >10 μM
A potentially confounding observation revolved around several
highly lipophilic compounds for which clog P values exceeded 6 acids bases neutrals zwitterions
and which were associated with low promiscuity. Most of these
compounds were also poorly soluble, <10 μM, raising the concern log D >4 1.4 3.3 2.3
that precipitation may have obscured biological effect. To address clog P >9 1.9 4.0 4.4
this issue, an analysis of the relationship between solubility and
promiscuity in the broader set of compounds was conducted, with screening assay.33 Compounds were divided into acidic, basic,
the result that both poorly and highly soluble compounds were neutral, and zwitterionic species, with the distribution of ion class
similarly distributed across the promiscuity spectrum, suggesting across this data set compiled in Table 11 along with hERG
that low solubility was not a contributing factor. inhibition data and mean lipophilicity profiles. The lipophilicity
The behavior associated with 2,133 drugs that had been data are based on experimental log D measured at pH 7.4,
evaluated in 200 assays from the Cerep BioPrint database was available for 1,211 compounds, or calculated as AZlog D using a
assessed on the basis of the criterion that a molecule expressing proprietary algorithm. The data were analyzed and presented as
>30% inhibition toward a target at a concentration of 10 μM was the probability of achieving a compound with a hERG IC50 of
considered to be promiscuous.9,31 Bases and quaternary bases >10 μM for the individual ion classes in relationship to their
exhibited higher levels of promiscuity than zwitterions, acids, or lipophilicity (results summarized in Table 12). Basic compounds
neutral compounds, while increased lipophilicity was associated exhibited the highest propensity to inhibit hERG, exacerbated by
with reduced specificity irrespective of the ionization class. An increased AZlog D, while acids, zwitterions, and neutral com-
apparent relationship in which promiscuity increased with the pounds offered lower potential, although a higher AZlog D
number of rings in a molecule could not be distinguished from correlated with increased inhibitory activity. These data were
lipophilicity. The relationship between MW and specificity for configured to provide a suggested upper limit of lipophilicity for
this data set was complicated and failed to reproduce earlier each ion class that would offer improved odds of reducing the
observations with a larger cohort of compounds in which potential to encounter hERG inhibition (Table 13). Lipophilic
promiscuity declined as MW increased.9,32 basic molecules showed the lowest probability to achieve a hERG
An increase in the lipophilicity of drug molecules has also been IC50 of >10 μM, and a clog P of <2 was suggested as the target
equated with a higher propensity for inhibition of the hERG for achieving a 70% chance of avoiding potent hERG inhibi-
cardiac ion channel, an activity associated with arrhythmogenic tion. Lipophilic acids, neutral compounds, and zwitterions were
potential that is manifested in clinical studies as torsades de considerably less problematic with a wider acceptable range of
pointes.29,33 Insight was gleaned from an analysis of 7,685 clog P and log D.33
AstraZeneca compounds for which hERG inhibition data were Phospholipidosis describes the phenomenon of drug-induced
available in a proprietary whole-cell electrophysiology-based accumulation of phospholipids in cells in vivo that has been
1428 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Figure 2. Predictions for the potential for phospholipidosis based on physicochemical properties.

Table 14. Changes in Physical Properties Associated with Table 15. Mean Aromatic Ring Count for Drugs at Various
Compound Developability in a Data Set at GlaxoSmithKline Stages of Development in the GlaxoSmithKline Pipeline

# of aromatic rings proof


preclinical FIH phase 1 phase 2 of concept
no. of
property compds 0 1 2 3 4 5 6 number of compounds 50 68 35 53 96
mean aromatic ring count 3.3 2.9 2.5 2.7 2.3
solubility (μg/mL) ∼31,000 161 100 79 57 36 28 14
clog P ∼26,000 0.8 1.9 2.9 3.7 4.4 5.1 6.5
and the clog P was g2.37 Molecules with a pKa of <6 or a clog P of
log D (pH 7.4) 10,464 1.8 1.3 2.1 2.4 2.7 2.9 2.9
<2 are predicted to be negative. The Tomizawa model uses a
HSA binding (%) 7856 85 78 88 93 96 96 96 slightly different algorithm developed after analyzing 33 com-
P450 3A4 inhibition (pIC50) 15,178 4.9 4.7 4.9 5.2 5.4 5.6 5.7 pounds for their potential to induce lipid accumulation in rat
hERG inhibition (pIC50) 11,105 5.2 5.2 5.6 5.7 5.7 5.5 5.3 hepatocytes in vitro.38 A plot of clog P and pKa revealed a strong
correlation with phospholipidosis for mono basic compounds
associated most prominently with lipophilic cationic molecules but was unable to predict the potential of zwitterions to cause
that in the most rudimentary form combine the structural lipid accumulation. A plot of net charge at pH 4 against clog P
features of a lipophilic ring element and a hydrophilic side chain provided an improved correlation, and compounds with a net
incorporating a basic amine moiety.34,35 Although there is no charge of between 1 and 2 and a clog P of >1 scored as positive in
clear correlation between the occurrence of phospholipidosis and a 30 compound test set with a predictive accuracy of 98%. 38
the manifestation of toxicity, the phenomenon is one of concern In an attempt to extend the utility of these algorithms to more
since it has the potential to delay or prevent the development of a reliably predict the potential of a molecule to induce phos-
pholipidosis in vivo, Hanumegowda and colleagues have
compound. The physicochemical properties of 40 compounds
introduced volume of distribution as an additional dependent
encompassing inducers of phospholipidosis in vitro or in vivo and
factor that incorporates the disposition of a compound.39
compounds free of this effect were analyzed for a correlation.36
The analysis was based on a total of 103 compounds, 53 of
Cationic amphiphilic compounds that scored as positive clus- which were known inducers of phospholipidosis and 50 were
tered differently from less basic or less lipophilic compounds that in the negative set, and arrived at an empirically derived
were categorized as negative or weak inducers.36 An analysis of correlation. Compounds for which the most basic pKa  clog
plots of clog P and pKa against the observation of the induction of P  Vd was g180 were successfully predicted as inducers of
phospholipidosis led Ploemen and colleagues to develop a simple phospholipidosis with 82% accuracy, while negative com-
predictive guideline that equated the potential for a biological pounds were predicted with 94% accuracy, for an overall 88%
effect with these readily calculated parameters. Thus, if the sum of concordance that is superior to the methods relying solely on
the square of the most basic pKa and the square of the clog P is pKa and clog P. 39 Taken together, these insights provide
g90, then the molecule was predicted to be a positive for the simple computational methodology to assess the potential of
potential to induce phospholipidosis, while a score of e90 or a a molecule to induce phospholipidosis that may be useful in
pKa of <8 or a clog P of <1 was considered be predictive of a guiding compound modification in a direction that abrogates
negative outcome (summarized in Figure 2).36 A more detailed risk. 40
analysis of 85 known inducers of phospholipidosis and 116 2.5. Aromatic Ring Count and the Probability of Success-
negative compounds led to a refinement of the equation that ful Drug Development. The relationship between aromatic ring
predicted that a molecule would have the potential to induce count and a series of physicochemical properties associated with
phospholipidosis if the sum of the square of the most basic pKa drug developability in candidate molecules prepared at GlaxoS-
and the square of the clog P is g50 provided that the pKa was g6, mithKline highlighted clear negative trends in all properties as
1429 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 16. Comparison of the Effects of Increasing Ring Count with Developability Parameters in a GlaxoSmithKline Data Seta

a
Red = strong detrimental impact; yellow = modest detrimental impact; green = modest beneficial impact; no color = no significant impact.

Table 17. Mean MW and Fsp3 for Compounds at Different decreases in solubility and log D and increases in HSA binding,
Stages of Development cytochrome P450 3A4, and 2C9 inhibition that were far more
modest in comparison to those of carboaromatic rings. How-
phase discovery phase 1 phase 2 phase 3 drugs
ever, the increase in solubility expected with the introduction
MW 449 436 429 417 350 of heteroaromatic elements was less than anticipated, consid-
Fsp3 0.36 0.38 0.43 0.45 0.47 ered to be due to an opposing effect of the presence of planar
number 2.2  106 376 591 188 1179 rings.42
Carboaliphatic ring count had minimal effect on developability
the number of aromatic rings increased.41 The data compiled in parameters, although this data set was small since 85% of the
Table 14 are mean values that reveal that clog P, log D, inhibition compounds evaluated were completely devoid of aliphatic rings.
of cytochrome P450 3A4, inhibition of hERG, and binding to Increasing heteroaliphatic ring count generally improved all of
human serum albumin all increase with aromatic ring count, the key parameters with the exception of hERG inhibition,
while solubility declines. The average number of aromatic rings attributed to the presence of a basic amine since this effect was
in molecules in the GlaxoSmithKline pipeline at various stages of absent in neutral molecules.
development was also compared. As is evident from Table 15, the Further analysis looked at the ratio of aromatic to hetero-
mean aromatic ring count was found to decline as compounds aromatic rings in molecules, evaluated as the impact of
advanced through clinical trials, suggesting that compounds with aromatic to heteroaromatic ratio, with compounds divided
fewer aromatic rings are more likely to be successful. into categories based on all of the possible combinations in
The authors provided a succinct summary of their observa- the range of 3 heteroaromatic/0 aromatic rings to 0 hetero-
tions by stating that “The fewer the number of aromatic rings aromatic/3 aromatic rings. Developability parameters were
contained in an oral drug candidate, the more developable that found to gradually deteriorate as the proportion of carboaro-
candidate is likely to be; specifically, more than three aromatic matic rings increased, reflected in increases in MW, lipophi-
rings in a molecule correlates with poorer compound develop- licity, protein binding, and both cytochrome P450 and
ability and, therefore, an increased risk of compound attrition.”41 hERG inhibition, and reduced solubility. These data led
A second study by this group took a more in-depth look at to the suggestion that replacing carboaromatic with hetero-
the effect of the nature of the aromatic ring on developability aromatic rings is likely to improve developability, while
parameters by comparing the properties of compounds incor- an analysis of the effect of fused ring systems noted improved
porating carboaromatic rings with those that are based on developability parameters compared to nonfused counter-
heteroaromatic, carboaliphatic, and heteroaliphatic rings.42 As parts, presumably because of the presence of fewer heavy
summarized in Table 16, carboaromatic rings cause considerably atoms. 42
more problems than the other types of ring systems, most 2.6. Physicochemical Properties and Drug Success: The
notably decreasing solubility and increasing log D while increas- Effect of sp2 Atom Count. The effect of saturation level on the
ing the propensity to bind to both human serum albumin and R1- success of candidate molecules was examined by assessing the
acid glycoprotein. A general increase in cytochrome P450 fraction of sp3-hybridized carbon atoms present in a large drug
inhibition, with the exception of 2D6, which was attributed to data set.43 The descriptor Fsp3, originated by Gasteiger44 and
the small active site of this isozyme, was also noted for com- defined as follows was used as the index of saturation, with a
pounds incorporating carboaromatic rings. In contrast, the effect higher fraction viewed as a straightforward way of increasing
of heteroaromatic rings was considerably less dramatic, with structural diversity accompanied by only a minimal increase in
1430 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 18. Fraction of Compounds with g1 Stereocenters at lower, and solubilities were higher in molecules with a large
Different Stages of Development fraction of sp3 centers (favorable trends captured in Tables 19
and 20). This study concluded that compounds with higher
phase discovery phase 1 phase 2 phase 3 drugs
degrees of saturation were more likely to succeed as drugs and
fraction 0.53 0.52 0.60 0.63 0.64 that increasing saturation has the advantage of increasing solu-
number 2.2  106 374 586 187 1448 bility and decreasing melting point, important factors in drug
fractiona 0.46 0.49 0.52 0.62 0.61 performance in vivo. This thesis was corroborated by the
number 1.3  106 249 369 120 1089
observation that compounds in later stages of development
incorporated a higher number of stereocenters than typical
a
After the removal of compounds that failed any “rule of 5” violation or
had >10 rotatable bonds. discovery compounds.
The increased sp3 atom count that is a more typical trait of
natural products compared to that of synthetic compounds has
Table 19. Variation of Fsp3 As a Function of Melting Point in also been associated with reduced promiscuity, as detected by the
a Cohort of 4,432 Compounds propensity of over 15,000 compounds harvested from commer-
cial (6,152), academic (6,623), and natural (2,477) sources to
mp (°C) 25 75 125 175 225 275 325 375
bind to 100 diverse proteins measured using small-molecule
no. of compds 75 657 1153 1253 815 375 93 11
microarrays.45,46 In this analysis, structural complexity was
Fsp3 0.34 0.33 0.31 0.27 0.24 0.18 0.11 0.10 defined by the number of stereogenic C atoms divided by the
total number of carbon atoms, while shape complexity was
captured by the ratio of sp3 C atoms to the sum of sp2 and sp3
Table 20. Variation of Fsp3 As a Function of Solubility for C atoms in a molecule. These metrics, which are independent of
1,202 Compounds compound size, revealed that natural products exhibited the
log S 12 10 8 6 4 2 0 2 highest stereochemical complexity, with a mean Cstereogenic/Ctotal =
no. of compds 1 5 46 104 362 473 194 17 0.24, while commercial compounds were the lowest, mean
Fsp3 0 0 0.07 0.31 0.42 0.38 0.56 0.67 Cstereogenic/Ctotal = 0.022. Commercially sourced compounds also
contained the lowest proportion of sp3 carbon atoms, mean = 0.27,
while the proportion in natural products was 2-fold higher at a
molecular weight while reducing planarity. mean of 0.55. A closer analysis revealed that in the commercial
compound data set, the sp3 carbon atoms were more likely to
of sp3 hybridized C atoms be peripheral to the core scaffold in contrast to natural prod-
Fsp3 ¼ ucts where sp3 carbon atoms comprised almost half of the core
total C atom count
scaffold atoms. The natural products collection exhibited great-
Dimethylpyridine was provided as a simple but illustrative er selectivity, with 13% defined as a hit in any assay, which com-
example of the sacrifice in complexity based on a reliance on pared to 23% for commercial compounds and 26% for the
compounds with a high sp2 atom count. Dimethylpyridine has a academic collection, while compounds incorporating an inter-
MW = 107, exists as 5 possible isomers, and has an Fsp3 = 0.29, mediate level of stereochemical complexity were less likely to be
which contrasts with dimethylpiperidine as a markedly more promiscuous.45
structurally diverse heterocycle that has 34 possible isomers, Concerns about the role of high sp2 atom count in molecules
including all of the possible enantiomers, for a modest increase in also surfaced in an analysis of the ionization class of marketed
molecular weight to MW = 113 (a less than 6% change) and an orally bioavailable drugs.10 The ionization class of a compound
Fsp3 that is unity. exerts a significant effect on ADME properties, with acidic com-
The thesis explored in this study focused on understanding the pounds generally highly bound to plasma proteins, a phenom-
relationship between increasing saturation and the potential of a enon that leads to a low volume of distribution and necessitates
molecule to be successfully developed as a drug and analyzed molecules with high metabolic stability in order to achieve a
compounds in the GVK BIO database of drugs and candidates in reasonable T1/2 in vivo. In contrast, basic compounds generally
development.43 Calculated parameters examined were Fsp3 as exhibit a high volume of distribution in vivo, which favors an
the measure of the degree of saturation, the number of stereo- improved T1/2 but also increases the potential for off-target
centers and MW, while the relationship between Fsp3 and problems based on the broader opportunity for a molecule to
physicochemical properties was also assessed. There was a 22% interact with a wider range of proteins.10 As a consequence, basic
decrease in mean MW and a 31% increase in mean Fsp3 between compounds are a class more frequently associated with an
discovery compounds and marketed drugs, as captured in Table 17. increased risk of toxicity, most frequently manifested as hERG
After removing compounds violating the “rule of 5” or possessing channel inhibition29,33 or phospholipidosis.3440 However,
more than 10 rotatable bonds, the fraction of compounds with one the absence of an ionizable element in neutral compounds
or more stereocenters was found to increase with progression generally compromises solubility. The ionization class of
through the development process (summarized in Table 18). The 2,056 marketed, orally bioavailable drugs distributed to 352
fraction of marketed drugs containing one or more stereocenters (17%) acidic, 803 (39%) basic, 714 (35%) neutral, and 126
was found to be 33% higher than that in compounds categorized as (6%) zwitterionic molecules, while the remaining 61 (3%)
being in the discovery phase. were cations.10 These data were compared with 10,271 com-
An evaluation of structural complexity, as reported by Fsp3, pounds abstracted from patent applications published between
was compared to solubility (log S) for 1,202 compounds and 2000 and 2009, which revealed that the distribution of ion class
melting point data which was available for 4,432 representatives has remained comparatively constant since the 1950s. How-
and were categorized into ranges of (25 °C. Melting points were ever, there was a marked increase in zwitterionic compounds
1431 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 21. Plot of log P vs Melting Point Based on the General Solubility Equation

Table 22. Solubility Targets for Drug Candidates Based on Table 23. Biopharmaceutics Classification System for Drugs
Aqueous Solubility Based on Solubility and Permeability
dose (mpk) Caco permeability solubility (μg/mL) class 1 class 2

0.1 low 21 high solubility low solubility


0.1 medium 5 high permeability high permeability
0.1 high 1
class 3 class 4
1 low 207
1 medium 52 high solubility low solubility
1 high 10 low permeability low permeability
10 low 2100
10 medium 520 at 18 for patent compounds, representing a difference of one
10 high 100 additional 6-membered aromatic ring in the patent compounds.
Several drugs that exhibited a negative Ar-sp3 atom count, ranging
noted for the periods 198084 and 198589, attributed to the from 14 to 20, were mostly steroidal in nature. This study
intense interest in the quinolone class of antibacterial gyrase concluded that Ar-sp3 atom count has been relatively constant over
inhibitors and the antihypertensive angiotensin II receptor time in orally bioavailable drugs but has increased in patented drug
antagonists. Basic compounds exhibited the highest clog P, molecules independent of MW and log P.
while clog D was the highest for neutral compounds, lower for 2.7. Physical Properties and Solubility. Both permeability
basic compounds and the lowest for acidic molecules. Zwitterions and solubility are dependent on lipophilicity, with the relation-
showed the highest PSA and H-bonding properties, while acids ship for the latter provided by the general solubility equation
were noted as possessing the highest sp2 atom count and the which states that log S = log P  0.01(mp  25) + 0.5.
fewest chiral centers.10 Solubility values based on this relationship are compiled in
These observations led to the introduction of a simple equa- Table 21 for a range of melting points along with color coding
tion to capture the aromatic atom count, designated as Ar-sp3, to reflect poor (<30 μM, red), intermediate (30200 μM,
and defined as the total number of aromatic atoms minus yellow), and good (>200 μM, green) solubility.47 The average
the number of sp3 carbon atoms.10 This provides a relatively drug expresses a clog P of 2.7, a figure that is at the upper limit of
straightforward scale that is independent of MW and which can what is generally considered to be good solubility.
readily be interpreted, with a higher Ar-sp3 correlating with an Drug solubility is of critical importance to oral absorption
increase in flatness and a lower Ar-sp3 reflecting increased since only a soluble drug is bioavailable. The data compiled in
3-dimensionality. Only 14% of the oral drug data set contained Table 22 provides guidance for targeted aqueous solubility as a
more than 2 aromatic rings, while the fraction of compounds in function of 3 different doses of drug based on membrane
patent applications possessing more than 2 aryl rings was con- permeability and an intestinal volume of ∼250 mL.48,49 These
siderably higher at 63%. However, the distribution of sp3 atoms figures are likely to vary if intestinal fluid is used as the vehicle,
was found to be similar in marketed drugs and the patent com- dependent on the physicochemical properties of a compound.
pounds, with a mean of 7.8 and 8.2, respectively. The increased The biopharmaceutics classification system (BCS) for drugs
aromatic content in patent compounds was considered to be an captures the relationship between drug dissolution and mem-
important contributor to the increases in clog P and MW that has brane permeability as a means of predicting oral bioavailability,
been observed in the more contemporary compound data sets. formulated in Table 23.50,51
Although the Ar-sp3 atom count was found to be well distributed In order to develop a predictive tool for assessing potential
in both data sets, there were significant differences between the solubility problems, a model was devised on the basis of a recursive
drugs and patent compounds. The 90th percentile for sp3 atom partitioning model derived from an analysis of 2363 diverse com-
count among drugs occurred at 11 but was considerably higher pounds selected as the training set and 1200 compounds used as the
1432 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Figure 3. Classification tree to predict solubility on the basis of aromatic proportion (AP, the number of aromatic atoms minus the total number of
atoms) and molecular weight (MW).

Table 24. Adaptation of Lipinski’s “Rule of 5” at Different Stages of Drug Optimization


rule application MW log P H-bond donors H-bond acceptors (N + O) PSA (Å2)

3 fragments e300 e3 e3 e3 e60


4 leads e400 e4 e4 e8 e120
5 drugs e500 e5 e5 e10 e140

test set that had measured aqueous solubility.52 The compounds were which the success of the predictions is captured in the individual
selected from advanced drug discovery projects and represented leaves.52
10 major chemical series from 6 programs. Solubility thresholds
were set at 20, 30, 40, 50, and 60 μM, with compounds defined 3. INCORPORATING PHYSICOCHEMICAL PROPERTIES
as soluble or insoluble at each concentration. Five models were INTO DRUG DESIGN
developed on the basis of these solubility thresholds, and 7 key With the analyses presented above as a backdrop, a number of
descriptors were evaluated: MW, Alog P, aromatic proportion attempts have been made to provide medicinal chemists with
(AP), PSA, rotatable bond, HBDs, and HBAs. Aromatic propor- pragmatic parametric guideposts based on physicochemical
tion (AP) was defined as the number of aromatic atoms divided by properties designed to facilitate decision making during lead
the total number of atoms, an index that was designated as Fsp3 by optimization. While many of these mnemonics have proven to be
Lovering and is related to both Leeson’s Ar-sp3 and Schreiber’s individually useful in compound analysis, an integrated ap-
complexity definitions.10,43,45 The final model was based on the proach that incorporates an appreciation of several of the
two descriptors that gave the best performance: MW and AP, with guidelines in a fashion tailored for a specific problem is proving
PSA, HBDs, and HBAs exhibiting no significant correlation with to be the most effective strategy to optimize drug design. The
solubility. The distribution of aqueous solubility across 3,563 fundamental need to maximize biological activity via increased
compounds was plotted against the 3 dimensions of MW, AP, affinity for a target is frequently a key focus of the medicinal
and Alog P, with the majority of the data mapping to a defined chemist, but as a singular objective, this is clearly inadequate
domain that broadly adhered to Lipinski’s rules. A solubility since harmonization with physical properties that confer good
classification tree was developed from the analysis that comprised ADME properties, particularly oral absorption and metabolic
5 branches and 11 terminal leaves designed to facilitate decision stability, is required.53 These properties are frequently difficult
making on the basis of AP and MW. This approach to compound to align in a similar direction, and a specific drugtarget
classification provided 81% accuracy and 75% precision for the test interaction will present both unique challenges and opportu-
set of 1,200 compounds; the parameters are captured in Figure 3 in nities. The discussion that follows is divided into studies of the
1433 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

properties associated with oral bioavailability and compound an increased clog P were found to be more permeable. No
durability in the development phase followed by a focus on correlation was observed between RB count and LM stability.17
optimizing drugtarget interactions. These elements of drug Additional studies have suggested that setting the upper
design are subsequently brought together in the context of limit for good absorption at 10 rotatable bonds maybe an
recent examples from the literature that illustrate the applica- oversimplification.18,19 Compounds incorporating between 15
tion of an integrated approach to the design of drug candidates and 20 RBs were associated with good absorption in the rat in a
with improved properties. However, it should be noted that cohort of 434 preclinical compounds at Pharmacia, properties
many of these represent recent discovery campaigns, and it will that showed some dependence on the therapeutic class under
be some time before the effect of an early consideration of consideration.18 However, this study did confirm the association
of absorption with a PSA of e140 Å 2, an important guiding
physicochemical parameters on compound design and devel-
opability will be fully understood. parameter that was reinforced by an analysis of drug exposure in
Lipinski’s “rule of 500 provided straightforward and relatively humans.19 Of 1,014 compounds with g20% oral bioavailability
simple parameters for predicting the potential for a compound to in humans, 80.8% conformed to a PSA of e140 Å2 and a
show good oral absorption.16 A more recent analysis of oral drug rotatable bond count of e10. However, the upper limit of
space as a function of clog P and MW for 617 approved drugs rotatable bonds compatible with good absorption was deter-
identified a centroid occurring at a MW = 316 and a clog P = 2.3, mined to be 13 and the best predictors of good oral absorption
distilled down to just 2 descriptors: a PSA of e160 Å 2 and a
with the number of HBAs = 4 and the number of HBDs = 2.54
These data conform nicely with the “rule of 5” guidelines. log P > 2.2. 19

In an effort to broaden the utility of the “rule of 500 , an adaption 3.2. Physicochemical Properties: Lipophilicity and Oral
Absorption. Insight into aspects of the relationship between
for the different stages of drug optimization has been proposed
lipophilicity and oral absorption was provided by an analysis of
that provides additional guidance (presented in Table 24).5560
Caco-2 permeability data (Papp) available for 9,571 structurally
In extending the “rule of 5” concept to leads and fragments, more
diverse compounds in the AstraZeneca collection.61 The Papp for
stringent rules have been applied, with a “rule of 4” suggested for
the data set spanned 4 orders of magnitude, with 58% of
leads and a “rule of 3” for the kind of elementary molecules that
compounds exhibiting a Papp of <100 nm/s and 42% a Papp of
are typically employed in fragment-based drug design campaigns. >100 nm/s. The physicochemical parameters studied for corre-
These guidelines represent a pragmatic reflection of the historical lation were AZlog D, clog P, PSA, HBDs, HBAs, RB count, and
trends observed during lead optimization campaigns. MW. Caco-2 permeability was compromised for compounds
3.1. Molecular Properties Influencing Oral Absorption: with a low AZlog D and high PSA, high HBD, HBA, and RB
Rotatable Bonds. The analysis of the relationship between counts, and a high MW. There was no dependence on clog P, a
physical properties and oral bioavailability in the rat for >1,100 parameter that ignores ionization state.
drug candidates at SmithKlineBeecham surfaced the association Recursive partitioning methodology provided the deeper in-
between rotatable bond count and oral bioavailability.17 Artificial sight that MW and AZlog D were the most discriminating factors
membrane permeability data were available for more than 3,000 affecting membrane permeability, with molecules with a MW of
compounds, while liver microsomal clearance data were available less than 414 Da and AZlog D of more than 1.3 predicted to have
for >4,000 molecules. The physicochemical properties assessed a 74% probability of achieving a high Papp. For compounds with a
for a potential correlation included clog P, HBDs and HBAs, MW higher than 414 Da and an AZlog D of less than 2.4, there
PSA, and rotatable bond count, which was included as a measure was an 81% chance that the Papp would be low. These data were
of molecular flexibility. A cutoff of 20% oral bioavailability was captured more effectively as a 50% chance of achieving high
used, considered to be the low end of the acceptable range for permeability (>100 nm/s) when MW was in the range of
compound advancement, and compounds were divided into 350400 Da and AZlog D was greater than 1.7. For compounds
MWs of greater or less than 500 Da based on Lipinski's rule. in the MW range 400450 Da, an AZlog D of >3.1 was required
The MW for the set of compounds ranged from 220 to 770 Da. for a high Papp, parameters that are compiled in Table 25.61
Rotatable bond count was found to influence oral bioavail- By dividing the data into MW bands and repeating the regres-
ability, with 65% of compounds with e7 rotatable bonds sion analysis, an estimate of the 50% probability of obtaining
exhibiting oral bioavailability of g20%. In contrast, 75% of good permeability of >100 nm/s in a parallel artificial membrane
compounds with g10 rotatable bonds had an oral bioavailability permeability assay (PAMPA) for different MW ranges could be
of e20%. A closer analysis revealed that if the MW was <400 Da, estimated. PAMPA data was similar to the Caco-2 Papp, but the
rotatable bond count was not a correlate since 90% of com- PAMPA assay avoids the involvement of transporters; never-
pounds with a MW <400 Da had e10 RBs and 70% had e7 RBs. theless, the AZlog D is in a similar range for good permeability
These data suggested a MW threshold below which flexibility is in the MW range of 300500 Da (Table 26). The acceptable
sufficiently restricted such that it has limited impact on oral lipophilicity limit (AZlog D) increased with MW, and the re-
bioavailability. liability of MW and AZlog D for predicting good membrane
The remainder of the analysis confirmed the earlier physical permeability was found to be comparable or superior to estab-
chemistry correlates, with 80% of the compound data set meeting lished methodology.61
3 of 4 of the Lipinski “rule of 5” criteria. Higher oral bioavail- 3.3. Golden Triangle Observations: Optimizing Oral Ab-
ability was associated with a lower MW, a lower H-bond count sorption and Clearance. An analysis of large data sets of Pfizer
(e12 H-bond acceptors and donors), and a lower PSA (e140 Å 2) compounds for which in vitro permeability data in a Caco-2
but exhibited only a limited dependence on clog P, suggesting cell line (16,227 compounds) and in vitro clearance parameters
that only a minimum lipophilicity was required. Increased RB derived from HLM stability data (47,018 compounds) were
count reduced the rate of permeation through an artificial available provided useful correlates with physicochemical pro-
membrane, while compounds with a reduced PSA rather than perties.62 The physicochemical properties examined for a
1434 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

potential correlation at this nexus of the two most important The plot of permeability against MW revealed a parabolic
properties affecting oral bioavailability were both an experimen- relationship in which permeability increased with log D but then
tally derived and a calculated log D. Both permeability and HLM fell off at high levels of lipophilicity. In contrast, the relationship
stability were recognized as being dependent on a range of between HLM stability and MW exhibited a markedly different
properties that included polar surface area, molecular volume relationship, with reduced clearance correlating with lower
and surface area, RB count, the number of heteroatoms, and the lipophilicity and MW. This was interpreted in the context of
number of HBDs and HBAs. In order to simplify the analysis, log higher MW compounds providing greater opportunity for meta-
D and MW were considered to be reasonable surrogates, with bolic modification, although some high MW compounds were
outliers identified as compounds that may depend on additional found to exhibit acceptable stability.
factors. The in vitro permeability and metabolic stability data Not surprisingly, when the data for permeability and HLM
were plotted against log D, with additional dimensions capturing stability were combined in order to identify compounds meet-
the size of the cohort at a particular point in the graphical ing both criteria, the percentage of compounds that passed this
representation and the percentage of compounds meeting a more stringent hurdle was considerably lower than that for the
passing or failing grade identified by color coding. individual parameters. These observations and trends were
captured in a plot of log D versus MW in which the potential
Table 25. Estimates of the Minimum AZlog D Required for a to optimally align both parameters mapped to a triangular area,
50% Chance of Achieving Good Permeability Based on a referred to as the Golden Triangle. The Golden Triangle was
Specific MW range defined by the base set at a MW of 200 Da where the acceptable
range of log D fell between of 2 and +5, with the apex occurring
MW AZlog D at a molecular weight of 450 Da where the acceptable log D
fell in a much narrower range, between 1 and 2. Thus, as MW
<300 >0.5
decreases, the lipophilicity range for satisfying both criteria
300350 >1.1
increases, while for higher MW molecules, the range of lipophi-
350400 >1.7 licity where both parameters align is considerably more restric-
400450 >3.1 tive. Within a log D bin, the relative number of permeable and
450500 >3.4 stable compounds increased as MW decreased, and within the
>500 >4.5 log D range of 1.02.0, the opportunity to obtain good perme-
ability and metabolic stability was higher. Permeability and
Table 26. Estimates of the AZlog D Required for a 50% metabolic stability show opposing relationships with log D, with
Chance of Achieving Good Permeability in a PAMPA Assay permeability correlated positively and metabolic stability corre-
for Different MW Ranges lated negatively. A balance is struck at a log D of between 1.0 and
2.0. At a MW = 350 Da and log D = 1.5, 25% of compounds
MW AZlog D passed both criteria, while at a MW = 450 Da and log D = 0 or 3.0,
only 3% of compounds passed both criteria, representing an
<300 1.7
8-fold lower chance of identifying molecules satisfying both
300350 2.2
parameters.
350400 2.6 Outliers to the Golden Triangle were examined because of
400450 2.7 their potential to be instructive in drug design. The CF3-
450500 2.5 substituted CCR5 antagonists 1 and 2 and pyridinylpiperidine 3
>500 >4.5 (Figure 4) exhibit high molecular weights and high elog D values

Figure 4. Compounds falling outside the parameters defined by the Golden Triangle.

1435 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

Table 27. Physicochemical Properties Associated with Marketed CNS Drugs


physicochemical mean values for suggested top 25 CNS drugs in preferred top 25 CNS drugs in
parameter the top 25 CNS drugs limits suggested range range preferred range

PSA (Å2) 47 <90 96% <70 76%


H-bond donors 0.8 <3 100% 01 92%
clog P 2.8 25 68% 24 52%
clog D (pH 7.4) 2.1 25 61% 24 61%
MW 293 <500 100% <450 100

but still passed permeability and stability criteria. This observation Table 28. Physicochemical Parameters Associated with 119
was attributed to the halogens (CF3) possibly causing MW to Marketed CNS Drugs and 108 Pfizer CNS Drugs
overestimate the actual size of the molecule, giving rise to the
property CNS drugs Pfizer candidates
suggestion that the use of heavy atom count (HAC) or surface area
may provide a better correlation. In addition, CF3 moieties may clog P 2.7 3.5
block metabolism, a phenomenon known to act both locally clog D 1.7 2.2
and globally.6365 Compounds 4 and 5 (Figure 4) exhibit low MW 303.5 357.4
lipophilicity, but both passed permeability criteria. However, PSA (Å2) 48.5 53.6
these compounds contain a CO2H moiety, with 4 incorporating
H-bond donors 1.1 1.2
the potential for intramolecular H-bonding and 5 being zwitter-
pKa 8.0 8.1
ionic, structural elements which may mask the polarity of the
compounds.66 Alternatively, these compounds may be subject to Papp 75% high 51% high
active transport, particularly the HMGCoA reductase inhibitor 4. 13% moderate 41% moderate
This analysis highlighted the importance of MW and log D as P-gp 75% low 55% low
surrogate drivers of permeability and metabolic stability and CLint 71% low 48% low
emphasized the close relationship with ligand efficiency (LE) LE 0.53 0.50
and lipophilic ligand efficiency (LLE) as a means of allowing LLE 6.2 6.3
contextual visualization. Permeability and metabolic stability could P450 inhibition 8595% low risk 8294% low risk
be maximized by optimizing physicochemical properties toward hERG inhibition 57% with e15% inhibition 43% with e15% inhibition
the center of the Golden Triangle in order to more effectively THLE Cv 79% with IC50 > 100 μM 71% with IC50 >100 μM
target the drug-like space. However, opportunities to successfully
explore space outside of the region defined by the Golden Triangle
with membrane permeability, clearance in HLM, P-glycoprotein
suggested the judicious use of intramolecular H-bonds to reduce
(P-gp) substrate potential, cytochrome P450, and hERG inhibi-
molecule polarity and thereby increase permeability or the careful
tion, while the viability of a transformed liver epithelial cell
deployment of CF3 moieties and halogens to reduce metabolism.
3.4. Physicochemical Properties of CNS Drugs: A Special (THLE Cv) was used as a measure of toxicity. Primary binding
Case. Profiling of the top 25 marketed CNS drugs calibrated the data at the pharmacological target allowed ligand efficiency and
physicochemical parameters affecting brain exposure, with mean lipophilic ligand efficiency to be calculated for the molecules,
values and upper limits suggested in order to achieve satisfactory and the mean values are compiled in Table 28. The objective of
bloodbrain barrier penetration (data compiled in Table 27 the study was to define trends and properties related to success
along with the preferred ranges).24 and to develop an optimal profile for a drug targeting the CNS,
3.5. Further Defining CNS Drug Space and Candidate with the target values for the parameters under consideration
Success: A CNS Multi-Parameter Optimization Tool. A de- compiled in Table 29. The optimal physicochemical properties
tailed analysis of a broad range of CNS compounds conducted by for a compound targeting the CNS were rationalized as a clog
a team at Pfizer has attempted to provide a deeper understanding P = 2.8, a clog D = 1.7, a MW = 305.3, a TPSA = 44.8, a HBD
of physicochemical properties required for brain penetration and count = 1, and a pKa = 8.4.
compound durability that could be useful in a predictive sense.67,68 A higher percentage of marketed CNS drugs met all of the
The a priori recognition that the complexion of physicochemical ADME and safety criteria than did the Pfizer candidates, and it
properties associated with compound delivery to the CNS can was quickly recognized that aligning key physicochemical attri-
differ led to a focus on developing a Multiple Parameter Optimiza- butes was an important contributor to the success of a com-
tion (MPO) tool. Although this approach was designed to pound. The MPO approach scored the quality of each of the 6
improve the potential to design compounds with good CNS physicochemical properties, clog P, clog D , MW, PSA, HBD, and
penetration, the underlying principles were fundamental in nature pKa, on a 0 to 1 scale, resulting in a composite score based on the
such that they extended beyond the CNS drug space. summation of individual attributes ranging from 0 to 6. This
The initial analysis focused on 119 marketed, orally bioavailable approach allowed a holistic assessment of drug-like ADME and
CNS drugs and 108 Pfizer CNS candidate molecules that were safety properties that did not rely on a single property, although
synthesized over a 20 year time frame, a data set that encompassed the selected parameters reflect a significant bias toward lipophi-
13 different mechanisms. Attention was focused on understanding licity as the key element. The overall score represents the extent
the relationship between physicochemical properties and in vitro of alignment of the individual properties, with a higher MPO
ADME attributes. Six physicochemical parameters, clog P, clog D , score predicting an improved potential for success and a target
MW, PSA, HBD, and pKa, were evaluated for their relationship score of g4 deemed optimal. The individual property scores
1436 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 29. Target Values for in Vitro Profiling Assays for CNS Compounds
ADME binding efficiencies safety

high permeability Papp >105 cm/s ligand efficiency (LE) = 0.46 low DDI risk (P450 inhibition) e25% inhibition
low P-gp liability ratio e2.5 lipophilic ligand efficiency (LE) = 6.4 low hERG risk e15% dofetilide inhibition
low LM clearance CLint e100 mL/min/kg high cell viability THLE Cv >100 μM

Table 30. Inflection Points for 6 Key Physicochemical Parameters Used to Define the Optimal Properties for Drugs Targeting the
CNS
shape of the function weighting desirable range (T0 = 1.0) less desirable range (T0 = 0.0)

clog P monotonic decreasing 1.0 e3 >5


clog D monotonic decreasing 1.0 e2 >4
MW monotonic decreasing 1.0 e360 >500
PSA hump function 1.0 >40 and e90 e20 and >120
H-bond donors monotonic decreasing 1.0 e0.5 >3.5
pKa monotonic decreasing 1.0 e8 >10

Table 31. Structures, Physicochemical Attributes, and MPO Score for 3 Pfizer CNS Candidate Drugs Advanced into Clinical
Study

are also useful since they offer a means of identifying specific potential, to be recognized as a P-gp substrate. The MPO tool was also
problems. The inflection points for the individual attributes used to analyze 11,303 Pfizer compounds across a broad range of
were selected on the basis of medicinal chemistry experience physicochemical property space and therapeutic targets with the result
and intuition, with the most desirable inflection point defined as that similar trends were observed, suggesting value in applying this
T0 = 1.0 and the least desirable inflection point as T0 = 0. approach to non-CNS compounds.67,68
Parameters that fall between T0 = 1 and T0 = 0 were scored To further probe the utility of the MPO approach and to
linearly on the basis of the slope of the graph between the provide illustrative examples that emphasized the holistic nature
inflection points, which are summarized in Table 30. To facilitate of the analysis, the MPO scores for 3 Pfizer compounds that have
the use of this tool, the authors have provided an Excel spreadsheet undergone clinical evaluation were examined.68 The structure of
that facilitates the calculation of compound scores. the 3 compounds, the PDE 9 and 10 inhibitors 7 and 8 and the
The 119 marketed CNS drugs and the 108 Pfizer CNS histamine H3 antagonist 9, and the associated physicochemical
candidates were assessed against the MPO algorithm with the data are compiled in Table 31, along with the individual and
result that 74% of marketed drugs and 60% of the Pfizer composite MPO scores. Although all 3 compounds present good
candidates had an MPO score g4. As the MPO score increased, overall MPO scores in the range suggested as optimal, the
the chances of finding compounds with desirable in vitro ADME individual profiles of the compounds are quite different, each
and safety profiles improved, and the chances of aligning these in offering a unique disposition that emphasizes the importance of
a single molecule also increased. Of the marketed drugs, 9196% adopting a broader perspective on compound properties. Inter-
that had an MPO score of >5 showed good ADME and safety profiles, estingly, four marketed drugs, the nootropic aniracetam (10), the
while 77% of the drug data set that had an MPO score >5 showed full benzodiazepine antagonist flumazenil (11), the sedative/hypno-
alignment of all 3 of the ADME attributes, HLM stability, Papp, and tic zaleplon (12), and the widely consumed stimulant caffeine
1437 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Figure 5. Structures of 4 drugs that exhibit perfect MPO scores of 6.

Table 32. Gibbs Free Energy Associated with Different


Binding Affinities of Drugs for Targets
binding affinity Gibbs free energy
(nM) (kcal/mol at 300 K)

1000 8.28
Figure 6. Free energy relationships associated with drugtarget
interactions. 100 9.64
10 11.00
(13) (Figure 5), exhibited MPO scores of 6, reflecting a perfect 1 12.36
alignment of all of the key parameters.68 0.1 13.72
3.6. Optimizing DrugTarget Interactions: Thermody-
namics of LigandProtein Binding. Freire has focused atten- release of ordered water molecules is a dominant force associated
tion on the importance of assessing the thermodynamic signature with the binding energy of hydrophobic groups, and it has been
of drugtarget interactions, dissecting the enthalpic and entro- estimated that burying a carbon atom from solvent contributes
pic components of binding using the key equations presented in 25 cal/mol/Å2 to binding affinity. However, conformational
Figure 6.12,13,55,69,70 The binding energy of a ligand is governed entropy change is almost always unfavorable based on the loss
mostly by intermolecular van der Waals attractive forces, of conformational degrees of freedom for both a drug and its
H-bonding interactions, and repulsive forces like the hydropho- protein target, but this can be addressed by introducing con-
bic effect that help to drive a molecule from an aqueous environ- formational constraints into a drug molecule. Note that these
ment into the hydrophobic cavity of a protein. High affinity observations are restricted to drugtarget interactions, and
binding requires positive contributions from both enthalpy and problems with the optimization of an orally bioavailable drug
entropy, but the simultaneous optimization of these elements is candidate can be exacerbated by the need to preserve the
challenging since enthalpic optimization can often be offset by a physicochemical properties associated with absorption where
loss in the entropic contribution. In contrast, since entropy is
Lipinski’s “rule of 5” and related parameters are dominant.
based on the hydrophobic effect, it is generally much easier to
Determination of the thermodynamic signature of a molecule
optimize in the absence of an understanding of the thermo-
requires isothermal titration calorimetry (ITC) experiments
dynamic signature by simply increasing the lipophilicity of a
which measure the heat evolved, ΔH, when a drug interacts with
ligand. This results in the kind of lipophilic, poorly soluble drug
its macromolecular target.73,74 The Gibbs free energy associated
candidates that are perceived to be becoming ever more common
with binding is related to the affinity of the interaction, and a
in contemporary drug discovery programs.1113
ligand with a 10 nM binding affinity has a total Gibbs free energy
Optimizing enthalpy is difficult based on its dependence on
of 11 kcal/mol. This equates to a free energy change of
establishing productive H-bonding and van der Waals interac-
approximately 1.36 kcal/mol for a 10-fold difference in potency
tions that can frequently be offset by the entropic penalty
(data compiled in Table 32). Knowledge of ΔH and ΔG allows
associated with the desolvation of incorrectly positioned polar
calculation of the entropic contribution to the free energy using
groups.11,12,7072 The challenge with establishing H-bond inter-
the equation ΔGbinding = ΔH  TΔS.
actions is the requirement for optimal geometry since H-bond 3.7. Thermodynamic Signatures and Enthalpy-Optimized
interactions are highly sensitive to both distance and angle, while Drug Candidates. Freire’s analysis of the evolution of inhibitors
van der Waals interactions are also maximized when geometric fit of HIV protease and HMGCoA reductase has highlighted that
is ideal. A suboptimal distance and angle contributes to an overall best-in-class molecules are optimized for a high dependence on
negative effect because the enthalpy change is dependent on the an enthalpic contribution to the binding energetics.12,13,75 How-
difference in H-bonding strength between a drug and its target ever, the process of identifying these molecules was almost
and the drug and the aqueous environment, the fundamental certainly focused on the common drug optimizing paradigm of
equilibrium describing ligand association. The energy required to simultaneously improving potency, selectivity, solubility and
desolvate a polar group is ∼8 kcal/mol, 10-fold higher than that pharmacokinetic properties rather than carefully analyzing bind-
required to desolvate a nonpolar group. Thus, entropic optimiza- ing energetics.
tion is much easier because of the positive contributions arising For the HIV protease inhibitor (PI) class, saquinavir (14)
from desolvating a hydrophobic molecule and from conforma- was the first drug launched in 1995, while the most recent PI,
tional entropy changes. Favorable desolvation entropy from the darunavir (22), was marketed in 2006 and exhibits an affinity for
1438 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Figure 7. Structures of HIV protease inhibitors.

Table 33. Thermodynamic Signatures for Marketed HIV Protease Inhibitors


compd launch date Kd (pM) ΔG (kcal/mol) ΔH (kcal/mol) TΔS (kcal/mol)

14 saquinavir 1995 400 13.0 1.2 14.2


15 indinavir 1996 480 12.4 1.8 14.2
16 ritonavir 1996 29 13.7 4.3 9.4
17 nelfinavir 1997 260 12.8 3.1 15.9
18 amprenavir 1999 390 13.2 6.9 6.3
19 lopinavir 2000 7.7 15.1 3.8 11.3
20 atazanavir 2003 150 14.3 4.2 10.1
21 tipranavir 2005 8 14.6 0.7 13.9
22 darunavir 2006 4.5 15.0 12.7 2.3

Table 34. Thermodynamic Signatures for Marketed


HMGCoA Reductase Inhibitors
ΔG ΔH TΔS
compd launch year (kcal/mol) (kcal/mol) (kcal/mol)

23 pravastatin 1991 9.7 2.5 7.2


24 fluvastatin 1993 9.0 0 9.0
25 atorvastatin 1996 10.9 4.3 6.6
26 cerivastatin 1997 11.4 3.3 8.1
27 rosuvastatin 2003 12.3 9.3 3.0

Figure 8. Structures of HMGCoA reductase inhibitors fluvastatin (24)


HIV protease that is significantly higher than that of saquinavir and rosuvastatin (27).
(14) (Figure 7 and Table 33). By analyzing the thermodynamic
signatures of marketed protease inhibitors, Freire found that
the more potent and highly refined inhibitors were associated and entropic components. In arriving at darunavir (22), there
with an increased enthalpic contribution to binding affinity was a specific focus on increasing the reliance on H-bonding
and a much reduced reliance on entropy (data compiled in and drugprotein backbone interactions, optimized by the
Table 33).12,13 Darunavir (22) is particularly striking in this set signature bis-THF moiety. This approach minimized the ex-
of compounds not only because of its class-leading potency but posure of darunavir (22) beyond the substrate envelope and
also because it has maximized the enthalpic contribution to contributes to the high genetic barrier to resistance associated
binding at 12.7 kcal/mol at the expense of a dependence on with this molecule.
entropy which, at 2.3 kcal/mol, is the smallest value within An analysis of the thermodynamic signatures of the statins,
this drug class. By way of contrast, the early PIs saquinavir (14), drugs that inhibit HMGCoA reductase and which are widely used
indinavir (15), ritonavir (16), nelfinavir (17), lopinavir (19), clinically to control cholesterol levels, revealed a similar pattern
and tipranavir (21) rely far more on an entropic contribution to of optimization.12,75 Pravastatin (23) is a natural product, while
affinity in order to achieve high potency. Darunavir (22) is an the remainder of the compounds compiled in Table 34 are
optimized derivative of amprenavir (18), an apparently good synthetic analogues that maintain the critical dihydroxy acid
starting point based on its thermodynamic signature which moiety recognized by the enzyme but replace the decahydro-
reveals an almost evenly balanced contribution from enthalpic naphthalene ring with alternative structural elements that show
1439 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Figure 9. Structures and enthalpy of interaction associated with a series of plasmepsin inhibitors.

Figure 11. Definitions of ligand efficiency (LE), binding efficiency


index (BEI), and surface efficiency index (SEI).

zero. The dissociation constant for KNI-10033 (31) from HIV


protease is 13 pM, an affinity based on favorable enthalpic,
Figure 10. Structures of the HIV protease inhibitor KNI-10033 (31) ΔH = 8.2 kcal/mol, and entropic contributions, TΔS = 6.7
and its analogue KNI-10075 (32). kcal/mol, data that is captured in Table 35. The sulfone homo-
logue 32 (Figure 10) establishes a strong H-bonding interaction
with Asp30B of the protease, reflected in an improved binding
enthalpy of 12.1 kcal/mol. However, the dissociation constant
Table 35. Thermodynamic Parameters for the Binding of does not appreciably change because the enthalpy gain is com-
Inhibitors 31 and 32 to HIV Protease pensated by a loss in entropy, attributed to the ordering induced
ΔG (kcal/mol) ΔH (kcal/mol) TΔS (kcal/mol) by establishing the H-bond and a lower energy associated with
desolvation. Strategies that might overcome the enthalpy/entropy
31 (KNI-10033) 14.9 8.2 6.7 compensation problem have been suggested.12 One approach is to
32 (KNI-10075) 14.6 12.1 2.5 target H-bonds presented in structured regions of a protein, readily
identified by X-ray B factors, hydrogendeuterium exchange ex-
some similarity across the series. The most recently marketed periments, or computer-aided drug design (CADD) analyses. Also,
compound, rosuvastatin (27), distinguishes itself from the earlier by targeting several H-bonds to the same location, establishing the
drugs by the significantly increased reliance of the binding first H-bond interaction will pay the entropic penalty, thereby
interaction with the enzyme on enthalpy, particularly compared allowing the others bind to a now structured region. Another
to fluvastatin (24), which relies entirely on entropy to drive the potential source of entropy compensation is the forced exposure
binding energetics (Figure 8). This is reflected in the physico- of hydrophobicity due to H-bond formation, illustrated by the
chemical properties of the drugs with fluvastatin (24) much more circumstance in which a H-bond donor/acceptor attached to a
lipophilic, clog P = 4.05 and a TPSA = 81 Å 2, than rosuvastatin phenyl ring, for example, could induce exposure of this hydrophobic
(27), which exhibits a clog P = 0.59 and a TPSA = 140 Å2. moiety. This would lead to a reduction in favorable desolvation
3.8. H-Bonding in DrugProtein Interactions. Isothermal entropy that partially neutralizes the enthalpy gained by establishing
titration calorimetry measurements of a homologous series of the H-bond between an inhibitor and its target.
plasmepsin inhibitors determined that the enthalpy (ΔH) of the 3.9. Ligand Efficiency, Binding Efficiency Index, and Sur-
interaction varied markedly with both the nature and position of face Efficiency Index. In the absence of ITC data, several binding
the aromatic substituent (data highlighted in Figure 9).12 The indexes have been introduced that relate elements of the structure of
requirement for a H-bond donor was clearly identified by the a molecule, heavy atom count, MW, or PSA, to potency and are easily
ITC results since replacing the para-OH of 29 by a CH3 (30) led calculated and which provide the medicinal chemist with some useful
to a reduction in ΔH of 4.4 kcal/mol. These data confirmed that a guideposts for decision making during lead optimization.49,7781
H-bond donor at C-4 was optimal for plasmepsin inhibition but The earliest study of ligandprotein energetics published in 1999
that the complementary acceptor in the enzyme could be surveyed data associated with a large number of high affinity
partially accessed from C-3 (28). complexes in which the dominant forces were van der Waals and
However, an illustration of the concept that establishing a hydrophopic interactions.77 The free energy of binding of ligands was
strong H-bonding interaction between a drug and its protein found to increase linearly with the number of non-H atoms, with the
target does not necessarily lead to improved binding affinity initial slope of 1.5 kcal/mol representing a measure of the maximal
based on the problem of compensating entropy changes is free energy of binding per non-H atom. This concept has been
provided by the HIV protease inhibitor KNI-10033 (31) referred to as ligand efficiency (LE), a term introduced by Hopkins in
(Figure 10).12,76 An optimally placed H-bond offers between 4 2004 that reflects the ratio of the affinity of a drug for its target with
and 5 kcal/mol of binding energy, which equates with between a the number of non-hydrogen atoms, the heavy atom count
1000- and 5000-fold increase in potency if the entropy change is (Figure 11).78 The measures of affinity that are typically used for
1440 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 36. LE Values for 5 Molecular Weights with the Associated Typical Heavy Atom Counts for the Potency Range 1 nM to
100 μM Based on Free Energy Dataa
ΔGbinding MW = 100 MW = 200 MW = 300 MW = 400 MW = 500
potency (kcal/mol) (HAC ∼7) (HAC ∼14) (HAC ∼21) (HAC ∼29) (HAC ∼36)

1 nM 12.36 1.77 0.88 0.59 0.43 0.34


10 nM 11.00 1.57 0.79 0.52 0.38 0.31
100 nM 9.74 1.39 0.69 0.46 0.34 0.27
1 μM 8.38 1.20 0.60 0.40 0.29 0.23
10 μM 7.02 1.00 0.50 0.33 0.24 0.19
100 μM 5.66 0.81 0.40 0.27 0.19 0.16
1 mM 4.30 0.61 0.30 0.20 0.15 0.12
a
The data are defined as kcal/mol/non-H atom. LE = kcal/mol per heavy atom; ΔGbinding = ∼1.36 kcal/mol for 10 change in potency.

Table 37. LE and BEI Values for 3 Molecular Weights with Table 38. SEI for Compounds with Potency Values of 1 and
the Associated Typical Heavy Atom Counts for the Potency 10 nM and Polar Surface Areas Ranging from 40125 Å2 a
Range 1 nM to 100 μM Based on pIC50 Data for Affinitya
potency PSA (Å2) SEI
MW (HAC) MW = 300 (21) MW = 400 (29) MW = 500 (36) 1 nM 40 22.5
IC50 pIC50 LE BEI LE BEI LE BEI 50 18
75 12
0.1 nM 10 0.47 33.3 0.35 25.0 0.28 20.0
100 9
1 nM 9 0.43 30.0 0.31 22.5 0.25 18.0 125 7.2
10 nM 8 0.38 26.7 0.28 20.0 0.22 16.0 10 nM 40 20
100 nM 7 0.33 23.3 0.24 17.5 0.19 14.0 50 16
1 μM 6 0.29 20.0 0.21 15.0 0.17 12.0 75 10.66
10 μM 5 0.24 16.7 0.17 12.5 0.14 10.0 100 8
100 μM 4 0.19 13.3 0.14 10.0 0.11 8.0 125 6.4
a
This analysis defines the values as unitless. LE = pIC50/HAC; BEI = a
SEI = pKi, pKd, or pIC50 ÷ PSA/100 Å2.
pIC50/MW in kDa.

Table 39. Reference Values for LE, BEI, and SEIa


this kind of analysis are the Gibbs free energy, pKi, pKd, or the pIC50,
index reference value unitless reference value
and the units for LE are kcal/mol per non-H atom when the Gibbs
free energy is used and unitless when based on the pKi, pKd, or pIC50. LE 0.50 kcal/mol/heavy atom 0.36
Thus, a molecule with 25 heavy atoms and a Ki = 1 nM at 300° K will BEI 37.1 kcal/mol/heavy atom 27.0
have a binding energy of 12.4 kcal/mol, affording an LE = SEI 24.7 kcal/mol/Å2 18.0
∼0.5 kcal/mol/non-H atom. However, it has been more common a
Reference values based on: Ki, Kd, or IC50 = 1 nM; pKi, pKd, or pIC50 =
and convenient to use pKi, pKd, or pIC50 in the equation that affords 9; ΔGbinding = 12.36 kcal/mol; MW = 0.333 kDa; ∼25 non-H:HAC =
slightly different, unitless figures. The figures calculated by the 2 25; PSA = 50 Å2.
approaches are compared in Tables 36 and 37. A compilation of LE
values associated with 3 molecular weights based on typical heavy
atom count across the potency range 1 nM to 1 mM in kcal/mol target value, and although independent of LE and BEI, SEI can be
per heavy atom is provided in Table 36. The term non-H atom used in conjunction with these parameters to analyze drug
encompasses a broad range of elements but ignores molecular properties. Notably, a high dependence of biological activity on
weight, a potential deficiency addressed by the binding efficiency polarity, manifested as a low SEI, generally leads to compounds
index (BEI), defined as pKi, pKd, or the pIC50 divided by the with poor membrane permeability.
MW in kDa. Using the BEI, an inhibitor with 1 nM potency Reference values that have been proposed for LE, BEI, and SEI
and a MW of 350 Da has a BEI = 25.7. LE and BEI values for 3 based on a compound that expresses 1 nM affinity for its target,
molecular weights with the associated typical heavy atom has a MW = 333 with 25 heavy atoms and a PSA = 50 Å2, are
counts for the potency range 1 nM to 100 μM are captured compiled in Table 39.
in Table 37 (data based on using the pIC50 as the measure of 3.10. Lipophilic Ligand Efficiency: LLE or LipE. Another
ligand affinity). index that has emerged with utility in lead optimization is
The surface efficiency index (SEI), defined as pKi, pKd, or the lipophilic ligand efficiency (LLE), also designated as LipE and
pIC50 divided by PSA normalized to 100 Å2, provides a means of defined in Figure 12. LLE relates potency to lipophilicity and was
relating potency to the polar surface area of a molecule.79,80 In introduced initially by Leeson and Springthorpe.9,26 This rela-
this approach to the analysis of drugtarget interactions, PSA is tionship reflects the importance of lipophilicity in drug design and
normalized to 100 Å2 since this is a value that has been associated can be used in conjunction with log P, log D, or clog D. LLE provides
with a marked change in oral bioavailability in small molecule an estimate of the binding efficiency in the context of lipophilicity
drug candidates (Table 38). An SEI = 18 has been suggested as a and is thus an index of lipophilicity per unit of potency. For a 1 nM

1441 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

inhibitor with a log P = 3, the LLE = 6, while for a 10 nM inhibitor molecular size.8284 For the data set examined, the IC50 numbers
with a log P = 3, LLE = 5, and the optimal target range for LLE is ranged from 0.01 nM to 5.5 mM, while the Ki data ranged from
generally considered to be between 5 and 7. 0.01 pM to 213 μM. The HAC for these ligands ranged from 7 to
3.11. Analysis of the BEI and SEI for 92 Marketed Oral 62 for the IC50 cohort and 678 where potency was defined by a
Drugs. A map of the SEI and BEI distribution for 92 marketed Ki, with the average values presented in Table 40.
oral drugs afforded a centroid at a BEI = 27 and a SEI = 18. Plots of IC50 or Ki versus HAC revealed that ligand efficiency
However, it was noted that different therapeutic targets will have does not show a linear relationship with molecular size, confirm-
different optimum efficiency indices. For example, the epidermal ing the earlier observations of Kuntz.77,82 Lower MW molecules
growth factor receptor kinase inhibitor iressa (33, Figure 13) has exhibited a higher LE than molecules of higher MW, a phenom-
a BEI = 17 and an SEI = 11.2 based on an IC50 = 20 nM and a PSA = enon most effectively illustrated by a plot of only the most
68.7 Å2. These data provide a marked contrast with the brain potent ligands at each size. Both the best and average LE fell off
penetrant antipsychotic haloperidol (34, Figure 13), which has a dramatically between 10 and 25 heavy atoms, with the average LE
BEI = 25 and a SEI = 23.3 based on higher intrinsic potency, IC50 = for both data sets close to the 0.3 value that has been suggested as a
0.35 nM, and a lower PSA = 40.5 Å 2.79,80 reference standard. Thus, while a LE of 0.3 is acceptable for a HAC
3.12. Ligand Efficiency and Molecular Size. The original of >20, LE values can be much higher when the HAC is less than 20.
description of the LE of compound association with proteins This lack of linearity between LE and MW across a broad range of
noted that the free energy of binding began to plateau at 15 heavy ligands and protein target classes assumes importance when asses-
atoms, with very little additional binding energy associated with sing the LE of small molecules in the context of an individual series.
larger molecules.77 A deeper analysis that assessed over 8,000 In order to accommodate this observation, an empirical normal-
ligands binding to 28 protein targets using either IC50 data, ization factor was devised on the basis of an interpretation of the
available for 6,072 compounds, or Ki, measured for 2,581 ligands, experimental data that provides an appropriate scaling of the LE of
provided additional insight into the relationship between LE and an individual molecule based on its size. The scaling factor LE_Scale,
which indexes the change in ligand efficiency with MW as defined in
Figure 14, equates the highest LE observed for a specific HAC over
the range 1045. This equation modifies the LE metric to reflect
Figure 12. Definition of lipophilic ligand efficiency (LLE). the decline as MW increases, fitting the maximum LE expected for a
specific HAC at different potency levels to a simple exponential, an
equation that allows a ready assessment of ligand efficiency in the
context of molecular size.
To further facilitate application of this observation, a fit quality
(FQ) was calculated for each LE score, defined as FQ = LE/
LE_Scale. This equation scales the most efficient ligands to a
normalized score of 1 that is independent of molecular size, with
FQ scores close to 1 indicative of near optimal binding based on
Figure 13. Structures of iressa (33) and haloperidol (34). the established data. It was noted that while FQ could exceed 1
for ligandtarget interactions not adequately predicted by the
data set, a low FQ is reflective of suboptimal binding efficiency. A
Table 40. Average Ki, HAC, and Ligand Efficiency for a Series
plot of FQ versus HAC provided a normalized score, a chart that
of 8,000 Ligands
allows a ready evaluation of LE compared to molecular size and
data set average potency (nM) average HAC average LE facilitates an understanding of the relative position of a molecule
on the exponential curve.8284
Ki 7.8 ( 1.5 36.1 ( 10.4 0.24 ( 0.09
An illustrative example was provided by an analysis of the 3
IC50 6.6 ( 1.2 26.8 ( 6.8 0.26 ( 0.07 carbonic anhydrase inhibitors presented in Table 41. The simplest
inhibitor, 4-methylbenzenesulfonamide (35), has a high LE and an
FQ = 1 whereas the fused thiophene derivative 36 maintains an
FQ = 1, despite a 2-fold higher HAC and a lower LE. In contrast,
Figure 14. Definition of LE_Scale, the scaling factor to normalize the t-Bu derivative 37 exhibits a markedly poorer LE for the same
ligand efficiency to the molecular weight. HAC as the thiophene derivative 36, reflected very clearly

Table 41. Ligand Efficiencies Associated with a Series of Carbonic Anhydrase Inhibitors

1442 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

Figure 17. Definition of group efficiency (GE).


Figure 15. Refined definition of LE_Scale that encompasses 1050
heavy atoms.
Table 42. Potency Gain As a Function of the Number of
Added Atoms at 3 Molecular Weights
GE = 0.31 GE = 0.39 GE = 0.52
ΔHAC (MW ∼500) (MW ∼400) (MW ∼300)

1 1.7 1.9 2.3


Figure 16. Definition of size-independent ligand efficiency (SILE). 2 2.8 3.6 5.5
3 4.6 6.8 13
in the FQ, which is less than 1 and revealing this to be an
4 7.7 13 30
inhibitor with lower than optimal binding characteristics. The
5 13 24 71
fit quality score has been suggested to be a useful metric with
application across the spectrum of drug discovery ranging 6 22 46 170
from an assessment of the quality of fragments discovered in 7 36 88 390
fragment-based drug design exercises through lead optimiza- 8 60 170 920
tion to drug candidates. 9 100 320 2200
The LE_Scale equation was subsequently refined by using 10 170 600 5100
only Ki data, considered to be more reliable, and expanding the 11 280 1100 12000
HAC range to encompass 1050 atoms, providing the relation- 12 460 2200 28000
ship presented in Figure 15.84 However, an alternative inter-
pretation of the LE versus size curve based on a set of 6,945 Ki molecular size increased, a pattern similar to that observed for the
values abstracted from the BindingDB database led to the free energy of interaction. In contrast, entropic efficiency was
derivation of a simpler equation that was confirmed by the largely unchanged as HAC increased, indicating that the entropic
analysis of 16,384 pIC50 data points.85 Size-independent ligand component to binding free energy is not responsible for the
efficiency (SILE) was related to HAC by the equation presented erosion of ligand efficiency as HAC increases.86
in Figure 16 where the affinity can be defined by RTlnIC50, 3.13. Group Efficiency (GE): An Assessment of the LE of
pIC50, pKi, or the experimentally measured ΔG. Drug Fragments Used in Lead Optimization. Group efficiency
The origins of the observation that LE erodes with increasing (GE) was introduced as an extension of the concept of ligand
size are inadequately understood. It has been proposed as efficiency with a specific focus on application in the lead
potentially being related to larger, more flexible ligands paying optimization phase that allows a straightforward understanding
higher entropic costs and that the probability of a good fit of the relationship between changes in structure and potency
between a ligand and protein becomes smaller as molecules based on heavy atom count.81 The units of GE, defined in
increase in size and complexity. An additional factor considered Figure 17, are kcal/mol/HAC, although pKi, pKd, or pIC50
was the reduction in accessible ligand surface area per atom as the values can, in principle, be used to afford a unitless index. GE
size of a molecule increases.8284 In an effort to further illuminate is a sensitive and useful metric that defines the quality of an added
the source of the erosion of LE with HAC, the thermodynamics structural element and which offers a simple guideline to calibrate
of 102 ligandprotein complexes representing 14 target classes the expected gain in potency as a function of the size of an
were evaluated, with a focus on analyzing the relationship added group. This equation is easily applied in lead optimiza-
between enthalpy and entropy and ligand binding and efficiency, tion campaigns, providing a straightforward means of assessing
the latter using the free energy of the interaction as the basis.86 changes in potency with structure. The targeted potency gains for
The plot of enthalpy versus entropy showed a significant negative groups in which HAC changes in the range 112 for 3 different
correlation, with those ligands displaying a favorable entropic GE values are presented in Table 42.
contribution to binding suffering offsetting positive enthalpies. In As an illustrative example of the use of GE as a guideline, the
contrast, ligands that relied upon a strong enthalpic component addition of the 6 heavy atoms associated with introducing a
suffered from a compensating loss in the entropic contribution to simple phenyl ring would require a 22-fold increase in potency in
the binding energy. These findings reflect Freire’s observations order to achieve a LE of 0.31 when targeting an overall MW of
that simultaneously optimizing enthalpy and entropy is challen- 500 Da. Alternatively, if the addition of a phenyl ring is
ging because of the propensity for these factors to align in accompanied by only a 10-fold increase in potency, then the
opposing directions.12,13 Plots of enthalpy or entropy against group efficiency equates to 1.37 kcal/mol ÷ 6 = 0.21 kcal/mol/
the free energy of binding for the data set showed no discernible HAC, which rises to a GE = 0.285 kcal/mol/HAC if potency
correlation, indicating that neither can be used as a proxy for free increases by 20-fold (data summarized in Table 43 for a molecule
energy. However, a plot of enthalpy and entropy in relationship with 25 heavy atoms).81
to molecular size identified subtle trends that were more Caveats on the application of GE indexes that were recognized
effectively revealed by plotting the efficiencies of enthalpy and included the potential for oversimplification based on the assump-
entropy, that is, by dividing these energies by the HAC. Under tion that the effect of the added group is modular in nature and
these circumstances, enthalpy and entropy separated into dis- independent of the context of the overall molecule. In addition, GE
tinct patterns of behavior, with enthalpic efficiency eroding as incorporates the effect of an added group on conformation and
1443 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 43. Group Efficiency Values for the Addition of a fold. For the BEI, the distribution was wider for leads, which
Phenyl Ring As a Function of Heavy Atom Count and Potency populated the lower end of the efficiency spectrum. Leads with
BEIs as low as 6.8, which corresponds to a MW = 631 Da and a
compd HAC IC50 (nM) pIC50 LE GE
Ki = 53 μM, were successfully optimized to approved drugs. The
A 25 10 8 0.32 BEI for 90% of the leads was >12.4, while the BEI for 90% of the
A + Ph 31 1 9 0.29 0.21 drugs was >14.7, and binding efficiency changes of g20% in
A + Ph 31 0.1 10 0.32 0.285 either direction were observed in half of the pairs. Drugs were
found to be an average of 11% more efficient than leads, and the
drugs were more efficient than their leads in 58% of cases.
Table 44. Source of Lead Structures for 60 Lead/Drug Pairs An interesting observation consistent with earlier studies was
Used to Analyze Binding Efficiencies that an increase in molecular weight tended to correlate with
reduced BEI.87 For all lead/drug combinations where the BEI
source of lead number of leads
ratio was <1, the MW ratio was >1; thus, efficiency always
literature compound 15 decreased when MW increased. However, an increase in MW
HTS 14 did not inevitably lead to reduced efficiency since 69% of drugs
scaffold morphing from literature lead 11 that were more efficient than the lead had a higher MW than the
substrate of TS analogue 10
progenitor.
Surprisingly, and perhaps reflective of the time period under
diversity screen 5
study, lipophilicity for this data set did not increase as a lead
pharmacophore screen 3
was developed into a drug since the median clog P for leads
screen against related enzyme 1
was 3.14 and for drugs was 3.04, and 75% of the drugs
derivative of literature compound 1 exhibited a clog P between 0 and 6. The variation in clog P
between leads and drugs in the great majority of cases was less
electronics that could be allosteric in origin, with X-ray cocrystallo- than 2 units, with a median difference of 0. Thus, for those
graphic analysis the most reliable approach to understanding this campaigns where the drug had a higher MW than the lead, the
aspect of structurefunction relationships.81 additional structural elements that were introduced relied
3.14. Analysis of the Ligand Efficiency of Leads and the upon an appropriate balance of polarity and lipophilicity.
Resultant Drugs. In an effort to provide a perspective on the The analysis of LLE trends revealed a median increase of 2
changes in ligand efficiency associated with a typical lead units from lead to drug based on clog P data, and LLE
optimization program, several drug discovery campaigns in increased in 80% of the individual drug optimization cam-
which both the leads and the resulting drugs could clearly be paigns. The median LLE increase was 1.53 units, indicating
identified were carefully analyzed.87 The data set selected was that the drugs were generally more potent while maintaining a
restricted to pairs of molecules in which the reported binding similar level of lipophilicity to the lead, recognized as an
affinity for both was conducted under the same assay conditions important contributor to success.
and withdrawn drugs were excluded. This resulted in 60 pairs of In a series of illustrative examples, discovery campaigns were
leads and the resulting drugs that were suitable for evaluation, categorized into cases where the scaffold was constant and the
with the variation in binding efficiencies compared to relevant change in BEI from lead to drug was greater than 1.3, cases where
calculated descriptors. The data set comprised 2 drugs approved the scaffold was retained but the change in BEI was less than 1.3,
between 1978 and 1990 and 58 approved in the 19912008 and those where the scaffold was altered and the change in BEI
interval based on discoveries occurring at 40 different companies was >1.3. Representative examples of the 3 categories and the
across targets encompassing 23 enzymes and 16 receptors. The associated data are presented in Tables 45, 46, and 47.
distribution of the source of lead structures is compiled in Table 44, The conclusions of this synopsis recognized that drugs and
with 25% of the leads originating with literature compounds, 18% leads exhibited similar BEI and that increases or decreases during
arising from scaffold morphing, 23% discovered by high through- optimization were quite common. The affinity of a drug for its
put screening (HTS), and 17% designed on the basis of substrate target was generally much greater than the lead, but the clog P of
or transition state mimics. The data set split into 60% of the a drug was often similar to the clog P of the lead, with the result
compounds being based on known molecules or elements, while that the LLE associated with a drug was improved over the lead.
40% were completely novel. Affinity was typically assessed by Ki or Increasing MW to improve potency was often inevitable, but
IC50 data rather than the more rigorous Kd, and the BEI was used avoiding an increase in clog P in the process was highlighted as a
as the basis for the analysis, with MW reported in kDa. key element associated with success. Large increases in binding
The MW of the drugs was found to be evenly distributed affinity of >30% were achieved even when the original scaffold
across the range 200600 Da, while 75% of the leads exhibited a was retained. X-ray cocrystal data were advocated as helpful for
MW in the 200400 Da range. The median MW was 328 Da for optimization based on the rational understanding of effects of
leads and 436 Da for drugs, with the MW greater for the drug in structural changes, while dissecting leads and building on the
82% of the cases examined and an average increase in MW of 89.5 most efficient fragments was a productive strategy at the early
Da. Not surprisingly, the distribution of ligand affinity was found stages of lead optimization.
to be wider for leads than drugs, with leads more typically
populating the low affinity range, as might be anticipated. The 4. SOME RECENT EXAMPLES OF THE APPLICATION OF
majority of leads exhibited Ki values between 10 and 10,000 nM, LE AND LLE IN DRUG OPTIMIZATION
while drugs were more refined, expressing Ki values ranging from
0.1 to 100 nM. The drugs were more potent than the lead in 90% 4.1. Cyclin-Dependent Kinase-2 Inhibitors (Astex). The
of the cases examined, with the average potency difference = 100- design of a potent cyclin-dependent kinase 2 (CDK2) inhibitor
1444 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 45. Constant Scaffold, BEI Drug Has Changed by >1.3 BEI of Lead, i.e., the BEI Ratio Is >1.3

Table 46. Examples: Constant Scaffold, BEI of Drug Has Changed <1.3 from the Lead

Table 47. Examples: Altered Scaffold, ΔLE >1.3

1445 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

derived from a lead fragment identified by an X-ray cocrystal to install a H-bonding element, affording 44 as a compound
screen provided an interesting and informative example of the use exhibiting both improved potency and LE. The increase in
of ligand efficiency to guide decision making during optim- potency was attributed to the establishment of a H-bond between
ization.88,89 The 4 structural fragments 38-41 (Figure 18) were the amide CdO of 44 and the backbone NH of Asp145 of
found to bind to CDK2 after conducting a series of soaking CDK2. In addition, an intramolecular H-bond between the
experiments. Of these, the naphthylsulfonamide 40 offered poor exocylic NH and the pendent C-3 amide CdO provided
vectors for optimization, while attempts to optimize the pyrazine conformational stabilization that contributed to the overall improve-
38 and the pyrazolopyrimidine 41, although initially promising, ment in LE to 0.44. In the next phase of optimization, the CH3 of the
were ultimately unsuccessful because in both cases potency acetamide of 44 was replaced with a phenyl ring to provide 45, a
reached a plateau. molecule with both lower potency and LE (Figure 20). However, an
The indazole 39 proved to be a more useful lead, and an amide X-ray cocrystal structure revealed some movement in the protein, and
was introduced to C-3 of the heterocycle in order to engage the the phenyl moiety was twisted 51° out of the plane of the amide. This
CdO of Leu83 of CDK2 via the H-bond donating NH, while the conformation could be reinforced by the introduction of ortho
SO2NH2 moiety was added to the phenyl ring in order to substituents, with 2,6-difluororophenyl evaluated initially, affording
establish an interaction with Asp86. These structural manipula- the potent CDK2 inhibitor 46, IC50 = 3 nM, that demonstrates a high
tions afforded 42 as a potent inhibitor of CDK2, IC50 = 660 nM, LE of 0.45. However, 46 suffered from low cell permeability, a
that exhibited good LE (Figure 19). Binding of this compound to deficiency that was improved by replacing the 4-fluorophenyl
CDK2 also relied upon a H-bonding interaction between the moiety with saturated rings to reduce the overall lipophilicity of
indazole NH and the carbonyl moiety of Glu81, while the the molecule. The added polarity provided by a piperidine ring
indazole sp2 N atom accepted a H-bond from the NH of afforded 47, a compound that represented a reasonable com-
Leu83.88 At this point, a divergent strategy was adopted that promise between potency and physical properties. A 2,6-
evaluated the simpler pyrazole 43, which, although less potent dichlorophenyl moiety gave a better pocket filling interaction
than 42 with an IC50 = 970 nM, fully preserved both ligand than a 2,6-difluororophenyl, affording AT-7519 (48) as a
efficiency and the binding mode of the progenitor. The 4-posi- refined molecule that met targeted criteria for development.
tion of the pyrazole ring provided a vector useful to access the This exercise provides an elegant example of an integrated
DFG region of the enzyme, and an amide moiety was introduced approach to lead optimization that relies upon X-ray cocrys-
tallographic data to evaluate the effects of structural changes
while carefully monitoring LE during lead optimization to
ensure compound quality. With the indazole core, an analysis
of LE indicated that increasing MW was affording suboptimal
increases in potency. By reducing the lead indazole to a pyrazole
core, LE was maintained, while an important new vector was
introduced that allowed access to useful binding pockets within
the protein. During the latter stages of lead optimization, the
overall physicochemical properties of the molecule became
Figure 18. Structures of fragments that bind to CDK2 identified by an more important than ligand efficiency as a means of addressing
X-ray cocrystal screen. the poor activity observed in cell-based assays. In this particular

Figure 19. Optimization of CDK2-inhibiting indazole 39 to pyrazole 44.

Figure 20. Optimization of pyrazole 45 to pyrazole 48.

1446 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

example, clog P underestimated the lipophilicity of the molecule due fragments are introduced. A group efficiency of g0.3 kcal/mol/
to the presence of an intramolecular H-bond. AT-7519 (48), a HAC was set as a target value. The majority of the intrinsic
molecule with a low MW (381 Da), excellent LE (0.45), and high binding affinity was found to be associated with the pyrazole
solubility as the HCl salt (>25 mg/mL) was advanced into clinical ring, while the Cl atom proved to be a particularly efficient
trials as an intravenously administered agent.88,89 substituent, as summarized in Table 48. However, it was noted
4.2. Protein Kinase B Inhibitors (Astex). Another instructive that group contributions were more evenly distributed if group
example is provided by the fragment-based drug design approach efficiency was assessed using MW as the basis rather than HAC.
adopted to optimize inhibitors of protein kinase B (PKB) based 4.3. Soluble Epoxide Hydrolase Inhibitors (Sumitomo). As
on the lead pyrazole 49 (Figure 21) that bound to the hinge depicted in Figure 23, inhibition of soluble epoxide hydrolase (sEH)
region of the enzyme active site.90 The methyl substituent of 49 prevents the formation of 59, thereby elevating levels of epoxyeico-
was found to exert a minimal effect on potency based on the satrienoic acids (EETs) 58, which mediate vasodilation-induced
evaluation of 50. The introduction of a basic NH2 to the phenyl hypotension, improved glucose tolerance, and anti-inflammatory
ring was probed in order to access an electronegative region in effects.91 Reflecting the structure of the substrate, most sEH
the ribose pocket, structural modifications that led to the inhibitors are highly lipophilic but a group at Sumitomo sought
evaluation of 5153 as compounds that exhibited a 1030- small, hydrophilic leads in an effort to accommodate the anticipated
fold increase in potency while maintaining LE. An additional increase in MW and lipophilicity that typically accompanies lead
phenyl ring was installed on 52 in a fashion designed to project optimization campaigns.92 Virtual screening of a proprietary com-
this moiety into a lipophilic pocket in the enzyme, affording 54, pound collection using several X-ray cocrystal structures of ligand-
which exhibits 10-fold increased potency with only a modest bound human sEH led to the construction of a focused library of
erosion of LE (Figure 22). A chlorine atom at the para position 735 diverse compounds from which 68 active inhibitors were
of the phenyl ring (56) increased potency a further 10-fold, a identified with an IC50 of <1 μM, a high hit rate of 9%. Inhibitors
consistent SAR observation since it was also reflected in the were generally amides or ureas which act as transition state mimics
piperidines 55 and 57. An evaluation of group efficiency within based on X-ray data, although one interesting β-hydroxyamine was
this series of PKB inhibitors was conducted using a Free-Wilson identified in the active set. The triage process involved removing 26
analysis, which assesses the contributions of individual ele- known sEH inhibitors and structures considered to be nonlead-like.
ments to the observed change in energy of interaction as the For the remainder, a plot of LE (LE = ΔG/HAC) vs BEI (BEI =
pIC50/MW) was informative, with LE and BEI correlating in a linear
relationship except when the number of heavy atoms (Cl, Br, S,
and P) differed. By limiting the MW to <380 Da and the clog P to
<3.5, compounds with lower ligand efficiency were removed,
an exercise that reduced the collection of leads to 17 ligand-efficient
inhibitors. The lead molecule selected for optimization, the
cyclopropylamine derivative 60, was selected on the basis of its
high LE rather than potency (data summarized in Table 49). The
SEI for 60 was calculated to be in the preferred range of 525,
which suggested good permeability, while X-ray cocrystallography
revealed the binding mode. Parallel synthesis methodology was
Figure 21. Structures of lead protein kinase B inhibitors.
employed in hit-to-lead studies, with library inputs selected on the
basis of MW and clog P such that compounds were synthesized with
a target MW of <380 Da and a clog P of <4. This exercise led to the
identification of 38 potent sEH inhibitors with IC50values of
<10 nM, all of which exhibited a LE of >0.37 and a BEI of >20.0.
Lead optimization maintained a focus on LE while assessing
liabilities in parallel, leading to the series of inhibitors summarized

Figure 22. Structures of optimized PKB inhibitors. Figure 23. Reaction catalyzed by soluble epoxide hydrolase (sEH).

Table 48. Free-Wilson Analysis of Protein Kinase B Inhibitors

1447 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

Table 49. Physicochemical Profile of the Lead Soluble Epoxide Hydrolase (sEH) Inhibitor 60

Table 50. Profiles of Optimized Soluble Epoxide Hydrolase (sEH) Inhibitors 6166

by reductive amination to afford alkyl derivatives or by reaction


with electrophilic species to afford families of carbamates, ureas,
amides, and sulfonamides. Monomers were selected on the basis
of the established SAR, and library design criteria were focused
on targeting a clog P in the range of 05 with a mean of 3 and a
molecular weight of <500 Da, with a mean of 377 Da. The
distribution of MW versus clog P was plotted in advance of
compound synthesis and the upper limit of clog P set high based
on the inherently lipophilic nature of the natural ligand. As a
Figure 24. Structures of lead CB2 agonists 67 and 68.
consequence, 66% of the molecules targeted had a clog P in the
range 24, representing a compromise with respect to oral
in Table 50. Although the oxadiazole 66 matched the targeted absorption and first pass metabolism.
properties profile, it was not pursued further because of the A plot of clog P versus the pIC50 for the 5 series gave a distribution
appearance of structurally similar compounds in a published patent along lines of identical LLE values, with all of the compounds except
application from a competitor. the sulfonamide series exhibiting low to modest LLE values based
4.4. CB2 Agonists (Pfizer). Structureactivity relationships in on their generally low potency and relatively high clog P.87 The
a series of CB2 agonists with potential application in the sulfonamides 6975 were more interesting, providing potent full
treatment of pain were examined using a parallel synthesis agonists for which an analysis of LLE revealed high intrinsic quality
strategy based on the lead benzimidazole 67 (Figure 24), with (values summarized in Table 51). The sulfonamides were tolerant
an appreciation of physicochemical properties emphasized dur- of structural changes that increased hydrophilicity with, for example,
ing the design phase.93 Ring saturation and the introduction of a a phenyl (71) to 3-pyridyl (72) substitution reducing clog P by
nitrogen atom was a useful strategy to facilitate rapid structural 1.5 without appreciably eroding potency. However, the intrinsic
derivatization via the tetrahydroimidazopyridine 68 (Figure 24). clearance in rat liver microsomes (RLM CLint) was higher than
A library of 126 compounds was prepared across 5 classes, either anticipated based on the good clog P and LLE values. This problem
1448 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 51. CB2 Agonist Potency, Rat Liver Microsome Clearance, and Physical Properties for a Series of Sulfonamides

Table 52. In Vitro and Physicochemical Profiles of the CB2


Agonists/CB1 Inverse Agonists 7679
compd 76 77 78 79

CB2 Ki (nM) 21.0 130 0.8 1.03


MW 476 438 532 363
BEI 16 16 17 25
LE 0.31 0.29 0.38 0.45
LLE 0.1 3.4 4.9 4.7
Alog P 7.6 3.5 4.2 4.3
log PHPLC 4.4 1.8 3.5
cPSA 47 95 140 47

4.6. ATP-Competitive Akt Inhibitors (Pfizer). During the


Figure 25. Structures of CB2 agonists/CB1 inverse agonists. optimization of a series of Akt inhibitors, consideration of LLE
became a critical element in the evolution of the lead structures.95
was addressed by removing metabolically vulnerable moieties rather A series of derivatives of 3-(aminomethyl)-1-(5-methyl-7H-
than reducing clog P since it had been determined that a clog P of pyrrolo[2,3-d]pyrimidin-4-yl)pyrrolidin-3-amine were probed
>1.9 was required in order to maintain high affinity for the for Akt inhibitory activity, with aniline 80 offering good potency
hydrophobic receptor. Compounds 69 and 7174 emerged from but poor HLM stability and presenting a hERG issue (Table 53).
this effort as representing reasonable compromises between potency An analysis of trends with earlier representatives of this chemo-
and physicochemical properties. type indicated that a clog P of <3 was required to abrogate hERG
4.5. CB2 Agonists/CB1 Inverse Agonists (Solvay Pharma- inhibition and optimize HLM stability, leading to a focus on
ceuticals). Optimization of compounds that combined CB2 agon- closely analyzing LLE for the series. Acetamide 81 was an
ism with CB1 inverse agonism focused on a series of imidazoles intrinsically weak Akt inhibitor but exhibited a low clog P and
designed after the pyrazole SR-144528 (76, Figure 25).94 Several high LLE, while benzamide 82 offered a 10-fold improvement in
physicochemical parameters were carefully monitored during the potency but, importantly, maintained the clog P below 2 and
optimization phase, including BEI, LE, LLE, Alog P, and cPSA. preserved the good LLE associated with 81. A systematic evalua-
The physicochemical properties of compounds emerging from this tion of fluoro substitution (8385) revealed that the para position
effort were compared with 7678, compounds originating from of the benzamide ring of 80 was optimal and that the significant
competitors (Figure 25 and Table 52). The imidazole moiety potency increase observed with 85 compensated for an increase in
offered intrinsically higher polarity than the reference agents, clog P, with the overall result that LLE was improved.
necessitating the careful optimization of substituents given the high Further optimization used resolved substrate and either a
lipophilicity associated with the cannabinoid receptor ligand. This 4-fluoro or 2,4-difluoro benzamide moiety, with the C-5 sub-
exercise afforded imidazole 79 as a molecule with the highest BEI stituent of the core optimized in a fashion that maintained the
and LE within the comparator set presented in Table 52. Imidazole clog P below 3, carefully monitored in the design process. A
79 also demonstrated a high LLE, while the Alog P was lower than nitrile substituent at C-5 of the core reduced clog P and improved
that of 76, a profile achieved without increasing the PSA to a level HLM but reduced both intrinsic potency and activity in a cell-
that compromised the potential for good brain penetration, a based assay (compounds 86 and 89 in Table 54), while lipophilic
problem with the highly polar bis-sulfone-sulfonamide 78. C-5 substituents exhibited opposing properties (compounds
1449 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

Table 53. Potency and Physicochemical Properties Associated with a Series of Akt Inhibitors 8085

Table 54. Potency and Physicochemical Properties Associated with a Series of Optimized Benzamide-Based Akt Inhibitors

87,88, and 90 in Table 54). A compromise was reached with an was followed by profiling the hits for mTOR inhibition, with lead
ethyl group at C-5, which was paired with a 2,4-difluorobenzamide inhibitors triaged on the basis of an analysis of ligand efficiency.
moiety to afford 91, which demonstrated 900-fold selectivity over The lead molecule identified by this approach, 92 (Table 55),
protein kinase A due to the steric demands incurred by the C-5 had high in vitro clearance in HLM based on an extraction ratio
ethyl moiety.95 This compound was subsequently nominated (ER) = 0.68. The HLM stability of analogues correlated with log
for clinical development. A critical element in this program was P/log D, with those compounds expressing a lower log P/log D
developing and understanding the effect of the overall lipo- and a LLE of <6 showing a lower ER. In order to improve liver
philicity of the molecules on both potency and in vitro ADME microsomal stability, metabolically labile moieties were either
properties. By monitoring changes in LLE, the best modifications removed or modified, with LLE carefully monitored to avoid low
were introduced in a fashion that avoided a bias toward depend- solubility, which resulted in poor absorption in rats. X-ray
ing on lipophilicity to improve potency. This resulted in more crystallographic data was used to identify sites where log P could
physicochemically balanced molecules that exhibited good kinase effectively be modulated, leading to the introduction of polarity
selectivity. into the amide substituent which projects into the ribose pocket
4.7. Dual PI3K/mTOR Inhibitors (Pfizer). The phosphatidy- of the enzyme. Unfortunately, the cyclohexanol moiety found in
linositol 3-kinase (PI3K) pathway plays an important role in cell 95 (Figure 26) entered into a redox cycle in vivo; therefore, the
growth, proliferation, and survival, and genetic aberrations in the introduction of O-substituents was probed, with the CH2CH2OH
pathway have been closely linked to the development and derivative 94 found to be optimal. A plot of PI3KR inhibitory
progression of a wide range of cancers. Several kinases are active potency versus clog P afforded diagonal lines representing com-
in this pathway, including PI3K, Akt, and the mammalian target pounds with equivalent LLE, with 94 demonstrating a LLE of
of rapamycin (mTOR), leading to a focus on the concept of 8 compared to an LLE of <6 for the lead.
designing a dual inhibitor of PI3KR and mTOR.96 A high 4.8. HIV Non-Nucleoside Reverse Transcriptase Inhibitors:
throughput screening campaign conducted using mouse PI3KR The Discovery of Lersivirine (Pfizer). The pyrazole derivative

1450 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

Table 55. Potency and Physicochemical Properties Associated with Dual PI3Kr and mTOR Inhibitors

Figure 27. Structures of HIV NNRTIs.

poorer than that of 97, T1/2 = 89 min, attributed to reduced


Figure 26. Structure of the dual PI3KR and mTOR inhibitor 95.
microsomal protein binding based on the measured change in
free fraction between 97 (13%) and 98 (80%).97 Gratifyingly, the
96 (Figure 27) was identified as an advanced HIV-1 non- glucuronidation rate of 98 was significantly reduced compared to
nucleoside reverse transcriptase inhibitor (NNRTI) that demon- that of 97, leading to a lower overall clearance in vitro. Further
strated potent activity toward the wild-type and drug-resistant optimization focused on the phenyl 5-substituent where a cyano
virus but suffered from rapid metabolism in HLM, attributed to
moiety proved to be particularly beneficial, restoring symmetry
the high lipophilicity which was reflected in a clog P of 4.3.97 The
and giving rise to lersivirine (99) as a potent HIV-1 antiviral
strategy adopted to optimize 96 focused on improving LLE,
agent that satisfied targeted physical properties with a clog P =
which was initially challenging because the substitution of the
C-3 or C-5 ethyl moieties with polar elements invariably reduced 2.1, LE = 0.43, and LLE = 4.92, the latter based on enzyme
antiviral potency. Replacing the benzylic methylene in 96, a inhibition data.98 Lersivirine (99) is currently undergoing clinical
potential site of metabolism, with an O atom gave the ether 97, a evaluation for the treatment of HIV-1 infection.
modification that improved potency several fold while preserving
LE and LLE and increasing stability toward metabolic oxida- 5. EPILOGUE
tion. 97 However, pyrazole 97 was rapidly glucuronidated on the There is an increasing awareness that contemporary drug
primary alcohol, a persistent problem in this series that could discovery campaigns are moving toward designing compounds
not be solved by modification of the alcohol since this element that rely too heavily on increased MW and lipophilicity (log P) to
contributed to the pharmacophore. Reducing the lipophilicity of drive potency, leading to the synthesis of molecules with
the dichlorophenyl moiety was embraced as a useful avenue to physicochemical properties that present significant challenges to
address the fundamental problem, and a survey of substitution drug developability and durability. The analysis of the thermo-
patterns was conducted while carefully monitoring LE and LLE. dynamics of binding of marketed HIV protease and HMGCoA
This initiative initially identified the 3-cyano derivative 98 as a reductase inhibitors has suggested that the energetics of the best-
compound with increased polarity, clog P = 2.7, at the cost of in-class drugs in each series of compounds, both developed over a
only a 3-fold diminution in potency, figures that contributed to 10 year time frame, are more heavily dependent on enthalpy rather
an increase in LLE from 2.23 for 97 to 3.77 for 98. However, the than entropy. In the absence of isothermal titration calorimetry
stability of 98 in HLM, T1/2 = 27 min, was found to be markedly data, ligand efficiency, binding efficiency, group efficiency,
1451 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456
Chemical Research in Toxicology REVIEW

lipophilic ligand efficiency, and the surface efficiency index provide tous in their deployment, there was no apparent overdepen-
readily determined guideposts that are useful in decision making dence on planar molecules.99 It should be noted, however, that
during lead optimization. An additional consideration that is an this analysis was focused on articles published in major
important part of the issue is a tendency to rely on scaffolds and journals during 2008 and may not, therefore, reflect contem-
structural elements that are too heavily dependent on sp2 atoms,
particularly carboaromatic rings which have been associated with
poor developability prospects. This dependence may have its roots
in a reliance on the kinds of contemporary chemical reactions that
offer reliability, versatility, and compatibility with a broad range of
functionality while avoiding chirality. However, a recent analysis of
the reactions used by medicinal chemists at AstraZeneca,
Pfizer, and GlaxoSmithKline examined this concept in some
detail and concluded that although aromatic rings are ubiqui-
Figure 30. Structures of CETP inhibitors.

Figure 28. Structure of ABT-262. Figure 31. Structure of the HCV NS5A inhibitor BMS-790052.

Figure 29. Structure of HCV NS3 protease inhibitors.

1452 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

Table 56. Physical Properties Associated with Compounds 96105a

a
Physicochemical data taken from the physical properties module of SciFinder. A red coloration indicates that the highlighted value is in excess of the
Lipinski “rule of 5” value or rotatable bond count based on rat data that is consistent with good permeability.

porary trends as effectively as an assessment of reactions ’ ACKNOWLEDGMENT


published in recent patent applications. I thank my colleagues Drs. Dinesh Vyas and Michael A.
The physicochemical properties of some protein targets are Walker for their critical review of the manuscript and construc-
such that large molecules that populate areas beyond that con- tive comments.
sidered to be drug-like space are essential for biological activity, ne-
cessitating working under more challenging conditions.100 The Bcl ’ ABBREVIATIONS
inhibitor ABT-263 (100, Figure 28),101 the hepatitis C virus (HCV)
NS3 protease inhibitors telaprevir (101), boceprevir (102), dano- ADME, absorption, distribution, metabolism, excretion; ADMET,
previr (103), TMC-434350 (104), vaniprevir (105) and MK-5172 absorption, distribution, metabolism, excretion, and toxicity; AP,
(106) (Figure 29),102 the cholesteryl ester transfer protein (CETP) aromatic proportion, the number of aromatic atoms divided by the
inhibitors torcetrapib (107) and anacetrapib (108) (Figure 30),103 total number of atoms; BEI, binding efficiency index; CADD,
and the HCV NS5A inhibitor BMS-790052 (109) (Figure 31)104 are computer-aided drug design; CB, cannabinoid receptor; CDK,
examples of compounds that are orally bioavailable but violate cyclin-dependent kinase; clog P, calculated logarithm of the octa-
several of the rules that are considered as guidelines for nol/water partition coefficient; CNS, central nervous system; ER,
absorption and developability (properties summarized in extraction ratio; GE, group efficiency; GPCR, G-protein-couple
Table 56). Notably, the HCV NS3 protease inhibitors dano- receptor; HAC, heavy atom count; HBD, hydrogen-bond donor;
previr (99), TMC-434350 (100), vaniprevir (101), and MK- HBA, hydrogen-bond acceptor; HCV, hepatitis C virus;
5172 (102) take advantage of macrocyclization to reduce hERG, the human Ether-a-go-go Related Gene; HLM, hu-
rotatable bond count compared to the acyclic inhibitor man liver microsomes; LE, ligand efficiency; LLE, lipophilic ligand
telaprevir (97). With the exception of torcetrapib (103), all efficiency; LM, liver microsomes; log P, logarithm of the ocatanol/
of these molecules are currently in clinical trials, and telaprevir water partition coefficient; MW, molecular weight; mp, melting
(97) and boceprevir (98) have recently been approved for point; MPO, multiparameter optimization; mTOR, mammalian
marketing by the FDA. Several of the properties of these target of rapamycin; NNRTI, non-nucleoside reverse transcriptase
molecules fall beyond the drug-like space and represent some inhibitor; PAMPA, parallel artificial membrane permeability
of the emerging problems in drug design that have been assay; P-gp, P-glycoprotein; PI, protease inhibitor; PK, pharmacoki-
highlighted by the retrospective analyses of marketed drugs. netic; PKB, protein kinase B; PSA, polar surface area; RB, rotatable
However, the biological targets of these molecules have presented bond; sEH, soluble epoxide hydrolase; SEI, surface efficiency
some of the most difficult challenges encountered in contempor- index; TPSA, topological polar surface area; Vd, volume of
ary drug discovery and have required the adoption of unique distribution.
tactical approaches. The future application of physicochemical
principles to the design of drugs of this type will likely require some
adaptation of the use of the guideposts that have been developed ’ REFERENCES
to influence decision making during the lead optimization phase. (1) Kola, I., and Landis, J. (2004) Can the pharmaceutical industry
For example, one approach that may be a useful strategy to reduce attrition rates? Nature Rev. Drug Discovery 3, 711–715.
increase permeability in molecules with properties that extend (2) Pharmaceutical Research and Manufacturers of America, Chart
Pack, available at http://www.phrma.org/sites/default/files/159/
beyond the rule of 5 is to take advantage of intramolecular phrma_chart_pack.pdf.
H-bonding to reduce the exposure of polar elements within a (3) Pharmaceutical Research and Manufacturers of America, Phar-
molecule while promoting a more compact conformation that may maceutical Industry Profile 2011 (Washington, DC: PhRMA, March
facilitate absorption.62,105 2011) available at http://www.phrma.org/sites/default/files/159/
phrma_profile_2011_final.pdf.
’ AUTHOR INFORMATION (4) Proudfoot, J. R. (2002) Drugs, leads, and drug-likeness: an analysis
of some recently launched drugs. Bioorg. Med. Chem. Lett. 12, 1647–1650.
Corresponding Author (5) Wenlock, M. C., Austin, R. P., Barton, P., Davis, A. M., and
*Tel: 203-677-6679 Fax: 203-677-7884. E-mail: Nicholas.Meanwell@ Leeson, P. D. (2003) A comparison of physiochemical property profiles
bms.com. of development and marketed oral drugs. J. Med. Chem. 46, 1250–1256.

1453 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

(6) Vieth, M., Siegel, M. G., Higgs, R. E., Watson, I. A., Robertson, (28) Azzaoui, K., Hamon, J., Faller, B., Whitebread, S., Jacoby, E.,
D. H., Savin, K. A., Durst, G. L., and Hipskind, P. A. (2004) Character- Bender, A., Jenkins, J. L., and Urban, L. (2007) Modeling promiscuity
istic physical properties and structural fragments of marketed oral drugs. based on in vitro safety pharmacology profiling data. ChemMedChem
J. Med. Chem. 47, 224–232. 2, 874–880.
(7) Leeson, P. D., and Davis, A. M. (2004) Time-related difference in (29) Jamieson, C., Moir, E. M., Rankovic, Z., and Wishart, G. (2006)
the physical property profiles of oral drugs. J. Med. Chem. 47, 6338–6348. Medicinal chemistry of hERG optimizations: highlights and hang-ups.
(8) Proudfoot, J. R. (2005) The evolution of synthetic oral drug J. Med. Chem. 49, 5029–5046.
properties. Bioorg. Med. Chem. Lett. 15, 1087–1090. (30) Peters, J.-E., Schnider, P., Mattei, P., and Kansy, M. (2009)
(9) Leeson, P. D., and Springthorpe, B. (2007) The influence of Pharmacological promiscuity: dependence on compound properties and
drug-like concepts on decision making in medicinal chemistry. Nature target specificity in a set of recent Roche compounds. ChemMedChem
Rev. Drug Discovery 6, 881–890. 4, 680–686.
(10) Leeson, P. D., St-Galley, S. A., and Wenlock, M. C. (2011) (31) Kresja, C. M., Horvath, D., Rogalski, S. L., Penzotti, J. E., Mao,
Impact of ion class and time on oral drug molecular properties. B., Barbosa., F., and Migeon, J. C. (2003) Predicting ADME properties
MedChemComm 2, 91–105. and side effects: the BioPrint approach. Curr. Opin. Drug Discovery Dev.
(11) Hann, M. M. (2011) Molecular obesity, potency and other 6, 470–480.
addictions in drug discovery. MedChemComm 2, 349–355. (32) Hopkins, A. L., Mason, J. S., and Overington, J. P. (2006) Can we
(12) Freire, E. (2008) Do enthalpy and entropy distinguish first in rationally design promiscuous drugs? Curr. Opin. Struct. Biol. 16, 127–136.
class from best in class? Drug Discovery Today 13, 869–874. (33) Waring, M. J., and Johnstone, C. (2007) A quantitative assess-
(13) Ladbury, J. E., Klebe, G., and Freire, E. (2010) Adding calori- ment of hERG liability as a function of lipophilicity. Bioorg. Med. Chem.
metric data to decision making in lead discovery: a hot tip. Nat. Rev. Drug Lett. 17, 1759–1764.
Discovery 9, 23–27. (34) Anderson, N., and Borlak, J. (2006) Drug-induced phospholi-
(14) Divisions of Chemical Toxicology and Medicinal Chemistry, pidosis. FEBS Lett. 580, 5533–5540.
240th American Chemical Society National Meeting and Exposition, (35) Alakoskela, J.-M., Vitovic, P., and Kinnunen, P. K. J. (2009)
Boston, Massachusetts, August 24, 2010, Abstracts 4043. Screening for the drug-phospholipid interaction: correlation to phos-
(15) Navia, M., and Chaturvedi, P. R. (1996) Design principles for pholipidosis. ChemMedChem 4, 1224–1251.
orally bioavailable drugs. Drug Discovery Today 13, 179–189. (36) Ploemen, J.-P.H.T.M., Kelder, J., Hafmans, T., van de Sandt, H.,
(16) Lipinski, C. A., Lombardo, F., Dominy, B. W., and Feeney, P. J. van Burgsteden, J. A., Saleminki, P. J., and van Esch, E. (2004) Use
(1997) Experimental and computational approaches to estimate solu- of physicochemical calculation of pKa and CLogP to predict phospho-
bility and permeability in drug discovery and development settings. lipidosis-inducing potential: a case study with structurally related
Adv. Drug Delivery Rev. 23, 3–25. piperazines. Exp. Toxicol. Pathol. 55, 347–355.
(17) Veber, D. F., Johnson, S. R., Cheng, H.-Y., Smith, B. R., Ward, (37) Pelletier, D. J., Gehlhaar, D., Tilloy-Ellul, A., Johnson, T. O., and
K. W., and Kopple, K. D. (2002) Molecular properties that influence the Greene, N. (2007) Evaluation of a published in silico model and
oral bioavailability of drug candidates. J. Med. Chem. 45, 2615–2623. construction of a novel Bayesian model for predicting phospholipidosis
(18) Lu, J. J., Crimin, K., Goodwin, J. T., Crivori, P., Orrenius, C., inducing potential. J. Chem. Inf. Model. 47, 1196–1205.
Xing, L., Tandler, P. J., Vidmar, T. J., Amore, B. M., Wilson, A. G. E., (38) Tomizawa, K., Sugano, K., Yamada, H., and Horii, I. (2006)
Stouten, P. F. W., and Burton, P. S. (2004) Influence of molecular Physicochemical and cell-based approach for early screening of phos-
flexibility and polar surface area metrics on oral bioavailability in the rat. pholipidosis-inducing potential. J. Toxicol. Sci. 31, 315–324.
J. Med. Chem. 47, 6104–6107. (39) Hanumegowda, U. M., Wenke, G., Regueiro-Ren, A., Yordanova,
(19) Tian, S., Li, Y., Wang, J., Zhang, J., and Hou, T. (2011) ADME R., Corradi, J. P., and Adams, S. P. (2010) Phospholipidosis as a function
evaluation in drug discovery. 9. Prediction of oral bioavailability in of basicity, lipophilicity, and volume of distribution of compounds. Chem.
humans based on molecular properties and structural fingerprints. Mol. Res. Toxicol. 23, 749–755.
Pharmaceutics 8, 841–851. (40) Ratcliffe, A. J. (2009) Medicinal chemistry strategies to mini-
(20) Gleeson, M. P. (2008) Generation of a set of simple, inter- mize phospholipidosis. Curr. Med. Chem. 16, 2816–2823.
pretable ADMET rules of thumb. J. Med. Chem. 51, 817–834. (41) Ritchie, T. R., and Macdonald, S. J. F. (2009) The impact of
(21) Varma, M. V. S., Obach, R. S., Rotter, C., Miller, H. R., Chang, G., aromatic ring count on compound developability: are too many aromatic
Steyn, S. J., El-Kattan, A., and Troutman, M. D. (2010) Physicochemical rings a liability in drug design? Drug Discovery Today 14, 1011–1020.
space for optimum oral bioavailability: contribution of human intestinal (42) Ritchie, T. R., Macdonald, S. J. F., Young, R. J., and Pickett, S. D.
absorption and first-pass metabolism. J. Med. Chem. 53, 1098–1108. (2011) The impact of aromatic ring count on compound developability:
(22) Ritchie, T. J., Ertl, P., and Lewis, R. (2011) The graphical further insights by examining carbo- and hetero-aromatic and -aliphatic
representation of ADME-related molecule properties for medicinal ring types. Drug Discovery Today 16, 164–171.
chemists. Drug Discovery Today 16, 65–72. (43) Lovering, F., Bikker, J., and Humblet, C. (2009) Escape from
(23) O’Shea, R., and Moser, H. E. (2008) Physicochemical proper- flatland: increasing saturation as an approach to improving clinical
ties of antibacterial compounds: implications for drug discovery. J. Med. success. J. Med. Chem. 52, 6752–6756.
Chem. 51, 2871–2878. (44) Yan, A., and Gasteiger, J. (2003) Prediction of aqueous
(24) Hitchcock, S. A., and Pennington, L. D. (2006) Structure-brain solubility of organic compounds by topological descriptors. QSAR
exposure relationships. J. Med. Chem. 49, 7559–7583. Comb. Sci. 22, 821–829.
(25) Hughes, J. D., Blagg, J., Price, D. A., Bailey, S., DeCrescenzo, (45) Bodycombe, N. E., Carrinski, H. A., Wilson, J. A., Shamji, A. F.,
G. A., Devraj, R. V., Ellsworth, E., Fobian, Y. M., Gibbs, M. E., Gilles, Wagner, B. K., Koehler, A. N., and Schreiber, S. L. (2010) Small
R. W., Greene, N., Huang, E., Krieger-Burke, T., Loesel, J., Wager, T., molecules of different synthetic and natural origins have distinct
Whitely, L., and Zhang, Y. (2008) Physicochemical drug properties distributions of structural complexity that correlate with protein binding
associated with in vivo toxicological outcomes. Bioorg. Med. Chem. Lett. profiles. Proc. Natl. Acad. Sci. U.S.A. 107, 18787–18792.
18, 4872–4875. (46) Dancik, V., Seiler, K. P., Young, D. W., Schreiber, S. L., and
(26) Edwards, M. P., and Price, D. A. (2010) Role of physicochem- Clemons, P. A. (2010) Distinct biological network properties between
ical properties and ligand lipophilicity efficiency in addressing drug the targets of natural products and disease genes. J. Am. Chem. Soc.
safety risks. Annu. Rep. Med. Chem. 45, 381–391. 132, 9259–9261.
(27) Leeson, P. D., and Empfield, J. R. (2010) Reducing the risk of (47) Hill, A. P., and Young, R. J. (2010) Getting physical in drug
drug attrition associated with physicochemical properties. Annu. Rep. discovery: a contemporary perspective on solubility and hydrophobicity.
Med. Chem. 45, 393–407. Drug Discovery Today 15, 648–655.

1454 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

(48) Lipinski, C. A. (2000) Drug-like properties and the causes of properties, in vitro ADME, and safety attributes. ACS Chem. Neurosci.
poor solubility and poor permeability. J. Pharmacol. Toxicol. Methods 1, 420–434.
44, 235–249. (68) Wager, T. T., Hou, X., Verhoest, P. R., and Villalobos, A. (2010)
(49) Curatolo, W. (1998) Physical chemical properties of oral drug Moving beyond rules: the development of a central nervous system
candidates in the discovery and exploratory development settings. mulitparameter optimization (CNS MPO) approach to enable align-
Pharm. Sci. Technol. Today 1, 387–393. ment of druglike properties. ACS Chem. Neurosci. 1, 435–449.
00
(50) Amidon, G. L., Lennernas, H., Shah, V. P., and Crison, J. R. (69) Ferenczy, G. G., and Keseru, G. M. (2010) Thermodynamics
(1995) A theoretical basis for a biopharmaceutics drug classification: the guided lead discovery and optimization. Drug Discovery Today 15,
correlation of in vitro drug product dissolution and in vivo bioavail- 919–932.
00
ability. Pharm. Res. 12, 413–420. (70) Ferenczy, G., and Keseru, G. M. (2010) Enthalpic efficiency of
(51) Takagi, T., Ramachandran, C., Bermejo, M., Yamashita, S., Yu, ligand binding. J. Chem. Inf. Model. 50, 1536–1541.
L. X., and Amidon, G. L. (2006) A provisional biopharmaceutical (71) Lafont, V., Armstrong, A. A., Ohtaka, H., Kiso, Y., Amzel, L. M.,
classification of the top 200 oral drug products in the United States, and Freire, E. (2007) Compensating enthalpic and entropic changes hinder
Great Britain, Spain, and Japan. Mol. Pharmaceutics 3, 631–643. binding affinity optimization. Chem. Biol. Drug Des. 69, 413–422.
(52) Lamanna, C., Bellini, M., Padova, A., Westerberg, G., and (72) Freire, E. (2009) A thermodynamic approach to the affinity
Maccari, L. (2008) Straightforward recursive partition model for dis- optimization of drug candidates. Chem. Biol. Drug Des. 74, 468–472.
carding insoluble compounds in the drug discovery process. J. Med. (73) Holdgate, G. A., and Ward, H. J. (2005) Measurements of
Chem. 51, 2891–2897. binding thermodynamics in drug discovery. Drug Discovery Today
(53) Gleeson, M. P., Hersey, A., Montanari, D., and Overington, J. 10, 1543–1550.
(2011) Probing the links between in vitro potency, ADMET and (74) Freire, E. (2004) Isothermal titration calorimetry: controlling
physicochemical parameters. Nature Rev. Drug Discovery 10, 197–208. binding forces in lead optimization. Drug Discovery Today: Technologies
(54) Paolini, G. V., Shapland, R. H. B., van Hoorn, W. P., Mason, 1, 295–299.
J. S., and Hopkins, A. L. (2006) Global mapping of pharmacological (75) Carbonell, T., and Freire, E. (2005) Binding thermodynamics
space. Nat. Biotechnol. 24, 805–815. of statins to HMG-CoA reductase. Biochemistry 44, 11741–11748.
(55) Smith, G. (2009) Medicinal chemistry by the numbers: the (76) Velazquez-Campoy, A., Luque, I., Todd, M. J., Milutinovich,
physicochemistry, thermodynamics and kinetics of modern drug design. M., Kiso, Y., and Freire, E. (2000) Thermodynamic dissection of the
Prog. Med. Chem. 48, 1–29. binding energetics of KNI-272, a potent HIV-1 protease inhibitor.
(56) Teague, S. J., Davis, A. M., Leeson, P. D., and Oprea, T. (1999) Protein Sci. 9, 1801–1809.
The design of leadlike combinatorial libraries. Angew. Chem., Int. Ed. (77) Kuntz, I. D., Chen, K., Sharp, K. A., and Kollman, P. A. (1999)
38, 3743–3748. The maximal affinity of ligands. Proc. Natl. Acad. Sci. U.S.A. 96,
(57) Oprea, T. I., Davis, A. M., Teague, S. J., and Leeson, P. D. 9997–10002.
(2001) Is there a difference between leads and drugs? A historical (78) Hopkins, A. L., Groom, C. R., and Alex, A. (2004) Ligand
perspective. J. Chem. Inf. Comput. Sci. 41, 1308–1315. efficiency: a useful metric for lead selection. Drug Discovery Today 9,
(58) Hann, M. M., Leach, A. R., and Harper, G. (2001) Molecular 430–431.
complexity and its impact on the probability of finding leads for drug (79) Abad-Zapetero, C., and Metz, J. T. (2005) Ligand efficiency
discovery. J. Chem. Inf. Comput. Sci. 41, 856–864. indices as guideposts for drug discovery. Drug Discovery Today 10,
(59) Hann, M. M., and Oprea, T. I. (2004) Pursuing the leadlikeness 464–469.
concept in pharmaceutical research. Curr. Opin. Chem. Biol. 8, (80) Abad-Zapetero, C. (2007) Ligand efficiency indices for effective
255–263. drug discovery. Exp. Opin. Drug Discovery 2, 469–488.
(60) Congreve, M., Carr, R., Murray, C., and Jhoti, H. (2003) A ’rule (81) Verdonk, M. L., and Rees, D. C. (2008) Group efficiency: a
of three’ for fragment-based lead discovery? Drug Discovery Today guideline for hits-to-leads chemistry. ChemMedChem 3, 1179–1180.
8, 876–877. (82) Reynolds, C. H., Bembenek, S. D., and Tounge, B. A. (2007)
(61) Waring, M. J. (2009) Defining optimum lipophilicity and The role of molecular size in ligand efficiency. Bioorg. Med. Chem. Lett.
molecular weight ranges for drug candidates  molecular weight 17, 4258–4261.
dependent lower logD limits based on permeability. Bioorg. Med. Chem. (83) Reynolds, C. H., Tounge, B. A., and Bembenek, S. D. (2008)
Lett. 19, 2844–2851. Ligand binding efficiency: trends, physical basis, and implications. J. Med.
(62) Johnson, T. W., Dress, K. R., and Edwards, M. (2009) Using the Chem. 51, 2432–2438.
Golden Triangle to optimize clearance and oral absorption. Bioorg. Med. (84) Bembenek, S. D., Tounge, B. A., and Reynolds, C. H. (2009)
Chem. Lett. 19, 5560–5564. Ligand binding efficiency and fragment-based drug discovery. Drug
(63) Park, R., and Kitteringham, N. R. (1994) Effects of fluorine Discovery Today 14, 278–283.
substitution on drug metabolism: pharmacological and toxicological (85) Nissink, J. W. M. (2009) Simple, size-independent measure of
implications. Drug Metab. Rev. 26, 605–643. ligand efficiency. J. Chem. Inf. Model 49, 1617–1622.
(64) Diana, G. D., Rudewicz, P., Pevear, D. C., Nitz, T. J., Aldous, (86) Reynolds, C. H., and Holloway, M. K. (2011) Thermodynamics
S. C., Aldous, D. J., Robinson, D. T., Draper, T., Dutko, F. J., Aldi, C., of ligand binding and efficiency. ACS Med. Chem. Lett. 2, 433–437.
Gendron, G., Oglesby, R. C., Volkots, D. L., Reurnan, M., Bailey, T. R., (87) Perola, E. (2010) An analysis of the binding efficiencies of drugs
Czerniak, R., Block, T., Roland, R., and Oppermand, J. (1996) Picorna- and their leads in successful drug discovery programs. J. Med. Chem.
virus inhibitors: trifluoromethyl substitution provides a global protective 53, 2986–2997.
effect against hepatic metabolism. J. Med. Chem. 38, 1355–1371. (88) Wyatt, P. G., Woodhead, A. J., Berdini, V., Boulstridge, J. A., Carr,
(65) Swaminathan, S., Siddiqui, A. U., Pinkerton, N. G., Wilson, M. G., Cross, D. M., Davis, D. J., Devine, L. A., Early, T. R., Feltell, R. E.,
W. K., and Schroepfer, G. J. (1994) Inhibitors of sterol synthesis: 3β- Lewis, E. J., McMenamin, R. L., Navarro, E. F., O’Brien, M. A., O’Reilly,
hydroxy-25,26,26,26,27,27-heptafluoro-5R-cholestan-15-one, an analog M., Reule, M., Saxty, G., Seavers, L. C. A., Smith, D.-M., Squires, M. S.,
of a potent hypocholesterolemic agent in which its major metabolism is Trewartha, G., Walker, M. T., and Woolford, A. J.-A. (2008) Identification
blocked. Biochem. Biophys. Res. Commun. 201, 168–173. of N-(4-piperidinyl)-4-(2,6-dichlorobenzoylamino)-1H-pyrazole-3-car-
(66) Kuhn, B., Mohr, P., and Stahl, M. (2010) Intramolecular boxamide (AT7519), a novel cyclin dependent kinase inhibitor using
hydrogen bonding in medicinal chemistry. J. Med. Chem. 53, 2601–2611. fragment-based X-ray crystallography and structure based design. J. Med.
(67) Wager, T. T., Chandrasekaran, R. Y., Hou, X., Troutman, M. D., Chem. 51, 4986–4999.
Verhoest, P. R., Villalobos, A., and Will, Y. (2010) Defining desirable (89) Squires, M. S., Feltell, R. E., Wallis, N. G., Lewis, E. J., Smith,
central nervous system drug space through alignment of molecular D.-M., Cross, D. M., Lyons, J. F., and Thompson, N. T. (2009)

1455 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456


Chemical Research in Toxicology REVIEW

Biological characterization of AT7519, a small-molecule inhibitor of membrane permeability and absorption in beyond rule of five chemical
cyclin-dependent kinases, in human tumor cell lines. Mol. Cancer Ther. space. MedChemComm, 2, 669674.
8, 324–332.
(90) Saxty, G., Woodhead, S. J., Berdini, V., Davies, T. G., Verdonk,
M. L., Wyatt, P. G., Boyle, R. G., Barford, D., Downham, R., Garrett,
M. D., and Carr, R. A. (2007) Identification of inhibitors of protein kinase
B using fragment-based lead discovery. J. Med. Chem. 50, 2293–2296.
(91) Imig, J. D., and Hammock, B. D. (2009) Soluble epoxide
hydrolase as a therapeutic target for cardiovascular diseases. Nature
Rev. Drug Discovery 8, 794–805.
(92) Tanaka, D., Tsuda, Y., Shiyama, T., Nishimura, T., Chiyo, N.,
Tominaga, Y., Sawada, N., Mimoto, T., and Kusunoes, N. (2011) A
practical use of ligand efficiency indices out of the fragment-based
approach: ligand efficiency-guided lead identification of soluble epoxide
hydrolase inhibitors. J. Med. Chem. 54, 851–857.
(93) Ryckmans, T., Edwards, M. P., Horne, V. A., Monica Correia,
A., Owen, D. R., Thompson, L. R., Tran, I., Tutt, M. F., and Young, T.
(2009) Rapid assessment of a novel series of selective CB2 antagonists
using parallel synthesis protocols: a lipophilic efficiency analysis. Bioorg.
Med. Chem. Lett. 19, 4406–4409.
(94) Lange, J. H. M., van der Neut, M. A. W., Wals, H. C., Kuil, G. D.,
Borst, A. J. M., Mulder, A., den Hartog, A. P., Zilaout, H., Goutier, W.,
van Stuivenberg, H. H., and van Vliet, B. J. (2010) Synthesis and SAR of
novel imidazoles as potent and selective cannabinoid CB2 receptor
antagonists with high binding efficiencies. Bioorg. Med. Chem. Lett.
20, 1084–1089.
(95) Freeman-Cook, K. D., Autry, C., Borzillo, G., Gordon, D.,
Barbacci-Tobin, E., Bernardo, V., Briere, D., Clark, T., Corbett, M.,
Jakubczak, J., Kakar, S., Knauth, E., Lippa, B., Luzzio, M. J., Mansour, M.,
Martinelli, G., Marx, M., Nelson, K., Pandit, J., Rajamohan, F., Robinson,
S., Subramanyam, C., Wei, L., Wythes, M., and Morris, J. (2011) Design
of selective ATP-competitive inhibitors of Akt. J. Med. Chem. 53,
4615–4622.
(96) Cheng, H., Bagrodia, S., Bailey, S., Edwards, M., Hoffman, J.,
Hu, Q., Kania, R., Knighton, D. R., Marx, M. A., Ninkovic, S., Sun, S., and
Zhang, E. (2010) Discovery of highly potent PI3K/mTOR dual
inhibitor PF-04691502 thorough structure based drug design. Med-
ChemComm 1, 139–144.
(97) Mowbray, C. E., Corbau, R., Hawes, M., Jones, L. H., Mills, J. E.,
Perros, M., Selby, M. D., Stupple, P. A., Webster, R., and Wood, A.
(2009) Pyrazole NNRTIs 3: Optimisation of physicochemical proper-
ties. Bioorg. Med. Chem. Lett. 19, 5603–5606.
(98) Mowbray, C. E., Burt, C., Corbau, R., Gayton, S., Hawes, M.,
Perros, M., Tran, I., Price, D. A., Quinton, F. J., Selby, M. D., Stupple,
P. A., Webster, R., and Wood, A. (2009) Pyrazole NNRTIs 4: Selection
of UK-453,061 (lersivirine) as a development candidate. Bioorg. Med.
Chem. Lett. 19, 5857–5860.
(99) Roughley, S. D., and Jordan, A. M. (2011) The medicinal
chemist’s toolbox: an analysis of reactions used in the pursuit of drug
candidates. J. Med. Chem. 54, 3451–3479.
(100) Zhao, H. (2011) Optimization in the nondrug-like space. Drug
Discovery Today 16, 158–163.
(101) Wendt, M. (2008) Discovery of ABT-263, a Bcl-family protein
inhibitor: observations on targeting a large protein-protein interaction.
Exp. Opin. Drug Discovery 3, 1123–1143.
(102) Chen, K. X., and Njoroge, F. G. (2009) A review of HCV
protease inhibitors. Curr. Opin. Invest. Drugs 10, 821–837.
(103) Shinkai, H. (2009) Cholesteryl ester transfer protein inhibi-
tors as high-density lipoprotein raising agents. Exp. Opin. Ther. Patents
19, 1229–1237.
(104) Gao, M., Nettles, R. E., Belema, M., Snyder, L. B., Nguyen,
V. N., Fridell, R. A., Serrano-Wu, M. H., Langley, D. R., Sun, J.-H.,
O’Boyle, D. R., II, Lemm, J. A., Wang, C., Knipe, J. O., Chien, C.,
Colonno, R. J., Grasela, D. M., Meanwell, N. A., and Hamann, L. G.
(2010) Chemical genetics strategy identifies an HCV NS5A inhibitor
with a potent clinical effect. Nature 465, 96–100.
(105) Alex, A., Millan, D. S., Perez, M., Waknenhut, F., and
Whitlock, G. (2011) Intramolecular hydrogen bonding to improve

1456 dx.doi.org/10.1021/tx200211v |Chem. Res. Toxicol. 2011, 24, 1420–1456

You might also like