You are on page 1of 21

C H A P T E R

21
Discovery and Development of Lead
Compounds from Natural Sources Using
Computational Approaches
José L. Medina-Franco
Facultad de Quı́mica, Departamento de Farmacia, Universidad Nacional Autónoma de México,
Mexico City, Mexico

O U T L I N E

21.1 Introduction 455 21.6 NPs as Leads for Challenging and Emerging
Targets 466
21.2 NPs in Drug Discovery 457
21.6.1 NPs as Compounds for Modulating
21.3 Chemoinformatic Analysis of Natural ProteineProtein Interactions 466
Products 457 21.6.2 NPs as Lead Compounds for DNA
21.3.1 Physicochemical Properties 458 Methyltransferase Inhibitors 467
21.3.2 Molecular Scaffolds and Substructural
21.7 Uncovering Bioactivities OF NPs of Dietary
Features 459
Origin 469
21.3.3 Structural Fingerprints 460
21.3.4 Molecular Complexity 462 21.8 Concluding Remarks 470
21.4 Molecular Databases Focused ON NPs and NP Acknowledgments 470
Derivatives 462
References 470
21.4.1 Pharmacological Profiling of NP Databases 463
List of Abbreviations 475
21.5 Virtual Screening and Target Fishing 464
21.5.1 Virtual Screening 464
21.5.2 Target Fishing and Reverse
Pharmacognosy 464

21.1 INTRODUCTION available to support the computational calculations.


Structure-based methods use the available three-
Computer-aided drug design has gained enormous dimensional (3D) information on the molecular target
momentum in drug discovery and its contributions to of interest, which is typically obtained, for example,
drug discovery are increasing. Computational (also from X-ray crystallography, nuclear magnetic reso-
referred to as in silico) techniques play important roles nance, or homology modeling. Ligand-based ap-
in the various stages of lead discovery and develop- proaches use the available information on a series of
ment. In silico methods applied to drug discovery can active ligands (and inactive compounds, when avail-
be roughly classified into two major approaches: able) in a given assay or set of assays. Despite the fact
structure-based and ligand-based. This classification that substantial progress has been made in a number
depends on the level of target structural information of computational methodologies, there is still significant

Evidence-Based Validation of Herbal Medicine


http://dx.doi.org/10.1016/B978-0-12-800874-4.00021-0 455 Copyright © 2015 Elsevier Inc. All rights reserved.
456 21. DISCOVERY AND DEVELOPMENT OF LEAD COMPOUNDS FROM NATURAL SOURCES USING COMPUTATIONAL APPROACHES

room for improvement, not only in the development of


the techniques themselves but also in the rational and
adequate application of such approaches by experts
and nonexperts [1]. Cases are exemplified by reviews
of the progress in molecular docking [2,3], quantitative
structureeactivity relationships (QSARs) [4,5], virtual
screening [6,7], molecular dynamics [8], pharmaco-
phore modeling [9], and chemoinformatics [10], to
name a few.
Natural products (NPs), from either terrestrial or
aquatic organisms, have a long tradition as sources of
active compounds for health-related benefits. It is well
acknowledged that over millions of years, Nature has
optimized and selected chemical structures to generate
chemical scaffolds and compounds enriched with bio-
logical function. Drawbacks of NPs that frequently
diminish the enthusiasm to pursue active compounds
from natural origin include challenges in the isolation
and purification procedures, very small available
amounts of lead compounds, the difficulty in synthesiz-
ing NPs with high structural complexity, and the associ-
ated synthesis scale-up issues. Also, in a drug discovery
context, caution should be taken with natural com- FIGURE 21.1 Examples of the many roles of computational
approaches applied to NP-based drug discovery. NP, natural product.
pounds that have been designed by Nature for defense
and are toxic. As such, one can expect that not all NPs
have a beneficial effect on health. However, the large classes of NPs and emphasizes the computational ap-
success of using NPs to produce bioactive compounds proaches used to characterize their structural diversity
or bioactive mixtures has inspired the preparation of and coverage in chemical space. We also discuss mea-
synthetic molecules that have become drugs. In addi- sures of structural complexity, which is a distinct feature
tion, as reviewed later in this chapter, the unique struc- of NPs. Section 4 focuses on NP databases including col-
tural features of NPs represent a promising opportunity lections available in the public domain. In the same sec-
to identify active compounds for emerging targets and tion we present examples of chemoinformatic analysis
for “tough targets” difficult to address with classical of various NP collections and discuss the pharmacolog-
organic small molecules. ical profiling of such collections, which, in light of the
NP research is increasingly being combined with increasing importance of chemogenomics, is a current
computer-aided drug design techniques to accelerate trend in modern drug discovery. In the next section, enti-
the identification of novel and improved drug candi- tled “Virtual Screening and Target Fishing,” we discuss
dates from natural origin and to further understand the application of computational techniques to help
and quantify the coverage of NPs in chemical space. In identify bioactive compounds of natural origin for a
this chapter, we discuss the progress on the synergy be- given molecular target (virtual screening). We also pre-
tween NP-based drug discovery and chemoinformatic sent the complementary approach, that is, identifying
and molecular modeling methods. Representative appli- possible molecular targets for NPs (target fishing).
cations are schematically outlined in Figure 21.1. This Thus, this section shows how virtual screening and
chapter represents an update of previous reviews of target fishing are used to filter out compounds or molec-
such synergistic combination [11e13]. The chapter is ular targets, respectively, to focus the experimental
divided into eight major sections. After this general screening efforts (which usually are time consuming
introduction, we briefly discuss an overview of the and expensive) on smaller sets of ligands or targets
role of NPs in drug discovery. This section is not with increased probability of having the desired biolog-
intended to be a comprehensive analysis of the status ical effect. Section 6 discusses the critical role of NPs in
of NP research because this point is extensively dis- addressing molecular targets not easily tackled with
cussed in other chapters of this book. Instead, the pur- typical small molecules, such as proteineprotein inter-
pose of this section is to put the reader into the context actions (PPIs). In this section we also discuss the use of
of the challenges faced by the NP drug discovery that NPs against emerging targets, such as epigenetic targets.
are being addressed by the use of computational The next section is dedicated to presenting the applica-
methods. The third section discusses major structural tion of computational methods to uncover bioactivities
21.3 CHEMOINFORMATIC ANALYSIS OF NATURAL PRODUCTS 457
of NPs of dietary origin. It also shows the intersection of than small molecules in typical drug discovery cam-
NPs and food chemicals to systematically identify com- paigns, the development of chemical libraries focused
pounds with health-related benefits. Finally, in the last on these chemical classes has gained interest [26].
section we present a summary and conclusions. Computational approaches, in particular using chemo-
informatics methods (vide infra), have enabled the
methodical structural analysis and classification of NPs.
Structural diversity and complexity are distinct fea-
21.2 NPs IN DRUG DISCOVERY
tures of the chemical structures of NPs that have partic-
ular significance in lead discovery and development.
From ancient times to the modern era, NPs have been
Despite the fact that noncomputational experts can
extensively used as medicines, dietary products, and
easily recognize the distinct structural features of NPs
nutritional supplements [14]. For example, for a number
by eye, systematic and quantitative studies are required
of years, 80% of drugs were either NPs or their deriva-
to derive metrics of such features. A number of well-
tives. Even after the widespread use of techniques
validated chemoinformatic methodologies are available
such as high-throughput screening (HTS) of synthetic li-
to characterize the structures of NPs and compound col-
braries, 50% of the new drugs approved from 1981 to
lections of various origins commonly used in screening
2010 were NPs, derivatives, or structural analogues
campaigns [27,28].
[15,16]. It has been estimated that more than 100 NP
Briefly, chemoinformatics, also referred as “chemin-
compounds are currently in clinical trials. NPs are valu-
formatics” or “chemical information science,” can be un-
able not only as potential therapeutic agents but also as
derstood as “the application of informatics methods to
molecular probes to identify targets of pharmaceutical
solve chemical problems” [29]. Willet further adds that
interest and facilitate the characterization of biological
chemoinformatics is focused on “the manipulation of in-
processes underlying a disease [17]. As reviewed below,
formation about chemical structures, either in the form
NPs have the advantage of uncovering distinct struc-
of planar two-dimensional (2D) structure diagrams or
tural classes [18,19] because of their better coverage of
(increasingly) in the form of 3D atomic coordinates,
chemical space relative to large synthetic compounds
with the manipulations encompassing a range of search-
[20]. Therefore, the chemical diversity of NPs can be
ing, modeling, and statistical approaches” [30].
used to access bioactive compounds with novel scaf-
Chemical libraries are frequently compared using one
folds [20e22]. Some combinatorial libraries are inspired
or more or the following major criteria:
by NP frameworks [23,24].
The reader is referred to extensive and excellent re- 1. Whole molecule properties such as physicochemical
views of the role of NPs in modern drug discovery properties,
[14,16,20]. The rationale that botanicals may exert their 2. Molecular scaffolds and substructural features,
activity owing to the interaction of the bioactive mix- 3. Molecular fingerprints.
tures with multiple biological endpoints in a synergistic
As discussed below, because each of this criteria
manner is contributing to the shift of the current para-
has its own advantages and disadvantages, it has
digm of drug discovery from single-target to multitarget
been proposed that several of these criteria should be
drug discovery [25]. Authors of that work also envision
considered for a comprehensive analysis of compound
that a next paradigm in NP-based drug discovery is the
collections [31].
synergistic combination of traditional NP research with
Using one or more of the general criteria above, com-
other drug discovery strategies. The next sections of this
pound databases are usually compared in terms of the
chapter are focused on the productive combination of
concept of chemical space [32]. Chemical space has
NP-based drug discovery with computational
several definitions [32], e.g., one formulated by Virshup
approaches.
et al.: “an M-dimensional Cartesian space in which com-
pounds are located by a set of M physicochemical and/
or chemoinformatic descriptors” [33]. In simpler terms,
21.3 CHEMOINFORMATIC ANALYSIS the concept of chemical space is intuitive if one makes
OF NATURAL PRODUCTS an analogy with the cosmic universe in which chemical
compounds would be represented by the stars [34,35].
NPs can be classified in several different manners, for However, in contrast to the cosmic universe, the chemi-
example, by the source, e.g., terrestrial or marine; by cal space is relative and depends strongly on the repre-
kingdom, e.g., plants, bacteria, fungi; or by chemical sentations used to define the space [36]. A number of
class, e.g., small molecules, macrocycles, peptides. Of chemoinformatic techniques have been developed to
note, although peptides and macrocycles (among which characterize and generate visual representations of the
NPs make an important contribution) are less attractive chemical space [32,37e39].
458 21. DISCOVERY AND DEVELOPMENT OF LEAD COMPOUNDS FROM NATURAL SOURCES USING COMPUTATIONAL APPROACHES

Here we review a representative chemoinformatic several NPs that have reached the market are very large
analysis of NPs that has been published [40]. For the and flexible, they do comply with the number of HBDs
sake of discussion, we divide this section of the chapter and, more importantly with log P values. Indeed, Gane-
based on the major type of criteria, although more than san concludes that “the single most important lesson
one molecular representation has been used across from NP lies in their ability to maintain low log P values
several studies. regardless of other characteristics” [50].
Singh et al. compared NPs contained in the ZINC
database [51] with drugs, combinatorial libraries, and
the Molecular Libraries Small Molecule Repository
21.3.1 Physicochemical Properties (MLSMR) [31]. In that study, three important molecular
Physicochemical properties are intuitive and straight- properties of size, flexibility, and molecular polarity
forward to interpret. These types of properties are were described by MW, RB, and Slog P; topological
frequently used to define metric-based or empirical polar surface area (TPSA); and HBA and HBD, respec-
rules that attempt to predict “drug-likeness” or to clas- tively. In the same work other criteria such as molecular
sify compounds as drug-like/non-drug-like [41,42]. scaffolds and fingerprint-based diversity (vide infra)
A prominent example is Lipinski’s Rule of Five [43] to were included. Concerning the molecular properties it
predict passive oral absorption based on molecular was concluded that NPs from the ZINC database have
weight (MW), octanol/water partition coefficient (Slog similar distributions of HBAs, HBDs, and RBs compared
P), hydrogen bond acceptors (HBAs), and hydrogen to drugs. The distribution of Slog P values showed that
bond donors (HBDs). Over the years other rules have NPs are slightly more hydrophobic than drugs and over-
been proposed that also highlight the importance of all have a slightly larger MW, as previously observed for
the number of rotatable bonds (RBs) and the polar sur- other NP data sets [46,52].
face area for drug development [44,45]. A total number of 28,000 compounds from the Tradi-
Shultz has presented a critical discussion of metric- tional Chinese Medicine (TCM) database [53] (vide
based rules frequently used in drug discovery [41]. Of infra) were compared to a database of a commercial
note, these rules are commonly formulated based on vendor library and a collection of small molecules ob-
existing data and may exclude novel compounds that tained with combinatorial chemistry and containing 30
are outside of the traditional relevant chemical space, different core scaffolds. The same set of six drug-like
e.g., chemical space relevant to emerging molecular tar- physicochemical properties was used: MW, RB, HBA,
gets. One needs to have in mind that existing data are HBD, TPSA, and Slog P. Results of that study showed
highly influenced by the “relevant” molecular targets that, overall, TCM has the largest values of HBD,
heavily investigated by the scientific community. In HBA, Slog P, and TPSA compared to the other collec-
many instances, the relevance of the molecular targets tions. Concerning the size, in general, TCM had the
is guided (and biased) by market/profit interests or largest molecules as measured by MW [54].
the likelihood of receiving funding [26]. Therefore, it In a separate study, the chemical space of 2477 NPs
has been pointed out that metric-based rules should from a commercial vendor was compared to that of
be used only as a guide to design libraries or select 5963 synthetic compounds from academic sources using
compounds from existing libraries for further develop- diversity-oriented synthesis (DOS) [55] and 6152 syn-
ment [26]. thetic compounds from a commercial vendor typically
Numerous studies have been reported comparing the employed in screening campaigns. Six drug-like proper-
physicochemical properties of NPs with other types of ties (MW, RB, HBA, HBD, TPSA, and Slog P) were used
compounds relevant in drug discovery [40,46e48]. in the comparison. It was concluded that DOS com-
Indeed, the unique properties of NPs represent an excel- pounds considered in that study are heavier and more
lent exception to Lipinski’s Rule of Five. It is not uncom- lipophilic than the NPs or the synthetic commercial
mon to find NPs that have an MW well over 1000 Da and compounds [56].
break many of the other “rules” that define the struc- Manallack et al. conducted an analysis of the distribu-
tures of so-called “drug-like” molecules, and yet are tion of ionization constants of 89,425 NPs from ZINC
orally bioavailable and have acceptable pharmaceutical [57]. The profile was compared to drugs, a chemoge-
properties [49]. nomics data set, and other compound databases from
Ganesan analyzed the drug-like properties of 24 ZINC. Results indicated that NPs have a distinct distri-
distinct NPs that were discovered and led to an bution of ionization constants, e.g., higher proportions
approved drug in the period 1970e2006. He identified of complex ionizable compounds and a greater number
two major types of NPs, those that comply with the of zwitterionic molecules. However, NPs from ZINC
Rule of Five and those that are outside of the so-called have some overlap with approved drugs. The distribu-
“Lipinski universe.” However, despite the fact that tion of pKa values of single acids and single bases in
21.3 CHEMOINFORMATIC ANALYSIS OF NATURAL PRODUCTS 459
NPs was more similar to that of drugs than that of with navigation through biologically relevant chemical
screening compounds [57]. In a subsequent study, Man- space [61]. The first four dimensions of the ChemGPS--
allack et al. performed a similar characterization of the NP map capture 77% of data variance. The first dimen-
acid/base profile and physicochemical properties of sion (principal component 1, PS1) represents size,
25,566 NPs obtained from ChEMBL [58]. In this second shape, and polarizability (main contribution is size);
work, the profile was compared with that of human PS2 is associated with aromatic and conjugation-
small-molecule metabolites and drugs. related properties (main influence is aromaticity); PS3
A visual representation of 24 ADME (absorption, describes lipophilicity, polarity, and hydrogen bond
distribution, metabolism, and elimination)-related capacity (major contribution is lipophilicity); and PS4
properties for the TCM database and NPs from ZINC expresses flexibility and rigidity. Small molecules can
was obtained with principal component analysis. The be positioned onto this map using interpolation in terms
so-called “ADME space” of the NP libraries was of principal component analysis score prediction. De-
compared to a collection of approved drugs, commer- tails of ChemGPS are provided elsewhere [61].
cial vendor compounds, a general diverse collection ob- Figure 21.2 clearly shows that NPs (green spheres)
tained from the National Cancer Institute database, and occupy regions of the chemical space that are different
combinatorial libraries. It was concluded that TCM from the regions explored by the synthetic compounds
covers a vast region of this property space, including (blue spheres) and also cover regions sparsely popu-
areas uncharted by drugs. NPs from ZINC occupy the lated by drugs (orange spheres). The synthetic com-
same area as drugs [27]. Of note, physicochemical prop- pounds show a large overlap with the property space
erties along with substructural features, e.g., functional of drugs. ChemGPS-NPWeb has been used to compare
groups, are also used as criteria to filter out compounds the chemical space of approved drugs with TCM
with potential toxicity issues early in the drug discov- and compounds derived from combinatorial libraries
ery process [59]. available in PubChem [64].
Figure 21.2 shows a visual representation of the
property-based chemical space of 1200 approved drugs,
2000 natural products for screening from the commercial
21.3.2 Molecular Scaffolds and Substructural
vendor AnalytiCon, and 13,387 (synthetic) molecules for
screening. The visual representation was obtained using
Features
the Web-based public tool ChemGPS-NPWeb [60,61]. Although physicochemical properties are broadly
ChemGPS-NP [60,62] is a principal components used by the scientific community to compare compound
analysis-based global chemical positioning system [63] data sets, a disadvantage is that they do not provide
that utilizes principal components analysis to assist direct information on the structural features such as
chemical connectivity, structural novelty, or complexity.
Indeed, different chemical structures can have the same
or similar physicochemical properties. A complemen-
tary approach is to use molecular scaffolds or frame-
works [65]. These terms are used to describe the core
structure of a molecule [66]. Similar to physicochemical
descriptors, molecular scaffolds (also called chemo-
types) are straightforward to interpret and enable easy
communication with medicinal chemists and biologists.
For example, molecular scaffolds are strongly associated
with the concepts of “scaffold hopping” [67] and “priv-
ileged structures” [68,69].
There are a number of ways to represent the molecu-
lar scaffold in a consistent and systematic manner [65].
One definition is the “cyclic systems” that result from
iteratively removing the side chains of the molecule. Cy-
clic systems are part of the chemotype methodology
relying on the molecular equivalence indices developed
FIGURE 21.2 Visual representation of the chemical space of a by Johnson and Xu [70,71] and are similar to the “atomic
collection of 2000 NPs (green), 1200 drugs approved for clinical use frameworks” of Bemis and Murcko [72]. A remarkable
(orange), and a general screening collection with 13,387 synthetic
compounds (blue). The plot was generated using the ChemGPS-NP feature of the scaffolds of Johnson and Xu used to
prediction scores calculated using the online tool ChemGPS-NPWeb. compare compound collections is that molecules
NP, natural product. classified in a scaffold do not lie in any other chemotypic
460 21. DISCOVERY AND DEVELOPMENT OF LEAD COMPOUNDS FROM NATURAL SOURCES USING COMPUTATIONAL APPROACHES

class [73]. This approach has been extensively used to a number of physicochemical properties, atom counts,
classify compound collections [31,73e75]. and other substructural features. They concluded that
One of the disadvantages of the scaffold analysis is NPs are different from compounds derived from combina-
the lack of information regarding the side chains around torial synthesis based on the number of chiral centers,
the molecular framework and the structural relation- the prevalence of aromatic rings, the introduction of com-
ship between the scaffolds themselves. A straightfor- plex ring systems, and the degree of the saturation of the
ward solution is the analysis not only of the molecular molecule, as well as the number and ratios of different het-
scaffolds but also of the side chains and functional eroatoms [46].
groups and other substructural analysis strategies [76]. As part of an effort to compute a natural product-
Scaffold analysis is used to compare compound data likeness score, P. Ertl et al. showed that natural products
sets, to assess the performance of virtual screening consisted of fewer aromatic rings and were less flexible
methodologies, to retrieve novel scaffolds, and to relative to drugs and an in-house set of synthetic com-
analyze the SAR of a set of molecules with measured pounds [80].
activity. For instance, we reported a chemotype-based Chen et al. compared the molecular topologies of nat-
hierarchical classification of the NCI AIDS database to ural products with those of drugs, human metabolites,
systematically identify the scaffolds associated with clinical candidates, and general bioactive compounds
anti-AIDS activity [73]. [81]. Chen et al. showed that biologically relevant NPs
Measuring and comparing the scaffold diversity of and human metabolites had the highest ratios of single
compound collections depend on several factors, ring system compounds, among other measures.
including the specific approach to describe the scaffolds,
the size of the database, and the distribution of the mol-
ecules in those scaffold classes [77]. Often, scaffold di-
versity is measured based on frequency counts.
21.3.3 Structural Fingerprints
Although these measures are correct in the way they Typical molecular and structural fingerprints encode
are defined, they do not provide sufficient information the information of the entire compound structure, i.e.,
concerning the specific distribution of the molecules not only molecular scaffolds, and they have been
across the different scaffolds, particularly the most applied to a number of computer-aided and chemoinfor-
populated ones. Medina-Franco et al. [77] proposed matic applications. The reader is referred to detailed de-
the use of an entropy-based metric to measure the distri- scriptions of the various types of molecular fingerprints
bution of the molecules across different scaffolds, partic- commonly used [82].
ularly the most populated ones, as a complementary A disadvantage of some fingerprints is that they are
metric for the comprehensive scaffold diversity analysis more difficult to interpret. Also, it is well known that
of compound data sets. Using this metric, Yongye et al. chemical space will depend on the type of fingerprints
measured the scaffold diversity of five NP databases used [83]. To reduce the dependence of chemical space
available in the public domain (see Section 21.4 of this on the structure representation, it has been proposed
chapter).The NP libraries were compared with a general that multiple methods be used, for example, multiple
screening collection and libraries frequently used in in fingerprint representations, and common or consensus
vivo screening. They found that the general screening li- conclusions be obtained [84]. Indeed, the aggregation
brary had the largest scaffold diversity. In addition to or combination of methods is a common practice in sim-
benzene and acyclic molecules, flavones, coumarins, ilarity searching (called “data fusion”) [85], in molecular
and flavanones were identified as the most frequent docking (“consensus scoring”) [86], in activity landscape
scaffolds across the NP collections analyzed in that modeling (e.g., “consensus activity cliffs”) [87,88], and in
work [78]. clustering [89].
Koch et al. reported a structural classification of NPs Molecular fingerprints are frequently used as a mo-
(SCONP) [79]. The SCONP was based on scaffold anal- lecular representation for diversity analysis. Singh
ysis of the comprehensive (although not publicly avail- et al. analyzed the structural diversity of NPs from
able) CRC Dictionary of Natural Products (Table 21.1). ZINC using three types of molecular fingerprints from
The scaffold classification can be visualized in a tree- different design, namely Molecular Access System
like fashion that resembles the approach previously (MACCS) keys, graph-based three-point pharmaco-
published by Medina-Franco et al. [73]. Koch et al. phores (GpiDAPH3), and typed graph distance (TGD)
employed that structural classification to develop a [31]. As discussed above, fingerprints with different de-
novel class of selective and potent inhibitors of 11b- signs were used to reduce the dependence of chemical
hydroxysteroid dehydrogenase type 1 [79]. space on structure representation. The diversity was
Feher and Schmidt compared NPs, molecules from compared with approved drugs, MLSMR, and four
combinatorial synthesis, and drug molecules. They used combinatorial libraries. Results showed that the
21.3 CHEMOINFORMATIC ANALYSIS OF NATURAL PRODUCTS 461
TABLE 21.1 Examples of Databases of Natural Products Commercially or Publicly Available

Database Description and size URL

Dictionary of natural Comprehensive and fully edited www.crcpress.com/;


products database of NPs. It contains more than dnp.chemnetbase.com/
259,859 compounds in over 68,000 intro/
entries. The database is frequently
updated.

SEARCHABLE ONLINE

HIMdHerbal Collects in vivo metabolism information 58.40.126.120/him/


Ingredients In vivo for active herbal ingredients, as well as
Metabolism database their corresponding bioactivity, organ
and/or tissue distribution, toxicity,
ADME, and clinical research profile.
Information for 361 ingredients and 1104
metabolites from 673 herbs.
NuBBE database Approximately 640 compounds collected nubbe.iq.unesp.br/
from publications of the NuBBE group in nubbeDB.html
Brazil.
Super Natural II 352,811 purchasable compounds. It Bioinf-applied.charite.
includes information on the 2D de/supernatural_new/
structures, physicochemical properties, index.php
and vendors.
Traditional Chinese Database with 37,170 (32,364 tcm.cmu.edu.tw
Medicine (TCM) nonduplicate) TCM compounds from
database @ Taiwan 352 TCM ingredients.
Universal Natural Repository of 197,201 compounds pkuxxj.pku.edu.cn/
Products Database obtained from other NP databases. UNPD
(UNPD)
UNIIQUIM database More than 3000 NPs collected from uniiquim.iquimica.
publications of the Chemistry Institute, unam.mx
UNAM in Mexico.

COMPOUNDS FOR IN SILICO AND/OR EXPERIMENTAL SCREENING


AfroDb Representative subset of 954 compounds zinc.docking.org/
from African medicinal plants. catalogs/afronp
AnalyticondMEGx Pure natural compounds from plants www.ac-discovery.com
and 1300 pure natural compounds from
microorganisms.
MicroSource pure 800 compounds fully characterized with www.msdiscovery.
natural products 95% purity or more. com/natprod.html
collection

NPs, natural products; TCM, Traditional Chinese Medicine; 2D, two-dimensional.

magnitude of the similarity values computed with The interlibrary similarity of TCM with drugs and a
different fingerprints strongly depends on the design commercial vendor library was assessed using MACCS
and resolution of the fingerprints, which has been keys, GpiDAPH3, and TGD fingerprints by means of
observed in other studies [84,87,90]. Overall, it was nearest-neighbor curves (distribution of the maximum
concluded that drugs are the most structurally diverse similarity values to the reference collections). It was
(showing the lowest mean and median similarities for observed that TCM has compounds with chemical struc-
the three fingerprints). The NP collection was the second tures different from those of drugs and the diverse
most diverse database. In that study, fingerprint-based collection of commercial compounds. In the same study,
characterization of the NPs complemented the analysis the intralibrary similarity of TCM was compared to the
conducted with physicochemical properties and molec- other reference compound databases using MACCS
ular scaffolds [31]. keys by means of pairwise similarity values with the
462 21. DISCOVERY AND DEVELOPMENT OF LEAD COMPOUNDS FROM NATURAL SOURCES USING COMPUTATIONAL APPROACHES

Tanimoto coefficient [91,92]. It was concluded that, in 21.3.4 Molecular Complexity


general, NPs in TCM have lower structural diversity
compared to drugs and commercial compounds as Molecular complexity is a concept of growing interest
captured by MACCS keys/Tanimoto coefficient [54]. to select compounds from existing collections for
The intralibrary similarity of five NP collections screening or designing chemical databases [26]. Lover-
whose chemical structures are in the public domain ing et al. showed that structurally complex molecules,
has also been measured using pairwise similarity values as measured by the fraction of saturated carbons, have
computed with MACCS keys and the Tanimoto coeffi- higher success rates in the drug discovery pipeline
cient. Results showed that the intralibrary similarity [93]. The authors of that work also hypothesized that
strongly depends on the type of NP database analyzed compounds with increased complexity may increase
[78]. The fingerprint-based characterization of the five selectivity. This hypothesis is supported by the study
collections was part of a chemoinformatic characteriza- of Clemons et al. that screened three types of compound
tion of the databases that also included a comprehensive collections across 100 diverse proteins [94]. This study
scaffold analysis (vide supra) [78]. showed that increasing the content of sp3-hybridized
To illustrate the comparison of compound databases and stereogenic atoms relative to compounds from com-
using molecular fingerprints, Figure 21.3 shows the dis- mercial sources improves selectivity and frequency of
tribution of the maximum MACCS keys/Tanimoto sim- binding. Evidently, increasing molecular complexity is
ilarities of 13,388 compounds from a general screening not the only criterion to consider when screening exist-
collection (blue curve) and 2000 compounds in an NP ing collections or designing new libraries. As the authors
database from a commercial vendor, AnalytiCon (green discussed, other properties have to be balanced [26].
curve) with a set of 1200 approved drugs for clinical use Quantifying molecular complexity is not a trivial task,
(not represented in the plot). Compounds in the general and several metrics have been suggested, including MW
screening collection have a median similarity of 0.71, [95e98]. One metric commonly used is the carbon bond
whereas the NPs have a median similarity of 0.77. These saturation defined by fraction sp3 (Fsp3) where
statistics and the curves of the distribution indicate that, Fsp3 ¼ (number of sp3 hybridized carbons/total carbon
in general, the chemical structures of NPs are slightly count) [93]. Using this metric, López-Vallejo et al.
more similar to those of drugs than the general screening compared the molecular complexity of NPs in the
collection (considering the MACCS keys structural rep- TCM database with the structural complexity of 30
resentation). Similar conclusions were obtained small-molecule synthetic combinatorial libraries [54].
comparing different general screening and NP data- Results of this study demonstrated the high structural
bases with a set of drugs [54]. complexity of NPs, suggesting the possibility of using
these collections to interrogate novel regions in the
currently neglected chemical space [54].

21.4 MOLECULAR DATABASES FOCUSED


ON NPs AND NP DERIVATIVES

The advancement of synthetic chemistry and HTS has


largely contributed to the growth of the number of mol-
ecules available. Large numbers of compounds can be
conveniently stored in chemical databases, which play
a key role in modern drug discovery [99].
Compound databases may contain existing or virtual
compounds. The second type usually comprises hypo-
thetical molecules that could be synthesized later.
A comprehensive collection of virtual compounds in
the public domain is the Generated Database of Chemi-
FIGURE 21.3 Nearest-neighbor curves comparing a natural cal Space (GDB) [37,100]. GDB has been used in virtual
products collection from a commercial vendor and a general screening screening followed by chemical synthesis and biological
collection with a collection of approved drugs for clinical use. The testing [101,102]. Libraries of existing compounds may
curves represent the cumulative distribution function of the maximum
similarity of each compound in the screening and natural products be proprietary, also called in-house libraries; commer-
collection to all the drugs as measured with MACCS keys and the cial; or public. Sources of screening libraries have been
Tanimoto coefficient. MACCS, Molecular Access System. reviewed [103e105].
21.4 MOLECULAR DATABASES FOCUSED ON NPs AND NP DERIVATIVES 463
Depending on the goals of the project or the screening approved drugs for clinical use, concluding that there
campaign, distinct types of compound libraries can be is a large overlap [118].
developed and screened [50]. Examples include DOS Ntie-Kang et al. published the ConMedNP library
[55], focused libraries, diverse libraries, combinatorial li- [119], an extension of the previously published database
braries, Libraries from Libraries [106], and NPs or syn- CamMedNP, which contains 1859 NPs and derivatives
thetic analogues of NPs [107]. obtained in Cameroon [120]. The augmented library
Recently Medina-Franco et al. [26] reviewed different ConMedNP represents a recollection of 3177 com-
approaches to designing focused libraries with confined pounds, not only from Cameroon but also from the Cen-
chemical spaces. In that work, the authors discussed two tral African flora. Ntie-Kang et al. published a
broad types of confinement: (1) library design focused physicochemical characterization of ConMedNP using
on a relevant therapeutic target or disease and (2) library typical drug-like properties and ADME-related descrip-
design focused on chemistry or a desired molecular tors. In addition, that group made publicly available the
function [26]. NP databases are important sources of 3D structures for other computational applications such
molecules from which to select compounds or inspire as virtual screening (vide infra) [119]. Ntie-Kang et al.
the synthesis of molecules for a target of therapeutic in- also published AfroDb [111], which is a relatively small
terest or with a desired molecular function [108,109]. In but representative subset of African medicinal plants
this regard, Over et al. analyzed and validated the role of containing around 1000 3D structures. AfroDb is avail-
NP-derived fragments for fragment-based ligand dis- able in ZINC.
covery [110]. Esquivel and colleagues at the Informatics Unit of the
Table 21.1 summarizes examples of NP databases Chemistry Institute of the National Autonomous Uni-
either commercially available or in the public domain versity of Mexico is assembling a database of NPs that
[111e115]. Some databases are designed to conduct have been published by the Chemistry Institute of the
structure and properties searches online, whereas others same university. It is estimated that the database will
are collections of compounds for purchase intended for have information for more than 3000 chemical sub-
virtual and experimental screening. A more complete set stances isolated and characterized. Other representative
of NP catalogs for screening is collected in the ZINC databases of NPs are summarized in Table 21.1 and
database [51]. Of note, by July 2014, ZINC contained others are reviewed elsewhere [20,107].
over 35 million molecules and 13 NP catalogs (available
at http://zinc.docking.org/browse/catalogs/natural-
products).
21.4.1 Pharmacological Profiling of NP
In 2012, Yongye and Medina-Franco compiled one
of the first lists of NP databases whose structures are
Databases
readily accessible on the Web [78]. At the time of that Chemogenomics is evolving as a multidisciplinary
study such databases contained between 560 and 89,000 research field that uses in vitro and in silico methods
compounds. Subsequently, the number of NP databases to better understand a ligandetarget SAR matrix or che-
with structure in the public domain doubled. mogenomics knowledge space [121,122]. The relation-
The TCM database is one of the major sources of ship between chemogenomics and related topics of
natural products freely available online [53]. TCM has major interest in current drug discovery, such as poly-
been extensively characterized using physicochemical pharmacology, drug repurposing, phenotypic screening,
properties and structural fingerprints (vide supra) and high-throughput in vivo testing, has been discussed
[54]. Based on this database, the cloud computing sys- in an integrated manner [25]. In two independent publi-
tem iScreen (available at http://iScreen.cmu.edu.tw/) cations, Bajorath and Rognan, respectively, reflected on
was developed, which is a Web server for docking the perspectives of computational chemogenomics
TCM followed by customized de novo drug design [123,124].
[116]. TCM has been used successfully to identify Along this line of thinking, there is increased aware-
pancreatic triacylglycerol lipase inhibitors using in ness that drugs exert their clinical effects through inter-
silico approaches [117]. actions with multiple targets. This is illustrated by the
Another important source of NPs freely available drugs dabrafenib and trametinib, approved in 2013 by
online is the Universal Natural Products Database the Federal Drug Administration (FDA) of the United
(UNPD) [118]. UNPD is a collection of 197,201 chemical States for the treatment of “unresectable or metastatic
structures obtained from plants, animals, and microor- melanoma with BRAFV600E mutation as detected by an
ganisms. The physicochemical profile of this database FDA-approved test” [125]. Both drugs target multiple
has been analyzed. The physicochemical properties kinases [125]. This awareness has increased the rele-
were employed as a basis to generate a visual compari- vance of and interest in systematically screening chemi-
son of the chemical space covered by UNPD and cal compounds, including NPs, across different
464 21. DISCOVERY AND DEVELOPMENT OF LEAD COMPOUNDS FROM NATURAL SOURCES USING COMPUTATIONAL APPROACHES

biological endpoints, for example, against multiple mo- methods. Structure-based approaches use the 3D struc-
lecular targets. This is a common practice after isolating ture of the target and ligand-based approaches utilize
and characterizing novel NPs. ligand information in light of structureeactivity data
An example of a large-scale NP profiling was re- derived from a set of known actives. Successful applica-
ported by Clemons et al., who tested the binding speci- tions of virtual screening to identify bioactive com-
ficity of 15,000 compounds, including 2477 NPs, with pounds have been reviewed [3,7].
100 sequence-unrelated proteins. The results of the Traditionally, bioactive NPs with a promising thera-
screening were made freely accessible to the scientific peutic indication have been identified through random
community and the reader has access to the chemical or fortuitous approaches. Thus, computational screening
structures and the corresponding biological profiles of NPs represents a valuable synergy for identifying
[94]. This data set has been subjected to a series of bioactive NPs in a systematic manner. Indeed, virtual
computational studies aimed at elucidating the SAR screening has been applied to screen small sets and large
and identifying structural patterns associated with the databases of NPs, giving rise to the identification of
selectivity or promiscuity of the molecules using finger- bioactive molecules [134e138]. Table 21.2 summarizes
print or substructure representations [126e128]. For representative examples of virtual screening approaches
example, this data set was the basis for developing ap- applied to NPs. Figure 21.4 shows the corresponding
proaches for identifying structural changes that have a chemical structures of the NPs uncovered by virtual
significant impact on the number of proteins to which screening summarized in Table 21.2. The reader is
a compound binds. For instance, Yongye and Medina- referred to a number of reviews that show the progress
Franco [126] proposed the structureepromiscuity index in the virtual screening of NPs [13,20,140e144].
difference (SPID) metric. SPID encodes the relationship Notably, a unique collection of NPs can be used in vir-
between structure similarity and the number of different tual screening through the Drug Discovery Portal (DDP;
proteins to which each pair of compounds binds [126]. available at www.ddp.strath.ac.uk) [145]. The DDP
In a separate study, Dimova et al. analyzed the same developed a database that contains purified NPs ob-
large microarray data set, using the concept of matched tained from plant, soil, or marine sources, synthetic com-
molecular pairs [129]. Dimova et al. identified single-site pounds of some of which were inspired by NPs.
substitutions that lead to large differences in compound Molecules stored in this database are available in
promiscuity [127]. research groups from various countries including Scot-
land (where DDP is based), Australia, France, and the
United States. The DDP also collects biological targets
that are being screened in academic laboratories. The
21.5 VIRTUAL SCREENING AND TARGET
goal of this initiative is to match a chemist’s compound
FISHING to a biological assay using computational techniques
and then validate the computational predictions in the
21.5.1 Virtual Screening biological assay that is available [20].
Because experimental testing of molecules is expen- Virtual screenings of NP databases for later experi-
sive and time consuming, virtual screening, also called mental validation have been reported. For example,
in silico or computational screening, represents a valu- 197,201 compounds in the UNPD (vide supra) were
able tool to guide and focus experimental efforts on docked with 332 target proteins relevant to approved
smaller, filtered sets of compounds with increased prob- drugs. Based on a docking-score-weighted prediction
ability of showing the desired biological activity model, the most promising NPs with potential bioac-
[6,130,131]. This is particularly attractive for filtering tivity were identified [118]. Virtual screenings of NPs
NPs for experimental testing because, in many in- for emerging and challenging targets are discussed
stances, NPs are available in small quantities. Ideally, below.
virtual screening is included in an iterative cycle of pre-
diction and experimental validation followed by rounds
21.5.2 Target Fishing and Reverse
of refinement. If the final goal of virtual screening is to
Pharmacognosy
identify potent compounds, one of the main objectives
of the first iteration cycle is to identify novel molecular As discussed above, the goal of virtual screening is to
scaffolds [132]. Similar to experimental screenings, identify new ligands for known targets. The inverse
virtual screening work flows are project-specific, approach, i.e., identifying new targets for known li-
tailored to the need of a particular target or biological gands, is called “target fishing” [122] or inverse
context [133]. screening. Thus, in the context of chemogenomics, vir-
Virtual screening approaches can be classified into tual screening is associated with ligand screening and
two major groups: structure-based and ligand-based target fishing is related to ligand profiling [121]. Similar
21.5 VIRTUAL SCREENING AND TARGET FISHING 465
TABLE 21.2 Examples of Recent Applications of Virtual Screening to Identify Bioactive NPsa

Computational approach(es) Key findings References

Docking-based virtual screening 19 NPs with diverse structures are identified as [135]
of an in-house collection with MMP inhibitors. The most potent inhibitor
4000 NPs with matrix (compound 5) also represses MMP-2 and active
metalloproteinases (MMPs). MMP-9 expression in MDA-MB-231 cancer cells
and suppresses the migration of MDA-MB-231
in a wound healing assay.

QSAR based on molecular Calcein, a natural dye, showed potent [137]


topology and virtual screening. inhibitory activity against interleukin-6 and
therefore is potentially effective in ulcerative
colitis.
Inverse virtual screening of a Xanthohumol and isoxanthohumol showed [139]
small library of phenolic NPs with inhibitory activity on PDK1 and PKC protein
a panel of 163 targets involved in kinases in in vitro assays.
the genesis and progression of
cancer.
Structure-based virtual screening 18 hits were identified that bind at the interface [134]
of an in-house library of 1430 NPs of HIV-1 integrase and human LEDGF/p75. The
and their derivatives with a novel two most potent inhibitors had IC50 values at
target for anti-HIV therapy. 0.32 and 0.26 mM, respectively. NPD170 had the
highest antiviral activity with an EC50 of
1.81 mM.
Structure-based virtual screening Eleven compounds were identified as ER [138]
of an in-house database with more modulators: 3 agonists and 8 antagonists. The
than 4000 NPs with two estrogen most potent antagonist (compound 4) had an
receptor (ER) modulators. EC50 value of 2.55 and 4.68 mM for ERa and ERb,
respectively.
Docking-based virtual screening One of the top-ranked compounds, NDGA had [136]
of 216 diverse NPs with an IC50 of 46.2 mM. In silico studies of NDGA
acetylcholinesterase as potential were conducted to design compounds with
leads for Alzheimer disease. improved CNS activity.

QSAR, quantitative structureeactivity relationships.


a
Chemical structures of selected hit compounds are in Figure 21.4.

FIGURE 21.4 Chemical structures of exemplary NPs identified from virtual screening followed by experimental validation. See Table 21.2
for details. NPs, natural products.
466 21. DISCOVERY AND DEVELOPMENT OF LEAD COMPOUNDS FROM NATURAL SOURCES USING COMPUTATIONAL APPROACHES

to virtual screening, depending on the experimental in- virtual screening and target fishing for NPs using 3D
formation available, target fishing can be performed us- pharmacophores [154].
ing structure-based methods, e.g., inverse docking, or As part of the FoodInformatics symposia held at the
ligand-based methods, e.g., similarity searching [146]. 245th American Chemical Society National Meeting in
Ideally the combination of both approaches can be 2013 [155] (vide infra), Quoc-Tuan Do explained the
used if enough structural and SAR data are available. principles of reverse pharmacognosy [156], empha-
This point is emphasized by Yue et al., who have dis- sizing the many roles of chemoinformatic approaches,
cussed progress on the target profiling of NPs using including inverse screening, to accelerate the identifica-
experimental (genomics and proteomics) and computa- tion of the bioactive compounds of an organism. Quoc-
tional approaches [147]. In that review, Yue et al. point Tuan discussed two successful and published examples
out the convenience of integrating various methods of reverse pharmacognosy using SelnergyÔ, a platform
such as inverse docking (docking compounds across developed in Greenpharma to predict, based on dock-
different targets), mapping ligandetarget profiling ing, interaction energies of a ligand with a target protein
space, and network analysis. [157,158].
A notable example of a ligand-based method that
can greatly benefit target fishing of NPs is the similarity
ensemble approach (SEA) [20,148]. SEA is a statistical 21.6 NPs AS LEADS FOR CHALLENGING
ligand-based design approach in which the structures AND EMERGING TARGETS
of a series of ligands with known targets are used
to train a model capable of predicting the poly The overall unique structural diversity, structural
pharmacological profile of other molecules. This features, and molecular complexity of NPs make them
approach was employed to predict the binding profiles attractive for interrogating molecular targets that repre-
of 3600 known drugs or compounds developed as sent significant challenges using traditional small mol-
potential drugs based on hundreds of thousands of ecules typically used for common targets. Similarly,
bioactive molecules and their known targets. Two of NPs represent a very attractive approach to address
three high-confidence predictions were retrospectively emerging molecular targets for which the “relevant”
shown to be accurate [149]. chemical space is not clearly defined yet. This section il-
A second remarkable example of a ligand-based lustrates these points by focusing on the role of NPs in
method that can benefit NP research is prediction of ac- the pursuit of modulators of PPIs and of the emerging
tivity spectra for substances (PASS) [150]. PASS predicts epigenetic targets.
simultaneously more than 500 biological activities,
including pharmacological main and side effects, mech-
anisms of action, mutagenicity, carcinogenicity, teratoge-
nicity, and embryotoxicity. PASS is based on a regression
21.6.1 NPs as Compounds for Modulating
approach applied to noncongeneric chemical series. The
ProteineProtein Interactions
developers of this approach have expanded the capabil- Several cellular functions are regulated by multipro-
ities of the PASS algorithm to predict the sites of metab- tein complexes that are controlled by PPIs between pro-
olites for xenobiotics [151]. Lagunin et al. have discussed tein subunits. It is also well known that human diseases
in detail the application of PASS to evaluate the multitar- can be caused by abnormal PPIs. Therefore PPI modula-
get profile of selected NPs [152]. tors, either inhibitors or stabilizing agents, are attractive
Target fishing using computational approaches is in drug discovery [159]. Despite significant progress
increasingly being used to uncover biological activities made toward the modulation of PPIs, these are still diffi-
of NPs. An example has been published by Gianluigi cult to target with small molecules because of the struc-
et al., who conducted inverse docking of 10 antioxidant tural characteristics of the proteineprotein interfaces.
phenolic NPs with 163 molecular targets involved in For example: (1) in several cases, the contact surfaces
cancer progression. Xanthohumol and isoxanthohumol involved in PPIs are large (w1500e3000 Å2) compared
showed activity with the protein kinases PDK1 and with those involved in proteinesmall-molecule interac-
PKC [139]. In a separate work, Carregal et al. developed tions (w300e1000 Å2); (2) in general, the contact sur-
a protocol combining molecular docking, refinement by faces between proteins are flat as opposed to the types
molecular dynamics simulations, and quantum me- of pockets and grooves found in typical surfaces of pro-
chanics/molecular mechanics to determine the pharma- teins bound to small molecules; (3) in contrast to typical
cological receptors for NPs isolated from Cerrado proteins of pharmaceutical relevance, PPIs lack endoge-
species in Brazil [153]. Other applications of target fish- nous small molecules that can be used as a reference or
ing for NPs are reviewed elsewhere [13]. More recently, starting point to design modulators [160]. Despite these
Wolber and Rollinger discussed the application of challenges, several strategies are being followed that
21.6 NPs AS LEADS FOR CHALLENGING AND EMERGING TARGETS 467
have succeeded in bringing compounds to clinical trials. 21.6.2 NPs as Lead Compounds for DNA
Sperandio et al. have reviewed general strategies to Methyltransferase Inhibitors
design libraries focused on PPIs [161].
In an insightful review, Wells and McClendon reflect Emerging molecular targets such as DNA methyl-
that the fact that many HTS efforts have yielded PPI hits transferases (DNMTs) and other epigenetic enzymes
with moderate potency is heavily influenced by histori- [164] are becoming attractive targets for the treatment
cal reasons: common screening libraries of small mole- of cancer and several other diseases. Among the epige-
cules contain chemotypes dominated by past drug netic targets, DNMTs are a family of enzymes that
discovery focused on classical targets, e.g., G-protein- catalyze the transfer of a methyl group from S-adeno-
coupled receptors and protein kinases. Moreover, the syl-L-methionine to the carbon-5 position of cytosine
authors of that review point out that PPI inhibitors residues leading to an epigenetic modification [165].
reaching clinical trials have MW between 500 and The human genome encodes four distinct DNMTs:
900 Da, with Ki values less than 1 mM [160], being clear DNMT1, DNMT2, DNMT3A, and DNMT3B. Of these,
exceptions to the traditional medicinally relevant chem- DNMT1 and DNMT3B constitute the major activities.
ical space (which contains molecules with MW less than It has been demonstrated that inhibition of DNMT ac-
500 Da). Taking all these considerations together, it was tivity can lead to demethylation and reactivation of
then proposed that the structural features of NPs repre- epigenetically silenced tumor suppressor genes [166].
sent excellent candidates for PPI modulation [161,162]. It Thus, DNA methylation represents a central mecha-
has been hypothesized that, in general, the rigidity of nism for mediating epigenetic gene regulation, and
PPI inhibitors is required for good binding affinity and the development of DNMT inhibitors provides novel
target selectivity. Some NPs have the desired profile of opportunities for cancer therapy [167]. Inhibition of
having a rigid molecular framework [159]. DNA methylation has also emerged as a promising
Several NPs have pharmacological activity as PPI strategy for the treatment of immunodeficiency and
blockers. Examples include FK506, rapamycin, and brain disorders [168,169]. However, DNMT inhibitors
ascomycin, a family of closely related polyketide natural currently in clinical use, 5-azacytidine and 5-aza-20 -
products derived from soil actinomycetes that have deoxycytidine, are nonselective cytosine analogues
large different cellular effects via binding of the FK506- with significant cytotoxic side effects. Different
binding protein immunophilin and modulation of the approaches are being followed to identify distinct non-
PPIs involved in the signal transduction pathways of T nucleoside DNMT inhibitors; these methods include
cell activation and growth. Other examples are pacli- chemical synthesis of lead compounds [170,171],
taxel, epothilone A, and discodermolide, which affect docking-based virtual screening of chemical databases
mitosis by modulating the PPIs implicated in tubulin from various sources, and similarity searching of data-
polymerization [162]. Rolitetracycline is an inhibitor of bases of approved drugs. Similarity searching, which is
the hypoxia-inducible factor-1 (HIF-1) pathway, which a ligand-based approach, led to the identification of
is a key regulator of angiogenic and glucose metabolic olsalazine, an anti-inflammatory drug approved for
processes and is used by tumor cells for both survival the treatment of ulcerative colitis, as a novel DNA
and growth. Most of the HIF-1 inhibitors are either hypomethylating agent [172]. This case represents an
NPs or synthetic compounds based on NPs. Chetomin example of a synergy by which computational
is an NP that also interferes with the HIF-1 signaling approaches accelerate drug repurposing.
pathway. Although further development of rolitetracy- Because environmental exposures are usually
cline and chetomin was hampered by a lack of activity assumed to have a major impact on the onset of
in cell-based assays and toxicity in animal models, abnormal DNA methylation patterns, a frequent uptake
respectively, the chemical scaffolds of both compounds of DNA demethylating agents is believed to have a che-
represent a unique architecture to develop distinct scaf- mopreventive effect [173]. This could be achieved
folds [159]. through the dietary uptake of natural product DNMT in-
To accelerate the identification of lead molecules that hibitors [173]. Several examples of NPs with
modulate PPIs, computational approaches are being DNMT-inhibitory and/or hypomethylating activity
developed to predict modes of PPIs as well as hot spots have been reported. A classic example is (-)-epigalloca-
at the protein interface. Progress in the development of techin-3-gallate (EGCG), the main polyphenol com-
these computational methods is presented in an excel- pound from green tea. EGCG has been proposed to
lent review by Bienstock [163]. These techniques can inhibit DNMT1 by blocking the active site of the enzyme
be conveniently applied to NPs. For example, once hot and reactivating methylation-silenced genes in cancer
spots have been identified for a given proteineprotein cells [174]. Catechin and epicatechin, and the bio-
interface, pharmacophore filtering can be applied using flavonoids quercetin, fisetin, and myricetin, are other
NP databases discussed earlier in this chapter. tea polyphenols that have also been associated with
468 21. DISCOVERY AND DEVELOPMENT OF LEAD COMPOUNDS FROM NATURAL SOURCES USING COMPUTATIONAL APPROACHES

the inhibition of DNA methylation. Curcumin, the major DNMT3B, along with the experimental and theoretical
component of the Indian curry spice turmeric, and par- evidence of the reaction between quinones and
thenolide, the principal sesquiterpene lactone of cysteine-rich proteins, the catalytic cysteines were hy-
feverfew, have also been reported to inhibit DNMT1 pothesized to perform a nucleophilic Michael 1,4
[175]. Psammaplin A and several other disulfide bromo- addition to the a,b-unsaturated carbonyl system of
tyrosine derivatives isolated from the marine sponge nanaomycin A. A similar plausible reaction was not
Pseudoceratina purpurea have been described as potent in- observed in the binding model with DNMT1, which sup-
hibitors of DNMT1 [176,177]. Kuck et al. have reported ported the experimental selectivity toward DNMT3B
nanaomycin A as a selective inhibitor of human [178].
DNMT3B [178]. Several other NPs reported as DNMT The occurrence of DNMT inhibitors in dietary prod-
inhibitors or hypomethylating agents are reviewed else- ucts highlights the relevance of identifying additional
where [173,179e181]. inhibitors of natural origin [173]. The systematic search
Nanaomycin A (Figure 21.5) is a quinone antibiotic for active compounds in NP databases can thus be the
isolated from a culture of Streptomyces that has been basis for further support of the use of complementary
described as the first non-S-adenosyl-L-homocysteine and alternative medicine products for inhibition of
(SAH) analogue acting as a DNMT3B-selective inhibitor DNA methylation. The systematic search can be greatly
that induces genomic demethylation. Nanaomycin A facilitated by the application of computational ap-
treatment reduced the global methylation levels in three proaches. Indeed, in silico methods have been broadly
cell lines and reactivated transcription of the RASSF1A used to propose the mechanism of action of DNMT
tumor suppressor gene [178]. To explain at the molecular inhibitors at the molecular level, to screen compound
level the activity of this NP, nanaomycin A was docked libraries to identify novel inhibitors, and to guide the
with a homology model of the catalytic site of DNMT3B lead optimization efforts using structure-based tech-
that was built based on the crystal structure of DNMT3A niques [183,184]. For example, Yoo et al. reported dock-
as a template [182]. Results of the docking study showed ing models of a number of NPs with DNMT1. The
that the carboxylic acid, hydroxyl group, and adjacent docking models of the NPs, along with the predicted
carbonyl oxygen atoms are predicted to form an exten- binding models of small-molecule DNMT inhibitors of
sive hydrogen bond network with the side chains of different origins were the basis for developing a phar-
arginine amino acids. The study also suggested that macophore model [185,186]. Such model was subse-
the hydroxyl group of nanaomycin A makes a hydrogen quently used to help explain, at the molecular level,
bond with the side chain of a glutamic acid residue the activity of trimethylaurintricarboxylic acid [187].
(Figure 21.5). Similar hydrogen bond interactions with The chemical spaces of two NP collections, including
the equivalent glutamic acid and arginine residues that compounds from the TCM database, were compared to
participate in the mechanism of methylation were not the property space of a DNMT-focused library,
observed in the docking models obtained with DNMT1. approved drugs, and synthetic commercial compounds.
These results provided a possible structural explanation It was concluded that the DNMT-focused library and the
for the enzyme selectivity of nanaomycin A for DNMT3B. two NP databases have molecules with properties
Based on the binding model of nanaomycin A with similar to those of approved drugs [188].

FIGURE 21.5 Docking model of the NP nanaomycin A with the catalytic site of human DNMT3B. NP, natural product.
21.7 UNCOVERING BIOACTIVITIES OF NPs OF DIETARY ORIGIN 469
As reviewed above, several bioactive compounds of As part of a program to characterize natural extracts
natural origin have been discovered fortuitously. How- and identify their active compounds and their mode of
ever, these findings have spurred follow-up studies to action, Guasch et al. identified, from 11 extracts with
systematically uncover bioactive compounds with the known antidiabetic activity, 12 molecules as potential
aid of computational approaches. As part of an effort partial agonists of peroxisome proliferator-activated re-
to identify novel inhibitors of DNMT of natural origin, ceptor g. To that end, the authors used a validated vir-
the authors reported the virtual screening of a large tual screening protocol based on a combination of
collection of NPs from the ZINC database. Starting pharmacophore modeling, docking, and shape-based
from 89,000 compounds, a subset of approximately similarity searching [190].
14,000 molecules was selected that was subjected to a More recently Medina-Franco et al. [189] compared
multistep docking approach using three docking algo- the physicochemical properties of NP databases with
rithms. Fifty-eight compound consensus hits with a chemical structures in the public domain with more
docking score that was similar to or better than that of than 2000 food materials in the FEMA “Generally Recog-
the reference molecule were selected and moved for- nized as Safe” (GRAS) list [189]. The authors concluded
ward to experimental validation, which is ongoing at that NP databases from different sources have distinct
the time of writing. Notably, one of the consensus hits distributions of physicochemical properties and struc-
has reported DNMT1 inhibitory activity [173]. It is antic- tural diversity in support of previous conclusions
ipated that computational approaches will continue to derived from the scaffold analysis of the same databases
be used to develop DNMT inhibitors as promising [78]. The study also concluded that the GRAS chemicals
epigenetic drugs. analyzed in that work (discrete chemical entities only)
have a high structural diversity, comparable to the
high structural diversity of NPs and other reference li-
braries [189].
21.7 UNCOVERING BIOACTIVITIES To identify potential bioactivity among the food
OF NPs OF DIETARY ORIGIN flavoring components in the FEMA GRAS list,
Martinez-Mayorga et al. carried out ligand-based virtual
For centuries, people from various cultures have used screening for compounds with structures similar to
herbal remedies to try to maintain or improve their health those of approved antidepressant drugs [191]. The vir-
(www.nlm.nih.gov/medlineplus/herbalmedicine.html). tual screening was performed by means of fingerprint-
Despite the fact that plant extracts in the form of teas, based similarity searching using the MACCS keys and
decoctions, or tinctures may contain complex mixtures the Tanimoto coefficient. Hit compounds in the FEMA
of compounds, they are broadly used in a safe and effec- GRAS list were selected as the compounds most similar
tive manner. In fact, plant-based extracts with defined (ranked with the highest similarity values) to any of 32
composition, e.g., botanicals or phytopharmaceuticals, approved antidepressant drugs. Selected compounds
are registered for clinical use as dietary supplements or represented the “nearest neighbors” of the approved an-
drugs, depending on the regulations of the country in tidepressants. Results indicated that valproic acid was
which they are used. However, as commented above, the antidepressant most similar to GRAS compounds.
one of the reasons pharmaceutical companies are reluc- Following the rationale that the inhibition of histone
tant to make large-scale use of NPs is because of the deacetylase-1 (HDAC1) could be associated with the ef-
complexity of the mixtures of compounds in NP extracts ficacy of valproic acid in the treatment of bipolar disor-
[20]. Therefore, for many years it has been of interest to der, Martinez-Mayorga et al. screened the GRAS
isolate, identify, and purify the bioactive components of compounds most similar to valproic acid for HDAC1 in-
plant extracts. hibition. The GRAS chemicals nonanoic acid and
Chemoinformatics and molecular modeling ap- 2-decenoic acid inhibited HDAC1 at the micromolar
proaches have been used to systematically identify level, with a potency comparable to that of valproic
bioactive components. In this context, foodinformatics is acid. It was pointed out that GRAS compounds are not
an emerging research field that is focused on uncovering expected to exhibit strong enzymatic inhibitory effects
health-related benefits of food components using che- at the concentrations typically employed in flavor for-
moinformatic and other computational approaches mulations designed for use in foods and beverages.
[155,189]. Geldenhuys et al. have published an insightful However, as shown in that work, GRAS chemicals
review of the role of NPs from dietary sources as lead were able to bind to a relevant therapeutic target. Addi-
compounds for virtual screening and structure-based tional studies on bioavailability, toxicity at higher con-
drug design. These authors put particular emphasis on centrations, and off-target effects are warranted. That
compounds such as resveratrol, curcumin, caffeine, study also exemplified the feasibility of exploring the
and genistein [144]. FEMA GRAS flavoring list using computational
470 21. DISCOVERY AND DEVELOPMENT OF LEAD COMPOUNDS FROM NATURAL SOURCES USING COMPUTATIONAL APPROACHES

methods as a potential source of biologically active mol- synthetic libraries) or performing interlibrary
ecules. In addition, the study demonstrated that similar- comparisons of various types of NP databases;
ity searching followed by experimental evaluation can • Organizing and mining molecular databases;
be used for rapid identification of GRAS chemicals • Quantifying the structural diversity and complexity
with potential bioactivity [191]. of NP collections;
• Characterizing the coverage of the chemical space;
• Characterizing SARs systematically;
• Designing synthetic analogues of NPs using
21.8 CONCLUDING REMARKS structure- or ligand-based approaches or both;
• Identifying and optimizing leads for “difficult” and
Historically, NPs have made outstanding contribu-
emerging molecular targets;
tions to drug discovery, providing bioactive molecules
• Deconvoluting bioactive mixtures;
that have reached patients for clinical applications or
• Uncovering possible therapeutic and health-related
that have inspired and stimulated the development of
applications of NPs of dietary origin.
a significant portion of today’s pharmaceuticals. More-
over, since ancient times NPs have been broadly Over the years, around the world, diverse research
employed as medicines, dietary products, and nutri- groups have investigated and developed NPs and syn-
tional supplements. The development of compounds thetic compounds inspired by NPs. Thousands of NPs
that can be used safely to deliver the desirable clinical and NP-like compounds are available and many com-
effect(s) is a complex problem that demands the syner- pounds are assembled in compound databases. The mo-
gistic combination of major and complementary ap- lecular complexity, physicochemical profile, and
proaches. As such, the discovery and development of structural diversity of such compound collections (com-
NPs as lead compounds for drug discovery can be accel- mercial, public, or in-house data sets) can be readily
erated by the rational application of computational assessed using chemoinformatic approaches.
methods. As reviewed in this chapter, computational The distinct structural characteristics of NPs offer a
methods play many roles in NP-based drug discovery. great promise for identifying compounds for use against
A broad set of computational strategies that are tailored molecular targets that are difficult to tackle using typical
to the specific project needs and information on the sys- small-molecule combinatorial libraries or emerging mo-
tem available are increasingly making contributions to lecular targets. Indeed, it has been acknowledged that
drug discovery campaigns based on NPs. Of note, the “new classes of molecular targets may need new chem-
computational methods themselves are far from perfect ical scaffolds.” In this context, molecular modeling and
and are constantly subject to improvement and refine- chemoinformatics provide relevant information to char-
ment. It is also important to stress that by no means acterize the structures of the difficult and emerging tar-
are computational strategies intended to erase tradi- gets and help find the NPs that satisfy the requirements
tional NP research. Instead, computational methods for binding.
are meant to further improve common practices that It is anticipated that the synergistic combination of
have been successful over the years. On the other two well-established approaches, NP-based drug dis-
hand, there is a false expectation that computational covery and computer-aided drug design, will continue
methods alone are capable of designing by themselves evolving and delivering therapeutic compounds or mol-
molecules that will reach the market, i.e., the false ecules with health-related benefits to the best interest of
notions of “computer-to-bedside” or “hit-one-button the patients.
drug discovery.” In this chapter we intended to make
clear that computational approaches are part of a multi-
disciplinary and collective effort in which experimental
Acknowledgments
strategies play a fundamental role. We thank Dr. Karina Martı́nez-Mayorga, Dr. Gerald M. Maggiora,
Computational methods have clear applications in and Dr. Nathalie Meurice for fruitful discussions and critical reading
of the manuscript.
NP-based drug discovery that include but are not
limited to:
• Filtering compound databases to select subsets of References
compounds for experimental screening; [1] Ritchie TJ, McLay IM. Should medicinal chemists do molecular
• Systematic screening of compound data sets with few modelling? Drug Discovery Today 2012;17:534e7.
or many NPs; [2] Yuriev E, Ramsland PA. Latest developments in molecular dock-
ing: 2010e2011 in review. J Mol Recognit 2013;26:215e39.
• Measuring the physicochemical profile of NPs so that [3] Bello M, Martinez-Archundia M, Correa-Basurto J. Automated
they can be compared in a consistent manner with docking for novel drug discovery. Expert Opin Drug Discovery
compound collections from different sources (e.g., 2013;8:821e34.
REFERENCES 471
[4] Scior T, Medina-Franco JL, Do QT, Martı́nez-Mayorga K, Yunes [26] Medina-Franco JL, Martinez-Mayorga K, Meurice N. Balancing
Rojas JA, Bernard P. How to recognize and workaround pitfalls novelty with confined chemical space in modern drug
in QSAR studies: a critical review. Curr Med Chem 2009;16: discovery. Expert Opin Drug Discovery 2014;9:151e65.
4297e313. [27] Medina-Franco JL. Chemoinformatic characterization of the
[5] Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, chemical space and molecular diversity of compound libraries.
Cronin M, et al. QSAR modeling: where have you been? where In: Andrea T, editor. Diversity-oriented synthesis: Basics and ap-
are you going to? J Med Chem 2014;57:4977e5010. plications in organic synthesis, drug discovery, and chemical
[6] Scior T, Bender A, Tresadern G, Medina-Franco JL, biology. Hoboken (New Jersey): John Wiley & Sons, Inc.; 2013.
Martı́nez-Mayorga K, Langer T, et al. Recognizing pitfalls in p. 325e52.
virtual screening: a critical review. J Chem Inf Model 2012;52: [28] Medina-Franco JL. Interrogating novel areas of chemical space
867e81. for drug discovery using chemoinformatics. Drug Dev Res
[7] Lavecchia A, Di Giovanni C. Virtual screening strategies in 2012;73:430e8.
drug discovery: a critical review. Curr Med Chem 2013;20: [29] Engel T. Basic overview of chemoinformatics. J Chem Inf Model
2839e60. 2006;46:2267e77.
[8] Durrant J, McCammon JA. Molecular dynamics simulations and [30] Willett P. Chemoinformatics: a history. WIREs Comput Mol Sci
drug discovery. BMC Biol 2011;9:71. 2011;1:46e56.
[9] Sanders MPA, Barbosa AJM, Zarzycka B, Nicolaes GAF, [31] Singh N, Guha R, Giulianotti MA, Pinilla C, Houghten RA,
Klomp JPG, de Vlieg J, et al. Comparative analysis of pharmaco- Medina-Franco JL. Chemoinformatic analysis of combi
phore screening tools. J Chem Inf Model 2012;52:1607e20. natorial libraries, drugs, natural products, and molecular
[10] Duffy BC, Zhu L, Decornez H, Kitchen DB. Early phase drug dis- libraries small molecule repository. J Chem Inf Model 2009;49:
covery: cheminformatics and computational techniques in iden- 1010e24.
tifying lead series. Bioorg Med Chem 2012;20:5324e42. [32] Medina-Franco JL, Martı́nez-Mayorga K, Giulianotti MA,
[11] Pfisterer PH, Wolber G, Efferth T, Rollinger JM, Stuppner H. Nat- Houghten RA, Pinilla C. Visualization of the chemical space
ural products in structure-assisted design of molecular cancer in drug discovery. Curr Comput-Aided Drug Des 2008;4:
therapeutics. Curr Pharm Des 2010;16:1718e41. 322e33.
[12] Barlow DJ, Buriani A, Ehrman T, Bosisio E, Eberini I, Hylands PJ. [33] Virshup AM, Contreras-Garcı́a J, Wipf P, Yang W, Beratan DN.
In-silico studies in chinese herbal medicines’ research: evalua- Stochastic voyages into uncharted chemical space produce a
tion of in-silico methodologies and phytochemical data sources, representative library of all possible drug-like compounds. J
and a review of research to date. J Ethnopharmacol 2012;140: Am Chem Soc 2013;135:7296e303.
526e34. [34] Bohanec S, Zupan J. Structure generation of constitutional iso-
[13] Medina-Franco JL. Advances in computational approaches for mers from structural fragments. J Chem Inf Comput Sci 1991;
drug discovery based on natural products. Rev Latinoam 31:531e40.
Quim 2013;41:95e110. [35] Pearlman RS, Smith KM. Novel software tools for chemical
[14] Cragg GM, Grothaus PG, Newman DJ. New horizons for old diversity. Perspect Drug Discovery Des 1998;9e11:339e53.
drugs and drug leads. J Nat Prod 2014;77:703e23. [36] Sheridan RP, Kearsley SK. Why do we need so many chemical
[15] Li JW-H, Vederas JC. Drug discovery and natural products: end similarity search methods? Drug Discovery Today 2002;7:
of an era or an endless frontier? Science 2009;325:161e5. 903e11.
[16] Newman DJ, Cragg GM. Natural products as sources of new [37] Ruddigkeit L, Blum LC, Reymond J-L. Visualization and virtual
drugs over the 30 years from 1981 to 2010. J Nat Prod 2012;75: screening of the chemical universe database gdb-17. J Chem Inf
311e35. Model 2013;53:56e65.
[17] Schmitt EK, Moore CM, Krastel P, Petersen F. Natural products [38] Owen JR, Nabney IT, Medina-Franco JL, López-Vallejo F. Visual-
as catalysts for innovation: a pharmaceutical industry ization of molecular fingerprints. J Chem Inf Model 2011;51:
perspective. Curr Opin Chem Biol 2011;15:497e504. 1552e63.
[18] Harvey AL. Natural products in drug discovery. Drug Discovery [39] Rabal O, Oyarzabal J. Biologically relevant chemical space navi-
Today 2008;13:894e901. gator: from patent and structureeactivity relationship analysis to
[19] Bohlin L, Göransson U, Alsmark C, Wedén C, Backlund A. Nat- library acquisition and design. J Chem Inf Model 2012;52:
ural products in modern life science. Phytochem Rev 2010;9: 3123e37.
279e301. [40] Wetzel S, Schuffenhauer A, Roggo S, Ertl P, Waldmann H. Chem-
[20] Harvey AL, Clark RL, Mackay SP, Johnston BF. Current strategies informatic analysis of natural products and their chemical space.
for drug discovery through natural products. Expert Opin Drug Chimia 2007;61:355e60.
Discovery 2010;5:559e68. [41] Shultz MD. Setting expectations in molecular optimizations:
[21] Rosén J, Gottfries J, Muresan S, Backlund A, Oprea TI. Novel strengths and limitations of commonly used composite
chemical space exploration via natural products. J Med Chem parameters. Bioorg Med Chem Lett 2013;23:5980e91.
2009;52:1953e62. [42] Yusof I, Segall MD. Considering the impact drug-like properties
[22] Kombarov R, Altieri A, Genis D, Kirpichenok M, Kochubey V, have on the chance of success. Drug Discovery Today 2013;18:
Rakitina N, et al. Biocores: identification of a drug/natural 659e66.
product-based privileged structural motif for small-molecule [43] Lipinski CA, Lombardo F, Dominy BW, Feeney PJ. Experimental
lead discovery. Mol Diversity 2010;14:193e200. and computational approaches to estimate solubility and perme-
[23] Newman DJ. Natural products as leads to potential drugs: an old ability in drug discovery and development settings. Adv Drug
process or the new hope for drug discovery? J Med Chem 2008; Delivery Rev 1997;23:3e25.
51:2589e99. [44] Veber DF, Johnson SR, Cheng HY, Smith BR, Ward KW,
[24] Boldi AM. Libraries from natural product-like scaffolds. Curr Kopple KD. Molecular properties that influence the oral
Opin Chem Biol 2004;8:281e6. bioavailability of drug candidates. J Med Chem 2002;45:2615e23.
[25] Medina-Franco JL, Giulianotti MA, Welmaker GS, Houghten RA. [45] Hopkins AL, Keseru GM, Leeson PD, Rees DC, Reynolds CH.
Shifting from the single to the multitarget paradigm in drug The role of ligand efficiency metrics in drug discovery. Nat Rev
discovery. Drug Discovery Today 2013;18:495e501. Drug Discovery 2014;13:105e21.
472 21. DISCOVERY AND DEVELOPMENT OF LEAD COMPOUNDS FROM NATURAL SOURCES USING COMPUTATIONAL APPROACHES

[46] Feher M, Schmidt JM. Property distributions: differences be- [68] Evans BE, Rittle KE, Bock MG, DiPardo RM, Freidinger RM,
tween drugs, natural products, and molecules from combinato- Whitter WL, et al. Methods for drug discovery: development of
rial chemistry. J Chem Inf Comput Sci 2003;43:218e27. potent, selective, orally effective cholecystokinin antagonists.
[47] Shelat AA, Guy RK. The interdependence between screening J Med Chem 1988;31:2235e46.
methods and screening libraries. Curr Opin Chem Biol 2007;11: [69] Mason JS, Morize I, Menard PR, Cheney DL, Hulme C,
244e51. Labaudiniere RF. New 4-point pharmacophore method for mo-
[48] Kong D-X, Li X-J, Zhang H-Y. Where is the hope for drug discov- lecular similarity and diversity applications: overview of the
ery? Let history tell the future. Drug Discovery Today 2009;14: method and applications, including a novel approach to the
115e9. design of combinatorial libraries containing privileged
[49] Koehn FE, Carter GT. The evolving role of natural products in substructures. J Med Chem 1999;42:3251e64.
drug discovery. Nat Rev Drug Discov 2005;4:206e20. [70] Xu Y, Johnson M. Algorithm for naming molecular equivalence
[50] Ganesan A. The impact of natural products upon modern drug classes represented by labeled pseudographs. J Chem Inf
discovery. Curr Opin Chem Biol 2008;12:306e17. Comput Sci 2001;41:181e5.
[51] Irwin JJ, Shoichet BK. ZINC - a free database of commercially [71] Xu YJ, Johnson M. Using molecular equivalence numbers to
available compounds for virtual screening. J Chem Inf Model visually explore structural features that distinguish chemical
2005;45:177e82. libraries. J Chem Inf Comput Sci 2002;42:912e26.
[52] Lee ML, Schneider G. Scaffold architecture and pharmacophoric [72] Bemis GW, Murcko MA. The properties of known drugs. 1.
properties of natural products and trade drugs: application in the Molecular frameworks. J Med Chem 1996;39:2887e93.
design of natural product-based combinatorial libraries. J Comb [73] Medina-Franco JL, Petit J, Maggiora GM. Hierarchical strategy
Chem 2001;3:284e9. for identifying active chemotype classes in compound
[53] Chen CY-C. TCM database@Taiwan: the world’s largest tradi- databases. Chem Biol Drug Des 2006;67:395e408.
tional chinese medicine database for drug screening in silico. [74] López-Vallejo F, Peppard TL, Medina-Franco JL, Martı́nez-
PLoS One 2011;6:e15939. Mayorga K. Computational methods for the discovery of
[54] López-Vallejo F, Giulianotti MA, Houghten RA, Medina- mood disorder therapies. Expert Opin Drug Discovery 2011;6:
Franco JL. Expanding the medicinally relevant chemical space 1227e45.
with compound libraries. Drug Discovery Today 2012;17:718e26. [75] López-Vallejo F, Castillo R, Yépez-Mulia L, Medina-Franco JL.
[55] O’Connell KMG, Galloway WRJD, Spring DR. The basics of Benzotriazoles and indazoles are scaffolds with biological activ-
diversity-oriented synthesis. In: Andrea T, editor. Diversity- ity against Entamoeba histolytica. J Biomol Screening 2011;16:
oriented synthesis: Basics and applications in organic synthesis, 862e8.
drug discovery, and chemical biology. Hoboken (New Jersey): [76] Villar HO, Hansen MR, Kho R. Substructural analysis in drug
John Wiley & Sons, Inc.; 2013. p. 1e26. discovery. Curr Comput-Aided Drug Design 2007;3:59e67.
[56] Clemons PA, Wilson JA, Dancik V, Muller S, Carrinski HA, [77] Medina-Franco JL, Martı́nez-Mayorga K, Bender A, Scior T. Scaf-
Wagner BK, et al. Quantifying structure and performance diver- fold diversity analysis of compound data sets using an entropy-
sity for sets of small molecules comprising small-molecule based measure. QSAR Comb Sci 2009;28:1551e60.
screening collections. Proc Natl Acad Sci USA 2011;108:6817e22. [78] Yongye AB, Waddell J, Medina-Franco JL. Molecular scaffold
[57] Manallack DT, Prankerd RJ, Nassta GC, Ursu O, Oprea TI, analysis of natural products databases in the public domain.
Chalmers DK. A chemogenomic analysis of ionization Chem Biol Drug Des 2012;80:717e24.
constants-implications for drug discovery. ChemMedChem [79] Koch MA, Schuffenhauer A, Scheck M, Wetzel S, Casaulta M,
2013;8:242e55. Odermatt A, et al. Charting biologically relevant chemical space:
[58] Manallack DT, Dennis ML, Kelly MR, Prankerd RJ, Yuriev E, a structural classification of natural products (SCONP). Proc Natl
Chalmers DK. The acid/base profile of the human metabolome Acad Sci USA 2005;102:17272e7.
and natural products. Mol Inf 2013;32:505e15. [80] Ertl P, Roggo S, Schuffenhauer A. Natural product-likeness score
[59] Yongye AB, Medina-Franco JL. Systematic characterization of and its application for prioritization of compound libraries.
structureeactivity relationships and ADMET compliance: a J Chem Inf Model 2008;48:68e74.
case study. Drug Discovery Today 2013;18:732e9. [81] Chen H, Engkvist O, Blomberg N, Li J. A comparative analysis of
[60] Larsson J, Gottfries J, Muresan S, Backlund A. ChemGPS-NP: the molecular topologies for drugs, clinical candidates, natural
Tuned for navigation in biologically relevant chemical space. products, human metabolites and general bioactive compounds.
J Nat Prod 2007;70:789e94. Med Chem Comm 2012;3:312e21.
[61] Rosen J, Lovgren A, Kogej T, Muresan S, Gottfries J, Backlund A. [82] Leach AR, Gillet VJ. An introduction to chemoinformatics. Dor-
ChemGPS-NPWeb: chemical space navigation online. J Comput- drecht (The Netherlands): Kluwer Academic Publishers; 2003.
Aided Mol Des 2009;23:253e9. [83] Shanmugasundaram V, Maggiora GM, Lajiness MS. Hit-directed
[62] Larsson J, Gottfries J, Bohlin L, Backlund A. Expanding the nearest-neighbor searching. J Med Chem 2005;48:240e8.
ChemGPS chemical space with natural products. J Nat Prod [84] Medina-Franco JL, Martı́nez-Mayorga K, Bender A, Marı́n RM,
2005;68:985e91. Giulianotti MA, Pinilla C, et al. Characterization of activity land-
[63] Oprea TI, Gottfries J. Chemography: the art of navigating in scapes using 2D and 3D similarity methods: consensus activity
chemical space. J Comb Chem 2001;3:157e66. cliffs. J Chem Inf Model 2009;49:477e91.
[64] Medina-Franco JL, Waddell J. Towards the bioassay activity [85] Willett P. Similarity-based virtual screening using 2D
landscape modeling in compound databases. J Mex Chem Soc fingerprints. Drug Discovery Today 2006;11:1046e53.
2012;56:163e8. [86] Feher M. Consensus scoring for protein-ligand interactions.
[65] Brown N, Jacoby E. On scaffolds and hopping in medicinal Drug Discovery Today 2006;11:421e8.
chemistry. Mini-Rev Med Chem 2006;6:1217e29. [87] Yongye A, Byler K, Santos R, Martı́nez-Mayorga K,
[66] Schuffenhauer A, Varin T. Rule-based classification of chemical Maggiora GM, Medina-Franco JL. Consensus models of activity
structures by scaffold. Mol Inf 2011;30:646e64. landscapes with multiple chemical, conformer and property
[67] Schneider G, Neidhart W, Giller T, Schmid G. Scaffold-hopping representations. J Chem Inf Model 2011;51:1259e70.
by topological pharmacophore search: a contribution to virtual [88] Medina-Franco JL, Yongye AB, López-Vallejo F. Consensus
screening. Angew Chem Int Ed 1999;38:2894e6. models of activity landscapes. In: Matthias D, Kurt V, Danail B,
REFERENCES 473
editors. Statistical modeling of molecular descriptors in QSAR/ [110] Over B, Wetzel S, Grutter C, Nakai Y, Renner S, Rauh D, et al.
QSPR. Weinheim, Germany: Wiley-VCH; 2012. p. 307e26. Natural-product-derived fragments for fragment-based ligand
[89] Chu C-W, Holliday JD, Willett P. Combining multiple classifica- discovery. Nat Chem 2013;5:21e8.
tions of chemical structures using consensus clustering. Bioorg [111] Ntie-Kang F, Zofou D, Babiaka SB, Meudom R, Scharfe M,
Med Chem 2012;20:5366e71. Lifongo LL, et al. AfroDb: a select highly potent and diverse nat-
[90] Pérez-Villanueva J, Santos R, Hernández-Campos A, ural product library from african medicinal plants. PLoS One
Giulianotti MA, Castillo R, Medina-Franco JL. Towards a 2013;8:e78085.
systematic characterization of the antiprotozoal activity land- [112] Dunkel M, Fullbeck M, Neumann S, Preissner R. Supernatural: a
scape of benzimidazole derivatives. Bioorg Med Chem 2010; searchable database of available natural compounds. Nucleic
18:7380e91. Acids Res 2006;34:D678e83.
[91] Jaccard P. Etude comparative de la distribution florale dans une [113] Kang H, Tang K, Liu Q, Sun Y, Huang Q, Zhu R, et al.
portion des alpes et des jura. Bull Soc Vaudoise Sci Nat 1901;37: HIM-herbal ingredients in-vivo metabolism database. J Cheminf
547e79. 2013;5:28.
[92] Willett P, Barnard JM, Downs GM. Chemical similarity [114] Lei J, Zhou J. A marine natural product database. J Chem Inf
searching. J Chem Inf Comput Sci 1998;38:983e96. Comput Sci 2002;42:742e8.
[93] Lovering F, Bikker J, Humblet C. Escape from flatland: Increasing [115] Valli M, dos Santos RN, Figueira LD, Nakajima CH, Castro-
saturation as an approach to improving clinical success. J Med Gamboa I, et al. Development of a natural products database
Chem 2009;52:6752e6. from the biodiversity of Brazil. J Nat Prod 2013;76:439e44.
[94] Clemons PA, Bodycombe NE, Carrinski HA, Wilson JA, [116] Tsai T-Y, Chang K-W, Chen C. Iscreen: world’s first cloud-
Shamji AF, Wagner BK, et al. Small molecules of different origins computing web server for virtual screening and de novo drug
have distinct distributions of structural complexity that correlate design based on TCM database@Taiwan. J Comput-Aided Mol
with protein-binding profiles. Proc Natl Acad Sci USA 2010;107: Des 2011;25:525e31.
18787e92. [117] Chen K-Y, Chang S-S, Chen CY-C. In silico identification of
[95] Bertz SH. The 1st general index of molecular complexity. J Am potent pancreatic triacylglycerol lipase inhibitors from tradi-
Chem Soc 1981;103:3599e601. tional chinese medicine. PLoS One 2012;7:e43932.
[96] Barone R, Chanon M. A new and simple approach to chemical [118] Gu J, Gui Y, Chen L, Yuan G, Lu H-Z, Xu X. Use of natural prod-
complexity. Application to the synthesis of natural products. ucts as chemical library for drug discovery and network
J Chem Inf Comput Sci 2001;41:269e72. pharmacology. PLoS One 2013;8:e62839.
[97] Allu TK, Oprea TI. Rapid evaluation of synthetic and molecular [119] Ntie-Kang F, Onguene PA, Scharfe M, Owono LCO,
complexity for in silico chemistry. J Chem Inf Model 2005;45: Megnassan E, Mbaze LM, et al. ConMedNP: a natural product li-
1237e43. brary from central african medicinal plants for drug discovery.
[98] Schuffenhauer A, Brown N, Selzer P, Ertl P, Jacoby E. Relation- RSC Adv 2014;4:409e19.
ships between molecular complexity, biological activity, and [120] Ntie-Kang F, Mbah JA, Mbaze LM, Lifongo LL, Scharfe M,
structural diversity. J Chem Inf Model 2006;46:525e35. Hanna JN, et al. CamMedNP: Building the cameroonian 3d
[99] Miller MA. Chemical database techniques in drug discovery. Nat structural natural products database for virtual screening.
Rev Drug Discov 2002;1:220e7. BMC Complementary Altern Med 2013;13:88.
[100] Reymond J-L, van Deursen R, Blum LC, Ruddigkeit L. Chemical [121] Jacoby E. Computational chemogenomics. Wiley Interdiscip Rev:
space as a source for new drugs. Med Chem Comm 2010;1:30e8. Comput Mol Sci 2011;1:57e67.
[101] Nguyen KT, Syed S, Urwyler S, Bertrand S, Bertrand D, [122] Rognan D. Structure-based approaches to target fishing and
Reymond JL. Discovery of NMDA glycine site inhibitors from ligand profiling. Mol Inf 2010;29:176e87.
the chemical universe database GDB. ChemMedChem 2008;3: [123] Bajorath J. A perspective on computational chemogenomics. Mol
1520e4. Inf 2013;32:1025e8.
[102] Nguyen KT, Luethi E, Syed S, Urwyler S, Bertrand S, Bertrand D, [124] Rognan D. Towards the next generation of computational che-
et al. 3-(aminomethyl)piperazine-2,5-dione as a novel NMDA mogenomics tools. Mol Inf 2013;32:1029e34.
glycine site inhibitor from the chemical universe database [125] Mullard A. 2013 FDA drug approvals. Nat Rev Drug Discov
GDB. Bioorg Med Chem Lett 2009;19:3832e5. 2014;13:85e9.
[103] Scior T, Bernard P, Medina-Franco JL, Maggiora GM. Large com- [126] Yongye AB, Medina-Franco JL. Data mining of protein-binding
pound databases for structure-activity relationships studies in profiling data identifies structural modifications that distinguish
drug discovery. Mini-Rev Med Chem 2007;7:851e60. selective and promiscuous compounds. J Chem Inf Model 2012;
[104] Bender A. Databases compound bioactivities go public. Nat 52:2454e61.
Chem Biol 2010;6:309. [127] Dimova D, Hu Y, Bajorath J. Matched molecular pair analysis of
[105] Barbosa AJM, Del Rio A. Freely accessible databases of commer- small molecule microarray data identifies promiscuity cliffs and
cial compounds for high- throughput virtual screenings. Curr reveals molecular origins of extreme compound promiscuity.
Top Med Chem 2012;12:866e77. J Med Chem 2012;55:10220e8.
[106] López-Vallejo F, Nefzi A, Bender A, Owen JR, Nabney IT, [128] Yongye AB, Medina-Franco JL. Toward an efficient approach to
Houghten RA, et al. Increased diversity of libraries from li- identify molecular scaffolds possessing selective or promiscuous
braries: chemoinformatic analysis of bis-diazacyclic libraries. compounds. Chem Biol Drug Des 2013;82:367e75.
Chem Biol Drug Des 2011;77:328e42. [129] Dossetter AG, Griffen EJ, Leach AG. Matched molecular pair
[107] Fullbeck M, Michalsky E, Dunkel M, Preissner R. Natural analysis in drug discovery. Drug Discovery Today 2013;18:
products: sources and databases. Nat Prod Rep 2006;23:347e56. 724e31.
[108] Haustedt LO, Mang C, Siems K, Schiewe H. Rational approaches [130] Klebe G. Virtual ligand screening: strategies, perspectives and
to natural-product-based drug design. Curr Opin Drug limitations. Drug Discovery Today 2006;11:580e94.
Discovery Dev 2006;9:445e62. [131] Heikamp K, Bajorath J. The future of virtual compound
[109] Grabowski K, Baringhaus K-H, Schneider G. Scaffold diversity screening. Chem Biol Drug Des 2013;81:33e40.
of natural products: Inspiration for combinatorial library [132] Muegge I. Synergies of virtual screening approaches. Mini-Rev
design. Nat Prod Rep 2008;25:892e904. Med Chem 2008;8:927e33.
474 21. DISCOVERY AND DEVELOPMENT OF LEAD COMPOUNDS FROM NATURAL SOURCES USING COMPUTATIONAL APPROACHES

[133] Clark DE. What has virtual screening ever done for drug compounds from cerrado. Int J Quantum Chem 2012;112:
discovery? Expert Opin Drug Discovery 2008;3:841e51. 3333e40.
[134] Hu GP, Li X, Zhang X, Li YZ, Ma L, Yang LM, et al. Discovery of [154] Gerhard W, Judith MR. Virtual screening and target fishing for
inhibitors to block interactions of HIV-1 integrase with human natural products using 3D pharmacophores. In: Jacoby E, editor.
LEDGF/p75 via structure-based virtual screening and Computational chemogenomics. Florida, (United States): Tylor &
bioassays. J Med Chem 2012;55:10108e17. Francis Group; 2013. p. 117e39.
[135] Wang LY, Li X, Zhang SD, Lu WQ, Liao S, Liu XF, et al. Natural [155] Martı́nez-Mayorga K, Medina-Franco JL, Organizers. Sympo-
products as a gold mine for selective matrix metalloproteinases sium: Foodinformatics: applications of chemical information to
inhibitors. Bioorg Med Chem 2012;20:4164e71. food chemistry. Division of Chemical Information. 245th ACS
[136] Remya C, Dileep KV, Tintu I, Variyar EJ, Sadasivan C. In vitro National Meeting. New Orleans, LI, United States; 2013.
inhibitory profile of NDGA against AChE and its in silico struc- [156] Blondeau S, Do QT, Scior T, Bernard P, Morin-Allory L. Reverse
tural modifications based on ADME profile. J Mol Model 2013; pharmacognosy: another way to harness the generosity of
19:1179e94. nature. Curr Pharm Des 2010;16:1682e96.
[137] Galvez-Llompart M, Recio Iglesias M d C, Galvez J, Garcia- [157] Do QT, Lamy C, Renimel I, Sauvan N, Andre P, Himbert F, et al.
Domenech R. Novel potential agents for ulcerative colitis by mo- Reverse pharmacognosy: Identifying biological properties for
lecular topology: suppression of IL-6 production in Caco-2 and plants by means of their molecule constituents: application to
RAW 264.7 cell lines. Mol Diversity 2013;17:573e93. meranzin. Planta Med 2007;73:1235e40.
[138] Cao X, Jiang J, Zhang S, Zhu L, Zou J, Diao Y, et al. Discovery of [158] Bernard P, Dufresne-Favetta C, Favetta P, Do QT, Himbert F,
natural estrogen receptor modulators with structure-based vir- Zubrzycki S, et al. Application of drug repositioning strategy
tual screening. Bioorg Med Chem Lett 2013;23:3329e33. to TOFISOPAM. Curr Med Chem 2008;15:3196e203.
[139] Lauro G, Masullo M, Piacente S, Riccio R, Bifulco G. Inverse vir- [159] Zinzalla G, Thurston DE. Targeting protein-protein interactions
tual screening allows the discovery of the biological activity of for therapeutic intervention: a challenge for the future. Future
natural compounds. Bioorg Med Chem 2012;20:3596e602. Med Chem 2009;1:65e93.
[140] Schuster D, Wolber G. Identification of bioactive natural prod- [160] Wells JA, McClendon CL. Reaching for high-hanging fruit in
ucts by pharmacophore-based virtual screening. Curr Pharm drug discovery at protein-protein interfaces. Nature 2007;450:
Des 2010;16:1666e81. 1001e9.
[141] Ehrman TM, Barlow DJ, Hylands PJ. Phytochemical informatics [161] Sperandio O, Reynes CH, Camproux AC, Villoutreix BO. Ratio-
and virtual screening of herbs used in chinese medicine. Curr nalizing the chemical space of protein-protein interaction
Pharm Des 2010;16:1785e98. inhibitors. Drug Discovery Today 2010;15:220e9.
[142] Shen JH, Xu XY, Cheng F, Liu H, Luo XM, Shen JK, et al. Virtual [162] Whitty A, Kumaravel G. Between a rock and a hard place? Nat
screening on natural products for discovering active compounds Chem Biol 2006;2:112e8.
and target information. Curr Med Chem 2003;10:2327e42. [163] Bienstock RJ. Computational drug design targeting protein-
[143] Ma DL, Chan DSH, Leung CH. Molecular docking for virtual protein interactions. Curr Pharm Des 2012;18:1240e54.
screening of natural product databases. Chem Sci 2011;2: [164] Knapp S, Weinmann H. Small-molecule modulators for epige-
1656e65. netics targets. ChemMedChem 2013;8:1885e91.
[144] Geldenhuys WJ, Bishayee A, Darvesh AS, Carroll RT. Natural [165] Rius M, Lyko F. Epigenetic cancer therapy: Rationales, targets
products of dietary origin as lead compounds in virtual and drugs. Oncogene 2012;31:4257e65.
screening and drug design. Curr Pharm Biotechnol 2012;13: [166] Egger G, Liang G, Aparicio A, Jones PA. Epigenetics in human
117e24. disease and prospects for epigenetic therapy. Nature 2004;429:
[145] Clark RL, Johnston BF, Mackay SP, Breslin CJ, Robertson MN, 457e63.
Harvey AL. The drug discovery portal: a resource to enhance [167] Svedruzic ZM. Mammalian cytosine DNA methyltransferase
drug discovery from academia. Drug Discovery Today 2010;15: Dnmt1: enzymatic mechanism, novel mechanism-based inhibi-
679e83. tors, and RNA-directed DNA methylation. Curr Med Chem
[146] Nettles JH, Jenkins JL, Bender A, Deng Z, Davies JW, Glick M. 2008;15:92e106.
Bridging chemical and biological space: “Target fishing” using [168] Miller CA, Gavin CF, White JA, Parrish RR, Honasoge A,
2D and 3D molecular descriptors. J Med Chem 2006;49:6802e10. Yancey CR, et al. Cortical DNA methylation maintains remote
[147] Yue R, Shan L, Yang X, Zhang W. Approaches to target profiling memory. Nat Neurosci 2010;13:664e6.
of natural products. Curr Med Chem 2012;19:3841e55. [169] Zawia NH, Lahiri DK, Cardozo-Pelaez F. Epigenetics, oxidative
[148] Keiser MJ, Roth BL, Armbruster BN, Ernsberger P, Irwin JJ, stress, and Alzheimer disease. Free Radical Biol Med 2009;46:
Shoichet BK. Relating protein pharmacology by ligand 1241e9.
chemistry. Nat Biotech 2007;25:197e206. [170] Castellano S, Kuck D, Viviano M, Yoo J, López-Vallejo F, Conti P,
[149] Keiser MJ, Setola V, Irwin JJ, Laggner C, Abbas AI, Hufeisen SJ, et al. Synthesis and biochemical evaluation of d2-isoxazoline de-
et al. Predicting new molecular targets for known drugs. Nature rivatives as DNA methyltransferase 1 inhibitors. J Med Chem
2009;462:175e82. 2011;54:7663e77.
[150] Lagunin A, Stepanchikova A, Filimonov D, Poroikov V. PASS: [171] Rilova E, Erdmann A, Gros C, Masson V, Aussagues Y, Poughon-
prediction of activity spectra for biologically active substances. Cassabois V, et al. Design, synthesis and biological evaluation of
Bioinformatics 2000;16:747e8. 4-amino-n-(4-aminophenyl)benzamide analogues of quinoline-
[151] Rudik AV, Dmitriev AV, Lagunin AA, Filimonov DA, based SGI-1027 as inhibitors of DNA methylation. ChemMed-
Poroikov VV. Metabolism site prediction based on xenobiotic Chem 2014;9:590e601.
structural formulas and PASS prediction algorithm. J Chem Inf [172] Méndez-Lucio O, Tran J, Medina-Franco JL, Meurice N,
Model 2014:498e507. Muller M. Towards drug repurposing in epigenetics: olsalazine
[152] Lagunin A, Filimonov D, Poroikov V. Multi-targeted natural as a novel hypomethylating compound active in a cellular
products evaluation based on biological activity prediction context. ChemMedChem 2014;9:560e5.
with PASS. Curr Pharm Des 2010;16:1703e17. [173] Medina-Franco JL, López-Vallejo F, Kuck D, Lyko F. Natural
[153] Carregal AP, Comar M, Alves SN, de Siqueira JM, Lima LA, products as DNA methyltransferase inhibitors: a computer-
Taranto AG. Inverse virtual screening studies of selected natural aided discovery approach. Mol Diversity 2011;15:293e304.
LIST OF ABBREVIATIONS 475
[174] Lee WJ, Shim JY, Zhu BT. Mechanisms for the inhibition of DNA [189] Medina-Franco JL, Martı́nez-Mayorga K, Peppard TL, Del Rio A.
methyltransferases by tea catechins and bioflavonoids. Mol Chemoinformatic analysis of GRAS (Generally Recognized as
Pharmacol 2005;68:1018e30. Safe) flavor chemicals and natural products. PLoS One 2012;7:
[175] Liu ZF, Xie ZL, Jones W, Pavlovicz RE, Liu SJ, Yu JH, et al. Cur- e50798.
cumin is a potent DNA hypomethylation agent. Bioorg Med [190] Guasch L, Sala E, Castell-Auvi A, Cedo L, Liedl KR, Wolber G,
Chem Lett 2009;19:706e9. et al. Identification of PPARgamma partial agonists of natural
[176] Pina IC, Gautschi JT, Wang GYS, Sanders ML, Schmitz FJ, origin (i): development of a virtual screening procedure and in
France D, et al. Psammaplins from the sponge pseudoceratina pur- vitro validation. PLoS One 2012;7:e50816.
purea: Inhibition of both histone deacetylase and DNA [191] Martinez-Mayorga K, Peppard TL, López-Vallejo F, Yongye AB,
methyltransferase. J Org Chem 2003;68:3866e73. Medina-Franco JL. Systematic mining of generally recognized
[177] Pereira R, Benedetti R, Perez-Rodriguez S, Nebbioso A, Garcia- as safe (GRAS) flavor chemicals for bioactive compounds. J Agric
Rodriguez J, Carafa V, et al. Indole-derived psammaplin a ana- Food Chem 2013;61:7507e14.
logues as epigenetic modulators with multiple inhibitory
activities. J Med Chem 2012;55:9467e91.
[178] Kuck D, Caulfield T, Lyko F, Medina-Franco JL. Nanaomycin a
selectively inhibits DNMT3B and reactivates silenced tumor sup- LIST OF ABBREVIATIONS
pressor genes in human cancer cells. Mol Cancer Ther 2010;9:
3015e23. 2D Two-dimensional
[179] Hauser AT, Jung M. Targeting epigenetic mechanisms: potential 3D Three-dimensional
of natural products in cancer chemoprevention. Planta Med DDP Drug Discovery Portal
2008;74:1593e601. DNMT DNA methyltransferase
[180] Li Y, Tollefsbol TO. Impact on DNA methylation in cancer pre- DOS Diversity-oriented synthesis
vention and therapy by bioactive dietary components. Curr EGCG (-)-Epigallocatechin-3-gallate
Med Chem 2010;17:2141e51. FDA Federal Drug Administration
[181] Cherblanc FL, Davidson RWM, Di Fruscia P, GDB Generated Database of Chemical Space
Srimongkolpithak N, Fuchter MJ. Perspectives on natural prod- GpiDAPH3 Graph-based three-point pharmacophore
uct epigenetic modulators in chemical biology and medicine. GRAS Generally Recognized as Safe
Nat Prod Rep 2013;30:605e24. HBA Hydrogen bond acceptor
[182] Jia D, Jurkowska RZ, Zhang X, Jeltsch A, Cheng X. Structure of HBD Hydrogen bond donor
Dnmt3a bound to Dnmt3L suggests a model for de novo DNA HDAC1 Histone deacetylase-1
methylation. Nature 2007;449:248e51. HTS High-throughput screening
[183] Medina-Franco JL, Caulfield T. Advances in the computational MACCS Molecular Access System
development of DNA methyltransferase inhibitors. Drug MLSMR Molecular Libraries Small Molecule Repository
Discovery Today 2011;16:418e25. MW Molecular weight
[184] Yoo J, Medina-Franco JL. Inhibitors of DNA methyltransferases: NMR Nuclear magnetic resonance
Insights from computational studies. Curr Med Chem 2012;19: NPs Natural products
3475e87. PASS Prediction of activity spectra for substances
[185] Yoo J, Medina-Franco JL. Homology modeling, docking, and PPIs Proteineprotein interactions
structure-based pharmacophore of inhibitors of DNA PSA Polar surface area
methyltransferase. J Comput-Aided Mol Des 2011;25:555e67. QSAR Quantitative structureeactivity relationship
[186] Yoo J, Kim JH, Robertson KD, Medina-Franco JL. Molecular RB Rotatable bond
modeling of inhibitors of human DNA methyltransferase with SCONP Structural classification of natural products
a crystal structure: discovery of a novel DNMT1 inhibitor. Adv SEA Similarity ensemble approach
Protein Chem Struct Biol 2012;87:219e47. SAH S-adenosyl-L-homocysteine
[187] Yoo J, Medina-Franco JL. Trimethylaurintricarboxylic acid SAM S-adenosyl-L-methionine
inhibits human DNA methyltransferase 1: Insights from enzy- Slog P Octanol/water partition coefficient
matic and molecular modeling studies. J Mol Model 2012;18: SPID Structureepromiscuity index difference
1583e9. TCM Traditional Chinese Medicine
[188] Yoo J, Medina-Franco JL. Chemoinformatic approaches for inhib- TGD Typed graph distance
itors of DNA methyltransferases: comprehensive characteriza- TPSA Topological polar surface area
tion of screening libraries. Comput Mol Biosci 2011;1:7e16. UNPD Universal Natural Products Database

You might also like