You are on page 1of 24

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/327875219

ADMETopt: A Web Server for ADMET Optimization in Drug Design via Scaffold
Hopping

Article in Journal of Chemical Information and Modeling · September 2018


DOI: 10.1021/acs.jcim.8b00532

CITATIONS READS

0 24

6 authors, including:

Hongbin Yang Guixia Liu


East China University of Science and Technology East China University of Science and Technology
16 PUBLICATIONS 46 CITATIONS 121 PUBLICATIONS 1,971 CITATIONS

SEE PROFILE SEE PROFILE

Yun Tang
East China University of Science and Technology
261 PUBLICATIONS 3,915 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

In silico prediction of Eco-toxicity via machine learning methods View project

Genetically encoded sensors of metabolism View project

All content following this page was uploaded by Yun Tang on 01 October 2018.

The user has requested enhancement of the downloaded file.


Subscriber access provided by EAST CHINA UNIV OF SCI & TECH

Application Note
ADMETopt: A Web Server for ADMET
Optimization in Drug Design via Scaffold Hopping
Hongbin Yang, Lixia Sun, Zhuang Wang, Weihua Li, Guixia Liu, and Yun Tang
J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.8b00532 • Publication Date (Web): 25 Sep 2018
Downloaded from http://pubs.acs.org on October 1, 2018

Just Accepted

“Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted
online prior to technical editing, formatting for publication and author proofing. The American Chemical
Society provides “Just Accepted” as a service to the research community to expedite the dissemination
of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in
full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully
peer reviewed, but should not be considered the official version of record. They are citable by the
Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore,
the “Just Accepted” Web site may not include all articles that will be published in the journal. After
a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web
site and published as an ASAP article. Note that technical editing may introduce minor changes
to the manuscript text and/or graphics which could affect content, and all legal disclaimers and
ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or
consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W.,


Washington, DC 20036
Published by American Chemical Society. Copyright © American Chemical Society.
However, no copyright claim is made to original U.S. Government works, or works
produced by employees of any Commonwealth realm Crown government in the
course of their duties.
Page 1 of 22 Journal of Chemical Information and Modeling

1
2
3
4
5 ADMETopt: A Web Server for ADMET Optimization in
6
7
8 Drug Design via Scaffold Hopping
9
10
11
12
13
14
15
16 Hongbin Yang, Lixia Sun, Zhuang Wang, Weihua Li, Guixia Liu, Yun Tang*
17
18
19
20
21
22
23
24
Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China
25
26 University of Science and Technology, Shanghai 200237, China
27
28
29
30
31
32
33
34 *Corresponding author, E-mail: ytang234@ecust.edu.cn
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 2 of 22

1
2
3
4 Abstract:
5
6
7 Drug-likeness comprising ADMET (absorption, distribution, metabolism, excretion,
8
9 and toxicity) properties plays significant roles in early drug discovery. However, as for
10
11
12 current strategies of lead optimization, in vitro potency is still the focus, which may
13
14
cause “molecular obesity” (poor ADMET properties). Therefore, optimization of
15
16
17 ADMET properties would be a preferable complement for drug discovery. In this
18
19
20 paper, we present a web server, ADMETopt, which applies scaffold hopping and
21
22 ADMET screening for lead optimization. More than 50 thousand unique scaffolds
23
24
25 were extracted by fragmenting chemicals deposited in ChEMBL and Enamine
26
27
databases. Up to 15 ADMET properties can be predicted to screen the potential
28
29
30 molecules, including 7 physicochemical properties and 8 biological properties. All the
31
32
33 models were built in terms of our previous studies and available in our web server
34
35 admetSAR. For the plausibility measurement of the modified molecules, synthetic
36
37
38 accessibility (SA) as well as quantitative evaluation of drug-likeness (QED) was then
39
40
implemented. As a case study, a scaffold similarity network was constructed for
41
42
43 compounds that have bioactivities on estrogen receptors. The results demonstrated
44
45
46 that the feasibility and practicability of our web server are acceptable. The web server
47
48 is publicly accessible at http://lmmd.ecust.edu.cn/admetsar2/admetopt/.
49
50
51 Keywords: ADMET; scaffold hopping; lead optimization; web server
52
53
54
55
56
57
58
59
60

ACS Paragon Plus Environment


Page 3 of 22 Journal of Chemical Information and Modeling

1
2
3
4 Introduction
5
6
7 The process of drug discovery and development is rather costly and risky, in
8
9 which thousands of chemicals have to be synthesized and tested, but most of them fail
10
11
12 due to lack of efficacy, unacceptable adverse effects or poor pharmacokinetics. The
13
14
general pipeline of drug discovery includes 1) lead discovery, 2) lead optimization in
15
16
17 terms of bioactivity, 3) lead optimization in terms of ADMET (absorption, distribution,
18
19
20 metabolism, excretion, and toxicity) properties, 4) pre-clinical studies, and 5) clinical
21
22 trials.1 Over the past decades, high throughput screening (HTS) technique has gained
23
24
25 extensive application for lead discovery in both industry and academia.2 Meanwhile,
26
27
with the development of virtual screening techniques, which mainly includes
28
29
30 structure-based and ligand-based strategies,3-5 and the support from large merchants of
31
32
33 chemical databases such as ZINC,6 lead discovery has become much easier than ever
34
35 before. Accordingly, researchers have shown a preference for rational drug design.
36
37
38 A compound with low drug-likeness and poor ADMET properties will not be
39
40
considered to advance into preclinical study despite of high bioactivity. ADMET is one
41
42
43 of the most common causes of drug failure, though a remarkable effect has been
44
45
46 received after more attentions were paid to these properties in recent years. Especially
47
48 toxicity is still the major cause for drug failure in clinical trials.7, 8 Comparing to
49
50
51 experimental determination of chemical ADMET properties, in silico methods have
52
53
shown great advantages, such as fast, cheap, green, and accurate. With the
54
55
56 development of various machine learning methods, many computational models have
57
58
59 been developed for the prediction of various endpoints including drug-induced liver
60

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 4 of 22

1
2
3
4 injury, acute oral toxicity, etc.9-11 Our group has developed a web server for chemical
5
6
7 ADMET prediction, namely admetSAR, which included 22 qualitative classification
8
9 and 5 quantitative regression models.12 We have been continually updating and
10
11
12 supplying more predictive models for the web server, and it was recently updated to
13
14
version 2 with 47 predictive models.13-17 More recently, Daina et al. developed
15
16
17 SwissADME that could also predict chemical ADME properties.18 With the help of
18
19
20 these tools, we could judge whether it is possible to move the hit or lead compounds
21
22 forward to preclinical research. In addition, ADMET filter can be used in early drug
23
24
25 discovery such as selection of screening library of nature products.19, 20
26
27
However, these predictive tools cannot guide the optimization of ADMET
28
29
30 properties according to structures. So far, medicinal chemists tend to modify the
31
32
33 molecules in terms of their experience: for example, adding a carboxyl or hydroxyl
34
35 group to improve the water solubility, replacing bicyclic scaffolds with smaller one to
36
37
38 reduce the molecular weight, avoiding nitro group and condensed rings that may lead
39
40
to mutagenicity. Here, we present a web server, named ADMETopt, in which scaffold
41
42
43 hopping is implemented to optimize the chemical ADMET properties. We suppose that
44
45
46 slight changes of scaffolds will not affect much of the affinity though activity cliff may
47
48 happen during optimization.21 The goal of the web server is to aid and support
49
50
51 medicinal chemists to optimize the lead compounds with respect to ADMET
52
53
properties.
54
55
56
57
58
59
60

ACS Paragon Plus Environment


Page 5 of 22 Journal of Chemical Information and Modeling

1
2
3
4
5
Materials and Methods
6
7
Construction of scaffold library
8
9
10 A scaffold was defined as a single ring or a collection of fused rings or spiro rings,
11
12
13 including exocyclic terminal bonds.22 The number of atoms in each smallest set of
14
15
16
smallest rings (SSSR) must between 4 and 7, which indicates that we did not consider
17
18 linkers between two or more rings since it might complicate the scaffold space and
19
20
21 make an over reliance on replacement algorithm for chemists. As an example
22
23 illustrated in Figure S4, the three scaffolds of tamoxifen, a selective estrogen receptor
24
25
26 α (ERα) antagonist used for the treatment of breast cancer, have benzene rings. Two
27
28
29
of them have one substituent while the other has two para-substituents. Though the
30
31 “key scaffold” of tamoxifen should be the double bond as well as the surrounding
32
33
34 aromatic rings, we do not consider this kind of scaffold since they might be
35
36 irreplaceable. One compound may contain more than one scaffold. In that case, users
37
38
39 should select one of the scaffolds to replace according to their custom requirements
40
41
42
before optimization.
43
44 The scaffold library was constructed by fragmenting all the chemicals from
45
46
47 ChEMBL23 and Enamine database (http://www.enamine.net/), which contained about
48
49 579 thousand and 1.75 million unique compounds, respectively. The scaffolds were
50
51
52 extracted as canonical SMILES format in which substituents were represented as “*”.
53
54
55
Scaffolds without any substituent were removed. Duplicated scaffolds were removed
56
57 according to the canonical SMILES of the scaffolds. It should be noticed that
58
59
60 scaffolds with substituents in different positions were regarded as different scaffolds

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 6 of 22

1
2
3
4 and thus would not be removed as repetition. The chirality of the scaffolds was
5
6
7 reserved.
8
9 Scaffold fingerprint and scaffold hopping
10
11
12 Scaffold hopping is to replace a selected scaffold of a query compound with a
13
14
15 similar one in the library. The similarity was calculated by Tanimoto similarity
16
17 coefficient between the fingerprints of the two scaffolds, defined in Eq. 1.
18
19 ∑𝑚𝑚 [𝐹𝐹𝑃𝑃𝑘𝑘 (𝑖𝑖)=𝐹𝐹𝐹𝐹𝑘𝑘 (𝑗𝑗)≠0]
20 Si,j = ∑𝑚𝑚 𝑘𝑘=1 (1)
𝑘𝑘 (𝑖𝑖)≠0 𝑜𝑜𝑜𝑜 𝐹𝐹𝑃𝑃𝑘𝑘 (𝑗𝑗) ≠0]
[𝐹𝐹𝐹𝐹
𝑘𝑘=1
21
22
23
where i and j is the scaffold number, m is the length of the scaffold fingerprint (1021
24
25 bits), and FP k (i) means the kth bit of the fingerprint of scaffold i. The Expression in
26
27
28 square brackets generates 1 if it is true, otherwise it will be 0.
29
30 Scaffold fingerprints were generated to calculate the properties listed in Table S1
31
32
33 by Python scripts in RDKit.24 The concept of scaffold fingerprint was mainly inspired
34
35
36
by Rabal et al.22 referring to the current studies on scaffold similarity and hopping
37
38 techniques.25-29 The properties can be categorized into two groups: the first 14
39
40
41 properties are of statistics involving the ring systems, elements and especially the
42
43 carbons. The rest 4 properties are distance-related feature vectors. The diversity points
44
45
46 include 6 atom types, i.e. sp3 carbon, sp2 carbon, aromatic carbon, aromatic nitrogen,
47
48
49
aliphatic nitrogen, and other atom type. The pharmacophores include hydrogen donors
50
51 and 4 types of hydrogen acceptors, i.e. exocyclic acceptor, nitrogen, oxygen, and
52
53
54 other acceptor atoms. These 11 atom types formed 66 atom type pairs in combination,
55
56 including diversity point-diversity point pairs, diversity point-pharmacophore pairs,
57
58
59 and pharmacophore-pharmacophore pairs. For each atom type pair, a 15-bit vector
60

ACS Paragon Plus Environment


Page 7 of 22 Journal of Chemical Information and Modeling

1
2
3
4 was generated according to the occurrence at the corresponding distance. Another
5
6
7 distance related feature vectors called shape features was also a 15-bit vector counting
8
9 the number of atoms at a corresponding distance from the diversity points, which can
10
11
12 reflect the topological properties of the ring systems. No weights were set to the
13
14
properties, so each bit of the scaffold fingerprint is of equal importance. The feature
15
16
17 vectors are very sparse, though they contain much more descriptor bits compared to
18
19
20 the statistic properties.
21
22 During scaffold hopping, we restricted that the number of diversity points of the
23
24
25 new scaffold should be the same as that of the replaced one. The orders of the
26
27
connection points of new scaffold can be different from the replaced one, i.e. the two
28
29
30 scaffolds need not be aligned to match the connection points (Figure S4). When the
31
32
33 number of connection points was greater than two, a distance algorithm was used to
34
35 decide how the scaffold is replaced. The detailed description of the implementation of
36
37
38 scaffold hopping in the webserver was provided in the supporting information.
39
40
Calculation of ADMET properties of new compounds
41
42
43 As listed in Table 1, 15 properties were set as constraints to optimize the
44
45
46 compounds, including 5 topological parameters, i.e. molecular weight (MW), number
47
48
49
of H-bond acceptors (HBA), number of H-bond donors (HBD), rotatable bonds (RB),
50
51 and number of halogens (Halo); two calculated physicochemical properties, i.e.
52
53
54 partition coefficient (AlogP)30 and topological polar surface area (TPSA),31 and 8
55
56 predictive binary ADMET properties, i.e. blood brain barrier penetration (BBB),
57
58
59 p-glycoprotein inhibitor (P-gpi), carcinogenicity (Carc), Ames mutagenicity (Ames),
60

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 8 of 22

1
2
3
4 acute oral toxicity (AO), hERG inhibitors (hERG), CYP450 inhibitory promiscuity
5
6
7 (CypPro), and human intestinal absorption (HIA). The details of the predictive models
8
9 for these properties were described in the supporting information and their predictive
10
11
12 performance was shown in Table S2.
13
14
The predictive models were built using support vector machine (SVM)
15
16
17 implemented via LibSVM32 and the chemicals were represented as MACCS
18
19
20 fingerprints calculated via Open Babel.33 The carcinogenicity model was originally
21
22 built by a ternary classification method, in which chemicals were categorized into
23
24
25 strongly carcinogenic (TD 50 < 10 mg/kg/day), weakly carcinogenic (TD 50 > 10
26
27
mg/kg/day), and inactive chemicals.15 For the purpose of optimization, both weakly
28
29
30 and strongly carcinogenic compounds were labeled as positive, and only inactive
31
32
33 compounds were labeled as negative. The acute oral toxicity model classified
34
35 compounds into four categories, I, II, III, and IV, corresponding to the LD 50 of ≤ 50, >
36
37
38 50, > 500, and > 5000 mg/kg. We labeled I/II as positive and III/IV as negative.
39
40
In addition to the properties mentioned above, quantitative evaluation of
41
42
43 drug-likeness (QED) and synthetic accessibility (SA) were also estimated for the
44
45
46 modified molecules, facilitating the selection of candidates from a large number of
47
48 recommended structures. The detailed description of these two metrics can be found
49
50
51 in the supporting information.
52
53
54
55
56
57
58
59
60

ACS Paragon Plus Environment


Page 9 of 22 Journal of Chemical Information and Modeling

1
2
3
4
5
Results and Discussion
6
7
Overview of scaffolds
8
9
10 The numbers of scaffolds we obtained from ChEMBL and Enamine are shown in
11
12
13 Figure S5. Though Enamine contains more compounds than ChEMBL, its scaffolds
14
15
16
are not as diverse as those from ChEMBL. After removing scaffolds without any
17
18 connection point, 53659 unique scaffolds were obtained. The statistics of some
19
20
21 properties of the scaffolds are depicted in Figure S6, including the number of rings,
22
23 heterocycles, aromatic rings, HBA, HBD, connection points, and halogens. Most of
24
25
26 the scaffolds contained 1-3 rings, in which 2-ring-scaffold is the most frequent in the
27
28
29
library. The ubiquity of heterocycles is not unexpected since their chemical space is
30
31 larger than homocycles. The distribution of HBA numbers indicates that the acceptors
32
33
34 are always located in the rings (scaffolds) of molecules and most of scaffolds have at
35
36 least one HBA atom such as N and O. Comparatively, the number of HBD is smaller.
37
38
39 The distribution of the halogen numbers showed that most of the scaffolds in the
40
41
42
database have no halogen, and scaffolds with more than 2 halogens are also rare,
43
44 which explains why we set the default range of halogens as 0~2 (Table 1).
45
46
47 Figure S7A exhibited the most frequent scaffolds extracted from ChEMBL and
48
49 Enamine. It is noticeable that scaffolds presented twice or more in a compound would
50
51
52 be counted only once. For the mono-ring scaffolds, benzene is the most common in
53
54
55
the database, occupying five out of the ten frequent scaffolds. N,N-substituted
56
57 piperazine, and N-substituted piperidine are the most frequent heterocycle scaffolds.
58
59
60 These two scaffolds can replace N,N-dimethyl to enhance the bioactivity of drugs

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 10 of 22

1
2
3
4 such as Chlorpromazine. As shown in Figure S7B, all the ten most frequent
5
6
7 dicyclo-scaffolds have at least one benzene ring. The high frequency of these
8
9 scaffolds may be relevant to their contribution of “drug-likeness” and ADMET
10
11
12 properties. However, we did not consider the frequency as a factor to prioritize in
13
14
scaffold hopping since their frequency may be affected by the popularity of some
15
16
17 specific targets or structural cores.
18
19
20 Statistics of ADMET properties
21
22
23
In order to understand the preference of the ADMET models implemented in this
24
25 web server, we predicted the ADMET properties for all the compounds in ChEMBL,
26
27
28 including P-gpi, hERG, HIA, AO, Ames, Carc, BBB, and CypPro. The distribution
29
30 histogram of the eight endpoints across ChEMBL was shown in Figure S8. HIA and
31
32
33 BBB are important for drug absorption and the discovery of oral drugs. The statistics
34
35
36
showed that most of the chemicals in ChEMBL were predicted as positive for these
37
38 two endpoints, especially for HIA+ that covered 95% of the dataset. The other six
39
40
41 endpoints are relevant to toxicity or side effects. Most of the chemicals were predicted
42
43 as negative for acute oral toxicity, Ames mutagenicity and carcinogenicity, while for
44
45
46 other three endpoints, only about a half of the chemicals were predicted as negative. It
47
48
49
is interesting that a large number of chemicals in ChEMBL (~50%) were predicted as
50
51 positive for hERG, which may result from the inaccuracy of the model (SP=0.786)
52
53
54 and the threshold of IC 50 ≤ 50 μM. This indicates that using this constraint may
55
56 narrow the chemical space in drug discovery. Nevertheless, we cannot neglect the fact
57
58
59 that hERG is one of the major causes of drug failure. Similarly, inhibition of P-gp and
60

ACS Paragon Plus Environment


Page 11 of 22 Journal of Chemical Information and Modeling

1
2
3
4 CYP450 inhibitory promiscuity have strong associations with drug-drug interactions
5
6
7 and adverse drug reactions (ADR) that may lead to drug failure.34, 35 The histogram in
8
9 Figure S8 showed that these three endpoints may filter many compounds, so we
10
11
12 suggest that we should better remove these constraints in early drug discovery to
13
14
explore in a larger chemical space, and then activate these constraints later to avoid
15
16
17 the potential risk of ADR.
18
19
20 Interface and functionality
21
22
23
The query compound can be represented as SMILES input by user, or drawn
24
25 through the built-in structure editor. Then the physicochemical and topological
26
27
28 properties of the query compound will be displayed, including MW, ALogP, logS, RB,
29
30 HBA, HBD, and the replaceable scaffolds of the query compounds, which can be
31
32
33 selected by user as query scaffold to be optimized (Figure 1).
34
35
36
During optimization, users can select certain properties to be considered and set
37
38 the constraints. If the constraints are not set, the default range will be used, as listed in
39
40
41 Table 1. Figure 1B shows the interface of property selection. Four “quick start”
42
43 buttons support to quickly add constraints, including rule-of-five, rule-of-four,
44
45
46 rule-of-three, and non-toxic. Users can continue to edit the constraints freely
47
48
49
including adding or removing a property and set the range or target of the property. At
50
51 least one constraint is required and we suggest that the users should consider the
52
53
54 number of halogens because scaffolds with little difference in halogens (e.g. adding
55
56 one or two halogens or replacing chlorine with bromine or iodine) will always be
57
58
59 output since their scaffold fingerprints are similar.
60

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 12 of 22

1
2
3
4 The resultant molecules that fulfill the constraints will display in grid after several
5
6
7 seconds. QED and SA of each molecule will be shown under the structure to help the
8
9 user to select the optimized molecules (Figure 1C). After the details of the optimized
10
11
12 molecule at hand, the user could continue to modify or predict more ADMET
13
14
properties of the molecule in admetSAR by a link.
15
16
17 Case study: network analysis for ERα binding compounds
18
19
20 To exemplify the application of ADMETopt, we analyzed the scaffold similarity
21
22
23
network of compounds that have bioactivities on ERα. There are already some ERα
24
25 modulators such as tamoxifen and diethylstilbesterol, but their high adverse effects
26
27
28 and drug resistance make it urgent to discover new selective ERα modulators.
29
30 The compounds and the bioactivity data were collected from ChEMBL (target id:
31
32
33 CHEMBL206). Only the compounds with at least one record to show IC 50 or K i less
34
35
36
than 10 µM were considered. In the network, nodes represented compounds and edges
37
38 for scaffold hopping. Isolated compounds, which cannot be hopped to another one
39
40
41 that has bioactivity record in ChEMBL, were not shown in this network. Figure S9
42
43 displayed the complete scaffold hopping network, which consisted of 1360 nodes and
44
45
46 7821 edges. The width of the edges indicates the similarity between the two scaffolds
47
48
49
of the compounds. Then we predicted the toxicity for each of the compounds in the
50
51 network. Compounds with at least one positive endpoint in Ames mutagenicity,
52
53
54 carcinogenicity, oral acute toxicity, or hERG inhibitors were considered as toxic and
55
56 labeled by red border. The other non-toxic compounds were labeled by green border.
57
58
59 We found that compounds in the same cluster generally have similar predicted
60

ACS Paragon Plus Environment


Page 13 of 22 Journal of Chemical Information and Modeling

1
2
3
4 toxicity. It was not surprised since scaffold hopping is based on similar scaffold and
5
6
7 compounds in each pair were very similar. Figure S9 showed the details of two
8
9 typical clusters. Molecules in the left cluster are tamoxifen derivatives and mostly
10
11
12 predicted as non-toxic, while those in the right cluster are mostly predicted as toxic.
13
14
For the left one, there are three factions corresponding to the three scaffolds of the
15
16
17 center molecule that can be hopped. In each faction, the molecules can be hopped to
18
19
20 any other molecules. Supposing that the center molecule is the query compound
21
22 submitted by a user, all the surrounding molecules can be reached via scaffold
23
24
25 hopping and the red nodes will not display if the user set the constraints of
26
27
“non-toxic”. However, for the right cluster, the constraint of “non-toxic” may not be a
28
29
30 good filter to optimize the molecule because it may drop into a black hole and cannot
31
32
33 find any proper molecules. For example, if molecule (1) in Figure S9 is the query
34
35 compound, all the neighbors around the query compound are predicted as toxic
36
37
38 though there are four green nodes in this subnetwork. Therefore, in order to get the
39
40
final goal, the safe compounds, during scaffold hopping one should set proper
41
42
43 constraints considering both the query structure and the applicability of the prediction
44
45
46 models.
47
48
49
50
51
52 Conclusions
53
54
55 We described a web server that uses scaffold hopping and ADMET prediction to
56
57
optimize molecules in drug discovery. In this web server, over 50 thousand unique
58
59
60 scaffolds are used to modify a molecule prioritized by their similarity to the query

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 14 of 22

1
2
3
4 scaffold. Up to 15 ADMET properties can be set as constraints to filter unwanted
5
6
7 molecules, including the common physicochemical and topological properties that can
8
9 be calculated by cheminformatics toolkits, and complex ones that can be predicted by
10
11
12 computational models. Researchers can iteratively modify the scaffolds of the query
13
14
molecule until to get a satisfactory one.
15
16
17 The robustness of the predictive models is very important for the practicability of
18
19
20 this web server. Inaccurate prediction may mislead researchers and increase the risk of
21
22 failure. Currently, the number of the ADMET predictive models in this web server is
23
24
25 still inadequate, causing that the constraints set by the users are limited and cannot
26
27
filter out many undesirable scaffolds. In addition, applicability domains of the
28
29
30 predictive models are not fully implemented, which may cause inaccuracy or bias of
31
32
33 the models. In the future we will continue to improve the models and consider more
34
35 methods that can help optimize the ADMET properties of a molecule. For example,
36
37
38 existence of structural alerts may increase the risk of toxicity; inhibition of several
39
40
CYP450 enzymes may result in drug-drug interaction.
41
42
43
44
45
46
47
Acknowledgements
48
49
50 This work was supported by the National Key Research and Development Program of
51
52 China (Grant 2016YFA0502304) and the National Natural Science Foundation of
53
54
55 China (Grant 81872800).
56
57
58
59
60

ACS Paragon Plus Environment


Page 15 of 22 Journal of Chemical Information and Modeling

1
2
3
4
5
Associated Content
6
7
8 Supporting Information
9
10 The Supporting Information is available free of charge on the ACS Publications
11
12
13 website.
14
15
Detailed descriptions of methods about the architecture and implementation of the
16
17
18 webserver (Figures S1-S3), and statistics and examples of the scaffolds (Tables S1-S2,
19
20
21 Figures S4-S9).
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 16 of 22

1
2
3
4
5
References
6
7 1. Giri, S.; Bader, A., A low-cost, high-quality new drug discovery process using
8 patient-derived induced pluripotent stem cells. Drug Discov. Today 2015, 20,
9
10 37-49.
11 2. Auld, D. S.; Jimenez, M.; Yue, K.; Busby, S.; Chen, Y. C.; Bowes, S.; Wendel, G.;
12 Smith, T.; Zhang, J. H., Matrix-Based Activity Pattern Classification as a Novel
13
Method for the Characterization of Enzyme Inhibitors Derived from
14
15 High-Throughput Screening. J. Biomol. Screen. 2016, 21, 1075-1089.
16 3. Hao, G. F.; Jiang, W.; Ye, Y. N.; Wu, F. X.; Zhu, X. L.; Guo, F. B.; Yang, G. F.,
17 ACFIS: a web server for fragment-based drug discovery. Nucleic Acids Res. 2016,
18
19 44, W550-556.
20 4. Muralidharan, A. R.; Selvaraj, C.; Singh, S. K.; Sheu, J. R.; Thomas, P. A.;
21 Geraldine, P., Structure-Based Virtual Screening and Biological Evaluation of a
22
23 Calpain Inhibitor for Prevention of Selenite-Induced Cataractogenesis in an in
24 Vitro System. J. Chem. Inf. Model. 2015, 55, 1686-1697.
25 5. Cruz-Monteagudo, M.; Schurer, S.; Tejera, E.; Perez-Castillo, Y.; Medina-Franco,
26
J. L.; Sanchez-Rodriguez, A.; Borges, F., Systemic QSAR and phenotypic virtual
27
28 screening: chasing butterflies in drug discovery. Drug Discov. Today 2017, 22,
29 994-1007.
30 6. Irwin, J. J.; Sterling, T.; Mysinger, M. M.; Bolstad, E. S.; Coleman, R. G., ZINC:
31
32 a free tool to discover chemistry for biology. J. Chem. Inf. Model. 2012, 52,
33 1757-1768.
34 7. Hay, M.; Thomas, D. W.; Craighead, J. L.; Economides, C.; Rosenthal, J.,
35
36 Clinical development success rates for investigational drugs. Nat. Biotechnol.
37 2014, 32, 40-51.
38 8. Waring, M. J.; Arrowsmith, J.; Leach, A. R.; Leeson, P. D.; Mandrell, S.; Owen,
39
R. M.; Pairaudeau, G.; Pennie, W. D.; Pickett, S. D.; Wang, J.; Wallace, O.; Weir,
40
41 A., An analysis of the attrition of drug candidates from four major pharmaceutical
42 companies. Nat. Rev. Drug Discov. 2015, 14, 475-486.
43 9. Lei, T.; Li, Y.; Song, Y.; Li, D.; Sun, H.; Hou, T., ADMET evaluation in drug
44
45 discovery: 15. Accurate prediction of rat oral acute toxicity using relevance
46 vector machine and consensus modeling. J. Cheminform. 2016, 8, 6.
47 10. Xu, Y.; Dai, Z.; Chen, F.; Gao, S.; Pei, J.; Lai, L., Deep Learning for
48
49 Drug-Induced Liver Injury. J. Chem. Inf. Model. 2015, 55, 2085-2093.
50 11. Yang, H.; Sun, L.; Li, W.; Liu, G.; Tang, Y., In Silico Prediction of Chemical
51 Toxicity for Drug Design Using Machine Learning Methods and Structural Alerts.
52
Front Chem 2018, 6, 30.
53
54 12. Cheng, F.; Li, W.; Zhou, Y.; Shen, J.; Wu, Z.; Liu, G.; Lee, P. W.; Tang, Y.,
55 admetSAR: a comprehensive source and free tool for assessment of chemical
56 ADMET properties. J. Chem. Inf. Model. 2012, 52, 3099-3105.
57
58 13. Yang, H. B.; Li, X.; Cai, Y. C.; Wang, Q.; Li, W. H.; Liu, G. X.; Tang, Y., In silico
59 prediction of chemical subcellular localization via multi-classification methods.
60 MedChemComm 2017, 8, 1225-1234.

ACS Paragon Plus Environment


Page 17 of 22 Journal of Chemical Information and Modeling

1
2
3
14. Li, X.; Chen, L.; Cheng, F.; Wu, Z.; Bian, H.; Xu, C.; Li, W.; Liu, G.; Shen, X.;
4
5 Tang, Y., In silico prediction of chemical acute oral toxicity using
6 multi-classification methods. J. Chem. Inf. Model. 2014, 54, 1061-1069.
7 15. Li, X.; Du, Z.; Wang, J.; Wu, Z.; Li, W.; Liu, G.; Shen, X.; Tang, Y., In Silico
8
9 Estimation of Chemical Carcinogenicity with Binary and Ternary Classification
10 Methods. Mol. Inform. 2015, 34, 228-235.
11 16. Du, H.; Cai, Y.; Yang, H.; Zhang, H.; Xue, Y.; Liu, G.; Tang, Y.; Li, W., In Silico
12
13
Prediction of Chemicals Binding to Aromatase with Machine Learning Methods.
14 Chem. Res. Toxicol. 2017, 30, 1209-1218.
15 17. Yang, H.; Lou, C.; Sun, L.; Li, J.; Cai, Y.; Wang, Z.; Li, W.; Liu, G.; Tang, Y.,
16
admetSAR 2.0: web-service for prediction and optimization of chemical ADMET
17
18 properties. Bioinformatics 2018. doi: 10.1093/bioinformatics/bty707.
19 18. Daina, A.; Michielin, O.; Zoete, V., SwissADME: a free web tool to evaluate
20 pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small
21
22 molecules. Sci. Rep. 2017, 7, 42717.
23 19. Gilad, Y.; Nadassy, K.; Senderowitz, H., A reliable computational workflow for
24 the selection of optimal screening libraries. J. Cheminform. 2015, 7, 61.
25
26
20. Tian, S.; Li, Y.; Wang, J.; Xu, X.; Xu, L.; Wang, X.; Chen, L.; Hou, T.,
27 Drug-likeness analysis of traditional Chinese medicines: 2. Characterization of
28 scaffold architectures for drug-like compounds, non-drug-like compounds, and
29
natural compounds from traditional Chinese medicines. J. Cheminform. 2013, 5,
30
31 5.
32 21. Hu, X.; Hu, Y.; Vogt, M.; Stumpfe, D.; Bajorath, J., MMP-Cliffs: systematic
33 identification of activity cliffs on the basis of matched molecular pairs. J. Chem.
34
35 Inf. Model. 2012, 52, 1138-1145.
36 22. Rabal, O.; Amr, F. I.; Oyarzabal, J., Novel Scaffold FingerPrint (SFP):
37 applications in scaffold hopping and scaffold-based selection of diverse
38
39
compounds. J. Chem. Inf. Model. 2015, 55, 1-18.
40 23. Gaulton, A.; Hersey, A.; Nowotka, M.; Bento, A. P.; Chambers, J.; Mendez, D.;
41 Mutowo, P.; Atkinson, F.; Bellis, L. J.; Cibrian-Uhalte, E.; Davies, M.; Dedman,
42
N.; Karlsson, A.; Magarinos, M. P.; Overington, J. P.; Papadatos, G.; Smit, I.;
43
44 Leach, A. R., The ChEMBL database in 2017. Nucleic Acids Res. 2017, 45,
45 D945-D954.
46 24. Landrum, G. RDKit. http://www.rdkit.org (2.8.18),
47
48 25. Lewell, X. Q.; Jones, A. C.; Bruce, C. L.; Harper, G.; Jones, M. M.; McLay, I. M.;
49 Bradshaw, J., Drug rings database with web interface. A tool for identifying
50 alternative chemical rings in lead discovery programs. J. Med. Chem. 2003, 46,
51
52
3257-3274.
53 26. Ertl, P., Database of bioactive ring systems with calculated properties and its use
54 in bioisosteric design and scaffold hopping. Bioorg. Med. Chem. 2012, 20,
55
5436-5442.
56
57 27. Ertl, P., Intuitive ordering of scaffolds and scaffold similarity searching using
58 scaffold keys. J. Chem. Inf. Model. 2014, 54, 1617-1622.
59 28. Schneider, G.; Neidhart, W.; Giller, T.; Schmid, G., "Scaffold-Hopping" by
60

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 18 of 22

1
2
3
Topological Pharmacophore Search: A Contribution to Virtual Screening. Angew.
4
5 Chem. Int. Ed. Engl. 1999, 38, 2894-2896.
6 29. Lovering, F.; Bikker, J.; Humblet, C., Escape from flatland: increasing saturation
7 as an approach to improving clinical success. J. Med. Chem. 2009, 52,
8
9 6752-6756.
10 30. Wildman, S. A.; Crippen, G. M., Prediction of Physicochemical Parameters by
11 Atomic Contributions. J. Chem. Inf. Comput. Sci. 1999, 39, 868-873.
12
13
31. Ertl, P.; Rohde, B.; Selzer, P., Fast calculation of molecular polar surface area as a
14 sum of fragment-based contributions and its application to the prediction of drug
15 transport properties. J. Med. Chem. 2000, 43, 3714-3717.
16
32. Chang, C. C.; Lin, C. J., LIBSVM: A Library for Support Vector Machines. Acm
17
18 T Intel Syst Tec 2011, 2, 1-27.
19 33. O'Boyle, N. M.; Banck, M.; James, C. A.; Morley, C.; Vandermeersch, T.;
20 Hutchison, G. R., Open Babel: An open chemical toolbox. J. Cheminform. 2011,
21
22 3, 33.
23 34. Cheng, F.; Yu, Y.; Zhou, Y.; Shen, Z.; Xiao, W.; Liu, G.; Li, W.; Lee, P. W.; Tang,
24 Y., Insights into molecular basis of cytochrome p450 inhibitory promiscuity of
25
26
compounds. J. Chem. Inf. Model. 2011, 51, 2482-2495.
27 35. Broccatelli, F.; Carosati, E.; Neri, A.; Frosini, M.; Goracci, L.; Oprea, T. I.;
28 Cruciani, G., A novel approach for predicting P-glycoprotein (ABCB1) inhibition
29
using molecular interaction fields. J. Med. Chem. 2011, 54, 1740-1751.
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

ACS Paragon Plus Environment


Page 19 of 22 Journal of Chemical Information and Modeling

1
2
3
4 Table 1. Available ADMET properties
5
6
7 Property name Notation Model/Source Default range
8
9 MW Molecular weight RDKit 50~500
10
11
12 Logarithmic lipo-hydro partition
13 AlogP RDKit -2~10
14
coefficient
15
16
17 TPSA Topological polar Surface Area RDKit 20~130
18
19
20 HBA Number of H-Bond acceptors RDKit 0~10
21
22 HBD Number of H-Bond Donors RDKit 0~5
23
24
25 RB Rotatable bonds RDKit 0~5
26
27
Halo Number of halogens RDKit 0~2
28
29
30 BBB Blood brain barrier admetSAR Positive
31
32
33 P-gpi P-glycoprotein inhibitor admetSAR Positive
34
35 Carc Carcinogens admetSAR Negative
36
37
38 Ames Ames toxicity admetSAR Negative
39
40
AO Acute oral toxicity admetSAR Negative
41
42
43 Human Ether-a-go-go-Related
44
hERG admetSAR Negative
45
46 Gene Inhibition
47
48 HIA Human intestinal absorption admetSAR Positive
49
50
51 CypPro CYP450 inhibitory promiscuity admetSAR Nagative
52
53
54
55
56
57
58
59
60

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 20 of 22

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31 Figure 1. The interface of ADMETopt. (A) Input of query compound and selection of
32
33
34 query scaffold. (B) Setting of constraints. (C) Result of the recommended molecules
35
36
37 hopped from the query scaffold. (D) The details of the scaffold hopping with the link
38
39 to continue optimization or ADMET prediction.
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

ACS Paragon Plus Environment


Page 21 of 22 Journal of Chemical Information and Modeling

1
2
3
TOC
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

ACS Paragon Plus Environment


Journal of Chemical Information and Modeling Page 22 of 22

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
The interface of ADMETopt. (A) Input of query compound and selection of query scaffold. (B) Setting of
28 constraints. (C) Result of the recommended molecules hopped from the query scaffold. (D) The details of
29 the scaffold hopping with the link to continue optimization or ADMET prediction.
30
31 696x437mm (96 x 96 DPI)
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 ACS Paragon Plus Environment

View publication stats

You might also like