You are on page 1of 60

Subscriber access provided by UNIV OF LETHBRIDGE

Perspective
How Beyond Rule of 5 Drugs and Clinical Candidates Bind to Their Targets
Bradley Croy Doak, Jie Zheng, Doreen Dobritzsch, and Jan Kihlberg
J. Med. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.jmedchem.5b01286 • Publication Date (Web): 12 Oct 2015
Downloaded from http://pubs.acs.org on October 14, 2015

Just Accepted

“Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted
online prior to technical editing, formatting for publication and author proofing. The American Chemical
Society provides “Just Accepted” as a free service to the research community to expedite the
dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts
appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been
fully peer reviewed, but should not be considered the official version of record. They are accessible to all
readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered
to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published
in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just
Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor
changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers
and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors
or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Medicinal Chemistry is published by the American Chemical Society. 1155


Sixteenth Street N.W., Washington, DC 20036
Published by American Chemical Society. Copyright © American Chemical Society.
However, no copyright claim is made to original U.S. Government works, or works
produced by employees of any Commonwealth realm Crown government in the course
of their duties.
Page 1 of 59 Journal of Medicinal Chemistry

1
2
3
4
5
6
7
8
How Beyond Rule of 5 Drugs and Clinical
9
10
11
12 Candidates Bind to Their Targets
13
14
15
16
17 Bradley C. Doak, Jie Zheng, Doreen Dobritzsch and Jan Kihlberg*
18
19
20
21 Department of Chemistry - BMC, Uppsala University, Box 576, SE-751 23 Uppsala, Sweden
22
23
24
25
26
27
28 ABSTRACT
29
30
31
32 To improve discovery of drugs for difficult targets the opportunities of chemical space beyond the rule
33
34
35
of 5 (bRo5) was examined by retrospective analysis of a comprehensive set of structures for
36
37 complexes between drugs and clinical candidates and their targets. The analysis illustrates the potential
38
39 of compounds far beyond rule of 5 space to modulate novel and difficult target classes that have large,
40
41
42 flat and groove-shaped binding sites. However, ligand efficiencies are significantly reduced for flat-
43
44 and groove-shape binding sites, suggesting that adjustments of how to use such metrics are required.
45
46 Ligands bRo5 appear to benefit from an appropriate balance between rigidity and flexibility to bind
47
48
49 with sufficient affinity to their targets, with macrocycles and non-macrocycles being found to have
50
51 similar flexibility. However, macrocycles were more disc and sphere-like which may contribute to
52
53
their superior binding to flat sites, while rigidification of non-macrocycles lead to rod-like ligands that
54
55
56
57
58
59
60 1
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 2 of 59

1
2
3
bind well to groove-shaped binding sites. These insights should contribute to altering perceptions of
4
5
6 what targets are considered "druggable" and provide support for drug design in beyond rule of 5 space.
7
8
9
10
11
12
13 1. INTRODUCTION
14
15 Drug discovery is at a crossroads where ground-breaking advances in our understanding of how
16
17
18 diseases develop are now made at an unprecedented pace. However, efficiency of drug discovery has
19
20 continued to decline as the number of new drugs approved each year has essentially been constant
21
22 during the last 30 years, while the costs of pharmaceutical development have increased dramatically.1-3
23
24
25 This decline has been attributed to a few fundamental issues, including a need to deliver first-in-class
26
27 treatments for complex diseases, while at the same time meeting increased demands for safety and
28
29
30
efficacy.4 As a result there is high attrition in phase II and III clinical trials, mainly due to lack of
31
32 efficacy and safety issues.2, 5, 6 Therefore, it has been emphasised that improved selection of targets
33
34 that are associated with diseases is the single most important factor required to increase efficacy and
35
36
37 deliver innovative medicines.2
38
39
40
41 During the last two decades the human genome and various other genomes have been mapped,7 and
42
43
44 significant progress has been made towards mapping the human proteome.8, 9 These rapid advances
45
46 have made an increased number of potential drug targets accessible that belong to both established and
47
48
novel target classes. Despite the advances in target identification less than a quarter of recently
49
50
51 approved drugs are directed against novel targets, and the majority of these drugs target established
52
53 classes of G-protein coupled receptors (GPCRs), transporters or enzymes.1, 10 A limiting factor may be
54
55
56
that approximately 3,000 of the genes in the human genome have been estimated to be related to
57
58
59
60 2
ACS Paragon Plus Environment
Page 3 of 59 Journal of Medicinal Chemistry

1
2
3
disease. Out of these only 600-1500 have been considered amenable for manipulation with
4
5
6 "traditional" small molecule drugs,11 i.e. drugs that comply with the rule of 5 (Ro5) guidelines and are
7
8 highly likely to be cell permeable and orally bioavailable. Still, it has been pointed out that large
9
10
11
portions of well-established target classes, such as ion channels, GPCRs and nuclear receptors remain
12
13 unexplored.10 However, an even larger number of targets from less explored and novel classes which
14
15 are "difficult-to-drug" using Ro5 compliant compounds could provide significant, additional
16
17
18 opportunities for drug discovery. For example, the human proteome8, 9 is estimated to have 100,000 to
19
20 1,000,000 binary protein-protein interactions (PPIs)12, 13 and may constitute one of the most important
21
22 sources of novel targets for drug discovery. However, the proportions of the proteome and its massive
23
24
25 number of PPIs that are involved in pathogenic mechanisms remains to be established. Even with that
26
27 caveat the recent and rapid developments in target identification urgently need to be matched by
28
29
innovative approaches for modulating non-traditional target classes, such as PPIs.14, 15
30
31
32
33
34 Targets currently classified as "difficult-to-drug" with Ro5 compliant ligands characteristically have
35
36
37 binding sites that are large, highly lipophilic or highly polar, flexible, flat or featureless (i.e. contain
38
39 few opportunities for molecular interactions such as hydrogen bond donors and acceptors).16-19 In
40
41 addition, the perceived lack of oral bioavailability outside of Ro5 space has led many to abandon these
42
43
44 targets and classify them as "undruggable". Thus, what initially appears as vast opportunities of novel
45
46 targets emerging from advances in genomics and proteomics will, to a large extent, require small
47
48 molecule drug discovery to move outside of Ro5 space into what has been termed beyond Ro5 (bRo5)
49
50
51 or "middle space".20 Interestingly, recent analysis of drugs and clinical candidates that fall outside of
52
53 Ro5 space has shown that this space offers significant possibilities for discovery of orally bioavailable
54
55
56
and cell permeable compounds, possibly more than previously thought.21 It can therefore be argued
57
58
59
60 3
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 4 of 59

1
2
3
that a too strict implementation of the Ro5 may have hampered the pharmaceutical industry from
4
5
6 seizing opportunities involving novel but more difficult targets.21-25
7
8
9
10
11
We and others have hypothesised the benefits of using bRo5 drugs for difficult targets,14, 15, 21, 26, 27
12
13 and examples and case studies have been reported in the literature. Here we present a comprehensive
14
15 analysis of bRo5 drugs and clinical candidates that highlight their ability to modulate difficult targets,
16
17
18 thereby expanding the number of targets for which we can design oral and parenteral drugs. First, we
19
20 assessed what target classes current drugs and clinical candidates outside Ro5 space are directed
21
22 towards in comparison to Ro5 compliant drugs. Analysis then focused on how drugs and clinical
23
24
25 candidates outside Ro5 space bind to their targets based on crystal structures of 130 clinically relevant
26
27 complexes, which were compared to drug-target complexes in Ro5 space. This allowed us to define to
28
29
what extent binding site and ligand characteristics such as size, shape, molecular interactions, affinity
30
31
32 and ligand efficiencies differ between different drug spaces. The influence of conformational
33
34 flexibility of the ligand and its shape was also investigated for compounds in beyond Ro5 space. The
35
36
37 results are then discussed to provide guidance for design of bioactive small molecule drugs outside of
38
39 Ro5 space for difficult targets.
40
41
42
43
44 2. THE DRUGS AND CLINICAL CANDIDATES DATASETS
45
46 To facilitate this in-depth analysis of how drugs and clinical candidates that do not comply with the
47
48 Ro5 bind to their targets a comprehensive dataset of 475 drugs and clinical candidates with MW >500
49
50
51 Da was classified by the compounds calculated physicochemical properties. They were then divided
52
53 into two datasets where intuitive and natural divisions in the ligand property distributions appeared as
54
55
56
previously reported,21 each representing different chemical spaces (Figure 1a). Two datasets of Ro5
57
58
59
60 4
ACS Paragon Plus Environment
Page 5 of 59 Journal of Medicinal Chemistry

1
2
3
compliant drugs were also compiled from ChEMBL28 and the recent literature10 for comparison during
4
5
6 analysis (Figure 1, 2 & 3). In this analysis compounds in rule of 5 space adhere to all of Lipinski's
7
8 guidelines, whereas compounds that break one Ro5 guideline (MW 500-700 Da) and also have other
9
10
11
properties which may extend a short distance outside strict Ro5 space were classified as being in
12
13 extended Ro5 space (eRo5).21 Finally, compounds in beyond Ro5 space (bRo5) all have MW >500 Da
14
15 and in addition have one or more properties outside the eRo5 ranges They are thus far beyond Ro5
16
17
18 space, but with an upper MW limit of 3000 Da set to exclude biologics such as insulin. The
19
20 classification into eRo5 and bRo5 space is useful to completely separate compounds in Ro5 space
21
22 from those that reside far away in bRo5 space and do not conform to its trends. Thus, eRo5 space may
23
24
25 be thought of as a buffer zone between Ro5 and bRo5 space, representing the natural tail of the
26
27 distribution of Ro5 drugs, in line with the original report of Lipinski,29 and the beginning of bRo5
28
29
space. The rational for the eRo5 and bRo5 classification is further highlighted by the two datasets
30
31
32 having mean quantitative estimates of drug-likeness (QED) scores of 0.31 and 0.16, respectively,
33
34 providing a single measure of the distance from traditional rule of 5 space. Both of these QED scores
35
36
37 are significantly below 0.67 and 0.49 which is the mean value identified by medicinal chemists for
38
39 compounds being "attractive" and "unattractive" for drug development, respectively.21, 30 Our dataset
40
41 of drugs and clinical candidates was obtained by searching different databases for compounds with
42
43
44 MW ranging from 500 to 3000 followed by filtering to remove contrast agents, veterinarian products
45
46 etc.21 Therefore some drugs and clinical candidates outside of our strict definition of Ro5 space, i.e.
47
48 those with MW <500 Da and one of HBD >5, HBA >10 or ClogP >5 or <0 are have not been included
49
50
51 in the analysis. We also highlight that some calculated properties are highly correlated to each other
52
53 (e.g. HBA and PSA, rs = 0.89-0.94), as illustrated in the correlation tables and principle component
54
55
56
analysis, which can be found in the supplementary information (Supporting Information Figure S1-2).
57
58
59
60 5
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 6 of 59

1
2
3
It is nonetheless useful to base the classification on these seemingly redundant properties to aid
4
5
6 filtering and analysis in different situations, ranging from computer assisted to practical, "back of the
7
8 envelope" calculations. The three datasets were then analysed and compared extensively across
9
10
11
different ligand-target interaction properties. Differences are described as being "significant" where
12
13 statistically significant different means were found (unpaired t test, with a p-value <0.05); full details
14
15 and p-values can be found in the supplementary information but is also denoted in the figures.
16
17
18
19
20
21 Original dataset
22 comprehensive list of 475 drugs and clinical candidates, 500-3000 Da
23
24 extended Ro5 beyond Ro5
25 all of: MW >500 Da and at least one of:
26 MW 500-700 Da, ClogP 0-7.5 MW 700-3000 Da, ClogP <0 or >7.5
27 HBD ≤5, HBA ≤10, PSA ≤200 Å2 HBD >5, HBA >10, PSA >200 Å2
28 NRotB ≤20 NRotB >20
29
30
eRo5, N =195 bRo5, N =280
31
71% oral 30% oral
32 a)
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50 b)
51
52
53
54
55 Figure 1. (a) Classification of 475 drugs and clinical candidates that have MW >500 Da into extended
56
57
rule of 5 (blue) and beyond rule of 5 (green) chemical space based on calculated physicochemical
58
59
60 6
ACS Paragon Plus Environment
Page 7 of 59 Journal of Medicinal Chemistry

1
2
3
properties. (b) Development pipeline by chemical space and chemical class showing
4
5
6 peptides/peptidomimetics (green), natural products and derivatives (blue) and de novo designed drugs
7
8 and clinical candidates (red) by phase. Orals are in dark and parenterals in light colors, respectively.
9
10
11
12
13
14 The dataset of 475 drugs and clinical candidates in eRo5 and bRo5 space that make up the current
15
16 dataset was previously curated and used to investigate oral bioavailability in eRo5 and bRo5 space.21 It
17
18
19
was also classified with regards to chemical class, route of administration and phase of development.21
20
21 This allows discussion of trends in drug development and demonstrates that de novo designed
22
23 compounds are in majority (43%), with equal numbers of natural products and
24
25
26 peptides/peptiodomimetics (26% each) across the full dataset.21 The majority of de novo designed
27
28 compounds are oral (64%), whereas natural products and in particular peptides/peptidomimetics are
29
30 mainly parenteral (59 and 80%, respectively). Analysing the dataset by phase of development,
31
32
33 chemical space and chemical class demonstrates that de novo designed compounds dominate strongly
34
35 in all clinical phases in eRo5 space, and that the majority of them are intended for oral administration
36
37 (Figure 1b). In bRo5 space peptides constitute the largest group across clinical candidates with
38
39
40 proportions of de novo designed compounds and natural products are only somewhat lower. In phase
41
42 II, III and approved, bRo5 natural products and peptides are mainly for parenteral administration,
43
44
45
whereas the proportion of orals is >45% for de novo designed compounds. In summary, the drug
46
47 discovery industry is focusing on development of de novo designed drug candidates for oral
48
49 administration in eRo5 space, while relying on all three chemical classes and more on parenteral
50
51
52 delivery in bRo5 space. It is noteworthy that a significant number of compounds (161 in total) from all
53
54 three chemical classes in bRo5 space are in clinical development, indicating a willingness to venture
55
56 outside of the Ro5. This is further supported by the emergence of a number of biotech companies that
57
58
59
60 7
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 8 of 59

1
2
3
focus on this chemical space,27 often in partnership with larger pharmaceutical companies. In the
4
5
6 current analysis we aim to globally assess bRo5 ligand-target interactions to define if moving far from
7
8 traditional Ro5 space is warranted in efforts to conquer difficult targets and for what target classes and
9
10
11
types of binding sites compounds in bRo5 space provide advantages.
12
13
14
15 3. TARGET CLASS MODULATION BY CHEMICAL SPACE
16
17
18 To analyse the target class preferences of drugs and clinical candidates in eRo5 and bRo5 space, the
19
20 dataset of 475 drugs and clinical candidates with MW >500 Da21 was first classified using a similar
21
22 taxonomy as employed to dissect trends and innovations in drug development for a large dataset of
23
24
25 approved drugs (Figure 2a).10 A Ro5 target class reference dataset was obtained by filtering this large
26
27 literature dataset of approved drugs10 by all of the Ro5 guidelines and selection of only one target per
28
29
drug. Drugs in eRo5 and bRo5 space showed interesting differences in their target class preferences
30
31
32 compared to Ro5 compliant drugs (Figure 2b). For instance, an increased proportion of eRo5 and bRo5
33
34 drugs and clinical candidates modulate protease and kinase targets, which have been increasingly
35
36
37 explored only in the last decade.10 Similar to the approved Ro5 compliant kinase inhibitors in our
38
39 dataset, tyrosine kinases were the largest subgroup targeted by eRo5 and bRo5 drugs and clinical
40
41 candidates (42%). Kinase inhibitors currently in clinical trials originate from Ro5, as well as from
42
43
44 eRo5 and bRo5 space, and no significant trends linking subgroups of kinases to a particular chemical
45
46 space were apparent. Nevertheless it is clear that medicinal chemists are drawing from compounds
47
48 outside traditional Ro5 space to target an expanding number of kinases. Structural and adhesion targets
49
50
51 such as tubulin, as well as transferases and isomerases are also more prevalent for eRo5 and/or bRo5
52
53 drugs and clinical candidates. In our analysis there is also a higher prevalence of bRo5 and eRo5 drugs
54
55
56
and clinical candidates at "other" targets; a class consisting of, antioxidants, vitamin and hormone
57
58
59
60 8
ACS Paragon Plus Environment
Page 9 of 59 Journal of Medicinal Chemistry

1
2
3
replacements, orphan drugs and other unclassifiable molecular targets. Moreover, as compared to Ro5
4
5
6 drugs a smaller proportion of bRo5 drugs and clinical candidates bind to well-established target classes
7
8 such as ion channels and nuclear receptors. Similar trends in target class preference are also observed
9
10
11
for the oral only and approved only subsets of eRo5 and bRo5 drugs and clinical candidates
12
13 (Supporting Information Figure S3). Importantly, a number of classes that are more frequently targeted
14
15 by eRo5 and bRo5 drugs and clinical candidates, such as proteases, kinases and transferases are among
16
17
18 those recently concluded to be underexplored in drug discovery.10
19
20
21
22 As different targets have been found to have different preferred ligand chemical spaces,31, 32
we
23
24
25 conclude that increased exploration of eRo5 and bRo5 space should be beneficial for future
26
27 development of drugs for underexplored target classes. The discovery of protein kinase inhibitors, now
28
29
commonly used in oncology, is an important example of how exploration of novel chemical space can
30
31
32 expand what targets are considered "druggable".33 It should also be noted that eRo5 and bRo5
33
34 compounds also appear to be among the most suitable for modulating the increasing number of
35
36
37 protein-protein interactions (PPI) that are emerging as therapeutic targets.14, 34
38
39
40
41
Rask-Andersen et al. dataset of
42
approved drugs
43
44 rule of 5, filtered by all of: eRo5, N =195, 71 % oral bRo5, N =280, 30 % oral
45 MW ≤500 Da, ClogP 0-5, HBD ≤5, 59 App., 32 PIII, 67 PII, 37 PI 119 App., 37 PIII, 88 PII, 36 PI
46 HBA ≤10
47 Target classification similar to Rask-Andersen et al.,
48 Primary target selected for each drug removal of unclassified compounds
49
50
51 Ro5 drug-target pairs N=579 eRo5 drug-target pairs bRo5 drug-target pairs
52 Approved N=185, 71% Oral, 30% Approved N =228, 31% Oral, 38% Approved
a)
53
54
55
56
57
58
59
60 9
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 10 of 59

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 b)
18
19
20 Figure 2. (a) Creation of target class preference datasets. The dataset compiled by Rask-Andersen et
21
22 al.10 was filtered by all the rule of 5 (Ro5) guidelines and the primary target reported in the literature
23
24
was selected for each drug to give the Ro5 dataset. The primary targets reported in the literature for the
25
26
27 full set of extended Ro5 (eRo5) and beyond Ro5 (bRo5) drugs and clinical candidates were classified
28
29 using the same taxonomy. (b) Proportion of Ro5 drugs (red), eRo5 (blue) and bRo5 (green) drugs and
30
31
32 clinical candidates modulating the indicated target classes. The proportion of compounds in each of the
33
34 three chemical space datasets that modulates a specific target class (the number of compounds that
35
36 modulate that target class divided by the total number in the dataset, N), is shown by the vertical bars.
37
38
39 Proportions were calculated to show differences in target preferences as the number of compounds
40
41 differ significantly between the three datasets. Compounds that are in phase (P) I, II or III or approved
42
43 (App.) are shown by increasingly darker colour shadings of the datasets. Targets are arranged by Ro5
44
45
46 target class preference from highest (left) to lowest (right). Alternate plots that show only approved,
47
48 clinical or orally bioavailable dugs and clinical candidates in eRo5 and bRo5 space, as well as the
49
50
51
exact number of compounds in each category, are included in Supporting Information Figure S3.
52
53
54
55
56 4. CHARACTERIZATION OF DRUG-TARGET COMPLEXES BY CHEMICAL SPACE
57
58
59
60 10
ACS Paragon Plus Environment
Page 11 of 59 Journal of Medicinal Chemistry

1
2
3
4.1. Generating drug-target structure datasets. To probe how drugs outside of Ro5 space bind to
4
5
6 their targets structural data for all available complexes of eRo5 and bRo5 drugs and clinical candidates
7
8 with their targets were extracted by cross-referencing the 475 drugs and clinical candidates with the
9
10
11
Protein Data Bank (PDB). In total, 93 drugs had crystal structures of relevant drug-target complexes
12
13 that fulfilled the chosen quality requirements (20% of the dataset, Figure 3). A reference set of 37
14
15 crystal structures of relevant Ro5 drug-target complexes was also obtained after clustering a Ro5
16
17
18 filtered ChEMBL drugs dataset according to their physicochemical properties (Figure 3). In order to
19
20 remove bias towards highly explored targets or drug classes and ensure that conclusions reflect the true
21
22 variation between chemical spaces, redundant complexes between a target and other members of the
23
24
25 same drug class were excluded. For example, the erythromycin A-ribosome complex 1JZY was used
26
27 as representative of all erythronolide-ribosome complexes. This produced three non-redundant datasets
28
29
of crystal structures of Ro5 drugs (N=29), eRo5 (N=26) and bRo5 (N=22) drugs and clinical
30
31
32 candidates bound to their targets (Table 1). An extensive set of additional annotated figures for both
33
34 the all structures and the non-redundant datasets can be found in the supplementary information along
35
36
37 with statistical analysis.
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 11
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 12 of 59

1
2
3
4 ChEMBL drugs dataset
5 N=10,460
6
7 rule of 5
8 filtered by all of:
9 MW ≤500 Da, ClogP 0-5
10 HBD ≤5, HBA ≤10
11
12 Clustered by physicochemical eRo5, N =195, 71 % oral bRo5, N =280, 30 % oral
13 properties 59 App., 32 PIII, 67 PII, 37 PI 119 App., 37 PIII, 88 PII, 36
14 PI
15 Cross-referencing with the PDB and selection of crystal structures: Resolution 0-3.8 Å, good
16 density at binding site interface, clinically relevant drug-target complex
17
18
all representative Ro5 drug- all eRo5 drug-target structures all bRo5 drug-target Results in
19
target structures N =37 N =47, 72 % oral structures N =46, 65% oral Supporting
20
86% oral, all approved 21 App., 5 PIII, 11 PII, 10 PI 32 App., 5 PIII, 8 PII, 1 PI Information
21
22
Filtering to remove redundant drug class-target binding sites.
23
e.g. Erythromycin A was selected to represent all erythronolide-ribosome complexes
24
25
26 non-redundant Ro5 non-redundant eRo5 non-redundant bRo5 Results in paper
27 structures N =29, 86% oral structures N =26, 58 % oral structures N =22, 48 % oral & Supporting
28 all approved 11 App., 4 PIII, 7 PII, 4 PI 17 App., 1 PIII, 3 PII, 1 PI Information
29
30
31 Figure 3. Cross referencing the extended Ro5, beyond Ro5 datasets and a representative set of
32
33
34
ChEMBL28 Ro5 drugs with the Protein Data Bank (PDB) and filtering by quality constraints gave
35
36 three datasets of relevant Ro5, eRo5 and bRo5 drug-target structures. To avoid bias towards highly
37
38 explored drug classes further filtering to remove redundant drug class-target structures gave unbiased,
39
40
41 non-redundant datasets which contain only one compound per drug-target class (e.g. one
42
43 erythronolide, one azole anti-infective etc.). The number of compounds approved (App.) and in Phases
44
45 (P) III, II and I development are shown. Results of the analysis of the complete datasets can be found
46
47
48 in the supporting information for comparison, while the non-redundant datasets are analysed within the
49
50 paper as well as in the supporting information.
51
52
53
54
55
56
57
58
59
60 12
ACS Paragon Plus Environment
Page 13 of 59 Journal of Medicinal Chemistry

1
2
3
Table 1. Analysed non-redundant drug-target complexes in extended and beyond Ro5 chemical space
4
5
6 Compound name Macromolecule (Target) Indication PDB Oral/ Clinical/
7 code Parenterala Approved
8
9 Beyond Ro5 (N=22)
10 Argatroban Thrombin Haematology 1DWC parenteral App
11 β-Acarbose α-Amylase Endocrinology 1PPI parenteral App
12 Birinapant E3 ubiquitin-protein ligase XIAP Oncology 4KMP parenteral Phase II
13 Capremycin Ribosome Infection 3KNL parenteral App
14 Cyclosporine A Cyclophilin A Immunology 1CWA oral App
15 Dactinomycin DNA Oncology 1I3W parenteral App
16 Doxorubicin DNA Oncology 1P20 parenteral App
17 Eptifibatide Integrin alpha-IIB Cardiovascular 2VDN parenteral App
18 Erythromycin A Ribosome (D. radiodurans) Infection 1JZY oral Phase III
19 Etoposide DNA topoisomerase-IIb Oncology 3QX3 oral App
20 Eritoran Toll-like receptor 4 Infection 2Z65 parenteral App
21 Itraconazole Lanosterol 14-α demethylase Infection 4K0F oral App
22 Ivermectin 22,23- Glutamate-gated chloride channel Infection 3RHW oral App
23 dihydro B1a
24 Navitoclax B cell lymphoma-2, Bcl-2 Oncology 4LVT oral Phase II
25 Ouabain Na-K ATPase Cardiovascular 3A3Y oral App
26 Paclitaxel Tubulin α-chain Oncology 1JFF oral Phase II
27 PF-03715455 Mitogen-activated protein kinase Respiratory 2YIS parenteral Phase I
28 Quinurpistin Ribosome (H. marismortui) Infection 1YJW parenteral App
29 Rapamycin FK560 binding protein Immunology 4DRI oral App
30 Rifampicin DNA-directed RNA polymerase Infection 4KMU oral App
31 Simeprevir Hepatitis C virus NS3/4A protease Infection 3KEE oral App
32 Thiosptrepton Ribosome (D. radiodurans) Infection 3CF5 parenteral App
33
34 Extended Ro5 (N=26)
35
Aliskiren Renin Cardiovascular 2V0Z oral App
36
AMG-131 Peroxisome proliferator-activated Cardiovascular 3FUR oral Phase II
37
receptor-γ
38
Atorvastatin HMG-CoA reductase Cardiovascular 1HWK oral App
39
40 BGJ-398 Fibroblast growth factor 1 Oncology 3TT0 oral Phase I
41 BMS-777607 Hepatocyte growth factor receptor Oncology 3F82 oral Phase I
42 BMS-791325 Hepatitis C virus NS5b subunit Infection 4NLD oral Phase II
43 Ceritinib Anaplastic lymphoma kinase Oncology 4MKC oral Phase II
44 Cobimetinib Mitogen-activated protein kinase kinase Oncology 4AN2 oral Phase III
45 Dalfopristin Ribosome (D. radiodurans) Infection 1SM1 parenteral App
46 EPZ-5676 DOT1-like histone H3 methyltransferase Oncology 4HRA parenteral Phase I
47 Ergotamine Serotonin receptor 1B chimera Pain 4IAR parenteral App
48 Fedratinib Bromodomain BRD4 Oncology 4OGJ oral Phase III
49 Homoharringtonine Ribosome (H. marismortui) Infection 3G6E parenteral App
50 Intedanib Vascular endothelial growth factor Oncology 3C7Q parenteral Phase III
51 receptor 2
52 Ispinesib Kinesin Eg5 Oncology 4A5Y parenteral Phase II
53 Lapatinib Epidermal growth factor receptor Oncology 1XKK oral App
54 Lonafarnib Protein farnesyltransferase Oncology 1O5M oral Phase II
55 Mometasone furoate Glucocorticoid receptor Respiratory 4P6W parenteral App
56 Nilotinib Tyrosine-protein kinase ABL1 Oncology 3CS9 oral App
57 Pictilisib Phosphoinositide-3 kinase Oncology 3DBS oral Phase II
58
59
60 13
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 14 of 59

1
2
3 Pseudomonic acid A Isoleucyl-tRNA synthetase Infection 1QU2 parenteral App
4 PU-H71 Heat shock protein-90 Oncology 2FWZ parenteral Phase I
5 Saquinavir HIV-1 protease Infection 3OXC oral App
6 Taladegib Smoothened homolog Oncology 4JKV oral Phase II
7
Tubocurarane Soluble acetylcholine receptor Anaesthesiology 3PMZ parenteral App
8
Volasertib Polo-like kinase 1 Oncology 3FC2 parenteral Phase III
9 a
10 Route of administration used in the indicated phase of development.
11
12
13
14 4.2 Shape and size of binding sites. The binding site shape of drug-target complexes can be
15
16
17 assessed manually or with the aid of descriptors calculated from binding sites identified by automated
18
19 methods. Two such methods, the recently described Difference of Gaussian Site (DoGSite)35, 36 and
20
21
22
MetaPocket 2.0,37 the latter of which is based on several algorithms, were used to analyse the three
23
24 datasets of drug-target complexes. The threshold for successful identification of binding sites was set
25
26 as >20% of the volume of the bound drug being covered by the calculated binding site. With this
27
28
29 lenient cut-off only 54% and 43% of all bRo5 drug-target bindings sites were successfully identified
30
31 by DoGSite and MetaPocket, respectively (Supporting Information Figure S4a). In addition, the
32
33 successfully calculated bRo5 binding sites covered a significantly lower proportion of the bound drug
34
35
36 than Ro5 drug binding sites (Mean 54% vs 94%, respectively; Supporting Information Figure S4).
37
38 Hence, bindings site shape was also assessed manually by visual inspection and classification as being
39
40 flat, groove, tunnel, pocket or internal. These correspond to the drug interacting with its target by a
41
42
43 single face for a flat site, two or three faces for a groove and four faces with two non-interacting
44
45 opposing faces for a tunnel-shaped site. Interactions of the drug through four or five faces, leaving one
46
47
48
non-interactive face, characterises a pocket, and for an internal binding site the drug is completely
49
50 buried inside the target (Supporting Information Figure S5). For binding sites that were successfully
51
52 calculated with DoGSite, descriptors such as enclosure and depth corresponded well to the shape
53
54
55 classifications and thereby support the manual classification (Supporting Information Figure S6).
56
57 However, volume and sphericity descriptors failed to accurately describe the size and shape of flat-
58
59
60 14
ACS Paragon Plus Environment
Page 15 of 59 Journal of Medicinal Chemistry

1
2
3
and groove-shaped binding sites as the calculated sites poorly covered the actual drug or clinical
4
5
6 candidate binding site. The manual classification is also supported by the increase in the mean
7
8 proportion of drug surface areas (SA) that become buried upon binding to the target from flat to
9
10
11
internal binding sites (Supporting Information Figure S8). Due to the low success rate in binding site
12
13 calculation, and the inability of descriptors to characterize those sites that were successfully calculated,
14
15 the manual classification of binding site shapes was used throughout the current analysis.
16
17
18
19
20 The distribution of binding site shapes showed striking differences between the three sets of drug-
21
22 target complexes (Figure 4a), with higher proportions of bRo5 drugs and clinical candidates binding to
23
24
25 the "difficult" open, flat and groove binding sites compared to Ro5 drugs. In contrast, Ro5 drugs
26
27 display a preference for pocket and internal binding sites, which conforms well with the view of such
28
29
sites as being highly "druggable" with Ro5 compliant compounds. Binding site shapes observed for
30
31
32 eRo5 drugs and clinical candidates are more evenly distributed between groove, tunnel, pocket and
33
34 internal sites, revealing an ability of compounds residing just outside Ro5 chemical space to target a
35
36
37 wide range of sites, with the exception of flat binding sites. It should be noted, that compounds in
38
39 eRo5 space also bind to pocket shaped and internal sites, which are then larger than those that have
40
41 Ro5 compliant ligands (Supporting Information Figure S8). Too few ligands in bRo5 space bind to
42
43
44 pockets to draw any statistically significant conclusions regarding binding site size. The shape of the
45
46 ligands, in their target bound conformations, were also assessed using normalised principle moment of
47
48 inertia (nPMI) plots, which characterize ligands by their similarity to rod, disc and sphere shapes
49
50
51 (Figure 4b). In agreement with previous analyses based on calculated 3D conformations,38 Ro5
52
53 compliant drugs were predominantly rod-like, while those in eRo5 and particularly in bRo5 space were
54
55
56
more disc- and sphere-like. Flat and groove binding sites were also found to have ligands that were
57
58
59
60 15
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 16 of 59

1
2
3
significantly more disc- and sphere-like compared to ligands for pocket and internal binding site
4
5
6 shapes (Supporting Information Figure S9-S10).
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26 a) b)
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45 c) d) e)
46
47
48 Figure 4. (a) Distribution of binding site shapes, (b) ligand normalised principle moment of inertia
49
50 (nPMI) shape plot, (c) buried ligand surface area (SA, Å2) as a function of total ligand SA (Å2), (d)
51
52 box plot of buried ligand SA (Å2) and (e) box plot of proportion of buried ligand SA. Each figure
53
54
55 presents data for rule of 5 drugs (Ro5, red), extended Ro5 (eRo5, blue) and beyond Ro5 (bRo5, green)
56
57 drugs and clinical candidates in complex with their respective targets. Protein-protein interaction (PPI)
58
59
60 16
ACS Paragon Plus Environment
Page 17 of 59 Journal of Medicinal Chemistry

1
2
3
interface SA data were extracted from Luo et al.39 and reanalysed. Box plots show minimum and
4
5
6 maximum values as whiskers, the 25th, 50th and 75th percentiles as boxes and means as crosses.
7
8 Horizontal lines indicate unpaired and unequal variance t-test of compared datasets with their p-values.
9
10
11
12
13
14 In addition to binding site shape, the ligand surface area (SA) that is buried upon binding and its
15
16 proportion to the total ligand SA also provides useful information about the nature of the binding sites.
17
18
19
While the buried SA indicates the size of the binding site, the proportion of buried ligand SA can
20
21 indicate how open or exposed the binding site is. The plot of buried ligand SA against total ligand SA
22
23 shows that eRo5 and bRo5 drugs have larger SAs buried in complexes with their targets than Ro5
24
25
26 drugs (Figure 4c, d). The proportion of buried ligand SA is, however, lower for drugs in eRo5 and
27
28 bRo5 space compared to Ro5 space (Figure 4c, e, Supporting Information Figure S8), which is
29
30 consistent with the preference of eRo5 and bRo5 drugs for flat and groove binding sites. Although the
31
32
33 buried SA of eRo5 and bRo5 drugs is larger than that of Ro5 drugs, it is still slightly smaller than
34
35 interface areas in weak protein-protein interactions (PPIs with Kd values >1 µM) and significantly
36
37 smaller than those in strong protein-protein interactions (PPIs with Kd values <1 µM, Figure 4d).39
38
39
40 However, most of the affinity of protein-protein interactions arises from smaller hotspot areas within
41
42 the interface40, 41 and drugs in bRo5 space may reach the critical size required to bind such hotspots
43
44
45
and gain sufficient affinity towards PPI interfaces.14, 25 In conclusion, drugs in bRo5 space are less rod-
46
47 like in ligand shape and bind to larger binding sites than Ro5 drugs, but with a lower proportion of
48
49 their total SA buried in the complex. This agrees well with the observation that eRo5 and particularly
50
51
52 bRo5 drugs bind to difficult, larger and more open binding sites that approach the size of PPI
53
54 interfaces. In addition, compounds in bRo5 space are likely large enough to interact effectively with
55
56 hotspots of PPI interfaces.
57
58
59
60 17
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 18 of 59

1
2
3
4
5
6 4.3 Molecular interactions at binding sites. The overall polarity of the drug and the target at the
7
8 binding site and the number of intermolecular interactions, such as hydrogen bonds, are important
9
10
11
descriptors for binding sites. The proportion of non-polar heavy atoms at the binding site interface, i.e.
12
13 the number of carbon and sulfur atoms, divided by the total number of heavy atoms at the interface,
14
15 provides a measure of the overall polarity of the ligand and target interfaces. No significant differences
16
17
18 in the means of this measure of interface polarity from Ro5 to bRo5 space were found for the ligand or
19
20 target interface datasets (Figure 5a, Supporting Information Figure S11). The mean proportion of non-
21
22 polar atoms at the targets interfaces are similar to previous estimates42 and is also consistently lower
23
24
25 than that of the ligands interfaces. Since bRo5 drugs have a smaller proportion of their total surface
26
27 area buried upon binding to the target compared to Ro5 drugs, they might enrich polar or lipophilic
28
29
atoms at the binding site interface to improve interactions. However, the polarity of the ligand binding
30
31
32 site interface is similar to the overall polarity of the ligand for all three chemical spaces, indicating that
33
34 neither polarity nor lipophilicity is enriched at the interface with the target (Figure 5a).
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 18
ACS Paragon Plus Environment
Page 19 of 59 Journal of Medicinal Chemistry

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21 a)
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39 b) c)
40
41 Figure 5. (a) Proportion of non-polar heavy atoms (carbon and sulfur) to the total number of heavy
42
43
atoms for the ligand binding site interface, the overall ligand, and the target binding site interface. The
44
45
46 distributions of the number of (b) hydrogen bond acceptor (HBA) atom interactions and (c) hydrogen
47
48 bond donor (HBD) atom interactions of ligands with their targets. Each figure presents data for rule of
49
50
51
5 (Ro5, red) drugs, extended Ro5 (eRo5, blue) and beyond Ro5 (bRo5, green) drugs and clinical
52
53 candidates in complexes with their respective targets. Box plots show minimum and maximum values
54
55 as whiskers, the 25th, 50th and 75th percentiles as boxes and means as crosses. Horizontal lines indicate
56
57
58 unpaired and unequal variance t-test of compared datasets with their p-values.
59
60 19
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 20 of 59

1
2
3
4
5
6 Similarly to interface polarity, the number of atoms forming hydrogen bonding interactions between
7
8 the ligand and target do not significantly differ between the three sets of drugs. Ligands in bRo5 space
9
10
11
display a slightly higher mean number of HBA atom interactions but a statistically similar number of
12
13 mean HBD atom interactions compared to Ro5 ligands (Figure 5b-c, Supporting Information Figure
14
15 S13-S14). At first glance it may appear that bRo5 drugs show a much wider distribution, particularly a
16
17
18 higher maximum number of HBD interactions. However, it should be taken into account that a
19
20 criterion for including drugs in the Ro5 and eRo5 datasets is <5 HBD atoms, limiting their possible
21
22 HBD interactions. In addition, the eRo5 and bRo5 datasets encompass both oral and parenteral drugs
23
24
25 and clinical candidates, with two parenterally delivered polysaccharides, amikacin and β-acarbose,
26
27 being the only two drugs with >5 HBD interactions in the dataset. The number of π-π interactions in
28
29
drug-target complexes are also similar for all three datasets (Supporting Information Figure S15).
30
31
32 Other types of interactions, such as ionic, cation-π interactions and halogen bonding were less
33
34 frequently observed in all three datasets, hence no confident conclusions could be drawn about these
35
36
37 types of interactions. Therefore, despite the increase in size and the difference in shape observed for
38
39 binding sites of bRo5 drugs, the overall polarity as well as type and number of molecular interactions
40
41 in the binding site remains similar to those of Ro5 drugs. These findings indicate that the same
42
43
44 approaches for lead optimisation may be applied to design of bRo5 ligands as for Ro5 ligands, with
45
46 recent illustrative examples being the discovery of hepatitis C virus NS3/4a protease inhibitors.43
47
48
49
50
51 4.4 Affinities and ligand efficiencies. Affinities, measured as equilibrium dissociation constants,
52
53 inhibition constants or concentrations giving 50% inhibition of target activity (Kd, Ki, IC50), were
54
55
56
extracted from the literature for drug-target complexes in the three datasets where available. Affinity
57
58
59
60 20
ACS Paragon Plus Environment
Page 21 of 59 Journal of Medicinal Chemistry

1
2
3
data were consistent with those previously reported for a large dataset of drugs44 and drugs in Ro5,
4
5
6 eRo5 and bRo5 space had similar means and distributions of affinities (Figure 6a). This also holds true
7
8 for 102 of the 177 approved drugs in the original eRo5 and bRo5 datasets21 for which affinities were
9
10
11
available (Supporting Information Figure S16), and leads to at least two important conclusions. Firstly,
12
13 drugs outside Ro5 space do not require higher affinities for their targets compared to Ro5 compliant
14
15 drugs to compensate for any perceived or actual unfavourable pharmacokinetics. Secondly, despite
16
17
18 being perceived as "difficult", binding sites that are larger and more open can be modulated by drugs
19
20 with similar affinities as drugs directed to sites traditionally considered highly "druggable". This
21
22 correlates well with the observations that similar numbers and types of molecular interactions are
23
24
25 formed between large, open and smaller, enclosed binding sites and their respective ligands (cf. Figure
26
27 5 and Supporting Information Figure S11-S15).
28
29
30
31
32 Ligand efficiency metrics have found widespread use,45 however they also have some limitations
33
34 associated with their application, particularly outside traditional Ro5 drug space.46 We nonetheless
35
36
37 believe it is useful to characterise the ligand efficiency (LE) and lipophilic ligand efficiency (LLE)
38
39 distributions observed in eRo5 and bRo5 space to provide guides for those who wish to use them in
40
41 drug development. As the drugs in eRo5 and bRo5 space are significantly bigger than Ro5 drugs, i.e.
42
43
44 they have higher molecular weights and more heavy atoms, their LE is significantly lower (Figure 6b,
45
46 Supporting Information Figure S17). While LE is known to not be completely independent of the size
47
48 of compounds,47 we also find that it is correlated to the proportion of buried SA and, hence, also the
49
50
51 shape of the binding site (Figure 6c). It should be noted that this correlation is not lost for size
52
53 corrected ligand efficiencies (see Supporting Information Figure S18-19 for full details). Hence, ligand
54
55
56
efficiencies are significantly lower for flat and groove binding sites than for pocket and internal sites,
57
58
59
60 21
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 22 of 59

1
2
3
with mean values increasing from 0.19 to 0.42 kcal/mol.HAC from flat to internal sites. LLE,45 a
4
5
6 measure of the affinity of a compound with respect to its lipophilicity, is similar for all three datasets
7
8 and across all binding site shapes (means ±standard deviations: 4.0 ±1.4 to 6.6 ±3.7, Supporting
9
10
11
Information Figure S17), indicative of Ro5, eRo5 and bRo5 drugs having similar lipophilicities and
12
13 affinities. Therefore efforts to develop drugs in eRo5 and bRo5 space should aim to generate
14
15 compounds with similar affinities to those in Ro5 space but with altered guidelines for LE, particularly
16
17
18 for difficult binding sites. Typical bRo5 drugs have a LE of 0.11-0.30 kcal/mol.HAC, with both
19
20 increased compound size and open, flat binding sites contributing to their reduced values. It should be
21
22 emphasised that the ligand efficiencies discussed herein reflect the historical development of drugs and
23
24
25 that a recent review of inhibitors of PPIs indicates that LE values in the top half of our dataset, 0.20
26
27 and upward, are possible for compounds targeting difficult binding sites.14
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 22
ACS Paragon Plus Environment
Page 23 of 59 Journal of Medicinal Chemistry

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 a) b)
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38 c)
39
40
41 Figure 6. (a) Affinities and (b) ligand efficiencies (LE, kcal/mol.HAC) for rule of 5 (Ro5, red),
42
43 extended Ro5 (eRo5, blue) and beyond Ro5 (bRo5, green) drugs and clinical candidates. (c) The
44
45
46
relationship between LE and binding site shape. As a rule of thumb, Ro5 drug candidates are optimised
47
48 to have 10 nM affinities, corresponding to LEs of 0.30 for a compound with a molecular weight of 500
49
50 Da (heavy atom count, HAC ~36). This "guideline" for optimisation is marked with a grey line in (b)
51
52
53 and (c). Box plots show minimum and maximum values as whiskers, the 25th, 50th and 75th percentiles
54
55 as boxes and means as crosses. Horizontal lines indicate unpaired and unequal variance t-test of
56
57 compared datasets with their p-values.
58
59
60 23
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 24 of 59

1
2
3
4
5
6 5. EXAMINING BEYOND RULE OF 5 DRUGS AND CLINICAL CANDIDATES
7
8 A key result of this analysis is that drugs and clinical candidates in bRo5 space are of particular
9
10
11
interest because of their greater ability to modulate difficult more open, flat and groove shaped binding
12
13 sites. Two recent investigations provide additional support for this observation. The first found that a
14
15 representative set of macrocyclic natural products with high MW bind either face-on to flat binding
16
17
18 sites or edge-on to groove-shaped binding sites.25 The second highlighted that de novo designed
19
20 inhibitors of protein-protein interactions that entered clinical trials during the last decade are
21
22 predominantly non-macrocyclic and bind to groove- or pocket-shaped binding sites in rod-like
23
24
25 conformations, which likely reflects the higher druggability of groove- and pocket-shaped binding
26
27 sites.14 Our non-redundant dataset of bRo5 ligand-target structures contains three drug classes that bind
28
29
to flat binding sites, all of which are used to treat infectious disease (Fig 7) and eleven that target
30
31
32 groove-shaped binding sites, five of which are used in oncology. Investigating if the chemical class of
33
34 the ligand in bRo5 space (i.e. de novo designed, natural product or peptide/peptidomimetic) affected
35
36
37 the properties of the complex with the target showed no significant differences for properties discussed
38
39 above. This is most likely due to the small number of complexes in each subset of the bRo5 dataset.
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 24
ACS Paragon Plus Environment
Page 25 of 59 Journal of Medicinal Chemistry

1
2
3 Flat binding site
4 Infection O
O
5 H2N H
N O O
OH N N O
6 O
O
O
N
H N N
O S NH H
7 O
O O
O
S
S
S HN
O
O N N
8 O
O O
H
HN N H O
HN
N O HO
HO
9 O O OH O
O
HO
NH HO
OH
O N S
O N
O
10 O N O NH
S
N O
N
N H
HN
11 Ivermectin, Approved, Oral
S
O
Avermectin-sensitive glutamate- HO
12 gated chloride channel GluCl α
Simeprevir, Approved, Oral
Thiostrepton, Approved, Parenteral
NS3/4A protease
13 Bacterial ribosome and proteins

14 Groove shaped binding site


Infection
15 Oncology
O
O
OH
O NH2

H O O O NH
O
16 N
N
N
N NH2 O O
HO H O N O O O OH O NH2 O
17 O H2N N
H
O N
H
N
H
NH
F HN O O N HN HN
18 HN
O
N
H
N O N NH2
O OH NH2 HO
19 F
NH
NH O O O OH
HO
O
NH HN O

20 H
O
OH O
Dactinomycin, Approved, Doxorubicin, Approved, Parenteral O
NH2
N N N N Parenteral, DNA DNA
21 N
H
O
N
O Capreomycin, Approved, Parenteral
O
22 Birinapant, Phase II, Parenteral O O O HO Bacterial ribosome and proteins
23 E3 ubiquitin-protein ligase XIAP
Metabolism HO OH
CF3 O
O OH
24 S
O HN O O
O
HO OH O O OH
O OH O
25 HN S NH S
O
OH O O
O OH
O
OH O OH
N N
26 O
O
O
O
NH OH
HO
OH
27 Cl
N
OH OH
Paclitaxel, Phase II, Oral HO OH
28 Navitoclax, Phase II, Oral O
Tubulin α-chain β-Acarbose, Approved, Parenteral
Apoptosis regulator Bcl-2
29 α-Amylase
30 Immunology OH Cardiovascular
31 O O
OH
32 O
O OH
O
O O
O NH
33 O
N
N HN
N O O OH
N
OH
O
NH HN
O N
34 NH O
O
N
O
H2N
H
N
O
S
H2N NH O
N O O N NH O O
35 O N
H
O N OH H NH
H O HN HN NH NH
N N O
36 N
H
O O
Argatroban, Approved, Parenteral O S S
O
O O O
37 Thrombin
NH2

Rapamycin, Approved, Oral Eptifibatide, Approved, Parenteral


38 Cyclosporine A, Approved, Oral
Peptidyl-prolyl cis-trans isomerase A Peptidyl-proly cis-trans isomerase FKBP5 Integrin α-IIB
39
40
41 Figure 7. Chemical structures of drugs and clinical candidates in bRo5 space that bind to flat and
42
43 groove shaped binding sites on targets. Compounds are grouped by therapeutic indication and their
44
45
46 status, route of administration and clinically relevant targets are given below each structure.
47
48
49
50
51 Interestingly, more than half of the non-redundant dataset bRo5 drug and clinical candidates that
52
53
54 bind to flat and groove shaped binding sites are macrocycles (Figure 7). Examination of the full dataset
55
56 of 475 drugs and clinical candidates that have a MW >500 Da also reveals that orally bioavailable, as
57
58
59
60 25
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 26 of 59

1
2
3
well as parenteral, macrocycles are significantly enriched in bRo5 space compared to non-macrocycles
4
5
6 (Figure 8a), a finding that has been highlighted before.21, 25, 27 Though the reason for this enrichment
7
8 has not been conclusively identified, it is known that conformational constraints imposed by the
9
10
11
macrocyclic structure can convey improved pharmacokinetics, but also higher potency and better
12
13 selectivity as compared to related acyclic analogues at similar binding sites.26, 27, 48, 49 Conformational
14
15 restriction has therefore been postulated as a general principle leading to macrocycle enrichment in
16
17
18 bRo5 space26, 27, 48, 49 and is consistent with studies showing that increasing the number of rotatable
19
20 bonds has a negative effect on the oral bioavailability of drugs, independent of their chemical class.50,
21
22 51
Generally increasing the conformational flexibility of drug candidates is expected to reduce both the
23
24
25 affinity and selectivity of target binding,48 although the correlation between flexibility and promiscuity
26
27 has been questioned.52, 53 As the influence of conformational restriction through macrocyclization on
28
29
target binding still remains unclear, at least in some cases,54 we also investigated flexibility and shape
30
31
32 of macrocycles and non-macrocycles for our dataset of bRo5 drugs and clinical candidates to obtain an
33
34 overview of flexibility in bRo5 space.
35
36
37
38
39 5.1 Flexibility of macrocycles and non-macrocycles. We first investigated whether macrocyclic
40
41 drugs and clinical candidates are more rigid than non-macrocyclic ones in bRo5 chemical space. Due
42
43
44 to the well-known difficulty in predicting the conformations of bRo5 compounds using computational
45
46 methods,49, 55 we analysed all available experimental conformers from crystal structures in the Protein
47
48 Databank (PDB) and Cambridge Structural Database (CSD) for our dataset of bRo5 drugs and clinical
49
50
51 candidates. Arguably such an analysis is limited by the size of the dataset and the data that is available
52
53 for each compound, and possibly biased by crystal packing artefacts and poorly refined geometries,56,
54
55 57
56
but it has the advantage of being based on experimental data. We found a total of 24 drugs and
57
58
59
60 26
ACS Paragon Plus Environment
Page 27 of 59 Journal of Medicinal Chemistry

1
2
3
clinical candidates in bRo5 space, with 12 non-redundant drug classes that displayed multiple
4
5
6 conformations in the crystalline state. In spite of its somewhat limited size, this dataset consists of the
7
8 largest number of bRo5 drugs for which experimentally determined conformations have been
9
10
11
analysed, and allowed us to reach some interesting, but potentially preliminary conclusions.
12
13
14
15 All conformers observed for a given drug or clinical candidate were clustered to identify a set of
16
17
18 representative conformers that showed >1 Å root mean square deviation (RMSD) of all heavy atoms
19
20 between different conformers. This ensured that the subsequent analysis was not biased by multiple
21
22 crystal structures of the same conformation. The representative conformers were then compared using
23
24
25 average RMSD, where a high value indicates that the compound is flexible and can adopt one or more
26
27 significantly different conformations. In addition to the RMSD values for all atoms, RMSD values for
28
29
the macrocyclic core atoms subset and core plus single heavy atoms attached directly to the core
30
31
32 (peripheral atoms) subset were also calculated for macrocyclic drugs and clinical candidates.
33
34
35
36
37 We found that RMSD values of the macrocyclic core atoms are similar to those of the core plus
38
39 peripheral atom subset, but that RMSDs are significantly higher when all atoms in the macrocycle
40
41 drugs are taken into account (Figure 8b, Supporting Information Figure S20). This indicates that the
42
43
44 side chains are commonly the most dynamic regions of macrocyclic drugs. Furthermore, the all atom
45
46 RMSD of macrocycles is similar to that of non-macrocyclic drugs and clinical candidates in bRo5
47
48 space, suggesting that both chemical classes have a similar degree of overall flexibility. Although
49
50
51 RMSD values are useful for comparison of conformations, they give little insight into the source of
52
53 conformational flexibility. Therefore an analysis of the location of bonds that rotate to give the
54
55
56
different conformations was conducted. For macrocyclic drugs in bRo5 space, bonds from all regions,
57
58
59
60 27
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 28 of 59

1
2
3
i.e. the macrocyclic core, the side chains as well as bonds linking the two, were rotated to form the
4
5
6 different conformers (Figure 8c, Supporting Information Figure S21). However, most bonds
7
8 throughout all regions of the macrocycle display only modest or limited rotational freedom. Non-
9
10
11
macrocyclic drugs in bRo5 space also demonstrate a similar distribution of rotational freedom about
12
13 bonds (Figure 8d, Supporting Information Figure S21). Conformationally constrained regions of non-
14
15 macrocyclic drugs arise from aromatic rings, other π-systems such as amides, and substituted aliphatic
16
17
18 rings that occur in higher proportions than for macrocyclic drugs (Supporting Information Figure S20e
19
20 & f). The flexibility of non-macrocycles mainly originates from rotation around bonds that connect
21
22 these rigid elements. Overall, this analysis of conformational flexibility and its origins indicates that
23
24
25 macrocycles are not more rigid than non-macrocyclic drugs in bRo5 space across different drug
26
27 classes. Instead, macrocyclisation could be considered as a complementary strategy to other, more
28
29
traditional approaches for introducing rigidity into drugs. Hence decreased flexibility may not be the
30
31
32 primary reason for the enrichment of oral macrocycles in bRo5 space.
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 28
ACS Paragon Plus Environment
Page 29 of 59 Journal of Medicinal Chemistry

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 a) b)
20
21 OH
O
22
23 O
24 N OH O O
25
O O
26 O
27 HO HO
OH
28
29
30 c) O

31
32
33
34 O OH
H H
N N N O
35 N
H
36 O O
N S
37 S N
38
39 d)
40
41 Figure 8. (a) Distribution of parenterals (triangles) and orals (circles) for macrocycles (right) in
42
43 beyond Ro5 (bRo5 MC, green) and extended Ro5 space (eRo5 MC, blue), as well as non-macrocycles
44
45
46 (left) in the two chemical spaces (bRo5 non-MC, red and eRo5 non-MC, orange). The dashed black
47
48 box shows the eRo5 space limits for ClogP and MW. (b) Comparison of average root mean squared
49
50
51
deviation (RMSD, Å) values of the representative conformers of bRo5 macrocyclic drugs (N=8, mean
52
53 number of representative conformers = 3.5) and bRo5 non-macrocyclic drugs (N=4, mean number of
54
55 representative conformers = 2.5). For the macrocycles, RMSD values for core atoms (atoms in the
56
57
58 macrocycle ring), core plus periphery atoms (the core atoms plus all single heavy atoms attached to the
59
60 29
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 30 of 59

1
2
3
core) as well as all atoms are compared. Box plots show minimum and maximum values as whiskers,
4
5
6 the 25th, 50th and 75th percentiles as boxes and means as crosses. Horizontal lines indicate unpaired and
7
8 unequal variance t-test of compared datasets with their p-values. (c) and (d) Chemical structure, all
9
10
11
observed crystal structures and the most representative structure coloured by the circular standard
12
13 deviation of each bond of erythromycin A and ritonavir, respectively. For erythromycin A core atoms
14
15 are in purple, periphery atoms in orange and side chain atoms in black. The superimpositions show all
16
17
18 experimentally determined structures from the Cambridge Structural Database and Protein Databank
19
20 for the two drugs with heavy atoms coloured by chemical element (green for carbon, blue for nitrogen,
21
22 red for oxygen and yellow for sulfur). Flexible bonds giving rise to different conformers of each drug
23
24
25 were identified from the circular standard deviation of the dihedral angles and are colour-coded from
26
27 white (0) to red (1) representing rigid to flexible bonds, respectively.
28
29
30
31
32
33 5.2 Ligand efficiency and shapes of ligands and binding sites. To further investigate the
34
35 differences and similarities of macrocyclic and non-macrocyclic drugs and clinical candidates in bRo5
36
37 space the affinity, ligand efficiency (LE), binding site shape and ligand shape were analysed. Affinities
38
39
40 did not differ significantly between macrocyclic and non-macrocyclic drugs and clinical candidates in
41
42 bRo5 space, but macrocycles were found to have a slightly lower LE compared to non-macrocycles
43
44
45
(Figure 9a-b, Supporting Information Figure S22). This is consistent with the observations that drugs
46
47 and clinical candidates binding to flat binding sites have lower LE and that only macrocyclic drugs in
48
49 bRo5 space bind to flat binding sites for the non-redundant datasets (Figure 9c). In contrast, non-
50
51
52 macrocyclic drugs display a slight preference for groove and pocket shaped binding sites. Although the
53
54 number of non-redundant drugs and clinical candidates in these two categories is low, the complete
55
56 bRo5 drug-target structure dataset also retains a higher proportion of flat binding sites for macrocycles
57
58
59
60 30
ACS Paragon Plus Environment
Page 31 of 59 Journal of Medicinal Chemistry

1
2
3
(Supporting Information Figure S23). In line with these findings, the nPMI shape of bRo5 drugs
4
5
6 indicates that macrocycles have a trend to be more disc- and sphere-like than non-macrocycles (Figure
7
8 9d and Supporting Information Figure S23). As already discussed above recent investigations of
9
10
11
macrocyclic natural products25 and of inhibitors of protein-protein interactions14 provide additional
12
13 support for these trends. Thus, one may conclude that the shape of macrocycles in combination with
14
15 suitable rigidity and conformational preferences make them well suited for binding to difficult flat
16
17
18 binding sites with sufficient potency and selectivity. The more linear and often aromatic non-
19
20 macrocycles appear to be somewhat better adapted for groove- and pocket-shaped binding sites,
21
22 although macrocycles also bind to these sites.
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 31
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 32 of 59

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
a) b)
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36 c)
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
d)
55
56
57
58
59
60 32
ACS Paragon Plus Environment
Page 33 of 59 Journal of Medicinal Chemistry

1
2
3
Figure 9. (a) Affinities, (b) ligand efficiencies (LE), (c) binding site shapes and (d) normalised
4
5
6 principle moment of inertia (nPMI) plot of bioactive conformations of drugs for beyond Ro5
7
8 macrocyclic (bRo5 MC, green) and beyond Ro5 non-macrocyclic (bRo5 non-MC, red) drugs and
9
10
11
targets. As a rule of thumb, Ro5 drug candidates are optimised to have 10 nM affinities, corresponding
12
13 to a LE of 0.30 for a compound with a molecular weight of 500 Da (heavy atom count, HAC ~36).
14
15 This "guideline" for optimisation is marked with a grey line in (b). Box plots show minimum and
16
17
18 maximum values as whiskers, the 25th, 50th and 75th percentiles as boxes and means as crosses.
19
20
21
22
23 5.3 Advantages of macrocycles in beyond rule of 5 space. As highlighted in the literature,26, 27, 48
24
25
26 and discussed above, macrocyclisation of a linear compound can improve both oral bioavailability and
27
28 affinity at the binding site of a specific target. As macrocyclisation decreases flexibility, it has been
29
30 suggested that decreased flexibility is the main reason for enrichment of macrocycles in bRo5 space.48
31
32
33 However, in the current dataset we examine drugs and clinical candidates acting at different targets
34
35 giving a global picture of flexibility in bRo5 space. This indicates that both macrocyclic and non-
36
37 macrocyclic drugs in bRo5 space have similar flexibilities and affinities across different targets and
38
39
40 binding sites. Interestingly, we also find that disc and sphere-like macrocycles bind more commonly to
41
42 flat binding sites than rod-like non-macrocycles, which more frequently target groove-shaped binding
43
44
45
sites. Hence, we conclude that the unique ability of macrocycles to adopt disk- and sphere-like shapes
46
47 that are better suited for binding to flat and groove-shaped sites is an important reason for enrichment
48
49 of macrocycles in bRo5 space. Improved permeability across membranes, which translates into higher
50
51
52 oral bioavailability, has been demonstrated for a number of macrocycles and constitutes another
53
54 reason.48 It should also be remembered that most macrocycles in bRo5 space are natural products that
55
56 were discovered prior to target-based drug discovery and high throughput screening. Thus, their
57
58
59
60 33
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 34 of 59

1
2
3
current enrichment as oral drugs in beyond Ro5 space also reflects the history of drug discovery for
4
5
6 difficult targets, while at the same time providing inspiration and insight for future discovery of oral
7
8 drugs for difficult targets.
9
10
11
12
13
14 6. PERSPECTIVE
15
16
17 6.1 Designing drugs beyond the rule of 5. A key conclusion from this analysis is that drugs and
18
19 clinical candidates in bRo5 space are better suited to modulate "difficult" and emerging target classes
20
21
22 compared to Ro5 compliant drugs, in particular when binding sites are large and flat or groove shaped
23
24 (Table 2). In previous analyses we demonstrated that 93% of the current oral drugs and clinical
25
26
candidates in eRo5 and bRo5 space fall within an "outer limit" of physicochemical space where there
27
28
29 still remains a reasonable chance to design orally bioavailable drugs.21, 27 This space was delineated
30
31 by MW <1000 Da, -2< ClogP <10, HBD <6, HBA <15, PSA <250 Å2 and NRotB <20. The current
32
33
34
target binding analysis and the previous oral bioavailability analyses therefore provide further
35
36 guidance for design of bRo5 ligands for "difficult" targets, in particular with regards to the shape of the
37
38 binding site, the flexibility of the ligand and the LE that may be attained for binding sites of different
39
40
41 shapes.
42
43
44
45 When designing orals in bRo5 space molecular weight may be increased up to 1,000 Da which
46
47
48 allows high-affinity binding to "difficult", extended, open, flat and groove-shaped binding sites that
49
50 bury up to 8-900 Å2 of ligand surface area (Table 2, Supplementary Figure S25). In contrast, drugs that
51
52
adhere strictly to the Ro5 bind to pockets and internal sites and bury up to 600 Å2 of ligand surface
53
54
55 area. Overall polarity at the ligand interface is similar for bRo5, eRo5 and Ro5 drugs and clinical
56
57 candidates which is mirrored in their similar lipophilicity, i.e. ClogP values centered around 4. Hence,
58
59
60 34
ACS Paragon Plus Environment
Page 35 of 59 Journal of Medicinal Chemistry

1
2
3
an increase in 2D PSA up to 250 Å2 is often found at higher molecular weights in bRo5 space to
4
5
6 maintain this balance of polarity. The majority of this increased PSA should originate from an
7
8 increased number of HBA as HBD must be strictly controlled at ≤6 to avoid reducing oral
9
10
11
bioavailability. This corresponds well to the observed small increase in the number of ligand HBA
12
13 atom interactions between the ligand and its target in bRo5 space as compared to eRo5 and Ro5 space,
14
15 while ligand HBD atoms interactions are similar. Similar overall ligand-target affinities were observed
16
17
18 between Ro5, eRo5 and bRo5 space, which resulted in reduced ligand efficiency in bRo5 space.
19
20 Additional, detailed comparisons of the properties of ligand-target interfaces in each chemical space
21
22 revealed that trends are similar between orals and parenterals for drugs and clinical candidates
23
24
25 (Supporting Information Figure S24-28). However, parenterals in bRo5 space have reduced LE and
26
27 increased LLE as compared to orals in bRo5 space (Supporting Information Figure S27). These
28
29
differences are caused by drugs in parenteral bRo5 space which have high molecular weight and high
30
31
32 polarity; they are often peptidic in nature and thereby difficult to administer orally. In conclusion, this
33
34 analysis therefore indicates that the same approaches for optimization of oral drugs can be applied in
35
36
37 bRo5 space as in Ro5 space, provided that physicochemical properties are kept within the recently
38
39 reported "outer limits".21, 27
40
41
42
43
44 Compounds in bRo5 space require an overall appropriate balance between flexibility and rigidity
45
46 that allows them to bind with high affinity to the target without paying an unnecessary entropic
47
48 penalty. Conformational flexibility may be reduced through macrocyclisation, use of aromatic and
49
50
51 substituted aliphatic rings and other π-systems, such as amides with equal success but at different types
52
53 of binding sites. Interestingly, macrocyclisation appears to favour disc- and sphere-like conformations
54
55
56
that facilitate binding of the ligand to the very difficult, flat binding sites. Incorporation of other
57
58
59
60 35
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 36 of 59

1
2
3
rigidifying structural elements in non-macrocycles is more prone to yield rod-like conformations that
4
5
6 bind to groove-shaped binding sites. During optimisation of beyond Ro5 drugs it is also important to
7
8 adjust goals for LE. Compounds and series with a LE value below 0.30 kcal/mol.HAC are often not
9
10
11
prioritised in drug discovery. However, during optimization of bRo5 drugs for difficult targets, leads
12
13 with LEs <0.30 kcal/mol.HAC should not be discarded as this analysis indicates that they retain a
14
15 reasonable chance of having sufficient affinity for their target while still being developed into a
16
17
18 "possible to be oral" drug space. Instead, the size and flexibility of the ligand and the shape of the
19
20 target binding site should be taken into account, allowing progression of compounds that may give
21
22 candidate drugs with ligand efficiencies >0.12 kcal/mol.HAC; a guideline that captures 90% of current
23
24
25 oral drugs and clinical candidates in bRo5 space. Finally, intramolecular hydrogen bonding, saturation
26
27 of efflux transporters with high doses, use of improved formulations and capitalizing on selective
28
29
transporter mediated distribution to target organs can also play a role in producing a cell permeable
30
31
32 and orally bioavailable drug in bRo5 space.21
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 36
ACS Paragon Plus Environment
Page 37 of 59 Journal of Medicinal Chemistry

1
2
3
Table 2. Summary of interactions between drugs and clinical candidates and their targets by chemical
4
5
6 spacea
7
8 Propertya Ro5 extended Ro5 beyond Ro5b
9 Target classes GPCR, protease/hydrolase,
10 GPCR, Ion channel, GPCR, protease/hydrolase,
transferase, isomerases,
11 nuclear hormone kinase, transferase, structural
structural & adhesion,
12 receptor & adhesion, other
enzyme regulators, other
13 Shape internal, pocket groove, internal, pocket, flat, groove
14 Buried ligand surface 2
15 285-575 Å 440-760 Å2 415-820 Å2
area
16 Ligand buried surface
17 63-99% 56-96% 36-78%
area proportion
18 Ligand interface non-
19 60-88% 64-81% 63-78%
polar atom proportion
20 H bond interactions 0-4 Acc. 0-2 Don. 1-4 Acc. 0-3 Don. 1-7 Acc. 0-3 Don.
21
-log(Affinity) 4.6-9.7 7.3-10.2 6.1-9.9
22
23 LE (kcal/mol.HAC) 0.27-0.66 0.23-0.39 0.14-0.26c
24 LLE 2.0-8.1 2.5-7.0 1.6-10.2
25
26 Flexibility in bRo5 space Macrocyclisation favours disk and sphere-like shapes for flat binding sites
27 Rings and π-systems (aromatics and amides etc.) favours rod-like shapes for groove and
28 pocket binding sites
29 a
Property values shown are the 10th to 90th percentile of the all structures encompassing both orals and
30
31
32 parenterals. Orals and parenterals show very similar property values (Supporting Information Figure
33
34 S24-28). b Boxes indicate differences between bRo5 space and small molecule drug space. c
LE values
35
36
37 should be adjusted based on the size of the ligand and shape of binding site during optimization with
38
39 the final aim of being in this range and as high as possible.
40
41
42
43
44
45
6.2 Conclusion and future challenges. This analysis supports that many of the currently recognised
46
47 "difficult" targets, and the multitude of novel targets that are emerging from genomics and proteomics,
48
49 are out of reach for drugs in Ro5 space, but that they may be suited to manipulation by drugs in bRo5
50
51
52 space. Disc- and sphere-shaped macrocycles and rod-like non-macrocycles in bRo5 space are
53
54 particularly well suited to targets that have large, flat- or groove-shaped binding sites, respectively.
55
56 Orally bioavailable and cell permeable drugs for difficult targets can be designed within the current
57
58
59
60 37
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 38 of 59

1
2
3
"outer limits" of physicochemical property space beyond which pharmacokinetics currently present an
4
5
6 almost insurmountable challenge.21 In order to capitalise on opportunities in bRo5 space, changes in
7
8 the perception of target properties that are considered "druggable" and which ligand properties that
9
10
11
allow orally bioavailability are required. Realising that physicochemical property space can be
12
13 expanded and introduction of reduced LE goals, which should be tailored to binding site shape and
14
15 size, will enable development of drugs in bRo5 space. It may also be important to focus optimisation
16
17
18 so that candidate drugs have an appropriate balance between rigidity and flexibility in order to obtain
19
20 satisfactory potency and ADMET properties.
21
22
23
24
25 The increased interest in drug discovery for difficult targets is becoming apparent from the
26
27 increasing number of relevant publications from industry and academia, the emergence of several
28
29
companies specialising in this chemical space and the growing number of approved drugs outside Ro5
30
31
32 chemical space.21 Thus, it is importatnt to consider the discoveries and breakthroughs that will be
33
34 required to enhance drug discovery in bRo5 space. Improved lead generation for novel classes of
35
36
37 targets with difficult binding sites will be key to success. Extrapolation of current trends indicates that
38
39 natural products and peptides will continue to constitute valuable starting points in this chemical
40
41 space.21 Fragment-based lead generation is also beginning to prove its worth in delivering drugs and
42
43
44 clinical candidates for difficult targets,58 and may become of increasing value for bRo5 lead discovery.
45
46 Compounds in bRo5 space that bind to difficult targets will be bigger and more complex than Ro5
47
48 drugs. Although significant progress has been made in preparation of complex structures, e.g.
49
50
51 macrocycles,59 synthesis will likely remain difficult, time-consuming and costly for the immediate
52
53 future. Therefore, development of improved predictive methods would also have a major impact on
54
55
56
bRo5 drug discovery by allowing only the the most highly prioritised compounds to be selected for
57
58
59
60 38
ACS Paragon Plus Environment
Page 39 of 59 Journal of Medicinal Chemistry

1
2
3
synthesis. Access to efficient and reliable methods for 3D conformer generation would allow more
4
5
6 accurate predictions of flexibility, lipophilicities and polar surface areas and facilitate modelling of cell
7
8 permeability and target binding. Predictive models for cellular efflux and metabolism would also be
9
10
11
extremely valuable, even though it may not be realistic to expect them to be developed in the near
12
13 future. In summary, we believe that developments in lead generation, synthetic methodology and
14
15 predictive methods, in combination with insights into how targets are engaged and improved
16
17
18 understanding of ADMET properties, will allow more effective drug discovery in bRo5 space in the
19
20 near future. It is our hope that this will ultimately contribute to an enhanced delivery of innovative
21
22 medicines and increased pharmaceutical R&D efficiency.
23
24
25
26
27 ASSOCIATED CONTENT
28
29
30
Supporting Information
31
32
33 Full methods for generation of the dataset and its anlysis. Figures for all data discussed in the
34
35 manuscript as well as figures for the full structure dataset along with statistical anlysis. This material is
36
37
available free of charge via the Internet at http://pubs.acs.org.
38
39
40
41
42
43
44 AUTHOR INFORMATION
45
46
47 Corresponding Author
48
49
50 *J.K.: phone, +46 (0)18 4713801; e-mail: jan.kihlberg@kemi.uu.se
51
52
53
54
55
56
57 Author contributions
58
59
60 39
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 40 of 59

1
2
3
All authors contributed to generation of data, its analysis and to the writing of the manuscript. They
4
5
6 have approved the final version of the manuscript.
7
8
9
10
11 Notes
12
13
14 The authors declare no competing financial interests
15
16
17
18
19
Biographies
20
21
22 Bradley Doak is a post-doctoral researcher in the Department of Organic Chemistry, Uppsala
23
24 University, Sweden. Brad obtained a Bachelor of Medicinal Chemistry at the Department of Medicinal
25
26
Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, Australia. He then
27
28
29 completed a PhD in Synthetic Medicinal Chemistry at Monash University in 2012, before taking his
30
31 current position in Sweden. His research interests include beyond-rule of 5 drug design, specifically
32
33
34
macrocycles, as well as fragment-based drug design and difficult targets.
35
36
37 Jie Zheng holds a Masters of Organic Chemistry at the Department of Organic Chemistry, Uppsala
38
39 University, Sweden. She obtained a Bachelor of Environmental Engineering from Shandong
40
41
42
University of Technology, China and then went on to study organic chemistry at Umeå University and
43
44 Uppsala University, Sweden. Following research in environmental and technical chemistry, her
45
46 research interests now include computational chemistry with an emphasis on how compounds bind to
47
48
49 their targets and applications in molecular design.
50
51
52 Doreen Dobritzsch is a senior lecturer in Biochemistry at Uppsala University, Sweden, since 2013.
53
54 Previously she held an Assistant Professorship at the Karolinska Institute, where she also performed
55
56
57
her postdoctoral studies after obtaining a PhD in Biochemistry from the Martin-Luther-Universität
58
59
60 40
ACS Paragon Plus Environment
Page 41 of 59 Journal of Medicinal Chemistry

1
2
3
Halle-Wittenberg, Germany, in 1999. Her research in the field of structural biochemistry is focused on
4
5
6 structure-function relationships in enzymes, primarily those involved in pyrimidine degradation, and
7
8 interactions occurring between immune molecules and self-antigens in rheumatoid arthritis.
9
10
11 Jan Kihlberg holds a chair in Organic Chemistry at Uppsala University, Sweden since 2013. During
12
13
14 the previous ten years at AstraZeneca R&D Mölndal he held positions as Director of Medicinal
15
16 Chemistry, then as Director of Competitive Intelligence and Business Foresight Analysis. He became
17
18
19
Professor in Organic Chemistry at Umeå University in 1996 after having established his research
20
21 group at Lund Institute of Technology in 1991. He holds a PhD in Organic Chemistry from Lund
22
23 Institute of Technology. His key research interests are to understand what properties convey cell
24
25
26 permeability and target binding to compounds outside of the rule of 5 as well as studies of the
27
28 chemical biology of glycopeptides, peptides and their mimetics.
29
30
31
32
33
34
ACKNOWLEDGMENTS
35
36
37
38 B. C. D. was supported by a postdoctoral fellowship funded by Uppsala University. The authors
39
40 would like to thank Prof. Helena Danielson, Uppsala University, Sweden for providing valuable
41
42 comments on the manuscript.
43
44
45
46
47
48
49 ABBREVIATIONS USED
50
51
52 bRo5, beyond rule of 5; eRo5, extended rule of 5; LLE, Lipophilic ligand efficiency; nPMI,
53
54
55 normalised principle moment of inertia; QED, Quantitative estimate of drug likeness; Ro5, rule of 5.
56
57
58
59
60 41
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 42 of 59

1
2
3
REFERENCES
4
5
6
7 1. Rask-Andersen, M.; Almen, M. S.; Schioth, H. B. Trends in the exploitation of novel drug
8 targets. Nat. Rev. Drug Discov. 2011, 10, 579-590.
9 2. Bunnage, M. E. Getting pharmaceutical R&D back on target. Nat. Chem. Biol. 2011, 7, 335-
10 339.
11 3. Kinch, M. S.; Hoyer, D.; Patridge, E.; Plummer, M. Target selection for FDA-approved
12
13 medicines. Drug Discov. Today 2015, 20, 784-789.
14 4. Scannell, J. W.; Blanckley, A.; Boldon, H.; Warrington, B. Diagnosing the decline in
15 pharmaceutical R&D efficiency. Nat. Rev. Drug Discov. 2012, 11, 191-200.
16 5. Hay, M.; Thomas, D. W.; Craighead, J. L.; Economides, C.; Rosenthal, J. Clinical development
17 success rates for investigational drugs. Nat. Biotech. 2014, 32, 40-51.
18
19
6. Cook, D.; Brown, D.; Alexander, R.; March, R.; Morgan, P.; Satterthwaite, G.; Pangalos, M. N.
20 Lessons learned from the fate of AstraZeneca's drug pipeline: a five-dimensional framework. Nat. Rev.
21 Drug Discov. 2014, 13, 419-431.
22 7. International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of
23 the human genome. Nature 2004, 431, 931-945.
24 8. Kim, M. S.; Pinto, S. M.; Getnet, D.; Nirujogi, R. S.; Manda, S. S.; Chaerkady, R.;
25
26 Madugundu, A. K.; Kelkar, D. S.; Isserlin, R.; Jain, S.; Thomas, J. K.; Muthusamy, B.; Leal-Rojas, P.;
27 Kumar, P.; Sahasrabuddhe, N. A.; Balakrishnan, L.; Advani, J.; George, B.; Renuse, S.; Selvan, L. D.;
28 Patil, A. H.; Nanjappa, V.; Radhakrishnan, A.; Prasad, S.; Subbannayya, T.; Raju, R.; Kumar, M.;
29 Sreenivasamurthy, S. K.; Marimuthu, A.; Sathe, G. J.; Chavan, S.; Datta, K. K.; Subbannayya, Y.;
30 Sahu, A.; Yelamanchi, S. D.; Jayaram, S.; Rajagopalan, P.; Sharma, J.; Murthy, K. R.; Syed, N.; Goel,
31
32
R.; Khan, A. A.; Ahmad, S.; Dey, G.; Mudgal, K.; Chatterjee, A.; Huang, T. C.; Zhong, J.; Wu, X.;
33 Shaw, P. G.; Freed, D.; Zahari, M. S.; Mukherjee, K. K.; Shankar, S.; Mahadevan, A.; Lam, H.;
34 Mitchell, C. J.; Shankar, S. K.; Satishchandra, P.; Schroeder, J. T.; Sirdeshmukh, R.; Maitra, A.;
35 Leach, S. D.; Drake, C. G.; Halushka, M. K.; Prasad, T. S.; Hruban, R. H.; Kerr, C. L.; Bader, G. D.;
36 Iacobuzio-Donahue, C. A.; Gowda, H.; Pandey, A. A draft map of the human proteome. Nature 2014,
37 509, 575-581.
38
39 9. Wilhelm, M.; Schlegl, J.; Hahne, H.; Moghaddas Gholami, A.; Lieberenz, M.; Savitski, M. M.;
40 Ziegler, E.; Butzmann, L.; Gessulat, S.; Marx, H.; Mathieson, T.; Lemeer, S.; Schnatbaum, K.;
41 Reimer, U.; Wenschuh, H.; Mollenhauer, M.; Slotta-Huspenina, J.; Boese, J. H.; Bantscheff, M.;
42 Gerstmair, A.; Faerber, F.; Kuster, B. Mass-spectrometry-based draft of the human proteome. Nature
43 2014, 509, 582-587.
44
45
10. Rask-Andersen, M.; Masuram, S.; Schioth, H. B. The druggable genome: evaluation of drug
46 targets in clinical trials suggests major shifts in molecular class and indication. Annu. Rev. Pharmacol.
47 Toxicol. 2014, 54, 9-26.
48 11. Hopkins, A. L.; Groom, C. R. The druggable genome. Nat. Rev. Drug Discov. 2002, 1, 727-
49 730.
50
12. Venkatesan, K.; Rual, J. F.; Vazquez, A.; Stelzl, U.; Lemmens, I.; Hirozane-Kishikawa, T.;
51
52 Hao, T.; Zenkner, M.; Xin, X.; Goh, K. I.; Yildirim, M. A.; Simonis, N.; Heinzmann, K.; Gebreab, F.;
53 Sahalie, J. M.; Cevik, S.; Simon, C.; de Smet, A. S.; Dann, E.; Smolyar, A.; Vinayagam, A.; Yu, H.;
54 Szeto, D.; Borick, H.; Dricot, A.; Klitgord, N.; Murray, R. R.; Lin, C.; Lalowski, M.; Timm, J.; Rau,
55 K.; Boone, C.; Braun, P.; Cusick, M. E.; Roth, F. P.; Hill, D. E.; Tavernier, J.; Wanker, E. E.;
56 Barabasi, A. L.; Vidal, M. An empirical framework for binary interactome mapping. Nat. Methods
57
58
2009, 6, 83-90.
59
60 42
ACS Paragon Plus Environment
Page 43 of 59 Journal of Medicinal Chemistry

1
2
3
13. Zhang, Q. C.; Petrey, D.; Deng, L.; Qiang, L.; Shi, Y.; Thu, C. A.; Bisikirska, B.; Lefebvre, C.;
4
5 Accili, D.; Hunter, T.; Maniatis, T.; Califano, A.; Honig, B. Structure-based prediction of protein-
6 protein interactions on a genome-wide scale. Nature 2012, 490, 556-560.
7 14. Arkin, M. R.; Tang, Y.; Wells, J. A. Small-molecule inhibitors of protein-protein interactions:
8 progressing toward the reality. Chem. Biol. 2014, 21, 1102-1114.
9 15. Surade, S.; Blundell, T. L. Structural biology and drug discovery of difficult targets: the limits
10
11
of ligandability. Chem. Biol. 2012, 19, 42-50.
12 16. Seco, J.; Luque, F. J.; Barril, X. Binding site detection and druggability index from first
13 principles. J. Med. Chem. 2009, 52, 2363-2371.
14 17. Krasowski, A.; Muthas, D.; Sarkar, A.; Schmitt, S.; Brenk, R. DrugPred: A structure-based
15 approach to predict protein druggability developed using an extensive nonredundant data set. J. Chem.
16
Inf. Model. 2011, 51, 2829-2842.
17
18 18. Perola, E.; Herman, L.; Weiss, J. Development of a rule-based method for the assessment of
19 protein druggability. J. Chem. Inf. Model. 2012, 52, 1027-1038.
20 19. Hajduk, P. J.; Huth, J. R.; Fesik, S. W. Druggability indices for protein targets derived from
21 NMR-based screening data. J. Med. Chem. 2005, 48, 2518-2525.
22 20. Terrett, N. Drugs in middle space. MedChemComm 2013, 4, 474-475.
23
24
21. Doak, B. C.; Over, B.; Giordanetto, F.; Kihlberg, J. Oral druggable space beyond the rule of 5:
25 insights from drugs and clinical candidates. Chem. Biol. 2014, 21, 1115-1142.
26 22. Abad-Zapatero, C. A sorcerer's apprentice and the rule of five: from rule-of-thumb to
27 commandment and beyond. Drug Discov. Today 2007, 12, 995-997.
28 23. Zhang, M.-Q.; Wilkinson, B. Drug discovery beyond the ‘rule-of-five’. Curr. Opin. Biotech.
29
2007, 18, 478-488.
30
31 24. Walters, W. P. Going further than Lipinski's rule in drug design. Exp. Opin. Drug Discov.
32 2012, 7, 99-107.
33 25. Villar, E. A.; Beglov, D.; Chennamadhavuni, S.; Porco Jr, J. A.; Kozakov, D.; Vajda, S.;
34 Whitty, A. How proteins bind macrocycles. Nat. Chem. Biol. 2014, 10, 723-731.
35 26. Driggers, E. M.; Hale, S. P.; Lee, J.; Terrett, N. K. The exploration of macrocycles for drug
36
37 discovery - an underexploited structural class. Nat. Rev. Drug Discov. 2008, 7, 608-624.
38 27. Giordanetto, F.; Kihlberg, J. Macrocyclic drugs and clinical candidates: what can medicinal
39 chemists learn from their properties? J. Med. Chem. 2014, 57, 278-295.
40 28. Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.;
41 McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: a large-scale
42
43
bioactivity database for drug discovery. Nucleic Acids Res. 2012, 40, D1100-1107.
44 29. Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Experimental and computational
45 approaches to estimate solubility and permeability in drug discovery and development settings. Adv.
46 Drug Deliv. Rev. 2001, 46, 3-26.
47 30. Bickerton, G. R.; Paolini, G. V.; Besnard, J.; Muresan, S.; Hopkins, A. L. Quantifying the
48 chemical beauty of drugs. Nat. Chem. 2012, 4, 90-98.
49
50 31. Vieth, M.; Sutherland, J. J. Dependence of molecular properties on proteomic family for
51 marketed oral drugs. J. Med. Chem. 2006, 49, 3451-3453.
52 32. Morphy, R. The influence of target family and functional activity on the physicochemical
53 properties of pre-clinical compounds. J. Med. Chem. 2006, 49, 2969-2978.
54 33. Zhang, J.; Yang, P. L.; Gray, N. S. Targeting cancer with small molecule kinase inhibitors. Nat.
55
56
Rev. Cancer 2009, 9, 28-39.
57
58
59
60 43
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 44 of 59

1
2
3
34. Nero, T. L.; Morton, C. J.; Holien, J. K.; Wielens, J.; Parker, M. W. Oncogenic protein
4
5 interfaces: small molecules, big challenges. Nat. Rev. Cancer 2014, 14, 248-262.
6 35. Wirth, M.; Volkamer, A.; Zoete, V.; Rippmann, F.; Michielin, O.; Rarey, M.; Sauer, W. H.
7 Protein pocket and ligand shape comparison and its application in virtual screening. J. Comput. Aided
8 Mol. Des. 2013, 27, 511-524.
9 36. Volkamer, A.; Kuhn, D.; Grombacher, T.; Rippmann, F.; Rarey, M. Combining global and
10
11
local measures for structure-based druggability predictions. J. Chem. Inf. Model. 2012, 52, 360-372.
12 37. Huang, B. MetaPocket: a meta approach to improve protein ligand binding site prediction.
13 OMICS 2009, 13, 325-330.
14 38. Ruddigkeit, L.; van Deursen, R.; Blum, L. C.; Reymond, J. L. Enumeration of 166 billion
15 organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 2012, 52,
16
2864-2875.
17
18 39. Luo, J.; Guo, Y.; Zhong, Y.; Ma, D.; Li, W.; Li, M. A functional feature analysis on diverse
19 protein-protein interactions: application for the prediction of binding affinity. J. Comput. Aided Mol.
20 Des. 2014, 28, 619-629.
21 40. Clackson, T.; Wells, J. A. A hot spot of binding energy in a hormone-receptor interface.
22 Science 1995, 267, 383-386.
23
24
41. Bogan, A. A.; Thorn, K. S. Anatomy of hot spots in protein interfaces. J. Mol. Biol. 1998, 280,
25 1-9.
26 42. Schmidtke, P.; Barril, X. Understanding and predicting druggability. A high-throughput
27 method for detection of drug binding sites. J. Med. Chem. 2010, 53, 5858-5867.
28 43. Rosenquist, A.; Samuelsson, B.; Johansson, P. O.; Cummings, M. D.; Lenz, O.; Raboisson, P.;
29
Simmen, K.; Vendeville, S.; de Kock, H.; Nilsson, M.; Horvath, A.; Kalmeijer, R.; de la Rosa, G.;
30
31 Beumont-Mauviel, M. Discovery and development of simeprevir (TMC435), a HCV NS3/4A protease
32 inhibitor. J. Med. Chem. 2014, 57, 1673-1693.
33 44. Overington, J. P.; Al-Lazikani, B.; Hopkins, A. L. How many drug targets are there? Nat. Rev.
34 Drug Discov. 2006, 5, 993-996.
35 45. Hopkins, A. L.; Keseru, G. M.; Leeson, P. D.; Rees, D. C.; Reynolds, C. H. The role of ligand
36
37 efficiency metrics in drug discovery. Nat. Rev. Drug Discov. 2014, 13, 105-121.
38 46. Kenny, P. W.; Leitao, A.; Montanari, C. A. Ligand efficiency metrics considered harmful. J.
39 Comput. Aided Mol. Des. 2014, 28, 699-710.
40 47. Reynolds, C. H.; Tounge, B. A.; Bembenek, S. D. Ligand binding efficiency: trends, physical
41 basis, and implications. J. Med. Chem. 2008, 51, 2432-2438.
42
43
48. Mallinson, J.; Collins, I. Macrocycles in new drug discovery. Future Med. Chem. 2012, 4,
44 1409-1438.
45 49. Chen, I. J.; Foloppe, N. Tackling the conformational sampling of larger flexible compounds
46 and macrocycles in pharmacology and drug discovery. Bioorg. Med. Chem. 2013, 21, 7898-7920.
47 50. Veber, D. F.; Johnson, S. R.; Cheng, H. Y.; Smith, B. R.; Ward, K. W.; Kopple, K. D.
48 Molecular properties that influence the oral bioavailability of drug candidates. J. Med. Chem. 2002, 45,
49
50 2615-2623.
51 51. Varma, M. V.; Obach, R. S.; Rotter, C.; Miller, H. R.; Chang, G.; Steyn, S. J.; El-Kattan, A.;
52 Troutman, M. D. Physicochemical space for optimum oral bioavailability: contribution of human
53 intestinal absorption and first-pass elimination. J. Med. Chem. 2010, 53, 1098-1108.
54 52. He, M. W.; Lee, P. S.; Sweeney, Z. K. Promiscuity and the conformational rearrangement of
55
56
drug-like molecules: insight from the Protein Data Bank. ChemMedChem 2014, 10, 238-244.
57
58
59
60 44
ACS Paragon Plus Environment
Page 45 of 59 Journal of Medicinal Chemistry

1
2
3
53. Haupt, V. J.; Daminelli, S.; Schroeder, M. Drug promiscuity in PDB: protein binding site
4
5 similarity is key. PLoS ONE 2013, 8, e65894.
6 54. DeLorbe, J. E.; Clements, J. H.; Whiddon, B. B.; Martin, S. F. Thermodynamic and structural
7 effects of macrocyclic constraints in protein−ligand interactions. ACS Med. Chem. Lett. 2010, 1, 448-
8 452.
9 55. Watts, K. S.; Dalal, P.; Tebben, A. J.; Cheney, D. L.; Shelley, J. C. Macrocycle conformational
10
11
sampling with MacroModel. J. Chem. Inf. Model. 2014, 54, 2680-2696.
12 56. Sondergaard, C. R.; Garrett, A. E.; Carstensen, T.; Pollastri, G.; Nielsen, J. E. Structural
13 artifacts in protein-ligand X-ray structures: implications for the development of docking scoring
14 functions. J. Med. Chem. 2009, 52, 5673-5684.
15 57. Liebeschuetz, J.; Hennemann, J.; Olsson, T.; Groom, C. R. The good, the bad and the twisted: a
16
survey of ligand geometry in protein crystal structures. J. Comput. Aided Mol. Des. 2012, 26, 169-183.
17
18 58. Baker, M. Fragment-based lead discovery grows up. Nat. Rev. Drug Discov. 2013, 12, 5-7.
19 59. Marsault, E.; Peterson, M. L. Macrocycles are great cycles: applications, opportunities, and
20 challenges of synthetic macrocycles in drug discovery. J. Med. Chem. 2011, 54, 1961-2004.
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 45
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 46 of 59

1
2
3
TOC GRAPHIC
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 46
ACS Paragon Plus Environment
Page 47 of 59 Journal of Medicinal Chemistry

1
2
3
4
5
6 Original dataset
7 comprehensive list of 475 drugs and clinical candidates, 500-3000 Da
8
9 extended Ro5 beyond Ro5
10 all of: MW >500 Da and at least one of:
11 MW 500-700 Da, ClogP 0-7.5 MW 700-3000 Da, ClogP <0 or >7.5
12 HBD ≤5, HBA ≤10, PSA ≤200 Å2 HBD >5, HBA >10, PSA >200 Å2
13 NRotB ≤20 NRotB >20
14
15 eRo5, N =195 bRo5, N =280
16 71% oral 30% oral
17
a)
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34 b)
35
36 Figure 1.
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 1
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 48 of 59

1
2
3
4 Rask-Andersen et al. dataset of
5 approved drugs
6 rule of 5, filtered by all of: eRo5, N =195, 71 % oral bRo5, N =280, 30 % oral
7 MW ≤500 Da, ClogP 0-5, HBD ≤5, 59 App., 32 PIII, 67 PII, 37 PI 119 App., 37 PIII, 88 PII, 36 PI
8 HBA ≤10
9 Target classification similar to Rask-Andersen et al.,
10 Primary target selected for each drug removal of unclassified compounds
11
12
13 Ro5 drug-target pairs N=579 eRo5 drug-target pairs bRo5 drug-target pairs
Approved N=185, 71% Oral, 30% Approved N =228, 31% Oral, 38% Approved
14 a)
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29 b)
30
31
32
33 Figure 2.
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 2
ACS Paragon Plus Environment
Page 49 of 59 Journal of Medicinal Chemistry

1
2
3
4 ChEMBL drugs dataset
5 N=10,460
6
7 rule of 5
8 filtered by all of:
9 MW ≤500 Da, ClogP 0-5
10 HBD ≤5, HBA ≤10
11
12 Clustered by physicochemical eRo5, N =195, 71 % oral bRo5, N =280, 30 % oral
properties 59 App., 32 PIII, 67 PII, 37 PI 119 App., 37 PIII, 88 PII, 36
13
PI
14
15 Cross-referencing with the PDB and selection of crystal structures: Resolution 0-3.8 Å, good
16 density at binding site interface, clinically relevant drug-target complex
17
18
all representative Ro5 drug- all eRo5 drug-target structures all bRo5 drug-target Results in
19
target structures N =37 N =47, 72 % oral structures N =46, 65% oral Supporting
20
86% oral, all approved 21 App., 5 PIII, 11 PII, 10 PI 32 App., 5 PIII, 8 PII, 1 PI Information
21
22
Filtering to remove redundant drug class-target binding sites.
23
e.g. Erythromycin A was selected to represent all erythronolide-ribosome complexes
24
25
26 non-redundant Ro5 non-redundant eRo5 non-redundant bRo5 Results in paper
27 structures N =29, 86% oral structures N =26, 58 % oral structures N =22, 48 % oral & Supporting
28 all approved 11 App., 4 PIII, 7 PII, 4 PI 17 App., 1 PIII, 3 PII, 1 PI Information
29
30
31
32 Figure 3.
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 3
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 50 of 59

1
2
3
Table 1. Analysed non-redundant drug-target complexes in extended and beyond Ro5 chemical space
4
Compound name Macromolecule (Target) Indication PDB Oral/ Clinical/
5
code Parenterala Approved
6
7
Beyond Ro5 (N=22)
8
9 Argatroban Thrombin Haematology 1DWC parenteral App
10 β-Acarbose α-Amylase Endocrinology 1PPI parenteral App
11 Birinapant E3 ubiquitin-protein ligase XIAP Oncology 4KMP parenteral Phase II
12 Capremycin Ribosome Infection 3KNL parenteral App
13 Cyclosporine A Cyclophilin A Immunology 1CWA oral App
14 Dactinomycin DNA Oncology 1I3W parenteral App
15 Doxorubicin DNA Oncology 1P20 parenteral App
16 Eptifibatide Integrin alpha-IIB Cardiovascular 2VDN parenteral App
17 Erythromycin A Ribosome (D. radiodurans) Infection 1JZY oral Phase III
18 Etoposide DNA topoisomerase-IIb Oncology 3QX3 oral App
19 Eritoran Toll-like receptor 4 Infection 2Z65 parenteral App
20 Itraconazole Lanosterol 14-α demethylase Infection 4K0F oral App
21 Ivermectin 22,23- Glutamate-gated chloride channel Infection 3RHW oral App
22 dihydro B1a
23 Navitoclax B cell lymphoma-2, Bcl-2 Oncology 4LVT oral Phase II
24 Ouabain Na-K ATPase Cardiovascular 3A3Y oral App
25 Paclitaxel Tubulin α-chain Oncology 1JFF oral Phase II
26 PF-03715455 Mitogen-activated protein kinase Respiratory 2YIS parenteral Phase I
27 Quinurpistin Ribosome (H. marismortui) Infection 1YJW parenteral App
28 Rapamycin FK560 binding protein Immunology 4DRI oral App
29 Rifampicin DNA-directed RNA polymerase Infection 4KMU oral App
30 Simeprevir Hepatitis C virus NS3/4A protease Infection 3KEE oral App
31 Thiosptrepton Ribosome (D. radiodurans) Infection 3CF5 parenteral App
32
33 Extended Ro5 (N=26)
34 Aliskiren Renin Cardiovascular 2V0Z oral App
35 AMG-131 Peroxisome proliferator-activated Cardiovascular 3FUR oral Phase II
36 receptor-
37 Atorvastatin HMG-CoA reductase Cardiovascular 1HWK oral App
38 BGJ-398 Fibroblast growth factor 1 Oncology 3TT0 oral Phase I
39
BMS-777607 Hepatocyte growth factor receptor Oncology 3F82 oral Phase I
40
BMS-791325 Hepatitis C virus NS5b subunit Infection 4NLD oral Phase II
41
Ceritinib Anaplastic lymphoma kinase Oncology 4MKC oral Phase II
42
Cobimetinib Mitogen-activated protein kinase kinase Oncology 4AN2 oral Phase III
43
Dalfopristin Ribosome (D. radiodurans) Infection 1SM1 parenteral App
44
45 EPZ-5676 DOT1-like histone H3 methyltransferase Oncology 4HRA parenteral Phase I
46 Ergotamine Serotonin receptor 1B chimera Pain 4IAR parenteral App
47 Fedratinib Bromodomain BRD4 Oncology 4OGJ oral Phase III
48 Homoharringtonine Ribosome (H. marismortui) Infection 3G6E parenteral App
49 Intedanib Vascular endothelial growth factor Oncology 3C7Q parenteral Phase III
50 receptor 2
51 Ispinesib Kinesin Eg5 Oncology 4A5Y parenteral Phase II
52 Lapatinib Epidermal growth factor receptor Oncology 1XKK oral App
53 Lonafarnib Protein farnesyltransferase Oncology 1O5M oral Phase II
54 Mometasone furoate Glucocorticoid receptor Respiratory 4P6W parenteral App
55 Nilotinib Tyrosine-protein kinase ABL1 Oncology 3CS9 oral App
56 Pictilisib Phosphoinositide-3 kinase Oncology 3DBS oral Phase II
57 Pseudomonic acid A Isoleucyl-tRNA synthetase Infection 1QU2 parenteral App
58
59
60 4
ACS Paragon Plus Environment
Page 51 of 59 Journal of Medicinal Chemistry

1
2
3 PU-H71 Heat shock protein-90 Oncology 2FWZ parenteral Phase I
4 Saquinavir HIV-1 protease Infection 3OXC oral App
5 Taladegib Smoothened homolog Oncology 4JKV oral Phase II
6 Tubocurarane Soluble acetylcholine receptor Anaesthesiology 3PMZ parenteral App
7
Volasertib Polo-like kinase 1 Oncology 3FC2 parenteral Phase III
8 a
9 Route of administration used in the indicated phase of development.
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 5
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 52 of 59

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 a) b)
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37 c) d) e)
38
39
40 Figure 4.
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 6
ACS Paragon Plus Environment
Page 53 of 59 Journal of Medicinal Chemistry

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21 a)
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38 b) c)
39
40
41 Figure 5.
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 7
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 54 of 59

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 a) b)
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37 c)
38
39
40
41
Figure 6.
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 8
ACS Paragon Plus Environment
Page 55 of 59 Journal of Medicinal Chemistry

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 Figure 7.
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 9
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 56 of 59

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 a) b)
18
19
20
21
22
23
24
25
26 c)
27
28
29
30
31
32
33
34 d)
35
36
37 Figure 8.
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 10
ACS Paragon Plus Environment
Page 57 of 59 Journal of Medicinal Chemistry

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
a) b)
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35 c)
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51 d)
52
53
54
Figure 9.
55
56
57
58
59
60 11
ACS Paragon Plus Environment
Journal of Medicinal Chemistry Page 58 of 59

1
2
3
Table 2. Summary of interactions between drugs and clinical candidates and their targets by chemical
4
5 spacea
6 Propertya Ro5 extended Ro5 beyond Ro5b
7 Target classes GPCR, protease/hydrolase,
GPCR, Ion channel, GPCR, protease/hydrolase,
8 transferase, isomerases,
nuclear hormone kinase, transferase, structural
9 structural & adhesion,
receptor & adhesion, other
10 enzyme regulators, other
11 Shape internal, pocket groove, internal, pocket, flat, groove
12 Buried ligand surface 2
285-575 Å 440-760 Å2 415-820 Å2
13 area
14 Ligand buried surface
63-99% 56-96% 36-78%
15 area proportion
16 Ligand interface non-
60-88% 64-81% 63-78%
17 polar atom proportion
18 H bond interactions 0-4 Acc. 0-2 Don. 1-4 Acc. 0-3 Don. 1-7 Acc. 0-3 Don.
19 -log(Affinity) 4.6-9.7 7.3-10.2 6.1-9.9
20 LE (kcal/mol.HAC) 0.27-0.66 0.23-0.39 0.14-0.26c
21
LLE 2.0-8.1 2.5-7.0 1.6-10.2
22
23
24 Flexibility in bRo5 space Macrocyclisation favours disk and sphere-like shapes for flat binding sites
25 Rings and π-systems (aromatics and amides etc.) favours rod-like shapes for groove and
26 pocket binding sites
a
27 Property values shown are the 10th to 90th percentile of the all structures encompassing both orals and
28 parenterals. Orals and parenterals show very similar property values (Supporting Information Figure
29
30
S24-28). b Boxes indicate differences between bRo5 space and small molecule drug space. c LE values
31 should be adjusted based on the size of the ligand and shape of binding site during optimization with
32 the final aim of being in this range and as high as possible.
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 12
ACS Paragon Plus Environment
Page 59 of 59 Journal of Medicinal Chemistry

1
2
3
TOC GRAPHIC
4
5
6
7
8
9
10 Structures
11 Drug chemical
12 space
13
14
15 extended beyond
16 Rule of 5
17 Ro5 Ro5
18
19
20
21
22 similar polarity, interactions and affinity
23
24 small, internal & large, open, flat &
25 pocket, LE grooves, LE
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 13
ACS Paragon Plus Environment

You might also like