Professional Documents
Culture Documents
Deep Learning For Drug Discovery: Wengong Jin Massachusetts Institute of Technology
Deep Learning For Drug Discovery: Wengong Jin Massachusetts Institute of Technology
discovery
Wengong Jin
3
OD
01 0
growth
1 Drug
T R discovery is a challenging search problem 0
0 01
20 40 0 80 100 0 20 40 0 80 100
redicted molecules redicted molecules (highest to lowest predictions)
Tanimoto similarity
0
chemical space O S
OD ( 00nm)
0 N N
0
S N
O
04 0
H2N
0
02
0
0
y
0 00
Data source: PhRMA.org
1000 1 00 2000 2 00 4
Number of possible
drug-like molecules
60
≈
10
(Kirkpatrick, et al. 2004)
5
• Experimental facilities in industry can only test 10 compounds/day
Let AI find
good drugs!
6
m is to Computational drug discovery: three schemes
acterize
m nsmitt-
mate im
e is isaim
toto is aim
ultimate to is to
y. This
racterize
acterize
Simulation Virtual screening De novo drug design
eate,
d and characterize
characterize
nent nverse
ransmitt-
nsmitt-
component
transmitt- transmitt-
sly.
y. This This This
This
simultaneously.
aneously.
dthe
p,” inverse
nverse loop,”
and and inverse
inverse
12, 15).
perties
ing the
hodsatomic
roperties
perties reveal properties
eal properties
)specifying
fying
ing ly coor-
the
afterthespecifying
the the
satomic
he t name
atomic
tituentconstituent
atomic atomic
ng D)
)nal with
coor-
coor-
mensional
(3D) coor- (3D) coor-
srse
gn, for
itsname anits name
name
design,
as as its name
ngtying
adigm is the
with
with
startingby starting
with with
ngucture.
for
and
arching foran anfor an for an
searching
tut
he to one
is
isinput
Here thethe
theisinput
the is the
of
ucture. prob-
tructure.
tput is the structure.
the structure.
2) pto uses
to
oneoneto map
ecessarily
ily map one to one
aethods
fofprob-
bution prob-
distribution
of prob-of prob-
ality
2) . 2) of
uses
uses Fig. 2)
2. uses
Schematic of the different approaches toward molecular design. Inverse design starts
Down
n (Fig. 2)(Fig.
design uses
7
methods
ethods
and
arch Figure from
source:
search
methods desired properties
Sanchez-Lengeling
methods and361,
et al., Science ends in chemical
360–365 (2018) space, unlike the direct approach that leads from
m is to Simulation is often too slow
acterize
m nsmitt-
mate im
e is isaim
toto is aim
ultimate to is to
y. This
racterize
acterize
Simulation Virtual screening De novo drug design
eate,
d and characterize
characterize
nent nverse
ransmitt-
nsmitt-
component
transmitt- transmitt-
sly.
y. This This This
This
simultaneously.
aneously.
dthe
p,” inverse
nverse loop,”
and and inverse
inverse
12, 15).
perties
ing the
hodsatomic
roperties
perties reveal properties
eal properties
)specifying
fying
ing ly coor-
the
afterthespecifying
the the
satomic
he t name
atomic
tituentconstituent
atomic atomic
ng D)
)nal with
coor-
coor-
mensional
(3D) coor- (3D) coor-
srse
gn, for
itsname anits name
name
design,
as as its name
ngtying
adigm is the
with
with
startingby starting
with with
ngucture.
for
and
arching foran anfor an for an
searching Takes
tut
he to one
is
isinput
Here thethe the is thedays for one
theisinput
of
ucture. prob-
tructure.
tput is the structure. compound
the structure.
2) pto uses
to
oneoneto map
ecessarily
ily map one to one
aethods
fofprob-
bution prob-
distribution
of prob-of prob-
ality
2) . 2) of
uses
uses Fig. 2)
2. uses
Schematic of the different approaches toward molecular design. Inverse design starts
Down
n (Fig. 2)(Fig.
design uses
8
methods
ethods
and
arch Figure from
source:
search
methods desired properties
Sanchez-Lengeling
methods and361,
et al., Science ends in chemical
360–365 (2018) space, unlike the direct approach that leads from
growth predi
0
OD ( 0
01 04
0 01
Virtual screening
02
0
2 NR
0 20 40 0 80 100 0 10 20 0 40 0 0 0
H
1 models (Walters et al., I1998;
0
McGregor et al., 2007; …)
N 0
08
S
Tanimoto similarity
0
O S
OD ( 00nm)
0 N N
S N
04 Prediction: good!
O
04 0
H2N
02
02 Virtual screening
Compound 01 Experiments
model
B 2 11
0 0
0 00 1000 1 00 2000 2 00 10
-
10
-4
10
- -2
10 10
-1
10
0
10
1
10
2
10
• Virtual screening
Ranked training set molecules is much faster than experimental screening in web labs.
[halicin] g ml
8
It can test 10 compounds within a day, while experimental screening
ation of Halicin
•
by 2,560 molecules within the FDA-approved drug library supplemented with a natural product collection.
takes years
Virtual
• Advantage: no need to synthesize screening
10
growth predicti
OD ( 00n
OD ( 00n
0 0
02 1 T R 02
2 NR
0 time between 0 01steps. The ultimate aim is to 0
0 20 40 0 concurrently
80 100 propose,
0 create, 40and characterize
20 0 80 100 0 10 20 0 40 0
• De novo drug design: directly
redicted molecules generate a
redictedcompound
molecules (highest towith
new materials, with each component transmitt- desired
lowest predictions) properties redicted molecules
(Moon
G et al., 1991; Clark et ingal.,
and1995;H Schneider
receiving
1
& Fechner,
data simultaneously. This2005; I …)
0
process is called “closing the loop,” and inverse
N 0
design is a critical
08 facet (12, 15). S
Tanimoto similarity
0
Property criteria
O S
OD ( 00nm)
N
Inverse design
0 N
04
(potency, safety, …) O
S N
Quantum chemical 04 methods reveal properties 0
H2N
of a molecular system only after specifying the 02
Drugessential
design parameters
model 02
of the A good drug
constituent atomic 0 1Experiments
B 2 11
training set nuclei and their 0 three-dimensional (3D) coor- 0
Broad library
ime betweenhalicin
steps. The ultimate aim is to dinate positions (16).
0 Inverse
00 design,
1000 1as00its name
2000 2 00 10
-
10
-4
10
-
10
-2
10
-1
10
0
10
1
1
new materials, with each component transmitt-the desired functionality and searching for an
Figure 2. Initial Model Training and ideal molecular
the Identification structure.
of Halicin Here the input is the
ng and receiving data We simultaneously.
need to solve This
(A) Primary screening data for growth inhibition of E. coli by and
functionality 2,560 the
molecules within
output is the
theFDA-approved
structure. drug library supplemented with a natural product c
process is called “closing the loop,” and inverse
Shown an inverse problemreplicates.
is the mean of two biological Red are growth inhibitory molecules; blue are non-growth inhibitory molecules.
Functionality need not necessarily map to one
design is a critical facet (12, 15).
(B) ROC-AUC plot evaluating model performance after training. Dark blue is the mean of six individual trials (cyan).
(C) Rank-ordered prediction scores of Drug unique structure
Repurposing but to a that
Hub molecules distribution of prob-
were not present in the training dataset.
nverse design able instructures.
(D) The top 99 predictions from the data shown (C) were curated Inverse design
for empirical (Fig.
testing 2) uses
for growth inhibition of E. coli. Fifty-one of 99 molecules were va
true positives
Quantum chemical basedreveal
methods on a cut-off optimization,
of OD600
properties <0.2. Shown issampling,
the mean ofand search methods
two biological replicates. Red are growth inhibitory molecules; blue are no
11
inhibitory molecules. to navigate the manifold of functionality of
De novo drug design: inherent trade-off
• Virtual screening is restricted to Ease for
commercially available compounds synthesis
(e.g., ZINC library)
De novo
drug design
• Limitation 2: traditional techniques
explores the space based on hand-
designed rules (e.g., genetic algorithms)
Coverage
12
Deep learning: a promising direction
• Deep learning has achieved human-level accuracy in computer vision (He et al., 2016)
Feature learning
re
Tru
04
02 02
02
D E F
• Deep generative
12 modelspredictions
can generate realistic
10 text and images with 12
pr
desired properties
08 1 08
OD ( 00nm)
OD ( 00nm)
0
Deep Generate an image
0
generative
04
of an armchair
0 1 in the 04
shape of avocado
02
models
1 T R 02
2 NR
0 0 01 0
Ramesh et0al., 2020
20 40 0 80 100 0 20 40 0 80 100 0 10 20 0
redicted molecules redicted molecules (highest to lowest predictions) redicted
• G
De novo drug H
design: generate a compound
1 with desired I
properties 0
N 0
08
Use deep S
Tanimoto similarity
0
O
Property criteria
S A good0 4
OD ( 00nm)
N N
(potency, safety, …)
generative 0
O
S N drug
models 04 0
H2N
02
02
01
training set B 2 11
Silver et al., “Mastering the game of Go with deep neural networks and tree search”, Nature (2016).
0 0
Broad library 14-2
Ramesh et al., “DALL-E: creating images from text ”, OpenAI blog 0 00 1000 1 00 2000 2 00 10
-
10
-4
10
-
10
growth predicti
OD ( 00n
0
01
Main technique: graph neural networks 04
02
2 NR
0 01 0
100 0 20 40 Virtual screening / molecular
0 80 0 10
100 20 property
0 40 prediction
0 0 0
redicted molecules
(Duvenaud et (highest to lowest
al. 2015; predictions)
Kearnes et al. 2016; Jin et al., 2017;
redicted molecules
Gilmer et al., 2017; …)
Graph
H I
1 0 encoding
N 0
08
S
Tanimoto similarity
0
O S
OD ( 00nm)
0 N N
S N
04 Property
O
04 0 (numerical attributes)
H2N
02
02
Graphs 01 Graph
0 0
B 2 11 generation
0 00 1000 1 00 2000 2 00 10
-
10
-4
10
- -2
10 10
-1
10
0
10
1
10
2
10
Ranked training set molecules De novo drug design [halicin] g ml
(Olivecrona et al., 2018; Gomez-bombarelli et al., 2018; Jin et al., 2018; Popova et al., 2018; …)
ntification of Halicin
E. coli by 2,560 molecules within the FDA-approved drug library supplemented with a natural product collection.
Red are growth inhibitory molecules; blue are non-growth inhibitory molecules.
15
e after training. Dark blue is the mean of six individual trials (cyan).
Example: discovery of new antibiotics
17
Part 1: antibiotic discovery
History of antibiotic discovery
• After 1990s, we struggle to discover novel antibiotic classes (Silver et al., 2011;
Brown et al., 2014; Shore & Coukell, 2016)
Figure source: ReAct group FDA = U.S. Food and Drug Administration 18
Virtual screening for antibiotic discovery
• Through collaboration with the Broad Institute, we collected 2560
molecules with measured growth inhibition against E. coli (BW25113)
19
Traditional approach: hand-crafted features
• Traditional methods are based on fixed, hand-
engineered molecular features.
E F
10 12
predictions
growth prediction score
OD ( 00nm)
• Molecular weight, number
0 of heavy atoms, etc.
01 04
• Problem: we don’t know 02
all the antibacterial patterns
2 NR
0 01
0 20 40 0 80 100 0 10 20 0 40 0 0 0
•
redicted molecules (highest to lowest predictions) redicted molecules
Graph neural networks automatically learn a feature representation from data
I
1 0
N 0
08
S
Tanimoto similarity
0
O S
OD ( 00nm)
0 N N
04
O
S N Features are learned Prediction: good!
04 0
H2N automatically
02
Features
02 designed
Compound 0 1 by experts Model
B 2 11
0 0
0 00 1000 1 00 2000 2 00 10
-
10
-4
10
- -2
10 10
-1
10
0
10
1
10
2
10
Ranked training set molecules [halicin] g ml
21
Graph neural network (GNN)
• Rich history of GNNs (Gori et al., 2005, Scarselli et al., 2009, Duvenaud et
al. 2015, Kearnes et al. 2016, Jin et al., 2017, Gilmer et al., 2017, Zitnik et
al., 2018, etc.)
22
Graph neural network (GNN)
Graph
Pooling
convolution
23
Graph neural network (GNN)
Antibacterial
property
Le
arn
i xed e d
F Hand-crafted features
Antibacterial
property
Le Deep learned features
ar n e d
ed r n
L ea
24
D E F
12 10 12
predictions
08 1 08
OD ( 00nm)
OD ( 00nm)
4
We virtually screened 10 compounds in Broad drug repurposing hub
0 0
• 04 01 04
• We0experimentally
2 1 T R tested the top 99 compounds in the Broad Institute
0 2 2 NR
0 0 01 0
• 51 of them
0 are indeed
20 40 0antibacterial
80 100 — hit
0 rate
redicted molecules
20 = 51.5%
40 0 80 100
redicted molecules (highest to lowest predictions)
0 10 20
r
G H I
1 0
N 0
08
S
Tanimoto similarity
0
O S Compound SU3327
OD ( 00nm)
N
51 drugs Low
Structural 0 N
04
novelty S (renamed as Halicin)
toxicity O N
04 0
H2N
02
02
01
training set B 2 11
0 0
Broad library
0 00 1000 1 00 2000 2 00 10
-
10
-4
10
-
halicin
Ranked training set molecules 25
12 12 10 12
predictions predictions
OD ( 00nm)
08 1 08
OD ( 00nm)
OD ( 00nm)
0 0 0
04 04 01 04
• Halicin
02 shows
02
potent
1 T R
growth inhibition against E. coli in vitro
02
2 NR 2 N
•
0
80 100 It is also
0 structurally different from known
10
0
100 antibiotics
200
0 0 20 40
20 400 00 0 80
0 01
40 0 80 100
0
0 10
west predictions) redicted molecules
redicted molecules redicted molecules (highest to lowest predictions)
I G H I
0 1 0
0 N 0
08
S
Tanimoto similarity
0 0
O
OD ( 00nm)
OD ( 00nm)
N 0 N N
04 04
N S N
O
0 Inhibition 04 0
H2N
02 02
02
01 Low similarity to existing antibiotics 01
B training
2 11 set B 2 11
0 0 0
Broad library
2000 2 00 10
-
10
-4
10
-
10
-2
10
-1
10
0
10
1
10
2
10 0 00 1000 1 00 2000 2 00 10
-
10
-4
10
halicin
ecules [halicin] g ml Ranked training set molecules
OD (
C
C
C
4 hr 10 10 4 hr 10
10 02 10
8hr 4
10 8hr 4
g
nutrient deplete A. baumannii CDC 288 nutrient deplete
10
2 2hr 0 10 10
10
2
10
-2 10 -1
4hr 0 1 - 2 -4 - -2 -1 0 1 -2 -1 0 1
C
2 2 2
C
10 10 10 10 10 10 10 1010 1010 10 10 10 10vehicle
10 10 10
halicin 10 10 10 10 10 vehicle halicin
4 hr
[halicin] g ml [halicin] g ml (0 DMSO) 10 w v)
(0 [halicin] g ml (0 DMSO) (0 w v)
10
E 8hr D F E 4
F
10 A. baumannii1 CDC 288 10 10 10
8 8
2
nutrient vehicle
deplete 10 vehicle 10
08
10 10
1 2 Strong
-2 -1in vivo
ole 0 inhibition
1 of2
10
Strong in vivo inhibition10
against
OD ( 00nm)
metronida metronida ole
10 10 10 10 10 0 10 10 10 10 10 vehicle halicin 10
pan-resistant A. baumannii resistant(0 C. wdifficile
g
[halicin] g ml 10 (0 DMSO) v)
10
C
04
C E
disrupt
in ect
halicin 10
4 vehicle F disrupt
in ect
halicin 10
4 vehicle
coloni ation 02 metronida ole ( 0 mg kg) coloni ation metronida ole ( 0 mg kg)
10
resistance 10 halicin (1 mg kg) 10resistance 10 halicin (1 mg kg)
C. difficile 0
8
8 0 vehicle
10
2
10 10
2
10
2
10 - 10
2 -48 -24 0 hrs 2410 - -4144 -
10 10 10 10
-2
10
-1 0
24 10
148 2
10 10 2 - 2 120
-48 -24 144 0 hrs 24 144 24 48 2 120 144
ampicillin C. difficile treatment [halicin] g ml Time a ter in ection10(hours)
ampicillin C. difficile treatment Time a ter in ection (hours)
10
200 mg kg in ection every 24 hrs metronida ole 200 mg kg in ection every 24 hrs
10
g
g
C
C
Number of mol
Prediction sc
0 0.4
Number of mol
Br O
1
4 Br
91
1
28
19
46
73
1
64
82
37
55
N+
10
0
0.
4 1 0.9 Tanimoto 9 score
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
O-
1 N N+ 103
4 2 4 4 4 10
LogP
ZINC000100032716 ZINC000225434673
LogP
HN
3 N N SH 0
>5
3 0 O
32 10 6
CFU/ml
10 10
0.8
0.1 8
H2N NH2
3.5
10
S
3.5 200 300 350 400 450 >500
•
S
5
Applied the same model to screen compounds in the ZINC library
0.2 10 0.4 0
1
1
10
MIC (μg/ml)
1
91
1
28
19
46
73
1
0
0.
64
82
37
55
MIC (μg/ml)
Molecular weight (Daltons) 4
0
0.
0.
N
4 Tanimoto4 score
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
Bin 32 32 32 16
O S N
10
N
time
D
+
-2 N
4.5
10
4.5
E
•
O -
MIC (μg/ml)
N
10
MIC (μg/ml)
MIC (μg/ml)
Molecular weight (Daltons)
a r I
3* tolC 1* -1 ’’)-I c * C -2 -1*
S
O
- A T - 3 l
D b o
S
-2 51 1 R C X A 2 - I 128 11 t primaryR train C
D
S
10 W 2 B ∆ C O t( ’ ) 25 B ∆ 10C
N O
32 8 32 am HN M NH
a n c (6 W am M library
Broad
EC aSA KP AB BPA∆b
S
B ∆b a
EC SA KP AB PA
N N+ N
O O- O
1 WuXi library
-3
10
ZINC15 pred
G H
-3 O O
O
Br
Br O
10 H N S 2
F
F
O
N+
9 O
O
0.01 0.25 4 0.25
128 10 0.06
9 false predict
Br Br N N
N
F
N N+
OH
O-
4 2 4 128
4 4
10 N
N NN F true-4predictio
-O N N
10
-48 N N N
8
0.1 10
N+ N+
N
N
N
O 10 O - OH
10
O OH N O
3 * S olC O-1* OAT O
S N +
- 1 - I a - c r 3 * lC 1 *113
S 7
10 511 B∆tN R C A (2’ ’ ) - I b 10
7
11 to -
R 25 C
S
X t ’ ) 5 ∆
N HN S O
32 8
Minimum 32
inhibitory W 2
am NM
C S O
O 32 a n Minimum
a
8
c ( 6 32
inhibitory W 2
am B M C
B W ba
32
+
O N N SH B 6 ∆b N
a B 6 ∆b ∆
CFU/ml
+
CFU/ml
N
10 O
10
concentration (μM) concentration (μM)
N O -
O O-
H2N S NH2
S
Br
Br
N+
O
G 10
5 Br
Br O
N+ H 10
5
G
O- 9
N
O- 9
10 4
N N+ 9
10 4 10
-O
O
Br Br+
N
S N N
N N
N N+
OH
32
4 32
2 32
4 16
4 4 10
time
-O
Br
= N0
Br
N
N
N
N
OH
4 2 4 4 10 4 time = 0 8
time = 4hrs10
-O N N 8 N + + N 8
N+ N+
N
N
O 10 3
10 time
O = OH
4hrs
N O
N O
10 3
10
O OH N O
O
7 BW25113 7
2
BW25113 10 7 28
F N+ 2
Compare GNN with other models
• Only GNN ranks Halicin among the top 100 compounds.
Rank of
Model Feature
Halicin
Feed-forward neural network RDKit features (fixed) 273 Learned features are better than
hand-designed features
Feed-forward neural network Morgan fingerprint (fixed) 1217
29
Part 2: infuse biological knowledge in GNNs
• Part 1: graph neural networks for antibiotic discovery
[ICML’17, NeurIPS’17, JCIM’19, Cell’20]
30
Motivation for biology-aware models
Representation
Biology- Biology-
Property
aware aware
31
Case study: COVID-19 drug combinations
Mortality rate in a recent clinical trial
(Beigel et al., 2020)
Placebo 15.2%
0 0.04 0.08 0.12 0.16
32
Case study: COVID-19 drug combinations
• Two drugs are synergistic if effect( , ) >> effect( ) + effect( )
• Challenge: training data is limited (less than 200 drug combinations), but
deep neural networks are very data hungry
+ + +
Big data Neural network Data Knowledge Neural network
33
Biological knowledge of viral replication
How can a drug block
COVID-19 infection?
34
ComboNet incorporates biology & chemistry
• Synergy comes from inhibition of certain biological targets (e.g., proteins)
…
Antiviral
activity
…
Chemical representation
(to be learned)
35
ComboNet learns drug-target interaction
1. Predict drug-target interaction — whether drug A inhibits target B
⏞
3CLpro
ACE2 Instead, predict whether a drug
…… Too sparse to
inhibits a biological target
…
HDAC2 use as features
⏞
Learned representation of
…
theinvolved
Targets molecular structure infection
in COVID-19
Compounds
Drug-target interaction data
0.9 0.2 0.3 0 0.9 1 0.8 0.4 0.1 0.7
(ChEMBL and NCATS) 0.1 0 0.5 0.1 1 0.1 0.9 0.4 1 0.3
0 0.3 0.2 0.7 0.1 0.2 0 0.8 0.1 0.5
36
ComboNet learns antiviral activity
2. Single-agent antiviral activity prediction
…
HDAC2 Antiviral
activity pA
…
Single-drug antiviral Drug Reserpine Remdesivir Penicillin Halicin
activity data (NCATS) Antiviral? Yes Yes No No
37
ComboNet learns antiviral synergy
Combination
3. Predict synergy for drug combinations synergy data
Single-drug (NCATS)
3CLpro antiviral activity
ACE2
……
…
HDAC2
pA
…
zAB
zA 3CLpro
Combination
Synergy
Compound A antiviral activity
ACE2 Feature representation of
drug combinationp sAB
……
bliss
HDAC2 (A,
ABB)
zAB = zA + zB − zA ⋅ zB
Compound B zB
…
3CLpro
ACE2
……
…
HDAC2 pB
…
38
ComboNet performance
• Training set (88 drug combinations); Test set (71 drug combinations)
ComboNet AUC is
0.8 on average
Remove chemical
or biological
ROC-AUC
information hurts
Standard models
cannot generalize
39
Discover new drug combinations
• Collaboration with National Center for Advancing Translational Science (NCATS)
40
Part 3: de novo drug design
• Part 1: graph neural networks for antibiotic discovery
[ICML’17, NeurIPS’17, JCIM’19, Cell’20]
41
Motivation for de novo drug design
• Deep learning can discover new antibiotics and COVID-19 drugs
• Simple approach: train a GNN to rank all the compounds in our library
60
• Problem: number of drug like molecules = 10 . We can’t rank all of them.
Compound
library Candidates
4 8
(10 − 10 )
42
Graph generation for de novo drug design
• Learn a distribution whose mass is concentrated around “good” molecules
60
• It can efficiently explore the entire chemical space (10 molecules)
Generate
How to generate
molecular graphs?
43
Previous solution 1: sequence-based methods
• Prior work used recurrent neural networks to generate molecular graphs
Weininger, D. SMILES, a chemical language and information system. Journal of chemical information and computer sciences, 28(1):31–36, 1988. 44
Problems of sequence-based approach
• Prior work used sequence-based generative models for molecular graphs
45
Previous solution: node-by-node generation
• A straightforward approach: generate a graph node-by-node (Liu et al., 2018)
Add nodes
one by one
……
2
• In total: O(N ) edge predictions
46
Failure of node-by-node generation
• Node-by-node generation via a variational auto encoder (VAE) (Liu et al., 2018)
Reconstruction accuracy
80
64
encode COVID-19 drug remdesivir
Accuracy
They should 48
be the same 32
decode 16
0
20 40 60 80 100
Molecule size (number of atoms)
47
We need to leverage inductive bias
Inductive
bias?
Sequence
Grid graph
Molecular graphs Dense
(text) (images) (low treewidth) graphs
Complexity
48
Junction tree variational autoencoder
Motif Motifs are small due to low treewidth
Junction
tree Motif vocabulary
Tree decomposition
250K graphs ⇒ 638 motifs
99.9% coverage (new graphs)
Molecular
graph
encode decode
Molecular
representation
50
Hierarchical graph encoder
Motif vector
Propagate
node vectors
Node vector
51
Hierarchical graph decoder
Motif vocabulary
52
Hierarchical graph decoder
Motif-by-motif generation
53
Motif-by-motif versus node-by-node
• Training objective: minimize reconstruction loss
Reconstruction accuracy
90 Motif-by-motif
generation
72
encode
Accuracy
They should 54
be the same 36
decode 18
Node-by-node
0 generation
20 40 60 80 100
Molecule size (number of atoms)
54
Results: molecular optimization
• Task: learn to modify a non-drug-like molecule into a drug-like molecule
Success rate
70
60
58.5
A local modification significantly
improves drug-likeness 50
Sequence node-by-node motif-by-motif
55
Part 3: de novo drug design
• Part 1: graph neural networks for antibiotic discovery
[ICML’17, NeurIPS’17, JCIM’19, Cell’20]
56
Deep learning for molecular sciences
Drug discovery
(e.g., de novo drug design)
Deep learning • Dahl et al., 2015;
Chemistry
(e.g., reaction prediction)
• Duvenaud et al., 2015;
57
Thanks to my collaborators
Regina Barzilay Tommi Jaakkola Klavs Jensen William Green Phillip A. Sharp James Collins Caroline Uhler
Rafael Gomez-Bombarelli Connor Coley Camille Bilodeau Peter Sorger Rachel Wu Jonathan Stokes Kyle Swanson
David Alvarez-Melis Guang-He Lee Allison Tam Nienke Moret Anne Fischer Kevin Yang Tao Lei 58
Thanks to my collaborators
59