You are on page 1of 29

Pharmacophores in Chemoinformatics:

1. Pharmacophore Patterns & Topological


Fingerprints
Dragos Horvath
Laboratoire d’InfoChimie
UMR 7177 CNRS – Université de Strasbourg
horvath@chimie.u-strasbg.fr
The Pharmacophore Way of Life – A Medicinal
Chemist’s Dream
• (Bio)Molecular Recognition is based on ligand-site
interactions of extremely complicated nature
– Understanding them requires a solid knowledge of statistical
physics and, therefore, of higher maths…
– But medicinal chemists hate maths… so they developed a
simplified rule set to rationalize ligand binding.
• Functional groups of similar physicochemical behavior
represent pharmacophore types:
– Hydrophobic, Aromatic, Hydrogen Bond (HB) donors, Cations,
HB Acceptors, Anions.
– Now, we just need to know how each of the six types interacts
with the site… welcome to the “pharmacophore” paradigm,
farewell higher maths (for the moment, at least)
The Interaction Saga: (1) van der Waals
Interactions
• Atoms are more or less hard spheres – squeezing them
against each other causes a sharp rise in energy:
– Erep=Aijd-12
• At distances larger than the sum of their « van der Waals
spheres », an attractive term due to dipole-induced dipole
interactions (London dispersion term) is predominant…
– Eatt= - Bijd-6
The Interaction Saga: (2) Electrostatics &
Solvation
• Coulomb charge-charge interactions are easy to compute,
once the partial charges Qk are assigned
E
on the atoms…
t∈i
– ECoul=QiQj/4πεd
• … and the solvent molecules are nt np
t∈i
explicitly modeled –
BEi;σi
accountig forBE k;σthe
all k possible solvation shell structures, in
Qi p∈i
order to estimate aQksolvation free energy.
t∈k
u∈i Ep∈i
• Alternatively, p∈ka neglected!
continuum solvent model may be
employed. np
v∈i

σk ε σi ε
= Ep.np p∈k 1- ext = Ep.np p∈i 1- ext
ε0 εint ε0 εint

D. Horvath et al., J. Chem. Phys. 104, 6679 (1996)


The Interaction Saga: (2bis) The Hydrophobic
Effect
• The mysterious force that separates grease and water is
not due to grease-grease van der Waals interactions being
stronger than grease-water attraction!
• It is not of electrostatic nature either, because greasy alkyl
chains have no charges!
• Actually, it’s not a force at all, but the consequence of the
drift towards a more probable state of matter (?!)
• For practical purposes, however, it makes sense to believe
that hydrophobes « attract » each other – for making
hydrophobic contacts significantly improves binding
affinity!
Physical Chemistry For Dummies: The Rules
• Hydrophobes make favorable contacts with other
hydrophobes (we do not want to know why!). Assume
strenght proportional to the buried hydrophobic area.
• Hydrophobes in close contact to polar groups cause
frustration, for they chase away the water molecules
favorably solvating the latter and offer no substitute
interactions
• Hydrogen bond donors seek to pair with acceptors, so that
they may reestablish the water hydrogen bonds they lost
• Cations seek to pair with anions and avoid hydrophobes.
• Shape is of paramount importance: groups of a same kind
may replace each other if they are shaped likely
BioIsoSteres – Equivalent Functional Groups
• Wikipedia: bioisosteres are substituents or groups with
similar physical or chemical properties that impart similar
biological properties to a chemical compound

R R
NH+ NH+
NH2 NH2


H2N O H2N
N
N N
O N
OH R N N
–N

O HN R

R R
Pharmacophore Patterns
• The pharmacophore pattern of a molecule
characterizes the relative arrangement of all its
pharmacophore types
– What pharmacophore types are represented?
– How are they arranged (spatially, topologically) with
respect to each other ?
– How can these aspects be captured numerically to yield
molecular descriptors of the pharmacophore pattern?
• Note: Pharmacophore patterns are essentially 3D.
Since geometry is determined by connectivity, 2D
“pharmacophore patterns” also make sense!
Exploiting pharmacophore patterns…

• N-dimensional vector D(M)=[D1(M), D2(M), …,DN(M)];


each Di encodes an element of the pharmacophore pattern
– Allows meaningful quantitative definitions of molecular
similarity:
• Neighborhood Behavior: Similar molecules - characterized by covariant
vectors - are likely to display similar biological properties
• As chemists do not easily perceive the pharmacophore pattern, such
covariance may reveal hidden but real molecular relatedness…
– May serve as starting point for searching a binding
pharmacophore – the subset of features that really
participate in binding to a receptor
• Machine learning to select those elements Di that are systematically
present in actives, but not in inactives of a molecular learning set!
Some examples of "hidden similarity"

CGRP
MAPkin
IL-8
N

NEUPTh
N

HIVP
N

PK55fyn
N

EGF-TK
O

PKC
H

PDEIV
PDEII
Elast
CatB
O
Cl

Br

Cl
K-ATP
V1Ah
Sigma1
5HTUpt
5HT6h
5HT3h
5HT2ch
5HT1D
5HT1Ah
Muh
NPY
NK1h
M3h
M1h
ML1
H1c
Galan
N
ETAh
DaUpt
N
N

N
D2h
N

N
D1h
Cl

Cl
CCKAh
B2h
O
N

Bomb
BZDc
AT1h

I
Beta1h
Alpha2

I
Alpha1
A1h

100
100

90
80
70
60
50
40
30
20
10
100

0
90
80
70
60
50
40
30
20
10
90
80
70
60
50
40
30
20
10

0
0
Tricentric Pharmacophore Fingerprints:
monitoring feature arrangement
• Topological: the distance between two features equals the
(minimal) number of chemical bonds between them

Cl

O
9 4

N
11
N

• Spatial: if stable conformers are known, use the distance in


Ǻ between two features
Example: Binary Pharmacophore Triplets

Basis Triplets:
• all possible feature combinations
• at a given series of distances…

3 3 3 4 3 5 5
9
3 5
8? 4 3 7

3 4 4 5 5 6

Hp

Ar

Ar


Hp

Hp



Hp


Hp

Hp

4- H

4- H

4-H
3-H

3-H

7-A
3-H

3-H

p3

p3

r4
A5
A5
p3

p3

p3

-H

-H

-P
-H

-H

-H

-A
-A

C6
p4

p5
p3

p4

p5

r5
r5

0 0 0 … 0 0 … … 1 … … … 0 … … 0 …

Pickett, Mason & McLay, J. Chem. Inf. Comp. Sci. 36:1214-1223 (1996)
First key improvement: Fuzzy mapping of
atom triplets onto basis triplets in 2D-FPT

5 4
3 3 3 4 5 3 4 7

3 4 5 5 6



Ar

Ar


Hp

Hp
Hp


Hp


Hp

Hp

4- H

4- H

4-H

7-A
3-H

3-H
3-H

3-H

p3

p3

r4
A5
p3

A5
p3

p3

-P
-H

-H
-H

-H

-H

-A
-A

C6
p4

p5
p3

p4

p5

r5
r5

0 0 0 … 0 0 … +6 … … +3 … … … … 0 …

Di(m) = total occupancy of basis triplet i in molecule m.


Combinatorial enumeration of basis triplets
• Example: there are 36796 basis triplets, verifying triangle
inequalities, when considering 6 pharmacophore types and
11 edge lenghts between Emin=3 to Emax=13 with an
increment of Estep=1: (3, 4, 5,…13)
– Canonical representation: T1d23-T2d13-T3d12 with T3≥T2≥T1
(alphabetically).
4 7
8
Hp7-Ar4-PC6
Ar4-Hp7-PC6 9

– Out of two corners of a same type, priority is given to the one


opposed to the shorter edge.

4 7
8
Ar4-Hp7-Hp6
Ar5-Hp6-Hp7 9

6
Triplet matching procedure

• The triplet matching score represents the optimal degree of


pharmacophore field overlap:
– if corner k of the triplet is of pharmacophore type T, e.g. F(k,T)=1,
then it contributes to the total pharmacophore field of type T,
observed at a point P of the plane:
3
ΨT (P)=∑F(k,T)×exp(−ρT dk,P)
2

k =1

Horvath, D. ComPharm pp. 395-439; in "QSPR /QSAR Studies by Molecular Descriptors", Diudea, M.,
Editor, Nova Science Publishers, Inc., New York, 2001
Control parameters for triplet enumeration &
matching in two 2D-FPT versions.
Parameter Description FPT-1 FPT-2
Minimal Edge Length of basis triangles (number of bonds
Emin 2 4
between two pharmacophore types)
Emax Maximal Triangle Edge Length of basis triangles 12 15
Estep Edge length increment for enumeration of basis triangles 2 2
Edge length excess parameter: in a molecule, triplets with
e 0 2
edge length > Emax+e are ignored
Maximal edge length discrepancy tolerated when attempting
Δ 2 2
to overlay a molecular triplet atop of a basis triangle.
Gaussian fuzziness parameter for apolar (Hydrophobic and
ρHp = ρAr 0.6 0.9
Aromatic) types
Gaussian fuzziness parameter for charged (Positive and
ρPC = ρNC 0.6 0.8
Negative Charge) types
Gaussian fuzziness parameter for polar (Hydrogen bond
ρHA = ρHD 0.6 0.7
Donor and Acceptor) types
l Aromatic-Hydrophobic interchangeability level 0.6 0.5
Number of basis triplets at given setup 4494 7155
Second key improvement: Proteolytic
equilibrium dependence of 2D-FPT

8
C8

PC
-P

8-
C5

NC
5- N

8-
12%

Ar
Ar

88%
Some ‘activity cliffs’ in rule-based descriptor
space are smoothed out in 2D-FPT-space
•Neu n
•C a t
tral
io

•Neu Cation
•Neu Cation

• 50%
• 90%

tral
tral
t ral
u
•Ne ion
•An

t ral t ral
u u
•Ne ion •Ne utral
•An

n
•Ne
n

tio
tio

0% al
0% al

Ca
Ca

• 7 eutr
• 4 eutr

•N
•N
Pharmacophore Pattern-Based Similarity
Queries: Lead Hopping!
Pharmacophore Reference Nearest Neighbors Docking
?
Hypothesis Fingerprint

Superposition-based Similarity Scoring

Automated
Fingerprint
Matching... Best Matching Candidates

Potential Pharmacophore
Fingerprint Library
Some examples of "hidden similarity"

CGRP
MAPkin
IL-8
N

NEUPTh
N

HIVP
N

PK55fyn
N

EGF-TK
O

PKC
H

PDEIV
PDEII
Elast
CatB
O
Cl

Br

Cl
K-ATP
V1Ah
Sigma1
5HTUpt
5HT6h
5HT3h
5HT2ch
5HT1D
5HT1Ah
Muh
NPY
NK1h
M3h
M1h
ML1
H1c
Galan
N
ETAh
DaUpt
N
N

N
D2h
N

N
D1h
Cl

Cl
CCKAh
B2h
O
N

Bomb
BZDc
AT1h

I
Beta1h
Alpha2

I
Alpha1
A1h

100
100

90
80
70
60
50
40
30
20
10
100

0
90
80
70
60
50
40
30
20
10
90
80
70
60
50
40
30
20
10

0
0
Successful Virtual Screening Simulations
Confirmed Actives (PF) Confirmed Inactives (PF) Confirmed Actives (PF) Confirmed Inactives (PF)
Confirmed Actives (OPT3)
(FPT-2) Confirmed Inactives (OPT3)
(FPT-2) Confirmed Actives (OPT3)
(FPT-2) Confirmed Inactives (FPT-2)
(OPT3)
90 7
% Retrieved Seed Compounds

% Retrieved Seed Compounds


80 6
70
5
60
4
50

40 3

30 2
20
1
10

0 0
8
50
7

% Retrieved Seed Compounds


45
D2
% Retrieved Seed Compounds

40 6
35
5
30
4
25
3
20
15
TK 2
10
1
5
0
0
90
45

% Retrieved Seed Compounds


80
40
% Retrieved Seed Compounds

70
35
60
30
50
25
40
20
30
15
20
10
10
5
0
0
0 20 40 60 80 100 120 140 160 180 200
0 20 40 60 80 100 120 140 160 180 200
Selection Size
Selection Size
Successful QSAR model construction with 2D-
FPT: predicting c-Met TK activity
Learning Set Compounds Validation Set Compounds
9
.

8.5

7.5

7
Experimental pIC50

6.5

5.5

4.5
25 variables entering nonlinear model
4 153 molecules for training: RMSE=0.4 (log units), R2=0.82
4 4.5 540 molecules
5.5 6 validation:
for 6.5 7
RMSE=0.8 7.5(log units),
8 R28.5
=0.53 9
8 validation molecules outpIC50
Calculated of 40 mispredicted by more than 1 log
What more could be done?

• 3D FPT version under study


– does it pay off to generate conformers? How many would you
need to get better results than with 2D-FPT? What’s the best
conformational sampler to use?
• Accessibility-weighted fingerprints?
– class to return (topological and/or 3D) estimate of the solvent-
accessible fraction of an atom?
• Tautomer-dependent fingerprints?
– if tautomers and their percentage were enumerated like any other
microspecies…
THE END
Pharmacophore Hypotheses

(A): From individual Active Leads: 2D/3D


• ALL features in the Lead assumed relevant for binding
(B): Consensus hypotheses from set of Leads: 2D/3D
• Ignore features that can be deleted without losing activity
(C): Site-Ligand interaction models: 3D*
• Select Ligand features shown to interact with the site in the
3D X-ray structure of the site-ligand complex.
(D): Active Site filling models: 3D*
• Design a pharmacophoric feature distribution complemen-
tary to the groups available in the active site
* In these cases, docking may be performed starting from pharmacophore –based
overlays
ComPharm Overlay…

- chosen conformer
of the reference
- chosen conformer
of the candidate
- pair of matching
atoms
- 3 Euler angles
- mirroring toggle

GA-controlled
overlay optimization
ComPharm Pharmacophoric Fields

Pharmacophoric Features
Alk. Aro. HBA HDB (+) (-)
1 X11 X12 X13 X14 X15 X16

Reference Atoms
2 X21 X22 X23 X24 X25 X26
3 X31 X32 X33 X34 X35 X36
4 X41 X42 X43 X44 X45 X46
5 X51 X52 X53 X54 X55 X56

• A descriptor of the nature of the


molecule’s pharmacophoric neigh-
borhood “seen” by every reference
atom, assuming an optimal overlay
of the molecule on the reference...

You might also like